c and map reduce on android
DESCRIPTION
My talk at IDCTRANSCRIPT
About Me
Academy (all from IDC) M.Sc. In Computer Science M.B.A. in Marketing
Experience Samsung OctopUI Redbend/Matrix Varonis
About Me
Blog – www.api-solutions.com Open Source
Profiterole – Android based Map Reduce Sherlock Hash – Android based Metadata Management
Teaching Basic Android - Matrix, Redbend Advanced Android – John Bryce Software Engineering – MAMRAM units Android & Software Engineering Talks– Google, IASA,
IDC Android books reviewer/co-author
Plan
1. Intro to Android2. Programming Paradigms and
Languages (brief) 3. NDK 4. Map Reduce5. Q&A
Credits
http://www.cs.iastate.edu/courses/archive/f05/cs587/ (used with permission)
http://www.ceid.upatras.gr/courses/katanemhmena/ds2/ (used with permission)
http://www.cs.berkeley.edu/~matei/ (used with permission)
This Talk
It is hard to be in your shoes … Going to see few topics from your
courses how they mesh in Android development
Programming Workshop C language development in Android
Functional and Logic Programming Map Reduce implementation in Android
Android (from wiki)
Android is a Linux-based operating system designed primarily for smart phones and tablet computers.
Android from Wiki
The first Android-powered phone was sold in October 2008. Android is open source and Google releases the code under
the Apache License. This open-source code and permissive licensing allows the software to be freely modified and distributed by device manufacturers, wireless carriers and enthusiast developers.
Additionally, Android has a large community of developers writing applications ("apps") that extend the functionality of devices, written primarily in a customized version of the Java programming language.
In October 2012, there were approximately 700,000 apps available for Android, and the estimated number of applications downloaded from Google Play, Android's primary app store, was 25 billion
Programming Paradigms
Programming Paradigm is a conceptual model for creating programs, supported by programming language.
Paradigms differ in the concepts and abstractions used to: Represent the elements of a program such as
objects, functions, variables, constraints, etc. Represent the steps that compose a
computation such as assignment, evaluation, continuations, data flows, etc...
Languages to Paradigms Mapping Java/C# – Object Oriented – that represents concepts as
"objects" that have data fields (attributes that describe the object) and associated procedures known as methods.
C – Procedural - that describes computation in terms of statements that change a program state. What is program state ? data structures
Lisp (Scheme/Clojure) – Functional - a style of building the structure and elements of computer programs, that treats computation as the evaluation of mathematical functions and avoids state and mutable data. Functional programming emphasizes functions that produce results that depend only on their inputs and not on the program state.
NDK
Java
Lingua-franca of Android development General, Concurrent, Class Based Object
Oriented language Android has major Java language
libraries (io, net,util, lang) Compiles to class format then
transformed to dex format and runs on Dalvik virtual machine
C Development On Android
Why to develop in C on Android ?
Which Problem Does it Solve Increase performance by implementing
rigorous tasks in native language Java is slow in general and may not be suitable
for some functions (e.g., 3D, Sound) number processing without too many allocations
Try to use existing legacy library and could not afford to rewrite it in java (BREW/Symbian) JNI can be used as a wrapper of these legacy
codes Can slowly migrate legacy code to a newer
platform
Justification Pros:
Reuse: allows access to useful native code
Efficiency: use best language for the task
Cons: Extra work: javah, create shared
native libs Dificult code to write and maintain
NDK (Native Development Kit) A toolset that lets you
embed in you app native source code
It is aimed to Bring native libraries in
android (code reusability) Make some parts of the
application really fast using code generated for arm-like cpus
JNI in Android
The Android NDK is nothing more than a complement to the Android SDK that helps you to: Generate JNI-compatible shared libraries that can
run on the Android platform running on ARM CPUs. Copy the generated libraries to a proper location
of your application to be included in .apks. A set of cross-toolchains (compilers, linkers, etc..)
that can generate native ARM binaries on Linux, OS X and Windows (with Cygwin)
All the rest is JNI
What is Native Method Functions written in a language other
than Java They could be C, C++, or even assembly
What is JNI Java interface to non-Java code. It is
Java's link to the "outside world" Native methods are compiled into a dynamic
link library (.dll, .so, etc.) OS loads and links the library into the process
that is running the Java Virtual Machine Part of the Java Developer Kit(JDK), serves
as a glue between java side and native side of an application Allows Java code that runs inside a Java Virtual
Machine (JVM) to interoperate with applications and libraries written in other programming languages, such as C, C++, and assembly
JNI Overview
Interactions with Native Code
Access to Java world from native code
Access to native code from Java
Using The JNI
Java calls C Embedding C in Java
C calls Java Using Java features from C
Embedding C in Java
1. Declare the method using the keyword native, provide no implementation.
2. Make sure the Java loads the needed library3. Run the javah utility to generate names/headers4. Implement the method in C5. Compile as a shared library
class HelloWorld {
public native void displayHelloWorld(); static
{ System.loadLibrary("hello");
} public static void main(String[] args) {
new HelloWorld().displayHelloWorld(); }
}
HelloWorld.h
#include “jni.h” /* Header for class HelloWorld */ #ifndef _Included_HelloWorld #define _Included_HelloWorld #ifdef __cplusplus extern “C” { #endif /* * Class: HelloWorld * Method: displayHelloWorld * Signature: ()V */ JNIEXPORT void JNICALL Java_HelloWorld_displayHelloWorld(JNIEnv *env, jobject); #ifdef __cplusplus } #endif #endif
The calling objectThe JVM reference
HelloWorldImp.c
#include <jni.h>#include "HelloWorld.h"#include <stdio.h>
JNIEXPORT void JNICALL Java_HelloWorld_displayHelloWorld(JNIEnv *env, jobject obj) { printf("Hello world!\n"); return;}
Create a Shared Library
class HelloWorld { . . .
System.loadLibrary("hello"); . . .
}
Compile the native code into a shared library:
popeye (Linux)
cc -shared -I/usr/java/j2sdk1.4.1_04/include \ -I/usr/java/j2sdk1.4.1_04/include/linux \ HelloWorldImpl.c -o libhello.so
Mapping Example
class Prompt{
private native String getLine(String prompt);}
JNIEXPORT jstring JNICALL Java_Prompt_getLine(JNIEnv *, jobject, jstring);
Prefix + fully qualified class name + “_” + method name
Accessing Java Strings
/* Illegal */ JNIEXPORT jstring JNICALL Java_Prompt_getLine(JNIEnv *env, jobject obj, jstring prompt) {
printf("%s", prompt); ... }
This jstring type is different from the regular C string type
/* correct way */ JNIEXPORT jstring JNICALL Java_Prompt_getLine(JNIEnv *env, jobject obj, jstring prompt) {
char buf[128]; const char *str = (*env)->GetStringUTFChars(env, prompt, 0); printf("%s", str);
/* release the memory allocated for the string operation */(*env)->ReleaseStringUTFChars(env, prompt, str); ...
}
For the functions associated with JNI objects, go to web page: http://java.sun.com/j2se/1.3/docs/guide/jni/spec/jniTOC.doc.html
Accessing Java Array /* Illegal */ JNIEXPORT jint JNICALL Java_IntArray_sumArray(JNIEnv *env, jobject obj, jintArray arr) {
int i, sum = 0; for (i=0; i<10; i++) {
sum += arr[i]; } ...
/* correct way */ JNIEXPORT jint JNICALL Java_IntArray_sumArray(JNIEnv *env, jobject obj, jintArray arr) {
int i, sum = 0;
/* 1. obstain the length of the array */jsize len = (*env)->GetArrayLength(env, arr);
/* 2. obtain a pointer to the elements of the array */jint *body = (*env)->GetIntArrayElements(env, arr, 0);
/* 3. operate on each individual primitive or jobjects */for (i=0; i<len; i++) {
sum += body[i];
}
/* 4. release the memory allocated for array */(*env)->ReleaseIntArrayElements(env, arr, body, 0);
Accessing Java Member Variables
fid = (*env)->GetStaticFieldID(env, cls, "si", "I"); /* 1. get the field ID */ si = (*env)->GetStaticIntField(env, cls, fid); /* 2. find the field variable */(*env)->SetStaticIntField(env, cls, fid, 200); /* 3. perform operation on the
primitive*/ fid = (*env)->GetFieldID(env, cls, "s", "Ljava/lang/String;"); /* 1. get the field ID
*/jstr = (*env)->GetObjectField(env, obj, fid); /* 2. find the field variable */jstr = (*env)->NewStringUTF(env, "123"); /* 3. perform operation on the object
*/(*env)->SetObjectField(env, obj, fid, jstr);
class FieldAccess {
static int si; /* signature is “I” */String s; /* signature is “Ljava/lang/String;";
} /* run javap -s -p FieldAccess to get the signature */
Calling a Java Method
1. Find the class of the objectCall GetObjectClass
2. Find the method ID of the objectCall GetMethodID, which performs a lookup for the Java method in a given class
3. Call the methodJNI provides an API for each type of method
e.g., CallVoidMethod(), etc.
You pass the object, method ID, and the actual arguments to the method (e.g., CallVoidMethod)
Example of Call:jclass cls = (*env)->GetObjectClass(env, obj);jmethodID mid = (*env)->GetMethodID(env, cls, “hello”, “(I)V”);(*env)->CallVoidMethod(env, obj, mid, parm1);
Synchronization
Synchronize is available as a C call Wait and Notify calls through JNIEnv do work
and are safe to use Could use native threading operations for
native to native threading, but this may cost portability
In java:synchronized (obj) { ... /* synchronized block */ ... }
In C:(*env)->MonitorEnter(env, obj);
/* synchronized block */
(*env)->MonitorExit(env, obj);
Summary
Use JNI judiciously – with a good reason Expect higher development costs Tutorials and more in the end of this
presentation
Map Reduce
Android World
Android smart phone are getting smarter They handle more and more data Application most of the time sleeps and
doesn’t run Impossible to have fault tolerant file
systems Save battery and CPU power
Reuse of containers Sharing resources – ashmem, strings pool
Problem
Single – thread approaches for data handling (sort/search/analyze) are naïve: Getting slower Awkward to develop and maintain No multi core/threading utilization
Problem Domain
Semi structural text based data functionalities needed Word Counting Inverted Index is a mapping between
words the the documents where the words appear.
Distributed Grep
Word Count Execution
the quick
brown fox
the fox ate
the mouse
how now
brown cow
MapMap
MapMap
MapMap
Reduce
Reduce
Reduce
Reduce
brown, 2
fox, 2
how, 1
now, 1
the, 3
ate, 1
cow, 1
mouse, 1
quick, 1
the, 1brown, 1
fox, 1
quick, 1
the, 1fox, 1the, 1
how, 1now, 1
brown, 1
ate, 1mouse, 1
cow, 1
Input Map Shuffle & Sort Reduce Output
www.recessframework.org/page/map-reduce-anonymous-functions-lambdas-php
Map Reduce
Is a framework for processing highly distributable problems across huge datasets using a large number of computers/threads/cpus. The framework contains both Map and Reduce functions.
The motivation is large size of input data combined with a lot of machines available (thus need to be used effectively)
Map
Original list
Function
New List
Reduce
Original list
Function
Result1000
Map-Reduce
1."Map" step: The master node takes the input, partitions it up into smaller sub-problems, and distributes them to worker nodes. A worker node may do this again in turn, leading to a multi-level tree structure. The worker node processes the smaller problem, and passes the answer back to its master node.
2."Reduce" step: The master node then collects the answers to all the sub-problems and combines them in some way to form the output – the answer to the problem it was originally trying to solve.
Map
Is a higher-order function that applies a given function to each element of a list, returning a list of results. It is often called apply-to-all when considered in functional form.
(defn bubble[x] (* x x)) (map #(bubble %1) [ 1 3 5 7 ]) (1 9 25 49)
Reduce
Reduce and accumulate are a family of higher-order functions that analyze a data structure and recombine through use of a given combining operation the results of recursively processing its constituent parts, building up a return value.
(reduce * [1 2 3 4 5 6 6]) 4320
Map-Reduce
Map: Accepts input
key/value pair Emits intermediate
key/value pair
Reduce : Accepts intermediate key/value pair
Emits output key/value pair
Very big
dataResult
MAP
REDUCE
Profiterole
Map Reduce Library for Android
Scalability in terms of number of threads
Small (fast ) Callbacks for custom data
types https://code.google.com/p/
profiterole/
Example of Profiterole output
SDK Logical View(by packages)
User Level•API•Samples
Core•Map Reduce Implementation•Tests
Result •Waffle Database
Key Code (async thread pool)MapCallback<TMapInput> mapper = new MapCallback<TMapInput>();
List<HashMap<String, Integer>> maps = new LinkedList<HashMap<String, Integer>>();
int numThreads = 25;
ExecutorService pool = Executors.newFixedThreadPool(numThreads);
CompletionService<OutputUnit> futurePool = new ExecutorCompletionService<MapCallback.OutputUnit>(
pool);
Set<Future<OutputUnit>> futureSet = new HashSet<Future<OutputUnit>>();
// linear addition of jobs, parallel execution
for (TMapInput m : input) {
futureSet.add(futurePool.submit(mapper.makeWorker(m)));
}
// tasks running
pool.shutdown();
Summary Map Reduce
Many problems can be phrased this way Elegant and Powerful MapReduce is not suitable for all
problems, but when it works, it may save you quite
Summary
Programming paradigms are important tools in our professional tool box
Managed to show you the importance of courses you are taking
Touch on Android development
Links & Credits
http://mobile.tutsplus.com/tutorials/android/ndk-tutorial/ (JNI tutorial)
http://code.google.com/p/profiterole/ http://code.google.com/p/profiterole/dow
nloads/list
THANK YOU ! [email protected]