2014 international software testing conference in seoul

52
Seoul Software Testing Conference Testing Big Data: Unit Test in Hadoop (Part II) Jongwook Woo (PhD) High-Performance Internet Computing Center (HiPIC) Educational Partner with Cloudera and Grants Awardee of Amazon AWS Computer Information Systems Department California State University, Los Angeles

Upload: jongwook-woo

Post on 20-Aug-2015

647 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Testing Big Data: Unit Test in Hadoop (Part II)

Jongwook Woo (PhD)

High-Performance Internet Computing Center (HiPIC)Educational Partner with Cloudera and Grants Awardee of Amazon AWS

Computer Information Systems DepartmentCalifornia State University, Los Angeles

Page 2: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Contents

Test in GeneralUse Cases: Big Data in Hadoop and EcosystemsUnit Test in Hadoop

Page 3: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Test in general

Quality Assur-ance

TDD (Test Driven Devel-opment)

Unit TestTest func-tional units of the S/W

BDD (Behavior Driven Devel-opment)

Based on TDDTest behavior of the S/W

Integration Test: integrated components

Group of unit tests

CI (Continuous Integration) Server

Hudson, Jenkins etc

Page 4: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

CI Server

Continuous Inte-gration Server

TDD (Test Driven Devel-opment) based

All developers commit the update every-dayCI server com-pile and run the unit testsIf a test fails, all receive the failure email

Know who committed a bad code

Hudson, Jenkins etc

Supports SCM version control tools

CVS, Subver-sion, Git

Page 5: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Test in Hadoop

Much harderJUnit cannot be used in HadoopClusterServerParallel Comput-ing

Page 6: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Use Cases: Shopzilla

Hadoop’s Ele-phant In The Room

Hadoop testingQuality Assur-ance

Unit Test: functional units of the S/WIntegration Test: inte-grated com-ponentsBDD Test: Behavior of the S/W

Augmented Development

Use a dev cluster?

Too long per day

Hadoop-In-A-Box

Page 7: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Use Cases: Shopzilla

Hadoop-In-A-Box

Fully compatible Mock Environ-ment

Without a clus-terMock cluster state

Test LocallySingle Node Pseudo ClusterMiniMRCluster=> can test HDFS, Pig

Page 8: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Use Cases: Yahoo

DeveloperWants to run Hadoop codes in the local ma-chine

Does not want to run Hadoop codes at the Hadoop cluster

Yahoo HITHadoop Integra-tion TestRun Hadoop tests in the Hadoop Ecosys-tems

Deploy HIT on a Hadoop sin-gle or clusterRun tests in Hadoop, Pig, Hive, Oozie,…

Page 9: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Unit Test in Hadoop

MRUnit testing framework

is based on JU-nit Cloudera do-nated to Apachecan test Map Reduce pro-grams

written on 0.20 , 0.23.x , 1.0.x , 2.x version of Hadoop

Can test Map-per, Reducer, MapperReducer

Page 10: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Unit Test in Hadoop

WordCount Ex-ample

reads text files and counts how often words oc-cur.

The input and the output are text files,

Need three classes

WordCount.-java

Driver class with main function

WordMapper.-java

Mapper class with map method

SumReducer.-java

Reducer class with reduce method

Page 11: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount Example

WordMapper.-java

Mapper class with map func-tionFor the given sample input

assuming two map nodes

The sample input is dis-tributed to the maps

the first map emits:

<Hello, 1> <World, 1> <Bye, 1> <World, 1>

The second map emits:

<Hello, 1> <Hadoop, 1> <Goodbye, 1> <Hadoop, 1>

Page 12: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount Example

SumReducer.javaReducer class with reduce functionFor the input from two Map-pers

the reduce method just sums up the values,

which are the occur-rence counts for each key

Thus the out-put of the job is:

<Bye, 1> <Goodbye, 1> <Hadoop, 2> <Hello, 2> <World, 2>

Page 13: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java (Driver)

import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.lib.input.-FileInputFormat;import org.apache.hadoop.mapreduce.lib.input.Tex-tInputFormat;import org.apache.hadoop.mapreduce.lib.output.-FileOutputFormat;import org.apache.hadoop.mapreduce.lib.output.-TextOutputFormat;public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [input] [output]"); System.exit(-1); } Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Page 14: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java

public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [in-put] [output]"); System.exit(-1); }

Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Check Input and Output files

Page 15: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java

public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [in-put] [output]"); System.exit(-1); }

Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Set output (key, value) types

Page 16: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java

public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [in-put] [output]"); System.exit(-1); }

Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Set Mapper/Reducer classes

Page 17: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java

public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [in-put] [output]"); System.exit(-1); }

Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Set Input/Output format classes

Page 18: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java

public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [in-put] [output]"); System.exit(-1); }

Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Set Input/Output paths

Page 19: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java

public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [in-put] [output]"); System.exit(-1); }

Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Set Driver class

Page 20: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordCount.java

public class WordCount { public static void main(String[] args) throws Exception { if (args.length != 2) { System.out.println("usage: [in-put] [output]"); System.exit(-1); }

Job job = Job.getInstance(new Configuration()); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(WordMapper.class); job.setReducerClass(SumReducer.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJarByClass(WordCount.class); job.submit(); }}

Submit the job to the master node

Page 21: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordMapper.java (Mapper class)

import java.io.IOException;import java.util.StringTokenizer; import org.apache.hadoop.io.In-tWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapre-duce.Mapper;

public class WordMapper extends Mapper<Object, Text, Text, In-tWritable> { private Text word = new Text(); private final static IntWritable one = new IntWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, InterruptedException { // Break line into words for process-ing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMoreTokens()) { word.set(wordList.nextToken()); contex.write(word, one); } }}

Page 22: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordMapper.java

public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static In-tWritable one = new In-tWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, Interrupt-edException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMore-Tokens()) { word.set(wordList.nextToken()); contex.write(word, one); } }}

Extends mapper class with input/output keys and values

Page 23: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordMapper.java

public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static In-tWritable one = new In-tWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, Interrupt-edException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMore-Tokens()) { word.set(wordList.nextToken()); contex.write(word, one); } }}

Output (key, value) types

Page 24: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordMapper.java

public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static In-tWritable one = new In-tWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, Interrupt-edException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMore-Tokens()) { word.set(wordList.nextToken()); contex.write(word, one); } }}

Input (key, value) typesOutput as Context type

Page 25: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordMapper.java

public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static In-tWritable one = new In-tWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, Interrupt-edException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMore-Tokens()) { word.set(wordList.nextToken()); contex.write(word, one); } }}

Read words from each line of the input file

Page 26: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

WordMapper.java

public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private Text word = new Text(); private final static In-tWritable one = new In-tWritable(1); @Override public void map(Object key, Text value, Context contex) throws IOException, Interrupt-edException { // Break line into words for processing StringTokenizer wordList = new StringTokenizer(value.toString()); while (wordList.hasMore-Tokens()) { word.set(wordList.nextToken()); contex.write(word, one); } }}

Count each word

Page 27: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Shuffler/Sorter

Maps emit (key, value) pairsShuffler/Sorter of Hadoop framework

Sort (key, value) pairs by keyThen, append the value to make (key, list of values) pairFor example,

The first, sec-ond maps emit:

<Hello, 1> <World, 1> <Bye, 1> <World, 1> <Hello, 1> <Hadoop, 1> <Goodbye, 1> <Hadoop, 1>

Shuffler pro-duces and it becomes the input of the reducer

<Bye, 1>, <Goodbye, 1>, <Hadoop, <1,1>>, <Hello, <1, 1>>, <-World, <1,1>>

Page 28: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

SumReducer.java (Reducer class)

import java.io.IOException;import java.util.Iterator; import org.apache.hadoop.io.In-tWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapre-duce.Reducer; public class SumReducer extends Re-ducer<Text, IntWritable, Text, In-tWritable> { private IntWritable totalWordCount = new IntWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, Interrupt-edException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, totalWordCount); }}

Page 29: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

SumReducer.java

public class SumReducer extends Reducer<Text, In-tWritable, Text, In-tWritable> { private IntWritable total-WordCount = new In-tWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, total-WordCount); }}

Extends Reducer class with input/output keys and values

Page 30: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

SumReducer.java

public class SumReducer extends Reducer<Text, In-tWritable, Text, In-tWritable> { private IntWritable total-WordCount = new In-tWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, total-WordCount); }}

Set output value type

Page 31: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

SumReducer.java

public class SumReducer extends Reducer<Text, In-tWritable, Text, In-tWritable> { private IntWritable total-WordCount = new In-tWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, total-WordCount); }}

Set input (key, list of values) type and output as Context class

Page 32: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

SumReducer.java

public class SumReducer extends Reducer<Text, In-tWritable, Text, In-tWritable> { private IntWritable total-WordCount = new In-tWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, total-WordCount); }}

For each word, Count/sum the number of values

Page 33: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

SumReducer.java

public class SumReducer extends Reducer<Text, In-tWritable, Text, In-tWritable> { private IntWritable total-WordCount = new In-tWritable(); @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int wordCount = 0; Iterator<IntWritable> it=values.iterator(); while (it.hasNext()) { wordCount += it.next().get(); } totalWordCount.set(wordCount); context.write(key, total-WordCount); }}

For each word, Total count becomes the value

Page 34: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

SumReducer

ReducerInput: Shuffler produces and it becomes the input of the re-ducer

<Bye, 1>, <Goodbye, 1>, <Hadoop, <1,1>>, <Hello, <1, 1>>, <-World, <1,1>>

Output<Bye, 1>, <Goodbye, 1>, <Hadoop, 2>, <Hello, 2>, <World, 2>

Page 35: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

How to UnitTest in Hadoop

Extending JUnit test

With org.a-pache.hadoop.mrunit.* API

Needs to test Driver, Mapper, Reducer

MapRe-duceDriver, MapDriver, ReduceDriverAdd input with expected out-put

Page 36: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

import java.util.ArrayList;import java.util.List; import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mrunit.MapDriver;import org.apache.hadoop.mrunit.MapReduceDriver;import org.apache.hadoop.mrunit.ReduceDriver;import org.junit.Before;import org.junit.Test; public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapRe-duceDriver; MapDriver<LongWritable, Text, Text, In-tWritable> mapDriver; ReduceDriver<Text, IntWritable, Text, In-tWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper); reduceDriver = new ReduceDriver<Text, In-tWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer); mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable>(); mapReduceDriver.setMapper(mapper); mapReduceDriver.setReducer(reducer); }

Page 37: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

@Test public void testMapper() { mapDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("dog"), new IntWritable(1)); mapDriver.runTest(); } @Test public void testReducer() { List<IntWritable> values = new ArrayList<IntWritable>(); values.add(new IntWritable(1)); values.add(new IntWritable(1)); reduceDriver.withInput(new Text("cat"), values); reduceDriver.withOutput(new Text("cat"), new IntWritable(2)); reduceDriver.runTest(); } @Test public void testMapReduce() { mapReduceDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapReduceDriver.addOutput(new Text("cat"), new IntWritable(2)); mapReduceDriver.addOutput(new Text("dog"), new IntWritable(1)); mapReduceDriver.runTest(); } }

Page 38: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapRe-duceDriver; MapDriver<LongWritable, Text, Text, IntWritable> map-Driver; ReduceDriver<Text, In-tWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer();

mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper);

reduceDriver = new ReduceDriver<Text, In-tWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer);

mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, In-tWritable, Text, IntWritable>(); mapReduceDriver.setMap-per(mapper); mapReduceDriver.setRe-ducer(reducer); }

Using MRUnit API, declare MapReduce, Mapper, Reducer drivers with input/output (key, value)

Page 39: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapRe-duceDriver; MapDriver<LongWritable, Text, Text, IntWritable> map-Driver; ReduceDriver<Text, In-tWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer();

mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper);

reduceDriver = new ReduceDriver<Text, In-tWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer);

mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, In-tWritable, Text, IntWritable>(); mapReduceDriver.setMap-per(mapper); mapReduceDriver.setRe-ducer(reducer); }

Run setUp() before executing each test method

Page 40: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapRe-duceDriver; MapDriver<LongWritable, Text, Text, IntWritable> map-Driver; ReduceDriver<Text, In-tWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer(); mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper);

reduceDriver = new ReduceDriver<Text, In-tWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer);

mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, In-tWritable, Text, IntWritable>(); mapReduceDriver.setMap-per(mapper); mapReduceDriver.setRe-ducer(reducer); }

Instantiate WordCount Mapper, Reducer

Page 41: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapRe-duceDriver; MapDriver<LongWritable, Text, Text, IntWritable> map-Driver; ReduceDriver<Text, In-tWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer();

mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper);

reduceDriver = new ReduceDriver<Text, In-tWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer);

mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, In-tWritable, Text, IntWritable>(); mapReduceDriver.setMap-per(mapper); mapReduceDriver.setRe-ducer(reducer); }

Instantiate and set Mapper driverwith input/output (key, value)

Page 42: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapRe-duceDriver; MapDriver<LongWritable, Text, Text, IntWritable> map-Driver; ReduceDriver<Text, In-tWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer();

mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper);

reduceDriver = new ReduceDriver<Text, In-tWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer);

mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, In-tWritable, Text, IntWritable>(); mapReduceDriver.setMap-per(mapper); mapReduceDriver.setRe-ducer(reducer); }

Instantiate and set Reducer driverwith input/output (key, value)

Page 43: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

public class TestWordCount { MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> mapRe-duceDriver; MapDriver<LongWritable, Text, Text, IntWritable> map-Driver; ReduceDriver<Text, In-tWritable, Text, IntWritable> reduceDriver; @Before public void setUp() { WordMapper mapper = new WordMapper(); SumReducer reducer = new SumReducer();

mapDriver = new MapDriver<LongWritable, Text, Text, IntWritable>(); mapDriver.setMapper(mapper);

reduceDriver = new ReduceDriver<Text, In-tWritable, Text, IntWritable>(); reduceDriver.setReducer(reducer);

mapReduceDriver = new MapReduceDriver<LongWritable, Text, Text, In-tWritable, Text, IntWritable>(); mapReduceDriver.setMap-per(mapper); mapReduceDriver.setRe-ducer(reducer); }

Instantiate and set MapperReducer driverwith input/output (key, value)

Page 44: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

@Test public void testMapper() { mapDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("dog"), new IntWritable(1)); mapDriver.runTest(); } @Test public void testReducer() { List<IntWritable> values = new ArrayList<IntWritable>(); values.add(new IntWritable(1)); values.add(new IntWritable(1)); reduceDriver.withInput(new Text("cat"), values); reduceDriver.withOutput(new Text("cat"), new IntWritable(2)); reduceDriver.runTest(); }

Mapper test: Define sample input with expected output

Page 45: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

@Test public void testMapper() { mapDriver.withInput(new LongWritable(1), new Text("cat cat dog")); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("cat"), new IntWritable(1)); mapDriver.withOutput(new Text("dog"), new IntWritable(1)); mapDriver.runTest(); } @Test public void testReducer() { List<IntWritable> values = new ArrayList<IntWritable>(); values.add(new IntWritable(1)); values.add(new IntWritable(1)); reduceDriver.withInput(new Text("cat"), values); reduceDriver.withOutput(new Text("cat"), new IntWritable(2)); reduceDriver.runTest(); }

Reducer test: Define sample input with expected output

Page 46: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test

@Test public void testMapReduce() { mapReduceDriver.with-Input(new LongWritable(1), new Text("cat cat dog")); mapReduceDriver.add-Output(new Text("cat"), new IntWritable(2)); mapReduceDriver.add-Output(new Text("dog"), new IntWritable(1)); mapReduceDriv-er.runTest(); } }

MapperReducer test: Define sample input with expected output

Page 47: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

MRUnit Test in real

Need to imple-ment unit tests

How many?all Map, Re-duce, Driver

Problems?Mostly work

But it does not support complicated Map, Re-duce APIs

How many problems you can detect

Depends on how well you implement MRUnit code

Page 48: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Conclusion

MRUnit for Hadoop Unit TestDevelopmentIntegrate with QA site with CI serverNeed to use it

Page 49: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Question?

Page 50: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

References

1.Hadoop WordCount example with new map reduce api (http://codesfusion.blogspot.com/2013/10/hadoop-wordcount-with-new-map-reduce-api.html)2.Hadoop Word Count Example (http://wiki.apache.org/hadoop/WordCount )3.Example: WordCount v1.0, Cloudera Hadoop Tutorial (http://www.cloudera.com/content/cloudera-content/cloudera-docs/HadoopTutorial/CDH4/Hadoop-Tutorial/ht_walk_through.html )4.Testing Word Count (https://cwiki.apache.org/confluence/display/MRUNIT/Testing+Word+Count)5.Apache MRUnit Tutorial (https://cwiki.apache.org/confluence/display/MRUNIT/MRUnit+Tutorial )6.Hadoop Integration Test Suite, Shopzilla (https://github.com/shopzilla/hadoop-integration-test-suite )7.Hadoop’s Elepahnt in the Room, Jeremy Lucas, Shopzilla (http://tech.shopzilla.com/2013/04/hadoops-elephant-in-the-room/ )8.Facebook Test MapReduce Local (https://github.com/facebook/hadoop-20/blob/master/src/test/org/apache/hadoop/mapreduce/TestMapReduceLocal.java )9.Yahoo HIT Hadoop Integrated Testing (http://www.slideshare.net/ydn/hi-tv3?from_search=1 )

Page 51: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference

Page 52: 2014 International Software Testing Conference in Seoul

Seoul Software Testing Conference