scala 20140715
DESCRIPTION
TRANSCRIPT
![Page 1: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/1.jpg)
Intro to Apache Spark:Fast cluster computing engine for Hadoop
Intro to Scala:Object-oriented and functional language for the Java Virtual Machine
ACM SIGKDD, 7/9/2014
Roger Huang
Lead System Architect
@BigDataWrangler
![Page 2: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/2.jpg)
2Intro to Spark: Intro to Scala | 7/9/2014
About me: Roger Huang• Visa
– Digital & Mobile Products Architecture, Strategic Projects & infrastructure
– Search infrastructure
– Customer segmentation
– Logging Framework
– Splunk on Hadoop (Hunk)
– Real-time monitoring
– Data
• PayPal– Java Infrastructure
![Page 3: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/3.jpg)
3Intro to Spark: Intro to Scala | 7/9/2014
Different perspectives on an elephant Scala
![Page 4: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/4.jpg)
4Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 5: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/5.jpg)
5Intro to Spark: Intro to Scala | 7/9/2014
Spark in the Hadoop ecosystem
![Page 6: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/6.jpg)
6Intro to Spark: Intro to Scala | 7/9/2014
Spark Ecosystem of Software Projects
• Spark [Ognen]– APIs: Scala, Python [Robert], Java
• “SQL”– Shark (Hive + Spark) [Roger]
– SparkSQL (alpha)
• Machine Learning Library (MLlib) [Omar]– Clustering
– Classification• binary classification
• Linear regression
– recommendations
• Spark Streaming [Chance]
• GraphX [Srini]
• …
![Page 7: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/7.jpg)
7Intro to Spark: Intro to Scala | 7/9/2014
Resilient Distributed Dataset• Fault tolerant collection of elements partitioned across the
nodes of the cluster that can be operated on in parallel
• Data sources for RDDs– Parallelized collections
• From Scala collections
– Hadoop datasets• From HDFS, any Hadoop supported storage system (Hbase, Amazon
S3, …)
• Text files, SequenceFile, any Hadoop InputFormat
• Two types of operations– Transformation
• takes an existing dataset and creates a new one
– Action• takes a dataset, run a computation, and return value to driver program
![Page 8: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/8.jpg)
8Intro to Spark: Intro to Scala | 7/9/2014
(Some) RDD Operations• Transformations
– map(func)
– filter(func)
– flatMap(func)
– mapPartitions(func)
– mapPartitionsWithIndex(func)
– sample(withReplacement, fraction, seed)
– union(otherDataset)
– distinct()
– groupByKey()
– reduceByKey(func)
– sortByKey()
– Join(otherDataset)
– cogroup(otherDataset)
– cartesian(otherDataset)
• Actions– reduce(func)
– collect()
– count()
– first()
– take(n)
– takeSample(withReplacement, num, seed)
– saveAsTextFile(path)
– saveAsSequenceFile(path)
– countByKey()
– foreach(func)
– …
![Page 9: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/9.jpg)
9Intro to Spark: Intro to Scala | 7/9/2014
Scala background• Scalable, Object oriented, functional language
– Version 2.11 (4/2014)
• Runs on the Java Virtual Machine
• Martin Odersky
– javac
– Java generics
• http://scala-lang.org/, REPL
• http://www.scala-lang.org/api/current
• http://scala-ide.org/
• http://www.scala-sbt.org/, Simple build tool
• Who’s using Scala?
– Twitter, LinkedIn, …
• Powered by Scala
– Apache Spark, Apache Kafka, Akka,…
![Page 10: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/10.jpg)
10Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Hadoop/Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 11: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/11.jpg)
11Intro to Spark: Intro to Scala | 7/9/2014
Scala for the computer scientist: functional programming (FP)
![Page 12: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/12.jpg)
12Intro to Spark: Intro to Scala | 7/9/2014
Scala for the computer scientist: functional programming (FP)
• Math functions, e.g., f(x) = y– A function has a single responsibility
– A function has no side effects
– A function is referentially transparent• A function outputs the same value for the same inputs.
• Functional programming– expresses computation as the evaluation and composition of
mathematical functions
– Avoid side effects and mutating state data
![Page 13: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/13.jpg)
13Intro to Spark: Intro to Scala | 7/9/2014
Why functional programming?
• Multi core processors
• Concurrency– Computation as a series of independent data transformations
– Parallel data transformations without side effects
• Referential transparency
![Page 14: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/14.jpg)
14Intro to Spark: Intro to Scala | 7/9/2014
Scala for the computer scientist: functional programming
• Functions– Lambda, closure
• For-comprehensions
• Type inference
• Pattern matching
• Higher order functions– map, flatMap, foldLeft
• And more …
![Page 15: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/15.jpg)
15Intro to Spark: Intro to Scala | 7/9/2014
FP: functions
• Anonymous function– Function without a name
– lambda function
• Example– scala> List(100, 200, 300) map { _ * 10/100}
– res0: List[Int] = List(10, 20, 30)
• Closure (Wikipedia)– Closure = A function, together with a referencing environment – a
table storing a reference to each of the non-local variables of that function.
– A closure allows a function to access those non-local variables even when invoked outside its immediate lexical scope.
![Page 16: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/16.jpg)
16Intro to Spark: Intro to Scala | 7/9/2014
FP: functions
• applyPercentage is an example of a closure– scala> var percentage = 10
– percentage: Int = 10
– scala> val applyPercentage = (amount: Int) => amount * percentage / 100
– applyPercentage: Int => Int = <function1>
– scala> percentage = 20
– percentage: Int = 20
– scala> List (100, 200, 300) map applyPercentage
– res1: List[Int] = List(20, 40, 60)
– scala>
![Page 17: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/17.jpg)
17Intro to Spark: Intro to Scala | 7/9/2014
FP: functions
• Anonymous function
• Closure
![Page 18: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/18.jpg)
18Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functionsscala> :load Person.scala
Loading Person.scala...
defined class Person
scala> val jd = new Person("John", "Doe", 17)
jd: Person = Person@372a6e85
scala> val rh = new Person("Roger", "Huang", 34)
rh: Person = Person@611c4041
scala> val people = Array(jd, rh)
people: Array[Person] = Array(Person@372a6e85, Person@611c4041)
scala> val (minors, adults) = people partition (_.age < 18)
minors: Array[Person] = Array(Person@372a6e85)
adults: Array[Person] = Array(Person@611c4041)
scala>
![Page 19: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/19.jpg)
19Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions
• HOF– takes a function as an argument
– Returns a function
![Page 20: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/20.jpg)
20Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions: map
• Creates a new collection from an existing collection by applying a function
• Anonymous functionscala> List(1, 2, 3 ) map { (x: Int) => x + 1 }
res0: List[Int] = List(2, 3, 4)
• Function literalscala> List(1, 2, 3) map { _ + 1 }
res1: List[Int] = List(2, 3, 4)
• Passing an existing functionscala> def addOne(num: Int) = num + 1
addOne: (num: Int)Int
scala> List(1, 2, 3) map addOne
res2: List[Int] = List(2, 3, 4)
![Page 21: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/21.jpg)
21Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions: map
![Page 22: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/22.jpg)
22Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions: flatmap
![Page 23: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/23.jpg)
23Intro to Spark: Intro to Scala | 7/9/2014
FP: for-comprehension
• Syntax– for ( <generator> | <guard> ) <expression> [yield] <expression>
• Types– Imperative form. Does not return a value.
scala> val aList = List(1, 2, 3)
aList: List[Int] = List(1, 2, 3)
scala> val bList = List(4, 5, 6)
bList: List[Int] = List(4, 5, 6)
scala> for { a <- aList; if (a < 2); b <- bList; if (b < 7) } println( a + b )
5
6
7
![Page 24: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/24.jpg)
24Intro to Spark: Intro to Scala | 7/9/2014
FP: for-comprehension
• Syntax– for ( <generator> | <guard> ) <expression> [yield] <expression>
• Types– Functional form (a.k.a., sequence comprehension) . Returns/yields
a value
scala> for { a <- aList; b <- bList} yield a + b
res0: List[Int] = List(5, 6, 7, 6, 7, 8, 7, 8, 9)
scala> res0.take(1)
res1: List[Int] = List(5)
scala> for { a <- aList; if (a < 2); b <- bList } yield a + b
res2: List[Int] = List(5, 6, 7)
scala>
![Page 25: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/25.jpg)
25Intro to Spark: Intro to Scala | 7/9/2014
FP: for-comprehension
![Page 26: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/26.jpg)
26Intro to Spark: Intro to Scala | 7/9/2014
FP: foldLeft• scala> val numbers = 1.to(10)
• numbers: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
• scala> def add( a:Int, b:Int ): Int = { a + b }
• add: (a: Int, b: Int)Int
• scala> numbers.foldLeft(0){ add }
• res0: Int = 55
• scala> numbers.foldLeft(0){ (acc, b) => acc + b }
• res1: Int = 55
• scala>
![Page 27: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/27.jpg)
27Intro to Spark: Intro to Scala | 7/9/2014
FP: foldLeft
![Page 28: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/28.jpg)
28Intro to Spark: Intro to Scala | 7/9/2014
FP: find the last item in an array
• scala> val ns = Array(20, 40, 60)
• ns: Array[Int] = Array(20, 40, 60)
• scala> ns.foldLeft(ns.head) {(acc, b) => b}
• res0: Int = 60
• scala>
![Page 29: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/29.jpg)
29Intro to Spark: Intro to Scala | 7/9/2014
FP: reverse an array w/ foldLeft
• scala> val ns = Array(20, 40, 60)
• ns: Array[Int] = Array(20, 40, 60)
• scala> ns.foldLeft( Array[Int]() ) { (acc, b) => b +: acc}
• res1: Array[Int] = Array(60, 40, 20)
• scala>
![Page 30: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/30.jpg)
30Intro to Spark: Intro to Scala | 7/9/2014
FP: reverse an array w/ foldLeft
![Page 31: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/31.jpg)
31Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 32: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/32.jpg)
32Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Java / OO developer: • Interoperable w/ Java
• Case classes
• Mixins with traits
![Page 33: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/33.jpg)
33Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Java / OO developer: • case class
– Implements equals(), hashCode(), toString()
– Can be used in Pattern Matching
![Page 34: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/34.jpg)
34Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Java / OO developer: • http://
docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html
• map– <R> Stream<R> map(Function<? super T,? extends
R> mapper)Returns a stream consisting of the results of applying the given function to the elements of this stream.This is an intermediate operation.
• flatMap– <R> Stream<R> flatMap(Function<? super T,? extends Stream<?
extends R>> mapper)Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. Each mapped stream is closed after its contents have been placed into this stream. (If a mapped stream is null an empty stream is used, instead.)This is an intermediate operation.
`
![Page 35: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/35.jpg)
35Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 36: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/36.jpg)
36Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Spark developer• ResilientDistributedDataset (RDD)
• A Resilient Distributed Dataset (RDD), the basic abstraction in Spark. Represents an immutable, partitioned collection of elements that can be operated on in parallel. This class contains the basic operations available on all RDDs, such as map, filter, and persist.
• http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD
![Page 37: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/37.jpg)
37Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 38: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/38.jpg)
38Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Big Data developer• Spark
– Programming API in Scala
– Implemented in Scala
• Scalding– Scala DSL on top of Cascading
– data processing API and processing query planner used for defining, sharing, and executing data-processing workflows
– Abstractions: tuples, pipes, source/sink taps
• Algebird
• Summingbird– Library that lets you write MapReduce programs that look like
native Scala or Java collection transformations
– Execute them on a number of well-known distributed MapReduce platforms, including Storm and Scalding.
![Page 39: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/39.jpg)
39Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Hadoop/Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 40: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/40.jpg)
40Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Big Data scientist / mathematician
• Monoid– If you want to “attach” operations such as +, -, *, / or <= to data
objects (e.g., Bloom filters), then you want to provide monoid forms of those data objects
– Consists of• A set of objects
• Binary operation that satisfies the monoid axioms
• Monad– If you want to create a data processing pipeline that transforms the
state of a data object
– composition
![Page 41: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/41.jpg)
41Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Hadoop/Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 42: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/42.jpg)
42Intro to Spark: Intro to Scala | 7/9/2014
Scala for the system architect• Concurrency
• Problem:
– Threads
– Shared mutable state
– Locks,
• Solution:
– message passing concurrency w/ Actors
– Future, Promise
• Abstractions
– Actor
• an object that processes a message
• encapsulates state (state not shared)
– ActorRef
– Message, usually sent asynchronously
– Mailbox
– ActorSystem
![Page 43: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/43.jpg)
43Intro to Spark: Intro to Scala | 7/9/2014
Scala for the system architect: Akka• Fault tolerance
– Supervision
– Strategies• Resume, restart, stop, escalate, …
• Scale out: remote actors– Via configuration
![Page 44: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/44.jpg)
44Intro to Spark: Intro to Scala | 7/9/2014
Scala for the system architect• Parallel collections
– scala> import scala.collection.parallel.immutable._
– import scala.collection.parallel.immutable._
– scala> ParVector(10, 20, 30, 40, 50, 60, 70, 80, 90) .map { x =>
– | println( Thread.currentThread.getName); x / 2 }
– ForkJoinPool-1-worker-13
– ForkJoinPool-1-worker-1
– ForkJoinPool-1-worker-1
– ForkJoinPool-1-worker-9
– ForkJoinPool-1-worker-11
– ForkJoinPool-1-worker-5
– ForkJoinPool-1-worker-3
– ForkJoinPool-1-worker-15
– ForkJoinPool-1-worker-7
– res0: scala.collection.parallel.immutable.ParVector[Int] = ParVector(5, 10, 15,
– 20, 25, 30, 35, 40, 45)
– scala>
![Page 45: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/45.jpg)
45Intro to Spark: Intro to Scala | 7/9/2014
Sequential collections
![Page 46: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/46.jpg)
46Intro to Spark: Intro to Scala | 7/9/2014
Parallel collections
![Page 47: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/47.jpg)
47Intro to Spark: Intro to Scala | 7/9/2014
Outline• Spark
– Hadoop eco system
• Scala– Background
• Why Scala?– For the computer scientist
– For the Java / OO programmer
– For the Spark developer
– For the Big Data developer
– For the Big Data scientist / mathematician
– For the system architect
![Page 48: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/48.jpg)
48Intro to Spark: Intro to Scala | 7/9/2014
Different perspectives on an elephant Scala
![Page 49: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/49.jpg)
49Intro to Spark: Intro to Scala | 7/9/2014
Spark in the Hadoop ecosystem
![Page 50: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/50.jpg)
50Intro to Spark: Intro to Scala | 7/9/2014
References• http://scala-lang.org/
• Scala in Action, Nilanjan Raychaudhuri
• Grokking Functional Programming, Aslam Khan
• Michael Noll
![Page 51: Scala 20140715](https://reader034.vdocuments.site/reader034/viewer/2022050920/54c66e414a7959e6368b46a1/html5/thumbnails/51.jpg)
Intro to Apache Spark:Fast cluster computing engine for Hadoop
Intro to Scala:Object-oriented and functional language for the Java Virtual Machine
ACM SIGKDD, 7/9/2014
Roger Huang
Lead System Architect
Digital & Mobile Products Architecture