collecting uncertain data the reactive way

69
Collecting Uncertain Data the Reactive Way Jeff Smith @jeffksmithjr

Upload: jeff-smith

Post on 21-Apr-2017

23.118 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Collecting Uncertain Data the Reactive Way

Jeff Smith @jeffksmithjr

x.ai is a personal assistant who schedules meetings for you

Reactive Machine Learning

Machine Learning Systems

Machine Learning Systems

Machine Learning Systems

Traits of Reactive Systems

Traits of Reactive Systems

Reactive Strategies

Reactive Strategies

Reactive Machine Learning

Reactive Machine Learning

Reactive Machine Learning

Collecting Data

What’s for dinner?

Reactive Data Collection

Modeling Uncertain Data

Certain Data Model

case class ZebraReading(sensorId: Int, locationId: Int, timestamp: Long, count: Int)

Uncertainty Interval

27 33

Uncertain Data Model

case class PreyReading(sensorId: Int, locationId: Int, timestamp: Long, animalsLowerBound: Double, animalsUpperBound: Double, percentZebras: Double)

Scaling Data Collection

Simple Data Architecture

Simple Data Architecture

Mutable State

case class Region(id: Int)

import collection.mutable.HashMap var densities = new HashMap[Region, Double]()

densities.put(Region(4), 52.4)

Scaling with Queues

Scaling with Queues

Out of Order Updates

Out of Order Updates

densities.put(Region(6), 73.6) densities.put(Region(6), 0.5) densities.get(Region(6)).get

Out of Order Updates

densities.put(Region(6), 73.6) densities.put(Region(6), 0.5) densities.get(Region(6)).get

densities.put(Region(6), 0.5) densities.put(Region(6), 73.6) densities.get(Region(6)).get

Concurrent Collections

import collection.mutable._

var synchronizedDensities = new LinkedHashMap[Region, Double]() with SynchronizedMap[Region, Double]

Scaling with Locks

Scaling with Locks

Immutable Factscase class PreyReading(sensorId: Int, locationId: Int, timestamp: Long, animalsLowerBound: Double, animalsUpperBound: Double, percentZebras: Double)

implicit val preyReadingFormatter = Json.format[PreyReading]

Immutable Factsval reading = PreyReading(36, 12, currentTimeMillis(), 12.0, 18.0, 0.60)

val setDoc = bucket.set[PreyReading](readingId(reading), reading)

Scaling with Distributed Databases

Scaling with Distributed Databases

Handling Incomplete Data

Distributed Data Storage

Querying Complete Data

(bucket.searchValues[PreyReading]("prey", "by_sensor_id") (new Query().setIncludeDocs(true))) .enumerate.apply(Iteratee.foreach { doc => println(s"Prey Reading: $doc")})

Complete Data

Partition Tolerance

Partition Tolerance

Partition Tolerance

Partition Tolerance

Querying Incomplete Data

(bucket.searchValues[PreyReading]("prey", "by_sensor_id") (new Query().setIncludeDocs(true))) .enumerate.apply(Iteratee.foreach { doc => println(s"Prey Reading: $doc")})

Incomplete Data

Incomplete Data

Reactive Data Collection

For Later

reactivemachinelearning.com medium.com/data-engineering

M A N N I N G

Jeff Smith

x.ai @xdotai [email protected] New York, New York

skillsmatter.com/conferences/ 6862-scala-exchange-2015#skillscasts

Thank You

Collecting Uncertain Data the Reactive Way

Jeff Smith @jeffksmithjr