modeling the iot with titandb and cassandra

Post on 09-Jan-2017

2.000 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Modeling the IoT with TitanDB and Cassandra

Intro

• Ted Wilmes

• Data warehouse engineer at WellAware - wellaware.us

• Building a SaaS oil and gas production monitoring and analytics platform

• Collect production O&G data from the field via cellular, satellite, and other means and deliver to our customers via mobile and browser clients

© 2015. All Rights Reserved. 2

© 2015. All Rights Reserved.

1The property graph model and TitanDB

2Modeling IoT

3Time series and performance

Property graph model

© 2015. All Rights Reserved. 4

person name: Ted

person name: George

knows

metOn: June 1,2012

Querying with Gremlin

© 2015. All Rights Reserved. 5

person name: Ted

person name: George

knows

metOn: June 1,2012

g.V().hasLabel(“person”).has(“name”, “Ted”).out(“knows”).values(“name”)

> George

TitanDB

• Graph database that supports pluggable storage layers

• Designed from the ground up to provide OLTP performance against large graphs with a particular focus on supporting high degree vertices (vertices with many edges)

• Implements Apache TinkerPop 3 APIs

• Cassandra acts as solid foundation providing high availability, performance, and ease of operation

© 2015. All Rights Reserved. 6

Our Internet of Things

© 2015. All Rights Reserved. 7

Things People

OrganizationsPlaces

Time

A hypothetical use case: IoT…

© 2015. All Rights Reserved. 8

in SPACE

© 2015. All Rights Reserved. 9

Spaceship

Mars Base

Space Station

Rocket

Satellite

© 2015. All Rights Reserved. 10

Many dimensions

© 2015. All Rights Reserved. 11

Rocket

Starfleet

Acme Rockets

Delta Booster

operates

builds

isModel

Major Tom

pilots Joyce

maintains

Many times, a “thing” is a system of systems

© 2015. All Rights Reserved. 12

http://stardust.jpl.nasa.gov/mission/delta2.html

Rocket

1st Stage 2nd Stage3rd Stage

Interstage

Fuel Tank

Oxidizer

Guidance Electronics

CPU Memory

© 2015. All Rights Reserved. 13

Guidance Electronics

CPU

Memory JVM

Heap Usage

Thread Count

Continuing to zoom in

Heap Usage

© 2015. All Rights Reserved. 14

JVM

Thread Count

Alarm

Alarm Condition

Joyce

triggers

notifiesmonitors

© 2015. All Rights Reserved. 15

Major Tom

Alarm

Alarm Condition

Joyce

triggers

notifies

reports to

Starfleet

employs

employs

IoT modeling in summary

• Things can be interconnecting systems of other things

• High fidelity model of ‘reality’ supports wide variety of use cases vs. a disconnected set of entities

• IoT app is really only one part about things, don’t forget to include everything else! (social, organizational, etc.)

© 2015. All Rights Reserved. 16

Time series & Performance

© 2015. All Rights Reserved. 17

© 2015. All Rights Reserved. 18

Guidance Electronics

CPU

Memory JVM

Heap Usage

Thread Count

© 2015. All Rights Reserved. 19

Time series in Titan

Heap Usage

JVM

?

© 2015. All Rights Reserved. 20

Our basic time series requirements

• Support a large volume of low latency writes

• Low latency retrieval on primarily the most recent data

© 2015. All Rights Reserved. 21

A selection of factors affecting Titan performance

• Titan deployment topology and configuration • All your usual Cassandra tuning tips and tricks • Titan JVM tuning

• selection of appropriate garbage collector • GC parameters • like Cassandra, worthwhile to adjust NewSize

• Data modeling • Indexing

• Global graph indices (native Titan vs. external) • Vertex centric indices

• Titan different caches - transaction cache & the database-level cache

© 2015. All Rights Reserved. 22

A selection of factors affecting Titan performance

• Titan deployment topology and configuration • All your usual Cassandra tuning tips and tricks • Titan JVM tuning

• selection of appropriate garbage collector • GC parameters • like Cassandra, worthwhile to adjust NewSize

• Data modeling • Indexing

• Global graph indices • Vertex centric indices

• Titan different caches - transaction cache & the database-level cache

Deployment options

© 2015. All Rights Reserved. 23

mars-north-1Local

Embedded

Remote

© 2015. All Rights Reserved. 24

© 2015. All Rights Reserved. 25

© 2015. All Rights Reserved. 26

But first, time series with CQL

* Brady Gentile - https://academy.datastax.com/demos/getting-started-time-series-data-modeling

© 2015. All Rights Reserved. 27

But first, time series with CQL

* Brady Gentile - https://academy.datastax.com/demos/getting-started-time-series-data-modeling

CQL

© 2015. All Rights Reserved. 28

First approachHeap Usage

Chunk

chunkStart: 1442880000000chunkEnd: 1442966400000

Chunk

chunkStart: 1442966400000chunkEnd: 1442966400000

Observation Observation

tstamp: 1442880000001 tstamp: 1442880000002

• Intuitive and easy to query • You can imagine adding further

levels to the hierarchy following a year->month->day format

• Individual observations can be associated with other pieces of data

• Observations can be filtered by timestamp with edge filter but you still have to retrieve a large number of disparate vertices

© 2015. All Rights Reserved. 29

A further refinement

Heap Usage

Chunk

chunkStart: 1442880000000chunkEnd: 1442966400000

Chunk

chunkStart: 1442966400000chunkEnd: 1442966400000

• How do we reduce the number of vertices (think Cassandra partitions) that we need to retrieve?

© 2015. All Rights Reserved. 30

timestamp value

1. Move all properties to the edge 2. Make the edge “undirected” or, a combo of the two approaches 1. Copy the properties to the edge 2. Keep the discrete observation

vertex

Chunk

tstamp value

Heap Usage

© 2015. All Rights Reserved. 31

Chunk vertex with its observations

Vertex ID chunkStart chunkEnd obs. @ t2 obs. @ t1 obs. @ t0

Observations in time descending order

© 2015. All Rights Reserved. 32

Sample Gremlin queries

• observations > 1442162072000 • chunk.outE().has(“tstamp”, gt(1442162072000))

• observations between 1442162072000 and 1442162073000 • chunk.outE().has(“tstamp”, between(1442162072000, 1442162073000))

• Most recent observation before now • chunk.outE().has(“tstamp”, lte(System.currentTimeMillis()).

order().by(“tstamp”, decr).limit(1)

• You can wrap this in your own time series specific API • new SeriesQuery(series1).interval(startTstamp, endTstamp).decr().limit(1)

© 2015. All Rights Reserved. 33

Pros and cons vs. separate CQL or other tsdb

• Pros • Allows for a single unified view of your IoT data, maintaining

direct connectivity between sensor data & the other entities • Gremlin works well for processing streams of time series

data • Cons

• Storage format is not as compact • Extra overhead of managing ‘chunks’ versus CQL primary

key taking care of that for us (eg. chunk cache)

© 2015. All Rights Reserved. 34

Heap Usage

Chunk

label: hasChunk chunkStart: 1442880000000chunkEnd: 1442966400000

Chunk

label: hasChunk chunkStart: 1442966400000chunkEnd: 1442966400000

A simple query - retrieve all the heap usage chunks

gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000

© 2015. All Rights Reserved. 35

Getting a vertex by id

gremlin> g.V(4) ==>v[4]

Yes

Does this vertex exist?

Vertex is now loaded in Titan transaction cache

© 2015. All Rights Reserved. 36

Aside - a tool of the trade

Profiler with socket tracing

© 2015. All Rights Reserved. 37

© 2015. All Rights Reserved. 38

© 2015. All Rights Reserved. 39

Retrieving properties

gremlin> g.V(4).valueMap() ==>[sensorType:[heap usage], units:[bytes]]

Two properties

Retrieve properties

Vertex properties are now loaded in the Titan transaction cache

© 2015. All Rights Reserved. 40

2 Round trips

Two properties

Retrieve properties

Yes

Does this vertex exist?

• Not a big deal for single vertex lookup with property retrieval but can add up

• Exacerbated by magnitude of latency between Titan and Cassandra

© 2015. All Rights Reserved. 41

Querying for adjacent vertices

gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000 Does this vertex exist?

Get 1st chunk properties

Get edges

Get 2nd chunk properties

© 2015. All Rights Reserved. 42

Batch requests

Does this vertex exist?

Get 1st chunk properties Get 2nd chunk properties

Get edges

Get 2nd chunk properties

• query.batch = true • “Whether traversal queries should

be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend.” - http://s3.thinkaurelius.com/docs/titan/0.9.0-M2/titan-config-ref.html

gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000

© 2015. All Rights Reserved. 43

Remove initial exists query

• storage.batch-loading = true • WARNING - this disables

vertex ‘exists’ checks

gremlin> g.V(4).out(‘hasChunk’).values(‘chunkStart’) ==> 1442880000000 ==> 1442966400000

Does this vertex exist?

Get 1st chunk properties Get 2nd chunk properties

Get edges

Get 2nd chunk properties

© 2015. All Rights Reserved. 44

Optimizing your write

gremlin> chunk.addEdge(“hasObservation”, chunk, “tstamp”, 1442162072000, “value”, 500.123)

Does this vertex exist?

Write new edge

© 2015. All Rights Reserved. 45

Optimizing your writes

gremlin> chunk.addEdge(“hasObservation”, chunk, “tstamp”, 1442162072000, “value”, 500.123)

Does this vertex exist?

Write new edge

• Remove the read from your write path - storage.batch-loading = true

• batch your commits, measure latency and throughput on your system to find a good commit size

© 2015. All Rights Reserved. 46

storage.batch-loading=false

© 2015. All Rights Reserved. 47

storage.batch-loading=true

© 2015. All Rights Reserved. 48

Quick and dirty write performance numbers

wps

0

22,500

45,000

67,500

90,000

• 9 m3.2xlarge nodes w/ C* 2.2, RF = 3, writing @ quorum, default C* settings • 1 m3.2xlarge “client” w/ Titan 1.0-SNAPSHOT, 10 write threads writing 100

million points in total across 100,000 series

© 2015. All Rights Reserved. 49

In summary

• Understanding of underlying data storage format can help with performance tuning

• Writes • remove reads from the write path where possible • test different batch commit sizes • when writing vertices you may need to adjust ids.block-size and

ids.renew-percentage • Reads

• batch communication between Titan and Cassandra with query.batch=true

• make use of global and vertex centric indices when possible

What questions do you have and thanks!

Thanks to the Apache TinkerPop,TitanDB team, my awesome coworkers, and the folks at DataStax for putting on an excellent summit!

Ted Wilmes Data Warehouse Engineer

@trwilmes tedwilmes@wellaware.us

© 2015. All Rights Reserved. 50

Thank you

top related