an introduction to apache geode (incubating)

87
A NEW PLATFORM FOR A NEW ERA

Upload: anthony-baker

Post on 13-Aug-2015

554 views

Category:

Technology


2 download

TRANSCRIPT

A NEW PLATFORM FOR A NEW ERA

2© Copyright 2015 Pivotal. All rights reserved. 2© Copyright 2015 Pivotal. All rights reserved.

Open Sourcing GemFire

An Introduction to Apache Geode (incubating)The open source distributed in-memory database

Anthony [email protected]@metatype

Dan [email protected]

3© Copyright 2015 Pivotal. All rights reserved.

Topics

GemFire history, architecture, and use cases

Geode, the open source core of GemFire

Example application and demo

Technical deep dive

4© Copyright 2015 Pivotal. All rights reserved.

It’s all about the

DATAand how you use it

5© Copyright 2015 Pivotal. All rights reserved.

2004 2008 2014

• Massive increase in data volumes

• Falling margins per transaction

• Increasing cost of IT maintenance

• Need for elasticity in systems

• Financial Services Providers (every major Wall Street bank)

• Department of Defense

• Real Time response needs• Time to market constraints • Need for flexible data

models across enterprise• Distributed development• Persistence + In-memory

• Global data visibility needs• Fast Ingest needs for data• Need to allow devices to

hook into enterprise data• Always on

• Largest travel Portal• Airlines• Trade clearing• Online gambling

• Largest Telcos• Large mfrers• Largest Payroll processor• Auto insurance giants• Largest rail systems on

earth

Hybrid Transactional/Analytics grids

Our GemFire Journey Over The Years

6© Copyright 2015 Pivotal. All rights reserved.

Apps at Scale Have Unique Needs

Pivotal GemFire is the distributed, in-memory database for apps that need:

1. Elastic scale-out performance

2. High performance database capabilities in distributed systems

3. Mission critical availability and resiliency

4. Flexibility for developers to create unique applications

5. Easy administration of distributed data grids

7© Copyright 2015 Pivotal. All rights reserved.

1. Elastic Scale-Out Performance

Nodes

Ops / SecLinear scalability

Elastic capacity +/-

Latency-minimizing data distribution

China Railway Corporation

“The system is operating with solid performance and uptime. Now, we have a reliable, economically sound production system that supports record volumes and has room to grow”

Dr. Jiansheng Zhu, Vice Director of China Academy of Railway Sciences

• 4.5 million ticket purchases & 20 million users per day.

• Spikes of 15,000 tickets sold per minute, 40,000 visits per second.

8© Copyright 2015 Pivotal. All rights reserved.

2. High Performance Database Capabilities

Performance-optimized

persistence

Configurable consistency Partitioned Replicated Disabled

Distributed & continuous queries

Distributed transactions, indexing, archival

Newedge

“Our global deployment of Pivotal GemFire’s distributed cache gives me a single version of the trade – resolving hard-to-test-for synchronization issues that exist within any globally distributed business application architecture”

Michael Benillouche, Global Head of Data Management

9© Copyright 2015 Pivotal. All rights reserved.

3. Mission Critical Availability and Resiliency

Cluster to cluster WAN connectivity

Cluster resilience & failover

XX Gire

“We can track and collect money at our 4,000+ kiosks and branches – even without a reliable Internet connection. GemFire provides the core data grid and a significant amount of related functionality to help us handle this unreliable network problem”

Gustavo Valdez, Chief of Architecture and Development

• 19 million payment transactions per month

• 4000+ points of sale with intermittent Internet connectivity

WAN

10© Copyright 2015 Pivotal. All rights reserved.

4. Flexibility for Developers to Create Unique Apps• Data Structures:

– User-defined classes– Documents (JSON)

• Native language support:– Java, C++, C#

• API’s– Java: Hashmap– Memcache– Spring Data GemFire– C++ Serialization API’s

• Embedded query authoring– Object Query Language (OQL)

• Publish & subscribe framework for continous query & reliable asynchronous event queues

• App-server embedded functionality:– Web app session state caching– L2 Hibernate

11© Copyright 2015 Pivotal. All rights reserved.

5. Easy Administration of Distributed Data Grids• Auto tuning of distributed computing resources to optimize

performance

• Cluster monitoring dashboard– Cluster and node status & performance

• Offline performance statistics analysis tool– View historical logs and events to diagnose performance and resource bottlenecks

• Command-line tool for taking action on clusters and nodes

12© Copyright 2015 Pivotal. All rights reserved.

What makes it fast?

Design center is RAM not HDD

Really demanding customers

Avoid and minimize, particularly on the critical read/write paths:– Network hops, copying data, contention, distributed locking, disk seeks, garbage

Lots and lots of testing– Establish and monitor performance baselines– Distributed systems testing is difficult!

13© Copyright 2015 Pivotal. All rights reserved.

Horizontal Scaling for GemFire (Geode) ReadsWith Consistent Latency and CPU

• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers

• Partitioned region with redundancy and 1K data size

0

2

4

6

8

10

12

14

16

18

0

1

2

3

4

5

6

2 4 6 8 10

Sp

ee

du

p

Server Hosts

speedup

latency (ms)

CPU %

14© Copyright 2015 Pivotal. All rights reserved.

GemFire (Geode) 3.5-4.5X Faster Than Cassandra for YCSB

15© Copyright 2015 Pivotal. All rights reserved.

Roadmap

HDFS persistence

Off-heap storage

Lucene indexes

Spark integration

Cloud Foundry service

…and other ideas from the Geode community!

16© Copyright 2015 Pivotal. All rights reserved. 16© Copyright 2015 Pivotal. All rights reserved.

Use cases and patterns

17© Copyright 2015 Pivotal. All rights reserved.

“Low touch” Usage Patterns

Simple template for TCServer, TC, App servers

Shared nothing persistence, Global session stateHTTP Session management

Set Cache in hibernate.cfg.xml

Support for query and entity cachingHibernate L2 Cache plugin

Servers understand the memcached wire protocol

Use any memcached clientMemcached protocol

<bean id="cacheManager" class="org.springframework.data.gemfire.support.GemfireCacheManager"Spring Cache Abstraction

18© Copyright 2015 Pivotal. All rights reserved.

Application Patterns

Caching for speed and scale– Read-through, Write-through, Write-behind

Geode as the OLTP system of record– Data in-memory for low latency, on disk for durability

Parallel compute engine

Real-time analytics

19© Copyright 2015 Pivotal. All rights reserved.

Development Patterns

Configure the cluster programmatically or declaratively using cache.xml, gfsh, or SpringDataGemFire

<cache> <region name="turbineSensorData" refid="PARTITION_PERSISTENT"> <partition-attributes redundant-copies="1" total-num-buckets="43"/> </region></cache>

20© Copyright 2015 Pivotal. All rights reserved.

Development Patterns

Write key-value data into a Region using { create | put | putAll | remove }

– The value can be flat data, nested objects, JSON, …

Region sensorData = cache.getRegion("TurbineSensorData"); SensorKey key = new SensorKey(31415926, "2013-05-19T19:22Z"); // turbineId, timestampsensorData.put(key, new TurbineReading() .setAmbientTemp(75) .setOperatingTemp(80) .setWindDirection(0) .setWindSpeed(30) .setPowerOutput(5000) .setRPM(5));

21© Copyright 2015 Pivotal. All rights reserved.

Development Patterns

Read values from a Region by key using { get | getAll }

TurbineReading data = sensorData.get(new SensorKey(31415926,"2013-05-19T19:22Z"));

Query values using OQL

// finds all sensor readings for the given turbine SELECT * from /TurbineSensorData.entrySet WHERE key.turbineId = 31415926

// finds all sensor readings where the operating temp exceeds a threshold SELECT * from /TurbineSensorData.entrySet WHERE value.operatingTemp > 120

Apply indexes to optimize queries

22© Copyright 2015 Pivotal. All rights reserved.

Development Patterns

Execute functions to operate on local data in parallel

Respond to updates using CacheListeners and Events

Automatic redundancy, partitioning, distribution, consistency, network partition detection & recovery, load balancing, …

23© Copyright 2015 Pivotal. All rights reserved. 23© Copyright 2015 Pivotal. All rights reserved.

Apache Geode (incubating)The open source core of GemFire

24© Copyright 2015 Pivotal. All rights reserved.

Why OSS? Why Now? Why Apache?

Open Source Software is fundamentally changing buying patterns– Developers have to endorse product selection (No longer CIO handshake)– Community endorsement is key to product visibility– Open source credentials attract the best developers– Vendor credibility directly tied to street credibility of product

Align with the tides of history– Customers increasingly asking to participate in product development– Allow product development to happen with full transparency

Apache is where you go to build Open Source street cred– Transparent, meritocracy which puts developers in charge– Roman keeps shouting “Apache!” every few hours

25© Copyright 2015 Pivotal. All rights reserved.

Geode Will Be A Significant Apache Project

Over a 1000 person years invested into cutting edge R&D

Thousands of production customers in very demanding verticals

Cutting edge use cases that have shaped product thinking

Tens of thousands of distributed , scaled up tests that can randomize every aspect of the product

A core technology team that has stayed together since founding

Performance differentiators that are baked into every aspect of the product

26© Copyright 2015 Pivotal. All rights reserved.

Geode or GemFire?

Geode is a project, GemFire is a product

We donated everything but the kitchen sink*

More code drops imminent; going forward all development happens OSS-style (“The Apache Way”)

* Multi-site WAN replication, continuous queries, and native (C/C++) client driver

27© Copyright 2015 Pivotal. All rights reserved.

geode.incubator.apache.org

28© Copyright 2015 Pivotal. All rights reserved.

github.com/apache/incubator-geode

XApache

29© Copyright 2015 Pivotal. All rights reserved.

Download the source

30© Copyright 2015 Pivotal. All rights reserved. 30© Copyright 2015 Pivotal. All rights reserved.

Geode Demo

31© Copyright 2015 Pivotal. All rights reserved.

Post RegionPartitioned

People RegionPartitioned

Social Network

Person

Name: StringDescription:String

Post

Id: PostId(name, date)Text: String

32© Copyright 2015 Pivotal. All rights reserved.

Partition put

Client

Server 1

Server 2

Server 3

Bucket 1

Bucket 1

Bucket 2

Bucket 2

#(LOL)=1Put LOL

33© Copyright 2015 Pivotal. All rights reserved.

Partition put

Client

Server 1

Server 2

Server 3

LOL

LOL

Bucket 2

Bucket 2

ReplicateTo Secondary

34© Copyright 2015 Pivotal. All rights reserved.

“User” Use Case – Save Objectspublic interface PersonRepository extends CrudRepository<Person, String> {}

@AutowiredPersonRepository people;

public static void main(String[] args) { people.save(new Person(name)); posts.save(new Post(new PostId(name, date), text));}

Nested Objects,Compound Keys

35© Copyright 2015 Pivotal. All rights reserved.

“User” Use Case – Save Objectspublic interface PersonRepository extends CrudRepository<Person, String> {}

@AutowiredPersonRepository people;

public static void main(String[] args) { people.save(new Person(name)); posts.save(new Post(new PostId(name, date), text));}

Automatically SerializedWith PDX

36© Copyright 2015 Pivotal. All rights reserved.

Configuration

<gfe:partitioned-region id="people" copies="1"/>

<bean id="pdxSerializer” class="com.gemstone.gemfire.pdx.ReflectionBasedAutoSerializer">

<constructor-arg value="io.pivotal.happysocial.model.*"/></bean>

<gfe:cache pdx-serializer-ref="pdxSerializer"/>

37© Copyright 2015 Pivotal. All rights reserved.

Data Analyst – Determine Sentiment

Find all of the posts for a user

Analyze their content

38© Copyright 2015 Pivotal. All rights reserved.

First try – Just use a Query

public interface PostRepository extends GemfireRepository<Post, PostId> {

@Query("select * from /posts where id.person=$1") public Collection<Post> findPosts(String personName);}

Collection<Post> posts = postRepository.findPosts(personName);String sentiment = sentimentAnalyzer.analyze(posts);

39© Copyright 2015 Pivotal. All rights reserved.

First try – Just use a Query

public interface PostRepository extends GemfireRepository<Post, PostId> {

@Query("select * from /posts where id.person=$1") public Collection<Post> findPosts(String personName);}

Collection<Post> posts = postRepository.findPosts(personName);String sentiment = sentimentAnalyzer.analyze(posts);

Query Nested Objects

40© Copyright 2015 Pivotal. All rights reserved.

Use an Index

<gfe:index id="postAuthor" expression="id.person" from="/posts”

/>

41© Copyright 2015 Pivotal. All rights reserved.

Still could be more efficient

Client

Server 1

Server 2

Server 3

Joe: LOL!!

Joe: LOL!!

EJ: arrg

Maya: Hii

Jess: sup

Jess: ok

Hitting multipleNodes

Bringing too muchData to the client

42© Copyright 2015 Pivotal. All rights reserved.

Colocate the data

Client

Server 1

Server 2

Server 3

Joe: LOL!! Joe: LOL!!

EJ: arrgMaya: Hii

Jess: sup Jess: ok

<gfe:partitioned-region id="posts" copies="1" colocated-with="people”>

<gfe:partition-resolver ref="partitionResolver"/>

</gfe:partitioned-region>

43© Copyright 2015 Pivotal. All rights reserved.

Send behavior to data

Client

Server 1

Server 2

Server 3

Joe: LOL!! Joe: LOL!!

EJ: arrgMaya: Hii

Jess: sup Jess: ok

Execution functiongetSentimentOn Joe, Jess

Execute on Joe

Execute on Jess

44© Copyright 2015 Pivotal. All rights reserved.

Sample Function – Client Side

@Component@OnRegion(region = "posts")public interface FunctionClient { public List<SentimentResult> getSentiment(@Filter Set<String> people);}

45© Copyright 2015 Pivotal. All rights reserved.

Sample Function – Server Side

@GemfireFunction(HA=true)public SentimentResult getSentiment(Region<PostId, Post> localPosts, @Filter Set<String> personNames) throws Exception {

String personName = personNames.iterator().next(); Collection<Post> posts = localPosts.query("id.person='" personName + "'");

String sentiment = sentimentAnalyzer.analyze(posts); return new SentimentResult(sentiment, personName);}

46© Copyright 2015 Pivotal. All rights reserved.

Sample Function – Server Side

@GemfireFunction(HA=true)public SentimentResult getSentiment(Region<PostId, Post> localPosts, @Filter Set<String> personNames) throws Exception {

String personName = personNames.iterator().next(); Collection<Post> posts = localPosts.query("id.person='" personName + "'"); String sentiment = sentimentAnalyzer.analyze(posts); return new SentimentResult(sentiment, personName);}

47© Copyright 2015 Pivotal. All rights reserved.

Sample Function – Server Side

@GemfireFunction(HA=true)public SentimentResult getSentiment(Region<PostId, Post> localPosts, @Filter Set<String> personNames) throws Exception {

String personName = personNames.iterator().next(); Collection<Post> posts = localPosts.query("id.person='" personName + "'");

String sentiment = sentimentAnalyzer.analyze(posts); return new SentimentResult(sentiment, personName);}

48© Copyright 2015 Pivotal. All rights reserved. 48© Copyright 2015 Pivotal. All rights reserved.

Demo

49© Copyright 2015 Pivotal. All rights reserved. 49© Copyright 2015 Pivotal. All rights reserved.

Design• Persistence

• PDX Serialization

50© Copyright 2015 Pivotal. All rights reserved.

Persistence – Design Goals

High availability

Low latency, high throughput

Durability

Minimal impact to in memory speed– Minimize IO operations– Minimize locking– Minimize deserialization

51© Copyright 2015 Pivotal. All rights reserved.

Persistence – Shared NothingServer 3Server 2

Bucket 3*primary*

Bucket 1secondary

Bucket 3secondary

Bucket 2*primary*

Server 1

Bucket 2secondary

Bucket 1*primary*

52© Copyright 2015 Pivotal. All rights reserved.

Persistence – Shared NothingServer 1 Server 3Server 2

Bucket 2secondary

Bucket 1*primary*

Bucket 3*primary*

Bucket 1secondary

Bucket 3secondary

Bucket 2*primary*

What if a server crashes?

53© Copyright 2015 Pivotal. All rights reserved.

Persistence – Shared NothingServer 1 Server 3Server 2

Bucket 2secondary

Bucket 1*primary*

Bucket 3*primary*

Bucket 1*primary*

Bucket 3secondary

Bucket 2*primary*

Bucket 2secondary

Primary Failover

RestoreRedundancy

Bucket 1secondary

54© Copyright 2015 Pivotal. All rights reserved.

Persistence – Shared NothingServer 1 Server 3Server 2

Bucket 2secondary

Bucket 1*primary*

Bucket 3*primary*

Bucket 1*primary*

Bucket 3secondary

Bucket 2*primary*

Bucket 2secondary

Bucket 1secondary

What if they all crash?

55© Copyright 2015 Pivotal. All rights reserved.

Persistence – Shared NothingServer 3Server 2

Bucket 3*primary*

Bucket 1*primary*

Bucket 3secondary

Bucket 2*primary*

Bucket 2secondary

Bucket 1secondary

Server 1

Bucket 2secondary

Bucket 1*primary*

Compare Persistent ViewOn Restart

56© Copyright 2015 Pivotal. All rights reserved.

Persistence – Shared NothingServer 1

Bucket 2secondary

Bucket 1*primary*

Server 3Server 2

Bucket 3*primary*

Bucket 1secondary

Bucket 3secondary

Bucket 2*primary*

Bucket 2secondary

Bucket 1secondary

Recover Latest Data &

Rebalance

57© Copyright 2015 Pivotal. All rights reserved.

Modify k1->v5

Create k6->v6

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1

Persistence - Operation Logs

Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

58© Copyright 2015 Pivotal. All rights reserved.

Modify k1->v5

Create k6->v6

Create k1->v1

Modifyk1->v3

Member 1

Persistence - Compaction

Oplog2.crf

Oplog1.crf

Create k2->v2

Create k4->v4

Copy live data forward

59© Copyright 2015 Pivotal. All rights reserved.

Modify k1->v5

Create k6->v6

Persistence - Compaction

Create k2->v2

Create k4->v4

Oplog2.crf

Member 1

Modify k4->v7Oplog3.crf

Put k4->v7

60© Copyright 2015 Pivotal. All rights reserved. 60© Copyright 2013 Pivotal. All rights reserved.

Design• Persistence

• PDX Serialization

61© Copyright 2015 Pivotal. All rights reserved.

Fixed or flexible schema?

id name age pet_id

{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}

OR

62© Copyright 2015 Pivotal. All rights reserved.

But how to serialize data?

http://blog.pivotal.io/pivotal/products/data-serialization-how-to-run-multiple-big-data-apps-at-once-with-gemfire

63© Copyright 2015 Pivotal. All rights reserved.

Portable Data eXchange

C#, C++, Java, JSON

| header | data || pdx | length | dsid | typeId | fields | offsets |

64© Copyright 2015 Pivotal. All rights reserved.

Efficient for queries

SELECT p.name from /Person p WHERE p.pet.type = “dino”

{ id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” }}

single field deserialization

65© Copyright 2015 Pivotal. All rights reserved.

Easy to use

Access from Java, C#, C++, JSON

Domain objects not required

Automatic type definition– No IDL compiler or schema required– No hand-coded read/write methods

66© Copyright 2015 Pivotal. All rights reserved.

Distributed type registry

Member A Member B

Distributed Type Definitions

Person p1 = …region.put(“Fred”, p1);

67© Copyright 2015 Pivotal. All rights reserved.

Distributed type registry

Member A Member B

Distributed Type Definitions

Person p1 = …region.put(“Fred”, p1);

automatic definition

68© Copyright 2015 Pivotal. All rights reserved.

Distributed type registry

Member A Member B

Distributed Type Definitions

Person p1 = …region.put(“Fred”, p1);

automatic definition

69© Copyright 2015 Pivotal. All rights reserved.

Distributed type registry

Member A Member B

Distributed Type Definitions

Person p1 = …region.put(“Fred”, p1);

automatic definition

replicate serialized data containing

typeId

70© Copyright 2015 Pivotal. All rights reserved.

Schema evolution

PDX provides forwards and backwards compatibility, no code required

Member A Member B

Distributed Type Definitions

v1

v2

Application #2

Application #1

v2 objects preserve data from missing fields

v1 objects use default values to fill in new fields

71© Copyright 2015 Pivotal. All rights reserved.

Questions?

http://geode.incubator.apache.org

[email protected]

[email protected]

http://github.com/apache/incubator-geode

72© Copyright 2015 Pivotal. All rights reserved. 72© Copyright 2013 Pivotal. All rights reserved.

Bonus Content

73© Copyright 2015 Pivotal. All rights reserved. 73© Copyright 2015 Pivotal. All rights reserved.

Design• Persistence

• Asynchronous Events

• Client Subscriptions

• PDX Serialization

74© Copyright 2015 Pivotal. All rights reserved.

Asynchronous Events – Design Goals

High availability

Low latency, high throughput

Deliver events to a receiver without impacting the write path

75© Copyright 2015 Pivotal. All rights reserved.

Member 3

Member 1

Serial Queues

Member 2

LOL!!put

Primary Queue

Secondary Queue

Enqueue

sup

LOL!!

supsup

LOL!!

Replicate

76© Copyright 2015 Pivotal. All rights reserved.

Member 3

Member 1

Member 2

Serial Queues

sup

LOL!!

sup

LOL!!

AsyncEventListener { processBatch()}

LOL!!

sup

Dispatch Events from Primary

77© Copyright 2015 Pivotal. All rights reserved.

put

Secondary Queue(Partition 1)

LOL!!

sup

sup

LOL!!

Primary Queue (Partition 2)

Parallel Queues

Primary Queue (Partition 1)

LOL!!

sup

sup

LOL!!

Word

Word

78© Copyright 2015 Pivotal. All rights reserved.

LOL!!

sup

sup

LOL!!

Parallel Queues

LOL!!

sup

sup

LOL!!

Word

WordAsyncEventListener { processBatch()}

AsyncEventListener { processBatch()}

ParallelDispatch

79© Copyright 2015 Pivotal. All rights reserved. 79© Copyright 2013 Pivotal. All rights reserved.

Design• Persistence

• Asynchronous Events

• Client Subscriptions

• PDX Serialization

80© Copyright 2015 Pivotal. All rights reserved.

“There are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one

errors.”

– the internet

81© Copyright 2015 Pivotal. All rights reserved.

Client driver

Intelligent – understands data distribution for single hop network access

Caching – can be configured to locally cache data for even faster access

Events – registerInterest() in keys to receive push notifications

82© Copyright 2015 Pivotal. All rights reserved.

Client subscriptions

Useful for refreshing client cache, pushing events (“topics”) to multiple clients

Highly available & scalable via in-memory replicated queues

Events are ordered, at-least-once delivery

Durable subscriptions, conflation optional

83© Copyright 2015 Pivotal. All rights reserved.

Client subscriptions

Member A Member B

region.registerInterest(“Fred”);

Client

Client

84© Copyright 2015 Pivotal. All rights reserved.

Client subscriptions

Member A Member B

Person p1 = …region.put(“Fred”, p1);

region.registerInterest(“Fred”);

Client

Client

85© Copyright 2015 Pivotal. All rights reserved.

Client subscriptions

Member A Member B

Person p1 = …region.put(“Fred”, p1);

region.registerInterest(“Fred”);

Client

Client

86© Copyright 2015 Pivotal. All rights reserved.

Client subscriptions

Member A Member B

Person p1 = …region.put(“Fred”, p1);

region.registerInterest(“Fred”);

Client

Client

87© Copyright 2015 Pivotal. All rights reserved.

Client subscriptions

Member A Member B

Person p1 = …region.put(“Fred”, p1);

region.registerInterest(“Fred”);

Client

Client