geode - day 2
TRANSCRIPT
Day 2
Swapnil Bawaskar@sbawaskar
(incubating)
• Clients - in depth • Function Execution • Serialization • Transactions • Data Colocation
Agenda
2
• Client
• A process (Your Application) connected to the Geode server(s)
• Can maintain near cache
• Run OQL queries on local data
• Can be notified about events on the servers
3
Concepts
Application
GemFire Server
Region
Region
Region Client Cache
• Client Notifications • Register Interest
• Individual Keys OR RegEx for Keys • Updates Local Copy
• Examples: • region.registerInterest(“key-1”); • region1.registerInterestRegex(“[a-z]+“);
• Continuous Query • Receive Notification when Query condition met
on server • Example:
• SELECT * FROM /tradeOrder t WHERE t.price > 100.00
• Can be DURABLE
Concepts
4
• Functions
• Used for distributed concurrent processing (Map/Reduce, stored procedure)
• Push code vs pull data
• Run in Server’s process
• Highly available
• Member oriented
5
Concepts
Submit (f1)
f1 , f2 , … fn
Execute Functions
6
Concepts
Server
Server
FunctionService.onRegion.withFilter.execute ResultCollector.getResult
Server Distributed System
execute
Server
Server
6
1
result
execute
execute
result result
2
5
3
4 3 4
Server
Partitioned Region Data Store - X
Partitioned Region Data Store - Y
Partitioned Region Data Store - Z
Partitioned Region Data Accessor
Partitioned Region Data Accessor
filter = Keys X, Y Client Region
• Functions
• Listeners
• CacheWriter / CacheListener
• AsyncEventListener (queue / batch)
• Parallel or Serial
• Conflation
7
Concepts
Fixed or flexible schema?
id name age pet_id
or
{id:1,name:“Fred”,age:42,pet:{name:“Barney”,type:“dino”}}
C#, C++, Java, JSON
No IDL, no schemas, no hand-coding Schema evolution (Forward and Backward Compatible)
* domain object classes not required
|header|data||pdx|length|dsid|typeid|fields|offsets|
Portable Data eXchange
Efficient for queries
{id:1,name:“Fred”,age:42,pet:{name:“Barney”,type:“dino”}}
SELECTp.nameFROM/PersonpWHEREp.pet.type=“dino”
single field deserialization
But how fast is it?
Benchmark: https://github.com/eishay/jvm-serializers
Schema evolutionMember A Member B
Distributed Type Definitions
v2v1
Application #1
Application #2
v2 objects preserve data from missing fields
v1 objects use default values to fill in new fields
PDX provides forwards and backwards compatibility, no code required
• Persistent Regions
• Durability
• WAL for efficient writing
• Consistent recovery
• Value always one disk seek away
• Compaction
13
Concepts
Modify k1->v5
Create k6->v6
Create k2->v2
Create k4->v4 Oplog2.crf
Member 1
Modify k4->v7 Oplog3.crf
Put k4->v7
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Server 1 Server N
Persistence - Shared Nothing
14
Server 3Server 2Server 1
Persistence - Shared Nothing
15
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
16
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
17
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
18
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
19
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
B3
B2
Server 1 waits for others when it starts
Persistence - Shared Nothing
20
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Fetches missed operations on restart
Concepts - HA
21
CacheTransactionManager mgr = cache.getCacheTransactionManager(); mgr.begin(); region1.put("K1", "V1"); region1.put("K2", "V2"); region2.put("K2", "V2"); mgr.commit();
Transaction - API
22
TransactionId txId = mgr.suspend(); … other non-transactional work mgr.resume(txId); mgr.tryResume(…);
region.putIfAbsent(K, V); region.replace(K, V, V); region.remove(K, V);
Single Entry
• Repeatable Read • Thread sees own changes • Other threads do not until commit is called
• Optimistic • No Entry level locks, Readers not blocked • Conflict Detection
• NOT Persistent (yet) • Not a problem if you have at-least one member up
• Data must be colocated
Transaction - Semantics
23
• Inspiration “Forscalability,applicationsshouldmanipulatesinglecollectionofdatathatlivesononeJVM”—PatHelland(LifeBeyondDistributedTransactions)
• Custom Partitioning • Within one Partitioned Region • E.g. All trades in January
• Data Colocation • Between Two or more Partitioned Regions • All Orders of a Customer
• Implement a PartitionResolver
Concepts - Data Colocation
24
Hands On
25
• Serialize Teeny using PDX • Trigger a CQ when popularity > 5 • Implement function to find the most popular
domain
Hands On
26