work with hundred of hot terabytes in jvms
TRANSCRIPT
Work With Hundreds of Hot Terabytes in JVMs
Per Minborg CTO, Speedment, Inc.
Do not Cross the Brook for Water
Why would you use a slow remote databasewhen you can have all your data available directly in your JVMs, ready for concurrent low-latency access ?
Scenario
>10 TB
Application
In-JVM-Cache
Fraud Detection,Credit Card Company
Web ShopStock TradeBankEtc.
Source of Truth
Table of Content
• Two Important Aspects of Big Data Latency• Cache synchronization strategies• How can you have JVMs that are in the TBs?• Speedment SQL Reflector
Table of Content
• Two Important Aspects of Big Data Latency• Cache synchronization strategies• How can you have JVMs that are in the TBs?• Speedment SQL Reflector
Two Important Aspects of Big Data Latency
• No matter how advanced database you may ever use, it is really the data locality that counts
• Be aware of the big change in memory pricing
Compare Latencies Using the Speed of Light
Database
During the time a database makes a 1 s query, how far will the light move?
Compare Latencies Using the Speed of Light
Disk seek
Intra-data center TCP
SSD
Compare Latencies Using the Speed of Light
Main Memory
CPU L3 cache
CPU L2 cache
CPU L1 cache
”Back to the Future”How much does 1 GB cost?
Cost of 1 GB RAM - Back to the Future
$ 5
$ 0.04
$ 720,000
$ 67,000,000,000
Source: http://www.jcmit.com/memoryprice.htm
Conclusion
• Keep your data close• RAM is close enough, cheap and getting even cheaper
Table of Content
• Two Important Aspects of Big Data Latency• Cache synchronization strategies• How can you have JVMs that are in the TBs?• Speedment SQL Reflector
Cache Synchronize Strategies
Poll Caching• Data evicted, refreshed or marked as old• Evicted element are reloaded• Data changes all the time• System restart either warm-up the
cache or use a cold cache
Dump and Load Caching• Dumps are reloaded periodically • All data elements are reloaded• Data remains unchanged between
reloads• System restart is just a reload
Common ways:
Cache Synchronize Strategies
Reactive Persistent Caching• Changed data is captured in the Database• Changed data events are pushed into the cache• Events are grouped in transactions• Cache updates are persisted• Data changes all the time• System restart, replay the missed events
The Speedment way:
Comparison
Dump and Load Caching
Poll Caching Reactive Persistance Caching
Max Data Age Dump period Eviction time Replication Latency - ms
Lookup Performance Consistently Instant ~20% slow Consistently Instant
Consistency Eventually Consistent Inconsistent - stale data Eventually Consistent
Database Cache Update Load
Total Size Depends on Eviction Time and Access Pattern
Rate of Change
Restart Complete Reload Eviction Time Down time update rate -> 10% of down time
*
Comparison
Dump and Load Caching
Poll Caching Reactive Persistance Caching
Max Data Age Dump period Eviction time Replication Latency - ms
Lookup Performance Consistently Instant ~20% slow Consistently Instant
Consistency Eventually Consistent Inconsistent - stale data Eventually Consistent
Database Cache Update Load
Total Size Depends on Eviction Time and Access Pattern
Rate of Change
Restart Complete Reload Eviction Time Down time update rate -> 10% of down time
*
Table of Content
• Two Important Aspects of Big Data Latency• Cache synchronization strategies• How can you have JVMs that are in the TBs?• Speedment SQL Reflector
Conventional Java Applications
• Java Objects are Garbage Collected periodically• Garbage Collection times increases with Java Heap size• Garbage Collection times increases with Java Heap mutation rate• “The app has hit the GC wall”• Hard to meet reasonable SLAs with more than 16:ish GB JVMs• 10 TB data and 10 GB JVMs -> ~1000 JVMs
Hazelcast High-Density Memory Store (HDMS)
• Stores data outside of the Java heap• The Garbage Collector does not see the HDMS content• Scales up to terra bytes of main memory in a single JVM• Use any number of nodes
Hot Restart Persistence
• Persists data in the Hazelcast maps in a file• Operations are appended to the file• SSD backing device recommended• 1.3 GB/s reload per node
• 10 GB in 6s• 100 GB in 1 min• 1 TB in 10 min
• 6.5 GB/s reload in a system with 10 nodes (1 active and 1 backup)• 10 GB in 1 s• 100 GB in 12 s• 1 TB in 2 min
• 65 GB/s reload in a system with 100 nodes, 1 TB in 12 s
Compressed Oops in Java 8 (36 bits)
• Using the default of –XX:+UseCompressedOops –XX:ObjectAlignmentInBytes=16
• In a 64-bit JVM, it can use “compressed” memory references.• This allows the heap to be up to 64 GB without the overhead of 64-
bit object references. • As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of
the address are always zeros and don’t need to be stored. This allows the heap to reference 4 billion * 16-bytes or 64 GB.
• Uses 32-bit references.
Table of Content
• Two Important aspects of Big Data Latency• Cache synchronization strategies• How can you have JVMs that are in the TBs?• Speedment SQL Reflector
Scenario
>10 TB
Application
In-JVM-CacheFraud DetectionCredit Card CompanyWeb ShopStock TradeBankBack Testing
Source of Truth
What is Speedment?
• Database SQL Reflector• Code generation tool -> Automatic domain model extraction from
databases• In-JVM-memory technology• Pluggable storage engines (ConcurrentHashMap,
OffHeapConcurrentHashMap, Hazelcast, etc.)• Transaction-aware
Speedment SQL Reflector
• Detects changes in a database • Buffers the changes• Can replay the changes later on• Will preserve order
Database
INSERTUPDATEDELETE
• Will preserve transactions• Sees data as it was persisted• Detects changes from any source
Download Trial @ www.speedment.com
1. Connect to Your Existing SQL DB
2. Automatic Schema Analysis
3. Just Push Play…
Super Easy Integration
Config hcConfig = new Config();
// Add optimized serialization Factories for Hazelcast// that are generated automatically
HazelcastInstance hcInstance = Hazelcast.newHazelcastInstance(hcConfig);
// Tell Speedment what Hazelcast instance to use
// Automatically build all Database metadata (e.g. Schema, Tables and Columns)
// Load selected data from the database into the Hazelcast maps and start tracking DB
SpeedmentHazelcastConfig.addTo(hcConfig);
ProjectManager.getInstance().putProperty(HazelcastInstance.class, hcInstance);
new TreeBuilder().build();
ProjectManager.getInstance().init();
Licenses and Services
• Free trial (60 days)• Speedment’s and Hazelcast’s Enterprise licenses are aligned • Start-up package with Speedment experts• Project consultants
Thank you!
@Speedment
www.speedment.com