Download - Scaling Your Cache

Transcript
Page 1: Scaling Your Cache

Scaling Your Cache

Alex Miller@puredanger

Page 2: Scaling Your Cache

Mission

• Why does caching work?

• What’s hard about caching?

• How do we make choices as we design a caching architecture?

Page 3: Scaling Your Cache

What is caching?

Page 4: Scaling Your Cache

Lots of data

Page 5: Scaling Your Cache

Memory Hierarchy

Register

L1 cache

L2 cache

RAM

Disk

Remote disk

1E+00 1E+01 1E+02 1E+03 1E+04 1E+05 1E+06 1E+07 1E+08 1E+09

1000000000

10000000

200

15

3

1

Clock cycles to access

Page 6: Scaling Your Cache

Register

L1 Cache

L2 Cache

Main Memory

Local Disk

Remote Disk

Fast

Slow

Small Expensive

Big Cheap

Facts of Life

Page 7: Scaling Your Cache

Caching to the rescue!

Page 8: Scaling Your Cache

Temporal Locality

Stream:

Cache:

Hits: 0%

Page 9: Scaling Your Cache

Temporal Locality

Stream:

Cache:

Hits: 0%

Stream:

Cache:

Hits: 65%

Page 10: Scaling Your Cache

Non-uniform distribution

0

800

1600

2400

3200

0%

25%

50%

75%

100%Web page hits, ordered by rank

Page views, ordered by rank

Pageviews per rank% of total hits per rank

Page 11: Scaling Your Cache

Temporal locality+

Non-uniform distribution

Page 12: Scaling Your Cache

17000 pageviewsassume avg load = 250 ms

cache 17 pages / 80% of viewscached page load = 10 ms

new avg load = 58 ms

trade memory for latency reduction

Page 13: Scaling Your Cache

The hidden benefit:reduces database load

DatabaseMemory

line of over provisioning

Page 14: Scaling Your Cache

A brief aside...

• What is Ehcache?

• What is Terracotta?

Page 15: Scaling Your Cache

Ehcache Example

CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("employees");cache.put(new Element(employee.getId(), employee));Element element = cache.get(employee.getId());

<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>

Page 16: Scaling Your Cache

Terracotta

App Node

TerracottaServer

TerracottaServer

App NodeApp NodeApp Node

App NodeApp NodeApp NodeApp Node

Page 17: Scaling Your Cache

But things are not always so simple...

Page 18: Scaling Your Cache

Pain of LargeData Sets

• How do I choose which elements stay in memory and which go to disk?

• How do I choose which elements to evict when I have too many?

• How do I balance cache size against other memory uses?

Page 19: Scaling Your Cache

Eviction

When cache memory is full, what do I do?

• Delete - Evict elements

• Overflow to disk - Move to slower, bigger storage

• Delete local - But keep remote data

Page 20: Scaling Your Cache

Eviction in Ehcache

<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>

Evict with “Least Recently Used” policy:

Page 21: Scaling Your Cache

Spill to Disk in Ehcache

<diskStore path="java.io.tmpdir"/>

<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600"

overflowToDisk="true" maxElementsOnDisk="1000000" diskExpiryThreadIntervalSeconds="120" diskSpoolBufferSizeMB="30" />

Spill to disk:

Page 22: Scaling Your Cache

Terracotta Clustering

<terracottaConfig url="server1:9510,server2:9510"/>

<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false">

<terracotta/> </cache>

Terracotta configuration:

Page 23: Scaling Your Cache

Pain of Stale Data

• How tolerant am I of seeing values changed on the underlying data source?

• How tolerant am I of seeing values changed by another node?

Page 24: Scaling Your Cache

Expiration

1 2 3 4 5 6 7 80 9

TTI=4

TTL=4

Page 25: Scaling Your Cache

TTI and TTL in Ehcache

<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>

Page 26: Scaling Your Cache

Replication in Ehcache<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution. RMICacheManagerPeerProviderFactory" properties="hostName=fully_qualified_hostname_or_ip, peerDiscovery=automatic, multicastGroupAddress=230.0.0.1, multicastGroupPort=4446, timeToLive=32"/>

<cache name="employees" ...> <cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory” properties="replicateAsynchronously=true, replicatePuts=true, replicatePutsViaCopy=false, replicateUpdates=true, replicateUpdatesViaCopy=true, replicateRemovals=true asynchronousReplicationIntervalMillis=1000"/></cache>

Page 27: Scaling Your Cache

Terracotta Clustering

Still use TTI and TTL to manage stale data between cache and data source

Coherent by default but can relax with coherentReads=”false”

Page 28: Scaling Your Cache

Write-Through CachingEhcache 2.0

• Keep database in sync with database

• Cache write --> database write

• Ehcache 2.0 adds new API:• Cache.putWithWriter(...)

• CacheWriter

• write(Element)

• writeAll(Collection<Element>)

• delete(Object key)

• deleteAll(Collection<Object> keys)

Page 29: Scaling Your Cache

Write-Behind CachingEhcache 2.0

• Allow database writes to lag cache updates

• Improves write latency

• Possibly reduces overall writes

• Use with read-through cache

• API same as write-through

• Other features:

• Batching, coalescing, rate limiting, retry

Page 30: Scaling Your Cache

Pain of Loading

• How do I pre-load the cache on startup?

• How do I avoid re-loading the data on every node?

Page 31: Scaling Your Cache

Persistent Disk Store

<diskStore path="java.io.tmpdir"/>

<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="true" maxElementsOnDisk="1000000" diskExpiryThreadIntervalSeconds="120" diskSpoolBufferSizeMB="30"

diskPersistent="true" />

Page 32: Scaling Your Cache

Bootstrap Cache Loader

<bootstrapCacheLoaderFactory class="net.sf.ehcache.distribution. RMIBootstrapCacheLoaderFactory" properties="bootstrapAsynchronously=true, maximumChunkSizeBytes=5000000" propertySeparator=",” />

Bootstrap a new cache node from a peer:

On startup, create background thread to pull the existing cache data from another peer.

Page 33: Scaling Your Cache

Terracotta Persistence

Nothing needed beyond setting up Terracotta clustering.

Terracotta will automatically bootstrap:- the cache key set on startup- cache values on demand

Page 34: Scaling Your Cache

Cluster Bulk LoadEhcache 2.0

• Terracotta clustered caches only

• Performance

• 6 nodes, 8 threads, 3M entries

• Coherent: 211.9 sec (1416 TPS)

• Bulk Load: 19.0 sec (15790 TPS) - 11.2x

Page 35: Scaling Your Cache

Cluster Bulk Load

CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("hotItems");

// Enter bulk loading mode for this nodecache.setCoherent(false);

// Bulk loadfor(Item item : getItemHotList()) { cache.put(new Element(item.getId(), item));}

// End bulk loading modecache.setCoherent(true);

Ehcache 2.0

Page 36: Scaling Your Cache

Pain of Concurrency• Locking

• Transactions

Page 37: Scaling Your Cache

Thread safety

• Cache-level operations are thread-safe

• No public API for multi-key or multi-cache composite operations

• Provide external locking

Page 38: Scaling Your Cache

BlockingCache

CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("items");manager.replaceCacheWithDecoratedCache( cache, new BlockingCache(cache));

• Cache decorator

• On cache miss, get()s block until someone writes the key to the cache

• Optional timeout

Page 39: Scaling Your Cache

SelfPopulatingCache• Cache decorator, extends BlockingCache

• get() of unknown key will construct the entry using a supplied factory

• “Read through” caching

CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("items");manager.replaceCacheWithDecoratedCache( cache, new SelfPopulatingCache(cache, cacheEntryFactory));

Page 40: Scaling Your Cache

JTAEhcache 2.0

• Cache acts as XAResource

• Works with any JTA Transaction Manager

• Autodetects JBossTM, Bitronix, Atomikos

• Transactionally move data between database, cache, queue, etc

Page 41: Scaling Your Cache

Pain of Duplication

• How do I get failover capability while avoiding excessive duplication of data?

Page 42: Scaling Your Cache

Partitioning + Terracotta Virtual Memory

• Each node (mostly) holds data it has seen• Use load balancer to get app-level partitioning• Use fine-grained locking to get concurrency• Use memory flush/fault to handle memory

overflow and availability• Use causal ordering to guarantee coherency

Page 43: Scaling Your Cache

Pain of Ignorance• Is my cache being used? How much?

• Is caching improving latency?

• How much memory is my cache using?

Page 44: Scaling Your Cache

Dev Console

• Ehcache console

• See caches, configuration, statistics

• Dynamic configuration changes

• Hibernate console

• Hibernate-specific view of caches

• Hibernate stats

Page 45: Scaling Your Cache

Scaling Your Cache

Page 46: Scaling Your Cache

21

1 JVM

Ehcache

2 or more JVMS2 or more

JVMs

EhcacheRMI

2 or more JVMS2 or more big JVMs

Ehcachedisk store

Terracotta OSS

2 or more JVMS2 or more

JVMs

2 or more JVMs2 or more

JVMs2 or more JVMslots of

JVMs

more scale

Terracotta FXEhcache FX

2 or more JVMS2 or more

JVMs2 or more JVMslots of

JVMs

Terracotta FXEhcache FX

NO NO YES YESYES

Scalability Continuumcausal

ordering

# JVMs

runtime

Ehcache DXmanagementand control

Ehcache EX and FXmanagementand control

Page 47: Scaling Your Cache

Thanks!

• Twitter - @puredanger

• Blog - http://tech.puredanger.com

• Terracotta - http://terracotta.org

• Ehcache - http://ehcache.org


Top Related