everything you wanted to know about velocity (but were afraid to cache)

42
Everything you wanted to know about Velocity (but were afraid to cache) Scott Colestock [email protected] Marcato Partners, LLC

Upload: irisa

Post on 25-Feb-2016

93 views

Category:

Documents


0 download

DESCRIPTION

Everything you wanted to know about Velocity (but were afraid to cache). Scott Colestock [email protected] Marcato Partners, LLC. About Scott. Scott Colestock [email protected] Twitter: scolestock Marcato Partners (MarcatoPartners.com) One of three partners - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Everything you wanted to know about Velocity (but were afraid to cache)

Everything you wanted to know about Velocity

(but were afraid to cache)Scott Colestock

[email protected] Partners, LLC

Page 2: Everything you wanted to know about Velocity (but were afraid to cache)

About Scott

• Scott Colestock• [email protected]• Twitter: scolestock• Marcato Partners (MarcatoPartners.com)– One of three partners– Focused on agile coaching– Focused on helping early-stage startup ventures in

the mobile space

Page 3: Everything you wanted to know about Velocity (but were afraid to cache)

What is it?

Velocity is a distributed key/value cache that provides .NET developers with a way to increase performance and

scalability when writing data-centric applications.

Page 4: Everything you wanted to know about Velocity (but were afraid to cache)

What is it? (2)

• The combined RAM available to all servers in a Velocity cluster is presented to Velocity clients as a unified whole

• Any serializable CLR object can be stored– Actual location within cluster is transparent– Client is a simple key/value API at heart

• Run as a service accessed across the network• Additional servers can be added on demand

Page 5: Everything you wanted to know about Velocity (but were afraid to cache)

What we’ll cover

• What motivates this product/technology• Terms / Pictures / Concepts• Deploy / Install Process• A lap around the API & Admin model• Demos• Gotchyas

Page 6: Everything you wanted to know about Velocity (but were afraid to cache)

Motivation• Data-centric applications have been the norm for a long while

– Relational data– More recently, “service-obtained” data

• Velocity is about increasing performance by bringing the data physically closer to the consumer– Reduce pressure on underlying data stores/services

• Velocity can be about storing data in value-added form (logically closer to the consumer)– Object graphs– Output caching (not explicit in V1)– Aggregated data in xml or other transformed formats

Page 7: Everything you wanted to know about Velocity (but were afraid to cache)

Motivation (2)

• Databases are always a point of high contention as you scale out, and tuning is expensive– Are your data retrieval sprocs getting harder to

maintain - excessive sql chops required?• Service calls for reference data (internal/external)

are often slow or intentionally throttled• Caching has always been considered a solution

for these issues…

Page 8: Everything you wanted to know about Velocity (but were afraid to cache)

Motivation (3)• Machine-local caching solutions (like Microsoft’s

“Enterprise Library Caching Application Block”) can provide partial answer– Easy key/value API– Flexible store (memory, disk-backed, etc.)– Flexible expiration and eviction policy

• Limitations:– Limited by the memory available to a single node…– Application recycles typically mean you lose the cache– In a load-balanced environment, a large data set means you will

frequently “miss” when attempting to load from cache…

Page 9: Everything you wanted to know about Velocity (but were afraid to cache)

Motivation (4)

Load Balancer

Key 3,5,23

Key 7,11,47

Key 12,16,33

Machine-local caches wind up being sparsely populated

when used with a load balancer (if the data set has

many keys)

Page 10: Everything you wanted to know about Velocity (but were afraid to cache)

Motivation (5)• With machine-local caches, you have no central place to

update/delete cached items• This means you can only cache data that can afford to be

stale by some time period– If the time period is short, you need a low TTL (time-to-live, aka

expiration) which means more cache misses• You can’t cache data that must have changes visible to the

system in (near) real time• With a single logical cache, you have one cache to shoot in

the event of an update/delete– Might be able to live with no expiration

Page 11: Everything you wanted to know about Velocity (but were afraid to cache)

What we’ll cover

• What motivates this product/technology• Terms / Pictures / Concepts• Deploy / Install Process• A lap around the API & Admin model• Demos• Gotchyas

Page 12: Everything you wanted to know about Velocity (but were afraid to cache)

Windows Server AppFabric Caching

• History: AppFabric caching was a separate component– Public debut at TechEd 2008 (earlier?)– Codename: Velocity

• “Dublin” was a separate effort, focused on providing a hosting and management environment around WCF/WF

• November 2009: Technologies grouped under heading of “Windows Server AppFabric”

• RTW in June 2010…

Page 13: Everything you wanted to know about Velocity (but were afraid to cache)

Relationship to Windows Azure AppFabric

• Service bus: Handle communication and authentication for accessing applications– Expose apps through firewalls, NAT gateways, etc.– Assist cloud-based apps talking to on-premise apps– Other composite app scenarios; pub/sub

• Access Control Service: Allow you to avoid setting up federated identity agreements just to grant partner/customer access to your cloud-based or on-premise apps.

•Today: Only common marketing/branding with Windows Server AppFabric. •Later: Common services for both

Page 14: Everything you wanted to know about Velocity (but were afraid to cache)

Cache-Aside Pattern

• In the current version, the out-of-box support is for the “cache-aside” pattern.– Check cache– If miss, retrieve data, then populate the cache

• Lots of other patterns you might contemplate (and simulate) with what is provided– Read-through/Write-through– Refresh-ahead/Write-behind

Page 15: Everything you wanted to know about Velocity (but were afraid to cache)

Cache-Aside Pattern

Page 16: Everything you wanted to know about Velocity (but were afraid to cache)

Cache Cluster

Logical Hierarchy

Server A

Cache Host A

Server B

Cache Host B

Server C

Cache Host C

Named Cache: Product Catalog

Default Cache

Region: Sports

Region 1 Region 3

Client apps work with a single logical unit of cache

Server process is DistributedCacheService.exe

Caches explicitly created

with TTL, expiration, HA policy

Regions represent a partition of data (subset of key/value pairs).

Live on one node. Unit of replication/failover.

Regions can be implicit or explicit. Use explicit

only for bulk gets or searching.

Page 17: Everything you wanted to know about Velocity (but were afraid to cache)

Logical Hierarchy

Named Cache: Product Catalog

Default Cache

Region: Sports

Region 1

ID (Key) Payload (Value)

Tags/VersionInfo

1 Foo …2 Bar …3 Baz …

Page 18: Everything you wanted to know about Velocity (but were afraid to cache)

Cache Cluster

Physical Layout

Web Server A

IIS 7.x

Web Server B

IIS 7.x

Web Server C

IIS 7.x

LoadBalancer

Cache Server A

Cache Host

Cache Server B

Cache Host

Cache Server C

Cache Host

Page 19: Everything you wanted to know about Velocity (but were afraid to cache)

Combined Deployment

Web Server A

IIS 7.x

Web Server B

IIS 7.x

Web Server C

IIS 7.x

LoadBalancer

Cache Host

Cache Host

Cache Host

Page 20: Everything you wanted to know about Velocity (but were afraid to cache)

Physical Layout

• Configuration store contains cache policies and global partition map (how keys divide into regions, which servers have which regions)

• If Sql config store, servers will send heartbeat to Sql. Otherwise, heartbeat goes to one or more “lead hosts”

• Partition map used by “Global Partition Manager” (one node in the cluster, but auto failover) to communicate routing information to Velocity clients

Cache ClusterWeb Server A

IIS 7.x

Web Server B

IIS 7.x

Web Server C

IIS 7.x

LoadBalancer

Cache Server A

Cache Host

Cache Server B

Cache Host

Cache Server C

Cache Host

ConfigStore

(File share or Sql Server)

Page 21: Everything you wanted to know about Velocity (but were afraid to cache)

Regions as unit of replication/failover(Global Partition Manager in action)

Cache Cluster

Server A

Cache Host A

Server B

Cache Host B

Server C

Cache Host C

Named Cache: Product Catalog

Default Cache

Region: Sports

Region 1

Page 22: Everything you wanted to know about Velocity (but were afraid to cache)

Regions as unit of replication/failover(When using Secondaries)

Cache Cluster

Server A

Cache Host A

Server B

Cache Host B

Server C

Cache Host C

Named Cache: Product Catalog

Default Cache

Region: Sports

Region 1

Sports secondary

Region 1 secondary

(Updates done synchronously)

Page 23: Everything you wanted to know about Velocity (but were afraid to cache)

Local Cache

• Local cache is an option that can be enabled when creating the cache client (DataCacheFactory)• Allows a local cache to be populated that will prevent network hop (and serialization) if request

can be satisfied locally• Best when data set is (relatively) small, changes infrequently, and stale data is acceptable• Can expire via TTL or notifications (which might be late/lost)• Can specify max object count before evicting LRU

Cache Cluster

Web Server A

IIS 7.x

Web Server B

IIS 7.x

Web Server C

IIS 7.x

LoadBalancer

Cache Server A

Cache Host

Cache Server B

Cache Host

Cache Server C

Cache Host

LocalCache

LocalCache

LocalCache

Page 24: Everything you wanted to know about Velocity (but were afraid to cache)

Data Types and Caching Considerations

• Reference Data: Product catalogs, “lookup” tables, other slow-moving content– Safe to cache for a defined period of time because you probably live

with staleness already– “Local” cache option might be desirable for small data sets

• Activity Data: Shopping carts or other transient transaction state– Accessed for read and write operations, but not shared. Low/No

concurrency considerations – exclusive write.– Safe to cache for reads and keep in cache for writes

• Resource Data: Inventory, Orders, and other core transactional data– Accessed concurrently for read and write– Caching will require a concurrency model to be chosen and managed

Page 25: Everything you wanted to know about Velocity (but were afraid to cache)

What we’ll cover

• What motivates this product/technology• Terms / Pictures / Concepts• Deploy / Install Process• A lap around the API & Admin model• Demos• Gotchyas

Page 26: Everything you wanted to know about Velocity (but were afraid to cache)

Deploy/Install Considerations

• Windows “Application Server” Role required• A few critical updates (see install guide)• .NET3.5SP1 for cache clients; .NET4 for servers• You’ll need Powershell 2 (already in

Win7/Win2k8R2)• Windows XP cannot be a client…• “Install” and “Configure” for AppFabric are

two distinct steps

Page 27: Everything you wanted to know about Velocity (but were afraid to cache)

Deploy/Install Considerations• Primary screen of

interest is choosing your configuration store:– XML/File share– Sql-Based

• File share avoids the need for Sql Server, but requires that some nodes in the cache cluster be special (“Lead Hosts”)

• Using Sql as the configuration store is the better engineering choice for production – you may have other reasons to avoid it.

Page 28: Everything you wanted to know about Velocity (but were afraid to cache)

Deploy/Install Considerations

• As you build out your AppFabric Cache Cluster, you will do “New Cluster” on the first node, and “Join Cluster” on subsequent nodes

• Ultimately, all of Windows Server AppFabric is a set of features underneath the Application Server Role – so standard command line installations work.– Setup.exe /install /i cachingservice,cacheclient,cacheadmin /l:c:\temp\setup.log

Page 29: Everything you wanted to know about Velocity (but were afraid to cache)

AppFabric as Application Server“Role Service”

Page 30: Everything you wanted to know about Velocity (but were afraid to cache)

Deploy/Install Considerations

• Can do a “Cache client” install for clients, or for internal apps, just incorporate client assemblies in your own build/deploy process

Microsoft.ApplicationServer.Caching.Core.dllMicrosoft.ApplicationServer.Caching.Client.dllMicrosoft.WindowsFabric.Common.dllMicrosoft.WindowsFabric.Data.Common.dll

Page 31: Everything you wanted to know about Velocity (but were afraid to cache)

What we’ll cover

• What motivates this product/technology• Terms / Pictures / Concepts• Deploy / Install Process• A lap around the API & Admin model• Demos• Gotchyas

Page 32: Everything you wanted to know about Velocity (but were afraid to cache)

Caching Classes

DataCacheFactory

DataCacheFactory()DataCacheFactory(configuration)DataCache GetCache(string cache)GetDefaultCache()

DataCacheFactoryConfiguration

LocalCacheProperties NotificationProperties SecurityProperties DataCacheServerEndpoint[] Servers

(Can set these via configuration)

DataCache

Add 

Adds a new object to the cache. Exception if the item is already in the cache.

Put Adds a new object to the cache. Replaces if already in cache.

Get  Returns an object from the cache.

Remove  Removes an object from the cache.

Page 33: Everything you wanted to know about Velocity (but were afraid to cache)

Caching Classes

Page 34: Everything you wanted to know about Velocity (but were afraid to cache)

DataCache with DataCacheItemVersion

• GetCacheItem: returns tags and version info• GetIfNewer: lets you use that version info!• Put and Remove have overloads that takes

version info– Allows for an optimistic concurrency model– Will only succeed if version information matches

what is current for the cached item

Page 35: Everything you wanted to know about Velocity (but were afraid to cache)

DataCache and Locking

• GetAndLock: Allows you to lock a cache item for a specified time period, even if not present– (Will fail if already locked)– public Object GetAndLock (string key, TimeSpan timeout,

out DataCacheLockHandle lockHandle, bool forceLock)

• Useful when attempting to get multiple servers to coordinate “cache pre-load” activity

• PutAndUnlock: Unlock an item, with given key and lock handle

• Unlock: Explicitly unlock, optional extend TTL

Page 36: Everything you wanted to know about Velocity (but were afraid to cache)

DataCache and Tags/Regions

• Explicitly created regions live on a single node…can create a hot spot for both call volume and memory growth

• But they offer bulk retrieval and flexible tag-based retrieves

• For secondary indexes, instead of regions: simulate secondary indexes with your own secondary-to-primary mapping cache

Page 37: Everything you wanted to know about Velocity (but were afraid to cache)

Administrative Model

• Administration for AppFabric Caching done purely through PowerShell

• Can administrate entire Cache Cluster from wherever administrative portion of install has been done – all nodes addressable from single command line location

• Use-CacheCluster points the shell at a particular cluster to administrate

• Get-Command -module DistributedCacheAdministration

Page 38: Everything you wanted to know about Velocity (but were afraid to cache)

What we’ll cover

• What motivates this product/technology• Terms / Pictures / Concepts• Deploy / Install Process• A lap around the API & Admin model• Demos• Gotchyas

Page 39: Everything you wanted to know about Velocity (but were afraid to cache)

What we’ll cover

• What motivates this product/technology• Terms / Pictures / Concepts• Deploy / Install Process• A lap around the API & Admin model• Demos• Gotchyas

Page 40: Everything you wanted to know about Velocity (but were afraid to cache)

Gotchyas• Balance number of nodes in cluster with memory per node.

– Too many nodes = cluster overhead, too much memory per node = GC overhead

• If you don’t use Sql Config Store, you need to manually run Start-CacheHost after reboot

• Consider the nature of data stored in cache, and secure appropriately (don’t let cache be weakest link)

• Sql Config Store requires high Sql privileges right now at point of install• Currently service runs as network service account• Consider what you will do when cache is down

– You can go after source of truth– How do you avoid leaving stale data in the cache?

Page 41: Everything you wanted to know about Velocity (but were afraid to cache)

Resources• AppFabric Caching and Deployment Guide

– http://bit.ly/AppFabMgmt• AppFabric Development Center

– http://bit.ly/AppFabDevCtr• AppFabric Forums

– http://bit.ly/AppFabForum• NHibernate integration

– http://sourceforge.net/projects/nhcontrib/files/NHibernate.Caches/• Entity Framework integration (basis for)

– http://code.msdn.microsoft.com/EFProviderWrappers• Recent MSDN:

http://msdn.microsoft.com/en-us/magazine/ff714581.aspx

Page 42: Everything you wanted to know about Velocity (but were afraid to cache)

Thank you -

Questions?