scalabilitydelara/courses/ece1779/handouts/scalability.pdftutorial ! step 5: create a ec2 security...

1

© 2010 VMware Inc. All rights reserved

Scalability

Recommended Text:

Database Replication

Bettina Kemme, Ricardo Jiménez-Peris, Marta Patiño-Martínez Morgan & Claypool Publishers

2

Scalability

!  Load Balancing

!  Provisioning/auto scaling

!  State • Caching

•  Commonly used items of the database are kept in memory.

• Replication •  Logical items of the database (e.g., tuples, objects) have multiple physical copies

located on different nodes.

•  Partitioning •  Logical items of the database (e.g., tuples, objects) are divided among multiple

physical nodes.

!  Fault Tolerance

3

Load Balancing

!  Determines where requests/transactions will be executed !  Blind techniques • Require minimum or no state on load balancer

•  Strategies: •  Random •  Round robin

!  Load aware • Consider load on replicas when routing requests

• Different request may generate different load • Replica nodes may have different capacity

•  Strategies: •  Shortest queue first

•  Simple to implement, just track number of active requests •  Least loaded

•  Replicas need to periodically provide load info to load balancer

4

Load Balancing

!  Application Aware • Uses knowledge of the application

•  Strategies: •  Shortest execution length

•  Profile request types to generate execution time estimates •  Load estimates load at replica by keeping tracks of pending requests and their

type, i.e., determines application-specific load metric. •  Routes to less loaded replica

•  Locality-aware request distribution (LARD) •  Takes into account data accessed by requests, and routes similar request to the

same replica to increase cache hit rates

•  Conflict-aware distribution •  Execute potentially conflicting requests on same replica for early conflict detection

2

5

Load Balancing Comparison

!  Blind • General

•  Easy to implement •  Suboptimal

!  Load aware •  Better performance • Require more state

!  Application aware •  Best performance

• Require even more state • Harder to implement

•  Brittle – need to change if application changes

6

AWS Elastic Load Balancer

!  Balances load between EC2 instances !  Can distribute load across availability zones ! Within an availability zone: • Round-robin among the least-busy instances

•  Tracks active connections per instance • Does not monitor CPU

!  Supports the ability to stick user sessions to specific EC2 instances.

7

AppEngine Load Balancing

! Google is not saying!

8

Self Provisioning

!  Automatically adjust the worker pool or number of replicas to the current workload •  Add workers/replicas when load increases

• Retire replica/workers when under load

!  Approaches • Reactive

•  Reconfigure when load threshold reached

•  Proactive •  User prediction mechanism to trigger reconfiguration •  Reconfigure before saturation •  Based on time series or machine learning approaches

3

9

Self Provisioning

!  Retiring replicas/workers • Move instance load to rest of system

•  All new request go to other nodes

• Wait for outstanding transactions to commit • When idle, release

!  Adding replicas/workers •  Transfer state/database to new replica • Receive all updates that occurred during previous transfer

•  Add replica to pool

10

Self Provisioning Considerations

!  Load metrics • Use low level system metrics to determine node utilization

•  CPU, I/O utilization

• Use application level metrics •  Response time, e.g., transaction completes within X milliseconds.

!  Cost/latency prediction • How long and how expensive it is to add a replica/worker • Reactive: the higher the latency the lower the configuration threshold

•  Proactive: the higher the latency the farther into the future we need to predict

! Monitoring • Monitoring isn’t free •  The more data that we collect, the higher the impact on the system

11

AWS Auto Scaling

!  Scale automatically according to user-defined conditions !  Enabled by CloudWatch !  Create scale up/down policy !  Associate scaling policy with CloudWatch alarm

12

AS Tutorial

!  Step 1: Download the command line tools •  http://aws.amazon.com/developertools/2535

!  Step 2: Create a credentials file •  AWSAccessKeyId=<Write your AWS access ID>

AWSSecretKey=<Write your AWS secret key>

!  Step 3: Set environment variables •  AWS_AUTO_SCALING_HOME=<directory where you installed the tools> •  JAVA_HOME=<java installation home directory>

•  PATH = $AWS_AUTO_SCALING_HOME/bin:$PATH

•  AWS_CREDENTIAL_FILE=<the file created in 2>

!  Step 4: Test •  as-cmd –help prints list of available commands

4

13

Tutorial

!  Step 5: Create a EC2 security group and a load balancer !  Step 6: Create a launch configuration •  as-create-launch-config ece1779_config --image-id ami-37ec3c5e --instance-type m1.small --group default

!  Step 7: Create an auto scaling group •  as-create-auto-scaling-group ece1779_group --availability-zones us-east-1a

--launch-configuration ece1779_config

--min-size=1 --max-size 3

--load-balancers ece1779LoadBalancer

--health-check-type ELB --grace-period 300

Health-chek-type can set to EC2 (machine level health) or ELB (app-level health)

!  Step 8: Create a scaling policy •  as-put-scaling-policy ScaleUp --auto-scaling-group ece1779_group --adjustment=1 --type ChangeInCapacity

Other types: ExactCapacity and PecentageChangeInCapacity

14

AS Tutorial

!  Step 9: Create alarms on CloudWatch and associate with AS policy

15

AppEngine Auto Scaling

!  App owner can adjust the front instance class,# of idle instances, and the pending latency.

16

AppEngine Auto Scaling

!  Can configure number of backend nodes ! war/WEB-INF/backends.xml

!  Administer •  AppEngine management console

•  appcfg.sh command line tool (part of the Java SDK)

5

17

Caching

! Objective: • Reduce load on storage server

• Reduce latency to access data

!  Store recent/hot/frequent data in RAM !  Volatile !  Fast read/write access

18

memcached

!  A distributed memory cache implementation !  Provides a key-value store interface

put (key,value)

value = get(key)

!  Scalable and consistent temporary storage •  Secondary system that provides fast access to data

• Data stored reliably somewhere else

19

Google memcache

!  Inspired by memcached !  Commonly used to cache datastore entities by their keys !  Key is up to 250 bytes long •  Larger keys hashed

!  Value can be up to 1 megabyte !  Single value operation are atomic • No support for multi-item transactions

!  Support for atomic numeric value increment/decrement

20

Usage Model

!  Data accesses get value from memcached

if cache miss fectch data from datastore put value on memcached

operate on value

!  Date updates •  Possible to overwrite value with new data •  No transactional way to do this •  Update may succeed in datastore and fail in memcache •  Updates may happen at different order in datastore and memcache •  Instead: Invalidate item on update

Fetch fresh value on next access

6

21

Memcache Java API

!  Key and value can be any Java serializable object

!  Obtain memcache pointer •  MemcacheService syncCache = MemcacheServiceFactory.getMemcacheService();

!  Put

22

Memcache Java API

! Get

!  Delete

!  Contains

23

Memcache Java API

!  Increment

!  Cache statistics

24

Replication

!  Fault tolerance • High availability despite failures

•  If one replica fails, there is another one that can take over

!  Throughput •  Support a large use base

•  Each replica handles a fraction of the users •  Scalability

!  Response time •  Support a geographically distributed user base •  There is a replica close to the user

•  Fast local access

7

25

Stateless vs. Stateful

!  Stateless •  Services that require mainly computing

•  State can be regenerated on demand •  Easy to replicate

!  Stateful •  Services uses business critical data • Good performance depends on state availability

• Harder to replicate

26

Transactions

!  User-defined sequence of read and write operations on data items that meet the following properties: •  Atomicity

•  Transaction either executes entirely and commits, or it aborts not leaving any changes in the database

•  All or nothing

• Consistency •  Integrity constraints on the data are maintained

•  Isolation •  Provides impression that no other transaction is executing in the system

• Durability •  Once the transaction commits, the changes are indeed reflected in the database, even

in the event of failures.

27

Replica Control

!  Task of keeping data copies consistent as items are updated !  There is a set of database nodes RA, RB, … !  Database consist of set of logical data items x, y, …. !  Each logical item x has physical copies xA, xB,…. • Where xA resides in RA

!  A transaction is a sequence of read and write operation on logical data items

!  The replica control mechanism maps the operation on the logical data items onto the physical copies

28

Read-One-Write-All (ROWA)

!  Common replica control method !  Read can be sent to any replicas •  Logical read operation ri(x) on transaction Ti

• Mapped to ri(xA) on one particular copy of x

!  Updates performed on all replicas •  Logical write operation wi(x) on transaction Ti

• Mapped to wi(xA), wi(xB), … on one particular copies of x

!  ROWA works well because on most applications reads >> writes

8

29

1-copy equivalence

!  A replicated system provides the same semantics as a non-replicated system •  1-copy-isolation

•  1-copy-atomicity •  1-copy-durability

•  1-copy-consistency

!  Requires thigh coupling of replica control mechanisms with db mechanisms that achieve: atomicity, consistency, isolation, durability

30

Serializability

! Most well-known isolation model !  Provides strongest isolation level !  Concurrent execution of a set of transactions most be equivalent

to some possible serial execution of the set !  Conflict: two operations that access the same item, are from

different transactions, and at least 1 is a write !  To be serializable all conflicts have to execute in the same order •  Ti has to appear to execute before Tj or Tj before Ti •  Acyclic serialization graph

S1, S2 are serial executions S3 is serializable S4 is unserializable

31

Serializability

!  Use concurrency control to enforce transaction isolation •  Acquire share lock before read

•  Acquire exclusive lock before write • Release all locks at end of transaction (2-phase locking)

32

1-copy-serializability

!  First attempt: just make sure schedules are serializable at each replica • Does not work

!  Need to order all conflicting operations in the same way ! Whenever one local schedule executes one of Ti ’s operations

before a conflicting operation of Tj , then Ti is executed before Tj in all replicas

9

33

1-copy-atomicity

!  Transaction executes on its entirety and commits, or it aborts and does not leave any effects on database

! On replicated database •  Transaction has to have same decision of either all (commit) or nothing (abort)

at all replicas at which it performs • ROWA

•  All replicas for update transaction •  1 replica for read-only transactions

• Requires some agreements protocol to force all replicas to reach same decision on outcome of transaction •  2-phase commit

34

Read-One-Write-All (ROWA)

!  Key design alternatives: • Where are updates executed

• When are changes propagated • How are changes propagated

35

Primary Copy vs. Update Anywhere

!  Primary copy •  All updates are executed first on a single node, the primary

•  Advantages: simpler to implement • Disadvantages: primary can be come bottleneck

!  Update Anywhere/Everywhere • Updates and read only request can be sent to any replica •  Advantages: potentially more scalable

• Disadvantages: harder to guarantee consistency

36

Eager vs. Lazy

!  Eager replication • Confirmation is only returned to user once all secondaries execute update

•  Advantages: Strong consistency • Disadvantages: Replica coordination can be slow

!  Lazy/asynchronous replication • Confirmation is returned to client after updates are applied to a single replica • Updates are subsequently propagated to other replicas

•  Advantages: Fast

• Disadvantages: Weaker consistency, potential for conflicts

10

37

Eager Primary Copy

S(y) acquire shared lock

X(y) acquire exclusive lock

U(y) unlock

38

Eager Primary Copy Properties

!  Strengths •  1-copy-serializability

•  1-copy-atomicity

! Weaknesses •  Transaction only commits when all replicas have executed writes

•  Execution time determined by slowest replica

•  2-phase-commit is slow and expensive • Combination of eager write propagation and locking can lead to blocking in the

presence of long read transactions

•  Primary may become bottleneck as it has to execute all reads for update transactions

•  Single point of failure

39

Eager Update Anywhere

40

Eager Update Anywhere Properties

!  Strengths •  1-copy-serializability

•  1-copy-atomicity •  No single point of failure

! Weaknesses •  Possibility for distributed deadlock

•  Adds significant complexity to protocol

•  Transaction only commits when all replicas have executed writes •  Execution time determined by slowest replica

•  2-phase-commit is slow and expensive

• Combination of eager write propagation and locking can lead to blocking in th presence of long read transactions

11

41

Lazy Primary Copy

42

Lazy Primary Copy Properties

!  Strengths •  Fast

•  Transactions don’t involve remote communication

! Weaknesses • Weak consistency

•  Remote replicas have stale date •  Does not comply with 1-copy-atomicity

•  Similar bottleneck issues as eager primary copy

• 

43

Lazy Update Anywhere

44

Lazy Update Anywhere Properties

!  Strengths •  Fast

•  Transactions don’t involve remote communication

! Weaknesses • Weak consistency

•  Remote replicas have stale date •  Does not comply with 1-copy-atomicity, 1-copy-isolation

12

45

Processing Write Operations

!  Writes have to the executed on all replicas

!  Write processing is the main overhead of replication

!  Symmetric update processing •  SQL statement is sent to all replicas

•  All replicas parse the statement, determine the tuples affected, and perform the modification/deletion/insertion.

•  Pros: Reuse existing mechanisms

•  Cons: Redundancy Execution has to be deterministic. Consider and update that sets a timestamp

!  Asymmetric update processing •  SQL statement is executed locally on a single replica

•  Writeset (extracted changes) sent to other replicas •  Identifier and after-image of each updated record

•  Pros: Efficiency

•  Cons: Harder to implement 46

Replication Architecture

!  Kernel-based or White-box approach •  Implement inside the database

•  Pros: Best integration with concurrency control and other transaction control mechanisms

•  Cons: Hard to maintain and port to other databases Confined to a single database system

!  Middleware-based or Black-box approach •  Replication performed by an outside component

interposed between the database replicas and the client

•  Pros: Does not require changes to the database Portability, modularity Can use different database systems

•  Cons: Have to re-implement concurrency control Coarse-grain locking " Less parallelism

!  Grey-box approach •  Database provides special replications API

•  Example: a writeset extraction/application API

•  Middleware calls on the APIs

47

MySQL Replication

!  ROWA !  Primary copy !  Eager and lazy replication !  Symmetric and asymmetric update processing !  Full and partial replication

48

MySQL Isolation Levels

!  SERILIZABLE !  REPEATABLE READ (default) •  By default reads do not acquire locks

•  A snapshot is created at transaction start •  Multiple reads to the same item will return consistent value •  Writes by others will not be seen

•  Possible to acquire explicit locks •  SELECT * FROM sometable FOR UPDATE •  SELECT * FROM sometable LOCK IN SHARE MODE

!  READ COMMITTED !  READ UNCOMMITTED

13

49

App Engine Datastore

!  High Replication Datastore (HRD) • Data is replicated across multiple data centers using a consensus algorithm

• Queries are eventually consistent •  Pros: Highest level of availability for reads and writes

• Cons: Higher latency on writes due to the propagation of data.

! Master/Slave Datastore • One data center holds the master copy of all data

• Data written to the master data center is replicated asynchronously to all other (slave) data centers. •  Pros: Strong consistency for all reads and queries

• Cons: Temporary unavailability during data center downtime

50

Failures

! Machine crashes

!  Network errors •  Partitions Quorum protocols • Message loss TCP handles this case

• Message corruption TCP handles this case

51

Self Healing

!  Normal operation • Replica control protocols keep replicas consistent

•  Provide enough information to recover from failures

!  Failover •  Failure detections

•  In case of primary failure, choose a new primary •  Voting •  Deterministic assignment scheme, e.g., chose surviving replica with smallest ID

• Deal with outstanding transactions •  Preserve atomicity •  All submitted transactions should either commit or abort at all remaining replicas

!  Recovery • Restart a failed replica, or create a new replica

•  Integrate into the replication system

52

Failure Tolerant Architecture

!  Decouple application to tolerate failure !  Implemented as shared-noting independent modules !  Use reliable message queues for coordination ! When a task fails, it is just restarted !  In addition, each part of the application can scale independently !  Examples: Amazon SQS, AppEngine Task Queues

14

53

AppEngine Task Queues

!  Task Quest •  Asynchronous processing

•  Image transcoding of the critical path

!  Scheduled Tasks •  Periodic processing, cron jobs

•  Fetching and caching data from remote services •  Sending daily status emails

• Generating reports

54

Task Queues

!  A facility for enqueueing HTTP requests !  Fault tolerant • Retry if task does not succeed (returns HTTP 200 status)

• Retry rate uses exponential backoff strategy

!  Enqueueing a task is fast •  About 3 time faster than writing to datastore

55

Adding a Task

!  Tasks consist of • URL

•  If null, URL set to /_ah/queue/queue-name •  Default URL for default queue: /_ah/queue/default

• Name •  If null, AppEngine assignes a unique name •  Name can be used to prevent enqueueing duplicates

• Request parameters

56

Configuring Task Queues

!  Task queues defined in WEB-INF/queue.xml !  An application can have up to 10 queues !  Configure queues to determine how fast tasks execute •  bucket-size " maximum number tokens for the queue

to execute, a task need a token •  rate " speed at which token are replenished

15

57

Securing Tasks

!  AppEngine’s front end recognizes requests coming from a queue as if it were coming from a developer account

!  Can use web.xml to restrict access to task URLs

58

Sharding

!  Challenges: •  Very large working set

•  Slows reads •  Facebook/Google user table

•  Too many writes

!  Solution: •  Partition the data into shards •  Assign shards to different machines • Denormalize the data

59

Sharding Strategies

!  Range-based partitioning •  E.g., username from a to m assigned to shard 1, n to z to shard 2

•  Pros: simple • Cons: hard to get load balancing right

!  Hash-based partitioning • Compute hash of key. • Different shards responsible for different hash ranges

•  Pros: Good load balancing

• Cons: A little more complex,

!  Directory-based partitioning •  Lookup service keeps track of partitioning scheme

•  Pros: Flexibility • Cons: Lookup service may become bottleneck

60

Denormalization

!  Data that is access together is stored together •  E.g., multi valued properties in AppEngine datastore entities

!  Eliminates costly joins ! May require replication •  E.g., a comment may be stored on the commenter’s and commentee’s profile

16

61

Sharding

!  Pros • High availability

•  Faster queries • More write bandwidth

!  Cons • Queries get more complicated

•  Need to figure out which shard to target •  May need to join data from multiple shards

• Referential integrity •  Foreign key constrains are hard to enforce •  Not supported in many databases

scalabilitydelara/courses/ece1779/handouts/scalability.pdftutorial ! step 5: create a ec2 security...

Documents