supply frame high availability in web content delivery

18
Aleksandar Bilanovic, SRE at Supply Frame, Inc. High Availability in Web Content Delivery 14.10.2014

Upload: aleksandar-bilanovic

Post on 27-Jun-2015

252 views

Category:

Internet


4 download

DESCRIPTION

high availability in web content delivery

TRANSCRIPT

Page 1: Supply frame high availability in web content delivery

Aleksandar Bilanovic, SRE at Supply Frame, Inc.

High Availability in

Web Content Delivery

14.10.2014

Page 2: Supply frame high availability in web content delivery

Minimizing risk associated with service failure and providing maximum uptime for application.

In order to achieve it we need to engineer and plan data center, network, servers, OS, applications and people.

It is about eliminating SPOF, detection of errors and building automated and reliable crossover to backup infrastructure.

Pros: high uptime, faster web content delivery, satisfied users, dealing with capacities and not with service denials.

Cons: high price, risk that HA arch became unmaintainable due to complexity, high complexity can contribute to failure and downtime.

What is and how to achieve High Availability

Page 3: Supply frame high availability in web content delivery

HA in Web Content Delivery: Supply Frame, Inc. Primer

Page 4: Supply frame high availability in web content delivery

Physical vs. virtualCarrier neutralRedundant power with industrial UPS / diesel generatorsClimate control Fire suppressionPhysical access controlBackup data center with redundant dark fiber cross connect

HA in Web Content Delivery: Data center

Page 5: Supply frame high availability in web content delivery

Internet access: BGP routing with multiple IP transits Local network: switch clusters for core/distribution/access

layersPrimary and backup data center routing Link aggregationRedundant power supplies

HA in Web Content Delivery: Infrastructure Network

Page 6: Supply frame high availability in web content delivery

HA Network: normal operations

Page 7: Supply frame high availability in web content delivery

HA Network: IP transit failure

Page 8: Supply frame high availability in web content delivery

HA Network: router failure

Page 9: Supply frame high availability in web content delivery

HA Network: switch failure

Page 10: Supply frame high availability in web content delivery

HA Network: cross connectfailure

Page 11: Supply frame high availability in web content delivery

HA Network: link aggregationfailure

Page 12: Supply frame high availability in web content delivery

Server class machines onlyRedundant power suppliesRedundant Array of Inexpensive / Independent Disks (RAID)Remote server console (iDRAC)

HA in Web Content Delivery: Servers

Page 13: Supply frame high availability in web content delivery

OS performance tuning (TCP/IP / number of open files, various memory buffers etc … )

Redundant databases / API backendsProtection servers / app performances from aggressive

crawlers (iptables recent module on LBs)OS/ App services monitoring (nagios, riemann, graphite,

dynect, pingdom, new relic)Data backup (online and offline)

HA in Web Content Delivery: OS / App

Page 14: Supply frame high availability in web content delivery

DynECT / Akamai GTM probing services for server/services failures / DC failover

Pacemaker / Corosync clusterLoad balancing services using haproxy (http/TCP

load balancer)

• uninterrupted services during deploy• high performance in web content delivery (number of web

nodes scales number of requests almost linear)• eliminating SPOF

HA in Web Content Delivery: crossover to backup infrastructure

Page 15: Supply frame high availability in web content delivery

HA OS/App: DynECT / Akamai GTM

Page 16: Supply frame high availability in web content delivery

HA OS/App:haproxy

Page 17: Supply frame high availability in web content delivery

HA OS/App:corosync / pacemaker LB cluster

(us-lax-1w-lb-00)

Page 18: Supply frame high availability in web content delivery

Human error - no HA arch can predict thatHA people: Follow the SunEveryone has to know something about everything and

everything about something (network, systems, application, automation, programming ...)

HA in Web Content Delivery: People