gartner infrastructure and operations summit berlin 2015 - devops journey

16
Beginning the DevOps Journey in Real Money Gaming Kelly Looney 27.11.2014

Upload: kelly-looney

Post on 06-Aug-2015

123 views

Category:

Technology


1 download

TRANSCRIPT

Beginning  the  DevOpsJourney  in  Real  Money  

Gaming

Kelly  Looney  27.11.2014

DevOps in  Real  Money  GamingContext:  600M  € company  – Online  Sports  betting,  Online  Casino,  Poker,  other  games

Two  primary  technologies  combined  via  a  business  merger(turn  of  the  century  architecture)

• Sports  -­ .Net/SQL  Server• Poker,  Casino,  and  “Platform”  – Java/Oracle

• Datacenters  in  Gibraltar,  Vienna,  and  other  points  in  Europe,  now  in  US• Over  2000  Servers  in  production  • 200  people  in  Ops  and  Infrastructure• Development  Centers  in  Vienna,  Ukraine,  and  Hyderabad• Over  700  development  team  members

2

3

Different  faces/rulesFor  different  markets

Monolithic  App  with  many  single  points  of  failure

In  2013…the  Challenge

DevTeams  focused  on  Horizontal  Components

Totally  separate  Ops,  Maintenance,  and  Devteams

Clashing  cultures  from  Merger/Locations/Code  bases

Up  24/7  with  Millions  of  €/day  wagered

In-­‐houseBuildDeployMonitoring…

AppDynamics picture  of  the  Beast 4

What  we  have  done  and  are  doing…• Global  Agile  Transformation  – classes,  coaches,  96  Scrum   teams• Craig  Larman,  Luke  Hohmann (Innovation  Games)• DTO  (Damon  Edwards,  Alex  Honor)  for  DevOps principles• Now  exploring  SAFe

• Several  Organization  Changes• Components  -­>  Features  -­>  Services• Ops  -­>  LeanOps -­>  Delivery  Units

6

Cultural  changes  we  have  encouraged• Old  style  Developers• Responsibilities:  Write  code• Focus:  Know  ONE  THING  really  really  well.  • Deep  expertise  =  respect• What  we  want  now  is  Developers  that:• Understand  our  company  goals  • Understand  requirements  and  tests• Write,  build,  integrate,  and  test  code  incrementally  • Can  demonstrate  and  explain  working  systems    • Maintains  his/her  code  in  production• Understands  operations  

Deep  expertise  is  great,  but  varied  knowledge  is  just  as  important7

Wow,  you  want  developers  to  do  everything…

• First  the  right  attitude…then• Todays  Tools  and  Processes:

1. Agile  provides  continuous  “customer”  access2. Distributed  versioning  (typically  Git)  puts  full  source  control  into  individual  developers  hands3. Continuous  Integration  isolates  mistakes4. Jenkins-­Vagrant-­Puppet-­Chef-­Saltstack pipelines  make  infrastructure  and  deployment  mostly  

automatic  regardless  of  complexity• Deploy  to  Test,  UAT,  Staging,  Production5. Monitoring  lets  you  see  and  assess  your  running  service

How  is  that  possible?

8

What  we  have  done  and  are  doing…tech• Tool  changes• SVN-­>Git,   In  house  depoy -­>  Jenkins/Team  City,  Puppet,Chef,  Rundeck• Bare  Metal  -­>  VMWare  -­>  Now  headed  to  Docker/containers• Monitoring…AppDynamics – more  to  come

• Architectural  Principles• Less  centralized,  fewer  failure  points• Code  to  create  a  server  is  the  asset,  not  the  server• Throw  cheap  machines,  not  faster  CPUs  or  bigger  DBs  at  scaling  problems

• Use  RDBs  when  needed  otherwise  avoid

9

Containers  are  changing  hosting• Virtualization  efficiency  and  cost  savings  are  obvious

• The  most  interesting  issue  is  the  separation  of  concerns  presented• “developer-­land”  vs infrastructure

10

What  to  do  about  quality?

• We  pulled  all  sorts  of  people  together• Ops  ,  Dev,  CS,  Business,  Partners…

• “What  do  you  think  we  can  we  do  to  improve  overall  system  quality?”• #1  Answer:  We  need  comprehensive  monitoring

• Our  system  is  so  complex  and  so  opaque  we  can’t  really  tell  what  is  specifically  wrong.• Reworking  our  millions  of  lines  of  code  to  properly  and  consistently  log  will  never  happen…

• This  lead  us  to  evaluate  many  different  monitoring  approaches  and  products• We  settled  on  AppDynamics,  reasons:

• Advanced  UI,  very  flexible• One  application  to  replace  a  variety  of  other  solutions• Aggregation  of  data  was  a  huge  cost  saver

• #2  Quality  issue:  Testing  Environment  stability  and  viability• Expensive,  not  really  “production-­like”  and  not  highly  available

• Too  elaborate  for  early  testing  and  not  close  enough  for  late  testing• Forced  to  mix  tests  which  often  polluted  one  another• Infrastructure  just  an  incredible  blocker,  no  private  or  public  cloud

First  Steps:  Workshops  at  each  main  development  site

11

12

The  Difference  Monitoring  has  made…1. Like  a  giant  debugger  for  production  issues

• Peer  into  what  were  before  opaque  code  bases• Where  are  the  stress  points?  Also  surface  the  really  dumb  stuff.

• Identify  intermittent  issues  that  were  hard  to  identify  before• “Working  for  me…”

2. Better  resource  planning• We  had  lots  of  “over-­solved”  problems  before• How  do  things  change  during  spikes  in  traffic

3. Rollout  actually  helped  us  identify  services  that  needed  refactoring• If  the  overhead  of  mature  monitoring  breaks  your  service…

4. Developers  starting  to  use  AppDynamics to  assess  new  designs• It  has  uncovered  a  few  things  were  were  happy  we  did  not  deploy!

5. Gets  the  whole  organization  in  touch  with  operations• A  huge  DevOps goal  realized…

13

Posted  all  around   the  organization

14

Automating  Test

• It’s  not  “How  many  automated  tests  do  I  have?”• We  could  have  easily  run  days  worth  or  tests  whenever  we  wanted

• It’s  “I  have  the  right  tests  to  quickly  decide  if  I  can  move  forward”

• Also  BTW  “We  run  Jenkins  to  do  a  build  every  night”• Does  !=  Continuous  Integration…

• Can  you  create  a  viable  test  environment,  use  it,  then  throw  it  away?(before  it  pollutes  other  tests…)

15

What  DevOps and  CD  mean  for  the  organization• The  whole  idea  of  holding  off  changes  to  retain  stability  gets  turned  on  its  head• Change  all  the  time  and  stay  stable!

• Changes  get  smaller  and  smaller,  but  are  constantly  being  deployed• With  small  changes  integration  issues  become  fairly  simple

• Environments  must  proliferate  along  with  associated  infrastructure• Ideally  you  need  a  new  test  environment  to  test  every  change  – Create/Destroy

• Are  your  environments  captured  as  code?• Use  Cloud  services  here,  even  if  you  don’t  want  to  for  production

16