mobile 3: launch like a boss!

Download Mobile 3: Launch Like a Boss!

If you can't read please download the document

Upload: mongodb

Post on 14-Aug-2015

74 views

Category:

Technology


0 download

TRANSCRIPT

  1. 1. Deploy Like a Boss Building a Mobile App with MongoDB Part 3
  2. 2. 2 Deploy with Joy!
  3. 3. 3
  4. 4. 4
  5. 5. 5 Production Checklist Proper Infrastructure Proper Configuration Proper Monitoring Emergency Procedures
  6. 6. 6 Infrastructure Sizing RAM CPU Disk Size I/O Bandwidth Availability
  7. 7. 7 Sizing Indexes need to be in RAM Working set needs to be in RAM I/O Bandwidth - write load - Index updates - Working set migration { _id: ObjectId(), tour: UUID, user: UUID, name: "Doug's Dogs", desc: "The best hot-dog", clues: [ "Hungry for a Coney Island?", "Ask for Dr. Frankenfurter", "Look for the hot dog stand" ] "geometry": { "type": "Point", "coordinates": [125.6, 10.1] } }
  8. 8. 11 Load Testing
  9. 9. 12 Load Testing Test it like you use it, benchmarks dont count
  10. 10. 13 Load Testing Test it like you use it, benchmarks dont count Test to failure
  11. 11. 14 Load Testing Test it like you use it, benchmarks dont count Test to failure Instrument your code!
  12. 12. 15 Load Testing Test it like you use it, benchmarks dont count Test to failure Instrument your code! https://github.com/breinero/Firehose https://github.com/ParsePlatform/flashback
  13. 13. 16 Load Testing Test it like you use it, benchmarks dont count Test to failure Instrument your code! Theres me
  14. 14. 17 Growth 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 Saturation Warn Load 1K Ops / Second time
  15. 15. 18 Growth 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 Saturation Warn Load Memory
  16. 16. 19 Growth 0 2 4 6 8 10 12 1 2 3 4 5 6 7 8 9 10 Saturation Warn Load Input Output
  17. 17. 20
  18. 18. 21 Monitoring Baseline MongoDB Management Service (MMS) MongoDB Ops Manager Nagios, Zenoss, Detailed Query Specific mongotop db.currentOp() Query Profiler mtools
  19. 19. 22 Fosrensics 2014-08-08T21:15:25.181-0500 [conn1026] getmore claimsPoc.claims cursorid:100012502307 ntoreturn:0 keyUpdates:0 numYields:1406953 locks(micros) r:11887558422 nreturned:289 reslen:4208149 28795759ms 2014-08-07T15:31:51.714-0500 [conn7] command claimsPoc.$cmd command: createIndexes { createIndexes: "claims", indexes: [ { key: { Claims.ICN: 1.0 }, name: "Claims.ICN_1" } ] } keyUpdates:0 numYields:0 locks(micros) r:14476 w:25176930351 reslen:113 25176955ms
  20. 20. 23 Logging
  21. 21. 24 Logging Save and Rotate Dont use --quiet --logpath != --dbpath Use component verbosity for debugging
  22. 22. 25 Security
  23. 23. 26 Security Firewall Bind ip Encrypt Networks Enable Access Control Dont enable REST interface Auditing Limit Exposure and use Principal of Least Privileges
  24. 24. 27 Tuning Best Practices Disable Transparent hugepages NTP to synchronize time Set ulimits Use XFS or Ext4 Dont use NFS Disable NUMA Have swap Read Production Notes Tunables Set IO Scheduler NOOP Adjust readaheads ( MMapV1 ) Avoid cgroups SE Linux (?) RAID
  25. 25. 28 Availability http://avstop.com/ac/flighttrainghandbook/imagel4b.jpg
  26. 26. 29 Availability S S DC1 DC2 P Avoid Critical Data Centers
  27. 27. 30 Availability P S DC1 DC2 S DC3
  28. 28. 31 Availability P S DC1 DC2 S AWS
  29. 29. 32 Availability P S DC1 DC2 Arbiter AWS
  30. 30. 33 Availability P DC1 Arbiter AWS S DC2 Down for maintenance
  31. 31. 34 Emergency Procedures https://spinoff.nasa.gov/spinoff2002/images/070.jpg
  32. 32. 35 Emergency Procedures https://spinoff.nasa.gov/spinoff2002/images/070.jpg Backup and Recovery File System Snapshot MMS Cloud Ops Manager Mongodump
  33. 33. 36 Backups and Recovery https://spinoff.nasa.gov/spinoff2002/images/070.jpg PERFORM DRILLS OFTEN AND ROUTINELY
  34. 34. 37 Emergency Procedures https://spinoff.nasa.gov/spinoff2002/images/070.jpg Document your Procedures Include ETAs Follow procedures in docs.mongodb.org
  35. 35. 38 Production Ready Architecture L.B.
  36. 36. 39 Production Ready Architecture L.B. Unindexed queries
  37. 37. 40 Production Ready Architecture L.B. Unindexed queries Leads to collection scans
  38. 38. 41 Production Ready Architecture L.B. Unindexed queries Leads to collection scans Results in high latencies
  39. 39. 42 Classic Failure Scenario L.B. Unindexed queries Leads to collection scans Results in high latenciesCauses memory exhaustion
  40. 40. 43 Production Ready Architecture L.B. Unindexed queries Leads to collection scans Results in high latenciesCauses memory exhaustion CASCADING FAILURE
  41. 41. 44 Circuit Breaker Trigger Conditions Latency stats.getMean() >= max OpsPerSecond stats.getN() >= max ConcurrentOperations stats.getN()*stats.getMean() >= max
  42. 42. 45 Circuit Breaker Trigger Conditions Latency stats.getMean() >= max OpsPerSecond stats.getN() >= max ConcurrentOperations stats.getN()*stats.getMean() >= max https://github.com/breinero/Firehose
  43. 43. 46 Production Ready Architecture L.B.
  44. 44. 47 Client Side Dont use ensureIndex() in application Look out for connection bombs --maxConnect DO use operation timeouts DONT cause socket timeouts Lower keepalives Avoid retry bombs
  45. 45. 48 Requirements & Specs Make a DevOps Contract Database Access Requirements Database Access Fulfillment Specification Cluster Configuration Monitoring and Alerting Specification
  46. 46. 49 Monitoring Opcounters Memory Page Faults Queues Replication Lag Oplog Window Background Flush Average Disk space
  47. 47. Thanks! { name: Bryan Reinero, title: Developer Advocate, twitter: @blimpyacht, code: github.com/breinero email: [email protected] }