webinar: operational best practices

Post on 25-May-2015

2.849 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

This webinar will cover best practices around dev/ops and general operations for those already familiar with basics of MongoDB. Topics will include team roles around data model design, monitoring, hardware configurations, replication and horizontal scaling.

TRANSCRIPT

Senior Solutions Architect, 10gen

Asya Kamsky

#MongoDB

Operational Best Practices

Operational Best Practices Asya Kamsky

Best Practices == More Value

How to get more sleep while your MongoDB cluster hums along

The Agenda

•  Roles and responsibilities

•  Schema design and application performance

•  Hardware

•  Replication

•  Sharding

•  Monitoring

Operational Best Practices Asya Kamsky

Roles and Responsibilities

Application Data needs

Schema Design

Read and Write

Patterns

Indexing Strategy

Hardware: RAM, CPU,

disk...

Network, Firewalls, Security

Roles and Responsibilities

Operational Best Practices Asya Kamsky

Application Data needs

Schema Design

Read and Write Patterns

Indexing Strategy

Hardware: RAM, CPU,

disk...

Network, Firewalls, Security

Backups

Maintenance

Upgrades

Roles and Responsibilities

Operational Best Practices Asya Kamsky

MONITORING

Roles and Responsibilities Application Developer

Data Architect

DBA System Admin

Network Admin

Operational Best Practices Asya Kamsky

Schema Design and Application Performance

In MongoDB correct schema design is essential for optimal application performance.

DATA != SCHEMA

Schema and Performance

Operational Best Practices Asya Kamsky

Multiple types of indexes supported.

Indexing is essential

Schema and Performance

Operational Best Practices Asya Kamsky

•  Monitoring •  Measuring •  Benchmarking •  Optimizing

Understanding actual performance

Schema and Performance

Operational Best Practices Asya Kamsky

•  Logs •  Query plan •  Application •  Ad-hoc testing

Hardware

Hardware

•  Memory

•  Storage

•  CPU - speed

•  CPU - number of cores

Impact on performance in that order!

Operational Best Practices Asya Kamsky

Replica Sets

Secondary Secondary

Primary

Client ApplicationDriver

Write

Read

Replica Sets and Application

Node 1Secondary

Node 2Secondary

Node 3Primary

Replication

Heartbeat

ReplicationReplica Set – HA

Node 1Secondary

Node 2Secondary

Node 3

Heartbeat

Primary Election

Replica Set – Failure

Node 1Secondary

Node 2Primary

Node 3

Replication

Heartbeat

Replica Set – Failover

Node 1Secondary

Node 2Primary

Replication

Heartbeat

Node 3Recovery

Replication

Replica Set – Recovery

Node 1Secondary

Node 2Primary

Replication

Heartbeat

Node 3Secondary

Replication

Replica Set – Reestablished

Replica Sets

•  Primary purpose: –  High Availability with automatic failover –  Disaster Recovery –  No-down-time maintenance –  No application changes on reconfiguration –  Extra copies of data for "special" read workloads

•  Full benefit achieved with advance planning

Operational Best Practices Asya Kamsky

Replica Sets

•  Full benefit achieved with advance planning

Operational Best Practices Asya Kamsky

•  Determine your SLA/HA requirements •  Determine your DR requirements •  Understand impact of node, network, DC failure •  Understand all available RS features

priority scores, hidden, delayed, tags •  Monitor and proactively remedy potential problems •  Practice recovery from disastrous failure

Replica Sets

•  Best Practices for Configuration –  Odd number of voting replica members –  Size the oplog appropriately for high volume loads –  Use multiple Data Centers/Availability Zones –  Use DNS names for node configuration –  Add hidden delayed-replication member as "insurance" –  All replica set nodes should have same capacity

•  Operation –  Upgrade secondaries first (primary last) –  Maintenance on secondaries first (primary last) –  Use 'rs.stepDown()' command

Operational Best Practices Asya Kamsky

Sharded Clusters

Node 1SecondaryConfigServer

Node 1SecondaryConfigServer

Node 1SecondaryConfigServer

Shard Shard Shard

Mongos

App Server

Mongos

App Server

Mongos

App Server

Sharding

•  Keys to successful sharding: –  Pick a good shard key –  Make config servers resilient –  Shard before you "have to"

•  Good shard key is essential to achieving scaling

Operational Best Practices Asya Kamsky

Sharded Clusters

Sharded Clusters

•  Good shard key is essential to achieving scaling

Operational Best Practices Asya Kamsky

•  Distributes your writes across all shards •  Allows majority of reads to be "targeted" (not scatter-

gather) •  Exists in every document •  Has sufficiently high cardinality •  Allows you to take advantage of advanced features - tag aware balancing

•  Config Servers –  Three must be available to automatically balance data –  All three must be "in sync"

•  if one becomes unavailable others go read-only –  At least one must be available to avoid disaster

•  without information inside config server it's not possible to determine which shards contain which ranges of data!

•  Must stop balancing during backup

Sharded Clusters

Operational Best Practices Asya Kamsky

•  Shard before you "have to" –  Balancing data is intensive process –  If existing cluster is near full capacity balancing may impact

response time of application –  Planning to shard well in advance gives more time

•  to provision new hardware •  to select a good shard key •  to understand advanced sharding features (tagging)

Sharded Clusters

Operational Best Practices Asya Kamsky

•  Other best practices –  Three config servers –  Each shard is a replica set –  Test what you run

•  use the same topology in QA as in production –  Monitor

•  RAM •  disk I/O •  total storage •  MongoDB throughput

Sharded Clusters

Operational Best Practices Asya Kamsky

Monitoring

Monitoring

• Multiple CLI and internal status commands •  mongostat; mongotop; db.serverStatus()

• MMS

•  Plug-ins for munin, Nagios, cacti, etc.

•  Integration via SNMP to other tools

Operational Best Practices Asya Kamsky

MongoDB Monitoring Service (MMS) Free, cloud-based service for monitoring and alerts

•  Charts, custom dashboards and automated alerting

•  Tracks 100+ metrics – performance, resource utilization, availability and response times

•  10,000+ users

MongoDB Monitoring Service (MMS) Free, cloud-based service for monitoring and alerts

A Picture Speaks a Thousand Words

Operational Best Practices Asya Kamsky

Symptoms

High Use CPU Similar Query Pattern

Operational Best Practices Asya Kamsky

Diagnostics - iostat Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sdp 0.00 0.00 0.50 0.00 27.86 0.00 56.00 149.58 20320.00 2010.00 100.00

Operational Best Practices Asya Kamsky

Monitoring

• mongostat

Operational Best Practices Asya Kamsky

Monitoring

• mongotop

Operational Best Practices Asya Kamsky

Monitoring Best Practices

•  Monitor Logs –  Alert, escalate –  Correlate

•  Disk –  Monitor

•  Instrument/Monitor App (including logs!)

•  Know your application and application (write) characteristics

Operational Best Practices Asya Kamsky

Monitoring Best Practices

•  Performance test/analyze system behavior

•  Load test before deployment

•  Selectively use database profiling during testing

•  Alert on abnormal states

•  High CPU is a sign of poorly indexed query

Operational Best Practices Asya Kamsky

Best Practices Summary

Best Practices

•  Pre-deployment –  Learn –  Plan –  Prototype/Benchmark –  Execute

•  During deployment –  Monitor –  Continue planning –  Evolve

Operational Best Practices Asya Kamsky

System provisioning

•  Capacity

•  Performance

•  Scale

•  Configuration

Operational Best Practices Asya Kamsky

Logs

•  Review

•  Alert

•  Rotate and collect (per cluster)

Operational Best Practices Asya Kamsky

Query/Index Analysis

•  Database Profiler

•  Run explain periodically (sampled)

•  Instrument code, generate metrics

•  Look for similar patterns to find root 'cause

Operational Best Practices Asya Kamsky

Hardware Configuration

•  Pay attention to disk configurations

•  Load testing will find some misconfigurations

•  MongoDB depends on the OS a lot

Operational Best Practices Asya Kamsky

Plan/Test Rollouts

•  Rolling upgrade for Replica Set

•  Generate indexes on secondaries first

•  Name services, use redirection

Operational Best Practices Asya Kamsky

More References

•  Please take a look at http://docs.mongodb.org

•  Ask questions on mongodb-user group

•  Use MMS or historic monitoring –  Watch for trends –  Create alerts –  Forecast capacity for provisioning

•  Utilize all available resources –  10gen offers paid public and on-site training & free web-based

classes –  consulting services –  pre-production and production support

Operational Best Practices Asya Kamsky

Senior Solutions Architect, 10gen

Asya Kamsky

#MongoSV

Thank You

top related