Transcript
Page 1: MongoDB and AWS: Integrations

MongoDB and AWSIntegrating with AWS Services

Partner Technical Solutions, MongoDB Inc.

Sandeep Parikh

#mongodb

Page 2: MongoDB and AWS: Integrations

Recap: Deployment and Availability

• MongoDB basics

• Deployment configurations

• Instance types

• Best practices

• Slides and recording:– http://www.mongodb.com/presentations/mongodb

-and-amazon-web-services-deploying-high-availability

Page 3: MongoDB and AWS: Integrations

Recap: Storage Configurations

• Storage options

• Simple recommendations

• Backup and restore

• Advanced configurations

• Slides and recording:– http://www.mongodb.com/presentations/mongodb

-and-amazon-web-services-storage-options-mongodb-deployments

Page 4: MongoDB and AWS: Integrations

Agenda

• Available Services

• Integrations

• Infrastucture

• Future Directions

• Questions

Page 5: MongoDB and AWS: Integrations

Available Services

Page 6: MongoDB and AWS: Integrations

AWS Services

Compute Storage Persistent IPs DNS

Hadoop Data Warehouse

Stream processing

App deploymen

t

Orchestration

Provisioning

App services Caching

Page 7: MongoDB and AWS: Integrations

AWS Services

Compute Storage Persistent IPs DNS

Hadoop Data Warehouse

Stream processing

App deploymen

t

Orchestration

Provisioning Security Caching

Page 8: MongoDB and AWS: Integrations

Integrations

Page 9: MongoDB and AWS: Integrations

CloudFormation

• Simplify provisioning and deployment

• JSON-based templates

• Manage like source code

• Specify all manner of AWS components

• Boostrap for other tools like Chef or Puppet

Page 10: MongoDB and AWS: Integrations

"Parameters" : {

"KeyPairName" : {

"Description" : "EC2 KeyPair to enable SSH access",

"Type" : "String"

},

"SecurityGroupName" : {

"Description" : "EC2 Security Group",

"Type" : "String”

},

"InstanceType" : {

"Type" : "String",

"Default" : ”m3.large",

"AllowedValues" : [”m3.large”,”m3.xlarge”,”m3.2xlarge”],

"Description" : "EC2 instance type"

}

},

CloudFormation Sample

Page 11: MongoDB and AWS: Integrations

"Properties" : {

"InstanceType" : { "Ref" : "InstanceType" },

"ImageId" : { … },

"SecurityGroups" : [{ "Ref”: “SecurityGroupName" }],

"KeyName" : { "Ref" : "KeyPairName" },

"EbsOptimized" : "true",

"BlockDeviceMappings" : [{

"DeviceName" : "/dev/xvdf",

"Ebs" : { "VolumeSize" : "200”, "Iops" : "1000",

"VolumeType" : "io1”, "DeleteOnTermination" : “false”

}}]

CloudFormation Sample

Page 12: MongoDB and AWS: Integrations

CloudFormation Templates

• https://github.com/crcsmnky/aws-cfn-mongodb

• Templates to launch single-node MongoDB deployment

• Each one implements our best practices– EBS-optimized, PIOPS, ulimit, readahead

• Used to generate AWS Marketplace instances

Page 13: MongoDB and AWS: Integrations

CloudFormation Templates

Clone the repo

Upload the CF template

Instance provisioning

starts

Instance clones repo

Instance runs setup

script

Instance provisioned

and deployed

Page 14: MongoDB and AWS: Integrations

CloudFormation Tools

• https://github.com/cloudtools/troposphere

• Python package to generate CF templates

• Next versions of our templates will leverage this

• Coming soon: Replica Sets

• Coming later: Sharded Cluster

Page 15: MongoDB and AWS: Integrations
Page 16: MongoDB and AWS: Integrations

Elastic Map Reduce

• Quickly deploy and run Hadoop in AWS

• Tuned distributions to run on top of EC2

• Provision deployments with any number of nodes

• Supports Spot and Reserved pricing for savings

Page 17: MongoDB and AWS: Integrations

EMR and MongoDB

• https://github.com/mongodb/mongo-hadoop

• MongoDB-Hadoop connector– Bi-directional access to/from MongoDB

• Supports MapReduce, Hive, Pig, Streaming

• Read/write from – MongoDB deployments or – BSON backup files

Page 18: MongoDB and AWS: Integrations

EMR with MongoDB

MongoDB

BSON

S3

EMR

EMR

EMR

EMR

EMR

EMR

EMR

EMR

EMR

EMR

Page 19: MongoDB and AWS: Integrations

EMR Workflow

Bootstrap script• MongoDB-Hadoop• MongoDB Java

driver

Copy resources• Bootstrap script• MapReduce job

Launch EMR• Instance type• Instance count• Arguments

MapReduce Output• MongoDB• BSON in S3

EMR Logs• Written to S3

Page 20: MongoDB and AWS: Integrations

$ elastic-mapreduce --create --jobflow ENRON000

--instance-type m1.xlarge --num-instances 5

--bootstrap-action s3://$S3_BUCKET/bootstrap.sh

--log-uri s3://$S3_BUCKET/enron_logs

--jar s3://$S3_BUCKET/enron-example.jar

--arg -D --arg mongo.job.input.format =

com.mongodb.hadoop.BSONFileInputFormat

--arg -D --arg mapred.input.dir =

s3n://mongo-test-data/messages.bson

--arg -D --arg mapred.output.dir =

s3n://$S3_BUCKET/BSON_OUT

--arg -D --arg mongo.job.output.format =

com.mongodb.hadoop.BSONFileOutputFormat

EMR Launch

Page 21: MongoDB and AWS: Integrations

Elastic Beanstalk

• Deploy and manage applications

• Handles provisioning, scaling, load balancing

• Built on EC2, S3, SNS, Auto Scaling

Page 22: MongoDB and AWS: Integrations

Elastic Beanstalk Architecture

App Serve

r

App Serve

r

App Serve

rSecurity Group

Elastic Load Balancer

Auto Scaling Group

Page 23: MongoDB and AWS: Integrations

Elastic Beanstalk with MongoDB

App Server

App Server

App Server

Security Group

Elastic Load Balancer

Auto Scaling Group

mongos

mongos

mongos

MongoDB

Page 24: MongoDB and AWS: Integrations

Elastic Beanstalk with MongoDB

• Customize and configure software that your app needs (e.g. mongos)

• Install packages

• Create files

• Execute commands (before or after app is setup)

• Control system services

• http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html

Page 25: MongoDB and AWS: Integrations

Infrastructure

Page 26: MongoDB and AWS: Integrations

Elastic IPs

• EC2 instances use dynamic IP addresses

• EIPs are static addresses that can be assigned to individual EC2 instances

• Unfortunately you have a limited number

Page 27: MongoDB and AWS: Integrations

Route53

• Highly available and scalable DNS service in AWS

• Hostnames can be assigned to EC2 instances, ELB instances, or S3 buckets

• DNS load balancing with weighted-round-robin

• Supports hostnames for non-AWS infrastructure

Page 28: MongoDB and AWS: Integrations

Route53 and MongoDB

• Short answer: use hostnames for all components

• With replica sets, hostnames can ease machine replacement

• With sharded clusters, hostnames can simplify config server maintenance

Page 29: MongoDB and AWS: Integrations

VPC

• Virtual Private Cloud lets you provision a logically isolated network inside AWS

• You manage all aspects of networking including– IP address ranges– Subnets– Routing tables and gateways

• Can be used as an extension to an offsite data center with Hardware VPN

Page 30: MongoDB and AWS: Integrations

VPC Public and Private

http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html

• Private subnets hidden to outside world

• Internet Gateway and EIPs can be used to access

• Web tier in public subnet

• Data tier in private subnet

Page 31: MongoDB and AWS: Integrations

ElastiCache

• Distributed in-memory cache

• Backed by Memecached or Redis

• Can be a drop-in replacement for existing cache deployments

• Supports auto-discovery and read-replicas

Page 32: MongoDB and AWS: Integrations

Future Directions

Page 33: MongoDB and AWS: Integrations

RedShift

• Fully-managed petabyte-scale data warehouse service

• MongoDB not natively supported as a data source

• … So how do you get your data in?

Page 34: MongoDB and AWS: Integrations

Data Pipeline

• Process and move data between different AWS compute and storage services

• Date Pipeline handles resources, failures, and dependencies

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/what-is-datapipeline.html

Page 35: MongoDB and AWS: Integrations

Data Pipeline with MongoDB

AWS Data Pipeline

MongoDB

S3

EMRorRedshift

Page 36: MongoDB and AWS: Integrations

OpsWorks

• Complete DevOps stack

• Model and manage apps, load balancers, databases

• Uses Chef recipes

• Load or time-based scaling

• Deploying MongoDB with OpsWorks:– http://blogs.aws.amazon.com/application-manage

ment/post/Tx1RB65XDMNVLUA/Deploying-MongoDB-with-OpsWorks

Page 37: MongoDB and AWS: Integrations

CloudWatch

• Monitoring for AWS resources

• Supports custom metrics

http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatch.html

Page 38: MongoDB and AWS: Integrations

aws cloudwatch put-metric-data

--metric-name ResidentMemory

--namespace MongoDB

--timestamp 2014-02-14T20:30:00Z

--value 32

--unit Gigabytes

CloudWatch Custom Metrics

Page 39: MongoDB and AWS: Integrations

Questions?

Page 40: MongoDB and AWS: Integrations

MongoDB WorldNew York City, June 23-25

#MongoDBWorld

See what’s next in MongoDB including • MongoDB 2.6• Sharding• Replication• Aggregation

http://world.mongodb.comSave 25% with discount code 25SandeepParikh


Top Related