1 mongo db mongodb (from "humongous) is a scalable, high- performance, open source,...
TRANSCRIPT
![Page 1: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/1.jpg)
1
Mongo DB
MongoDB (from "humongous“) is a scalable, high-performance, open source, document-oriented database. Written in C++.
Home: http://www.mongodb.org/ Support by http://www.10gen.com/ Production Deploy
http://www.mongodb.org/display/DOCS/Production+Deployments
![Page 2: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/2.jpg)
2
Agenda
Getting Up to Speed with MongoDB ( key features)
Developing with MongoDB (start & shutdown & connect & query & DML)
Advanced Usage ( index & Aggregation, GridFS) Administration ( admin,replication,sharding) MISC (BJSON;internal)
![Page 3: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/3.jpg)
3
Getting Up to Speed with MongoDB
Key Features of MongoDB1. document-oriented
2. schema-free
Design Philosophy
Different from Relation database easy & simple admin/dev : no-transaction, no-relation, no-
duration, no-SQL
Different from key-value database tons of functions: indexing,Aggregation (MapReduce etc),Fixed-
size collections,File storage,replication
![Page 4: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/4.jpg)
4
Getting Up to Speed with MongoDB
Key Features of MongoDB
1. Document-oriented (multiple key/value pairs)
{ "_id" : ObjectId("4d9a2fa7640cde2b218c6f65"), "version" : 1300363306, "evenOrOdd" : 0, "siteId" : 0 }
{ "_id" : ObjectId("4d9cd50d2a98297726eeda5b"), "prefix" : "craft fl", "res" : { "sug" : [ "craft flowers" ], "categories" : [ [ 14339, "Crafts" ] ] } }
2. Instance --1:X-- database -- 1:X– collection –1:X – document
3. schema-free
• Use test
• Db.test.insert({ "version" : 1300363306, "evenOrOdd" : 0, "siteId" : 0 })
• Db.test.insert({"prefix" : "craft fl", "res" : { "sug" : [ "craft flowers" ], "categories" : [ [ 14339, "Crafts" ] ] } })
• Db.test.insert({“name”:”binzhang’});
• Db.test.ensureIndex({“name”,”1”})
![Page 5: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/5.jpg)
5
Getting Up to Speed with MongoDB
Design Philosophy
1. Databases are specializing - the "one size fits all" approach no longer applies. MongoDB is bettween in-memory key-value and relational persistent database.
2. By reducing transactional semantics the db provides, one can still solve an interesting set of problems where performance is very important, and horizontal scaling then becomes easier. The simpler, the faster.
3. The document data model (JSON/BSON) is easy to code to, easy to manage (schemaless), and yields excellent performance by grouping relevant data together internally. But waste a bit space.
4. A non-relational approach is the best path to database solutions which scale horizontally to many machines. Easy to scale out for in-complex application.
5. While there is an opportunity to relax certain capabilities for better performance, there is also a need for deeper functionality than that provided by pure key/value stores.
![Page 6: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/6.jpg)
6
Getting Up to Speed with MongoDB
Different from Relation database
1. easy & simple admin/dev • Kill –INT to shutdown instance
2. no-transaction3. no-relation4. no-duration
KILL -9 would corrupt database and need to repair when start up next time.
5. no-SQL MongoDB comes with a JavaScript shell that allows
interaction with MongoDB instance from the command line.
![Page 7: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/7.jpg)
7
Getting Up to Speed with MongoDB
Different from key-value database1. Datatypes: null,boolean,32-bit integer,64 bit integer,64 bit floating
point number, string, object_id,date,regular expression,code,binary data,undefined,array,embedded document etc
2. Indexing: unique index,combine index, geospatial indexing etc
3. Aggregation (MapReduce etc): group distinct etc
4. Fixed-size collections: Capped collections are fixed in size and are useful for certain types of data, such as logs.
5. File storage: a protocol for storing large files, uses subcollections to store file metadata separately from content chunks
6. Replication: include master-slave mode and replicate-set mode
7. Security : simple authorization.
![Page 8: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/8.jpg)
8
Agenda
Getting Up to Speed with MongoDB ( Summary : Getting Up to Speed with MongoDB ( Summary : document oriented & schema-free )document oriented & schema-free )
Developing with MongoDB (start & shutdown & connect & query & DML)
Advanced Usage ( index & Aggregation, GridFS) Administration ( admin,replication,sharding) MISC
![Page 9: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/9.jpg)
9
Developing with MongoDB
Continue. Start MongoDB connect Query DML ( create, insert, update, delete, drop ) Stop cleanly
![Page 10: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/10.jpg)
10
Developing with MongoDB
Start MongoDB
Mkdir /MONGO/data01
/opt/mongo/bin/mongod --logpath /MONGO/log01/server_log.txt --logappend --fork --cpu --dbpath /MONGO/data01 --replSet autocomplete
Fri Apr 1 14:37:08 [initandlisten] MongoDB starting : pid=10799 port=27017 dbpath=/MONGO/data01 64-bit
Fri Apr 1 14:37:08 [initandlisten] db version v1.8.0, pdfile version 4.5Fri Apr 1 14:37:08 [initandlisten] git version: 9c28b1d608df0ed6ebe791f63682370082da41c0Fri Apr 1 14:37:08 [initandlisten] build sys info: Linux bs-linux64.10gen.cc 2.6.21.7-
2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Fri Apr 1 14:37:08 [initandlisten] waiting for connections on port 27017Fri Apr 1 14:37:08 [websvr] web admin interface listening on port 28017
![Page 11: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/11.jpg)
11
Developing with MongoDB Connect to MongoD
$opt/mongo/bin/mongo
MongoDB shell version: 1.8.0
connecting to: test
autocomplete:PRIMARY> exit
Bye
usage: /opt/mongo/bin/mongo [options] [db address] [file names (ending in .js)]
![Page 12: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/12.jpg)
12
Developing with MongoDB MongoDB Shell
MongoDB comes with a JavaScript shell that allows interaction with a MongoDB instancefrom the command line.
Query – find()
db.c.find() returns everything in the collection c. db.users.find({"age" : 27}) where the value for "age" is 27 db.users.find({}, {"username" : 1, "email" : 1}) if you are interested only in the
"username" and "email" keys db.users.find({}, {"fatal_weakness" : 0}) never want to return the
"fatal_weakness" key db.users.find({}, {"username" : 1, "_id" : 0}) db.users.find({"age" : {"$gte" : 18, "$lte" : 30}}) db.raffle.find({"ticket_no" : {"$in" : [725, 542, 390]}}) db.c.find({"z" : {"$in" : [null], "$exists" : true}}) db.users.find({"name" : /joe/i})
![Page 13: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/13.jpg)
13
Developing with MongoDB
Behind Find() : Cursor The database returns results from find using a cursor. The client-side implementations of cursors generally allow you
to control a great deal about the eventual output of a query.
> for(i=0; i<100; i++) {
... db.c.insert({x : i});
... }
> var cursor = db.collection.find(); > while (cursor.hasNext()) { ... obj = cursor.next(); ... // do stuff ... }
> var cursor = db.people.find();> cursor.forEach(function(x) {... print(x.name);... });adammattzak
![Page 14: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/14.jpg)
14
Developing with MongoDBBehind Find() : Cursor continue
Getting Consistent Results? var cursor = db.myCollection.find({country:'uk'}).snapshot();
A fairly common way of processing data is to pull it out of MongoDB, change it in some way, and then save it again:
cursor = db.foo.find();
while (cursor.hasNext()) {
var doc = cursor.next();
doc = process(doc);
db.foo.save(doc);
}
![Page 15: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/15.jpg)
15
Developing with MongoDB Create collection db.foo.insert({"bar" : "baz"})
Insert db.foo.insert({"bar" : "baz"})
Update
db.users.update({"_id" : ObjectId("4b253b067525f35f94b60a31")}, ... {"$set" : {"favorite book" : "war and peace"}})
Deletedb.users.remove()db.mailing.list.remove({"opt-out" : true})
Drop collection db.foo.drop();
![Page 16: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/16.jpg)
16
Developing with MongoDB DML continue : Safe Operation
1. MongoDB does not wait for a response by default when writing to the database. Use the getLastError command to ensure that operations have succeeded.
2. The getLastError command can be invoked automatically with many of the drivers when saving and updating in "safe" mode (some drivers call this "set write concern").
db.$cmd.findOne({getlasterror:1}) db.runCommand("getlasterror") db.getLastError()
![Page 17: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/17.jpg)
17
Developing with MongoDB
Stop MongoDB
kill -2 10014 (SIGINT) or kill 10014 (SIGTERM).
1. wait for any currently running operations or file preallocations to finish (this could take a moment)
2. close all open connections 3. flush all data to disk4. halt.
use the shutdown command
> use adminswitched to db admin> db.shutdownServer();server should be down...
![Page 18: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/18.jpg)
18
Agenda
Getting Up to Speed with MongoDB ( Summary : Getting Up to Speed with MongoDB ( Summary : document oriented & schema-free )document oriented & schema-free )
Developing with MongoDB (Summary: find())Developing with MongoDB (Summary: find()) Advanced Usage ( index & Aggregation, GridFS) Administration ( admin,replication,sharding) MISC
![Page 19: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/19.jpg)
19
MongoDB Advanced Usage
Advanced Usage Index Aggregation MapReduce Database commands Capped Collections GridFS: Storing Files
![Page 20: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/20.jpg)
20
MongoDB Advanced Usage
Index MongoDB’s indexes work almost identically to typical relational database
indexes, Index optimization for MySQL/Oracle/SQLite will apply equally well to
MongoDB. if an index has N keys, it will make queries on any prefix of those keys
fast Example
db.people.find({"username" : "mark"}) db.people.ensureIndex({"username" : 1})
db.people.find({"date" : date1}).sort({"date" : 1, "username" : 1}) db.ensureIndex({"date" : 1, "username" : 1})
db.people.find({"username" : "mark"}).explain()
![Page 21: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/21.jpg)
21
MongoDB Advanced Usage
Index continue Indexes can be created on keys in embedded documents in the same way
that they are created on normal keys. Indexing for Sorts : Indexing the sort allows MongoDB to pull the sorted data
in order, allowing you to sort any amount of data without running out of memory.
Index Nameing rule: keyname1_dir1_keyname2_dir2_..._keynameN_dirN, where keynameX is the index’s key and dirX is the index’s direction (1 or -1).
db.blog.ensureIndex({"comments.date" : 1}) db.people.ensureIndex({"username" : 1}, {"unique" : true}) db.people.ensureIndex({"username" : 1}, {"unique" : true, "dropDups" : true})
autocomplete:PRIMARY> db.system.indexes.find(){ "name" : "_id_", "ns" : "test.fs.files", "key" : { "_id" : 1 }, "v" : 0 }{ "ns" : "test.fs.files", "key" : { "filename" : 1 }, "name" : "filename_1", "v" : 0 }{ "name" : "_id_", "ns" : "test.fs.chunks", "key" : { "_id" : 1 }, "v" : 0 }{ "ns" : "test.fs.chunks", "key" : { "files_id" : 1, "n" : 1 }, "name" : "files_id_1_n_1",
"v" : 0 }
![Page 22: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/22.jpg)
22
MongoDB Advanced Usage
Index continue : explain() explain will return information about the indexes used for the query (if any) and stats
about timing and the number of documents scanned.
autocomplete:PRIMARY> db.users.find({"name":"user0"}).explain(){ "cursor" : "BtreeCursor name_1", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 0, "nYields" : 0, "nChunkSkips" : 0, "isMultiKey" : false, "indexOnly" : false, "indexBounds" : { "name" : [ [ "user0", "user0" ] ] }}
![Page 23: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/23.jpg)
23
MongoDB Advanced Usage
Index continue : hint() If you find that Mongo is using different indexes than you want it to
for a query, you can force it to use a certain index by using hint.
db.c.find({"age" : 14, "username" : /.*/}).hint({"username" : 1, "age" : 1})
Index continue : change index db.runCommand({"dropIndexes" : "foo", "index" : "alphabet"})
db.people.ensureIndex({"username" : 1}, {"background" : true}) Using the {"background" : true} option builds the index in the
background, while handling incoming requests. If you do not include the background option, the database will block all other requests while the index is being built.
![Page 24: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/24.jpg)
24
MongoDB Advanced Usage Advanced Usage
Index Aggregation
db.foo.count()db.foo.count({"x" : 1})db.runCommand({"distinct" : "people", "key" : "age"})
Group
{"day" : "2010/10/03", "time" : "10/3/2010 03:57:01 GMT-400", "price" : 4.23}{"day" : "2010/10/04", "time" : "10/4/2010 11:28:39 GMT-400", "price" : 4.27}{"day" : "2010/10/03", "time" : "10/3/2010 05:00:23 GMT-400", "price" : 4.10}{"day" : "2010/10/06", "time" : "10/6/2010 05:27:58 GMT-400", "price" : 4.30}{"day" : "2010/10/04", "time" : "10/4/2010 08:34:50 GMT-400", "price" : 4.01}
db.runCommand({"group" : {... "ns" : "stocks",... "key" : "day",... "initial" : {"time" : 0},... "$reduce" : function(doc, prev) {... if (doc.time > prev.time) {... prev.price = doc.price;... prev.time = doc.time;... } }}})
![Page 25: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/25.jpg)
25
MongoDB Advanced Usage
Mapreduce It is a method of aggregation that can be easily parallelized across
multiple servers. It splits up a problem, sends chunks of it to different machines, and lets each machine solve its part of the problem. When all of the machines are finished, they merge all of the pieces of the solution back into a full solution.\
Example: Finding All Keys in a Collection>map = function() {... for (var key in this) {... emit(key, {count : 1});... }};
> mr = db.runCommand({"mapreduce" : "foo", "map" : map, "reduce" : reduce})
> db[mr.result].find()
> reduce = function(key, emits) {... total = 0;... for (var i in emits) {... total += emits[i].count; }... return {"count" : total};... }
![Page 26: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/26.jpg)
26
MongoDB Advanced Usage Database commands
Commands implement all of the functionality that doesn’t fit neatly into “create, read, update, delete.”
Example:> db.runCommand({"drop" : "test"});{ "errmsg" : "ns not found", "ok" : false }
It equals querying $cmd internal collections.
>db.$cmd.findOne({"drop" : "test"});
Show all commands >db.listCommands()
![Page 27: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/27.jpg)
27
MongoDB Advanced Usage Capped Collections1. capped collections automatically age-out the
oldest documents as new documents are inserted.
2. Documents cannot be removed or deleted (aside from the automatic age-out described earlier), and updates that would cause documents to move (in general updates that cause documents to grow in size) are disallowed.
3. inserts into a capped collection are extremely fast.
4. By default, any find performed on a capped collection will always return results in insertion order.
5. ideal for use cases like logging.
6. Replication use capped collection as OpLog.
![Page 28: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/28.jpg)
28
MongoDB Advanced Usage GridFS: Storing Files
GridFS is a mechanism for storing large binary files in MongoDB. Why using GridFS
• Using GridFS can simplify your stack. If you’re already using MongoDB, GridFSobviates the need for a separate file storage architecture.
• GridFS will leverage any existing replication or autosharding that you’ve set up forMongoDB, so getting failover and scale-out for file storage is easy.
• GridFS can alleviate some of the issues that certain filesystems can exhibit whenbeing used to store user uploads. For example, GridFS does not have issues withstoring large numbers of files in the same directory.
• You can get great disk locality with GridFS, because MongoDB allocates data filesin 2GB chunks.
![Page 29: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/29.jpg)
29
MongoDB Advanced Usage GridFS: example
$ echo "Hello, world" > foo.txt$ ./mongofiles put foo.txtconnected to: 127.0.0.1added file: { _id: ObjectId('4c0d2a6c3052c25545139b88'),filename: "foo.txt", length: 13, chunkSize: 262144,uploadDate: new Date(1275931244818),md5: "a7966bf58e23583c9a5a4059383ff850" }done!
$ ./mongofiles listconnected to: 127.0.0.1foo.txt 13
$ rm foo.txt$ ./mongofiles get foo.txtconnected to: 127.0.0.1done write to: foo.txt
$ cat foo.txtHello, world
![Page 30: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/30.jpg)
30
MongoDB Advanced Usage GridFS: internal
The basic idea behind GridFS is that we can store large files by splitting them up into chunks and storing each chunk as a separate document.
autocomplete:PRIMARY> show collectionsfs.chunksfs.filessystem.indexes
autocomplete:PRIMARY> db.fs.chunks.find(){ "_id" : ObjectId("4db258ae05a23484714d58ad"), "files_id" : ObjectId("4db258ae39ae206d1114d6e4"), "n" :
0, "data" : BinData(0,"SGVsbG8sbW9uZ28K") }{ "_id" : ObjectId("4db258d305a23484714d58ae"), "files_id" : ObjectId("4db258d37858d8bb53489eea"), "n" :
0, "data" : BinData(0,"SGVsbG8sbW9uZ28K") }{ "_id" : ObjectId("4db2596d05a23484714d58af"), "files_id" : ObjectId("4db2596d4fefdd07525ef166"), "n" : 0,
"data" : BinData(0,"SGVsbG8sbW9uZ28K") }
autocomplete:PRIMARY> db.fs.files.find(){ "_id" : ObjectId("4db258ae39ae206d1114d6e4"), "filename" : "file1.txt", "chunkSize" : 262144, "uploadDate" :
ISODate("2011-04-23T04:42:22.546Z"), "md5" : "c002dec1a1086442b2aa49c2b6e48884", "length" : 12 }{ "_id" : ObjectId("4db258d37858d8bb53489eea"), "filename" : "file2.txt", "chunkSize" : 262144, "uploadDate" :
ISODate("2011-04-23T04:42:59.851Z"), "md5" : "c002dec1a1086442b2aa49c2b6e48884", "length" : 12 }{ "_id" : ObjectId("4db2596d4fefdd07525ef166"), "filename" : "file2.txt", "chunkSize" : 262144, "uploadDate" :
ISODate("2011-04-23T04:45:33.771Z"), "md5" : "c002dec1a1086442b2aa49c2b6e48884", "length" : 12 }
autocomplete:PRIMARY> db.system.indexes.find(){ "ns" : "test.fs.files", "key" : { "filename" : 1 }, "name" : "filename_1", "v" : 0 }{ "ns" : "test.fs.chunks", "key" : { "files_id" : 1, "n" : 1 }, "name" : "files_id_1_n_1", "v" : 0 }
![Page 31: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/31.jpg)
31
MongoDB Advanced Usage
Advanced Usage review Index : almost same as oracle Aggregation MapReduce : built-in Database commands : db.listCommands() Capped Collections : suitable for log GridFS: Storing Files : built-in document oriented
Others Geospatial Indexing Database References Server-Side Scripting
![Page 32: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/32.jpg)
32
DBA on MongonDB Administration ( admin,replication,sharding)
Monitoring
Security and Authentication
Backup and Repair
Master-Slave Replication
Replication-set
Sharding
![Page 33: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/33.jpg)
33
DBA on MongonDB
Easy Monitoring Using the Admin Interface db.runCommand({"serverStatus" : 1}) mongostat Third-Party Plug-Ins
![Page 34: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/34.jpg)
34
Using the Admin Interface
![Page 35: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/35.jpg)
35
db.runCommand({"serverStatus" : 1})
db.runCommand({"serverStatus" : 1}) { "version" : "1.5.3", "uptime" : 166, "localTime" : "Thu Jun 10 2010 15:47:40 GMT-
0400 (EDT)", "globalLock" : { "totalTime" : 165984675, "lockTime" : 91471425, "ratio" : 0.551083556358441 }, "mem" : { "bits" : 64, "resident" : 101, "virtual" : 2824, "supported" : true, "mapped" : 336 }, "connections" : { "current" : 141, "available" : 19859 }, "extra_info" : { "note" : "fields vary by platform" }, "indexCounters" : { "btree" : { "accesses" : 1563, "hits" : 1563, "misses" : 0,
"backgroundFlushing" : { "flushes" : 2, "total_ms" : 44, "average_ms" : 22, "last_ms" : 36, "last_finished" : "Thu Jun 10 2010
15:46:54 GMT-0400 (EDT)" }, "opcounters" : { "insert" : 38195, "query" : 8874, "update" : 4058, "delete" : 389, "getmore" : 888, "command" : 17731 }, "asserts" : { "regular" : 0, "warning" : 0, "msg" : 0, "user" : 5054, "rollovers" : 0 },
![Page 36: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/36.jpg)
36
mongostat Fields inserts - # of inserts per second query - # of queries per second update - # of updates per second delete - # of deletes per second getmore - # of get mores (cursor batch) per
second command - # of commands per second flushes - # of fsync flushes per second mapped - amount of data mmaped (total data size)
megabytes visze - virtual size of process in megabytes res - resident size of process in megabytes faults - # of pages faults per sec (linux only) locked - percent of time in global write lock idx miss - percent of btree page misses (sampled) qr|qw - queue lengths for clients waiting (read|
write) ar|aw - active clients (read|write) netIn - network traffic in - bits netOut - network traffic out - bits conn - number of open connections
![Page 37: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/37.jpg)
37
DBA on MongonDB
Security and Authentication
1. Each database in a MongoDB instance can have any number of users.
2. only authenticated users of a database are able to perform read or write operations on it.
3. A user in the admin database can be thought of as a superuser
4. Need to start MongoDB with “--auth” option to enable authentication.
![Page 38: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/38.jpg)
38
Backup on MongonDB1. Data File Cold Backup
kill –INT mongod; copy --dbpath2. mongodump (exp) and mongorestore (imp)3. fsync and Lock4. Slave Backup
> use adminswitched to db admin> db.runCommand({"fsync" : 1, "lock" : 1});{"info" : "now locked against writes, use db.$cmd.sys.unlock.findOne() to unlock","ok" : 1}
Do mongodump
> db.$cmd.sys.unlock.findOne();{ "ok" : 1, "info" : "unlock requested" }
![Page 39: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/39.jpg)
39
Repair MongonDB
1.Need to repair databases after an unclean shutdown ( kill -9 )
**************old lock file: /data/db/mongod.lock. probably means unclean
shutdownrecommend removing file and running --repairsee: http://dochub.mongodb.org/core/repair for more information*************
2. All of the documents in the database are exported and then immediately imported, ignoring any that are invalid. Then rebuild indexes.
3. Take a long time while data-set is humongous4. Repairing a database will also perform a
compaction.5. db.repairDatabase() can repair single database
![Page 40: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/40.jpg)
40
DBA on MongonDB
Replication Master-Slave Replication Replication-set
![Page 41: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/41.jpg)
41
Master-Slave Replication
$ mkdir -p ~/dbs/master$ ./mongod --dbpath ~/dbs/master --port 10000 --master
$ mkdir -p ~/dbs/slave$ ./mongod --dbpath ~/dbs/slave --port 10001 --slave --source localhost:10000
1. Scale read2. Backup on Slave3. Process data on Slave4. DR
![Page 42: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/42.jpg)
42
Master-Slave Replication
How it works? The Oplogoplog.$main a capped collection in local database.
• ts Timestamp for the operation. The timestamp type is an internal type used to track when operations are performed. It is composed of a 4-byte timestamp and a 4-byte incrementing counter.
• op Type of operation performed as a 1-byte code (e.g., “i” for an insert).• ns Namespace (collection name) where the operation was performed.• o Document further specifying the operation to perform. For an insert,
this would be the document to insert.
1. Slave first starts up, it will do a full sync of the data on the master node.
2. After the initial sync is complete, the slave will begin querying the master’s oplog and applying operations in order to stay up-to-date. “async”
![Page 43: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/43.jpg)
43
Replication on MongonDB
Replication-set
1. A replica set is basically a master-slave cluster with automatic failover.
2. One master, some secondary (slave)
3. One secondary is elected by the cluster and may change to another node if the current master goes down.
![Page 44: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/44.jpg)
44
Replication on MongonDB
Setup Replication-set1. Option --replSet is name for this replica set.
$ ./mongod --dbpath ~/dbs/node1 --port 10001 --replSet autocomplete/slcdbx1005:10002
We start up the other server in the same way:
$ ./mongod --dbpath ~/dbs/node2 --port 10002 --replSet autocomplete/slcdbx1006:10001
If we wanted to add a third server, we could do so with either of these commands:
$ ./mongod --dbpath ~/dbs/node3 --port 10003 --replSet autocomplete/slcdbx1005:10001
$ ./mongod --dbpath ~/dbs/node3 --port 10003 –replSet autocomplete/slcdbx1005:10001, slcdbx1006:10002
![Page 45: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/45.jpg)
45
Replication-set failover
•standard a full copy of data & voting & ready to be primary•passive a full copy of data & voting •arbiter voting & no data replicated
![Page 46: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/46.jpg)
46
MongonDB Auo-sharding
Sharding : splitting data up and storing different portions of thedata on different machines.
1. Manualy sharding: The application code manages storing different data on different servers and querying against the appropriate server to get data back.
2. Auto Sharding : The cluster handles splitting up data and rebalancing automatically.
![Page 47: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/47.jpg)
47
Auto sharding
When to shard?
1. You’ve run out of disk space on your current machine.
2. You want to write data faster than a single mongod can handle.
3. You want to keep a larger proportion of data in memory to improve performance.
4. DR5. Failover automatically
![Page 48: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/48.jpg)
48
Auto sharding
Component of MongoDB sharding?• Config server$ mkdir -p ~/dbs/config$ ./mongod --dbpath ~/dbs/config --port 20000
• Mongos (router)$ ./mongos --port 30000 --configdb localhost:20000
• Sharding ( usually replication-set)
$ mkdir -p ~/dbs/shard1$ ./mongod --dbpath ~/dbs/shard1 --port 10000
Mongos> db.runCommand({addshard : "localhost:10000", allowLocal : true})
{"added" : "localhost:10000","ok" : true}
![Page 49: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/49.jpg)
49
Pre sharding a table
Determine a shard key
1. define how we distribute data. 2. MongoDB's sharding is order-preserving; adjacent data by shard key tends to be on the same
server.3. The config database stores all the metadata indicating the location of data by range: 4. It should be granular enough to ensure an even distribution of data.
Chunks
1. a contiguous range of data from a particular collection. 2. Once a chunk has reached about 200M size, the chunk splits into two new chunks. When a
particular shard has excess data, chunks will then migrate to other shards in the system. 3. The addition of a new shard will also influence the migration of chunks.
![Page 50: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/50.jpg)
50
Sharding a table
Enable sharding on a database db.runCommand({"enablesharding" : "foo"})
Enable sharding on collection. db.runCommand({"shardcollection" : "foo.bar", "key" : {"_id" : 1}})
Show autosharding status
> db.printShardingStatus()--- Sharding Status ---sharding version: { "_id" : 1, "version" : 3 }shards:{ "_id" : "shard0", "host" : "localhost:10000" }{ "_id" : "shard1", "host" : "localhost:10001" }databases:{ "_id" : "admin", "partitioned" : false, "primary" : "config" }{ "_id" : "foo", "partitioned" : false, "primary" : "shard1" }{ "_id" : "x", "partitioned" : false, "primary" : "shard0" }{ "_id" : "test", "partitioned" : true, "primary" : "shard0","sharded" : { "test.foo" : { "key" : { "x" : 1 }, "unique" : false } } }test.foo chunks:{ "x" : { $minKey : 1 } } -->> { "x" : { $maxKey : 1 } } on : shard0{ "t" : 1276636243000, "i" : 1 }
![Page 51: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/51.jpg)
51
Query on sharding assume a shard key of { x : 1 }.
![Page 52: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/52.jpg)
52
Sharding machine layoutAvoid single failure
![Page 53: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/53.jpg)
53
Review
Getting Up to Speed with MongoDB ( document oriented and schema-free )
Developing with MongoDB (find()) Advanced Usage ( Tons of features) Administration ( Easy to
admin,replication,sharding) MISC (BJSON;internal)
![Page 54: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/54.jpg)
54
Misc
1. BSON
2. Datafiles layout
3. Memory-Mapped Storage Engine
![Page 55: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/55.jpg)
55
Misc1 BSON (Binary JSON)
a lightweight binary format capable of representing any MongoDB document as a string of bytes.
BSON is the format in which documents are saved to disk. When a driver is given a document to insert,
use as a query, and so on, it will encode that document to BSON before sending it to the server.
Goals:Efficiency Traversability Performance
![Page 56: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/56.jpg)
56
Datafiles layout & Memory-Mapped Storage Engine
1. The numeric data files for a database will double in size for each new file, up to a maximum file size of 2GB.
2. Preallocates data files to ensure consistent performance
3. Memory-Mapped storage Engine
4. When the server starts up, it memory maps all its data files.5. OS is to manage flushing data to disk and paging data in and out.6. MongoDB cannot control the order that data is written to disk, which
makes it impossible to use a writeahead log to provide single-server durability.
7. 32-bit MongoDB servers are limited to a total of about 2GB of data per mongod. This is because all of the data must be addressable using only 32 bits.
![Page 57: 1 Mongo DB MongoDB (from "humongous) is a scalable, high- performance, open source, document-oriented database. Written in C++. Home:](https://reader034.vdocuments.site/reader034/viewer/2022051819/5513de3b55034679748b5306/html5/thumbnails/57.jpg)
57
Q&A
Getting Up to Speed with MongoDB ( key features)
Developing with MongoDB (start & shutdown & connect & query & DML)
Advanced Usage ( index & Aggregation, GridFS) Administration ( easy admin,replication,sharding) MISC (BSON; Memory-Mapped)