mongodb: a gentle, friendly overview

29
A gentle, friendly overview Antonio Pintus CRS4, 08/09/2011 1

Upload: antonio-pintus

Post on 01-Nov-2014

4.084 views

Category:

Technology


2 download

DESCRIPTION

My talk @ CRS4 about a MongoDB overview

TRANSCRIPT

Page 1: MongoDB: a gentle, friendly overview

A gentle, friendly overview

Antonio Pintus

CRS4, 08/09/2011

1

Page 2: MongoDB: a gentle, friendly overview

NOSQL /1

• MongoDB belongs to the NoSQL databases family:

• non-relational

• document-oriented

• no prefixed, rigid, database schemas

• no joins

• horizontal scalability2

Page 3: MongoDB: a gentle, friendly overview

NOSQL /2

• NoSQL DB family includes several DB types:

• document/oriented: mongoDB, CouchDB, ...

• Key Value / Tuple Store: Redis, ...

• Graph databases: Neo4j, ...

• ...3

Page 4: MongoDB: a gentle, friendly overview

MongoDB

• Performant: C++

• Schema-free

• Full index support

• No transactions

• Scalable: replication + sharding

• document-based queries

• Map/Reduce

• GridFS

• a JavaScript interactive shell

4

Page 5: MongoDB: a gentle, friendly overview

SCHEMA-FREE• Schema-free collections = NO TABLES!

• A Mongo deployment (server) holds a set of databases

• A database holds a set of collections

• A collection holds a set of documents

• A document is a set of fields: key-value pair (BSON)

• A key is a name (string), a value is a basic type like string, integer, float, timestamp, binary, etc.,a document, or an array of values

5

Page 6: MongoDB: a gentle, friendly overview

DATA FORMAT

• document/oriented

• stores JSON-style documents: BSON (Binary JSON):

• JSON + other data types. E.g., Date type and a BinData type.

• Can reference other documents

• lightweight, traversable, efficient

6

Page 7: MongoDB: a gentle, friendly overview

BSON{! "_id" : ObjectId("4dcec9a0af391a0d53000003"),

! "servicetype" : "sensor",

! "description" : "it’s only rock’n’roll but I like it",

! "policy" : "PUBLIC",

! "owner" : "User001",

! "date_created" : "2011-05-02 17:11:28.874086",

! "shortname" : "SampleSensor",

! "content-type" : "text/plain",

! "icon" : "http://myserver.com/images/sens.png"

} 7

Page 8: MongoDB: a gentle, friendly overview

COLLECTIONS

• More or less, same concept as “table” but dynamic, schema-free

• collection of BSON documents

• documents can have heterogeneous data structure in the same collection

8

Page 9: MongoDB: a gentle, friendly overview

QUERIES• query by documents

• Examples (using the interactive shell):

• db.mycollection.find( {"policy" : "PUBLIC"} );

• db.mycollection.findOne({"policy" : "PUBLIC", “owner”:”User001”});

• db.mycollection.find({"policy" : "PUBLIC", “owner”:”User001”}).limit(2);

• db.mycollection.find( {"policy" : "PUBLIC"}, {“shortname”:1} );

• db.mycollection.find({"counter": {$gt:2}});

• conditional ops: <, <=, >, >=, $and, $in, $or, $nor, ...

9

Page 10: MongoDB: a gentle, friendly overview

INDEXES• Full index support: index on any attribute (including multiple)

• increase query performance

• indexes are implemented as “B-Tree” indexes

• data overhead for inserts and deletes, don’t abuse!

• db.mycollection.ensureIndex( {"servicetype" : 1} );

• db.mycollection.ensureIndex( {"servicetype" : 1, “owner”:-1} );

• db.mycollection.getIndexes()

• db.system.indexes.find()

10

Page 11: MongoDB: a gentle, friendly overview

INSERTS

• Simplicity

• db.mycollection.insert({“a”:”abc”,...})

• var doc = {“name”:”mongodb”,...};

• db.mycollection.insert(doc);

11

Page 12: MongoDB: a gentle, friendly overview

UPDATES1. replace entire document

2. atomic, in-place updates

• db.collection.update( criteria, objNew, upsert, multi )

• criteria: the query

• objNew: updated object or $ operators (e.g., $inc, $set) which manipulate the object

• upsert: if the record(s) do not exist, insert one.

• multi: if all documents matching criteria should be updated

• db.collection.save(...): single object update with upsert

12

Page 13: MongoDB: a gentle, friendly overview

UPDATES /2

• atomic, in-place updates = highly efficient

• provides special operators

• db.mycollection.update( { “shortname”:"Arduino" }, { $inc: { n : 1 } } );

• db.mycollection.update( { “shortname”:"Arduino" }, { $set: { “shortname” : “OldArduino” } } );

• other atomic ops: $unset, $push, $pushAll, $addToSet, $pop, $pull, $rename, ...

13

Page 14: MongoDB: a gentle, friendly overview

Mongo DISTRIBUTION• Mac, Linux, Solaris, Win

• mongod: database server.

• By default, port=27017, store path=/data/db.

• Override with --dbpath, --port command options

• mongo: interactive JavaScript shell

• mongos: sharding controller server 14

Page 16: MongoDB: a gentle, friendly overview

GOOD FOR

• event logging

• high performance small read/writes

• Web: real-time inserts, updates, and queries. Auto-sharding (scalability) and replication are provided.

• Real-time stats/analytics

16

Page 17: MongoDB: a gentle, friendly overview

LESS GOOD FOR

• Systems with heavy transactional nature

• Traditional Business Intelligence

• (obviously) System and problems requiring SQL

17

Page 18: MongoDB: a gentle, friendly overview

SHARDING /1

• Horizontal scalability: MongoDB auto-sharding

• partitioning by keys

• auto-balancing

• easy addition of new servers

• no single points-of-failure

• automatic failover/replica-sets18

Page 19: MongoDB: a gentle, friendly overview

SHARDING /2

19

mongod

mongod

mongod

Config servers

mongod

mongod

mongod

mongod

mongod

mongod

mongod

mongod

mongod...

mongos mongos

Client

...

Shards

Page 21: MongoDB: a gentle, friendly overview

PyMongo

• Recommended MongoDB driver for the Python language

• An easy way to install it (Mac, Linux):

• easy_install pymongo

• easy_install -U pymongo

21

Page 22: MongoDB: a gentle, friendly overview

QUICK-START: INSERT• (obviously) mongod must be running ;-)

22

import pymongofrom pymongo import Connection

conn = Connection() # default localhost:27017; conn=Connection('myhost',9999)

db = conn['test_db'] # gets the database

test_coll = db['testcoll'] # gets the desired collection

doc = {"name":"slides.txt", "author":"Antonio", "type":"text", "tags": ["mongodb", "python", "slides"]} # a dict

test_coll.insert(doc) # inserts document into the collection

• lazy creation: collections and databases are created when the first document is inserted into them

Page 23: MongoDB: a gentle, friendly overview

QUICK-START: QUERY

23

res = test_coll.find_one() # gets one document

query = {"author":"Antonio"} # a query document

res = test_coll.find_one(query) # searches for one document

for doc in test_coll.find(query): # using Cursors on multiple docs print doc ...

test_coll.count() # counts the docs in the collection

Page 24: MongoDB: a gentle, friendly overview

NOT COVERED (HERE)

• GridFS: binary data storage is limited to 16MB in DB, so GridFS transparently splits large files among multiple documents

• MapReduce: batch processing of data and aggregation operations

• GeoSpatial Indexing: two-dimensional indexing for location-based queries (e.g., retrieve the n closest restaurants to my location)

24

Page 25: MongoDB: a gentle, friendly overview

IN PRODUCTION (some...)

25

Page 26: MongoDB: a gentle, friendly overview

26

Page 27: MongoDB: a gentle, friendly overview

27

Paraimpu LOVES MongoDB

• MongoDB powers Paraimpu, our Social Web of Things tool

• great data heterogeneity

• real-time thousands, small data inserts/queries

• performances

• horizontal scalability

• easy of use, development is funny!

Page 28: MongoDB: a gentle, friendly overview

28

REFERENCES• http://www.mongodb.org/

• http://www.mongodb.org/display/DOCS/Manual

• http://www.mongodb.org/display/DOCS/Slides+and+Video

• pymongo: http://api.mongodb.org/python/

• Paraimpu: http://paraimpu.crs4.it

Page 29: MongoDB: a gentle, friendly overview

29

THANK YOU

Antonio Pintus

email: [email protected]

twitter : @apintux