steam learn: introduction to nosql with mongodb
TRANSCRIPT
November, 27th 2014
Introduction to NoSQLWith MongoDB
by Romain Francez
November, 27th 2014
Introduction To NoSQL
● About when and why to choose NoSQL
● Modelisation and simple operations with MongoDB
● Not about saying NoSQL > RDBMS
● Not about advanced operations
November, 27th 2014
NoSQL
● Buzzword
● Actually is NoREL
○ Any database that doesn’t use the relational system
○ Many types of NoSQL databases
November, 27th 2014
Types of NoSQL
● Document
● Key-value
● Column
● Graph
November, 27th 2014
RDBMS vs NoSQL: Loss
● Relational databases
○ ACID (Atomicity, Consistency, Isolation, Durability)
● NoSQL
○ Sacrifices some of those principles
■ Because our real world application requires it
○ CAP (Consistency, Availability, Partition tolerance)
■ Not possible to ensure all three at the same time
○ Distributed Systems
■ Partition tolerance is required
■ Either consistency or availability
November, 27th 2014
RDBMS vs NoSQL: Gain
● Lot of data
○ Need to spread data across servers
● Fast key-value
● Flexible schemas
● Nested datatypes
● Don’t need atomic operations
○ Financial transactions are harder to modelize in
NoSQL
November, 27th 2014
MongoDB Overview
● Edited by 10Gen/Open Source
● Widely used
● Choose trade off between consistency and availability
● Document Type Database
● BSON/JSON
● JavaScript
● Flexible Schema
November, 27th 2014
Example
● E-commerce application
● Extreme scenario
● Millions of products with each
○ Categories
○ Price
○ Brand
November, 27th 2014
Example: Schema
● Nope
● Would be the first step with RDBMS
November, 27th 2014
Example: Data access
● How the data is accessed
● Some data are more likely to be read and not written
● In my own application I don’t track Brands
November, 27th 2014
Example: Basic Queries
db.products.find({brand: 'Chanel'}).count();
> 166713
November, 27th 2014
Example: Schema
{
brand:
categories: []
price:
}
November, 27th 2014
Example: Intermediate Queries
db.products.aggregate([
{$match: {categories: {$size: 2}}},
{$group: {
_id: '$brand',
average: {$avg: '$price'},
count: {$sum: 1}
}}
]);
November, 27th 2014
Example: Intermediate Queries{
"result" : [
{
"_id" : "Chanel",
"average" : 49.92298619934285,
"count" : 68475
},
{
"_id" : "Hollister",
"average" : 49.93522546263746,
"count" : 68304
},
...
],
"ok" : 1
}
< 1s1,000,000 documentsno indices
November, 27th 2014
Example: Conclusion
● Can be (very) fast for data analysis
● Advanced operations
○ Indices
○ Map/Reduce
○ References and client side joins
November, 27th 2014
How to choose a NoSQL DBMS
● All about your application and data
○ How the data is accessed (read/write)
○ How much data (volumetry)
○ Temporary vs permanent
○ Functionalities of the DBMS
November, 27th 2014
Key Points to Remember
● Design your schema according to how your data is used
(accessed + displayed) the most (read/write)
● Choose the right DB provider for you problem
November, 27th 2014
Appendix A: Generate Datavar brands = ['Hollister', 'Chanel', 'Nike', 'H&M', 'Celio', 'Adidas'];
var categories = ['red', 'green', 'blue', 'yellow'];
for (var j = 0; j < 1000000; j++) {
var item = {};
var brandIdx = Math.floor(Math.random() * brands.length) % brands.length;
item.brand = brands[brandIdx];
item.categories = [];
var numCategories = Math.floor(Math.random() * categories.length) % categories.length;
for (var i = 0; i <= numCategories; i++) {
var categoryIdx = Math.floor(Math.random() * categories.length) % categories.length;
var category = categories[categoryIdx];
if (item.categories.indexOf(category) === -1) {
item.categories.push(category);
}
}
item.price = Math.round(Math.random() * 10000) / 100;
db.products.insert(item);
}
November, 27th 2014
Questions ?For online questions, please leave a comment on the article.
November, 27th 2014
Join the community !(in Paris)
Social networks :● Follow us on Twitter : https://twitter.com/steamlearn● Like us on Facebook : https://www.facebook.com/steamlearn
SteamLearn is an Inovia initiative : inovia.fr
You wish to be in the audience ? Contact us at [email protected]