![Page 1: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/1.jpg)
Some notes on NoSQL, in particular MongoDB
Bettina Berendt(with thanks to Matthijs van Leeuwen for some of the slides)
8 December 2015
![Page 2: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/2.jpg)
Overview: NoSQL
‘Not only SQL’
No tables, but storage of, e.g.– Collections of documents (e.g. JSON, XML)– Key-value pairs– Columns of values– Graphs– Objects– …
2
![Page 3: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/3.jpg)
NoSQL
Advantages– horizontally scalable (as opposed to vertically)– No static schema or data model– Cheaper in maintenance
Disadvantages– Possibilities can be very system-specific no universal query language– Often, some coding is necessary– Fewer a/o weaker theoretical guarantees
3
![Page 4: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/4.jpg)
Example systems: NoSQL
4
![Page 5: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/5.jpg)
MongoDB, the most popular system for document stores(see https://en.wikipedia.org/wiki/MongoDB and references there)
MongoDB is “schema-free“!
![Page 6: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/6.jpg)
6
Understanding the MongoDB / NoSQL notion of “document“
– Good example of what computer scientists call “semi-structured data“ (see previous week)
– But actually fairly structured in comparison to e.g. a textual document:• MongoDB‘s format is called BSON, a binary form of JSON• See https://en.wikipedia.org/wiki/JSON, https://
en.wikipedia.org/wiki/BSON – Note: JSON can be thought of as an alternative to XML,
as described for example on the – certainly not disinterested - http://www.json.org/xml.html , but not the type of XML you often see for annotating texts, as for example in the Letters of 1916 project
![Page 7: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/7.jpg)
7
INSERT a row (SQL) insert a document (MongoDB)db.inventory.insert(
{
item: "ABC1",
details: {model: "14Q3",
manufacturer: "XYZ Company"
},
stock: [ { size: "S", qty: 25 }, { size: "M", qty: 50 } ],
category: "clothing"
}
)
![Page 8: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/8.jpg)
8
SELECT (SQL) find documents (MongoDB)
db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } )
db.inventory.find( { type: 'food', price: { $lt: 9.95 } } )
![Page 9: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/9.jpg)
9
SELECT and SORT
![Page 10: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/10.jpg)
10
UPDATE (SQL) update documents (MongoDB)
db.inventory.update(
{ item: "MNO2" },
{
$set: {
category: "apparel",
details: { model: "14Q3", manufacturer: "XYZ" }
},
$currentDate: { lastModified: true }
}
)
![Page 11: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/11.jpg)
11
Other useful constructs ...
... such as GROUP BY are also available (see Wikipedia description)
... And python interfaces exist.
![Page 12: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/12.jpg)
Setting indexes in MongoDB:Usage (1): BSON structureGiven the following document in the users collection
{ “_id“ : ObjectID(...),“name“ : “Alice“,“age“ : 27“score“ : 25
}
the following command creates an index on the score field:
db.users.createIndex ( { “score“ : 1 } )
![Page 13: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/13.jpg)
Usage (2)
![Page 14: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/14.jpg)
14
SELECT and SORT (shown with reference to an index)
![Page 15: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015](https://reader035.vdocuments.site/reader035/viewer/2022062523/5a4d1ae27f8b9ab0599779e9/html5/thumbnails/15.jpg)
15
Importance for DHers?
– Certainly growing, but probably not necessary for everyone
My personal rule of thumb: – Very useful if
• you know the query you have (for example because you have worked it out on a small data sample, with SQL, python, or whatever), and
• you need to process LOTS of data– Less useful for very exploratory analysis, since
there you may need a universal query language.