Transcript
Page 1: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

Some notes on NoSQL, in particular MongoDB

Bettina Berendt(with thanks to Matthijs van Leeuwen for some of the slides)

8 December 2015

Page 2: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

Overview: NoSQL

‘Not only SQL’

No tables, but storage of, e.g.– Collections of documents (e.g. JSON, XML)– Key-value pairs– Columns of values– Graphs– Objects– …

2

Page 3: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

NoSQL

Advantages– horizontally scalable (as opposed to vertically)– No static schema or data model– Cheaper in maintenance

Disadvantages– Possibilities can be very system-specific no universal query language– Often, some coding is necessary– Fewer a/o weaker theoretical guarantees

3

Page 4: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

Example systems: NoSQL

4

Page 5: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

MongoDB, the most popular system for document stores(see https://en.wikipedia.org/wiki/MongoDB and references there)

MongoDB is “schema-free“!

Page 6: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

6

Understanding the MongoDB / NoSQL notion of “document“

– Good example of what computer scientists call “semi-structured data“ (see previous week)

– But actually fairly structured in comparison to e.g. a textual document:• MongoDB‘s format is called BSON, a binary form of JSON• See https://en.wikipedia.org/wiki/JSON, https://

en.wikipedia.org/wiki/BSON – Note: JSON can be thought of as an alternative to XML,

as described for example on the – certainly not disinterested - http://www.json.org/xml.html , but not the type of XML you often see for annotating texts, as for example in the Letters of 1916 project

Page 7: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

7

INSERT a row (SQL) insert a document (MongoDB)db.inventory.insert(

{

item: "ABC1",

details: {model: "14Q3",

manufacturer: "XYZ Company"

},

stock: [ { size: "S", qty: 25 }, { size: "M", qty: 50 } ],

category: "clothing"

}

)

Page 8: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

8

SELECT (SQL) find documents (MongoDB)

db.inventory.find( { type: { $in: [ 'food', 'snacks' ] } } )

db.inventory.find( { type: 'food', price: { $lt: 9.95 } } )

Page 9: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

9

SELECT and SORT

Page 10: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

10

UPDATE (SQL) update documents (MongoDB)

db.inventory.update(

{ item: "MNO2" },

{

$set: {

category: "apparel",

details: { model: "14Q3", manufacturer: "XYZ" }

},

$currentDate: { lastModified: true }

}

)

Page 11: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

11

Other useful constructs ...

... such as GROUP BY are also available (see Wikipedia description)

... And python interfaces exist.

Page 12: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

Setting indexes in MongoDB:Usage (1): BSON structureGiven the following document in the users collection

{ “_id“ : ObjectID(...),“name“ : “Alice“,“age“ : 27“score“ : 25

}

the following command creates an index on the score field:

db.users.createIndex ( { “score“ : 1 } )

Page 13: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

Usage (2)

Page 14: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

14

SELECT and SORT (shown with reference to an index)

Page 15: Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015

15

Importance for DHers?

– Certainly growing, but probably not necessary for everyone

My personal rule of thumb: – Very useful if

• you know the query you have (for example because you have worked it out on a small data sample, with SQL, python, or whatever), and

• you need to process LOTS of data– Less useful for very exploratory analysis, since

there you may need a universal query language.


Top Related