presentation: mongo db & elasticsearch & membase
TRANSCRIPT
MongoDB
ElasticSearch
CouchbasePrepared by: Shalkarbayuly A. (PhD student)
Presentation was prepared for “Database Management” course as
a part of PhD program in Turgut Ozal University
NoSQL databases
NOSQL sometimes stands for Not Only SQL
NOSQL is mechanism for storage and retrieval
of data other than tabular data
Motivations are simplicity of design, horizontal
scaling, control over availability
Problems of RD solved by NoSQL
★ RD will not scale to your traffic at an
acceptable cost
★ NoSQL provides a tool to develop new
features easily.
★ NoSQL have local data transactions which
do not have to be very durable. e.g. “liking”
items on websites.
Avoid NoSQL
➢ If application requires run-time flexibility.
➢ If application requires ACID
➢ if application requires complicated queries
➢ if application requires query language
➢ If consistency is mandatory and there will
be no drastic changes in data volume,
relational databases would be a better option
What is MongoDB?
★ MongoDB is cross-platform
★ MongoDB is document-oriented database
★ MongoDB is a NoSQL database
★ MongoDB stores data in JSON-like
documents
MongoDB philosophy
Keep functionality when we can
Non-relational makes scaling horizontally
practical
Document models are good
Database technology should run anywhere
VMs, cloud, metal, etc
Use cases for MongoDB
Need of horizontal scaling:
storing in many regular servers
Iterative development:
regular changes of database’s structure
Document-oriented logic:
web page is important than data
Goal: create Web-application for e-commerce
Products: there is no specific products, different
type of products may be sold on webapp
Problem: design database schema
E-commerce: sample problem
Product_Book{id, name, shipping_info, price, description, ……….author,title,publisher,edition,ISBN}
Product_Media{id, name, shipping_info, price, description, ……….artist,title,track_listing,label,format}
Simple solution: each time create table for specific product type
Problem: very complex code, creating table, rewriting app takes time and
causes errors, items cannot be considered as one item
RE
LA
TIO
NA
L D
AT
AB
AS
E
appro
ach
Product{id, name, shipping_info, price, description, field1_value, field1_name,field2_value,field2_name,}
Product{id, name, shipping_info, price, description, type,author,artist,track_listing,ISBN,}
Set of fields with value and name
Problems: what if there are many
fields, how to find all books
All types of attributes in one table
Problems: new items causes
changes in code and table
RE
LA
TIO
NA
L D
AT
AB
AS
E
appro
ach
Book{title,author,ISBN,}
Product{id, name, shipping_info, price, description, type (whether book or media)}
Creating polymorphic tables
Problems: need extra JOINS, which causes increase of speed
RE
LA
TIO
NA
L D
AT
AB
AS
E
appro
ach
{title: “Matrix”, price: 3500,details:{actors:[‘Keanu Reaves’,’’K. Zeta Jones’]}}
{title: “Sherlock Holmes”, price: 2100, details:{ISBN:33002A,author:”Conan Doyle”}}
★ MongoDB stores data in document form
○ Don’t need schema
○ Store in JSON form
○ Datas are edited in application
★ No JOINS
○ Data is loaded in LINEAR time
CRUD operations
Retrieving, creating, updating, deleting operations are done
on application side
There is no SQL queries
if software has many apps (on different platforms) it is BAD.
Because you have to write logic each time
if software has one app it is GOOD.
Because you don’t have to mess with SQL code
MongoDB: java example
//Create String json = “{‘name’:’Mike’,’surname’:’Smith’}”;DBObject dbObject = (DBObject)JSON.parse(json);collection.insert(dbObject);//Retrievedb.products.find({‘title’:”The Matrix”});//UpdateBasicDBObject newD = new BasicDBObject();newD.append("$set",new BasicDBObject().append("clients", 110));BasicDBObject sq = new BasicDBObject().append("name", "Mike");collection.update(sq, newD);
ElasticSearch
★ Real-time data
★ Real-time analytics
★ Distributed
★ High-availability
★ Multitenancy
★ Full-text search
★ Document-oriented
★ Schema-free
★ RESTful API
★ Build on top
Apache Lucene
Elasticsearch: important features
Elasticsearch is based on Lucene
Elasticsearch is ready for search of all types
➔Elasticsearch is search engine
➔Elasticsearch is document database
What is Lucene? (small explanation)
Lucene is information retrieval library, which
takes documents and makes them easily
searchable, through:
● indexing
● advanced analysis
● tokenization (indexing mice as mouse)
Lucene creates inverted
index, so that searching in
documents is performed in
linear time
Indexing
Searching terms in
documents without
indexing is
doc.size x no.documents
Don’t do this
Elasticsearch: use case
Elasticsearch is used as:
★ search engines
★ as a search mechanism for web-apps
among main database○ e.g. E-commerce storing data in MongoDB, while
search data is stored in elasticsearch
★ as a document database
Elasticsearch: REST
Elasticsearch can be
accessed through
REST protocol
#Inserting dataPUT http://localhost:9200/movies/movie/1{"title": "The Godfather","director": "Francis Ford Coppola"}#Getting dataGET http://localhost:9200/movies/movie/1#Delete dataDELETE http://localhost:90/movies/movie/1#Searching dataPOST http://localhost:9200/_search"query": {
"query_string": {"query": "kill"
}}
Couchbase history
Couchbase was created by combining two
NoSQL databases
Membase + CouchOne (principal players
behind CouchDB) = Couchbase
CouchBase
● Written in: Erlang & C
● Main point: Memcache compatible, but with
persistence and clustering
● Protocol: memcached + extensions
● Very fast (200k+/sec) access of data by key
● Provides memcached-style in-memory
caching buckets
Couchbase
Best used: Any application where low-latency
data access, high concurrency support and
high availability is a requirement.
For example: Low-latency use-cases like ad
targeting or highly-concurrent web apps like
online gaming (e.g. Zynga).
Couchbase
Couchbase store data in key-value or in document form
Couchbase is a key-value store: every Document has a
Key and a Value
Key can be a up to 250 characters
Keys are unique, within a bucket there can be only one key
Values can be JSON, string, numbers, binary blobs, special
positive number
Key-value NOSQL databases
Performance: high
Scalability: high
Flexibility: high
Complexity: none
Advantage:
High speed of response
Disadvantage:
All logic is located in app