nosql - what's that

52
Sergejus Barinovas | Microsoft MVP @sergejusb, sergejus.blogas.lt NoSQL – What’s that?

Upload: sergejus-barinovas

Post on 01-Nov-2014

3.388 views

Category:

Technology


2 download

DESCRIPTION

Overview of NoSQL in general, its types and available most pop

TRANSCRIPT

Page 1: NoSQL - what's that

Sergejus Barinovas | Microsoft MVP

@sergejusb, sergejus.blogas.lt

NoSQL – What’s that?

Page 2: NoSQL - what's that

NoSQL

Page 3: NoSQL - what's that

WHY?

Page 4: NoSQL - what's that

• Limited SQL scalability• Horizontal partitioning (sharding)• Vertical partitioning

NoSQL – Why?

Page 5: NoSQL - what's that

• Limited SQL availability• Master / slave configuration

NoSQL – Why?

Page 6: NoSQL - what's that

• SQL limitations for storing huge amount of data• Key / value / type columns

NoSQL – Why?

Page 7: NoSQL - what's that

• Limited SQL speed of read/write operations• Multiple read replicas

NoSQL – Why?

Page 8: NoSQL - what's that

• 2009, Eric Evans

• NoSQL – open source distributed databases, not relational SQL databases

• NoSQL – not only SQL

• NoSQL → Big Data

NoSQL History

Page 9: NoSQL - what's that

• The ability to horizontally scale simple-operation throughput over many servers

NoSQL Characteristics (scalability)

Page 10: NoSQL - what's that

• A “weaker” concurrency model than the ACID transactions in most SQL systems

NoSQL Characteristics (BASE)

Page 11: NoSQL - what's that

• Efficient use of distributed indexes and RAM for data storage

NoSQL Characteristics (distributed)

Page 12: NoSQL - what's that

• The ability to dynamically define new attributes or data schema

NoSQL Characteristics (schema-less)

Page 13: NoSQL - what's that

• Atomicity – all or nothing

• Consistency – state integrity

• Isolation – no reads of uncommitted data

• Durability – recover committed trans

ACID (transactions)

Page 14: NoSQL - what's that

• 2000, Eric Brewer• It is impossible for a distributed

computer system to simultaneously provide all three of the following guarantees:

• Consistency

• Availability

• Partition tolerance

CAP Theorem

Page 15: NoSQL - what's that

• Basically – partial system failures are OKAvailable

• Soft state – inconsistency is OK

• Eventual consistency – stale data is OK

BASE (eventual consistency)

Page 16: NoSQL - what's that
Page 17: NoSQL - what's that

NoSQL Databases

Page 18: NoSQL - what's that

• Key / value store

• Document database

• Graph database

• Columnar database

NoSQL Categories

Page 19: NoSQL - what's that

• <key, value> or Tuple<key, v1,. ., vn>

• Simple operations• Get• Put• Delete

Key / value store

Byte[] Byte[]

Key Value

Page 20: NoSQL - what's that

Key / value store

Key Value“current_date

”2023-04-08

“sergejusb” Binary Object

“sergejusb” JSON Object

Page 21: NoSQL - what's that

• Dynamo*

• Membase

• Voldermort

• Redis

• Azure Table Storage

• Riak

Key / value store

Page 22: NoSQL - what's that

Name: Dynamo

Created: 2007, Amazon (proprietary)

Implementation: ?

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: ?

Key / value store

Page 23: NoSQL - what's that

Name: Membase

Created: 2010, sponsored by Zinga

Implementation: C / C++ / Erlang

Distributed: Yes

Replication: Multiple Servers

CAP: CP

API: Memcached API, JSON

Key / value store

Page 24: NoSQL - what's that

Name: Voldemort

Created: 2008, LinkedIn

Implementation: Java

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: Java

Key / value store

Page 25: NoSQL - what's that

Name: Redis

Created: 2009, sponsored by VMWare

Implementation: C

Distributed: No

Replication: Master / Slave

CAP: CP

API: Various Languages

Key / value store

Page 26: NoSQL - what's that

Name: Azure Table Storage

Created: 2008, Microsoft

Implementation: ?

Distributed: Yes

Replication: Multiple Servers (DFS)

CAP: CP

API: .NET API, JSON

Key / value store

Page 27: NoSQL - what's that

Name: Riak

Created: 2008, Basho (from Akamai)

Implementation: Erlang

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: JSON

Key / value store

Page 28: NoSQL - what's that

• Document == complex object• XML• YAML• JSON / BSON

• Support for secondary indexes

• Schema can be defined at runtime

• Optional support for simple querying using Map / Reduce

Document database

Page 29: NoSQL - what's that

• MongoDB

• CouchDB

• RavenDB

Document database

Page 30: NoSQL - what's that

Name: MongoDB

Created: 2008, 10gen

Implementation: C++

Distributed: Yes via Shards

Replication: Master / Slave

CAP: CP

API: BSON

Document database

Page 31: NoSQL - what's that

Name: CouchDB

Created: 2005

Implementation: Erlang

Distributed: Sort of

Replication: Master / Master

CAP: AP

API: JSON

Document database

Page 32: NoSQL - what's that

Name: RavenDB

Created: 2010, Ayende Rahien

Implementation: C#

Distributed: Yes via Shards

Replication: Master / Master

CAP: AP

API: .NET API, JSON

Document database

Page 33: NoSQL - what's that

• Graph == network

• Basic constructs• Node• Edge• Properties

Graph database

sergejus

sergejus.blogas.lt

tdagys

auth

ors reads

knows

knows

Page 34: NoSQL - what's that

• FlockDB

• Neo4J

Graph database

Page 35: NoSQL - what's that

Name: FlockDB

Created: 2010, Twitter

Implementation: Scala

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: Thrift, Ruby

Graph database

Page 36: NoSQL - what's that

Name: Neo4J

Created: 2003, Neo Technologies

Implementation: Java

Distributed: No

Replication: Master / Slave

CAP: CP

API: JSON, Various Languages

Graph database

Page 37: NoSQL - what's that

• For HUGE amount of data

• Columns are added at a runtime

• Great scalability • Horizontal • Vertical

Columnar database

Page 38: NoSQL - what's that

• Unusual data model• Key Space == Database• Column Family == Table• Columns and Super Columns• Super Column == array of Columns• Column == Tuple<Key, Value,

Timestamp, TTL>

Columnar database

Page 41: NoSQL - what's that

• BigTable*

• Cassandra

• HBase

• Hypertable

Columnar database

Page 42: NoSQL - what's that

Name: BigTable

Created: 2006, Google

Implementation: C++

Distributed: Yes

Replication: Multiple Servers (GFS)

CAP: CP

API: C++

Columnar database

Page 43: NoSQL - what's that

Name: Cassandra

Created: 2008, Facebook

Implementation: Java

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: Thrift, Avro

Columnar database

Page 44: NoSQL - what's that

Name: HBase

Created: 2007, Powerset

Implementation: Java

Distributed: Yes

Replication: Multiple Servers (HDFS)

CAP: CP

API: Thrift, Java, JSON

Columnar database

Page 45: NoSQL - what's that

Name: Hypertable

Created: 2007, Zvents

Implementation: C

Distributed: Yes

Replication: Multiple Servers

CAP: CP

API: Thrift

Columnar database

Page 46: NoSQL - what's that

• ORDER BY ?• “Natural Key Order”

NoSQL Limitations

Page 47: NoSQL - what's that

• GROUP BY ?• Map / Reduce

NoSQL Limitations

Page 48: NoSQL - what's that

• JOIN ?• Multiple Map / Reduce

NoSQL Limitations

Page 49: NoSQL - what's that

• SELECT * ?• Multi-Machine Map / Reduce

NoSQL Limitations

Page 50: NoSQL - what's that

• Maturity

• Tooling

• Specificity

NoSQL Limitations

Page 51: NoSQL - what's that

• Choose the right tool for the task

• You can use BOTH

SQL vs. NoSQL

Page 52: NoSQL - what's that

Q & A