nosql - what's that

Post on 01-Nov-2014

3.388 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Overview of NoSQL in general, its types and available most pop

TRANSCRIPT

Sergejus Barinovas | Microsoft MVP

@sergejusb, sergejus.blogas.lt

NoSQL – What’s that?

NoSQL

WHY?

• Limited SQL scalability• Horizontal partitioning (sharding)• Vertical partitioning

NoSQL – Why?

• Limited SQL availability• Master / slave configuration

NoSQL – Why?

• SQL limitations for storing huge amount of data• Key / value / type columns

NoSQL – Why?

• Limited SQL speed of read/write operations• Multiple read replicas

NoSQL – Why?

• 2009, Eric Evans

• NoSQL – open source distributed databases, not relational SQL databases

• NoSQL – not only SQL

• NoSQL → Big Data

NoSQL History

• The ability to horizontally scale simple-operation throughput over many servers

NoSQL Characteristics (scalability)

• A “weaker” concurrency model than the ACID transactions in most SQL systems

NoSQL Characteristics (BASE)

• Efficient use of distributed indexes and RAM for data storage

NoSQL Characteristics (distributed)

• The ability to dynamically define new attributes or data schema

NoSQL Characteristics (schema-less)

• Atomicity – all or nothing

• Consistency – state integrity

• Isolation – no reads of uncommitted data

• Durability – recover committed trans

ACID (transactions)

• 2000, Eric Brewer• It is impossible for a distributed

computer system to simultaneously provide all three of the following guarantees:

• Consistency

• Availability

• Partition tolerance

CAP Theorem

• Basically – partial system failures are OKAvailable

• Soft state – inconsistency is OK

• Eventual consistency – stale data is OK

BASE (eventual consistency)

NoSQL Databases

• Key / value store

• Document database

• Graph database

• Columnar database

NoSQL Categories

• <key, value> or Tuple<key, v1,. ., vn>

• Simple operations• Get• Put• Delete

Key / value store

Byte[] Byte[]

Key Value

Key / value store

Key Value“current_date

”2023-04-08

“sergejusb” Binary Object

“sergejusb” JSON Object

• Dynamo*

• Membase

• Voldermort

• Redis

• Azure Table Storage

• Riak

Key / value store

Name: Dynamo

Created: 2007, Amazon (proprietary)

Implementation: ?

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: ?

Key / value store

Name: Membase

Created: 2010, sponsored by Zinga

Implementation: C / C++ / Erlang

Distributed: Yes

Replication: Multiple Servers

CAP: CP

API: Memcached API, JSON

Key / value store

Name: Voldemort

Created: 2008, LinkedIn

Implementation: Java

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: Java

Key / value store

Name: Redis

Created: 2009, sponsored by VMWare

Implementation: C

Distributed: No

Replication: Master / Slave

CAP: CP

API: Various Languages

Key / value store

Name: Azure Table Storage

Created: 2008, Microsoft

Implementation: ?

Distributed: Yes

Replication: Multiple Servers (DFS)

CAP: CP

API: .NET API, JSON

Key / value store

Name: Riak

Created: 2008, Basho (from Akamai)

Implementation: Erlang

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: JSON

Key / value store

• Document == complex object• XML• YAML• JSON / BSON

• Support for secondary indexes

• Schema can be defined at runtime

• Optional support for simple querying using Map / Reduce

Document database

• MongoDB

• CouchDB

• RavenDB

Document database

Name: MongoDB

Created: 2008, 10gen

Implementation: C++

Distributed: Yes via Shards

Replication: Master / Slave

CAP: CP

API: BSON

Document database

Name: CouchDB

Created: 2005

Implementation: Erlang

Distributed: Sort of

Replication: Master / Master

CAP: AP

API: JSON

Document database

Name: RavenDB

Created: 2010, Ayende Rahien

Implementation: C#

Distributed: Yes via Shards

Replication: Master / Master

CAP: AP

API: .NET API, JSON

Document database

• Graph == network

• Basic constructs• Node• Edge• Properties

Graph database

sergejus

sergejus.blogas.lt

tdagys

auth

ors reads

knows

knows

• FlockDB

• Neo4J

Graph database

Name: FlockDB

Created: 2010, Twitter

Implementation: Scala

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: Thrift, Ruby

Graph database

Name: Neo4J

Created: 2003, Neo Technologies

Implementation: Java

Distributed: No

Replication: Master / Slave

CAP: CP

API: JSON, Various Languages

Graph database

• For HUGE amount of data

• Columns are added at a runtime

• Great scalability • Horizontal • Vertical

Columnar database

• Unusual data model• Key Space == Database• Column Family == Table• Columns and Super Columns• Super Column == array of Columns• Column == Tuple<Key, Value,

Timestamp, TTL>

Columnar database

• BigTable*

• Cassandra

• HBase

• Hypertable

Columnar database

Name: BigTable

Created: 2006, Google

Implementation: C++

Distributed: Yes

Replication: Multiple Servers (GFS)

CAP: CP

API: C++

Columnar database

Name: Cassandra

Created: 2008, Facebook

Implementation: Java

Distributed: Yes

Replication: Multiple Servers

CAP: AP

API: Thrift, Avro

Columnar database

Name: HBase

Created: 2007, Powerset

Implementation: Java

Distributed: Yes

Replication: Multiple Servers (HDFS)

CAP: CP

API: Thrift, Java, JSON

Columnar database

Name: Hypertable

Created: 2007, Zvents

Implementation: C

Distributed: Yes

Replication: Multiple Servers

CAP: CP

API: Thrift

Columnar database

• ORDER BY ?• “Natural Key Order”

NoSQL Limitations

• GROUP BY ?• Map / Reduce

NoSQL Limitations

• JOIN ?• Multiple Map / Reduce

NoSQL Limitations

• SELECT * ?• Multi-Machine Map / Reduce

NoSQL Limitations

• Maturity

• Tooling

• Specificity

NoSQL Limitations

• Choose the right tool for the task

• You can use BOTH

SQL vs. NoSQL

Q & A

top related