data stores: beyond relational databases

44
DOTNETMÁLAGA // MalagaMakers // 5 th Nov 2015

Upload: javier-garcia-magna

Post on 22-Jan-2018

562 views

Category:

Software


5 download

TRANSCRIPT

DOTNETMÁLAGA // MalagaMakers // 5th Nov 2015

• Relational vs. NoSQL

• Definitions and examples

• Other database classifications

• 9 Databases in 40 minutes!

• Polyglot Persistence

• Some statistics

• Summary

What is NoSQL?

SQLCommercial example: Oracle | OS example: (Oracle) MySQL

NoSQL“Mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.”

“Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable.”

NoSQL systems are also sometimes called "Not only SQL".

SQL? ACID? Relations? Distributed?

Commercial example: DynamoDB | OS example: MongoDB

NewSQLModern relational database management systems that seek to provide the samescalable performance of NoSQL systems for online transaction processing (OLTP) read-write workloads while still maintaining the ACID guarantees of a traditional database system.

OS example: VoltDB

Y

AX B

NoSQL vs. SQL vs. NewSQL

Wikipedia

No-sql.org

More Database classifications

On premises vs. Cloud “As a service” (Azure DocumentDB)

Memory / Disk vs. Only in memory (OrigoDB, Redis, SQL Server)

OLTP vs. OLAP

Databases vs. Not a database but a data store (Zookeeper, Kafka)

CAP classifications

And more…

In action…

Key-value stores (Redis)

Document stores (RavenDB …ok, MongoDB)

Wide column stores (Cassandra)

Graph DBMS (Neo4j)

Search engines (Elastic Search)

Time Series DBMS (InfluxDB)

Event Stores (Event Store)

MultiModel (OrientDB)

Relational DBMS (MS SQL Server 2016)

Use cases…

Show latest itemsCount itemsLeaderboardsUnique itemsPub/SubQueuesCacheAs the main database

Key Value

Some C# code

Use cases…

Log dataProduct catalogMetadata / asset managementCMSPrototypingAs the main database

Document Store

Some Javascript (Meteor) code…

Use cases…

Time series analyticsHuge # writesAs the main database(for big data storage!)

Wide Column

Some CQL + C# code…

CQL vs. Internal structure (Cassandra CLI)

cqlsh:test> SELECT * FROM tweets;user | time | lat | long | tweet--------------+--------------------------+--------+---------+---------------------softwaredoug | 2013-07-13 08:21:54-0400 | 38.162 | -78.549 | Having chest pain.softwaredoug | 2013-07-21 12:15:27-0400 | 38.093 | -78.573 | Speedo self shot.

jnbrymn | 2013-06-29 20:53:15-0400 | 38.092 | -78.453 | I like programming.jnbrymn | 2013-07-14 22:55:45-0400 | 38.073 | -78.659 | Who likes cats?jnbrymn | 2013-07-24 06:23:54-0400 | 38.073 | -78.647 | My coffee is cold.

[default@test] list tweets;-------------------RowKey: softwaredoug=> (column=2013-07-13 08\:21\:54-0400:, value=, timestamp=1374673155373000)=> (column=2013-07-13 08\:21\:54-0400:lat, value=4218a5e3, timestamp=1374673155373000)=> (column=2013-07-13 08\:21\:54-0400:long, value=c29d1917, timestamp=1374673155373000)=> (column=2013-07-13 08\:21\:54-0400:tweet, value=486176696e67206368657374207061696e2e, timestamp=1374673155373000)=> (column=2013-07-21 12\:15\:27-0400:, value=, timestamp=1374673155407000)=> (column=2013-07-21 12\:15\:27-0400:lat, value=42185f3b, timestamp=1374673155407000)=> (column=2013-07-21 12\:15\:27-0400:long, value=c29d2560, timestamp=1374673155407000)=> (column=2013-07-21 12\:15\:27-0400:tweet, value=53706565646f2073656c662073686f742e, timestamp=1374673155407000)-------------------RowKey: jnbrymn=> (column=2013-06-29 20\:53\:15-0400:, value=, timestamp=1374673155419000)=> (column=2013-06-29 20\:53\:15-0400:lat, value=42185e35, timestamp=1374673155419000)=> (column=2013-06-29 20\:53\:15-0400:long, value=c29ce7f0, timestamp=1374673155419000)=> (column=2013-06-29 20\:53\:15-0400:tweet, value=49206c696b652070726f6772616d6d696e672e, timestamp=1374673155419000)=> (column=2013-07-14 22\:55\:45-0400:, value=, timestamp=1374673155434000)=> (column=2013-07-14 22\:55\:45-0400:lat, value=42184ac1, timestamp=1374673155434000)=> (column=2013-07-14 22\:55\:45-0400:long, value=c29d5168, timestamp=1374673155434000)=> (column=2013-07-14 22\:55\:45-0400:tweet, value=57686f206c696b657320636174733f, timestamp=1374673155434000)=> (column=2013-07-24 06\:23\:54-0400:, value=, timestamp=1374673155485000)=> (column=2013-07-24 06\:23\:54-0400:lat, value=42184ac1,

user – partition key time – clustering key

Use cases…

General data managementNetwork and IT operationsRecommendation enginesFraud detectionSocial networks

Graph DBs

Just a few slides remaining…

Some C# code…

Some C# code… log4net + ElasticSearch + Kibana{

"settings": {"index": {

"number_of_shards": 1,"number_of_replicas": 0

}},"mappings": {

"LogEvent": {"properties": {

"timeStamp": {"type": "date","format": "dateOptionalTime"

},"message": {

"type": "string"},"messageObject": {

"type": "object"},"exception": {

"type": "object"},

….

2 ElasticSearch general purpose libraries for .Net:• Nest – High level• ElasticSearch.Net – Low level

C# + InfluxDB + Grafana + … IoT?

InfluxDB + Grafana <> ElasticSearch + KibanaTime series (metrics) <> Structured data, e.g. logs

CQRS

https://msdn.microsoft.com/en-us/library/jj591559.aspx

CQRS…

WITH an ORM WITH Event Store

https://msdn.microsoft.com/en-us/library/jj591559.aspx

Too good to be true…?

http://orientdb.com/why-orientdb/

The Beast

• SQL and NoSQL (JSON support)

• In-Memory tables

• Row level security

• Always Encrypted

• Query Store

• Polybase Hadoop / Azure blob storage

Polyglot persistence

Any decent sized enterprise will have a variety of different data storage technologies for different kinds of data

before…

https://engineering.linkedin.com/architecture/brief-history-scaling-linkedin

after…

https://engineering.linkedin.com/architecture/brief-history-scaling-linkedin

Some stats(from DB-Engines.com)

Key Takeaways

Always think about the schema

(even with schema less DBs)

Best DB? “It depends”

• Prototyping?

• Domain?

• How the data is going to be used?

Most of us don’t work with “big data” but “small or medium”

DOTNETMÁLAGAMálagaMakers

Docker images usedspotify/cassandrabalsamiq/docker-elasticsearchbalsamiq/docker-kibanatutum/influxdbneo4j/neo4jwkruse/eventstoreredis

Resources

Different DB images: https://www.thoughtworks.com/insights/blog/nosql-databases-overview

Polyglot persistence images: http://www.slideshare.net/mongodb/webinar-mongodb-and-polyglot-persistence-architecture

DATABASE NAME AVAILABLE FOR WINDOWS?

Redis Yes (C)

MongoDB Yes (C++)

Cassandra Yes (Java)

Neo4j Yes (Java)

ElasticSearch Yes (Java)

InfluxDB Yes (Go)

EventStore Yes

OrientDB Yes (Java)

SQL Server Yes (C++)