polyglot persitence

Polyglot PersistenceChoosing the right persistence

option for the task at hand

SoftUni TeamStamo PetkovSoftware Universityhttp://softuni.bg

Data Wizard

http://softuni.bg/

http://creativecommons.org/licenses/by-nc-sa/4.0/

http://softuni.org/

Стамо ПетковИнформационно обслужване АДОтдел „Майкрософт технологии“

[email protected]@gmail.com

https://github.com/stamohttp://www.stamopetkov.euhttp://bg.linkedin.com/in/stamopetkovhttps://www.facebook.com/stamo.petkov@stamo_petkov

Who am I?

mailto:[email protected]

mailto:[email protected]

https://github.com/stamo

https://github.com/stamo

http://www.stamopetkov.eu/




http://bg.linkedin.com/in/stamopetkov



https://www.facebook.com/stamo.petkov



3

1. What does “Polyglot Persistence” means?2. Why do we need it?3. What options do we have?

RDBMS Document stores Key – value pairs BLOB storage Table storage Graph DBs Message Queues

4. Conclusions

Table of Contents

What does “Polyglot Persistence” means?

5

Polyglot Persistence is all about choosing the right persistence option for the task at hand.

Scott Leberknight, 2008 Gains popularity in 2011

with Martin Fowler’s diagram of “Retailers Web Application”

Making more sensewith rapidly emergingcloud technologies

The origins of Polyglot Persistence

Why do we need it?

7

Every two days now we create as much information as we did from the dawn of civilization up until 2003. That’s something like five Exabytes of data. Eric Schmidt, 4th of Aug 2010

Data production

8

One minute on the Internet

Learn more athttp://www.domo.com

http://www.domo.com/

9

Scalability and Performance Vertical scaling – Pros and Cons Horizontal scaling – Pros and Cons Persistence storages scalability

What options do we have?

11

Oracle, SQL Server, Azure SQL, PostgreSQL, MySQL

Relational databases have been around for over four decades, and that means something in the IT world.

Well known language – SQL was developed in early 70s and was standardised in 1986

Simplicity of relational model. Solid theoretical basis and normalization rules Great expertise

Relational Database Management Systems

12

NoSQL to be read Not Only SQL It’s not SQL slayer, but SQL companion Mostly open source Horizontal scalability Schema - less MapReduce Very fast for adding new data and for simple

operations/queries. CAP theorem

NoSQL

13

Riak, Redis, Berkeley DB, Oracle NoSQL DB Storing associative arrays (Dictionary, Hash) Treat the data as a single opaque collection which may have

different fields for every record Can store in RAM or HDD / SSD Use far less memory in comparison with RDBMS Ideal for cache or temporary storage Complex consistency model

Key – value pairs

14

Document stores MongoDB, DocumentDB, CouchDB… Storing documents in JSON, XML, YAML, BSON, etc. REST API Designed for horizontal scaling and

Big Data processing MapReduce framework JavaScript friendly, allow full stack

JavaScript development Rapid development

15

Apache Cassandra, Azure Table Storage, Apache Hbase… Store semi-structured data that’s highly available. Flexible datasets Designed for Big Data – store petabytes of data at reasonable cost No single point of failure – every node in the cluster has the same

role MapReduce support Read and write throughput both increase linearly as new machines

are added, with no downtime or interruption to applications Fault – tolerant – supports replication, failover and disaster recovery

Table storage

16

Neo4j, Titan, ArangoDB, Apache Giraph… Everything is stored in form of either an

edge, a node or an attribute Each node and edge can have any number

of attributes Facebook used Giraph with some

performance improvements to analyze one trillion edges using 200 machines in 4 minutes

Use cases: Real-time recommendations, Social networks, Graph-based search

Graph DBs

17

MongoDb (GridFS), Azure BLOB, Azure File Storage Store petabytes of highly available data Serve content to web or mobile applications Power big data analytics Stream video and audio Perform secure backup and disaster recovery Cost – effective

Binary Large Object storage

18

RabbitMQ, IronMQ, Azure Queue Storage Asynchronous communications protocol Can rise events or be directly accessed by clients Messages may be kept in memory, written to disk, or even

committed to a DBMS Allows creating of decoupled components Azure Queue Storage can assign resources dynamically based on

queue length.

Message Queues

19

RankDBMS Database Model

ScoreOct

2015Sep

2015Oct

2014Oct

2015Sep

2015Oct

20141. 1. 1. Oracle Relational DBMS 1466.95 +3.58 -4.95

2. 2. 2. MySQL Relational DBMS 1278.96 +1.21 +15.99

3. 3. 3. Microsoft SQL Server Relational DBMS 1123.23 +25.40 -96.37

4. 4. 5. MongoDB Document store 293.27 -7.30 +52.86

5. 5. 4. PostgreSQL Relational DBMS 282.13 -4.05 +24.416. 6. 6. DB2 Relational DBMS 206.81 -2.33 -0.867. 7. 7. Microsoft Access Relational DBMS 141.83 -4.17 +0.19

8. 8. 10. Cassandra Wide column store 129.01 +1.41 +43.309. 9. 8. SQLite Relational DBMS 102.67 -4.99 +7.71

10. 10. 12. Redis Key-value store 98.80 -1.86 +19.42

The DB-Engines Ranking

http://db-engines.com/en/system/Oracle

http://db-engines.com/en/article/RDBMS

http://db-engines.com/en/system/MySQL


http://db-engines.com/en/system/Microsoft+SQL+Server


http://db-engines.com/en/system/MongoDB

http://db-engines.com/en/article/Document+Stores

http://db-engines.com/en/system/PostgreSQL


http://db-engines.com/en/system/DB2


http://db-engines.com/en/system/Microsoft+Access


http://db-engines.com/en/system/Cassandra

http://db-engines.com/en/article/Wide+Column+Stores

http://db-engines.com/en/system/SQLite


http://db-engines.com/en/system/Redis

http://db-engines.com/en/article/Key-value+Stores

20

If your data is relational in nature use RDBMS If your data is relatively constant in size and fit in tables use

RDBMS Don’t be afraid to experiment with new persistence options, but

think twice before putting them in production Try to use in-memory data stores for temporary data Prefer BLOB storages when you are dealing with large files Consider using some kind of cloud infrastructure

Conclusions

Questions??

??

?

?

??

?

?

Polyglot Persistence

https://conf.softuni.bg/

https://conf.softuni.bg/trainings/1194/Algorithms-September-2015

https://conf.softuni.bg/trainings/1194/Algorithms-September-2015

http://softuni.bg/

http://softuni.org/

http://www.nakov.com/

http://forum.softuni.bg/

http://judge.softuni.bg/

https://www.facebook.com/SoftwareUniversity

https://twitter.com/softunibg

http://www.youtube.com/SoftwareUniversity

http://www.introprogramming.info/

http://www.luxoft.com/

http://xs-software.com/

http://komfo.com/

http://smartit.bg/

http://www.softwaregroup-bg.com/

http://www.superhosting.bg/

http://www.indeavr.com/

http://www.infragistics.com/

http://netpeak.bg/

License This course (slides, examples, labs, videos, homework, etc.)

is licensed under the "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International" license

22

Attribution: this work may contain portions from





Free Trainings @ Software University Software University Foundation – softuni.org Software University – High-Quality Education,

Profession and Job for Software Developers softuni.bg

Software University @ Facebook facebook.com/SoftwareUniversity

Software University @ YouTube youtube.com/SoftwareUniversity

Software University Forums – forum.softuni.bg

http://softuni.org/

http://softuni.bg/

https://www.facebook.com/SoftwareUniversity



http://softuni.bg/

http://softuni.org/

http://www.facebook.com/SoftwareUniversity



http://softuni.bg/