polyglot persitence
TRANSCRIPT
Polyglot PersistenceChoosing the right persistence
option for the task at hand
SoftUni TeamStamo PetkovSoftware Universityhttp://softuni.bg
Data Wizard
Стамо ПетковИнформационно обслужване АДОтдел „Майкрософт технологии“
[email protected]@gmail.com
https://github.com/stamohttp://www.stamopetkov.euhttp://bg.linkedin.com/in/stamopetkovhttps://www.facebook.com/stamo.petkov@stamo_petkov
Who am I?
3
1. What does “Polyglot Persistence” means?2. Why do we need it?3. What options do we have?
RDBMS Document stores Key – value pairs BLOB storage Table storage Graph DBs Message Queues
4. Conclusions
Table of Contents
What does “Polyglot Persistence” means?
5
Polyglot Persistence is all about choosing the right persistence option for the task at hand.
Scott Leberknight, 2008 Gains popularity in 2011
with Martin Fowler’s diagram of “Retailers Web Application”
Making more sensewith rapidly emergingcloud technologies
The origins of Polyglot Persistence
Why do we need it?
7
Every two days now we create as much information as we did from the dawn of civilization up until 2003. That’s something like five Exabytes of data. Eric Schmidt, 4th of Aug 2010
Data production
9
Scalability and Performance Vertical scaling – Pros and Cons Horizontal scaling – Pros and Cons Persistence storages scalability
What options do we have?
11
Oracle, SQL Server, Azure SQL, PostgreSQL, MySQL
Relational databases have been around for over four decades, and that means something in the IT world.
Well known language – SQL was developed in early 70s and was standardised in 1986
Simplicity of relational model. Solid theoretical basis and normalization rules Great expertise
Relational Database Management Systems
12
NoSQL to be read Not Only SQL It’s not SQL slayer, but SQL companion Mostly open source Horizontal scalability Schema - less MapReduce Very fast for adding new data and for simple
operations/queries. CAP theorem
NoSQL
13
Riak, Redis, Berkeley DB, Oracle NoSQL DB Storing associative arrays (Dictionary, Hash) Treat the data as a single opaque collection which may have
different fields for every record Can store in RAM or HDD / SSD Use far less memory in comparison with RDBMS Ideal for cache or temporary storage Complex consistency model
Key – value pairs
14
Document stores MongoDB, DocumentDB, CouchDB… Storing documents in JSON, XML, YAML, BSON, etc. REST API Designed for horizontal scaling and
Big Data processing MapReduce framework JavaScript friendly, allow full stack
JavaScript development Rapid development
15
Apache Cassandra, Azure Table Storage, Apache Hbase… Store semi-structured data that’s highly available. Flexible datasets Designed for Big Data – store petabytes of data at reasonable cost No single point of failure – every node in the cluster has the same
role MapReduce support Read and write throughput both increase linearly as new machines
are added, with no downtime or interruption to applications Fault – tolerant – supports replication, failover and disaster recovery
Table storage
16
Neo4j, Titan, ArangoDB, Apache Giraph… Everything is stored in form of either an
edge, a node or an attribute Each node and edge can have any number
of attributes Facebook used Giraph with some
performance improvements to analyze one trillion edges using 200 machines in 4 minutes
Use cases: Real-time recommendations, Social networks, Graph-based search
Graph DBs
17
MongoDb (GridFS), Azure BLOB, Azure File Storage Store petabytes of highly available data Serve content to web or mobile applications Power big data analytics Stream video and audio Perform secure backup and disaster recovery Cost – effective
Binary Large Object storage
18
RabbitMQ, IronMQ, Azure Queue Storage Asynchronous communications protocol Can rise events or be directly accessed by clients Messages may be kept in memory, written to disk, or even
committed to a DBMS Allows creating of decoupled components Azure Queue Storage can assign resources dynamically based on
queue length.
Message Queues
19
RankDBMS Database Model
ScoreOct
2015Sep
2015Oct
2014Oct
2015Sep
2015Oct
20141. 1. 1. Oracle Relational DBMS 1466.95 +3.58 -4.95
2. 2. 2. MySQL Relational DBMS 1278.96 +1.21 +15.99
3. 3. 3. Microsoft SQL Server Relational DBMS 1123.23 +25.40 -96.37
4. 4. 5. MongoDB Document store 293.27 -7.30 +52.86
5. 5. 4. PostgreSQL Relational DBMS 282.13 -4.05 +24.416. 6. 6. DB2 Relational DBMS 206.81 -2.33 -0.867. 7. 7. Microsoft Access Relational DBMS 141.83 -4.17 +0.19
8. 8. 10. Cassandra Wide column store 129.01 +1.41 +43.309. 9. 8. SQLite Relational DBMS 102.67 -4.99 +7.71
10. 10. 12. Redis Key-value store 98.80 -1.86 +19.42
The DB-Engines Ranking
20
If your data is relational in nature use RDBMS If your data is relatively constant in size and fit in tables use
RDBMS Don’t be afraid to experiment with new persistence options, but
think twice before putting them in production Try to use in-memory data stores for temporary data Prefer BLOB storages when you are dealing with large files Consider using some kind of cloud infrastructure
Conclusions
Questions??
??
?
?
??
?
?
Polyglot Persistence
https://conf.softuni.bg/
License This course (slides, examples, labs, videos, homework, etc.)
is licensed under the "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International" license
22
Attribution: this work may contain portions from
Free Trainings @ Software University Software University Foundation – softuni.org Software University – High-Quality Education,
Profession and Job for Software Developers softuni.bg
Software University @ Facebook facebook.com/SoftwareUniversity
Software University @ YouTube youtube.com/SoftwareUniversity
Software University Forums – forum.softuni.bg