tarantool - s.conf.guru · history was born @ mail.ru group used to store sessions/profiles of ms...
TRANSCRIPT
![Page 1: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/1.jpg)
TarantoolTarantoolA no-SQL DBMS nowA no-SQL DBMS now
with SQLwith SQLKirill Yukhin
1
![Page 2: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/2.jpg)
AgendaAgenda
What is Tarantool?PerformanceStorage enginesScalingSQLPlans
2
![Page 3: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/3.jpg)
HistoryHistory
Was born @ Mail.ru groupUsed to store sessions/profiles of Ms of users
Web servers
4 instances
8 instances
3
load web-page
A JAX request
mobile API
> 1.000.000 requests per second
sessions
profiles
![Page 4: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/4.jpg)
Must-have and mustn't-haveMust-have and mustn't-have
No secondary keys, constraints etc.Schema-lessNeed a language. *QL is not must-have
High-speed in any sense!SimpleExtensible
TransactionsPersistencyOnce again: it must be fast, no excuses
4
![Page 5: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/5.jpg)
Tarantool: Bird's Eye ViewTarantool: Bird's Eye View
No need for cache: It is in-memoryBut still DBMS: persistency and transactions
It regards ACIDSingle threaded: It is lock-freeEasy: imperative language is on board: Lua
It JITsIt's easy to program business
It scales: Replication and sharding
5
![Page 6: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/6.jpg)
DBMS + Application ServerC, Lua, SQL, Python, PHP, Go, Java, C# ...
Querieshandling WAL Network
Process
Threads
Persistent in-memory and disk storage enginesStored procedures in C, Lua, SQL
6
![Page 7: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/7.jpg)
Coöperative multitasking
Multithreading That is a stall
Losses on caches coherency supportLosses on locksLosses on long operations
7
Fibers
Event-loop
Thread is always busyLock-freeSingle core - no coherency issues at all
![Page 8: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/8.jpg)
VinylVinyl
In-memory is OK, but not always enough
Write-oriented: LSM tree
Same API as memtx
Transactions, secondary keys
8
![Page 9: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/9.jpg)
9
Horizontal
Why?
![Page 10: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/10.jpg)
Horizontal scalingHorizontal scalingReplication
ABC ABCABC
Sharding
A BCScaling computation and fault
tolerance
10
Scaling computation anddata
Replication and sharding
A
A A
B
B B
C
C C
Scaling computation, data and faulttolerance
![Page 11: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/11.jpg)
ReplicationReplication
begin
Asynchronous
commit
replicate
begin
Synchronous
prepare replicate
11
commit
Commit is not waiting for replication tosucceed
Two phase commit. To succeed, need toreplicate to N nodes
FasterReplicas might lag, conflict
More reliableSlower, complicated protocols
![Page 12: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/12.jpg)
ShardingShardingRanges hashDecide where to store?
min max
Found range where the key belongs -> found the node
12
Calculated hash of the key -> found the node
BestComplicatedUsually useless
Good enoughComplex reshardingComplex queries not fast
?
![Page 13: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/13.jpg)
Resharding problemResharding problem
shard_id(key) : key → shard , shard , ..., shard 1 2 N
Change N leads to change of shard-function
shard_id(key1) = new_shard_id(key)Need to re-calculate shard-
functions for all data
Some data might move on one of
old nodes
Useless datamoves
... but not in Tarantool land
13
![Page 14: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/14.jpg)
Virtual shardingVirtual sharding
Data Virtual nodes
Physical nodes
tuple
tupletuple
tupletuple
tuple
14
shard_id(key) = bucket , bucket , ..., bucket 1 2 N
# = const >> #
Shard-function is fixed
![Page 15: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/15.jpg)
ShardingSharding
RangesHashesVirtual buckets
Having a range or a bucket, how to findwhere it is stored physically?
1. Prohibit re-sharding2. Always visit all nodes3. Implement proxy-router!
15
![Page 16: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/16.jpg)
Why SQL?Why SQL?
SQL> SELECT DISTINCT(a) FROM t1, t2 WHERE t1.id = t2.id AND t2.y > 1;
CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER, c INTEGER)
CREATE TABLE t2 (id INTEGER PRIMARY KEY, x INTEGER, y INTEGER, z INTEGER)
16
![Page 17: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/17.jpg)
Why SQL?Why SQL?CREATE TABLE t1 (id INTEGER PRIMARY KEY, a INTEGER, b INTEGER, c INTEGER)
CREATE TABLE t2 (id INTEGER PRIMARY KEY, x INTEGER, y INTEGER, z INTEGER)
function query() local join = for _, v1 in box.space.t1:pairs(, iterator='ALL') do local v2 = box.space.t2:get(v1[1]) if v2[3] > 1 then table.insert(join, t1=v1, t2=v2) end end local dist = for _, v in pairs(join) do if dist[v['t1'][2]] == nil then dist[v['t1'][2]] = 1 end end local result = for k, _ in pairs(dist) do table.insert(result, k) end return result end
17
![Page 18: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/18.jpg)
SQL FeaturesSQL Features
Trying to be subset of ANSIMinimum overhead of query plannerACID transactions, SAVEPOINTsleft/inner/natural JOIN, UNION/EXCEPT,subqueriesHAVING, GROUP BY, ORDER BYWITH RECURSIVETriggersViewsConstraintsCollations
18
![Page 19: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/19.jpg)
PerspectivesPerspectives
Onboard shardingSynchronous replicationSQL: more types, JIT, query planner
19
![Page 20: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/20.jpg)
Sharding
Replication
In-memory
Disk
Persistency
SQL
Stored procedures
Audit logging
Connectors to DBMSes
Static build
GUI
Unprecedentedperformance
Tarantool VShard
Synchronous/Asynchronous
memtx engine
vinyl engine , LSM-tree
Both engines
ANSI
Lua, C, SQL
Yes
MySQL, Oracle, Memcached
for Linux
Cluster management
100.000 RPS per instance - easy! 20
![Page 21: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/21.jpg)
Спасибо!Спасибо!https://tarantool.io
https://github.com/tarantool/tarantool
21
![Page 22: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/22.jpg)
Lag of async replicas
slow network
Lag
Master
Re-send of lost changesRejoin
Complex topologiesSupport of arbitrarytopologies
22
![Page 23: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/23.jpg)
Multikey & JSON in TarantoolMultikey & JSON in Tarantool23
![Page 24: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/24.jpg)
24
![Page 25: Tarantool - s.conf.guru · History Was born @ Mail.ru group Used to store sessions/profiles of Ms of users Web servers 4 instances 8 instances 3 load web-page A JAX request mobile](https://reader033.vdocuments.site/reader033/viewer/2022050223/5f686d84e5aab931122b3bf9/html5/thumbnails/25.jpg)
Storing data in the JSON format is also a natural way tostore data than in rows and columns
25