sharding - patterns & antipatterns, Константин Осипов, Алексей...
DESCRIPTION
Доклад Константина Осипова (Mail.Ru, Tarantool) и Алексея Рыбака (Badoo)TRANSCRIPT
![Page 1: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/1.jpg)
Sharding: patterns and
antipatterns
Konstantin Osipov (Mail.Ru, Tarantool)
Alexey Rybak (Badoo)
![Page 2: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/2.jpg)
Big picture: scalable databases
● replication
● sharding and re-sharding
● distributed queries & jobs, Map/Reduce
● DDL
● will focus on sharding/re-sharding only
![Page 3: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/3.jpg)
Contents
I. sharding function
II. routing
III.re-sharding
![Page 4: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/4.jpg)
I. Sharding function
![Page 5: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/5.jpg)
Selecting a good shard key
● the identified object
should be small
● some data you won’t be
able to shard (and have to
duplicate in each shard)
● don’t store the key if you
don’t have to
![Page 6: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/6.jpg)
Good and bad shard keys
● good: user session, shopping order
● maybe: user (if user data isn’t too thick)
● bad: inventory item, order date
![Page 7: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/7.jpg)
Garage sharding: numbers
● replication based doubling (2, 4, 8, out of
cash)
● the magic number 48 (2✕3✕4)
![Page 8: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/8.jpg)
Garage sharding thru hashing
● good: remainderso f(key) ≡ key % n_srv
o f(key) ≡ crc32(key) % n_srv
● bad: first login letter
![Page 9: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/9.jpg)
Sharding for grown-ups
● table function
● consistent hashing
![Page 10: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/10.jpg)
Table functions● virtual buckets: key -> bucket -> shard
o “key -> bucket” function, “bucket -> shard” table
o “key -> bucket” table, “bucket -> shard” table
![Page 11: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/11.jpg)
Consistent hashing
● Danny Lewin RIP
● Kinda ring and like...
uhm... points, you
know ...
● Libraries: Ketama
![Page 12: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/12.jpg)
Guava/Sumbur
● f(key, n_servers) => server_id
● strictly uniform key-to-server mapping
● recurrence formula (15 lines of code)
![Page 13: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/13.jpg)
II. Routing
![Page 14: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/14.jpg)
Routing types
● smart client
● coordinator
● proxy
● local proxy on every app server
● intra-database routing
![Page 15: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/15.jpg)
Smart Client
● no extra hops
● all clients
(PHP/Python/C...)
should implement
it
● resharding is hard
![Page 16: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/16.jpg)
Proxy
● encapsulates routing logic
● extra hop, traffic
● +1 service
● SPOF
=> local proxy
![Page 17: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/17.jpg)
Coordinator
● centralized
knowledge
● SPOF
![Page 18: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/18.jpg)
Intra-database routing
● too many nodes
● redundancy is high
● ad-hoc requests
![Page 19: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/19.jpg)
III.Re-sharding
![Page 20: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/20.jpg)
Re-sharding is a pain
● redistribution impacts:o clients
o network performance
o consistency
=> maintenance time window
● forget about it on petabyte scale
![Page 21: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/21.jpg)
Best practice: no data redistribution
● update is a move
● data expiration (new data on new servers)
● new data on selected servers
![Page 22: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/22.jpg)
DDL
● upgrade your app
● upgrade your database
● update your app and remove any trace of old
schema
![Page 23: Sharding - patterns & antipatterns, Константин Осипов, Алексей Рыбак](https://reader034.vdocuments.site/reader034/viewer/2022042715/559c1dcf1a28abce298b461b/html5/thumbnails/23.jpg)
Thank you! Questions?