ceph bluestore - новый тип хранилища в ceph / Максим Воронцов,...
TRANSCRIPT
![Page 1: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/1.jpg)
Ceph new store: BlueStoreМаксим Воронцов
![Page 2: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/2.jpg)
About me
● Главный инженер по вычислительным комплексам● Работаю с Linux 8 лет● WAS/DB2/MQ вот это все● Много разных проектов
![Page 3: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/3.jpg)
About RedSys
● Бизнес интегратор● Существует более 20 лет● Офисы в MOW, LED, OVB, GOJ, ROV, KHV● RED = Responsibility + Efficiency + Development● Отрасли - ТЭК, ВПК, Госы, Телеком, etc.
![Page 4: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/4.jpg)
Customers
![Page 5: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/5.jpg)
TOC
● Before Ceph● Ceph first advent● Ceph temptations● BlueStore prophecy● Ceph FileStore vs BlueStore● Let's fight● Results● Awaiting Ceph second advent
![Page 6: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/6.jpg)
Software Defined Storage
● Unlimited scalability● Storage virtualization● Policy-driven administration● API services● Support for block, file and object data types
![Page 7: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/7.jpg)
IBM definition
«SDS in today's business context refers to IT storage that goes beyond typical array interfaces (for example, command line and graphic user) to operate within a higher architectural construct.»
![Page 8: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/8.jpg)
Examples
● AWS S3● EMC ScaleIO● Ceph● GlusterFS● Huawei FusionStorage● IBM ElasticStorage● NexentaStor
![Page 9: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/9.jpg)
Issue
● DB2 on z/OS
![Page 10: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/10.jpg)
Issue
● DB2 on z/OS● XML in DB2
![Page 11: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/11.jpg)
Issue
● DB2 on z/OS● XML in DB2● Signed XML in DB2 (no way)
![Page 12: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/12.jpg)
Issue
● DB2 on z/OS● XML in DB2● Signed XML in DB2 (no way)● You really shouldn't store blobs in relational store
![Page 13: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/13.jpg)
To find a way
● More money to IBM?
![Page 14: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/14.jpg)
To find a way
● More money to IBM?● More money to someone else?
![Page 15: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/15.jpg)
To find a way
● More money to IBM?● More money to someone else?● Something else?
![Page 16: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/16.jpg)
Which one?
● AWS S3● Ceph● IBM ElasticStorage● Huawei OceanStor● Swift
![Page 17: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/17.jpg)
Why this one?
![Page 18: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/18.jpg)
Standing on the shoulders of giants
● CERN● Cisco● Deutsche Telecom● Yahoo● Cloudmouse.ru● ...
![Page 19: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/19.jpg)
Preborn
7 guests in VMWare:● 1 MON● 3 OSD● 1 ActiveMQ● 1 Tomcat● 1 ElasticStorage
![Page 20: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/20.jpg)
Long story short
Long long story about...
![Page 21: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/21.jpg)
Long story short
Long long story about…
What is English for «импортозамещение»?
![Page 22: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/22.jpg)
Long story short
Long long story about…
What is English for «импортозамещение»?
Catch up and overtake z/OS
![Page 23: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/23.jpg)
Long story short
Long long story about…
What is English for «импортозамещение»?
Catch up and overtake z/OS
What is Russian for LTFS?
![Page 24: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/24.jpg)
Long story short
Long long story about…
What is English for «импортозамещение»?
Catch up and overtake z/OS
What is Russian for LTFS?
What is Russian for WORM?
![Page 25: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/25.jpg)
BlueStore prophecy
Ceph Jewel Preview: a new store is coming, BlueStore
![Page 26: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/26.jpg)
Ceph scheme
![Page 27: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/27.jpg)
OSD scheme
![Page 28: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/28.jpg)
FileStore scheme
![Page 29: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/29.jpg)
BlueStore scheme
![Page 30: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/30.jpg)
BlueStore advanced scheme
![Page 31: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/31.jpg)
Mount directory structure
$ ls -R /var/lib/ceph/osd/ceph-0 | wc -l
![Page 32: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/32.jpg)
Mount directory structure
$ ls -R /var/lib/ceph/osd/ceph-0 | wc -l
FileStore BlueStore
18656 16
![Page 33: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/33.jpg)
HW test
$ sudo dd bs=1G count=1 oflag=direct \
if=/dev/zero of=zerofile
1+0 records in
1+0 records out
1073741824 bytes (1,1 GB) copied, 10,275 s, 105 MB/s
![Page 34: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/34.jpg)
HW test
$ iperf3 -c osd00
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 7.40 GBytes 6.35 Gbits/sec 3278 sender
[ 4] 0.00-10.00 sec 7.39 GBytes 6.35 Gbits/sec receiver
$ iperf3 -c osd00-ci
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 15.5 GBytes 13.3 Gbits/sec 64 sender
[ 4] 0.00-10.00 sec 15.5 GBytes 13.3 Gbits/sec receiver
![Page 35: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/35.jpg)
Ceph tests
$ ceph osd pool create radosbench 64
$ rados bench -p radosbench 300 write \
--no-cleanup
$ rados bench -p radosbench 300 seq
$ rados bench -p radosbench 300 rand
$ rbd create fio_test --size 10G
$ fio rbd.fio
![Page 36: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/36.jpg)
Results
![Page 37: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/37.jpg)
Results
![Page 38: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/38.jpg)
Not so fast
$ ceph-disk prepare --bluestore /dev/sdd /dev/sdb
$ ls /dev/disk/by-partlabel/ -l
osd-device-2-block -> ../../sdb2
osd-device-2-data -> ../../sdd1
![Page 39: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/39.jpg)
Not so fast
$ ceph-disk prepare --bluestore /dev/sdd /dev/sdb
$ ls /dev/disk/by-partlabel/ -l
ceph%20data -> ../../sdb1
ceph%20block -> ../../sdb2
![Page 40: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/40.jpg)
Not so fast
Here be dragons
Tech preview
CPU regression on too fast disks ;-)
Did you do backup today?
![Page 42: Ceph BlueStore - новый тип хранилища в Ceph / Максим Воронцов, (Redsys)](https://reader031.vdocuments.site/reader031/viewer/2022013106/586f8f951a28ab54768b7661/html5/thumbnails/42.jpg)
● https://www.redbooks.ibm.com/abstracts/redp5121.html● http://www.sersc.org/journals/IJMUE/vol10_no11_2015/27.p
df● http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf● https://ceph.com● https://www.sebastien-han.fr/blog/● https://cds.cern.ch/record/2015206/files/CephScaleTestMa
rch2015.pdf● http://rocksdb.org