corpus collapsum - устойчивость galera к партиционированию,...
DESCRIPTION
Доклад Рагавендра Прабу на HighLoad++ 2014.TRANSCRIPT
![Page 1: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/1.jpg)
Corpus collapsumPartition tolerance of Galera in a noisy high load
environmentHighload++ 2014
Raghavendra Prabhu [email protected]
Percona [email protected] randomsurfer wnohang.net rdprabhu ronin13
![Page 2: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/2.jpg)
The Title?
![Page 3: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/3.jpg)
Our Cluster
![Page 4: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/4.jpg)
Split brain
![Page 5: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/5.jpg)
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed system. ”“ A distributed system is one in which the failure of a computer you didn’t
even know existed can render your own computer unusable. ” - Leslie Lamport“ Never attribute to malice that which is adequately explained by stupidity.
” - Hanlon’s Razor“ Never attribute to Byzantine failure which can be explained by an ill
node(s) ” - Me
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 5 / 58
![Page 6: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/6.jpg)
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed system. ”“ A distributed system is one in which the failure of a computer you didn’t
even know existed can render your own computer unusable. ” - Leslie Lamport“ Never attribute to malice that which is adequately explained by stupidity.
” - Hanlon’s Razor“ Never attribute to Byzantine failure which can be explained by an ill
node(s) ” - Me
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 5 / 58
![Page 7: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/7.jpg)
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed system. ”“ A distributed system is one in which the failure of a computer you didn’t
even know existed can render your own computer unusable. ” - Leslie Lamport“ Never attribute to malice that which is adequately explained by stupidity.
” - Hanlon’s Razor“ Never attribute to Byzantine failure which can be explained by an ill
node(s) ” - Me
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 5 / 58
![Page 8: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/8.jpg)
Introduction
Seed quotes..
“ ’Network is reliable’ - a fallacy of the distributed system. ”“ A distributed system is one in which the failure of a computer you didn’t
even know existed can render your own computer unusable. ” - Leslie Lamport“ Never attribute to malice that which is adequately explained by stupidity.
” - Hanlon’s Razor“ Never attribute to Byzantine failure which can be explained by an ill
node(s) ” - Me
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 5 / 58
![Page 9: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/9.jpg)
20000 feet view
![Page 10: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/10.jpg)
Introduction
Actors
▶ Database - WSREP/PXC▶ Plugin - Galera▶ Traffic control
♦ Traffic Control - tc♦ NetEm
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 7 / 58
![Page 11: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/11.jpg)
Introduction
Actors
▶ Database - WSREP/PXC▶ Plugin - Galera▶ Traffic control
♦ Traffic Control - tc♦ NetEm
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 7 / 58
![Page 12: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/12.jpg)
Introduction
Actors
▶ Database - WSREP/PXC▶ Plugin - Galera▶ Traffic control
♦ Traffic Control - tc♦ NetEm
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 7 / 58
![Page 13: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/13.jpg)
Introduction
Actors
▶ Containers - Docker▶ Load
♦ Generators - Sysbench, RQG▶ Network
♦ Dnsmasq♦ nsenter
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 8 / 58
![Page 14: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/14.jpg)
Introduction
Actors
▶ Containers - Docker▶ Load
♦ Generators - Sysbench, RQG▶ Network
♦ Dnsmasq♦ nsenter
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 8 / 58
![Page 15: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/15.jpg)
Introduction
Actors
▶ Jenkins♦ Build flow and CI
▶ Storage♦ Why
▶ “Others”
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 9 / 58
![Page 16: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/16.jpg)
Details
But why
▶ The ’P’ in CAP▶ WAN scalability▶ Real Reason - fun!▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 10 / 58
![Page 17: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/17.jpg)
Details
But why
▶ The ’P’ in CAP▶ WAN scalability▶ Real Reason - fun!▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 10 / 58
![Page 18: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/18.jpg)
Details
But why
▶ The ’P’ in CAP▶ WAN scalability▶ Real Reason - fun!▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 10 / 58
![Page 19: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/19.jpg)
Details
But why
▶ The ’P’ in CAP▶ WAN scalability▶ Real Reason - fun!▶ Tolerance to latency variance
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 10 / 58
![Page 20: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/20.jpg)
Details
But why
▶ Failures in warehouses.▶ Not quorum, but consensus.▶ Real world networks and synchronous replication
- Delay- Partition
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 11 / 58
![Page 21: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/21.jpg)
Galera
![Page 22: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/22.jpg)
Details
Galera
▶ Data-centric approach▶ EVS▶ Causality and Synchronous▶ Latency
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 13 / 58
![Page 23: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/23.jpg)
![Page 24: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/24.jpg)
![Page 25: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/25.jpg)
![Page 26: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/26.jpg)
Where did it start
![Page 27: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/27.jpg)
Details
Where did it start
▶ Bug! https://bugs.launchpad.net/galera/+bug/1274192▶ Loss of PC▶ Crash▶ HA goal
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 18 / 58
![Page 28: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/28.jpg)
One can bring the whole down
![Page 29: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/29.jpg)
The Flow
![Page 30: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/30.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 31: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/31.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 32: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/32.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 33: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/33.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 34: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/34.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 35: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/35.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 36: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/36.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 37: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/37.jpg)
Details
Basic Flow
Jenkins Build images Start Dnsmasq Bootstrap
Load/SysbenchSST/OthersPre-sanitynsenter/netem
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 21 / 58
![Page 38: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/38.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 39: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/39.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 40: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/40.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 41: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/41.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 42: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/42.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 43: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/43.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 44: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/44.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 45: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/45.jpg)
Details
Basic FlowRR sysbench
Detach/Keep
Sanity check Reconciliation
Post sanity Core trace
Cleanup Collect logsRaghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 22 / 58
![Page 46: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/46.jpg)
Details
Cluster Resilience
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 23 / 58
![Page 47: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/47.jpg)
Details
Parameters
▶ Sysbench▶ Segment▶ Reconciliation period▶ Loss nodes
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 24 / 58
![Page 48: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/48.jpg)
Details
Parameters
▶ Sysbench▶ Segment▶ Reconciliation period▶ Loss nodes
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 24 / 58
![Page 49: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/49.jpg)
Details
Parameters
▶ Sysbench▶ Segment▶ Reconciliation period▶ Loss nodes
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 24 / 58
![Page 50: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/50.jpg)
Details
Parameters
▶ Sysbench▶ Segment▶ Reconciliation period▶ Loss nodes
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 24 / 58
![Page 51: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/51.jpg)
Details
Parameters
▶ NetEm▶ Detach loss▶ Fsync▶ Shutdown
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 25 / 58
![Page 52: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/52.jpg)
Details
Parameters
▶ NetEm▶ Detach loss▶ Fsync▶ Shutdown
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 25 / 58
![Page 53: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/53.jpg)
Details
Parameters
▶ NetEm▶ Detach loss▶ Fsync▶ Shutdown
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 25 / 58
![Page 54: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/54.jpg)
Details
Parameters
▶ NetEm▶ Detach loss▶ Fsync▶ Shutdown
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 25 / 58
![Page 55: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/55.jpg)
Containers!
![Page 56: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/56.jpg)
Details
Docker
▶ Why not virtualize♦ Occam♦ Namespaces
▶ Simplicity♦ Network♦ One application per node
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 27 / 58
![Page 57: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/57.jpg)
Details
Docker
▶ Portability- See same qualitative behavior that I do.
▶ Reproducibility- Makes it determinstic
▶ Configurable and CI- Byproducts
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 28 / 58
![Page 58: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/58.jpg)
Details
Docker
▶ QEMU and Docker▶ Scalability
♦ Performance♦ Feature
▶ Abstraction of channels
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 29 / 58
![Page 59: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/59.jpg)
Details
Container Networking
▶ Linking didn’t help▶ Dnsmasq to rescue!
♦ Hosts file and volumes♦ SIGHUP and refresh
▶ More elegant methodsSwarm
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 30 / 58
![Page 60: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/60.jpg)
Details
Noise
▶ Initial setup- Bridge- Egress only- IFB
▶ Present state▶ NetEm
- tc qdisc buckets- packet loss, delay, corruption, duplication, reordering- nsenter
▶ Future- Docker exec
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 31 / 58
![Page 61: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/61.jpg)
Testing methods
![Page 62: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/62.jpg)
Details
Method I
▶ Qdisc is detached after load▶ Objective
- Time to recover of full cluster▶ Done with a larger subset
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 33 / 58
![Page 63: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/63.jpg)
Details
Method II
▶ Qdisc is kept till the end▶ Objective
- Formation of primary component▶ Comparatively smaller set
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 34 / 58
![Page 64: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/64.jpg)
Details
Observations
▶ Post sanity types- Why
▶ Which method is more pertinent▶ State transfer issues
- Beginning- During re-emergence
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 35 / 58
![Page 65: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/65.jpg)
Details
Observations
▶ Direct load to affected nodes▶ Logs
- journalctl- Streaming?
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 36 / 58
![Page 66: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/66.jpg)
Details
Other noises
▶ Aim▶ Fsync
- libeatmydata- Variance
▶ Correlation with network▶ How with Docker
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 37 / 58
![Page 67: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/67.jpg)
System Load
![Page 68: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/68.jpg)
Details
Load generation
▶ Sysbench- Generation- Reconnect on partition
▶ Sockets chosen- Load on affected nodes
▶ Distribution of Load- RR with socat- Native sysbench support
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 39 / 58
![Page 69: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/69.jpg)
Details
Load generation
▶ Nature of data/load- DDL
▶ RQG in future- Fuzz testing
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 40 / 58
![Page 70: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/70.jpg)
The Fix
![Page 71: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/71.jpg)
Strike Out!
![Page 72: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/72.jpg)
Details
Eviction
▶ STONITH▶ Permanent eviction▶ ’N’ strikes & out!
- Timers - evs parameters- wsrep_evs_delayed and wsrep_evs_evict_list
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 43 / 58
![Page 73: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/73.jpg)
Details
Eviction
▶ Aim▶ Quorum required
- Why? - Not shoot each other - Non-PC nodes also.
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 44 / 58
![Page 74: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/74.jpg)
Details
Eviction
▶ Aim▶ Quorum required
- Why? - Not shoot each other - Non-PC nodes also.
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 44 / 58
![Page 75: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/75.jpg)
Details
Eviction
▶ EVS version and upgrade▶ TODO!
- Ingress only - Follow here.▶ Credits to Teemu Ollakka, Yan Zhang and Alex Yurchenko from codership.
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 45 / 58
![Page 76: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/76.jpg)
Details
Coredumps with Docker
▶ Breakdown of abstraction▶ Lack of isolation▶ What was done
- Volumes- core_pattern & sysctl- suid and ulimit
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 46 / 58
![Page 77: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/77.jpg)
Details
WAN Segments
▶ How they work▶ Random allocation▶ Joiner starvation▶ Simulates data center▶ Donor selection
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 47 / 58
![Page 78: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/78.jpg)
Epilogue
The code
▶ Github: https://github.com/percona/pxc-docker▶ Jenkins: http://jenkins.percona.com/job/PXC-5.6-netem/
- Demo?▶ Contributions/testing welcome!▶ Dependencies
- Sysbench
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 48 / 58
![Page 79: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/79.jpg)
Epilogue
Code: todo
▶ Docker automated builds▶ Orchestration▶ Docker
♦ Injection♦ Signal proxying
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 49 / 58
![Page 80: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/80.jpg)
Epilogue
Code: todo
▶ Use Hoare’s channels - Go!▶ Run it bare - CoreOS▶ Overlay with etcd/fleet/libswarm
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 50 / 58
![Page 81: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/81.jpg)
Future work
![Page 82: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/82.jpg)
Epilogue
Future work
▶ Fault injection♦ Memory
- Poisoned memory♦ Disk
- libeatmydata- Opposite: laggard!- ENOSPC
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 52 / 58
![Page 83: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/83.jpg)
Epilogue
Fault injection
▶ CPU- NUMA?- Hotplug
▶ More network- corruption, duplication, reordering, rate-limit- Better distribution- Other shaping
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 53 / 58
![Page 84: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/84.jpg)
More Chaos
![Page 85: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/85.jpg)
Epilogue
Future work
▶ Disturb cluster more!- Membership changes* Manual eviction* Pull the cord!- Corrupt nodes
▶ Consistency voting
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 55 / 58
![Page 86: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/86.jpg)
Epilogue
Further Reading
▶ Byzantine fault tolerance- Reaching agreement in presence of faults
▶ The Network is Reliable▶ NetEm▶ Latency: The New Web Performance Bottleneck▶ Galera▶ Auto eviction code▶ Don’t Settle for Eventual Consistency
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 56 / 58
![Page 87: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/87.jpg)
Epilogue
About
▶ /me: Raghavendra Prabhu, Product Lead, Percona XtraDB Cluster, Percona.▶ Slides will be at slideshare and owncloud▶ Keybase.io: rdprabhu▶ About.me: raghavendra.prabhu▶ Keybase.io: rdprabhu▶ Presentation under CC BY-SA 4.0
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 57 / 58
![Page 88: Corpus collapsum - устойчивость Galera к партиционированию, Raghavendra Prabhu (Percona)](https://reader035.vdocuments.site/reader035/viewer/2022081404/5585ba34d8b42a695a8b4c5d/html5/thumbnails/88.jpg)
Epilogue
Image Credits▶ http://galeracluster.com/documentation-webpages/▶ http://www.thelastdragontribute.com/40th-anniversary-death-of-bruce-lee/▶ https://upload.wikimedia.org/wikipedia/commons/6/60/Corpus_callosum.png▶ http://www.thebarrow.org/Neurological_Services/Epilepsy/204354▶ https://flic.kr/p/9J6GNu▶ https://secure.flickr.com/photos/brewbooks/7780990192▶ https://www.flickr.com/photos/kwerfeldein/2649294869▶ https://secure.flickr.com/photos/mindmob/51951632▶ https://secure.flickr.com/photos/arenamontanus/2227769907▶ https://www.flickr.com/photos/markop/477199204▶ http://galeracluster.com/wp-content/uploads/2013/10/galera_replication1.png▶ https://www.flickr.com/photos/gcwest/281385801▶ https://www.flickr.com/photos/opethdamna/360934079▶ http://digital-amphetamine.deviantart.com/art/Sky-82555664▶ http://highload.co/i/logo.png▶ https://flic.kr/p/xTT8n▶ https://www.flickr.com/photos/29233640@N07/13466208953▶ https://www.flickr.com/photos/bob_in_thailand/9782777742/
Raghavendra Prabhu (Percona) Corpus collapsum 31 October, 2014 58 / 58