brief introduction of drbd in sle12sp2
TRANSCRIPT
![Page 2: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/2.jpg)
2
Overview
• What is DRBD
• Development status
• How to use DRBD
• Key features of DRBD
• Packages & Environment
• State of DRBD
• Basic structure
• MD
• What happening when resource starting
![Page 3: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/3.jpg)
What is DRBD?
![Page 4: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/4.jpg)
Distributed Replicated Block Device
![Page 5: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/5.jpg)
Distributed Replicated Block Device
![Page 6: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/6.jpg)
Distributed Replicated Block Device
![Page 7: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/7.jpg)
Dual primary(Need shard FS support: OCFS2/gfs)
![Page 8: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/8.jpg)
Development status
![Page 9: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/9.jpg)
DRBD & Kernel
• drbd.ko – already built in kernel, but fall behind our dist Kernel 2.6.33 → 8.3.7 Kernel 3.12 → 8.4.6 (SLE12 SP1 as KMP) Kernel 4.2 → 8.4.X Kernel 4.4 → 9.0.1 (SLE12 SP2 as KMP)
• DRBD – Dev and maintain by Linbit. Ver8.0~8.3.x, Ver8.4.x, Ver9.0.x – Other tools like: drbd-utils, drbd-doc, drbd-test, drbdmanage
![Page 10: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/10.jpg)
How to use DRBD
![Page 11: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/11.jpg)
Demo time!! - DRBD8 (147.2.207.59/154) - DRBD9 (147.2.212.220/144/107) - DRBD with HA cluster
![Page 12: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/12.jpg)
Preparation
• 1) You need to create/provide block device for DRBD
2) You need to distribute DRBD config files.
3) Enable the ports DRBD needed.
4) Need to create meta-data.
5) Trigger the initial synchronization.
![Page 13: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/13.jpg)
Configuration in DRBD8
• “test.res” in /etc/drbd.d/ resource test { protocol C; disk { on-io-error pass_on; } on node-1 { address 147.2.207.187:7792; device /dev/drbd0; disk /dev/vdb; meta-disk internal; } on node-2 { address 147.2.207.199:7792; device /dev/drbd0; disk /dev/vdb; meta-disk internal; }}
![Page 14: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/14.jpg)
Configuration in DRBD9• “test.res” in /etc/drbd.d/
resource test { net { protocol C; } connection-mesh { hosts node-1 node-2 node-3; } on node-1 { address 10.161.155.151:7788; device /dev/drbd0; disk /dev/sdb1; meta-disk internal; node-id 0; } on node-2 { address 10.161.155.158:7788; device /dev/drbd0; disk /dev/sdb1; meta-disk internal; node-id 1; } on node-3 { address 10.161.155.159:7788; device /dev/drbd0; disk /dev/sdb1; meta-disk /dev/sdc1; node-id 2; }}
![Page 15: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/15.jpg)
Crm configuration
• crm configurecrm(live)configure# primitive drbd_test ocf:linbit:drbd \ params drbd_resource="test" \ op monitor interval="29s" role="Master" \ op monitor interval="31s" role="Slave"crm(live)configure# ms ms_drbd_test drbd_test \ meta master-max="1" master-node-max="1" \ clone-max="2" clone-node-max="1" \ notify="true"crm(live)configure# commitcrm(live)configure# exit
![Page 16: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/16.jpg)
Key features of DRBD
![Page 17: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/17.jpg)
Replication modes
• ...net { protocol C;}…
Fully synchronous mode (LAN): Protocol CAsynchronous mode(WAN): Protocol A and Protocol B (Normally used in Geo scenario)
![Page 18: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/18.jpg)
Online device verification
• DRBD permits the verification of local and peer devices in an online fashion.
DRBD doesn't move data between nodes to validate but instead moves cryptographic digests of the data (hash). In this way, a node computes a hash of a block; transfers the much smaller signature to the peer node, which also calculates the hash; and then compares them. If the hashes are the same, the blocks are properly replicated. But if the hashes differ, the out-of-date block is marked as out of sync, and subsequent synchronization ensures that the block is properly synchronized.
![Page 19: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/19.jpg)
Automatic recovery
• Automatic resync after node or connectivity failure, direction, amount. DRBD can also recover from a wide variety of errors, but one of the most insidious is the so-called "split brain" situation.
1) Discarding modifications made on the younger primary.2) Discarding modifications made on the older primary.3) Discarding modifications on the primary with fewer changes.4) Graceful recovery from split brain if one host has had no intermediate changes. (Recommended)
...handlers { split-brain "/usr/lib/drbd/notify-split-brain.sh root" ...}net { after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; ...}...
![Page 20: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/20.jpg)
Optimizing synchronization
• Two of the schemes that DRBD uses are activity logs and the quick-sync bitmap.
The activity log stores blocks that were recently written to and define which blocks need to be synchronized after a failure is resolved. The quick-sync bitmap defines the blocks that are in sync (or out of sync) during a time of disconnection. When the nodes are reconnected, synchronization can use this bitmap to quickly synchronize the nodes to be exact replicas of one another.
![Page 21: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/21.jpg)
New features of DRBD9
• 1) Multi-Node replication.
2) Up to 31 connections per resource, that means support 32 nodes cluster.
3) Auto promote.
4) Transport abstraction layer. eg. drbd_transport_tcp.ko All for RDMA on Ethernet/InfiniBand.
5) New manage tools: drbdmanage
![Page 22: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/22.jpg)
Packages & Environment
![Page 23: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/23.jpg)
DRBD Packages in SLE12SP2
• Project drbd: drbd (COPYING, ChangeLog) drbd-kmp-default (drbd.ko, drbd_transport_tcp.ko ) Project drbd-utils: drbd-utils (drbdadm, drbdmeta, drbdsetup, etc...)
Project yast2-drbd: yast2-drbd
![Page 24: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/24.jpg)
Threads of DRBD
• After ko loaded kthread drbd_reissue PR: 0
Per resources started and after connected: drbd<minor>_submit PR: 0 drbd_w(orker)_<res> PR: 20 drbd_r(eceiver)_<res> PR: 20 drbd_a(ck_receiever)_<res> PR: -3 drbd_s(ender)_<res> PR: 20
•
![Page 25: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/25.jpg)
State
![Page 26: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/26.jpg)
Resource roles
• Primary: may be read from and written to
Secondary: normally receives updates from its peer, but may neither be read from nor written to
Unknown: It is only displayed for the peer’s resource role, and only in disconnected mode
![Page 27: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/27.jpg)
Disk states
• Diskless: No local block device has been assigned to the DRBD driver
Attaching: Reading meta data. Next → Consistent/Inconsistent/…
Failed: I/O failure reported by local block device. Next → Diskless
Consistent/Inconsistent: Consistent data of a node/need sync
UpToDate/Outdated: It is decided when connection is establised.
Dunknown: Used for the peer disk if no network connection.
![Page 28: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/28.jpg)
Connection states
• StandAlone: The resource has not yet been connected.
Disconnecting: Temporary state, Next → StandAlone.
Unconnected: Temporary state, Next → WFConnection.
Timeout/NetworkFailure/ProtocolError: Connection Errors.
Teardown: Temporary state, Next → Unconnected.
WFConnection: waiting until the peer node become visible.
Connected: connection has been established.
Others: StartingSyncS/StartingSyncT, WFBitMapS/WFBitMapT, SyncSource/SyncTarget, PausedSyncS/PausedSyncT, VerifyS/VerifyT
![Page 29: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/29.jpg)
Basic data structure
![Page 30: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/30.jpg)
DRBD resources
• A node has a number of DRBD resources. Each such resource has a number of devices (volumes) and connections to other nodes. Each device has a unique minor device number.
This relationship is represented by the global variable drbd_resources, thedrbd_resource, drbd_connection, drbd_device, and drbd_peer_device objects, and their interconnections.
| resource | device | … | device | | connection | peer_device | … | peer_device | | … | … | ... | … | | connection | peer_device | … | peer_device |
All in lru-safe way, protected by the resource->conf_update mutex.
![Page 31: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/31.jpg)
Metadata
![Page 32: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/32.jpg)
Metadata includes:
• Information like size of the DRBD device
Generation Identifier
Activity Log
Quick-sync bitmap
![Page 33: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/33.jpg)
Activity log
• Considering write operation to the local backing device and the data block send over though the network at the same time, the primary node fail and fail-over being initiated… this data block is out of sync
“The Activity log” , keeps track of those blocks that have "recently" been written to.
So only the blocks in the Activity log need to be synchronized after connection resume.
![Page 34: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/34.jpg)
Quick sync bitmap (per node)
• On a per-resource per-peer basis, to keep track of blocks being out-of sync.
One bit represents a 4-KiB chunk of on-disk data
Bitmap is changed in memory, unless changes out of the activity log or the resource is prepare to down.
![Page 35: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/35.jpg)
Generation Identifier
• Determining whether the two nodes are in the same cluster
Determining whether need sync and the direction
Identifying split brain
A list consist of:Current UUIDBitmap UUIDsHistorical UUIDs * 2
Three main ways to generate GI:1) Initial sync happen, both side using the GI of SyncSource.2) Promote Secondary to Primary when connection state is disconnected. 3) Original Primary generate new GI when disconnecting, secondary stay unchanged.
Others like disconnecting during state changing...
•
•
![Page 36: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/36.jpg)
$ drbdadm up <res> What happening?
![Page 37: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/37.jpg)
Stages:
• CFG_PREREQCFG_RESOURCECFG_DISK_PREP_DOWN/CFG_DISK_PREP_UPCFG_NET_DISCONNECT/CFG_NET_CONNECTCFG_NET_PREP_DOWN/CFG_NET_PREP_UPCFG_NET_PATHCFG_NET…
For drbdadm up <res>, scheduled stages are:CFG_NET_PREP_UPCFG_NET_PATHCFG_NET_CONNECTCFG_PEER_DEVICECFG_DISK_PREP_UPCFG_DISK
•
![Page 38: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/38.jpg)
Appendices
![Page 39: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/39.jpg)
Links
• Linbit homepage: http://www.drbd.org/en/
Source code in tarball: http://www.drbd.org/en/community/download
Git repos: http://git.linbit.com/
•
•
![Page 40: brief introduction of drbd in SLE12SP2](https://reader033.vdocuments.site/reader033/viewer/2022051304/58ac2b231a28abf03a8b6885/html5/thumbnails/40.jpg)
40