![Page 1: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/1.jpg)
DISTRIBUTEDFAILURE DETECTOR
Ziv Dayan200388130
Tom Afek Kafka200637247
InstructorIttay Eyal
![Page 2: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/2.jpg)
Reminder
What is a failure detector? Our failure detector
Software Implementation Gossip style Independent local unit
![Page 3: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/3.jpg)
Model
![Page 4: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/4.jpg)
Implementation Communication – by messages Each message contains a list of heartbeats Each heartbeat contains
IP of creator Time since creation
Each node contains its own Local Node: Local NodeLocal Node
Net MembersNet MembersNodeNode NodeNode NodeNode NodeNode NodeNode NodeNode
NeighborsNeighbors
VersionsVersions
NeighborNeighbor NeighborNeighbor NeighborNeighbor NeighborNeighbor
VersionVersion VersionVersion VersionVersion VersionVersion VersionVersion
![Page 5: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/5.jpg)
Network Construction
![Page 6: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/6.jpg)
Failure Detection Method
Repeat periodically: Choose the node whose threshold is
closest to expiration Wait until the threshold has expired Check the local time of creation of the
last heartbeat received by the suspected node: If changed – the node is OK Else – the suspected node had crashed
![Page 7: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/7.jpg)
Thread Diagram
Computer ListenerComputer Listener
MainMain
Message HandlerMessage Handler
Message SenderMessage Sender
SenderSender
DetectorDetector
![Page 8: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/8.jpg)
Version Handling A new abstract class is added –
NetMessage Method 1: Handle() – decodes the received
message using the proper version and returns Message
Method 2: toString() – used for serializationNetMessage
SHA1Message NormalMessage
Message
![Page 9: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/9.jpg)
Version Agreement Protocol
initiator responder
,i iaddr V
,i rv
,i rv
NetMsg msg
,i rvNetMsg msg
![Page 10: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/10.jpg)
Readers Writers Problem
![Page 11: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/11.jpg)
Heartbeat Rate
H = f(P, n, threshold) Assumptions required
Simplicity Vs Efficiency Full topology Spread time << threshold
![Page 12: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/12.jpg)
Heartbeat Rate – Take I
Assumption – Local Information Strong Assumption
Reliability x – number of messages - Probability for false detection We want
Result :
21
1 1
xn PLR
Pn n
thresht
h2
1 1
xn PLR
n n
2
1 1
log 1thresh
n PLR
n n
th
P
![Page 13: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/13.jpg)
Take I Performance
Linear Performance The bigger is P the bigger is the slope
![Page 14: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/14.jpg)
Heartbeat Rate – Take II
Assumptions Synchrony Consistency
Calculation for average case
![Page 15: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/15.jpg)
Take II Performance
High Performance
![Page 16: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/16.jpg)
Which Method Is better?
Comparison Categories Efficiency Scalability Dynamism Reliability
![Page 17: Ziv Dayan200388130 Tom Afek Kafka200637247 Instructor Ittay Eyal](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d2d5503460f94a038b6/html5/thumbnails/17.jpg)
Thank you for listening