1 on failure recoverability of client-server applications in mobile wireless environments ing-ray...
TRANSCRIPT
1
On Failure Recoverability of Client-Server
Applications in Mobile Wireless Environments
Ing-Ray Chen, Baoshan Gu, Sapna E. George and Sheng-Tzong Cheng
Present by Heng Zhang & Ying Jin
2
Agenda
Introduction System Model Analysis of Failure Recovery
Probability Numerical Results Failure Recoverability vs. Cost
Tradeoff Analysis Conclusion and Future Research
3
Introduction Failure in mobile host Checkpoint strategies Logging strategies Mobility handoff strategies on the
Failure Recoverability
Mobile Host in a client-server applications can easily fail due to: Low battery power Memory exhaustion Lack of resource (e.g. bandwidth)
4
Introduction (cont’) Checkpoint
Application takes a snapshot of its state to save the values of state variables in a persistent storage. So when a failure occurs, MH can roll back to the sate saved at the checkpoint.
Checkpoint protocols Coordinated – multiple MH: consistent
global checkpoint Uncoordinated –one MH: MH can
independently checkpoint its local state
5
Introduction (cont’)Write-event
MH receives a message or a user input which modifies the state of the application.
No-logging A new checkpoint is created whenever a
write-event happens. High checkpoint cost.
Logging Create checkpoints periodically Logs write-events which occur in between
two consecutive checkpoints. Decrease checkpoint cost More failure recovery time
6
Introduction (cont’) Mobility Handoff strategies on the
Failure Recoverability Eager:
Always keep the logging and checkpoint information in the base station MH currently resides.
Fast failure recovery Lazy:
Do not move the checkpoint and logging as the MH moves. A forwarding pointer is established from the current base station to the last base station.
7
System Model (1) Consider a MH, in a client-server distributed
application in a mobile wireless environment Periodically checkpoint The write events between 2 consecutive
checkpoints recorded by creating log entries Assume the checkpoint and log information
will be kept at the base stations Two mobility hand off strategies are
considered Eager strategy lazy strategy
8
System Model (2)
Eager Strategy When the MH fails, the persistent information (the last
checkpoint and message logs afterward) can be found at the current base station.
Lazy Strategy When MH moves from cell n to cell 1, a linked list is
formed with the length of n-1. If the MH fails, there will be n cell involved in failure recovery and the persistent information will be scattered in the base stations on the forwarding chain.
10
Analysis of Failure Recovery Probability
When checkpoint operation is performed, all log entries before the checkpoint will be purged and No. of the log entries N(t) will be reset to 0.
The MH will only re-execute those log entries accumulated past the last checkpoint.
Eager Mobility Handoff Strategy:
13
Consider the Special Case: Logs arrive—Poisson process with arrival rate λ MH failure time—Exponentially distributed with failure rate δ
Analysis of Failure Recovery Probability – Eager Strategy
15
Lazy Mobility Handoff Strategy Suppose that before the failure occur, the No. of cells
visited by a MH is k since the last checkpoint. The recovery process:
Transferring the last checkpoint and log entries distributed among the k base stations to the current base station via the wired network
Transfer the last checkpoint with log entries from the current base station to the MH (via the wireless network)
The re-execution of the log entries
Analysis of Failure Recovery Probability
16
Cdf of the Recovery Time:
Random variable representing the number of base station crossed by the MH past the last checkpoint given that the failure time is t
Analysis of Failure Recovery Probability – Lazy Strategy
17
Let t’=t - MTc
TR
r – transmission ratio between wired and wireless communication.
Analysis of Failure Recovery Probability – Lazy Strategy
18
Consider the Special Case: Logs arrive— Poisson process with arrival rate λ MH failure time—Exponentially distributed with failure rate δ Residence time–-Exponentially distributed with rate σ
Analysis of Failure Recovery Probability – Lazy Strategy
20
Numerical Results Effects of various parameters on
failure recoverability Eager handoff + logging Lazy handoff + logging
Log arrival rate λMH failure rate δCheckpoint interval Tc
Transmission ratio between wired and wireless communication
r
MH mobility rate σ
21
Numerical Results (cont’)
Checkpoint interval 1000s
Log arrival rate 0.1
Mobility rate 0.01
Failure rate 0.0001
Transmission ratio 0.1
• Failure recovery probability under the eager strategy is always better than that under the lazy strategy.
• Given enough recovery time (T > 0.3)in this case, the failure recoverability offered by the less costlylazy strategy is just as good as the more costly eager strategy.
Failure recovery probability for different recovery time
22
Numerical Results (cont’)
Checkpoint interval 1000s
Recovery time 0.24s
Mobility rate 0.010.001
Failure rate 0.0001
Transmission ratio 0.1
• The system recovery probabilitydecreases dramatically as the log arrival rate increases.
• Recovery probability difference between the twohandoff strategies is small when the log arrivalrate is low.
• The effect of mobility rate on the recovery probability is marginal.
Effect of log arrival rate and mobility rate
23
Numerical Results (cont’)
Log arrival rate 0.1
Recovery time 0.24s
Mobility rate 0.01
Failure rate 0.0001
Transmission ratio 0.1
• The system recovery probabilitydecreases dramatically as the checkpoint interval increases.
• The difference in recovery probability at a particular recovery time between the Eager and Lazy strategy becomes more significant as the checkpoint intervalincreases.
Effect of the checkpoint interval
24
Numerical Results (cont’)
Log arrival rate 0.1
Recovery time 0.24s
Mobility rate 0.01
Checkpoint interval 1000s
Transmission ratio 0.1
• The higher the failure rate the higher the recovery probability.
• As the failure rate increases the difference between the Eager and Lazy mobility handoff strategies becomes less significant.
Effect of failure rate
25
Failure Recoverability vs. Cost Tradeoff Analysis
Tradeoff: Eager strategy spends less time for a failure recovery, but much more time for maintaining checkpoint and logs than Lazy strategy.
Objective: identify a condition under which the recovery probability gained by Eager is most effective considering the cost invested for maintenance.
26
Failure Recoverability vs. Cost Tradeoff Analysis Failure Recoverability versus Cost
Ratio: the slope of the recovery probability gained versus the cost invested.
Cost invested by Eager
Number of checkpoints before failure:
Number of moves crossing boundary during a checkpoint period:
Number of log entries between two consecutive moves:
27
Failure Recoverability vs. Cost Tradeoff Analysis
Cost invested by Lazy Cp : communication cost for setting up the link.
• There exists a best checkpoint intervalunder which the Eager strategy is most cost-effective over the Lazy strategy.
• The best cost-effective checkpoint interval for the eager strategy increases as the recovery time increases.
28
Conclusion and Future Research Conclusion
Closed-form expressions for the failure recovery time distribution for both Eager and Lazy handoff strategies.
Extensive numerical analysis on the effect of model parameters like log arrival rate, mobility rate, failure rate, checkpoint interval.
Analysis the tradeoff involved between cost requested to maintain the checkpoints and logs and the recovery cost.
Future research Using more sophisticated probabilistic model
(SPN) Analysis more other checkpoint strategies