1 on failure recoverability of client-server applications in mobile wireless environments ing-ray...

1

On Failure Recoverability of Client-Server

Applications in Mobile Wireless Environments

Ing-Ray Chen, Baoshan Gu, Sapna E. George and Sheng-Tzong Cheng

Present by Heng Zhang & Ying Jin

2

Agenda

Introduction System Model Analysis of Failure Recovery

Probability Numerical Results Failure Recoverability vs. Cost

Tradeoff Analysis Conclusion and Future Research

3

Introduction Failure in mobile host Checkpoint strategies Logging strategies Mobility handoff strategies on the

Failure Recoverability

Mobile Host in a client-server applications can easily fail due to: Low battery power Memory exhaustion Lack of resource (e.g. bandwidth)

4

Introduction (cont’) Checkpoint

Application takes a snapshot of its state to save the values of state variables in a persistent storage. So when a failure occurs, MH can roll back to the sate saved at the checkpoint.

Checkpoint protocols Coordinated – multiple MH: consistent

global checkpoint Uncoordinated –one MH: MH can

independently checkpoint its local state

5

Introduction (cont’)Write-event

MH receives a message or a user input which modifies the state of the application.

No-logging A new checkpoint is created whenever a

write-event happens. High checkpoint cost.

Logging Create checkpoints periodically Logs write-events which occur in between

two consecutive checkpoints. Decrease checkpoint cost More failure recovery time

6

Introduction (cont’) Mobility Handoff strategies on the

Failure Recoverability Eager:

Always keep the logging and checkpoint information in the base station MH currently resides.

Fast failure recovery Lazy:

Do not move the checkpoint and logging as the MH moves. A forwarding pointer is established from the current base station to the last base station.

7

System Model (1) Consider a MH, in a client-server distributed

application in a mobile wireless environment Periodically checkpoint The write events between 2 consecutive

checkpoints recorded by creating log entries Assume the checkpoint and log information

will be kept at the base stations Two mobility hand off strategies are

considered Eager strategy lazy strategy

8

System Model (2)

Eager Strategy When the MH fails, the persistent information (the last

checkpoint and message logs afterward) can be found at the current base station.

Lazy Strategy When MH moves from cell n to cell 1, a linked list is

formed with the length of n-1. If the MH fails, there will be n cell involved in failure recovery and the persistent information will be scattered in the base stations on the forwarding chain.

9

Analysis of Failure Recovery Probability

Notations

10


When checkpoint operation is performed, all log entries before the checkpoint will be purged and No. of the log entries N(t) will be reset to 0.

The MH will only re-execute those log entries accumulated past the last checkpoint.

Eager Mobility Handoff Strategy:

11TR

Cdf of the Recovery Time:

Analysis of Failure Recovery Probability – Eager Strategy

12

Let t’=t - MTc


13

Consider the Special Case: Logs arrive—Poisson process with arrival rate λ MH failure time—Exponentially distributed with failure rate δ


14


15

Lazy Mobility Handoff Strategy Suppose that before the failure occur, the No. of cells

visited by a MH is k since the last checkpoint. The recovery process:

Transferring the last checkpoint and log entries distributed among the k base stations to the current base station via the wired network

Transfer the last checkpoint with log entries from the current base station to the MH (via the wireless network)

The re-execution of the log entries


16

Cdf of the Recovery Time:

Random variable representing the number of base station crossed by the MH past the last checkpoint given that the failure time is t

Analysis of Failure Recovery Probability – Lazy Strategy

17

Let t’=t - MTc

TR

r – transmission ratio between wired and wireless communication.


18

Consider the Special Case: Logs arrive— Poisson process with arrival rate λ MH failure time—Exponentially distributed with failure rate δ Residence time–-Exponentially distributed with rate σ


19


20

Numerical Results Effects of various parameters on

failure recoverability Eager handoff + logging Lazy handoff + logging

Log arrival rate λMH failure rate δCheckpoint interval Tc

Transmission ratio between wired and wireless communication

r

MH mobility rate σ

21

Numerical Results (cont’)

Checkpoint interval 1000s

Log arrival rate 0.1

Mobility rate 0.01

Failure rate 0.0001

Transmission ratio 0.1

• Failure recovery probability under the eager strategy is always better than that under the lazy strategy.

• Given enough recovery time (T > 0.3)in this case, the failure recoverability offered by the less costlylazy strategy is just as good as the more costly eager strategy.

Failure recovery probability for different recovery time

22



Recovery time 0.24s

Mobility rate 0.010.001

Failure rate 0.0001


• The system recovery probabilitydecreases dramatically as the log arrival rate increases.

• Recovery probability difference between the twohandoff strategies is small when the log arrivalrate is low.

• The effect of mobility rate on the recovery probability is marginal.

Effect of log arrival rate and mobility rate

23



Recovery time 0.24s

Mobility rate 0.01

Failure rate 0.0001


• The system recovery probabilitydecreases dramatically as the checkpoint interval increases.

• The difference in recovery probability at a particular recovery time between the Eager and Lazy strategy becomes more significant as the checkpoint intervalincreases.

Effect of the checkpoint interval

24



Recovery time 0.24s

Mobility rate 0.01



• The higher the failure rate the higher the recovery probability.

• As the failure rate increases the difference between the Eager and Lazy mobility handoff strategies becomes less significant.

Effect of failure rate

25

Failure Recoverability vs. Cost Tradeoff Analysis

Tradeoff: Eager strategy spends less time for a failure recovery, but much more time for maintaining checkpoint and logs than Lazy strategy.

Objective: identify a condition under which the recovery probability gained by Eager is most effective considering the cost invested for maintenance.

26

Failure Recoverability vs. Cost Tradeoff Analysis Failure Recoverability versus Cost

Ratio: the slope of the recovery probability gained versus the cost invested.

Cost invested by Eager

Number of checkpoints before failure:

Number of moves crossing boundary during a checkpoint period:

Number of log entries between two consecutive moves:

27

Failure Recoverability vs. Cost Tradeoff Analysis

Cost invested by Lazy Cp : communication cost for setting up the link.

• There exists a best checkpoint intervalunder which the Eager strategy is most cost-effective over the Lazy strategy.

• The best cost-effective checkpoint interval for the eager strategy increases as the recovery time increases.

28

Conclusion and Future Research Conclusion

Closed-form expressions for the failure recovery time distribution for both Eager and Lazy handoff strategies.

Extensive numerical analysis on the effect of model parameters like log arrival rate, mobility rate, failure rate, checkpoint interval.

Analysis the tradeoff involved between cost requested to maintain the checkpoints and logs and the recovery cost.

Future research Using more sophisticated probabilistic model

(SPN) Analysis more other checkpoint strategies

29

Thank you

1 on failure recoverability of client-server applications in mobile wireless environments ing-ray...

Documents

checkpoint information

introduction failure

failure recovery time

checkpoint operation

checkpoint protocols

new checkpoint

failure recoverability

base station mh