srds’03 performance and effectiveness analysis of checkpointing in mobile environments xinyu chen...

19
SRDS’03 Performance and Effectiveness Analysis of Checkpointing in Mobile Environments Xinyu Chen and Michael R. Lyu The Chinese Univ. of Hong Kong Hong Kong Florence, Italy

Post on 22-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

SRDS’03

Performance and Effectiveness Analysis of

Checkpointing in Mobile Environments

Xinyu Chen and Michael R. Lyu

The Chinese Univ. of Hong Kong

Hong Kong

Florence, Italy

CUHKSRDS’03

Outline

Introduction A mobile environment – Wireless CORBA Performance analysis with and without

checkpointing Analytical results and comparisons Conclusions

CUHKSRDS’03

Introduction

Checkpointing and Rollback Recovery Checkpointing

Save program’s states during failure-free execution

Repair bring the failed device back to normal operation

Rollback reload the program’s states saved at the most recent

checkpoint

Recovery Reprocess the program, starting from the most recent

checkpoint by applying the logged messages, until the point just before the failure

CUHKSRDS’03

Wireless CORBA Architecture

Visited Domain

Home Domain

Terminal Domain

Access Bridge

Access Bridge

Access Bridge

Access Bridge

Static Host

Static Host

Terminal Bridge

GIOP

Tunnel

ab1

ab2

mh1

GTP Messages

Control messageComputational message

GIOP: General Inter-ORB Protocol

GTP: GIOP Tunnel Protocol

CUHKSRDS’03

Wireless CORBA Architecture

Visited Domain

ab1

ab2

Access Bridge

Access Bridge

Static Host

Static Host

Home Domain

Home Location

Agent

Terminal Domain Terminal

Bridge

GIOP

Tunnelmh1

mh1

Terminal Domain Terminal

Bridge

GIOP

Tunnel

GIOP

Tunnel

mh1

Terminal Domain Terminal

BridgeGIOP Tunnel

mh1

Terminal Domain Terminal

Bridge

Access Bridge

Access Bridge

Handoff: a mechanism for a mobile host to seamlessly

change a connection from one Access Bridge to another

CUHKSRDS’03

Program’s Termination Condition

GTP messages Control message Computational message: the number is not changed

A program on a mobile host is successfully terminated if it continuously receives n computational messages

Formulate the expected program execution time with message number n

CUHKSRDS’03

State Transition without Checkpointing

State 0 – normal, State 1– repair, State 2 – handoff

0

2 1

Generally distributed random variables H: handoff time R: repair time

CUHKSRDS’03

Expected Program Execution Time

Expected repair time

Expected program execution time without checkpointing

Laplace transform for cumulative distribution function

CUHKSRDS’03

Equi-number Checkpointing

Take checkpoints according to the number of received messages (a)

Divide the program execution into m equal intervals (m=n/a) Equi-number checkpointing with respect to message

number Message number in each checkpointing interval is not changed

Equi-number checkpointing with respect to checkpoint number

Checkpoint number is not changed

CUHKSRDS’03

State Transition in Equi-number Checkpointing

State 3 – Composite repair State 4 – Composite checkpointing

0

2

3

4/a

A generally distributed random variable C: Checkpointing time

CUHKSRDS’03

Composite States

State 3 – Composite repair State 5 – repair, State 6 – rollback, State 7 – handoff

5 6 7

3

State 4 – Composite checkpointing State 8 – checkpointing, State 9 – handoff

4 8 9

6 7

9

CUHKSRDS’03

Expected Program Execution Time

Expected sojourn time in State 3

Expected program execution time with equi-number checkpointing

= m

CUHKSRDS’03

Average Effectiveness

Effective interval: a program produces useful work towards its completion

Wasted interval: Repair and rollback Handoff Checkpoint creation Wasted computation

Average Effectiveness: how much of the time an MH is in effective interval during an execution

CUHKSRDS’03

Optimal Checkpointing Interval

Minimize the expected program execution time or maximize the average effectiveness

CUHKSRDS’03

Beneficial Condition

Checkpointing improves the performance

CUHKSRDS’03

Analytical Results and Comparisons (1)

Equi-number checkpointing with respect to checkpoint number

CUHKSRDS’03

Analytical Results and Comparisons (2)

Checkpointing vs. without checkpointing

CUHKSRDS’03

Average effectiveness vs. message arrival rate and handoff rate

Analytical Results and Comparisons (3)

CUHKSRDS’03

Conclusions

Introduce equi-number checkpoiting strategies

Derive expectations of program execution time with and without checkpointing

Obtain average effectiveness Identify the optimal checkpointing interval Identify the beneficial condition Obtain analytical results and comparisons