checkpointing-based rollback recovery for parallel applications on the integrade grid middleware
DESCRIPTION
Checkpointing-based Rollback Recovery for Parallel Applications on the InteGrade Grid Middleware. Raphael Y. de Camargo Andrei Goldchleger Fabio Kon Alfredo Goldman Department of Computer Science University of São Paulo, Brazil. Middleware 2004 – Toronto, Canada - PowerPoint PPT PresentationTRANSCRIPT
Checkpointing-based Rollback Recovery for Parallel Applications on the InteGrade Grid Middleware
Raphael Y. de CamargoAndrei Goldchleger Fabio KonAlfredo Goldman
Department of Computer ScienceUniversity of São Paulo, Brazil
Middleware 2004 – Toronto, Canada2nd International Workshop on Grid Computing
2
Summary
Introduction InteGrade Grid middleware BSP Computing Model Checkpointing-based Rollback Recovery Checkpointing Infrastructure Preliminary Experiments Conclusions
3
Introduction
Grid Computing: Grid computing allows the
leveraging and integration of computer resources distributed across LANs and WANs
Besides dedicated computing resources, it is also possible to use idle computing power from commodity workstations (opportunistic computing)
Challenges: Environment composed of
shared user workstations spread across many different LANs.
Machines may fail, become unaccessible, or may switch from idle to busy very rapidly
Some mechanism for fault-tolerance is a major requirement for such a system.
4
InteGrade Grid Middleware
Objectives: Use idle computing power of
commodity workstations (opportunistic computing)
Allow organizations to increase their available computing power without buying extra hardware
Ensures the quality of service of machine owners sharing its computing resources
Implementation Status: Basic architecture already
implemented
Uses CORBA distributed object technology for communication
Provides support for execution of sequential, BSP and bag-of-tasks applications
5
InterCluster InteGrade Architecture
GRM (Global Resource Manager):Manages the grid resources and schedules applications for execution
ASCT:Allows the submission and controlling of applications on the Grid
LRM (Local Resource Manager):Manages a node´s resources
Runtime LibrariesProvide support for running parallel applications
6
BSP Parallel Computing Model
Computation is performed using a sequence of parallel supersteps
Each superstep is composed of computation and communication, with a synchronization barriers in the end
All data from communication is available to other processes only in the next superstep
Two communication Mechanisms: Direct Remote Memory Access (DRMA) Bulk Synchronous Message Passing (BSMP)
7
Checkpointing-based Rollback Recovery
Checkpointing: Consists in periodically saving the application state into a checkpoint, so that its state can be recovered from it
Checkpointing-based Rollback-Recovery: Process of reinitializing an application from an intermediate execution point after a failure is detected
Two approachs for checkpointing:
System-level: - The memory space and processor registers from an application are saved into the checkpoint.
Application-level:- The application is responsible for providing the data to be saved and reconstructing its state from the checkpoint.
8
Application-level checkpointing
Advantages Semantic information about data being saved: Possibility of generating portable checkpoints
Only the necessary data for recovering application state needs to be saved
The application is reponsible for: Providing which data needs to be saved Recovering its state from a previous checkpoint
Disadvantages Need to instrument source-code with checkpointing code Necessary to have access to application source-code Cannot generate forced checkpoints
9
Checkpointing of Parallel Applications
In case of parallel applications we must consider the depencies among application processes generated by message exchanges;
Global checkpoint: is a collection contaning checkpoints from every application process. In the diagram, the global checkpoint s1 is inconsistent while global checkpoint s2 is consistent.
BSP applications: consistency can be guaranteed by generating the checkpoints after the synchronization phases.
10
Checkpointing Infrastructure
Pre-Compiler Instruments a C/C++ application source-code with
checkpointing code Runtime libraries
Allows saving the application state into a checkpoint and recovering the data from a previous checkpoint
ExecutionMonitor Keep information about applications running on the
grid, allowing the restarting of these applications in case of failures.
11
PreCompiler
Based on OpenC++. Permits that we use compile-time reflection to instrument an application source-code with checkpointing code
Needs to modify application code in order to save the following data : Execution Stack: contains runtime data from the active
functions in a particular moment during application execution Position Counter: the current position in the program The Heap: contains memory chuncks allocated by commands
such as malloc and new Global variables
12
Saving and Recoveringthe Execution Stack State
Execution stack state: Not directly accessible from application code.
Saving the execution stack state:Save a list of the currently active functions and the values of their local variables.
Recovering the execution stack state: Call the functions in the saved list, declare the local variables and recover their values from the checkpoint. The remaining code is skipped.
Position Counter:Process state will only be saved in certain points in the source code, marked by a call to some function, such as checkpoint_candidate()
Execution Stack:
local variables
control information
function
parameters
local variables
control information
function
parameters
13
Saving Local Vars and Pointers
Local Variables: Auxiliary stack keeps the address of local variables that are currently in scope. Local variable addresses are pushed into the stack just after their declaration, and removed when the variables leave scope During checkpoint generation, the values contained in these addresses are saved in the checkpoint.
Pointers: In the case of pointers, it is necessary first to dereference the pointer When saving pointer with multiple levels of indirection it is necessary to follow the pointer graph structure Special care is necessary with graphs containing cycles and when multiple pointers reference the same memory chunk
14
Saving the Heap Memory
HeapManager Mantains a list of currently allocated
chunks of memory Includes the memory address, its size,
and a flag that indicates if that chunk has already been saved during checkpoint generation
Updated before memory allocation calls such as malloc and free for C and new and delete for C++.
15
Classes, Structures and BSP Calls
Structures Saved in the same way as local vars. Must follow the pointers present in the structure.
Classes Use introspection to add methods for saving and restoring the class members.
BSP The bsp_begin and bsp_synch standard functions are replaced by functions from the checkpointing library During reinitialization, calls to functions that modify the state of the BSP library must be reexecuted.
(e.g., bsp_pushregister)
16
Precompiler – Example of Instrumented Code
int function () { int lastFunctionCalled = -1; int localVar = 0; ckp_push_data(&lastFunctionCalled,sizeof(int)); ckp_push_data(&localVar, sizeof(int)); if ( ckpRecovering == 1 ) { ckp_get_data(&lastFunctionCalled, sizeof(int)); ckp_get_data(&localVar, sizeof(int)); if( lastFunctionCalled == 0 ) goto ckp0; } // Do computations (...) ckp0: lastFunctionCalled = 0; functionA ( ) ; // Do computations (...) ckp_npop_data(2); return localVar;}
Original Code Modified Code
17
Checkpointing Runtime Library
Checkpointing Library: Provides the functionality for mantaining a stack of local variables, managing heap state and saving the data to a checkpoint; Provides a timer that applications can set to ensure a minimun time between checkpoints Checkpoints are currently architecture dependent and saved to file in the file system.
BSP Ckp Library: Provides specific functionality for checkpointing BSP applications:
bsp_begin_ckp( ): registers some addresses necessary for checkpointing coordination and initializes the timer.
bsp_synch_ckp( ): Test if the timer has expired and if true, signals the others processes to generate a new checkpoint.
18
Application ExecutionMonitoring and Reinitialization
Execution Monitor: Contains a list of running applications in the nodes from its cluster Reschedule new executions with the GRM for failed processes
GRM Detects when a node or LRM fails and notifies the Execution Monitor Report nodes failures to the GRM
LRM Captures the exit status of
running applications and sends to the ExecutionMonitor
If process was explicitally killed by the signals SIGTERM or SIGKILL it is restarted
BSP Applications For BSP applications, all the
processes in the application are reinitialized
19
Preliminary Experiments
Sequence similarity application: Compares two sequences of characters and finds the similarity among
them using given criteria. Used in bioinformatics to compare sequences of DNA. Was parallelized using the BSP computing model
600s 0 339.7s 339.9s 0%
60s 5 347.1s 339.9s 2.1%
10s 23 371.9s 339.9s 9.4%
Experiments were performed on a cluster of 10 1.4GHz machines connect by a 100Mbps Fast Ethernet network.
tmin nckp ttotal torig ovh
20
Conclusions
We described an checkpointing-based rollback recovery mechanism for applications running in the InteGrade Grid middleware
This mechanism will allow a better resource utilization in the Grid, since it will be possible to migrate processes between nodes
Premiliminary indicates that checkpointing overhead can be low enough to be used on long-running BSP parallel applications
21
Ongoing Work
Improve pre-compiler support for C++ Support for portable checkpoints
Allows better resource utilization In heterogeneous environments
Robust storage system for checkpoints Data saved in a distributed way Provide some degree of replication to provide fault-tolerance
Implement a efficient process migration mechanism on InteGrade Can be used for both fault-tolerance and dynamic adaptation
22
Questions ?