by nitin bahadur gokul nadathur department of computer sciences university of wisconsin-madison
DESCRIPTION
Middleware for Active Reduction Operations in Distributed Systems. By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison. Spring 2000. Talk Outline. Motivation and Goals General Architecture of the middleware Components of the middleware - PowerPoint PPT PresentationTRANSCRIPT
By
Nitin Bahadur
Gokul Nadathur
Department of Computer Sciences
University of Wisconsin-MadisonSpring 2000Spring 2000
Multicast / Reduction Trees 2Spring 2000
Talk Outline
• Motivation and Goals• General Architecture of the middleware• Components of the middleware• Providing reliability - handling of node failures• Applications developed using the middleware• Performance• Conclusions and possible extensions
Multicast / Reduction Trees 3Spring 2000
Motivation and Goals
• A middleware for an application with Master - Worker paradigm
• Scalable framework for communication and computing client response (“Reduction”)
• Unicast does not scale - so use multicast• Introducing reduction operations dynamically in
clients • A general framework for communication among
clients
Multicast / Reduction Trees 4Spring 2000
The Big Picture...
Master App
ARTL
Client AppARTL
Client AppARTL
Client App
ARTL
Sends queriesReduces resultsHands back results to application
Execute responses to queries Forward queries downstreamReduces incoming resultsSends reduced results to master
Executes responses to queriesSends back results towards master
Multicast / Reduction Trees 5Spring 2000
ART - Library Architecture
Network
ARTL Communication Layer
Event Handler
Application API
Framework for processing messages
Incoming Packet
ARTL specific message
Application specific callbacks
Reduction functions
ARTL messages :1. Query from master 2. Response from downstream nodes
Outgoing message
Application
Multicast / Reduction Trees 6Spring 2000
ART - Library Architecture
Network
ARTL Communication Layer
Event Handler
Application API
Framework for processing messages
Incoming Packet
ARTL specific message
Application specific callbacks
Reduction functions
ARTL messages :1. Query from master 2. Response from downstream nodes
Outgoing message
Application
Multicast / Reduction Trees 7Spring 2000
Communication Subsystem
• Connection Setup – Connect nodes as a Binomial tree
• Send and receive ARTL and application messages• Detect node failure and act accordingly• Integrate restarted node in current tree structure
Multicast / Reduction Trees 8Spring 2000
Why use Binomial Tree
Master App
Client App Client App
Client App
1
2
2Master
App
Client App
Client App
Client App
Binomial TreeQuery Propagation time = 2
Unicast MechanismQuery Propagation time = 3
2
3
1
Multicast / Reduction Trees 9Spring 2000
Reduction
1
5 3 2
7 6
8
4
Reduction at 5 and 3
Responses
Example Reduction operations:Min(), Max()
Multicast / Reduction Trees 10Spring 2000
1
5 3 2
7 6
8
4
Tree connection setup
Multicast / Reduction Trees 11Spring 2000
1
5 3 2
7 6
8
4
Tree Setup - Phase I
TCP connection setup
Multicast / Reduction Trees 12Spring 2000
1
5 3 2
7 6
8
4
Tree Setup - Phase II
TCP connection setup
Multicast / Reduction Trees 13Spring 2000
1
5 3 2
7 6
8
4
Tree Setup - Phase III
TCP connection setup
Multicast / Reduction Trees 14Spring 2000
Inter node communication
• Unicast and multicast data transmission• ARTL receives application messages for which no
receive has been posted – these are sent to a callback function registered by
application
• ARTL receives data on behalf of application when application explicitly posts a receive
DataARTL Header
Multicast / Reduction Trees 15Spring 2000
ART - Library Architecture
Network
ARTL Communication Layer
Event Handler
Application API
Framework for processing messages
Incoming Packet
ARTL Encapsulated message
Application specific callbacks
Reduction functions
ARTL messages :1. Query from master 2. Response from downstream nodes
Outgoing message
Application
Multicast / Reduction Trees 16Spring 2000
Reduction Functions
• Implemented as Shared objects
• Sent to client during Setup phase
• Each reduction function is associated with a particular response it reduces
Multicast / Reduction Trees 17Spring 2000
Event Handler
Network
Table containing Query id and Callback information for currently registered queries
Responses for the shaded entry from down stream nodes
Reduced response sent upstream
Event Handler
Application
Response Callback
Run Queue of reduction/response
operations
Thread Pool
Multicast / Reduction Trees 18Spring 2000
Multithreaded Architecture
• No prior Knowledge about behavior of reduction function
• Exploit concurrency - multiple processor per node
• Static Pool of threads - Creation and destruction of threads is bad (Firefly RPC)
Multicast / Reduction Trees 19Spring 2000
Crash Reconfiguration
1
5 3 2
7 6
8
4
Multicast / Reduction Trees 20Spring 2000
Crash Reconfiguration
1
5 3
7 6
8
4
Crash Reconfiguration at depth 1
Multicast / Reduction Trees 21Spring 2000
Crash Reconfiguration
1
5 3
7 6
8
4
Crash Reconfiguration at depth 2
Multicast / Reduction Trees 22Spring 2000
1
5 3 2
7 6
8
4
Crash Reconfiguration
Crash Reconfiguration at depth 1
Multicast / Reduction Trees 23Spring 2000
1
3 27
68 4
Crash Reconfiguration
Crash Reconfiguration at depth 1
Multicast / Reduction Trees 24Spring 2000
Crash Detection
• Break in TCP connection with parent/child – a signal is received at the other end of connection
• Use of periodic refresh messages to inform parent that child is up and running– useful in WAN environments
Multicast / Reduction Trees 25Spring 2000
Crash Handling
• Parent of node down informs master• All nodes are informed of a node failure• Master recomputes tree
– If leaf node down, then no problem
– If intermediate node down, some reconfiguration is required
Multicast / Reduction Trees 26Spring 2000
Node Restart
• Restarted node contacts master to tell it about restart
• Master sends it current state of network and the shared object(s)
• All nodes are informed of a node restart• Master recomputes tree and informs the new
node’s parent about its new child• Parent and child establish connections
Multicast / Reduction Trees 27Spring 2000
SysMon - A System monitor
Monitors the load average from /procdisplays Min, Max and average loads
Per-node load is also displayed
ARTL Reduction operations : Min, Max and Average
Multicast / Reduction Trees 28Spring 2000
SysMon - A System monitor
Node failures are detected and SysMon pops up an alert
Multicast / Reduction Trees 29Spring 2000
File Transfer Application
• Transfers a file from master to all clients• File can be executed at clients (if required)
– execution can be instantaneous on receiving file– execution can be delayed until all nodes have
received the file
Multicast / Reduction Trees 30Spring 2000
File Transfer PerformanceFile Transfer Time for 40 MB file
020406080
100120140160180
2 4 8 16
Total number of nodes
Tim
e in
se
co
nd
s Unicast FileTransfer time
Multicast FileTransfer time
Expected multicastfile transfer time
Multicast / Reduction Trees 31Spring 2000
Total Startup Time vs Number of Nodes
Total Startup Time
0
5
10
15
20
2 4 8 16 32
Number of Nodes
Tim
e in
sec
Startup time in sec
Client processes started using ssh on different machines
Multicast / Reduction Trees 32Spring 2000
Conclusions and Extensions• A middleware for dynamic operations• Support for crash detection, recovery and dynamic
processes• Demonstrated near optimal speedup using real
applications
• Making response function dynamic - active services
• Differential scheduling in thread scheduler for QoS• Making dynamic code secure