youngil kim awalin sopan sonia ng zeng. introduction concept of the project system architecture ...
TRANSCRIPT
![Page 1: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/1.jpg)
P2P Control System based on Map/Reduce
Youngil KimAwalin Sopan
Sonia Ng Zeng
![Page 2: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/2.jpg)
Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System Analysis
◦ System Information Logger (SIL)◦ System Information Gatherer (SIG)◦ Map/Reduce
Implementation – Visualization Implementation – P2P Application Demo
Outline
![Page 3: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/3.jpg)
How can we know system information from many nodes?◦ It is hard to track which node has a problem when
too many nodes exist
But… HDFS and Map/Reduce make it easy!◦ Gather system information of each node to HDFS◦ Analyze system information using Map/Reduce◦ A kind of network managing system like HP’s
OpenView
Introduction
![Page 4: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/4.jpg)
Tool to have an overview of the nodes in the P2P◦ Still preserving the de-centralized nature of P2P◦ Can be run on any computer – from within the P2P or
outside of it. So, the computer running the tool is not necessarily the “master”
◦ If the tool is not running, the P2P still remains intact Still, one can control the P2P from the tool The tool will provide an interface to do both:
overview and control◦ Therefore, the user does not need to be an expert to
work with a network system
Concept of the Project
![Page 5: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/5.jpg)
System Architecture
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
P2PNetwork
![Page 6: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/6.jpg)
System Architecture
System Info Gatherer
(Hadoop Master)
Hadoop Slave Node
HadoopSlave
HadoopSlave
HadoopSlave
HDFS
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
Sys Info Logger
Sys InfoLogger
Sys Info Logger
Sys Info Logger
P2PNetwork
![Page 7: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/7.jpg)
System Architecture
System Info Gatherer
(Hadoop Master)
Hadoop Slave Node
HadoopSlave
HadoopSlave
HadoopSlave
HDFS
SystemManager
(Visualization)
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
p2p Local
P2P app.
Sys Info Logger
Sys InfoLogger
Sys Info Logger
Sys Info Logger
SystemControlNetwork
P2PNetwork
SystemInformation
![Page 8: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/8.jpg)
Implemented minimal P2P to show how our tool works◦ How to control application or system on each
node using visualization◦ Has STOP/RESUME operations
Functions◦ Response to “QUERY” Show active/inactive
(overview)◦ Response to “CONTROL” Change node status
based on control argument
(active/inactive)
Implementation – P2P Application
![Page 9: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/9.jpg)
Hadoop for DFS & Map/Reduce Framework◦ We use bug cluster◦ Master: brood00◦ Slaves: Currently tested with 5 nodes
(bug51 ~ bug55)
◦ Using each local storage Using “/tmp” directory because home directory is not a
local storage but NFS volume.
◦ Network Ports: hdfs(9000), job tracker(9001), Namenode Interface (50070), JobTracker Interface (50030)
Implementation - HDFS
![Page 10: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/10.jpg)
Implementation - System Analysis
![Page 11: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/11.jpg)
mr_syslog.py◦ Implemented in Python◦ Saves information in both local storage and HDFS◦ Gathers information every 10 secs◦ Creates logfile based on time
Information of each node is saved with the following format◦ < 20110501_2252_bug51.log >◦ bug51 1304304720: mem(75.50), cpu(1.00), disk(10.00)◦ bug51 1304304724: mem(75.50), cpu(1.50), disk(10.00)◦ bug51 1304304727: mem(75.51), cpu(0.40), disk(10.00)◦ bug51 1304304729: mem(75.51), cpu(0.50), disk(10.00)◦ bug51 1304304732: mem(75.50), cpu(0.50), disk(10.00)◦ bug51 1304304734: mem(75.50), cpu(0.40), disk(10.00)
System Information Logger (SIL)
![Page 12: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/12.jpg)
Functions◦ Find current resource usage of each node at current
time using Map/Reduce Currently, it shows maximum values per minute time slot
◦ Communication Gateway between nodes and visualization tool Send “QUERY” to each P2P application to check on the
status of each node Send node status to visualization tool
Node ID Status (in/active) CPU Usage Memory Usage Disk Storage
System Information Gatherer (SIG)
![Page 13: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/13.jpg)
Map:◦ Input – each node log file
Key: position of file Value: raw data, one line per key
◦ Output Key: node ID Value: set of system information
(CPU/memory/storage usage) Eg: < bug51, [30.0, 29.0, 12.0] >
Map/Reduce
![Page 14: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/14.jpg)
Reduce:◦ Input – from Map
Key: node ID Value: set of set of system information Eg: < bug51, [ [30.0, 29.0, 12.0], [33.0, 40.0, 9.0], … ]
>◦ Output
Key: Node ID Value: Maximum values for each piece of information Eg: < bug51, [33.0, 40.0, 12.0] >
Map/Reduce
![Page 15: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/15.jpg)
Written in Java Used Prefuse toolkit for a tabular
visualization for the node status Only need to use the right-click menu to
control the node Live communication with the nodes
◦ To query the node status from the SIG◦ To send commands to the nodes in the P2P
network in real-time
Implementation - Visualization
![Page 16: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/16.jpg)
Initial view of all nodes
After stopping Bug53
Visualization
![Page 17: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/17.jpg)
System set-up and initialization (video file) Show namenode & jobtracker interface
Show Map/Reduce jobs Show Visualization tool
◦ Changes of each status◦ Control each P2P application
Demo
![Page 18: Youngil Kim Awalin Sopan Sonia Ng Zeng. Introduction Concept of the Project System architecture Implementation – HDFS Implementation – System](https://reader035.vdocuments.site/reader035/viewer/2022062804/5697bf751a28abf838c801f7/html5/thumbnails/18.jpg)