Download - [OSDC 2013] Hadoop Cluster HA 的經驗分享
![Page 2: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/2.jpg)
2
Who am I
韓祖棻 Jerry – Etu 技術經理
• Database Management
• Windows/Linux Application Developer
• Web Developer
• Developer of Etu
![Page 3: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/3.jpg)
3
Agenda
• Background• Facebook Namenode High Availability• Hadoop 1.0 Namenode High Availability• Hortonworks High Availability• Cloudera High Availability• Etu Appliance High Availability• Conclusion
![Page 4: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/4.jpg)
4
Background
![Page 5: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/5.jpg)
5
The Hadoop Ecosystem
MahoutMahout
HBaseHBase
MapReduceMapReduce
PigPig
HDFS ( Hadoop Distributed File System)HDFS ( Hadoop Distributed File System)
Data Store
Data Processing Layer
Hive Meta StoreHive Meta Store
HiveQLHiveQL
Zooke
eper
Zooke
eper
Avro
(Seri
aliz
ati
on
)Avro
(Seri
aliz
ati
on
)
RDBMSRDBMSETL ToolsETL Tools BI ReportingBI Reporting
![Page 6: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/6.jpg)
6
HDFS cluster consists of a single Namenode.
HDFS Architecture (Master/Slave)
Namenode
Breplication
Rack1 Rack2Client
Blocks
Datanodes Datanodes
Client
Write
Read
Metadata ops
Metadata(Name, replicas..)(/var/disk/data, 1..
Block opsMetadata ops
The Namenode was a sing point of failure
(SPOF) in an HDFS Cluster.
![Page 7: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/7.jpg)
7
Facebook Namenode High Availability
![Page 8: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/8.jpg)
8
AvatarNode
![Page 9: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/9.jpg)
9
Hadoop 1.0 Namenode High Availability
![Page 10: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/10.jpg)
10
Backup Namenode Approach
• Use case 3f: – Active running, Standby down for maintenance. Active dies and cannot start.
Standby is started and takes over as active.
![Page 11: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/11.jpg)
11
Hortonworks High Availability
![Page 12: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/12.jpg)
12
HDPs Full-Stack HA Architecture
![Page 13: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/13.jpg)
13
HA for HDFS NameNode Using VMware
Do not use the NameNode VM for running any other master daemon.
![Page 14: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/14.jpg)
14
HA for Hadoop Using RHEL (v5.x, v6.x)
![Page 15: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/15.jpg)
15
Cloudera High Availability
![Page 16: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/16.jpg)
16
Shared Storage Using NFS (After CDH 4.0)
NNActive
NNStandby
Shared NN state with single writer
(fencing)
DN
FailoverControllerActive
ZK
Monitor Health of NN. OS, HW
DN DN
FailoverControllerStandby
ZK ZKHeartbeat Heartbeat
Monitor Health of NN. OS, HW
SPOF
![Page 17: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/17.jpg)
17
Journal Nodes
Quorum-based Storage (After CDH 4.1)
NNActive
NNStandby
DN
FailoverControllerActive
ZK
Monitor Health of NN. OS, HW
DN DN
FailoverControllerStandby
ZK ZKHeartbeat Heartbeat
Monitor Health of NN. OS, HW
JN JN JN
QJM QJM
JNJNJN
![Page 18: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/18.jpg)
18
Etu Appliance High Availability
![Page 19: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/19.jpg)
19
Summarize previous solutions
Solution AutoFailover HA Type External
StorageFacebook Avatar Node X Namenode ○Apache Hadoop 1.0 Backup Namenode X Namenode ○
Hortonworks
Vmware (*1) ○ Namenode ○
RHEL (*2) ○ System-wide ○
Cloudera (Apache Hadoop 2.X)
Shared Storage ○ Namenode(*3) Optional
Quorum-based Storage ○ Namenode (*3) Optional
1. 2 ESX Servers + SAN Arch. (vSphere HA Cluster)2. RHEL Cluster HA and Power Fencing Device3. Implementing the Fencing Method for System-wide HA.
![Page 20: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/20.jpg)
20
Two Roles
Master node Worker
Worker
Worker
Master node
![Page 21: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/21.jpg)
21
Services on Master and Workers
Master Worker
Hadoop Ecosystem Services
Name NodeJob TrackerHBase MasterZookeeper (Leader)Hive
Data NodeTask TrackerRegion ServerZookeeper
System Services
MySQL/PostgreSQLKerberosNTP ServerSyslog
Syslog
![Page 22: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/22.jpg)
22
HA Architecture (Active/Standby)
![Page 23: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/23.jpg)
23
HA based on CDH4.0.1
NNActive
NNStandby
SynchronizedFile System
DN
FailoverControllerActive
ZK
Monitor Health of NN. OS, HW
DN DN
FailoverControllerStandby
ZK ZKHeartbeat Heartbeat
Monitor Health of NN. OS, HW
![Page 24: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/24.jpg)
24
Data Synchronization
• Hadoop ecosystem– Configurations are stored in Zookeeper– Hive meta data is stored in PostgreSQL
• PostgreSQL– Using PostgreSQL Replication
• User data• System configurations or data
– PostgreSQL, Kerberos, NTP server, Syslog
![Page 25: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/25.jpg)
25
Requirements
Active Master Worker
Worker
Standby Master
ZK
ZK
ZK Leader
- HDFS Service is Running in Active Master- Zookeeper Cluster is ready- Standby Master is ready to activate High
Availability service
![Page 26: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/26.jpg)
26
Failover Scenario
Active Master Worker
Worker
Worker
- Active Namenode service failure- Active Namenode JVM failure- Active ZKFC service failure- Etu Active Master OS failure- Etu Active Master machine power failure- Failure of NIC cards on the Etu Active
Master machine- Network failure for the Etu Active Master
machine
Standby Master
![Page 27: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/27.jpg)
27
Design Details – Enabling HA
Active Master Standby Master
1. Stopping services dependent on HDFS. (JobTracker, HMaster, …)
2. Stopping Namenode and Datanode services.
3. Configuring HDFS and FC service.
4. Creating Synchronized File System.
5. Initializing Synchronized File System for share edit logs.
7. Initializing Standby Master.
6. Starting Active FC service.
Namenode JT, HMaster, …
FC
Namenode
FC
edit logs
Kerberos, NTP, Syslog,…
8. Starting Standby FC service.
9. Synchronizing system configurations and data.
10. Starting Active Namenode and Datanode services.
11. Starting Standby Namenode and Datanode services.
12. Checking Services Status.
13. Starting services dependent on HDFS. (JobTracker, HMaster, …)
DB Replication
Kerberos, NTP, …
![Page 28: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/28.jpg)
28
Design Details - Failover
Active Master Standby Master
1. Fencing Active Master from Standby Mastera. Stopping network service.b. Stopping Hadoop related services.c. Stopping system services.d. Configuring network environment.e. Removing default services.
7. Transition Standby Master to Active Master.a. Stopping network service.b. Stopping system services.c. Configuring network environment.d. Configuring host information.e. Configuring system services.f. Starting network service.g. Starting System services.
Namenode JT, HMaster, …
FC
Namenode
FC
edit logs
Kerberos, NTP, Syslog,…
8. Configuring Hadoop related services.
DB Replication
Kerberos, NTP, …
2. Stopping Standby FC service.
3. Stopping Standby Namenode service.
5. Removing DB Replication.
4. Removing Synchronized File System . 9. Starting Namenode and Datanode services.
10. Starting Hadoop related services.
Active Master
Namenode JT, HMaster, …
Kerberos, NTP, …
Fencing
![Page 29: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/29.jpg)
29
Use case -Active Namenode maintenance
Active Master Worker
Worker
Worker
- Stop NN- Restart NN
Standby Master
![Page 30: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/30.jpg)
30
Use case - Standby Master failure
Active Master Worker
Worker
Worker
- OS failure- Power failure- Failure of NICs- Network failure
Standby Master
![Page 31: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/31.jpg)
31
Use case - Cluster power failure
Active Master Worker
Worker
Worker
Standby Master
![Page 32: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/32.jpg)
32
Use case - Cluster network failure
Active Master Worker
Worker
Worker
Standby Master
![Page 33: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/33.jpg)
33
Demo – Non-HA (VM002)
Activating HA with One-Click
![Page 34: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/34.jpg)
34
Demo –Activating (VM002 --- VM007)
![Page 35: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/35.jpg)
35
Demo –Activating Done (VM002 – VM007)
![Page 36: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/36.jpg)
36
Demo –Failover (VM002 –> VM007)
![Page 37: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/37.jpg)
37
Demo –Failover Done (VM007)
![Page 38: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/38.jpg)
38
Conclusion
• Leveraging Synchronized File System to share Namenode edit logs, and system data between Masters.
• Implements improved fencing method to handle failover.
• Providing system-wide high availability, not only for Hadoop Name Node Service.
![Page 39: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/39.jpg)
39
Reference
• Hadoop 1.0.4 Documentation– http://hadoop.apache.org/docs/stable/index.html– https://issues.apache.org/jira/secure/attachment/12480489/Na
meNode%20HA_v2_1.pdf
• Hadoop 2.0.3-alpha Documentation– http://hadoop.apache.org/docs/r2.0.3-alpha/index.html
• Hadoop AvatarNode High Availability– http://hadoopblog.blogspot.tw/2010/02/hadoop-namenode-high
-availability.html
• Hortonworks Data Platform– http://hortonworks.com/products/hortonworksdataplatform/– http://www.vmware.com/files/pdf/Apache-Hadoop-VMware-HA-s
olution.pdf
![Page 40: [OSDC 2013] Hadoop Cluster HA 的經驗分享](https://reader033.vdocuments.site/reader033/viewer/2022061502/554bb888b4c90530298b49b9/html5/thumbnails/40.jpg)
40
Reference
• CDH4.2.0 Documentation– http://www.cloudera.com/content/support/en/documentation/cd
h4-documentation/cdh4-documentation-v4-latest.html