how to plan a hadoop cluster for testing and production environment

45
炬識科技股份有限公司 HOW TO PLAN A HADOOP CLUSTER FOR TESTING AND PRODUCTION ? Present by Resource Planning 2016/9/10 2016 Taiwan HadoopCon Sep. 9~10 th

Upload: anna-yen

Post on 08-Jan-2017

192 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: How to plan a hadoop cluster for testing and production environment

炬識科技股份有限公司�

HOW TO PLAN A HADOOP CLUSTER FOR TESTING AND PRODUCTION ?�

Present by Resource Planning�2016/9/10�

2016 Taiwan HadoopCon Sep. 9~10th

Page 2: How to plan a hadoop cluster for testing and production environment

www.athemaster.com 2

前言�

非Best practice, 僅是分享我們的經驗�

名詞定義可能有所不同�

我們的案例以CDH為主�

Page 3: How to plan a hadoop cluster for testing and production environment

Hard Drive Architecture 此測試案例下,JBOD比RAID約快1倍

RAID 0 與 JBOD查詢效能比較�3

www.athemaster.com

Page 4: How to plan a hadoop cluster for testing and production environment

測試環境�

www.athemaster.com

4

¨  實體主機*5 ¨  硬體規格

¤ AVAGO MegaRAID Controller *1 ¤ Disk: 500GB *6 ¤ CPU: 6Code *2 ¤ RAM: 16G*8

¨  系統版本 ¤ CentOS 6.6 ¤ CDH 5.4.5�

Page 5: How to plan a hadoop cluster for testing and production environment

Test Plan�

www.athemaster.com

5

¨  寫一個自動產生資料的程式�¨  使用名為“ADP”的 ETL工具進入HDFS中�¨  這裡的ADP為自行開發�¨  再用一支程式定時對Impala做scan�¨  也就是select count(*),來測試當資料量與查詢

時間的關係�

Page 6: How to plan a hadoop cluster for testing and production environment

Data Size�

www.athemaster.com

6

¨  此次測試中每5分鐘生成一個table ¨  每一個table的資料數約介於630~670萬筆之間�

Page 7: How to plan a hadoop cluster for testing and production environment

JBOD:查詢效能測試結果�

www.athemaster.com

7

Page 8: How to plan a hadoop cluster for testing and production environment

RAID 0:查詢效能測試結果�

www.athemaster.com

8

Page 9: How to plan a hadoop cluster for testing and production environment

- 檔案讀取的效能� - 硬碟空間的使用效率 - 記憶體的使用需求 (每個namespace object on NN約150 bytes)�

HDFS Block Size and count�9

www.athemaster.com

Page 10: How to plan a hadoop cluster for testing and production environment

Factors (簡化版)�

www.athemaster.com

10

¨  Input ¤ 平均檔案大小 ¤ 檔案數量

¨  Output ¤ Block size (64MB/128MB/256MB) ¤ Master Node (NameNode service)記憶體需求

(128GB/256GB/516GB)�<<更多資訊>> https://www.cloudera.com/documentation/enterprise/latest/topics/admin_nn_memory_config.html https://martin.atlassian.net/wiki/pages/viewpage.action?pageId=26148906 內文提到一篇” HDFS Scalability whitepaper”,有非常詳細計算方式。

Page 11: How to plan a hadoop cluster for testing and production environment

當資料要翻山越嶺才會到達Hadoop Cluster的時候…A story�

網路環境�11

www.athemaster.com

Page 12: How to plan a hadoop cluster for testing and production environment

Testing and Production�

測試環境與正式環境�12

www.athemaster.com

Page 13: How to plan a hadoop cluster for testing and production environment

測試環境特性�

www.athemaster.com

13

¨  快速部署 ¨  通常要在硬體規格與數量不足的狀況下進行 ¨  通常不要求高可用(有HA測試項目者例外) ¨  通常不要求效能 ¨  注重與其他系統間的整合測試�

Page 14: How to plan a hadoop cluster for testing and production environment

正式環境特性�

www.athemaster.com

14

¨  重視系統耐久度(短時間內不再投資) ¨  重視高可用性 ¨  雖然Hadoop不會用在交易系統,但是通常仍

有一定的效能要求 ¨  可能有備援要求�

Page 15: How to plan a hadoop cluster for testing and production environment

Poodle�

測試環境案例一�15

www.athemaster.com

Page 16: How to plan a hadoop cluster for testing and production environment

測試重點�

www.athemaster.com

16

¨  結構化資料ETL進入HDFS的可用性 ¨  不同種類的資料相容性 ¨  與SQL Server的效能比較�

Page 17: How to plan a hadoop cluster for testing and production environment

硬體規格�

www.athemaster.com

17

¨  節點角色數量 ¤ Master Node*2 ¤ Worker Node*3

¨  伺服器硬體規格 ¤ Dell R430 1U Rack Server*5

n  Intel Xeon E5-2620 v3 2.4GHz,15M Cache,8.00GT/s n  QPI,Turbo,HT,6C/12T *2 n  128GB(16 *8 GB) RDIMM,2133 MT/s,Dual Rank, n  4TB SAS HDD 3"5 7200 rpm *4 n  PERC H730 Integrated RAID Controller Size:1024 MB n  Hard Drive Architecture: 除了保留部分空間給 /boot和

swap外,其餘空間全部給 /,不使用LVM

Page 18: How to plan a hadoop cluster for testing and production environment

來源資料�

www.athemaster.com

18

資料格式 單一檔案大小 最大檔案數 總檔案大小

DB dump file 7.5GB 6 45GB

DB dump file 12.5GB 6 75GB

evtx 30~50MB 4 1GB

Txt 1.5KB 2 3KB

Page 19: How to plan a hadoop cluster for testing and production environment

Hadoop vs. SQL Server查詢效能�

www.athemaster.com

19

����

System��1�� 5�� 10�� 20��

Cloudera�� 19(m)19(s) �� 21(m) �� 24(m) �� 43(m)

MS-SQL�� 3(m)42(s) �� 4(m) �� 6(m)

N/A

(Loading ����

� �)

Page 20: How to plan a hadoop cluster for testing and production environment

Dahlia �

測試環境案例二�20

www.athemaster.com

Page 21: How to plan a hadoop cluster for testing and production environment

測試重點�

www.athemaster.com

21

¨  資料輸入 – SQL Server, Oracle, Teradata, AWS (Sqoop)

¨  資料輸出 – SQL Server, Oracle, Teradata, AWS (Impala, Hive)

¨  資高可用性 – NameNode HA, Cluster HA ¨  安全性 – 授權、加密、資料遮罩�¨  容量與性能 – Data compression, performance

monitoring ¨  硬體數量不足,BI工具需要專屬主機 ¨  如何用兩台主機架設Hadoop, 且可驗證HA?

Page 22: How to plan a hadoop cluster for testing and production environment

硬體規格(細節)�

www.athemaster.com

22

內    容� 數量�

System x3650 M5 ODD Cable Kit 3 Ultraslim 9.5mm SATA DVD-ROM 3 POWER CODE 6 Intel Xeon Processor E5-2630 v3 8C 2.4GHz 20MB Cache 1866MHz 85W 3

System x3650 M5 Plus 8x 2.5" HS HDD Assembly Kit with Expander 3

System x 550W High Efficiency Platinum AC Power Supply for x3650 M5 3

System x3650 M5 PCIe Riser 1 (2 x8 FH/FL + 1 x8 ML2 Slots) 3

600GB 10K 12Gbps SAS 2.5in G3HS 512e HDD 36

32GB TruDDR4 Memory (4Rx4, 1.2V) PC417000 CL15 2133MHz LP LRDIMM 48

X3650M5 1*E5-2630 v3 8C (85W) 2.4GHZ 20MB CACHE 1866 Mhz, 1 X 16GB ECC RDIMM (1.2V), 8*2.5" HS SAS/SATA/Max 18, M5210 1Gb Flash,4*1GB ETHERNET, 1*550W RPS,3Y

3

Page 23: How to plan a hadoop cluster for testing and production environment

節點角色配置�

www.athemaster.com

23

主機編號� 硬體規格� 節點角色�

PM-01 (VM01~06)�

CPU: 8 Core*2 MEM: 512GB HDD: SAS 600GB*12�

Master Node 01 Master Node 02 Master Node 03 Worker Node 01 Worker Node 02 Worker Node 03�

PM-02 (VM07~08)�

CPU: 8 Core*2 MEM: 512GB HDD: SAS 600GB*12��

Utility Node 01 (CM Server, CM) Edge Node 01 (Oracle)�

PM-03� CPU: 8 Core*2 MEM: 512GB HDD: SAS 600GB*12��

Edge Node 02 (BI Tool)�

Page 24: How to plan a hadoop cluster for testing and production environment

PM-01 叢集服務配置�

www.athemaster.com

24

Page 25: How to plan a hadoop cluster for testing and production environment

完成以下高可用性測試�

www.athemaster.com

25

¨  運算過程中Shutdown一台DataNode ¨  運算過程中Shutdown Primary NameNode ¨  運算過程中增加一個Node ¨  運算過程中Shutdown Primary CM DB�

Page 26: How to plan a hadoop cluster for testing and production environment

Taroko�

正式環境案例一�26

www.athemaster.com

Page 27: How to plan a hadoop cluster for testing and production environment

需求重點�

www.athemaster.com

27

¨  Hadoop與EDA軟體整合 ¨  ETL系統需要搭載一個RDM ¨  需要儲存到兩年的資料�

Page 28: How to plan a hadoop cluster for testing and production environment

硬體規格與節點角色�

www.athemaster.com

28

Page 29: How to plan a hadoop cluster for testing and production environment

叢集服務配置�

www.athemaster.com

29

Page 30: How to plan a hadoop cluster for testing and production environment

Andes�

正式環境案例二�30

www.athemaster.com

Page 31: How to plan a hadoop cluster for testing and production environment

需求重點�

www.athemaster.com

31

¨  高可用性 ¨  查詢效能 ¨  系統管理員與開發人員權限管理�

Page 32: How to plan a hadoop cluster for testing and production environment

硬體規格�

www.athemaster.com

32

Role MasterNode(01~03) WorkerNode(01~10)ServerQty 3 10Model HPDL360Gen9 HPDL360Gen9

CPUE5-2600v3 E5-2600v316core2.6GHz 12core2.4GHz(Dual8core) (Dual6core)

RAM 256GB(32GB*8) 256GB(32GB*8)

DISK6*600GB 12*4TB(3.5"SAS7.2Krpm)

2.5"SAS15Krpm 2*600GB(3.5"SAS15Krpm)

 � (Support12GbpsRAID)

RAID RAID-1(OS) RAID-1(OS)RAID-10(DATA) JBOD(DATA)

NIC 10GbE*2(LACP) 10GbE*2(LACP)

另外還有兩台Edge Node, 程式人員只能從該節點連線叢集。�

Page 33: How to plan a hadoop cluster for testing and production environment

叢集服務配置�

www.athemaster.com

33

Others and Edge Node 01

Master Node 01

Master Node 02

Master Node 03

Edge Node 02

Worker Node 02

13? 22? �為什麼會看到這麼多節點?�因為這些節點上安裝 cloudera-scm-agent,並且向CM 註冊過。�*Edge Node與軟體授權*�

Page 34: How to plan a hadoop cluster for testing and production environment

YARN Pending Containers �

www.athemaster.com

34

Page 35: How to plan a hadoop cluster for testing and production environment

進階 CM DB HA�

www.athemaster.com

35

¨  Postgresql server HA �¤  Failover - 當active的pgsql故障後,

pgpool會自動把standby的pgsql轉換成active以繼續運作。�

¤  Recovery - 把active的pgsql資料複製到standby的pgsql,使資料一致。�

¤  Failback - 故障排除的pgsql(原active)重新連結pgpool並回到active角色。�

¨  Pgpool HA �¤  Failover : 當active的pgpool故障後,

watchdog會提醒並自動把standby的pgpool轉換成active以繼續運作。�

¤  Failback : 故障排除後的pgpool(原active)會自動重新與active pgpool建立連結。�

Page 36: How to plan a hadoop cluster for testing and production environment

Amazon�

正式環境案例三�36

www.athemaster.com

Page 37: How to plan a hadoop cluster for testing and production environment

需求重點�

www.athemaster.com

37

¨  瞬間資料量大 (8000EPS) ¨  每日累積資料量大 (超過1TB) ¨  希望盡可能拉長資料儲存區間 ¨  新舊設備混用�

Page 38: How to plan a hadoop cluster for testing and production environment

硬體規格與節點角色配置�

www.athemaster.com

38

Role� 資料擷取分流� 硬體規格�

Master Node 01� N/A� CPU: 12 Core MEM: 16GB*14 HDD: 2.5” 300GB*2(RAID 1) 3.5” 4TB*12 (JBOD)�

Master Node 02� N/A�

(舊)Worker Node 01� Adaptor 01� CPU: 12 Core MEM: 16GB*12 HDD: 2.5” 300GB*2(RAID 1) 3.5” 4TB*12 (JBOD)�

(舊)Worker Node 02� Adaptor 02�

(舊)Worker Node 03� Adaptor 03�

Worker Node 04� Adaptor 04� CPU: 12 Core MEM: 16GB*12 HDD: 2.5” 300GB*2(RAID 1) 3.5” 4TB*12 (JBOD)�

Worker Node 05� Adaptor 05�

Worker Node 06� Adaptor 06�

Worker Node 07� N/A�

Worker Node 08� N/A� CPU: 12 Core MEM: 16GB*12 HDD: 3.5” 4TB*12 (JBOD)�

Page 39: How to plan a hadoop cluster for testing and production environment

叢集服務配置�

www.athemaster.com

39

Cloudera Manager�

Page 40: How to plan a hadoop cluster for testing and production environment

官方說法�

What is new?�40

www.athemaster.com

Page 41: How to plan a hadoop cluster for testing and production environment

Cloudera 5.8 官方建議�

www.athemaster.com

41

Page 42: How to plan a hadoop cluster for testing and production environment

2016 Technical Summit�

www.athemaster.com

42

Page 43: How to plan a hadoop cluster for testing and production environment

CDH Next Focus�

www.athemaster.com

43

¨  improving Impala ¨  SQL Knowledge worker Experience (Hue) ¨  Data Science Knowledge worker

Experience (kudu) ¨  Cloud - integration with major public/

private Cloud service provider through API�

Page 44: How to plan a hadoop cluster for testing and production environment

Kudu: Columnar Store�

www.athemaster.com

44

Page 45: How to plan a hadoop cluster for testing and production environment

[email protected]

Thank you�45

www.athemaster.com