cdh5最新情報 #cwt2013

47
1 CDH最新情報 Kiyoshi Mizuamru | Systems Engineer Cloudera World Tokyo 2013

Upload: cloudera-japan

Post on 28-May-2015

1.513 views

Category:

Technology


4 download

DESCRIPTION

#cwt2013 Clouderaの水丸 @kmizumar によるCDH5の紹介スライドを公開しました。HBaseの障害復旧の高速化、HDFSのNFSサポートなどを紹介しています

TRANSCRIPT

  • 1. 1 CDH Kiyoshi Mizuamru | Systems Engineer Cloudera World Tokyo 2013

2. email: [email protected] 20132Cloudera 20106Hadoop Hadoop3 2 3. CDH CDH CDH4 CDH5 3 4. 4 CDH 5. CDH Clouderas Distribution including Apache Hadoop 100% Hadoop Apache Hadoop CDH4.4 5 6. CDH 6 7. CDH 7 MAPREDUCE, HIVE, PIG SQL CLOUDERA IMPALA CLOUDERA SEARCH MAHOUT, DATAFU 8. CDH 8 Hadoop Apache CDH CDHHadoop RDBMS, ETL, BICDH CDH 100% Apache 9. CDH 9 2009 2010 2011 2012 2012/06 2012/09 2013 2013/02 2013/05 2013/09 CDH3 Q2 2011 CDH4 2012/06 NFS MR1MR2 CDH4.1 2012/09 QJM CDH4.2 2013/02 HA HBase CDH4.3 2013/05 HuePig CDH1 Q3 2009 CDH2 Q1 2010 CDH4.4 2013/09 10. CDH 10 CDH3CDH3u0, CDH3u1, CDH3u2 2013/06/20EOM CDH4CDH X.Y.Z X YCDH3update Z CDH4.4.0 11. CDH5 12. CDH 5 Release Notes Release Notes New Features in CDH5 Incompatible Changes Known issues in CDH5 Release Notes https://www.cloudera.com/content/cloudera-content/cloudera- docs/CDH5/latest/CDH5-Release-Notes/CDH5-Release- Notes.html 12 13. CDH 5.0.0 1 13 CDH 5.0.0 1 CDH 4.4.0 Apache Avro avro-1.7.4+3 Apache Hadoop 2.0 hadoop-2.2.0+353 2.0.0+1475 Apache DataFu pig-udf- datafu-0.0.4+12 0.0.4+22 Apache Flume ume-ng-1.4.0+44 1.4.0+23 Apache HBase hbase-0.95.2+272 0.94.6+132 HBase Solr hbase-solr-1.2+16 Apache Hive hive-0.11.0+483 0.10.0+198 Apache Mahout mahout-0.8+27 0.7+21 Apache Oozie oozie-4.0.0+54 3.3.2+92 Apache Pig pig-0.11.0+46 0.11.0+33 Apache Sentry (incubating) sentry-1.2.0+10 1.1.0 14. CDH 5.0.0 1 14 CDH 5.0.0 1 CDH 4.4.0 Apache Solr solr-4.4.0+98 Apache Sqoop sqoop-1.4.4+20 1.4.3+62 Apache Sqoop 2 sqoop2-1.99.2+105 1.99.2+85 Apache Whirr whirr-0.8.2+19 0.8.2+15 Apache ZooKeper zookeeper-3.4.5+25 3.4.5+23 Parquet parquet-1.0.0+7 Cloudera Development Kit cdk-0.7.0+3 Cloudera Hue hue-3.0.0+266 2.5.0+139 Cloudera Impala impala-1.2.0+0 1.1.1 Cloudera Llama llama-1.0.0+0 Cloudera Search search-1.0.0+0 1.0.0 15. Java CDH5Oracle JDK 1.7 CDH5JDK 1.7 CDH5 Beta1JDK 1.7.0_25 CDH5JDK 1.6 Oracle JDK 1.7 15 16. MapReduce 2.0 YARNMapReduce YARNYet-Another-Resource-Negotiator JobTrackerTaskTracker JobTraker 16 17. MapReduce 1.0 17 Job Client SubmitJob JobTracker TaskTracker Map Slot Reduce Slot TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker TaskTracker 18. YARN 18 Client SubmitApplication ResourceManager NodeManager Client AppMaster Container NodeManager Cotainer AppMaster NodeManager Container Container NodeManager Container ContainerCotainer Container 19. YARN ResourceManagerRM ApplicationManager (AM) ResourceManager NodeManager NodeManagerNM MapReduce 2.0 ResourceManagerApplicationMasterJobTracker NodeManagerTaskTracker 19 20. MapReduce 1.0 MRv1MRv2 MRv1YARN CDH5MRv1API Hadoop 2.0.0 CDH4MRv1CDH5 MRv1 CDH5MRv1MRv2 20 21. Hadoop 2.0.0 HadoopHDFS CDH5MRv1 mapreduce* CDH5MRv1 http://archive.cloudera.com/cdh5/cdh/5/hadoop/hadoop- project-dist/hadoop-common/DeprecatedProperties.html 21 22. Apache Flume Twitter Source FLUME-2190 Twitter HTTP SourceHTTPS FLUME-2109 22 23. Apache HBase Hadoop 2.0 MTTR CDH4.2/HBase0.94 REST proxy 23 24. Apache HBase HBase ProtoBuf 24 25. Apache HBase MasterRS/Client MasterClient RSMasterMaster Client RS .META., -ROOT-HLog .meta 25 26. Apache HBase HBASE-9373 HBASE-9158 .snapshot.hbase_snapshots CopyTable startRow, stopRow Import MapReduce 26 27. Apache HBase CDH5 HBase(Apache HBase 0.95.2/0.96.x) CDH4 HBase(0.92/0.94) CDH 5 HBase HBase -ROOT- .META.ZooKeeper TotalOrderPartitoner HFile v1 27 28. Apache HDFS mmapHDFS HDFS RW/RO HDFSNFSv3 HDFS WebHdfsFileSystem HdfsFileStatus INodeIdINode DistributedFileSystemCreate API DataNode 28 29. Apache HDFS CDH5copyToLocal ID HDFS 1 0 29 30. Cloudera Hue Sqoop App HDFS ZooKeeper App ZnodeZooKeeper Znode Pig Editor, HBase Browser, Sqoop App Hue Shell 30 31. Cloudera Hue JobTracker HiveServer2 Beeswax CDH5HueHiveServer2 Django1.21.4 SAML 31 32. Cloudera Hue Hive/Impala Hue Hue Shell App YARN 32 33. Apache HiveHCatalog TRUNCATE HIVE-466 LEAD/LAG/FIRST/LAST Hive HIVE-896 DECIMAL HIVE-2693 ALTER VIEW AS SELECT HIVE-3834 33 34. Apache HiveHCatalog HIVE-3764 ORDER BY HIVE-1402 HIVE-2206 GROUP BY HIVE-2517 HQL HIVE-2655 34 35. Apache Hive CDH5 1Hive0.11 Hive 0.11 Hive CDH5schematool JDBCHiveServer2 HiveServer2CDH5 JDBC CDH5 HueCDH4HiveServer2 35 36. Cloudera Impala UDF ImpalaETL/ELT UDFHive UDF C++ JavaHive CREATE FUNCTIONUDF DROP FUNCTIONUDF 36 37. Cloudera Impala INVALIDATE METADATA, REFRESH ImpalaCREATE TABLE, ALTER TABLE, DROP TABLE, INSERT, LOAD DATA Impala Hive INVALIDATE METADATA, REFRESH catalogd 37 38. Cloudera Impala YARN CDH5 LlamaYARN Impala EXPLAIN EXPLAIN_LEVEL 38 39. Cloudera Llama Long-Lived Application MAster YARN ImpalaImpala AM ImpalaYARN HadoopImpala Impala 39 40. Apache Mahout Vector, MatrixAPI SGD SVD++ LuceneSequenceFile k-means 40 41. Apache MapReduce 2.0(YARN) ResourceManager ResourceManagerSPOF cgroupsCPU 41 42. Apache Oozie Oozie HCatalog HCatalog SLA SLA JMS JMSSLA 42 43. Apache Oozie Oozie CDH5 Beta 1 Oozie Oozie CDH4.xCDH5.xOozie Oozie 43 44. 44 45. CDH4 45 CDH4.1.5 CDH4.2.2 CDH4.3.2 CDH4.4.0 DataFu 0.0.4+14 0.0.4+17 0.0.4+20 0.0.4+22 Flume 1.2.0+142 1.3.0+97 1.3.0+161 1.4.0+23 Hadoop 2.0.0+573 2.0.0+968 2.0.0+1369 2.0.0+1475 HBase 0.92.1+176 0.94.2+228 0.94.6+107 0.94.6+132 HCatalog 0.4.0+219 0.5.0+11 0.5.0+13 Hive 0.9.0+161 0.10.0+84 0.10.0+135 0.10.0+198 Mahout 0.7+14 0.7+17 0.7+19 0.7+21 MR1 0.20.2+1281 0.20.2+1361 0.20.2+1369 0.20.2+1475 Oozie 3.2.0+140 3.3.0+83 3.3.2+54 3.3.2+92 Pig 0.10.0+64 0.10.0+511 0.11.0+30 0.11.0+33 46. CDH4 46 CDH4.1.5 CDH4.2.2 CDH4.3.2 CDH4.4.0 Sqoop 1.4.1+60 1.4.2+61 1.4.3+36 1.4.3+62 Sqoop2 1.99.1+34 1.99.1+117 1.99.2+85 Whirr 0.8.0+24 0.8.0+27 0.8.2+13 0.8.2+15 ZooKeeper 3.4.3+35 3.4.5+17 3.4.5+21 3.4.5+23 Sentry 1.1.0 1.1.0 Cloudera Hue 2.1.0+226 2.2.0+198 2.3.0+140 2.5.0+139 Cloudera Impala 1.1.1 1.1.1 Cloudera Search 1.0.0 1.0.0 47. 47