may 2013 hug: building common denominator of hadoop distributions with bigtop
DESCRIPTION
Bigtop is stepping up in its role as the foundation of a standard Hadoop-based data analytics stack, essentially bringing most of the commercial offering to the standard footing. 6 out of 7 commercial vendors using Bigtop framework to power their distributions based on ASF Hadoop. Bigtop is also the must have stabilization tool for Hadoop platform where's any downstream application or system developer can make sure that their software would work with the next version of Hadoop. Presenter(s): Dr. Konstantin Boudnik, ASF Hadoop committer, Bigtop PMC; Director of Engineering, WANdisco Roman Shaposhnik, VP, Apache Bigtop, IPMC member at ASF; Software engineer, Cloudera inc.TRANSCRIPT
Hadoop.next
Who are we?● Hadoop downstream community● Well, specifically:
– Roman Shaposhnik● VP, Apache Bigtop, IPMC member at ASF● Software engineer, Cloudera inc.
– Dr. Konstantin Boudnik,● ASF Hadoop committer, Bigtop PMC,● Director of Engineering, WANdisco
What are we dealing with?● Hadoop 1.x
– stable, but old
● Hadoop 2.0.x– modern, used to be alpha, now stabilizing
● Hadoop 2.1.x– modern, feature-driven
● Hadoop 3.x– perpetual trunk
What are the implications?● YARN's appeal as IaaS● Fragmentation● Repeat of “UNIX vendor wars”● Cutting off vital sources of feedback● Jaded downstream● Confused users● Delayed world domination
What's downstream to do?● mvn help:all-profiles
Profile Id: hadoop_0.20.203 (active)Profile Id: hadoop_1.0Profile Id: hadoop_non_secureProfile Id: hadoop_facebookProfile Id: hadoop_0.23Profile Id: hadoop_yarnProfile Id: hadoop_2.0.0Profile Id: hadoop_2.0.1Profile Id: hadoop_2.0.2Profile Id: hadoop_2.0.3Profile Id: hadoop_trunkProfile Id: hadoop_cdh4.1.2
That active profile?● http://mvnrepository.com/artifact/
org.apache.hbase/hbase/0.94.3
<dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase</artifactId> <version>0.94.3</version></dependency>
● Try finding Apache Giraph artifacts!
13
We don't have the TCK, but...
Zookeeper
HBase
Pig
Hive
Impala
Giraph
Hama
Hue
Solr
Crunch
Sqoop
Oozie
Whirr
Mahout
Flume
Apache Bigtop
“open-source software related to a system for integration, packaging, deployment and validation of a big data management software distribution based on Apache Hadoop”
15
Remember what Debian did to Linux?
GNU Software Linux kernelLinux kernel
16
Bigtop is trying to do it with Hadoop
Hadoop Ecosystem(Pig, Hive, Mahout)
Linux kernelHadoop(HDFS + MR)
What does Bigtop offer:● Community focused on all of the above● Software for:
– Integration
– Build (make, Maven)
– Packaging (RPM, DEB)
– Deployment (Puppet)
– Testing (iTest)
● A continuous integration Jenkins server
Embrace asynchronous nature ● Don't expect flag days● Don't expect agreement on releases● Do practice Last Known Good Builds
Av1 Bv22
Cv3 Dv4
Av1 Bv2
Cv3 Dv2
........Av1 Bv2
Cv3 Dv4
Bv22
Dv44
Who's on-board?● Cloudera
– CDH4 is 100% based on Bigtop (hadoop v2)
● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical
– Ubuntu Server: Hadoop and Bigdata blueprint
● Illumos (early stages of interest)
Who's on-board?● Cloudera
– CDH4 is 100% based on Bigtop (hadoop v2)
● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical
– Ubuntu Server: Hadoop and Bigdata blueprint
● Illumos (early stages of interest)
Who's on-board?● Cloudera
– CDH4 is 100% based on Bigtop (hadoop v2)
● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical
– Ubuntu Server: Hadoop and Bigdata blueprint
● Illumos (early stages of interest)
Who's on-board?● Cloudera
– CDH4 is 100% based on Bigtop (hadoop v2)
● WANdisco● TrendMicro● Hortonworks, EMC, EBay, Intel (partially)● Canonical
– Ubuntu Server: Hadoop and Bigdata blueprint
● Illumos (early stages of interest)
What's happening● A special release: Bigtop 0.3.0-incubating
– Hadoop 1.0.1
● Last stable release: Bigtop 0.5.0– Hadoop 2.0.2-alpha
● Next stable release: Bigtop 0.6.0– End of Mar 2013 release
– Hadoop 2.0.4-alpha
– Major focus on developers
A special note on 2.0.4-alpha● It really will be 2.0.4.1● First release to use Bigtop as release criteria● A 100% community effort● First non-vendor stabilization effort● A stable base for 18 applications and
counting!
<your idea here>