hadoop 101 (v1) (20150730)
TRANSCRIPT
![Page 1: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/1.jpg)
Hadoop 101Big Data Technology
![Page 2: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/2.jpg)
What is Big Data?
![Page 3: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/3.jpg)
Big Data is ...
- A Technology that capable of handling a:- massive and complex data (petabytes+)- stream of data in (near) real time- extremely large infrastructure
![Page 5: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/5.jpg)
What is Hadoop?- Hadoop is:
- scalable.- a “Framework”.- not a drop in replacement
for RDBMS.- great for pipelining
massive amounts of data to achieve the end result.
![Page 6: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/6.jpg)
- Hadoop was created by Doug Cutting and Mike Cafarella. Cutting, who was working at Yahoo! at the time, named it after his son’s toy elephant.
- Yahoo! has the single largest Hadoop cluster in the world (4,500 nodes). (according to the Apache Hadoop website)
- Yes, there is a Hadoop GPU Framework available!
Hadoop Fun Facts
![Page 7: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/7.jpg)
Hadoop Core Components
![Page 8: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/8.jpg)
Hadoop 1.x- HDFS (storage)
- NameNode- DataNode- Secondary NameNode*
- MapReduce (processing)- JobTracker- TaskTrackers- JobHistoryServer
Hadoop Core Components (Details)
Hadoop 2.x- HDFS (storage)
- NameNode- DataNode- Secondary NameNode*
- YARN (processing)- ResourceManager- ApplicationMaster- NodeManager- JobHistoryServer
![Page 9: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/9.jpg)
Hadoop Compatible Components (1)
- Manipulate/Querying Data:- Apache Hive (SQL like query)- Cloudera Impala (SQL like query)- Apache Pig (Scripting based query)
- MapReduce (Library)
- Key Value Storage- HBase- Cassandra
![Page 10: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/10.jpg)
Hadoop Compatible Components (2)
- Message Queueing:- Kafka (Similar to RabbitMQ, Pub-Sub, etc)
- Advanced Processing- Spark (Up to 100x faster than MapReduce)
- Scheduler/Workflow- Oozie (Similar to Crontab)
![Page 11: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/11.jpg)
Hadoop Compatible Components (3)
- Data Export/Import:- Flume (Stream: Text Files/Logs to HDFS)- Sqoop (RDBMS to HDFS or vice versa)
and many more.. :)
![Page 12: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/12.jpg)
Most Popular Hadoop Distributions
source: datanami.com
![Page 13: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/13.jpg)
Real Example of Using Hadoop* (1)
![Page 14: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/14.jpg)
Real Example of Using Hadoop* (2)
![Page 15: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/15.jpg)
Real Example of Using Hadoop* (3)
(near) Real Time Analytics
![Page 16: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/16.jpg)
QA Session
Join our Linkedin Group
Big Data Indonesiahttps://www.linkedin.com/grp/home?gid=6970225
![Page 17: Hadoop 101 (v1) (20150730)](https://reader034.vdocuments.site/reader034/viewer/2022051318/58821d2e1a28ab3f4c8b709b/html5/thumbnails/17.jpg)
Hadoop 101Thank You # EOFUnless stated, all images used in this slides belong to their respective owners.