our hadoop ppt

12
 Introducti on to computers Group No. 5 Shwetank Mishra - 33 Rashmi – 35 Pinkesh - 41 Pratik - 43 hushan - 45

Upload: rashu-parab

Post on 06-Oct-2015

17 views

Category:

Documents


0 download

DESCRIPTION

Human Generated Data Machine Generated DataBefore 'Hadoop' was in the scene, the machine generated data was mostly ignored and not captured. Hadoop is one way of using an enormous cluster of computers to store an enormous amount of data

TRANSCRIPT

Slide 1

Introduction to computersGroup No. 5Shwetank Mishra - 33 Rashmi 35Pinkesh - 41Pratik - 43Bhushan - 45

What is this Hadoop..?Who use this..?How does it work..? The Idea behind this innovationHadoopCreated by Doug Cutting & Mike Cafarella in 2005Apache projectHadoop is an infrastructure software.

What Is Hadoop?Big Data & its IMPACT

Big Data & Hadoop

Human Generated Data Machine Generated DataBefore 'Hadoop' was in the scene, the machine generated data was mostly ignored and not captured. Hadoop is one way of using an enormous cluster of computers to store an enormous amount of dataFeatures of Big DataHow Hadoop solves problem of Big Data

Hadoop clusters scale horizontallyHadoop can handle unstructured / semi-structured dataHadoop clusters provides storage and computingHadoop provides storage for Big Data at reasonable costHow does Hadoop Work..?TWO Things to keep in mind

Stores voluminous data even bigger than the PCs capacity Processes all the data on all the nodes in unique way

MapReduce in a nutshell7This work was partially supported by the SCAPE Project.The SCAPE project is cofunded by the European Union under FP7 ICT2009.4.1 (Grant Agreement number 270137).Task1

Task 2Task 3Output dataAggregated ResultAggregated Result Sven SchlarbRDBMS V/s HadoopWorks well with structured data only.Works well with both structures & unstructured data. RDBMSHadoopRequires more Implementation time.Requires less time.Non-ReliableReliableNo-high throughputHigh throughputWhich Companies are Using Hadoop

Our suggestion (Non Users)

Companies Like;What Makes Hadoop Unique amongst all is; FAULT TOLERANCEWhy Hadoop @ Yahoo!Example (Search Data)Before HadoopAfter HadoopTime 26 days20 minutesLanguageC++PythonDevelopment time2-3 weeks2-3 days Data base for search assist is built using Hadoop

Thank you