intro to big data

Post on 27-May-2015

507 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Introduction to Big Data.

TRANSCRIPT

Intro to Big Data On Premise

Presented by: Jon BloomSenior Consultant, Agile Bay, Inc.

Jon BloomBlog: http://www.bloomconsultingbi.com

Twitter: @sqljon

Linked-in: http://www.linkedin.com/in/BloomConsultingBI

Email: jbloom@AgileBay.com

Customers & Partners

w w w . a g i l e b a y . c o m

Session AgendaWhat is Big Data?What is Hadoop?BI vs. HadoopDemo:

Terms and Acronyms Hadoop:

Apache project (open source) project to develop software for reliable, scalable, distributed computing.

Cluster: A group of computers (nodes) linked together to perform a highly-available and high computation work

HDFS distributed file system that provides high-throughput access to application data.

YARNA framework for job scheduling and cluster resource management.

MapReduce A system for parallel processing of large data sets.

What is Big Data?

What is Big Data?Volume, Velocity, Variety

What is Hadoop?

What is HadoopApache open source project Batch Oriented Parallel Processing across

Commodity Servers Ecosystem

• Ambari• HBase• Avro• Cassandra• Chukwa

• Hive• Mahout• Pig• ZooKeeper

Distributed Computing & MapReduce

MapperReducer

BI vs. Hadoop?

BI vs. HadoopHadoop not a replacement of BIExtends BI capabilitiesBI = Scale up to 100s of GigabytesHadoop = From 100s of Gygabytes to

Terabytes (1,000s og Gygabytes) and Terabytes (1,000,000 Gigabytes)

Demo

Thank you for attending!Q & A

Blog: www.bloomconsultingbi.comTwitter: @sqljon

Linked-in: http://www.linkedin.com/in/BloomConsultingBI

Email: jbloom@agilebay.com

top related