big data today and tomorrow
DESCRIPTION
TRANSCRIPT
BIGdatatoday and tomorrow
Mariusz Gil
/ ABOUT ME /
BIG DATAThis talk is about
BIG DATA?What is...
VOLUMElarge amounts of data
VELOCITYneeds to be analyzed quickly
VARIETYdifferent types of structured and unstructured data
Big Data is data that is too large, complex and dynamics for any conventional data tools to capture, store, manage and analyze.
30 billion pieces of content we added past month
more than 2 billion videos were watched yesterday
more than 58 millions messages were send yesterday
WHY?
690 nodes Hadoop cluster for predictions and analytics
HOW?
HBASECOLUMNAR STORAGE
HIVESQL DATA WAREHOUSE ENGINE
AVRODATA SERIALIZATION
MAHOUTSCALABLE MACHINE LEARNING
OOZIEWORKFLOWS ORCHESTRATION
ZOOKEEPERDISTRIBUTED COORDINATION SERVICE
FLUMELOG COLLECTOR
HDFSHADOOP DISTRIBUTED FILE SYSTEM
YARN / MapReduce v2DISTRIBUTED PROCESSING FRAMEWORK
AMBARIPROVISIONING, MANAGING AND MONITORING CLUSTERS
WHIRRRUNNING CLOUD SERVICES
EVOLVE
HADOOP!The future is not only
REALTIMEFuture is low latency and
Apache Drill
Storm
BIG THINGData is the next