data analytics in real world (may 2016)

12
Data Analytics in Real World Geeta Chauhan @ MUM May 2016

Upload: geetachauhan

Post on 14-Apr-2017

104 views

Category:

Technology


0 download

TRANSCRIPT

Data Analytics in Real World

Geeta Chauhan @ MUM May 2016

1 2 3

6 5 4

7 8 9

Master's in Computer Application

Systems EngineerGeneral Manager & Technical Director

Senior ConsultantDevelopment DirectorInnovation & Research Director

Chief Technology Officer

Lead 13 New Products, Features across 30+ Products

Data Driven, Multi-tier, Social Media, Mobile, Cloud, Analytics

Agile, User Centered Design, Lean Startup,

Mindfulness

India

USA

Data Analytics in Real World 2

Challenges for Data Analytics in Real World

Technological

Rapidly evolving Technology Stack

Shift towards Open Source to contain costs

Shift from One standard way of doing things to Contextual use case driven

Shift from On-prem sol to Cloud and Hybrid cloud models

New types of access & usage patterns

Real Time, On- Demand, Exploratory, Internet of Things

Two different types of projects

Production Bread & Butter

Experimental - High unknowns, don’t know what you don’t know

Organizational & Cultural

ROI - lead time for first set of outcomes

Data cleansing & ingestion 80-90%

Lack of Domain Expertise, Not asking or solving for right questions

Learning curve - crucial for successful rollout of project

Data Driven decision making still new

Comfort level with high unknowns

Test driven approach - A/B Testing

Data Analytics in Real World 3

Architectural Patterns & Solutions

Lambda Architecture

Real-time speed layer + Batch Processing layer + Serving Layer

Edge Analytics – Internet of Things

Distributed analytics closer to source

Data Center as a Computer

Cluster computing, dynamic workloads

Blockchain

Distributed ledger, internet of value

Data Analytics in Real World 4

Lambda (λ) Architecture

Data Analytics in Real World 5

Edge Analytics

Cloudlets with Edge

Analytics

Video

IOT

Automotive

Source: CMU

Data Analytics in Real World 6

Client Server Era

Small Apps, Big Servers

Static Partitioned

Cloud Era

Big Apps, Small Servers, Micro-services

Elastic Partitioned

Data Center as a Computer

Source: Andreesen Horowitz

Data Analytics in Real World 7

Dynamic Workloads Resource Utilization

Distributed Systems Kernel

General Purpose dynamic shared

cluster for multiple workloads

When resources become idle, can be

reused by other schedulers

Source: Apache Mesos

Data Analytics in Real World 8

Blockchain

Decentralized ledger - Protocol underlying

Bitcoin Cryptocurrency

Merkel Tree - Each block: Timestamp +

Crypto Hash of prior blocks + Data

Open + Trust + Secure

Data Integration, provenance, privacy

Internet of Value

Source: Economist.com

Data Analytics in Real World 9

Key Takeaways

Continuous Learning

Interpersonal Skills – Yes, and

Data Driven experimental approach

Contextual Use Case driven technology stack

Automation for rapid iterations and reproducible results

Meditation

Data Analytics in Real World 10

Q & A

Contact: [email protected]

Data Analytics in Real World 11

Resources

Lambda Architecture: http://lambda-architecture.net

Edge Analytics: https://www.cs.cmu.edu/~satya/docdir/satya-edge2015.pdf

Apache Mesos Whitepaper: https://www.cs.berkeley.edu/~alig/papers/mesos.pdf

Bitcoin Whitepaper: https://bitcoin.org/bitcoin.pdf

Data Analytics in Real World 12