8 douetteau - dataiku - data tuesday open source 26 fev 2013

8
Hal’s Headache Data Tuesday 02/25/2013 Florian Douetteau

Upload: data-tuesday

Post on 01-Jul-2015

1.184 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

Hal’s HeadacheData Tuesday

02/25/2013 Florian Douetteau

Page 2: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

04/14/2023 2

Meet Hal Alowne

Dataiku - Data Tuesday

Big Guys• 10B$+ Revenue• 100M+ customers• 100+ Data Scientist

Hal AlowneBI ManagerDim’s Private Showroom

Hey Hal ! We need a big data platform

like the big guys.Let’s just do as they do!

‟”European E-commerce Web site

• 100M$ Revenue• 1 Million customer• 1 Data Analyst (Hal Himself)

Dim SumCEO & Founder Dim’s Private Showroom

Big DataCopy Cat Project

Page 3: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

04/14/2023Dataiku - Data Tuesday 3

CHOOSE TECHNOLOGY

HadoopCeph

Sphere

Cassandra

Spark

Scikit-Learn

MahoutWEKA

MLBase

RapidMiner

Panda

D3Crossfilter

InfiniDBLucidDB

Impala

Elastic Search

SOLR

MongoDBRiak

Membase

Pig HiveCascadingTalend

Machine Learning Mystery Land

Scalability CentralNoSQL-Slavia

SQL Colunnar Republic

Vizualization CountyData Clean Wasteland

Statistician Old House

R

Page 4: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

04/14/2023Dataiku - Data Tuesday 4

LEARN MACHINE LEARNING STUFF

Find People that understand machine learning and all this stuff

Try to understand myself

Page 5: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

04/14/2023Dataiku - Data Tuesday 5

DO IT

Hadoop

SQL

D3

R

ElasticSearchWeb Logs

CRM

Open Data Storm

Connect things together Pour Data in Clean Data Fix the leaks Start again

Gigabytes

Terabytes

Megabytes

Page 6: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

04/14/2023Dataiku - Data Tuesday 6

MERIT = TIME + ROI

Targeted Newsletter

RecommenderSystem

Dynamic Pricing

TIME : 6 MONTHS ROI : APPS

Build a lab in 6 months (rather than 18 months)

Find the right people

(6 months?)

Choose the technology(6 months?)

Make it work (6 months?)

Build the lab (6 months)

Deploy apps that actually deliver value

2013 2014

2013

• Train People• Reuse working patterns

Page 7: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

04/14/2023Dataiku - Data Tuesday 7

D1

Flow

Impact

Dataiku

Data Is Money

One fake customer A few real ones

One Goal

Help you build your data lab in less than six months

‟”

One platform with an open source core

Shaker

Doctor

Feedback

all-in-one data scientists distribution

Manage datasetsand transformations

Export Predictions

DiagnoseData

PrepareData

Continuous Loopback

Page 8: 8   douetteau - dataiku - data tuesday open source 26 fev 2013

04/14/2023 8

Thank You !

Dataiku - Data Tuesday