brendan haire, atlassian, presentation at chief data & analytics officer forum, melbourne

18
Building a data lake in the sky DATA LAKE ON AWS AGILE LAKE DELIVERY

Upload: corinium-coriniumglobal

Post on 21-Jan-2017

147 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Building a data lake in the skyD ATA L A K E O N AW S

A G I L E L A K E D E L I V E RY

Page 2: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Who am I?

Data experienceThrough my career I have built and managed:

• reporting platform for an Australian University• data warehouse and BI solution for a telco in Europe• data warehouse and real-time data integration platform for a bank

.. and finally I led the Analytics and Data Integration team at Atlassian for the past year delivering on our data strategy.

About myself• Atlassian for over 4 years• IT for 20 years• Roles from developer, dev mgr, architect to project mgmt• Software Engineering background• Developer at heart

Brendan Haire

Page 3: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Starting pointData

Context

• Software company• Fast growing• Data Driven• IPO

• 200TB Data• ~1000 users per week

(~800 reporting, ~200 ad hoc)• 30k queries per day

• Team of 4• Legacy EDW• Multiple data silos• Emerging problem

Atlassian

Page 4: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Scale/CostData EverywhereSlow Analysis Duplication Effort

The Problem

Page 5: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Data lake on AWS

“A lake in the clouds”

Page 6: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

PrinciplesA data pipeline and analytic platform that:

Vision

•handles large and small data sets•supports real-time and batch functionsE

nabl

ing

Ana

lytic

s

•is easy to add raw data for immediate use•allows value to be progressively added through stages•support self-service analysis and integration functionsS

cale

Fr

ictio

n

Page 7: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Conceptual

Source Systems

Data Applications Business Intelligence

1 Data Lake

2 Data Stream

Page 8: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Solution

Page 9: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

The UglyThe BadThe Good

Good, Bad, Ugly

• New analytics capability• Less ETL and moving data• Performance• AWS - flexibility• Scaling - compute vs storage• Cost - control + predictability

• High learning curve• New tooling• Data Governance

• ‘Cutting edge’ hurts

Page 10: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Agile lake delivery

“From pond to lake”

Page 11: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne
Page 12: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne
Page 13: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

by Henrik Kniberg

Page 14: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Minimal Viable Product (MVP)

Page 15: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Weekly Active Usage (WAU)

Page 16: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

FeedbackTest

Enabling Innovation

• Problem statement• Vision• Research• Talk to people

• ShipIT / Hackathons• Spikes• Minimum Viable Product

• User Feedback• Usage

Hypothesis

Page 17: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

IncrementalSelf ServiceRaw Data Usage FeedbackSelf service is key in reducing friction and enabling scale

Providing analysts access to raw data is a game changer

Incremental delivery and feedback drive innovation

When building a platform usage is a great proxy for value

Takeaways

Page 18: Brendan Haire, Atlassian, Presentation at Chief Data & Analytics Officer Forum, Melbourne

Thank you!