evolving your analytics stack with your...

23
Evolving Your Analytics Stack With Your Business Budapest Data Forum

Upload: others

Post on 08-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Evolving Your Analytics

Stack With Your

BusinessBudapest Data Forum

Page 2: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

• Data Scientist at Snowplow.

• Work with a number of clients from various industries. Focus on business analytics.

• Help users get set up with Snowplow and build data models.

Hello! I’m Keane

Page 3: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

• Open source event data pipeline.

• Enable users to track, process and act upon their data.

• Own your data.

What is Snowplow?

Page 4: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Businesses are

constantly evolving…

• Your products (apps & platforms)

change.

• Your questions should change too

• It’s critical that the analytics stack

can evolve with your business

Page 5: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

How?

+SELF-DESCRIBING DATA EVENT DATA MODELING

EVOLVING EVENT DATA PIPELINE

Page 6: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

SELF-DESCRIBING DATA

Part 1

Page 7: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

No two companies are alike

Page 8: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Define your own events and entities

• Article Load

• Issue Open

• Paywall Hit

• Article

• Content

• Advert

• Program

• View Recipe

• Add To Basket

• Rate Recipe

• Recipe

• Customer

• Basket

• Nutrition

Events

Entities

Page 9: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

You then define a schema for each

event and entity

"description": "Schema for a nutrition context",

"vendor": "com.gousto",

"name": “nutrition",

"version": “1-0-2“,

"properties": {

”Recipe": {"type": "string"},

”Description": {"type": "string"},

”URL": {"type": "string"},

”Calories": {"type": ["integer", "null"]},

”Protein": {"type": "string"},

”Fat": {"type": "string"}

}

}

Page 10: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

You then define a schema for each

event and entity

"schema": "iglu:ufc/nutrition/jsonschema/1-0-2",

"data": {

”Recipe": “Beef Goulash”

”Description": “Hearty beef goulash recipe”

”Calories": “3000”,

”Protein": “13g”,

”Fat": “8g”,

”Carbohydrates": “123.5g”,

”URL": ”www.gousto.com/recipes/beefgoulash”

}

}

Page 11: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

• Validate the data (important for data quality)

• Load the data into tidy tables into your data warehouse

• Make it easy / safe to write downstream data processing

applications (e.g for real time users)

The schemas can then be used in a number of ways

Page 12: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Event Data ModelingPart 2

Page 13: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

• Event data modeling is the process of using

business logic to aggregate over event-level data to

produce 'modeled' data that is simpler for querying.

What is event data modeling?

Page 14: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Modeled vs. unmodeled data

IMMUTABLE. UNOPINIATED. HARD TO CONSUME. NOT

CONTENTIOUS

MUTABLE AND OPINIONATED. EASY TO

CONSUME. MAY BE CONTENTIOUS

Page 15: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

• Late arriving events can change the way you understand earlier

arriving events

• If we change our data models, this gives us the flexibility to

recompute historical data based on the new model

In general, event data modeling is performed on the full

event stream

Page 16: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Evolving the data pipeline

Part 3

Page 17: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

How do we handle pipeline evolution?

▸Businesses change over time

▸ The events that occur are going to change

▸Use of the data will change

▸ Insight -> more questions -> more insight -> more questions

▸Two types of evolution: push and pull

BUSINESSES ARE NOT STATIC, SO EVENT PIPELINES SHOULD NOT BE EITHER

Page 18: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Push & Pull Factors

Web

Apps

Servers

Comms channels

Push …

Data

warehouse

Data exploration

Predictive modeling

Real-time dashboards

Real-time,

data-driven applicationsRT

BidderVoucher

Person-

alization…

Collection Processing

Smart car / home

PUSH FACTORSWhat is being tracked

will change over time

PULL FACTORSThe questions asked of the data

will change over time.

Page 19: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

How do we handle pipeline evolution?

• If data is self-describing it is easy to add an additional sources

• Self-describing data is good for managing bad data and pipeline evolution

I AM AN ISSUE OPEN EVENT AND I

HAVE INFORMATION

ABOUT THE USER AND ISSUE.

Page 20: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

How do we handle pipeline evolution?

INSIGH

T

QUESTION

?

ANSWE

R

Page 21: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

3 POSSIBILITIES

Existing data model

supports answer

ANSWERING THE QUESTION:

1

Need to update data

model and data

collection

3

Need to update

data model

2

Page 22: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

• Updating existing events and entities in

a backward compatible way e.g. add

optional new fields

• Update existing events and entities in a

backwards incompatible way e.g.

change field types, remove fields, add

compulsory fields

• Add new event and entity types

• Add new columns to existing derived tables e.g. add new

audience segmentation

• Change the way existing derived tables are generated e.g.

change sessionization logic

• Create new derived tables

SELF-DESCRIBING DATA RECOMPUTE DATA MODELS ON ENTIRE DATA SET

Self-describing data and the ability to recompute data models are essential to enable pipeline evolution

Page 23: Evolving Your Analytics Stack With Your Businessbiconsulting.hu/letoltes/2017budapestdata/keane... · Evolving Your Analytics Stack With Your Business Budapest Data Forum •Data

Questions?