making agile work for data teams · making agile work for data teams writing effective product...
TRANSCRIPT
Making Agile Work for Data Teams
Writing Effective Product Backlog Items (PBIs) for Data Products
Clare StankwitzMathias Eifert
excella.com | @excellaco
excella.com | @excellaco
Roadmap
1. Types of data projects
2. What distinguishes data teams?
3. Challenges of vertical slicing
4. Strategy 1: Lean Startup and Hypothesis-Driven Development
5. Strategy 2: Decoupling the stack
6. Key takeaways
excella.com | @excellaco
What makes agile for data challenging
in your organization?
excella.com | @excellaco
• Dashboards to support decisions
• Other compelling representations
What does a data team do?
Data Integration (DI)
• Databases• “ETLs”
Data tables & what lives in them
Data Science• Complex statistical
models• Machine learning
Experimentation with data
Data Visualization (Data Viz)
The art of data
excella.com | @excellaco
How is data development different?
excella.com | @excellaco
Hidden,
Hard,
Experimental,
but Valuable work
excella.com | @excellaco
Less commoditized
Data Tools
Higher abstraction level
Trend towards platforms
Low testing support
excella.com | @excellaco
Users Developers
excella.com | @excellaco
Data teams are different.
Agile for data teams is different.
Backlogs for data teams are different.
excella.com | @excellaco
Vertical Slices
excella.com | @excellaco
When your vertical slicelooks more like
a pyramid
DiscoveryGovernance
PlumbingData prep
ModelingUI
excella.com | @excellaco
Vertical slices area useful construct, not an obligation
excella.com | @excellaco
What is value, anyway?
Business Value
Information Value
excella.com | @excellaco
Strategy 1:Lean Startup & Hypothesis-Driven Development
excella.com | @excellaco
Pop Quiz:“Creating a new product or service under conditions of extreme uncertainty”
Are we talking about:a) Data Scienceb) Lean Startupc) Both a) and b)
excella.com | @excellaco
Validated LearningA unit of progress based on important assumptions being confirmed or refuted through data-driven testing.
Series of Problems to SolveE.g. Acquisition, Activation, Retention, Revenue, and Referral.
MVP (Minimum Viable Product)The smallest thing we can do to collect the maximum amount of validated learning with the least effort.
excella.com | @excellaco
CRISP-DM
excella.com | @excellaco
Grants Analytics Project
Problem: Learning: Outcome:
Extract data from PDFs Document structure and key terms Refine models
Identify validated results Not enough target data PIVOT: Change approach -predictions vs. anomalies
Make data accessible Reading PDFs is too time consuming Create searchable database
How will data be used Context drives decisions Integrate with other data sources
Make data widely usable Users with varying tech skills PIVOT: Interactive dashboard
Information Value Business Value
excella.com | @excellaco
RiskiestAssumptionTesting
RAT
What could possibly go
wrong?
excella.com | @excellaco
Hypothesis-Driven Development
Testable Hypothesis:We believe that [doing this] will have [this desired outcome].
We know this to be true when we observe [some measurable result].
excella.com | @excellaco
Hypothesis-Driven Development
Testable Hypothesis:We believe that incorporating [this additional modeling approach] will improve the quality of our model.
We know this to be true when we observe precision > 70% with recall ≥ 85%.
excella.com | @excellaco
Strategy 2:Decoupling & Multilevel Definitions of Done
excella.com | @excellaco
Decoupling the StackReduce dependencies to reduce risk
Deliver value at each increment of the product
excella.com | @excellaco
excella.com | @excellaco
Early stages = Information Value
Foundation
Viz + Models
Extraction
Preparation
PBI 1 [Example layers of a
data product]
Data discovery
excella.com | @excellaco
Use mock/sample data to work from both ends
PBI 2
Foundation
Viz + Models
Extraction
Preparation
Prototype based on mock data
[Example layers of a
data product]
excella.com | @excellaco
Build up instead of out; meet in the middle
PBI 3
Foundation
Viz + Models
Extraction
Preparation
Manual extract
Approximatebusiness needs
[Example layers of a
data product]
excella.com | @excellaco
Crude version complete!(for 1 slice of functionality)
PBI 4
Foundation
Viz + Models
Extraction
Preparation
Simple version
Transformed extract data
[Example layers of a
data product]
excella.com | @excellaco
Level 2: Elaborate & Fortify
PBI 5
Foundation
Viz + Models
Extraction
Preparation
[Example layers of a
data product]
excella.com | @excellaco
Finish line: robust, high-value product or feature
PBI “N”
Foundation
Viz + Models
Extraction
Preparation
[Example layers of a
data product]
excella.com | @excellaco
Enforcing Discipline:A Multi-Level Definition of Done
“Doneness” level
Tables, Queries, Functions, DBsObjects work in which environment?
Data Used Tests (meeting min coverage)
Level 1 PBIs Development Mock Schema, Unit
Level 2 PBIs Preview Subset of Sample Data Integration
Level 3 PBIs Production Full Dataset Performance
excella.com | @excellaco
Examples of Level 1 PBI Goals
Create Tables:• Tables evaluated in a dev environmentPopulate Tables:• Test tables populated with mock dataData Discovery:• Find location and quality of data
Environment Data Used Tests (meeting min coverage)
Level 1 PBIs Development Mock Schema, Unit
excella.com | @excellaco
Examples of Level 2 PBI Goals
Create Tables:• Tables evaluated in a staging environmentPopulate Tables:• Use sample data and address data qualityData Exploration:• Test models & prototype dashboards
Environment Data Used Tests (meeting min coverage)
Level 2 PBIs Staging Sample Integration
excella.com | @excellaco
Examples of Level 3 PBI Goals
Create Tables:• Tables evaluated in prod environmentPopulate Tables:• Performant using full datasetData Analysis:• Refine models and vizes using full dataset
Environment Data Used Tests (meeting min coverage)
Level 3 PBIs Production Full dataset Performance
excella.com | @excellaco
Summary:A Multi-Level Definition of Done
“Doneness” level
Tables, Queries, Functions, DBsObjects work in which environment?
Data Used Tests (meeting min coverage)
Level 1 PBIs Development Mock Schema, Unit
Level 2 PBIs Preview Subset of Sample Data Integration
Level 3 PBIs Production Full Dataset All tests
• Can have more than 3 levels, or define them differently
• Can combine with HDD for data science or other areas of high uncertainty
• Simpler modeling and other development (e.g., APIs) can often have just one level of Done
excella.com | @excellaco
Key Points
Strive for independence• Work through the system’s
pyramid in small bites
PBIs come in many flavors• Purpose is to add value
and gain feedback
Test assumptions & stay open to change• Generate knowledge
value early & often
Data teams are different from software teams• Structural, tactical, and
mission differences
excella.com | @excellaco
Questions? Insights? What could you take back to your teams?