Download - Metrics driven development 10.09.2014
Metrics Driven Development How to improve visibility, communication and feedback loop through metrics Erno Aapa @ernoaapa Developer, DevOps-consultant [email protected] DevOps-Finland meetup Helsinki 10.09.2014
ERNO AAPA Developer / Team Leader DevOps-consultant Founder of DevOps-Finland
Daily work as a developer. One of Avaus tech team leaders and consult companies about DevOps.
On free time organize DevOps-Finland meetings.
Who am I?
11/09/14 2
Measurement is one of the keys Everyone emphasize the importance of it
Three Ways: Feedback
Feedback loop
11/09/14 4
M in C.A.L.M.S
Measure everything Show the
improvement
11/09/14 5
Build-MEASURE-Learn
Measure and learn
11/09/14 6
… but in reality… we need it, but we don’t have it
What we have
11/09/14 8
Monthly / Yearly reports
Financial reports
Google Analytics
Business
?
Dev
CPU, Mem, IO, Disk
Zabbix/Ganglia/Nagios
Ops
Test coverage reports
CI server notifications
Performance test results
QA
What that cause? • Bad visibility
Ops-team tries to get the information from logs, load metrics, etc. Business doesn’t have detailed visibility only high level reports
• Dev “just do my job” They don’t feel the “heartbeat” of the service
• Invalid conception of service state “No-one complains so it works… I guess :/ ”
• “Business bug” – feature created but not used Service full of unused legacy features
11/09/14 9
“…is a practice where metrics are used to drive the entire application
development”
Metrics Driven Development (MDD)
InfoQ / 2012
Principles Define metrics before implementation
• Like in TDD implement test first, implement metric first • For example: Include metrics to user stories
Instrumentation-as-Code • Developers must be able to add new metric with minimal(=one line) effort
Single Source Of Truth • Store common metrics from app, logs, monitoring agents and other tools to single place • Platform should be so timely, comprehensive, and intuitive to use that everyone instinctively relies on it
Shared view for key metrics • Shared view to give all same vision • So simple that everyone can understand it
Use metrics when making decisions • You have powerful information, use it when you’re making decisions
Maintain and follow the metrics • Follow the metrics - now popular feature can be waste tomorrow • Remove unneeded metrics – it’s code, it need to be maintained
of Metrics Driven Development (MDD)
11/09/14 11 Erno Aapa / 2014
Librato Blog / 2014 InfoQ / 2012
User story
As a user I can order items in my shopping cart.
We measure visited users and count of orders. We expect to at least 20% of users to order items in shopping cart.
Define what to measure and what we expect to user story
11/09/14 12
Can be
defined in EPIC or
FEATURE too
Collect required metrics
# Somewhere in your where login happens... statsd.increment("users.visited”)
Developer adds required code to collect metrics
11/09/14 13
# ...and where you handle the orders statsd.increment(”shopping.ordered”)
Somefile.rb
OtherFile.rb
Share visibility Visualize metric and also what we expect
11/09/14 14
Visualize what we are expecting. Easy to everyone understand what is good or bad.
Use this information when making decisions
Single place of truth Provide single place for data to easily compare and analyze
11/09/14 15
Frontend Service Monitoring
Shared dashboard for key metrics Make most important metrics visible to everyone - all the time
11/09/14 16
In the meeting Use the information when you’re making decisions
11/09/14 17
• What features have went to live?
• Did we achieve what wanted?
• Should we continue to improving them?
• Or should we remove them?
• What is our overall status?
• What we do next to achieve our goal?
- Get the “feeling” and visibility to production
- Drop unused features
- Can focus on important parts
PRODUCT OWNER / STAKEHOLDERS - Get real time information from
production
- Can make data driven decisions
- Support LeanStartup way
DEVELOPER
TEST MANAGER - Get visibility to production
- Can scale tests to match with production
SYSADMIN - Get visibility to inside of the app
- In case of problems easier to find the error
Everyone benefit from it
11/09/14 18
just to name few…
Communication
11/09/14 19
Ops
Business
Dev
Metrics
Shared view and goals helps communication
Benefits of MDD
Imaginary case Simple Java web shop
My Webshop
11/09/14 21
Web shop
UI Server Graphite Grafana
User orders items through Web UI
After processing the order, server sends data to Graphite
Use Grafana to visualize data.
By developer
11/09/14 22
1. Create counter
2. Increment counter
THATS ALL!
By developer
11/09/14 23
SEE THE
RESULTS!
Type the metric name and…
Gauges
• Queue sizes
Timers
• Query times
• Response times
Counters
• Execution counters
Histograms
• median, 75th, 90th, 95th, 98th, 99th
… when you get started
11/09/14 24
Measure time with single @annotation
There is so many things what you can measure, if you have the capability
”Metrics are always powerful political ammunition, to be used for better or worse” Metrics-driven Enterprise Software Development- book ”Tried a solution where OPS and DEV were sharing one metrics server… it didn't work” Mantas Klasavicius / Adform ”MONITOR ALL THE THINGS! No, you should not wrap every single method call in your application to increment a counter” Librato blog
Pitfalls
What tools to use? To collect, store, display? SaaS maybe?
11/09/14 27
TOOLS WHY
COLLECT • StatsD (multiple languages) • Code Hale Metrics (Java) • Easy to implement by self
• Provide ”one-line” –way to developer collect any metrics from application
• Possible to change storage from configuration
STORE • Graphite • InfluxDB • … and more
• Store time-series data • Provides easy accessable API • Aggrigate data on ”on the fly” • Downsample old data
DISPLAY • Grafana • Tesseo • … and more
• Simple to use • Simple and clear graphs • Possible to create multiple dashboards
SaaS • Librato.com • geckoboard.com • leftronic.com • HostedGraphite.com • Influxdb.com • …and many more
• Easy to get started • Maintained • Free to test
Give a demo and convince others!
Graphite vs. InfluxDB echo “dc1.server2.cpuload 5.6" | nc graphite.com
echo “app.visitors 1" | nc graphite.com 2003
11/09/14 28
"name" : “cpu_load",
"columns" : ["value", ”dc", ”server"],
"points" : [ [5.6, ”dc1", ”server2”] ]
"name" : “visitors",
"columns" : ["value”, ”browser", ”version"],
"points" : [ [1, ”Chrome” , 37.0] ]
Send data Key / Value Json
Graphite vs. InfluxDB
average(dc1.*.cpuload)
sum(app.visitors)
Not possible :(
// Top 5 servers with highest load
highestAverage(*. *.load,5)
11/09/14 29
select average(value) from cpu_load where dc = ‘dc1’
select sum(value) from visitors
select sum(value) from visitors where browser = ‘Chrome’ and
version > 35
Probably not possible :(
Aggregate data
Not full SQL!
Graphite vs. InfluxDB
• Older project
• Harder to install / configure
• Over 100 aggregation functions sum, cumulative, compare, highestAverage
• Good for collecting metrics
• Limited “WHERE” queries
11/09/14 30
• Really young project (Apr 2013)
• Easy to get started
• Only basic aggregation functions(17 total) sum, min, max, median, percentile
• Good for collecting events
• Limited possibilities to aggregate data
Summary
Don’t pick one, use both! (and don’t forget to check SaaS services too!)
What next? That’s cannot be all…
What next - Annotations
11/09/14 32
What happen at 19:27? There were ad in the TV
Add annotations automatically when something happens what can change the service state. Like deployments, other events, etc.
• Have a easy way to add new alerts
• Don’t have to be a major alerts, small “reminders” are ok too • Notify developers by email when any page response time goes higher than 300ms • Notify business and developers when any feature usage goes lower than 5%
What next - Alerts
11/09/14 33
Users not
using it?
Feature “favourite list” usage lower than 5%
It’s waste… REMOVE IT!
Cleaner codebase No complex UIs
Less functionality
= easier to maintain
Easy to get started, so why not?
• Next time plan how you could measure it • Add required metrics to code • Use free SaaS services to get easily started • Create nice dashboard, add the most important graph and convince others!
11/09/14 34
Try it today, create a demo to your team, convince them
Questions? Thank you! Erno Aapa @ernoaapa Developer, DevOps-consultant [email protected] 10.09.2014