listen to your machines: devops analytics for better feedback loops
TRANSCRIPT
Copyright © 2016 Splunk Inc.
Listen to Your Machines: DevOps Analytics for Better Feedback Loops
Andi Mann, Chief Technology AdvocateSplunk@andimann
‘Known good practices’ for collecting, correlating, and analyzing DevOps data
3
Effective DevOps Practices● Improve collaboration and sharing between dev and ops
● Build trust and accountability between teams
● Break down barriers and improve handoffs between silos
● Establish trust and transparency between Dev. and Ops.
● Streamline flow of code from idea to cash
● Creating feedback loops at every stage
● Focus on impact on business goals and customer experience
4
FROM EVERY TOOL, EVERY PROCESS, EVERY COMPONENT, ON-PREM OR OFF
One Constant -Machine Data
5
Common Data Fabric
6
APISDKs UI
Server, Storage. N/W
Server Virtualization
Operating Systems
Infrastructure Applications
Mobile Applications Cloud Services
Other ToolsTicketing/Help
Desk
Custom Applications
Visibility Across the Whole Ops Environment
API Services
Common Data Fabric
7
APISDKs UI
Other ToolsEscalation/
Collaboration
Visibility Across the Whole Dev Lifecycle
Plan Code Build Test/QA Stage Release Config Monitor
Important DevOps data and metrics for different DevOps teams
8
Computing UK’s ‘Metrics that Matter’
Source: Computing Research UK, DevOps Review 2016: Accelerating Innovation, July 2016
9
More DevOps Metrics that Might Matter10
Culturee.g.• Retention• Work hours• Callouts
Processe.g.• Idea-to-cash• MTTR• Deliver time
Qualitye.g.• Tests passed• Tests failed• Best/worst
Systemse.g.• Throughput• Uptime• Build times
Activitye.g.• Commits• Tests run• Releases
Impacte.g.• Signups• Checkouts• Revenue
Specific Metrics For Each Stakeholder11
BizOpsStageBuildSecQADevPMOBiz
• time to deliver• idea to cash• ROI
• process times• team efficiency• unplanned work
• code volume• commit volume• release speed
• test volume• code coverage• exception counts
• access attempts• remediation time• code quality
• build speed• failure rates• manual exceptions
• performance• latency• scalability
• response time• uptime/availability• resource usage
• revenue• signups• satisfaction
Shared Metrics for Multiple Stakeholders12
BizOpsStageBuildSecQADevPMOBiz
• time to deliver
• scalability
• ROI
• time to deliver
• team efficiency
• ROI
• team efficiency
• scalability
• release speed
• remediation time
• code quality
• performance
• remediation time
• code quality
• manual exceptions
• scalability
• performance
• manual exceptions
• release speed
• code quality
• scalability
• remediation time
• performance
• scalability
• performance
• release speed
• ROI
What About Just Dev and OpsVelocity
time to deploy
time to build
‘idea to customer’
story throughout
build failure rate
Quality
number of defects
downtime per release
code coverage
response time impacts
build failures
Business impact
customer satisfaction
application usage
user signup/cancel
transaction failures
sales volumes
Human impact
employee satisfaction
team productivity
staff retention
work hours
‘work from home’ days
Source: Computing Research UK, DevOps Review 2016: Accelerating Innovation, July 2016
13
How different orgs and teams benefit from using DevOps data in multiple feedback loops
14
State of DevOps 2016: ‘Metrics that Matter’15
Source: 2016 State of DevOps Report, DevOps Research and Assessment
Objective data enables continuous improvement
Defect Information
CapacityPlanning
Quality Standards
Enhancement Requests
Integration Requirements
Acceptance Metrics
Service Levels and KPIs
Application Development Test and Acceptance Production
BuildCodePlan Test/QA Stage Release Config Monitor
InfrastructureDependencies
16
Shared Data Helps Find and Fix Issues FasterReal-time dashboards show error rate in production and impact of pushing
new builds
17
Developers can search and visualize web logs, Java logs—without
production access
Alerts notify developers as soon as a problem arises
17
Shared Data Increases App Delivery Velocity
18
DevOps Teams iterate with continuous insights
Product Managers identify new opportunities
Code continuously deliveredto market
Auditorshave visibility
Customersare happy
Shared Data Improves Code Quality
19
Code Quality Scans Static Security Scans
White BoxDevelopers check in code
Automated Acceptance Tests
Dynamic Security Scans
Black Box
“Chaos Monkey” Tests
Test Fail: Return
Test Fail: Return
Production
QA Prod Pattern
QA Pattern Library
Test Pass: Promote
Test Pass: Promote to Production
Pattern library used for test and
QA
Shared Data Aligns DevOps With Business Impact
20
Some Real-World ‘Metrics That Matter’21
“Developers can focus on innovation and not on building monitoring tools.”
“Web Ops can measure performance of releases in pre-prod, prod and in QA.”
“Gather all data, and it starts looking like one big system, instead of a bazillion teeny ones that hate each other.”
“We measure customer sentiment on Google Play in real time and can correlate it with code
releases and app performance.”
Enable Improved DevOps Agility22
Key Customer Benefits
-Robert Gonsalves,Web Operations
“It’s like we were working without peripheral vision before and now we have it.”
• Increased success rate of deployments• Ability to detect issues before they affect broad
production• Monitoring deployment process several times per day
Deliver Better Code Quality23
Key Customer Benefits
-Principal Engineer,Apollo Group
“Developers are now able to look for errors and troubleshoot issues five to 10 times faster by having all their event data centralized in Splunk.”
• Provide full visibility into QA sanity and load testing before production
• Exceed SLA thresholds with full visibility and benchmark key infrastructure metrics and errors
• Easily troubleshoot if tests do not contain the expected results
Enable Data-Driven Continuous Delivery24
-Alison Perkins, Senior Systems Engineer
“Dump all the logs into Splunk, and it starts looking like one big system, instead of a bazillion teeny ones that hate each other.”
Key Customer Benefits • Quickly validate and troubleshoot code pushes to
production• Ensure that new code does not negatively impact
performance or user experience • Reduced one application’s error rate by 2 orders of
magnitude in a matter of weeks
Enable Closer Business-IT Alignment
Allows DevOps to ensure quality of releases & avoid negative impact on service performance.
Analyze which new website features are being adopted, and how, by end users.
Insight fed back into the development cycle to improve customer engagement.
25
Data-oriented service delivery visualization to help leaders see value in opaque DevOps processes
26
Data From Dev and Ops Tools
27
Data From Provisioning and Config
28
Data from Release Servers
29
Data from Infrastructure Systems
30
Data from Database Servers
31
Data From Automation Servers
32
Data From DevOps Systems
33
Advanced analytics to enable data-driven DevOps decisions
34
DevOps VSM In a Glass Table View – all good
35
Glass Table view – threshold exceeded
36
Deep Dive – Build Service Status
37
Deep Dive – Build Server Error Message
38
Machine-learning & predictive analytics in a DevOps-driven service delivery culture
39
Apply Advanced Algorithms to Your Data
40
Track and Predict Anomalies
41
Summary
42
FROM EVERY TOOL, EVERY PROCESS, EVERY COMPONENT, ON-PREM OR OFF
One Constant -Machine Data
43
Common Data Fabric
44
APISDKs UI
Server, Storage. N/W
Server Virtualization
Operating Systems
Infrastructure Applications
Mobile Applications Cloud Services
Other ToolsTicketing/Help
Desk
Custom Applications
Visibility Across the Whole Ops Environment
API Services
Common Data Fabric
45
APISDKs UI
Other ToolsEscalation/
Collaboration
Visibility Across the Whole Dev Lifecycle
Plan Code Build Test/QA Stage Release Config Monitor
Code Repository Automation Systems
Application Monitoring
CI/Build Servers
Project & Issue Tracking
Dev/Test/Staging Servers InfrastructureCloud
46
Explore Visualize Customize ShareAnalyze
Common Data Fabric
APISDKs UI
INCREASE APP DELIVERY VELOCITY
IMPROVE CODE QUALITY
INCREASE BUSINESS IMPACT
Improve the Impact of Application Delivery
47
For more, please visit Splunk.com • Splunk for DevOps - www.splunk.com/DevOps• Splunk DevOps Ecosystem Apps: splunkbase.splunk.com• Splunk blogs: blogs.splunk.com• Splunk community: www.splunk.com/community• DevOps demo available
Sources/Additional Reading● splunk.com/DevOps - Resources on Splunk for DevOps incl. case studies, customer stories, partners, products, videos, etc.
● dev.splunk.com – Resources for developing with or on ther Splunk platform, incl. SDKs, API Docs, guides, etc.
● blogs.splunk.com – Check the ‘DevOps’ and ‘Ansible’ tags for specifics, including how to deploy Spunk w/ Ansible
● splunkbase.splunk.com – Splunk add-ons and applications incl. Ansible Tower App for Splunk and 1000+ more
● DevOps Review 2016: Accelerating Innovation, Computing Research UK, July 2016
● 2016 State of DevOps Report, DevOps Research and Assessment
● The DevOps Cookbook, John Allspaw, Patrick Debois, Damon Edwards, Jez Humble, Gene Kim, Mike Orzen, and John Willis
● The Phoenix Project, Gene Kim, Kevin Behr, George Spafford
● Data-Driven DevOps: Use Metrics to Help Guide Your Journey, Gartner Inc. 2014, Cameron Haight and Tapati Bandopadhyay
● Metrics that Matter, Mark Michaelis, IntelliTect
● DevOps and the Cost of Downtime: Fortune 1000, IDC
● DevOps Best Practice Metrics: Fortune 1000 Survey, IDC, 2014
● The Seven Habits Of Highly Effective DevOps, Forrester Research, Inc., October 2, 2014
49
Thank You