Download - Topic: Observability Platforms
EMA Top 3 Enterprise Decision Guide 2021Topic: Observability PlatformsData-Driven Guidance for Product Evaluation in DevOps, SRE, IT Operations, and Business
Q3 2021 EMA Top 3 ReportBy Torsten Volk
Table of Contents 3 EMA Top 3 Awards
4 The Five-Step Product Selection Process
5 Product Selection Criteria
6 The Cloud-Native Universe: Choice Brings Complexity
7 CNCF: Metrics
8 Learning From Real-Life Failure
8 Failure Impact Across all 52 Cases
9 Zooming Out: 41,558 Challenges From the Cloud-Native Universe
10 Observability as the #1 DevOps Challenge
11 Harnessing the Value of Technology by Mastering Complexity
12 Five Most Frequently Asked Questions About Observability
13 Market Analysis: Observability
. 3
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
EMA Top 3 Awards
EMA Top 3 Products Help Enterprises Shatter the Vicious Cycle Between Speed, Quality, Cost, and Innovation
EMA Top 3 Awards What are the EMA Top 3 Awards?Enterprise Management Associates (EMA) presents its EMA Top 3 Awards to software products that help enterprises reach their digital transformation goals by optimizing product quality, time to market, cost, and ability to innovate.
Personas and PerspectivesEMA derives its Top 3 product categories from today’s critical pain points expe-rienced by software developers, DevOps professionals, site reliability engineers (SREs), IT operators, data scientists, and business staff belonging to enterprises of any size and industry.
Data-Driven ResearchEMA Top 3 Awards are based on the consolidated analysis of a combination of real-life project data, data obtained through daily briefings and end-user interaction, and additional data collected from public sources to optimally understand customer requirements.
You should read this report if… …you want to learn from the successes and failures of your peers.
…you require hard data on market trends in DevOps, IT operations, and busi-ness technology.
INN
OV
AT
ION
SPEED 01
02
04
03
QU
AL
ITY
COST
. 4
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
The Five-Step Product Selection Process
The purpose of the EMA Top 3 decision guide is to present the reader with products that address the key business requirements and pain points in 2021. The EMA selection process follows five key steps.
1. Empirical: EMA identifies the specific key customer pain points for each one of the top challenges in DevOps, SRE, IT operations, and busi-ness in 2021.
2. Strategic: EMA evaluates how each product addresses the key pain points identified in step 1 and how it aligns with today’s most relevant technology trends.
3. Innovative: This criteria rewards products for breaking with legacy constraints in order to provide customers with truly innovative solutions.
4. Customer-Centric: EMA Top 3 Awards recognize a product’s radical focus on customer requirements instead of marketing an existing prod-uct as something new.
5. Specific: EMA Top 3 Award-winning products address quantifiable cus-tomer pain points.
Please treat these EMA Top 3 vendor recommendations as a starting point to inform your product selection process and overall digital transformation strat-egy. While this report can provide valuable data-driven insights, it aims to inform, not replace, your own due diligence process.
The Five-Step Product Selection Process
How well are the current product capabilities aligned with the requirements derived from the empircal dataset and the recommendations created in dialogue with customers and partners.
1. Emperical
When selecting IT vendors, there are often elements to consider that can dramatically increase the value of an IT solution by aligning with today’s key IT trends.
2. Strategic
EMA Top 3 products must address the critical enterprise customer pain points revealed in this report.
5. Specific
The EMA Top 3 report aims to point out true customer centric innovation that is unique in the
market place. This can be due to a new architecture that frees customers from legacy constraints or it
could simply be due to a reorientation of a traditional vendor that will benefit customers in the near future.
3. Innovative
EMA awards dramatic strategic realignment of vendors that required a significant leap of faith in return for “doing the right thing” for customers.
4. Customer Centric
. 5
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Product Selection Criteria
Real-life use cases are the crucial link between PowerPoint and how products work in the wild. This report relentlessly focuses on identifying and cluster-ing use cases based on direct customer observations and on the analysis of quantitative project data. While EMA aggregates and anonymizes all customer-specific data points, the EMA Top 3 evaluation process is based on actual customer problems.
Instead of exclusively relying on EMA survey data and research notes, EMA created a data framework that enables EMA analysts to directly analyze project bottlenecks and enterprise pain points by looking at real-life project artifacts. The example on the right shows an extract of the analysis of 41,558 Kubernetes-related developer problems posted to the Stackoverflow support forum. From this specific evaluation, EMA received a series of technology clusters with a high probability of being involved in production issues of different types and severity levels. By no means does this result reveal that these technology com-binations should be avoided in your next project. Instead, it helps EMA define problem areas that require further examination and deserve some additional questioning by enterprises selecting product vendors.
EMA Top 3 Product Awards - Reward for Addressing the Difficult Problems Each EMA Top 3 Award-winning product has demonstrated its direct focus on addressing today’s key pain points for software developers, DevOps teams, SREs, IT operators, and business professionals.
Example: Learning From Real-Life ChallengesThis simplified heatmap shows the most common problem clusters within Kubernetes-based environments. Starting at the top, we can learn that chal-lenges around Helm, the Kubernetes package manager, occur within the context of “ingress control” (1) and “yaml definitions”(2), with “pod manage-ment” (3), “the use of minikube” (4), and “Nginx” (5) playing a significant role. This empirical research approach helps focus the EMA Top 3 product awards on actual real-life customer pain points in 2021.
Clustering Real-Life Cloud Native Issues
This heatmap is based on the 41,558 most recent Kubernetes-related developer posts on the Stackoverflow support forum.
Product Selection Criteria
dockergooglekubernetesengine
kuberneteshelm
kubernetesingress
kubernetespod
minikube
nginxpython
springboot
yaml
amazoneks
amazonwebservices
azure
azureaks
containers
docker
googlecloudplatform
googlekubernetesengine
java
kuberneteshelm
0
50
100
150
200
1 23 4 5
. 6
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
The Cloud-Native Universe: Choice Brings Complexity
This chart shows the full extent of cloud-native choice, in the form of all 936 products from the CNCF landscape, available to product teams to assemble their applications. The fact that teams can select one or more components from each product category brings the flexibility needed to optimize devel-oper productivity while providing DevOps engineers with the required governance and control to ensure cost-efficiency and continuous compliance. The fact that different product teams make different choices and often the same team leverages different compo-nents to create their stack for different projects creates a level of entropy that is hard to control with traditional skill-sets, processes, and tools.
The Cloud-Native Universe: Choice Brings ComplexitySecurity,
Compliance, and Automation Observability
Data, Application, and DevOps
Runtime
Orchestration and Management
Container Platform
EMA Top 3 Award-winning products help customers leverage their preferred cloud-native products in a well governed, integrated, and automated manner.
. 7
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
CNCF: Metrics
CNCF: MetricsThese metrics belong to the CNCF product overview chart from the previ-ous page and aim to provide readers with a high-level overview of the degree of choice and complexity attached to modern distributed and cloud-native applications.
Total Venture Funding:
$601,889,361,536
Total GitHub Stars:
2,571,835
Total Number of Organizations
Involved: 779
Total Number of Annual
Code Commits: 484,198
Total Individual Contributors:
104,285
Total Number of Products:
936
Number of Development
Languages Used: 28
. 7CNCF: Metrics
. 8
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Learning From Real-Life Failure
Failure Impact Across all 52 CasesForty-eight percent of this specific set of incidents were full production outages, 18% were less disruptive Kubernetes cluster outages, and the remaining 34% con-stituted a mix of operational inconsistencies and deploy-ment failures.
Analyzing FailuresDrilling further into the data shows that 45% of the issues are directly related to public cloud technologies (pink) with the remaining 55% shared between secu-rity, availability, compute, data, deployment tech-nologies, network, and performance topics.
EMA Top 3 products help enterprises address the underlying challenges of these issues in a cost-efficient manner while optimizing the end-user experience.
Examining the failures of your peers can offer with valuable lessons for your own decision-making processes, without having to endure the pain of a real-life product failure. The following chart consists of quotes from 52 cloud-native application failures within an enterprise context. These incidents reach all the way from failed code deployments with minimal user impact to major produc-tion outages affecting most or all internal and external user groups.
Quotes on the Impact of 52 Kubernetes Production Failures
Learning From Real-Life Failure
avai
labi
lity
com
pute
data
deployment
network
performance
publiccloud
security
gcp
ingr
ess
wea
ve sc
ope
aws
aws spot in
stances
gke
kiam
gke
nginx ingress
cpu limitgke
haproxy-ingress
kiamkops
aws
aws eks
aws iam
aws spot instances
azure
eks
google kubernetes enginekiam
weave scope
weave scope
gke
clus
ter
publ
ic a
ws e
lb
kops
kops
cluster autoscaler
dns
conntrack
network interrupts
cpu throttling
cpu limitjobsservice vipsdns
cpu limit
dns
kops
kube-a
ws
nginx
notready nodes
snat
kubelet
kops
dns
aws cni plugin
scheduler
dns
public aws elb
public aws elb
node
s
(NA)
large c
luste
rs
centos
hpa
aws iam
haproxy
conntrack
dnscpu throttling
overload
golang templating
aws iam
cpu throttling
coredns
elbha
prox
ylarg
e cl
uste
rs
elb d
ynam
ic ip
s
con�gmap change
systemoom
conntrack
kube2iam
centos
ndots:5
anti-a�nity rules
aws iam
(NA)
(NA)
(NA)
(NA)
batch
jobs i
nfrastr
ucture
container-selin
ux
alias ip vpc (vpc native)
latency
(NA)
frozen cronjob
(NA)
(NA)
(NA)
(NA)
latency(NA)
oomkill
back
endc
onne
ctio
nerr
ors
liven
essp
robe
batc
h jo
bs in
frast
ruct
ure
api s
erve
r
(NA)
helm
amazon vpc cni plugin
--kube-api-qps
container-selinux
alpine musl libc
(NA)
(NA)
latency
(NA)
(NA)
Source: codeberg.org, hjacobs/Kubernetes-failure-stories, Aug 5, 2021
0% 10% 30%20% 40% 50%
Deploy
Error
High latenc
Cluster outag
Product outag
Technologies Involved in Kubernetes Failures
. 9
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Zooming Out: 41,558 Challenges From the Cloud-Native Universe
Zooming Out: 41,558 Challenges From the Cloud-Native Universe These challenges are divided into 488 cat-egories and 1,506 subcategories captur-ing the full spectrum of cloud-native devel-oper challenges.
The tree map is based on 41,558 Kubernetes-related developer posts on the Stackoverflow support forum.
. 10
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Observability as the #1 DevOps Challenge
Observability as the #1 DevOps ChallengeObservability, availability, and security are the three primary DevOps challenges today. This is a direct result of the rapidly increasing complexity introduced by the adoption of mostly autonomous product teams, distributed application archi-tecture, and a hybrid multi-cloud operating model. SREs are often scrambling to find the information required to manage operational risk, since organizations typically do not have a unified approach toward collecting, processing, and stor-ing logs, metrics, and traces across the stack in an application-centric manner.
50% Estimated daily time spent by software engineers on overhead tasks.
2016 2017 2018 2019 2020 2021
availability
observability
security
-14.57%-10.24%
6.69%11.81%
52.76%
42.13%
105.51%
198.43%201.57%
255.12%251.57%
235.43%
311.81%316.93%
410.63%
362.6%409.84%
425.59%
35.76%49.31% 56.6%87.85%
177.43%
224.31%
307.99%
604.17%647.57%
743.4%
720.83%
828.47%
907.99%
1,029.51%
1,197.57%1,228.13%
1,169.79%1,225.69%
-2.02%41.41%
114.14%
171.72%190.91%
188.89%229.29%
486.87%581.82%
668.69%
526.26%
665.66%
678.79%
894.95%
1,045.45%1,053.54%
913.13%
1,077.78%
146.06%
427.43%
291.92%
10.24%41.41%
0%
200%
400%
600%
800%
1000%
1200%
175.69%
. 11
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Harnessing the Value of Technology by Mastering Complexity
The fact that simple changes to the application stack are the number-one root cause for the failure of cloud-native applications demonstrates that organiza-tions in 2021 are still struggling to master complexity in software development, DevOps, and IT operations. EMA Top 3 products enable enterprises to better harness existing and new software technologies in a cost-effective and com-pliant manner. Policy-driven software development, operations management, scalability, and unified operations of application infrastructure across data centers and public clouds are the key requirements for achieving this goal. The bottom chart quantifies the overall complexity increase in cloud-native tech-nologies between 2017 and 2021 by showing how a 343% increase in the number of individual technologies resulted in a 527% increase in the number of tech-nology combinations that were part of application problems.
Cloud-Native Complexity Increase Measured by the Number of Unique Technologies (Orange) and by the Number of Technology Combinations (Blue)
The number of distinct cloud-native technologies (orange) increased from 478 to 1,733 (343%) between 2017 and 2021, while the number of cloud-native technology combinations (blue) increased from 826 to 4,349 (527%) within the same timeframe.
Harnessing the Value of Technology by Mastering Complexity
Jan2017
Jul2017
Jan2018
Jul2018
Jan2019
Jul2019
Jan2020
Jul2020
Jan2021
0
500
1000
1500
2000
2500
3000
3500
4000
4500
826
1,678
2,922
3,937
4,349
464768
1,0901,371 1,485
Public Cloud Complexity Increase Measured by the Number of Unique Technologies (Orange) and by the Number of Technology Combinations (Blue) Related to AWS, Azure, and GCP
The number of distinct public cloud services (blue) increased from 1,475 to 3,689 (250%) between 2015 and 2021, while the number of combinations of public cloud services (orange) increased from 2,366 to 8,129 (344%) within the same timeframe.
The top chart shows the 250% increase in the number of different public cloud technologies, which corresponds to a 344% increase in the number of overall public cloud technology combinations between June 2015 and June 2021 (source: Stackoverflow).
In complex distributed environments, any configuration change or change to the software code can lead to downtime and performance degrada-tion. Anytime changes of any kind are made to a complex system, such as a microservices application, there is a risk of immediate or delayed negative impact on the application itself or on other applications.
5 Top Causes of Application Downtime and Performance Degradation
31% 11%37% 16% 5%
Changes Code Bugs
Migrations Security Upgrades
2015 2016 2017 2018 2019 2020 2021
Individualtopics
TopicCombinations
1,366 1,646 1,861 2,132 2,1962,733 2,8492,366
3,2704,510
5,318 5,730
8,453 8,129
. 12
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Five Most Frequently Asked Questions About Observability
Five Most Frequently Asked Questions About ObservabilityQ: What is the difference between monitoring and observability?A: Observability tracks relationships and dependencies between applications, microservices, and infrastructure in order to enable developers, IT operations, and DevOps to understand and optimize performance, cost, and reliability of production applications.
Q: What is telemetry versus observability?A: Telemetry data can consist of logs, traces, and metrics emitted from appli-cations. Observability ingests, consolidates, normalizes, and analyzes this telemetry data in order to continuously optimize an application.
Q: Who is responsible for establishing observability?A: Operators and developers need to collaboratively implement observability into the application stack. This means overcoming the still prevalent practice of both personas using different platforms for gaining application visibility.
Q: What is observability engineering?A: Observability engineering aims at building software systems that are easy to understand, debug, optimize, operate, and enhance.
Q: How do we achieve full-stack observability?A: Full-stack observability provides real-time visibility for development and oper-ations personas to understand the impact of any component that is part of the application stack on application performance. Achieving full-stack observabil-ity requires the consolidation, normalization, and analysis of all logs, traces, and metrics across the application stack, including their contextual dependencies.
Distributed Application
Observability Platform
Metrics
Logs
Traces
Events
Multiple clouds and
data centers
Multiple microservices
Multiple application
stacks
Distributed App
Optimization Actions Analysis Normalization Consolidation
Continuously optimize the system
. 13
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Market Analysis: Observability
EMA Research Facts
Market Analysis: Observability
EMA Quick TakeThe term “observability” was coined in the early 1960s by Rudolf Emil Kalman, a Hungarian-born mathematician seeking to explain how automated control sys-tems work, only by observing their output. In 2021, “observability” has become one of the key pain points and requirements in cloud-native computing due to the struggles experienced by organizations when attempting to control and opti-mize the performance, reliability, and cost of their distributed applications.
Observability platforms enable developers, operators, DevOps engineers, SREs, and security personas to directly connect the dots between end-user experience, application performance, and the underlying code, data, and infrastructure in data centers and the public cloud. The ability to seamlessly drill down from a big picture view of the overall application and infrastructure topology into the details of how individual infrastructure components and code functions impact the end-user experience constitutes the core capability of observability plat-forms. The rapid adoption of a cloud-native distributed application architecture and the resulting complexity of semi-autonomous teams developing and man-aging individual microservices across the corporate data center and different clouds have accelerated the growth of the market for observability platforms.
Business Problems SolvedDeveloper, DevOps, and SRE productivity
High reliability
Cost-performance optimization
Unified management for cloud-native and monolithic apps
Rapid recovery
Minimize release and operations cost
#1 Observability is the number-one DevOps challenge in 2021
100% YoY Increase in observability-related developer challenges
50% Estimated daily time spent by SREs on finding decision-relevant data
Our SREs are often spending 50% of their time on fact-finding missions. Addressing this inefficiency is one of our key goals for 2021 and beyond.
VP Cloud Engineering, Automotive Parts Manufacturer
Quote From the Trenches
. 14
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Market Analysis: Observability
Market Size
Incumbents: 106
Funding: $78B
VC Funding: $1B
Revenue: $10B
Employees: 125,000
Open Jobs: 4,432
GitHub Stars: 443,890
GitHub Contr.: 12,241
Market Growth
Competition
Outlook
Productivity ImpactDevOps and SRE
• Continuous access to all required context data
• Security and compliance across the DevOps pipeline
Software Developers
• Rapid debugging of distributed apps
• Continuous compliance and security
IT Operations
• Automatic dependency and topology tracking
• Full stack app management across data center and cloud
Very High
Very High
High
smal
l
slow fast hot
none
a
verage heating up intense
. 15
EMA Top 3 Report | EMA Top 3 Enterprise Decision Guide 2021
Market Analysis: Observability
Market Segment:Automatic End-to-End ObservabilityChanges to the application stack, code, and release pipeline are the three key reasons for performance degradation and downtime in cloud-native apps and traditional enterprise applications alike. EMA Top 3 award-winning applications in the “auto-matic end-to-end observability” segment capture these changes in near-real time, at full resolution, and without requiring manual instrumentation, in order to provide:
1. Business-driven production insights
2. Targeted alerts with problem context
3. Automatic root cause analysis
4. Monitoring across application environments
These components lead to better alignment between IT and business through the ability of tuning optimization and resolution actions to opti-mize specific sets of business KPIs.
EMA Top 3 Award
Business Impact
Why IBM Observability by Instana Received the EMA Top 3 AwardInstana received the EMA Top 3 Award for the platform’s ability to automatically discover and moni-tor cloud-native and traditional application stacks within the context of their orchestration platform (typically Kubernetes), and the underlying data center or cloud infrastructure. Instana’s reinforcement learning models continuously learn to watch out for issues similar to the ones that were detected within a comparable context in the past. Instana automatically discovers new applications simply by develop-ers adding standard configuration code to their Git repository that enables the platform to automatically place, configure, and manage the required agents in order to ensure comprehensive observability.
We are set in our ways of guessing what the business needs and how we prioritize developer and opera-tor tasks to get there. Sometimes we are right and sometimes we are wrong, since in the end, we are just taking our best guess.
Development Lead, Global Financial Institution
Enhance developer and SRE productivity
Decrease MTTR
Lower operational risk by continuously optimiz-ing the application stack
Proactive issues resolution and application optimization
Complete observability for traditional and cloud-native apps across data center and cloud
Empowering developers and operations staff to handle complex cloud-native applications inde-pendently of where they currently run
Automatic End-to-End ObservabilityMachine Learning-driven Instant Application Insights for Dev, Ops, SRE, and Security
. 15EMA Top 3 Award: Instana
About Enterprise Management Associates, Inc.Founded in 1996, Enterprise Management Associates (EMA) is a leading industry analyst firm that provides deep insight across the full spectrum of IT and data management technologies. EMA analysts leverage a unique combination of practical experience, insight into industry best practices, and in-depth knowledge of current and planned vendor solutions to help EMA’s clients achieve their goals. Learn more about EMA research, analysis, and consulting services for enterprise line of business users, IT professionals, and IT vendors at www.enterprisemanagement.com. You can also follow EMA on Twitter or LinkedIn.
This report, in whole or in part, may not be duplicated, reproduced, stored in a retrieval system or retransmitted without prior written permission of Enterprise Management Associates, Inc. All opinions and estimates herein constitute our judgement as of this date and are subject to change without notice. Product names mentioned herein may be trademarks and/or registered trademarks of their respective companies. “EMA” and “Enterprise Management Associates” are trademarks of Enterprise Management Associates, Inc. in the United States and other countries.
©2021 Enterprise Management Associates, Inc. All Rights Reserved. EMA™, ENTERPRISE MANAGEMENT ASSOCIATES®, and the mobius symbol are registered trademarks or common law trademarks of Enterprise Management Associates, Inc.
1995 North 57th Court, Suite 120, Boulder, CO 80301 +1 303.543.9500 www.enterprisemanagement.com 4113.091421