three pilars of observability kubernetes with elastic stack · docker • kubernetes ... apm adds...

Post on 20-May-2020

28 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Massimo BrignoliPrincipal Solutions Architect, Elastic

Three Pilars of Observability Kuberneteswith Elastic Stack

2

• Custom on-prem & cloud deployments

• Public cloud fully-managed deployments

– Google Kubernetes Engine (GKE)

– Amazon Elastic Container Service for Kubernetes (EKS)

– Azure Kubernetes Service (AKE)

• Pivotal Container Service (PKS)

• Red Hat OpenShift

Kubernetes is Taking Over the Enterprise

3

Kubernetes is Complicated

Container Runtime

4

Kubernetes Visibility Challenges

55

Observable Kubernetes

Elastic Stack: Three Pillars of Observability in One Platform

● Logging

● Metrics

● APM Tracing

6

It Comes Down to The Three Pillars of Observability

Twitter:https://blog.twitter.com/engineering/en_us/a/2013/observability-at-twitter.htmlPeter Bourgonhttps://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html

7

Elastic at the Center Stage

8

Elastic Stack for logs

64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291

64.242.88.10 - - [07/Jan/2019:16:11:58 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352

64.242.88.10 - - [07/Jan/2019:16:20:55 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253

For each event, print out what happened.

Metrics vs LogsLogs are chronological records of events

•Turnkey experience for specific data types

•Data to dashboard in just one step

•Automated parsing and enrichment

•Default dashboards, alerts, ML jobs

Making logging more turnkey with modulesLogging Metrics Security

Logging Modules

11

System

•Linux / MacOS

•Windows Events

Containers

•Docker

•Kubernetes

Databases

•MySQL

•PostgreSQL

Queues

•Kafka

•Redis

Web servers

•Apache

•Nginx

Audit data

•Filesystem

•System calls

Infrastructure Applications

WINLOGBEATFILEBEATAUDITBEAT

Log File Import

12

Automatic Structure Discovery

Ad-hoc log search and visualization Kibana Discover, Visualize, Dashboard

14

Elastic Stack for metrics

Elasticsearch beginnings

15

Primarily used for application searchSearch engineInverted index primary data structure, and is great for search

2010

2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics

Elasticsearch evolves to support analytics

https://www.elastic.co/blog/elasticsearch-as-a-column-store

Columnar Store, Built on Lucene "doc values"Search engineInverted index primary data structure, and is great for search

2010

2014 Aggregation Framework Analytics features to slice and dice data along various dimensions

Aggregation Framework

17

Out-of-this-world aggregations

https://www.elastic.co/blog/out-of-this-world-aggregations

Search engineInverted index primary data structure, and is great for search

2010

2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics

BKD trees and sparse fieldsData structures optimized for numbers. Faster analytics, lower storage footprint

2016

2014 Aggregation Framework Analytics features to slice and dice data along various dimensions

Elasticsearch storage efficiencies

18

BKD Trees & Sparse Fields

https://www.elastic.co/blog/searching-numb3rs-in-5.0

1-Dimension

2-Dimensions

Sparse Data

Search engineInverted index primary data structure, and is great for search

2010

2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics

RollupsRoll up or aggregate older data into bigger time buckets and save on disk space

2018

Rollup support for long-term retention

Added in Elasticsearch 6.3

https://www.elastic.co/blog/data-rollups-in-elasticsearch-you-know-for-saving-space

Search engineInverted index primary data structure, and is great for search

2010

BKD trees and sparse fieldsData structures optimized for numbers. Faster analytics, lower storage footprint

2016

2014 Aggregation Framework Analytics features to slice and dice data along various dimensions

2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics

Elasticsearch for search and numerical analytics

20

Inverted Index for full-text search Columnar store for structured data

BKD Trees for numerical operations Rollups save space

Metrics Modules

21

Infrastructure

System

•Linux

•MacOS

•Windows

•Perfmon

Cloud

•AWS

•GCP

•Azure

•DigitalOcean

•Alibaba

Containers

•Docker

•Kubernetes

Virtualization

•vSphere

Network

•Netflow

•Packets

•TLS Envelope

Storage

•Ceph

PACKETBEATMETRICBEATHEARTBEAT

Infrastructure

22

Metrics Modules

Infrastructure

PACKETBEATMETRICBEATHEARTBEAT

Uptime

•Heartbeat

Custom apps

•JMX/Jolokia

•PHP-FPM

•Golang

Datastores

•MySQL

•PostgreSQL

•MongoDB

•Couchbase

•Aerospike

•Graphite

Queues

•Kafka

•Redis

•RabbitMQ

Caches

•Memcached

Web servers

•Apache

•Nginx

Other

•HAProxy

•Zookeeper

Applications

Heartbeat: Uptime Monitoring

Heartbeat: Uptime Monitoring

Functionbeat: Serverless data shipper

Cloudwatch Cloudwatch Logs

Functionbeat: Serverless data shipper

Visualizing time series dataTime Series Visual Builder

28

Elastic Stack for APM

Example: Slow response or load times

Why APM?

03:43:45 Request "GET cyclops.ESProductDetailView"

03:43:57 Response "cyclops.ESProductDetailView 200 OK"

12 seconds - zZzzZZz

Example: Errors & Exceptions

Why APM?

03:43:59 Request "POST /api/checkout"

03:43:59 Response "/api/checkout 500 ERROR"

Agents, API, and APM Server

How APM works

Data processorapm-server

Data storageElasticsearch

BrowserAgent

Web server

Agent

Web server

Agent

UIKibana

BrowserAgent

BrowserAgent

Web server

Agent

APM adds end-user experience and application-level monitoring to the stack

Elastic APM

● Python

● Node.js

● Ruby

● RUM (Real User Monitoring)

Language Support

● Java

● Go

● .NET (in dev)

•Focuses on search experience on top of APM data

•Just another index in Elastic Stack

•Active roadmap to expand programming languages

Great overview and drill-down with industry-standard visualizations

Dedicated APM UI

Single transaction

Distributed Tracing

Transaction 1

SpanSpan

Span

HTTP request Response

Multiple Services

Distributed Tracing

Trace A

Transaction 1

SpanSpan

Transaction 2

Span

Transaction 3

SpanSpan

Span

Combine a custom workflow with the freedom of search

Ad-hoc search in a curated UI

Need another visualization? Build a dashboard, no need to wait for your vendor

APM is just another index in Elasticsearch

Correlate data from different sourcesAbility to re-use analysis content Ability to re-use Elastic-provided content

Correlation between logs, metrics, and APM Elastic Common Schema

Benefits

Version 0.1 published: github.com/elastic/ecsWorking with internal groups to validateCommunity feedback welcome!

Status

39

Metadata processorsEnrich events with useful metadata to correlate logs, metrics & traces

• cloud.availability_zone

• cloud.region

• cloud.instance_id

• cloud.machine_type

• cloud.project_id

• cloud.provider

• docker.container.id

• docker.container.image

• docker.container.name

• docker.container.labels

• kubernetes.pod.name

• kubernetes.namespace

• kubernetes.labels

• kubernetes.annotations

• kubernetes.container.name

• kubernetes.container.image

add_cloud_metadata add_docker_metadata add_kubernetes_metadata

40

Kubernetes deployment

Node 1

Metricbeat

Filebeat

Node 2

Metricbeat

Filebeat

Node n

Metricbeat

Filebeat

Filebeat DaemonSet

Metricbeat DaemonSet

4141

Logging

● Cluster level logging

● Services logging (eg. nginx, mysql)

● Custom application logging

42

Kubernetes Logging

• Need for a logging solution– Kubernetes does not have a native solution

– kubectl logs is too hard for large clusters

• Cluster-level logging– Logs have separate storage and lifecycle independent of nodes, pods and containers

– Kubernetes provides no native storage solution for log data

• Application-level logging– Complicated

– Packaged applications (eg. nginx)

– Custom applications

43

Two Packaged Solutions

• Fluentd DamonSet– Log collection, parsing and distribution

• Fluentd + Stackdriver for GCP

• Fluentd + Elasticsearch

44

Better Log Collection with Filebeat

kubectl create -f filebeat-kubernetes.yaml

45

Filebeat Auto-Discovery

filebeat.autodiscover: providers: - type: kubernetes templates: - condition: contains: kubernetes.container.image: " nginx" config: - module: nginx access: # For nginx access log prospector: type: docker containers.ids: - "${data.kubernetes.container.id}"

• A module contains

– Log file path

– Ingest pipeline

– Fields definitions

– Sample dashboards

46

• Apache2 module

• Auditd module

• Icinga module

• IIS module

• Kafka module

• Logstash module

• MongoDB module

Filebeat ModulesSimplify collection, parsing and visualization of common log formats

• MySQL module

• Nginx module

• Osquery module

• PostgreSQL module

• Redis module

• System module

• Traefik module

4747

Metrics

● Metrics data sources

● Popular solutions

● Metricbeat

48

Kubernetes Monitoring

• What to monitor– Cluster monitoring– Pod monitoring– Application monitoring

• Metrics sources– cAdvisor & Heapster– Kube-state-metrics– Prometheus– APM

• Solutions– Heapster/InfluxDB/Grafana– Heapster/Elasticsearch– Prometheus/Grafana– APM - Datadog, Dynatrace– Metricbeat with Autodiscovery

Collect Store Analyze

ElasticsearchInfluxDB...

KibanaGrafana...

MetricbeatHeapsterPrometheus...

SearchDashboardAlerts...

Data ModelMetrics Sources

49

Comprehensive Metrics Collection Metricbeat

• Kubernetes module• Monitors pods and services

– Cluster, pod & container metrics– Application metrics through auto-discovery

(eg. Nginx)

• Metrics sources - Cover them ALL– Kubelet (heapster, cAdvisor)– kube-state-metric– Kubernetes events– Prometheus module (beta)

• Curated infra UI • Dedicated Kibana app

50

Out-of-the-box Dashboards

  

51

Curated UI for KubernetesVisualize the cluster and group by nodes or namespaces or pods

52

Monitor Services inside Containers with Auto-Discovery

Metricbeat Filebeat

Node n

Logs

MetricsNginx

metricbeat.autodiscover:

providers:

- type: kubernetes

host: ${HOSTNAME}

templates:

- condition.contains:

kubernetes.container.name: nginx

config:

- module: nginx

period: 10s

metricsets: [" stubstatus"]

hosts: ["${data.host}:8080"]

53

Metricbeat ModulesSimplify collection and visualization of common metrics

● Aerospike module● Apache module● Ceph module● Couchbase module● Docker module● Dropwizard module● Elasticsearch module● Etcd module● Golang module● Graphite module● HAProxy module● HTTP module

● Jolokia module● Kafka module● Kibana module● Kubernetes module● kvm module● Logstash module● Memcached module● MongoDB module● Munin module● MySQL module● Nginx module

● PHP_FPM module● PostgreSQL module● Prometheus module● RabbitMQ module● Redis module● System module● uwsgi module● vSphere module● Windows module● ZooKeeper module

5454

Tracing

● Elastic APM

55

Microservices Can Be ComplicatedMicroservice Architecture of Uber

https://dzone.com/articles/microservice-architecture-learn-build-and-deploy-a

56

First Major Open Source APM SolutionAgents, Server, Dashboards

57

APM Tracing - Transaction Waterfall View

58

You can do MORE ...

• Enforce access policies with X-Pack Security

• Be notified about changes & problems with X-Pack Alerting

• Be smarter with X-Pack Machine Learning

• ...

Be Creative, the Sky is NOT even the Limit with Elastic!

59

Cloud Native Computing Foundation

• https://www.cncf.io/projects/

Resource Monitoring solutions

• https://kubernetes.io/docs/tasks/debug-application-cluster/resource-usage-monitoring/

Log monitoring:

https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/

https://kubernetes.io/docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/

Kubernetes Resources

60

Questions you may ask

• How long time do you need to resolve performance issue with

your application?

• How easy is it to get, find and combine logs, metric and APM

data on your current solution?

• How many monitoring systems you need to maintain?

• Do you keep data in silos?

Questions?

top related