osmc 2014: from monitoringsucks to monitoringlove (and back) | kris buytaert

64
From #MonitoringSucks to #MonitoringLove (and back) @KrisBuytaert OSMC 2014 , Nuremberg, Germany

Upload: netways

Post on 02-Jul-2015

302 views

Category:

Software


1 download

DESCRIPTION

Back in June 2011 John Vincent ranted on twitter that #monitoringsucks, and for a lot of us he was absolutely right. At #devopsdays Rome 2012, in November, Ulf Mansson proclaimed his new found love for monitoring and we changed the hashtag into #monitoringlove. Based on a new era of open source tools, Ulf started loving monitoring again. And for a lot of us he was absolutely right. Over the past 5 years an enormous amount of new tools and new patterns has come out of the community sometimes tagged with #devops, pretty much all of them open source. Do you still know what you should be using for what? And what the differences are? An opinionated overview of the open source monitoring landscape to clear up the confusion on what you should use, or make the decision even more difficult on you :)

TRANSCRIPT

Page 1: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

From #MonitoringSucks to

#MonitoringLove

(and back)

@KrisBuytaert OSMC 2014 , Nuremberg, Germany

Page 2: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Kris Buytaert ●I used to be a Dev, ●Then Became an Op ●Chief Trolling Officer and Open Source Consultant @inuits.eu ●Everything is an effing DNS Problem ●Building Clouds since before the bookstore ●Organising Conferences ●Evangelizing devops

Page 3: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

An opinionated talk about the Open Source Monitoring tooling landscape

In which I hope to learn from YOU

Page 4: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#devops=~C(L)AMS ● Culture

● (Lean)

● Automation

● Monitoring and Measurement

● Sharing

● Damon Edwards and John Willis

Gene Kim

Page 5: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Monitoring is usually an aftertought ENOBUDGET, ENOTIME

Page 6: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

An 2008 OLS Paper ● We have bloated Java tools

● Some open Core stuff

● DYI folks want traditional Nagios

● DBA Required

Page 7: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#monitoringsucks ● John Vincent (@lusis), june 2011

● A sub #devops movement

● https://github.com/monitoringsucks/

Page 8: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Why #monitoringsucks ● Manual config (gui)

● Not in sync with reality

● Hosts only

● Services sometimes

● Aplication never

● Chaos or out of sync with reality

● Alert Fatigue

Page 9: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Let's forget about ● Tools with no (stable) API

● Tools with strong focus on GUI

● Unless you are an SME with < 100 nodes

● Zenoss, Hyperic, GroundWork, ....

● P.S. : don't even mention proprietary software to me

Page 10: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

What we want

● Small , well suited components

• Collect

• Transport / Mangle

• Store

• Analyse

• Act / Alert

• Visualize

Page 11: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#monitoringlove

•Ulf Mansson #devopsdays Rome 2011

•A new era of tooling

•#monitoringlove hacksessions @inuits

•#monitorama

Page 12: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 13: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 14: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 15: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Icinga •2009 Fork

•I consider Nagios dead

•Vibrant Community (or they stalk me)

•Throw great parties in Nurnberg

•Nobody can pronounce it anyhow

•https://github.com/Inuits/puppet-icinga/

Page 16: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Stored Configs

Page 17: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

#monitoringlove But the love was about :

Page 18: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Sensu ● Awesome for non static environments

● Scaling a clustered RabbitMQ ?

● This is Europe, U no do cloud

Page 19: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Automation of #monitoring brought back

the #love

Page 20: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

●Autodetection

●Multiplexing

●Trend Forecasting

I love CheckMK

Page 21: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

•Autodetection ?

•Service,

•Business Functionalities

•eg. vhosts etc

•Single Source of Truth

I hate CheckMK

Page 22: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Monitoring a service vs

Monitoring a Service

Page 23: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

definition of done:

monitored and in production

Page 24: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

A software project is not done untill your last end user is dead

Page 25: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Culture,

Automation,

Measurement : measure all the things

Sharing

Page 26: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Deploy Statistics ● Time To Deploy

● Deploy Frequency

● Lifecycle frequency

● Map to other metrics

Page 27: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

CollectD all the metrics, at high intervals

Page 28: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Oldschool graphite

Page 29: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Self Service Gdash based pipelines

Puppetized Templates (wip)

Page 30: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Gdash

Page 31: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Grafana

Page 32: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Graphite++ ● Dashboards

• Grafana

● Engines :

• InfluxDB

• Cyanite

Page 33: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Triggers on Graphs ● Export Java Metrics

● JMXTrans

● Export JMXConfigs

● Configure NRPE Check

● Export NagiosCheck

● Collect JMX Exports on JMXTransNode

● Graph Em

● Collect Icinga Configs on Icinga

Page 34: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Aggregation ● Alert on streams

● Alert on aggregated metrics

Page 35: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 36: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 37: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Riemann ● I still don't get it ?

● Distributed Top

● Do you like Clojure ?

● Riemann Health plugin ?

● s/riemann-health/collectd/g;

● Output to graphite

Page 38: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Graphs to Knowledge

Skyline

•Oculus

•Creating Information out of this data

•Big data

•Machine Learning

Page 39: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

But I have log files..

Page 40: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 41: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 42: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 43: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Logs and Metrics ● Graylog2

● ELSA (Enterprise Log Search and Archive)

● ELK Stack

Page 44: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

● Collect from anywhere

● Filter

● Send anywhere

● Queing

Page 45: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 46: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Black on White ?

Page 47: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

APM But what about my apps ?

Half the world cheers about SAAS tools :(

Page 48: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Packetbeat ● Traffic Flow through network

● Transactions causing errros

● SQL per HTTP

● API call usage

Page 49: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

PacketBeat

Page 50: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

This new “D” hype

Page 51: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Containers are the new black

● 1 process per container

● Metric collection ?

● Service health ?

Page 52: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

So you want service registration of your healthy (containerized) applications ?

Page 53: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Enter Consul.io ● Service discovery

● Failure detection

● Using Gossip build on top of Serf

● Random node 2 node communication

● A HashiCorp project

Page 54: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Consul ● Uses monitoring_plugins for health

● Creates unhealthy dns setups

● Sensu alike

● Key-Value store

● Consul_template => fills your templates

Page 55: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Everything is a freaking dns problem

Page 56: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Self Healing ● Pacemaker Corosync (ocf resource that monitors your service)

● Mesos

● Kubernetes

● Scale changes, Consensus Models change

Page 57: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

So your DC fails

Whom to alert when ?

Page 58: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
Page 59: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

'New' kids on the block ● Flapjack

● flapjack.io

● monitoring notification routing + event processing system

● OpenDuty

● github.com/szechuen/OpenDuty

● Duty management

Page 60: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

My Alerting Strategy

Is still in beta

Page 61: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

And back :(

In 2014 I`m still running the same check for

- service registration (consul)

- high availability (pacemaker/corosync)

- monitoring (icinga)

Page 62: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

But I love where Monitoring is heading

We have much less false positives

And we have a Maintainable Monitoring Infra

Kinda

Page 63: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Your next trip to Gent !

CfgMgmtcamp.eu February 2 and 3, 2015

CFP is Open !

Page 64: OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert

Contact [email protected] Further Reading @krisbuytaert http://www.krisbuytaert.be/blog/ http://www.inuits.eu/

Inuits Duboistraat 50 2060 Antwerpen Belgium 891.514.231 +32 475 961221