grafana for messaging monitoring - cern asdf 17-03-2016 - grafana for messaging monitoring...grafana...
TRANSCRIPT
Grafana for Messaging Monitoring
Lionel Cons – CERN IT/CM/MM
17 Mar 2016 Grafana for Messaging Monitoring 1
History
EGEE then EGI messaging brokers monitoring
• multi-site effort: AUTH, CERN and SRCE
• Nagios based
Then added metrics via PNP4Nagios (so RRD).
Then switched to Graphite.
Then used only for the CERN brokers.
Then removed Nagios but kept Graphite.
Then added Grafana.
17 Mar 2016 Grafana for Messaging Monitoring 2
Requirements
• fine grain metrics (e.g. number of messages
stored for client X on broker Y on host Z)
• medium refresh rate: 1 update / minute
• multi-year retention
• live/streaming access
• advanced metric-based alerts
• user controlled graphs
• easy to build dashboards
17 Mar 2016 Grafana for Messaging Monitoring 3
Architecture
17 Mar 2016 Grafana for Messaging Monitoring 4
Production Transport
ActiveMQ
Storage
Graphite
Visualization
Grafana
Analysis
Esper
☞ IT/TF: Advanced monitoring with complex stream processing
Numbers
• 37 brokers on 16+10 hosts
• ~100 metrics, ~8200 metric instances
• ~140 metric updates per second
• ~300 IOPS for Graphite data
• m1.large VM with Ceph io1 volume
• ~3 GB in total for 5 years worth of data
• ~3 messages per second
17 Mar 2016 Grafana for Messaging Monitoring 5
Graphite
• widely used and available in EPEL
• slowly evolving project
• version 0.9.15 released on 27/11/2015
• excellent web API with many functions
• time series stored using whisper
• next generation storage started: ceres
• very slow progress there…
17 Mar 2016 Grafana for Messaging Monitoring 6
https://mig-graphite.cern.ch/render?
target=msgbrk.received_messages.*.atlas
https://mig-graphite.cern.ch/render?
target=scaleToSeconds(nonNegativeDerivat
ive(msgbrk.received_messages.*.atlas),1)
https://mig-graphite.cern.ch/render?
target=highestAverage(scaleToSeconds(non
NegativeDerivative(msgbrk.received_messa
ges.*.atlas),1),5)
https://mig-graphite.cern.ch/render?
target=highestAverage(scaleToSeconds(non
NegativeDerivative(msgbrk.received_messa
ges.*.atlas),1),5)
&width=800&height=400&from=-3month
&title=Received%20Messages%20(Hz)
&bgcolor=white&fgcolor=black&fontSize=10
https://mig-graphite.cern.ch/render?
target=highestAverage(scaleToSeconds(non
NegativeDerivative(msgbrk.received_messa
ges.*.atlas),1),5)
&width=800&height=400&from=-3month
&title=Received%20Messages%20(Hz)
&bgcolor=white&fgcolor=black&fontSize=10
&format=json
Graphite Web API
17 Mar 2016 Grafana for Messaging Monitoring 7
Grafana
• started as better Graphite GUI
• Grafana was to Graphite what Kibana was to ES
• very active development
• now supports many time series databases
• production: Graphite, InfluxDB, OpenTSDB
• experimental: KairosDB, Prometheus
• ... and also ElasticSearch
• fits very well our requirements
17 Mar 2016 Grafana for Messaging Monitoring 8
Grafana Dashboards
17 Mar 2016 Grafana for Messaging Monitoring 9
Grafana Comments
• we only use a small subset of Grafana
• some features are missing (e.g. template
options like stacking) but will probably come
• the dashboard editor is very good for
prototyping (especially the query builder)
• JSON editing is however often needed
• using a database to store configuration
information (e.g. available data sources)
unnecessarily complicates the deployment
17 Mar 2016 Grafana for Messaging Monitoring 10
Summary
• for the metrics to monitor our brokers we use:
• Graphite to store and visualize (low level)
• Grafana to visualize (high-level)
• this combination fulfills all our requirements
• a single VM (with Ceph) is enough at our scale
• we are happy with this simple solution
17 Mar 2016 Grafana for Messaging Monitoring 11