java application monitoring with dropwizard metrics and graphite

48
Java application monitoring with Dropwizard Metrics and Graphite Roberto Franchini @robfrankie Bologna, April 10th, 2015

Upload: roberto-franchini

Post on 17-Jul-2015

1.357 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Java application monitoring with Dropwizard Metrics and graphite

Java application monitoring with

Dropwizard Metrics

and GraphiteRoberto Franchini

@robfrankie

Bologna, April 10th, 2015

Page 2: Java application monitoring with Dropwizard Metrics and graphite

whoami(1)

15 years of experience, proud to be a programmerWrites software for information extraction, nlp, opinion

mining (@scale ), and a lot of other buzzwordsImplements scalable architectures

Plays with servers (don't say that to my sysadmin)Member of the JUG-Torino coordination team

feedback http://lanyrd.com/sdkghq

2

Page 3: Java application monitoring with Dropwizard Metrics and graphite

Company

3

Page 4: Java application monitoring with Dropwizard Metrics and graphite

Agenda

IntroScenario

System monitoringApplication monitoring (dark side)Application monitoring (light side)

Dropwizard MetricsDashboards

4

Page 5: Java application monitoring with Dropwizard Metrics and graphite

Quotes

Page 6: Java application monitoring with Dropwizard Metrics and graphite

Business value

Our code generates business value when it runs, not when we write it.

We need to know what our code does when it runs.We can’t do this unless we measure it.

(Codahale)

6

Page 7: Java application monitoring with Dropwizard Metrics and graphite

SLA driven

Have an SLA for your serviceMeasure and report performance against the SLA

(Ben Treynor, google inc.)

7

Page 8: Java application monitoring with Dropwizard Metrics and graphite

Scenario

Page 9: Java application monitoring with Dropwizard Metrics and graphite

45 bare metal serversNgnix

Jetty (mainly embedded)PostgreSQL

GlusterFS (28TB and growing)Kestrel

Kafka on the horizonRedis

Jenkins as scheduler (cron on steroids)

Infrastructure

9

Page 10: Java application monitoring with Dropwizard Metrics and graphite

Software

Java shopHome made distributed search engine

Home made little PAASDocker on the go

More than 120 webappsMore than 100 batch jobs

NRT stream processing jobs running 24x7

10

Page 11: Java application monitoring with Dropwizard Metrics and graphite

Java

Java is not deadAnd is almost everywhereThe language is evolving

The JVM is the most advanced managed environment where run your code

Choose your style: Scala, Clojure, Groovy

11

Page 12: Java application monitoring with Dropwizard Metrics and graphite

Who uses it (cool side)

TwitterSpotifyGoogleNetflix

LinkedIn

12

Page 13: Java application monitoring with Dropwizard Metrics and graphite

Who uses it (real world)

Your bank

13

Page 14: Java application monitoring with Dropwizard Metrics and graphite

Systems monitoring

Page 15: Java application monitoring with Dropwizard Metrics and graphite

Collectd

From 2012 Collectd

systems: load, df, traffic java (via jmx): heapqueues: items, size

dbms: connections, size

15

Page 16: Java application monitoring with Dropwizard Metrics and graphite

Collectd charts

Traffic

16

Page 17: Java application monitoring with Dropwizard Metrics and graphite

Collectd to Graphite

collectd writes to graphitewrite_graphitebetter charts

dashboard are easydashboards are meaningful

17

Page 18: Java application monitoring with Dropwizard Metrics and graphite

Graphite dashboard

Servers load dashboard

18

Page 19: Java application monitoring with Dropwizard Metrics and graphite

Grafana

GrafanaA beautiful frontend for graphite

Dashboards are meaningfuland

BEAUTIFUL(you can send screenshots to managers now)

19

Page 20: Java application monitoring with Dropwizard Metrics and graphite

Grafana dashboard

20

Page 21: Java application monitoring with Dropwizard Metrics and graphite

Application monitoring

Page 22: Java application monitoring with Dropwizard Metrics and graphite

Requirements

Measure behaviorsSend to graphite

Integrate with system measuresCorrelate with system measures

22

Page 23: Java application monitoring with Dropwizard Metrics and graphite

Repeat with me

Correlate application and system metrics

23

Page 24: Java application monitoring with Dropwizard Metrics and graphite

Correlate

graphite

collectd

applications

grafana

24

Page 25: Java application monitoring with Dropwizard Metrics and graphite

To do what?

Discover bottleneckspost-mortem analysis

SLA monitoringIO impact

Network trafficMemory

25

Page 26: Java application monitoring with Dropwizard Metrics and graphite

User Story

Given the application runningwhen the manager comes

then I want to show a big green number

26

Page 27: Java application monitoring with Dropwizard Metrics and graphite

The answer

42

27

Page 28: Java application monitoring with Dropwizard Metrics and graphite

In detail

“Application monitoring? WHAT?”“Ok, let me explain

What the app is doing right now?How is the app performing right now?

And then graph it!”“Ok, I got it!”“Let me see”

28

Page 29: Java application monitoring with Dropwizard Metrics and graphite

5 minutes laterpublic class PoorManJavaMetrics {

int called;

long totalTime;

public void doThings() {

final long start = System.currentTimeMillis();

//heavy business logic

called++;

final long end = System.currentTimeMillis();

final long duration = end - start;

totalTime +=duration;

}

public void logStats() {

System.out.println("---stats---");

//I can’t write that

}

}

29

Page 30: Java application monitoring with Dropwizard Metrics and graphite

DIY Java Monitoring

Maybe better with centralized utility class(maybe…)

thread safeness?send measure to different backends?

log to different logging systems?

30

Page 31: Java application monitoring with Dropwizard Metrics and graphite

Java Monitoring

Measure in the codeThread safeness

Counters, gauges, meters etc.Log metrics

Graph metricsExport metrics

31

Page 32: Java application monitoring with Dropwizard Metrics and graphite

NOT only JMXWe want more

Integrate JMX metrics from third-party libs

JMX

32

Page 33: Java application monitoring with Dropwizard Metrics and graphite

Dropwizard Metricshttps://dropwizard.github.io/metrics/3.1.0/

Page 34: Java application monitoring with Dropwizard Metrics and graphite

Overview

Code instrumentationmeters, gauges, counters, histograms

Reportersconsole, csv, slf4j, jmx

Web app instrumentationWeb app health check

Advanced reportersgraphite, ganglia

34

Page 35: Java application monitoring with Dropwizard Metrics and graphite

Overview

Third party libsaspectjinfluxdbstatsd

cassandra

35

Page 36: Java application monitoring with Dropwizard Metrics and graphite

Main parts

MetricsRegistrya collection of all the metrics for your applicationusually one instance per JVMuse more in multi WAR deployment

Nameseach metric has a unique nameregistry has helper methods for creating names

MetricRegistry.name(Queue.class, "items", "total")

//com.example.queue.items.total

MetricRegistry.name(Queue.class, "size", "byte")

//com.example.queue.size.byte

36

Page 37: Java application monitoring with Dropwizard Metrics and graphite

Metrics

Gaugesthe simplest metric type: it just returns a value

Countersincrementing and decrementing 64.bit integer

final Map<String, String> keys = new HashMap<>();registry.register(MetricRegistry.name("gauge", "keys"), new Gauge<Integer>() {

@Overridepublic Integer getValue() {

return keys.keySet().size();}

});

final Counter counter= registry.counter(MetricRegistry.name("counter", "inserted"));counter.inc();

37

Page 38: Java application monitoring with Dropwizard Metrics and graphite

Metrics

Histogramsmeasures the distribution of values in a stream of data

Metersmeasures the rate at which a set of events occur

final Histogram resultCounts = registry.histogram(name(ProductDAO.class, "result-counts");resultCounts.update(results.size());

final Meter meter = registry.meter(MetricRegistry.name("meter", "inserted"));meter.mark();

38

Page 39: Java application monitoring with Dropwizard Metrics and graphite

Metrics

Timersa histogram of the duration of a type of event and a meter of the rate of its occurrence

Timer timer = registry.timer(MetricRegistry.name("timer", "inserted"));

Context context = timer.time();

//timed ops

context.stop();

39

Page 40: Java application monitoring with Dropwizard Metrics and graphite

Reporters

JMXexpose metrics as JMX Beans

Consoleperiodically reports metrics to the console

CSVappends a set of .csv files in a given dir

SLF4jlog metrics to a logger

Graphitestream metrics to graphite

40

Page 41: Java application monitoring with Dropwizard Metrics and graphite

Console reporter final ConsoleReporter console = ConsoleReporter.forRegistry(registry)

.outputTo(System.out)

.convertRatesTo(TimeUnit.MINUTES)

.build();

console.start(10, TimeUnit.SECONDS);

4/9/15 11:45:57 PM =============================================================

-- Gauges ----------------------------------------------------------------------gauge.keys value = 9901

-- Counters --------------------------------------------------------------------counter.inserted count = 9901

-- Meters ----------------------------------------------------------------------meter.inserted count = 9901

41

Page 42: Java application monitoring with Dropwizard Metrics and graphite

slf4j reporterfinal Slf4jReporter logging = Slf4jReporter.forRegistry(registry)

.convertDurationsTo(TimeUnit.MINUTES)

.outputTo(LoggerFactory.getILoggerFactory().getLogger("metrics")) .

build();

logging.start(20, TimeUnit.SECONDS);

0 [metrics-logger-reporter-2-thread-1] INFO metrics - type=GAUGE, name=gauge.keys, value=9012 [metrics-logger-reporter-2-thread-1] INFO metrics - type=COUNTER, name=counter.inserted, count=9016 [metrics-logger-reporter-2-thread-1] INFO metrics - type=METER, name=meter.inserted, count=901, mean_rate=90.03794743129822, m1=81.7831205903394, m5=80.52726521433198, m15=80.30969500950305, rate_unit=events/second14 [metrics-logger-reporter-2-thread-1] INFO metrics - type=TIMER, name=timer.inserted, count=900, min=1.9083333333333335E-8, max=0.016671673633333335, mean=1.667999479718904E-4, stddev=0.0016585493668388946, median=7.196666666666667E-8, p75=1.3421666666666667E-7, p95=2.7838333333333335E-7, p98=7.131833333333334E-7, p99=0.01666843721666667, p999=0.016671673633333335, mean_rate=89.8720293570475, m1=81.59911170741354, m5=80.33057092356765, m15=80.11080303990207, rate_unit=events/second, duration_unit=minutes

42

Page 43: Java application monitoring with Dropwizard Metrics and graphite

Graphite reporterfinal Graphite graphite = new Graphite(new InetSocketAddress("graphite.example.com", 2003));

final GraphiteReporter reporter = GraphiteReporter.forRegistry(registry)

.prefixedWith("web1.example.com")

.convertRatesTo(TimeUnit.SECONDS)

.convertDurationsTo(TimeUnit.MILLISECONDS)

.filter(MetricFilter.ALL)

.build(graphite);

reporter.start(1, TimeUnit.MINUTES);

Metrics can be prefixedUseful to divide environment metrics: prod, test

43

Page 44: Java application monitoring with Dropwizard Metrics and graphite

Metrics naming

Dot notation by getClass()easy to createvery long name on dashboard

Maybe better to use<namespace>.<instrumented section>

.<target (noun)>.<action (past tense verb)>

Such asaccounts.authentication.password.failed

Use prefixprod, test, dev, localdifferentiate data retention on graphite by prefix

44

Page 45: Java application monitoring with Dropwizard Metrics and graphite

Grafana application overview

45

Page 46: Java application monitoring with Dropwizard Metrics and graphite

Demo

Page 47: Java application monitoring with Dropwizard Metrics and graphite

References

https://dropwizard.github.io/metrics/3.1.0/https://dl.dropboxusercontent.com/u/2744222/2011-04-09-

Metrics-Metrics-Everywhere.pdfhttp://graphite.wikidot.com/

http://grafana.org/http://matt.aimonetti.net/posts/2013/06/26/practical-guide-

to-graphite-monitoring/https://www.usenix.

org/sites/default/files/conference/protected-files/srecon15_slides_limoncelli.pdf

47

Page 48: Java application monitoring with Dropwizard Metrics and graphite

Thank Youhttp://lanyrd.com/sdkghq

@robfrankie

[email protected]

48