java application monitoring with dropwizard metrics and graphite

Post on 17-Jul-2015

1.357 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Java application monitoring with

Dropwizard Metrics

and GraphiteRoberto Franchini

@robfrankie

Bologna, April 10th, 2015

whoami(1)

15 years of experience, proud to be a programmerWrites software for information extraction, nlp, opinion

mining (@scale ), and a lot of other buzzwordsImplements scalable architectures

Plays with servers (don't say that to my sysadmin)Member of the JUG-Torino coordination team

feedback http://lanyrd.com/sdkghq

2

Company

3

Agenda

IntroScenario

System monitoringApplication monitoring (dark side)Application monitoring (light side)

Dropwizard MetricsDashboards

4

Quotes

Business value

Our code generates business value when it runs, not when we write it.

We need to know what our code does when it runs.We can’t do this unless we measure it.

(Codahale)

6

SLA driven

Have an SLA for your serviceMeasure and report performance against the SLA

(Ben Treynor, google inc.)

7

Scenario

45 bare metal serversNgnix

Jetty (mainly embedded)PostgreSQL

GlusterFS (28TB and growing)Kestrel

Kafka on the horizonRedis

Jenkins as scheduler (cron on steroids)

Infrastructure

9

Software

Java shopHome made distributed search engine

Home made little PAASDocker on the go

More than 120 webappsMore than 100 batch jobs

NRT stream processing jobs running 24x7

10

Java

Java is not deadAnd is almost everywhereThe language is evolving

The JVM is the most advanced managed environment where run your code

Choose your style: Scala, Clojure, Groovy

11

Who uses it (cool side)

TwitterSpotifyGoogleNetflix

LinkedIn

12

Who uses it (real world)

Your bank

13

Systems monitoring

Collectd

From 2012 Collectd

systems: load, df, traffic java (via jmx): heapqueues: items, size

dbms: connections, size

15

Collectd charts

Traffic

16

Collectd to Graphite

collectd writes to graphitewrite_graphitebetter charts

dashboard are easydashboards are meaningful

17

Graphite dashboard

Servers load dashboard

18

Grafana

GrafanaA beautiful frontend for graphite

Dashboards are meaningfuland

BEAUTIFUL(you can send screenshots to managers now)

19

Grafana dashboard

20

Application monitoring

Requirements

Measure behaviorsSend to graphite

Integrate with system measuresCorrelate with system measures

22

Repeat with me

Correlate application and system metrics

23

Correlate

graphite

collectd

applications

grafana

24

To do what?

Discover bottleneckspost-mortem analysis

SLA monitoringIO impact

Network trafficMemory

25

User Story

Given the application runningwhen the manager comes

then I want to show a big green number

26

The answer

42

27

In detail

“Application monitoring? WHAT?”“Ok, let me explain

What the app is doing right now?How is the app performing right now?

And then graph it!”“Ok, I got it!”“Let me see”

28

5 minutes laterpublic class PoorManJavaMetrics {

int called;

long totalTime;

public void doThings() {

final long start = System.currentTimeMillis();

//heavy business logic

called++;

final long end = System.currentTimeMillis();

final long duration = end - start;

totalTime +=duration;

}

public void logStats() {

System.out.println("---stats---");

//I can’t write that

}

}

29

DIY Java Monitoring

Maybe better with centralized utility class(maybe…)

thread safeness?send measure to different backends?

log to different logging systems?

30

Java Monitoring

Measure in the codeThread safeness

Counters, gauges, meters etc.Log metrics

Graph metricsExport metrics

31

NOT only JMXWe want more

Integrate JMX metrics from third-party libs

JMX

32

Dropwizard Metricshttps://dropwizard.github.io/metrics/3.1.0/

Overview

Code instrumentationmeters, gauges, counters, histograms

Reportersconsole, csv, slf4j, jmx

Web app instrumentationWeb app health check

Advanced reportersgraphite, ganglia

34

Overview

Third party libsaspectjinfluxdbstatsd

cassandra

35

Main parts

MetricsRegistrya collection of all the metrics for your applicationusually one instance per JVMuse more in multi WAR deployment

Nameseach metric has a unique nameregistry has helper methods for creating names

MetricRegistry.name(Queue.class, "items", "total")

//com.example.queue.items.total

MetricRegistry.name(Queue.class, "size", "byte")

//com.example.queue.size.byte

36

Metrics

Gaugesthe simplest metric type: it just returns a value

Countersincrementing and decrementing 64.bit integer

final Map<String, String> keys = new HashMap<>();registry.register(MetricRegistry.name("gauge", "keys"), new Gauge<Integer>() {

@Overridepublic Integer getValue() {

return keys.keySet().size();}

});

final Counter counter= registry.counter(MetricRegistry.name("counter", "inserted"));counter.inc();

37

Metrics

Histogramsmeasures the distribution of values in a stream of data

Metersmeasures the rate at which a set of events occur

final Histogram resultCounts = registry.histogram(name(ProductDAO.class, "result-counts");resultCounts.update(results.size());

final Meter meter = registry.meter(MetricRegistry.name("meter", "inserted"));meter.mark();

38

Metrics

Timersa histogram of the duration of a type of event and a meter of the rate of its occurrence

Timer timer = registry.timer(MetricRegistry.name("timer", "inserted"));

Context context = timer.time();

//timed ops

context.stop();

39

Reporters

JMXexpose metrics as JMX Beans

Consoleperiodically reports metrics to the console

CSVappends a set of .csv files in a given dir

SLF4jlog metrics to a logger

Graphitestream metrics to graphite

40

Console reporter final ConsoleReporter console = ConsoleReporter.forRegistry(registry)

.outputTo(System.out)

.convertRatesTo(TimeUnit.MINUTES)

.build();

console.start(10, TimeUnit.SECONDS);

4/9/15 11:45:57 PM =============================================================

-- Gauges ----------------------------------------------------------------------gauge.keys value = 9901

-- Counters --------------------------------------------------------------------counter.inserted count = 9901

-- Meters ----------------------------------------------------------------------meter.inserted count = 9901

41

slf4j reporterfinal Slf4jReporter logging = Slf4jReporter.forRegistry(registry)

.convertDurationsTo(TimeUnit.MINUTES)

.outputTo(LoggerFactory.getILoggerFactory().getLogger("metrics")) .

build();

logging.start(20, TimeUnit.SECONDS);

0 [metrics-logger-reporter-2-thread-1] INFO metrics - type=GAUGE, name=gauge.keys, value=9012 [metrics-logger-reporter-2-thread-1] INFO metrics - type=COUNTER, name=counter.inserted, count=9016 [metrics-logger-reporter-2-thread-1] INFO metrics - type=METER, name=meter.inserted, count=901, mean_rate=90.03794743129822, m1=81.7831205903394, m5=80.52726521433198, m15=80.30969500950305, rate_unit=events/second14 [metrics-logger-reporter-2-thread-1] INFO metrics - type=TIMER, name=timer.inserted, count=900, min=1.9083333333333335E-8, max=0.016671673633333335, mean=1.667999479718904E-4, stddev=0.0016585493668388946, median=7.196666666666667E-8, p75=1.3421666666666667E-7, p95=2.7838333333333335E-7, p98=7.131833333333334E-7, p99=0.01666843721666667, p999=0.016671673633333335, mean_rate=89.8720293570475, m1=81.59911170741354, m5=80.33057092356765, m15=80.11080303990207, rate_unit=events/second, duration_unit=minutes

42

Graphite reporterfinal Graphite graphite = new Graphite(new InetSocketAddress("graphite.example.com", 2003));

final GraphiteReporter reporter = GraphiteReporter.forRegistry(registry)

.prefixedWith("web1.example.com")

.convertRatesTo(TimeUnit.SECONDS)

.convertDurationsTo(TimeUnit.MILLISECONDS)

.filter(MetricFilter.ALL)

.build(graphite);

reporter.start(1, TimeUnit.MINUTES);

Metrics can be prefixedUseful to divide environment metrics: prod, test

43

Metrics naming

Dot notation by getClass()easy to createvery long name on dashboard

Maybe better to use<namespace>.<instrumented section>

.<target (noun)>.<action (past tense verb)>

Such asaccounts.authentication.password.failed

Use prefixprod, test, dev, localdifferentiate data retention on graphite by prefix

44

Grafana application overview

45

Demo

References

https://dropwizard.github.io/metrics/3.1.0/https://dl.dropboxusercontent.com/u/2744222/2011-04-09-

Metrics-Metrics-Everywhere.pdfhttp://graphite.wikidot.com/

http://grafana.org/http://matt.aimonetti.net/posts/2013/06/26/practical-guide-

to-graphite-monitoring/https://www.usenix.

org/sites/default/files/conference/protected-files/srecon15_slides_limoncelli.pdf

47

Thank Youhttp://lanyrd.com/sdkghq

@robfrankie

franchini@celi.it

48

top related