top node.js metrics to watch
TRANSCRIPT
![Page 1: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/1.jpg)
Top Node.js Metrics to WatchStefan Thies
![Page 2: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/2.jpg)
Agenda
- Development of node.js performance monitoring agents
- Node.js key metrics
![Page 3: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/3.jpg)
![Page 4: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/4.jpg)
Metrics Aggregation
- pre-aggregation in monitoring agent
- N measures per minute -> sum, max, min, avg, rates, percentiles
- aggregation plans in Sematext backend (Java / Hadoop)
- 1 min, 5 min, 1 hour, 1 day, 1 week, 1 month
- fast queries over long periods of time and multiple dimensions (e.g. filters for host, process/worker id)
![Page 5: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/5.jpg)
Store & Forward
1)Buffer Metrics when the receiver is not reachable ...
2)Re-transmit metrics, stored in NeDB
![Page 6: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/6.jpg)
1 http.post(options, cb1) 3 http.post(options, cb2) 3 http.post(options, cb3) 4 http.post(options, cb4) 5 cb4 (err) 5 cb1 6 cb2 7 cb3
Java Server
Threads, Thread Pool, limited e.g. max 3
Node Client & Java Backend
async, non-blocking Main + Event Loop Thread
HTTP 500 internal server error
Luke be nice to
Node.js Client
![Page 7: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/7.jpg)
“A stupid guy called ‘Travis’, made DoS attacks!!!”
8 Minute Unit Tests for network/storage test cases ⇒ 30 seconds :)
![Page 8: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/8.jpg)
OS Metrics limited in node.js API
- Limited Memory info os.freemem(), os.totalmem()
- a few missing CPU metrics: os.cpus()
- No Disk stats in node API
![Page 9: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/9.jpg)
How to Load the Monitoring Agent?
OK - for Devs, but Ops don’t like to touch source code ...
Node 4.x to the rescue! Pre-loading modules with ‘-r’ / require
![Page 10: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/10.jpg)
Garbage Collection
- Incremental marking and lazy sweeping- marking ‘stop the world’
- Incremental GC cycles (scavenge)
- Full GC cycles
- What should be measured?
- Count of GC cycles
- Rate GC cycles / time
- Sum GC Time
- Released Memory (before GC - after GC)
![Page 11: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/11.jpg)
How to get GC info?
Find GC options: node --v8-options | grep _gc
node --trace-gc --trace_gc_nvp lib/index.js
[7729:0x101804600] [I:0x101804600] 26 ms: pause=0.9 mutator=-1455940110228.4 gc=s external=0.0 mark=0.0 sweep=0.00 sweepns=0.00 sweepos=0.00 sweepcode=0.00 sweepcell=0.00 sweepmap=0.00 evacuate=0.0 new_new=0.0 root_new=0.0 old_new=0.0 compaction_ptrs=0.0 intracompaction_ptrs=0.0 misc_compaction=0.0 weak_closure=0.0 inc_weak_closure=0.0 weakcollection_process=0.0 weakcollection_clear=0.0 weakcollection_abort=0.0 total_size_before=2360232 total_size_after=2257696 holes_size_before=32 holes_size_after=32 allocated=2360232 promoted=0 semi_space_copied=929376 nodes_died_in_new=7 nodes_copied_in_new=5 nodes_promoted=0 promotion_ratio=0.0% average_survival_ratio=90.1% promotion_rate=0.0% semi_space_copy_rate=90.1% new_space_allocation_throughput=0 context_disposal_rate=0.0 steps_count=0 steps_took=0.0 scavenge_throughput=1180677
![Page 12: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/12.jpg)
NPM modules for GC info
- gc-stats
- gc-profiler
- memwatch(-next)
- missing gc times
- + leak detection
- + heap diff
Native C++ modules V8 API / NAN 1.x vs. NAN 2.x
![Page 13: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/13.jpg)
![Page 14: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/14.jpg)
NPM GC packages
![Page 15: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/15.jpg)
GC Insights as part of Node.js API?
![Page 16: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/16.jpg)
Examples for Node.js Metrics
![Page 17: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/17.jpg)
- CPU Usage
- Memory Usage
- Disk - I/O read/writes
- Space
- Process Metrics
- Application Metrics- in-process monitor
Server, Process, Application Metrics
![Page 18: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/18.jpg)
GC Metrics
GC cycles / min < 50
GC Time < 20 ms / min
Released Mem. 2 MB / cycle
![Page 19: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/19.jpg)
Example - Monitoring Kibana 4.1 Node.js Apphttp://blog.sematext.com/2015/05/27/monitoring-kibana-4s-node-js-app/
- 2.0 ruby server
- 3.0 HTML5 no server
- 4.0-4.2 Node Express
- > 4.3 Node Hapi.js
We run a managed ELK Stack / Logging SaaS
![Page 20: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/20.jpg)
GC cycles - out of control!
GC cycles / min: 45.000 (!) GC Time: < 10 sec / min
???
OOM Kill
![Page 21: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/21.jpg)
Taming GC ...
GC cycles / min: 100 GC Time: < 92 ms / min
--max-old-space-size=200
GC cycles / min: 45.000 (!) GC Time: < 10 sec / min
Update to Node.js 4.2.x
![Page 22: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/22.jpg)
Event Loop Latency
Avg. Latency < 0,5 ms
![Page 23: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/23.jpg)
Event Loop Latency
EventLoop Latency < 0,5 ms 3 / 15 ms !!!3 / 15 ms !!!
![Page 24: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/24.jpg)
Don’t Block the Event Loop ...
![Page 25: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/25.jpg)
Process Memory
![Page 26: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/26.jpg)
Example - Process Memory
OOM Kill
![Page 27: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/27.jpg)
Number of Workers
![Page 28: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/28.jpg)
Number of Workers
![Page 29: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/29.jpg)
Correlate Metrics for Different Workers
![Page 30: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/30.jpg)
HTTP Metrics
![Page 31: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/31.jpg)
HTTP Request and Error Rate
![Page 32: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/32.jpg)
Error Breakdown
![Page 33: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/33.jpg)
Get the Full Picture ...
![Page 34: Top Node.js Metrics to Watch](https://reader031.vdocuments.site/reader031/viewer/2022021922/5870a4421a28abcb078b56c7/html5/thumbnails/34.jpg)
Thank you!
www.npmjs.com/~megastef
www.npmjs.com/~sematext
@seti321 or @sematext