linuxcon 2011: openvz and linux kernel testing
TRANSCRIPT
- 1. 1 Andrew Vagin Developer, Linux Kernel team OpenVZ and Linux Kernel Testing
- 2. 2 Agenda Linux containers and OpenVZ Ideal test lab Testing techniques Performance testing Anecdotes
- 3. 3 Andrew Morton I'm curious. For the past few months, [email protected] have discovered (and fixed) an ongoing stream of obscure but serious and quite long-standing bugs. How are you discovering these bugs? Andrew added later: hm, OK, I was visualizing some mysterious Russian bugfinding machine or something. Don't stop ;) David Miller This issue has existed since the very creation of the netlink code :-)
- 4. 4 Linux Containers (LXC) Many isolated environments on top of a single kernel Namespaces Resource accounting Better resource accounting Checkpointing and live migration Extra features: cpu limits, NFS inside CTs, etc OpenVZ Containers
- 5. 5 What makes a good test lab? Fully automated system with deployment service A web interface for test scheduling Standard test sets (combo #3, make it large) A web interface for test results (comparisons, graphs, logs) Integration with a bug tracking system Net or serial console to collect kernel oopses KVM, power switch, other goodies
- 6. 6 How do we find bugs in the mainstream kernel Containers help us find more bugs Independent life cycles Precise resource accounting Containers allow us to Test initialization/finalization of kernel subsystems Test error paths Catch more leaks than the regular testing does Catch more race conditions by means of stress testing
- 7. 7 Start/stop test Massive parallel start/stop and suspend/resume Random resource parameters Helps to catch: Race conditions Test error paths Memory leaks
- 8. 8 What makes a good performance test? Effective load: Atomic (UnixBench) Complex (LAMP, SPEC-JBB, vConsolidate) Sane test environment (no random cron jobs etc.) Automation (minimize human interaction) Reproducible results, minimize variability Understand test results, even good ones
- 9. 12 Density testing High density is important feature of OpenVZ (vs VMs) Test measures response time on a number of CTs increasing the number of CTs until time is bad It's not a stress test Produce a big resource overcommit
- 10. 13 Other useful tests Week load test replays real httpd logs in real containers Feature tests: isolation, CPU scheduler, checkpointing, network virtualization, second level quota, etc. Third-party tests: LTP, onnectathon, vSpecJBB, vConsolidate, UNIX bench, sysbench, DVD-store, Netperf
- 11. 14 Real life stories
- 12. 15 (1) How a Russian bug finding machine works QA found a leak of 78 bytes of kernel memory Developer was unable to reproduce a bug He found that this is a leak of a 'struct user' object He audited kernel code which references this object Found one suspicious place Wrote a demo code to trigger the bug, and a fix ... PROFIT!
- 13. 16 (2) How resource controls prevented a DoS attack uid / resource held maxheld barrier limit failcnt numothersocks 9 360 360 360 1 uid / resource held maxheld barrier limit failcnt kmemsize 1237973 14372344 14372700 14790164 80 numothersocks 9 360 360 360 1 A simple kernel attack using socketpair() a.k.a. CVE 2010-4249
- 14. 18 (3) How a guy measured netns performance It was a nice sunny day... 5 different configurations to test Unpredictable, random results CPU throttling caused by overheating; adding a case fan helped!
- 15. 20 Conclusion Containers are good for kernel testing Resource limits (cgroups) are also helpful [most] performance tests are hoax
- 16. 21 Andrew Vagin Thank you. Questions?