capriccio: scalable threads for internet services rob von behren, jeremy condit, feng zhou, geroge...
TRANSCRIPT
![Page 1: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/1.jpg)
Capriccio: Scalable Threads for Internet Services
Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer
University of California at Berkeley
Presenter: Olusanya Soyannwo
![Page 2: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/2.jpg)
Outline
Motivation Background Goals Approach Experiments Results Related work Conclusion & Future work
EECS Advanced Operating Systems Northwestern University
![Page 3: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/3.jpg)
Motivation
Increasing scalability demands for Internet services
Hardware improvements are limited by existing software
Current implementations are event based
EECS Advanced Operating Systems Northwestern University
![Page 4: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/4.jpg)
Background : Event Based Systems - Drawbacks
Events systems hide the control flowDifficult to understand and debug
Programmers need to match related events
Burdens programmers
EECS Advanced Operating Systems Northwestern University
![Page 5: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/5.jpg)
Goals: Capriccio
Support for existing thread API
Scalability to hundreds of thousands of threads
Automate application-specific customization
EECS Advanced Operating Systems Northwestern University
![Page 6: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/6.jpg)
Approach: Capriccio
Thread packageCooperative scheduling
Linked stacksAddress the problem of stack
allocation for large numbers of threadsCombination of compile-time and run-
time analysis Resource-aware scheduler
EECS Advanced Operating Systems Northwestern University
![Page 7: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/7.jpg)
Approach: User Level Thread – The Choice
POSIX API(-)Complex preemption
(-)Bad interaction with Kernel scheduler
Performance Ease thread synchronization overhead No kernel crossing for preemptive threading More efficient memory management at user level
Flexibility Decoupling user and kernel threads allows faster innovation Can use new kernel thread features without changing application
code Scheduler tailored for applications
EECS Advanced Operating Systems Northwestern University
![Page 8: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/8.jpg)
Approach: User Level Thread – Disadvantages
Additional OverheadReplacing blocking calls with non-
blocking calls
Multiple CPU synchronization
EECS Advanced Operating Systems Northwestern University
![Page 9: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/9.jpg)
Approach: User Level Thread – Implementation Context Switches
Built on top of Edgar Toernig’s coroutine library Fast context switches when threads voluntarily yield
I/O Capriccio intercepts blocking I/O calls Uses epoll for asynchronous I/O
Scheduling Very much like an event-driven application Events are hidden from programmers
Synchronization Supports cooperative threading on single-CPU machines
• Requires only Boolean checks
EECS Advanced Operating Systems Northwestern University
![Page 10: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/10.jpg)
Approach: Linked Stack
The problem: fixed stacksOverflow vs. wasted spaceLimits thread numbers
The solution: linked stacksAllocate space as neededCompiler analysis
• Add runtime checkpoints • Guarantee enough space until next check
Fixed Stacks
Linked Stack
EECS Advanced Operating Systems Northwestern University
![Page 11: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/11.jpg)
Approach: Linked Stack Parameters
MaxPathMinChunk
Steps Break cycles Trace back
Special Cases Function pointers External calls
5
4
2
6
3
3
2
3
MaxPath = 8
EECS Advanced Operating Systems Northwestern University
![Page 12: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/12.jpg)
Approach: Linked Stack Parameters
MaxPathMinChunk
Steps Break cycles Trace back
Special Cases Function pointers External calls
5
4
2
6
3
3
2
3
MaxPath = 8
EECS Advanced Operating Systems Northwestern University
![Page 13: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/13.jpg)
Approach: Linked Stack Parameters
MaxPathMinChunk
Steps Break cycles Trace back
Special Cases Function pointers External calls
5
4
2
3
3
2
3
MaxPath = 8
6
EECS Advanced Operating Systems Northwestern University
![Page 14: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/14.jpg)
Approach: Linked Stack Parameters
MaxPathMinChunk
Steps Break cycles Trace back
Special Cases Function pointers External calls
5
3
MaxPath = 8
6
3
2
2
4
3
EECS Advanced Operating Systems Northwestern University
![Page 15: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/15.jpg)
Approach: Linked Stack Parameters
MaxPathMinChunk
Steps Break cycles Trace back
Special Cases Function pointers External calls MaxPath = 8
6
3
2
2
4
3
3
3
EECS Advanced Operating Systems Northwestern University
![Page 16: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/16.jpg)
Approach: Scheduling
Advantages of event-based schedulingTailored for applicationsWith event handlers
Events provide two important pieces of information for schedulingWhether a process is close to
completionWhether a system is overloaded
EECS Advanced Operating Systems Northwestern University
![Page 17: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/17.jpg)
Approach: Scheduling -The Blocking Graph
Close
Write
WriteReadSleep
ThreadcreateMain
Thread-based View applications as
sequence of stages, separated by blocking calls
Analogous to event-based scheduler
EECS Advanced Operating Systems Northwestern University
![Page 18: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/18.jpg)
Approach: Resource-aware Scheduling Track resources used along BG edges
Memory, file descriptors, CPU Predict future from the past Algorithm
• Increase use when underutilized• Decrease use near saturation
Advantages Operate near the knee w/o thrashing Automatic admission control
EECS Advanced Operating Systems Northwestern University
![Page 19: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/19.jpg)
Experiment: Threading Microbenchmarks
SMP, two 2.4 GHz Xeon processors 1 GB memory two 10 K RPM SCSI Ultra II hard
drives Linux 2.5.70 Compared Capriccio, LinuxThreads,
and Native POSIX Threads for Linux
EECS Advanced Operating Systems Northwestern University
![Page 20: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/20.jpg)
Experiment: Thread Scalability
Producer-consumer microbenchmark LinuxThreads begin to degrade after
20 threads NPTL degrades after 100 Capriccio scales to 32K producers
and consumers (64K threads total)
EECS Advanced Operating Systems Northwestern University
![Page 21: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/21.jpg)
Results: Thread Primitive - Latency
Capriccio LinuxThreads NPTL
Thread creation
21.5 21.5 17.7
Thread context switch
0.24 0.71 0.65
Uncontended mutex lock
0.04 0.14 0.15
EECS Advanced Operating Systems Northwestern University
![Page 22: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/22.jpg)
Results: Thread Scalability
EECS Advanced Operating Systems Northwestern University
![Page 23: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/23.jpg)
Results: I/O performance
Network performanceToken passing among pipesSimulates the effect of slow client links10% overhead compared to epollTwice as fast as both LinuxThreads
and NPTL when more than 1000 threads
Disk I/O comparable to kernel threads
EECS Advanced Operating Systems Northwestern University
![Page 24: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/24.jpg)
Results: Runtime Overhead
Tested Apache 2.0.44 Stack linking
73% slowdown for null call 3-4% overall
Resource statistics 2% (on all the time) 0.1% (with sampling)
Stack traces 8% overhead
EECS Advanced Operating Systems Northwestern University
![Page 25: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/25.jpg)
Results: Web Server Performance
EECS Advanced Operating Systems Northwestern University
![Page 26: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/26.jpg)
Related Work
Programming Model of high concurrency Event based models are a result of poor thread implementations
User-Level Threads Capriccio is unique
Kernel Threads NPTL
Application Specific Optimization SPIN & Exokernel Burden on programmers Portability
Asynchronous I/O Stack Management
Using heap requires a garbage collector (ML of NJ)
EECS Advanced Operating Systems Northwestern University
![Page 27: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/27.jpg)
Related Work (cont’d)
Resource Aware SchedulingSeveral similar to capriccio
![Page 28: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/28.jpg)
Future Work
ThreadingMulti-CPU supportKernel interface
(enabled) Compile-time techniquesVariations on linked stacksStatic blocking graph
SchedulingMore sophisticated prediction
EECS Advanced Operating Systems Northwestern University
![Page 29: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/29.jpg)
Conclusion
Capriccio simplifies high concurrency Scalable & high performanceControl over concurrency model
• Stack safety• Resource-aware scheduling• Enables compiler support, invariants
Issues Additional burden to programmer Resource controlled sched.? What
hysteresis?
EECS Advanced Operating Systems Northwestern University
![Page 30: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/30.jpg)
OTHER GRAPHS
![Page 31: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley](https://reader030.vdocuments.site/reader030/viewer/2022032517/56649ca35503460f949636f6/html5/thumbnails/31.jpg)
OTHER GRAPHS