hrtimers and beyond transformation of the linux time(r) systemarch 2 arch 3 timekeeping tick process...
TRANSCRIPT
![Page 1: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/1.jpg)
hrtimers and beyond
transformation of the Linux time(r) system
Thomas GleixnerDouglas Niehaus
OLS 2006
![Page 2: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/2.jpg)
Original time(r) system
TOD Clock source HW
ISR Clock event source HW
TOD Clock source HW
ISR Clock event source HW
TOD Clock source HW
ISR Clock event source HW
Arch 1
Arch 2
Arch 3
Timekeeping
Tick
Process acc.
Profiling
Jiffies
Timer wheel
![Page 3: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/3.jpg)
History
● double linked list sorted by expiry time● UTIME (1996)● timer wheel (1997)● HRT (2001)● hrtimers (2006)
![Page 4: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/4.jpg)
Timer Wheel
● periodic tick necessary● O(1) insertion / deletion● recascading in bursts (can cause high
latencies)● higher tick frequencies don't scale due
to long lasting timer callbacks and increased recascading
![Page 5: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/5.jpg)
Cascading
![Page 6: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/6.jpg)
Cascading
100 250 1000 HZ[1] 256 10 4 1 ms[2] 64 2560 1024 256 ms[3] 64 164 66 16 s[4] 64 175 70 17 m[5] 64 186 75 19 h
![Page 7: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/7.jpg)
Cascading CONFIG_BASE_SMALL=y
100 250 1000 HZ[1] 64 10 4 1 ms[2] 16 640 256 64 ms[3] 16 10240 4096 1024 ms[4] 16 164 66 16 s[5] 16 44 17 4 m
![Page 8: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/8.jpg)
Cascading● array sizes have to be chosen carefully
taking tick frequency into account● rare (multiple) cascades increase latency
–use cases have to be analysed to avoid problematic cascading
● separating timers with high accuracy requirement from coarse grained timeouts will relax the situation
![Page 9: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/9.jpg)
timers vs. timeouts
timers● precise event
scheduling● accurate● likely to expire
timeouts● report error
conditions● coarser grained● likely to be
removed before expiration
![Page 10: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/10.jpg)
History of high resolution timers
● UTIME – KURTLinux – University of Kansas
● HRT – fork of UTIME– Monta Vista
● Hrtimers– Linutronix
![Page 11: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/11.jpg)
Why hrtimers ?● UTIME and HRT added a subjiffy field
– Kept jiffy ticks by design to avoid broader kernel change impact
– Modes: on top of the timer wheel or separate highresolution event list
● HRT moved high resolution timers into a separate list one tick before expiry–Suffered from timer wheel latencies
![Page 12: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/12.jpg)
hrtimers● timers inserted into a redblack tree
sorted by expiration time● separate queue for each base clock,
which allowed simplifying POSIX timers ● base code is still tick driven (softirq is
called in the timer softirq context)● time values are kept in new data type
ktime_t (using nanosecond base)
![Page 13: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/13.jpg)
ktime_t● optimizable data type for both 32 and
64 bit machines● plain nanosecond value on 64 bit CPU● (seconds, nanoseconds) pair on 32 bit
CPUs with field order allowing (depending on the endianess) 64 bit add, subtract, compare operations.
![Page 14: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/14.jpg)
hrtimer users
● nanosleep● itimer● POSIX timers● timed futex operations
![Page 15: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/15.jpg)
hrtimers
TOD Clock source HW
ISR Clock event source HW
TOD Clock source HW
ISR Clock event source HW
TOD Clock source HW
ISR Clock event source HW
Arch 1
Arch 2
Arch 3
Timekeeping
Tick
Process acc.
Profiling
Jiffies
Timer wheel
hrtimers
![Page 16: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/16.jpg)
how to get high resolution timers ?
● solve the tick (jiffy) dependency of timekeeping
● create a generic framework for next event interrupt programming
● replace the periodic tick interrupt by timers under hrtimers
![Page 17: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/17.jpg)
Timekeeping
● Make use of John Stultz's Generic Time of Day framework–architecture independent–generic framework replaces
duplicated architecture code–better decoupling from tick
![Page 18: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/18.jpg)
hrtimers + GTOD
HW
ISR Clock event source HW
HW
ISR Clock event source HW
HW
ISR Clock event source HW
Arch 1
Arch 2
Arch 3
Timekeeping
Tick
Process acc.
Profiling
Jiffies
Timer wheel
hrtimers
Clock source
TODClock synchr.
Shared HW
![Page 19: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/19.jpg)
clockevents
● Generic infrastructure to distribute timer related events–architecture independent–generic framework replaces
duplicated architecture code–allows quality based selection of
clock event hardware
![Page 20: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/20.jpg)
hrtimers + GTOD + clockevents
HW
HW
HW
HW
HW
ISR HW
Arch 1
Arch 2
Arch 3
Timekeeping
Tick
Process acc.
Profiling
Jiffies
Timer wheel
hrtimers
Clock source
TODClock synchr.
Shared HW
Clock events
ISR
Event distribution
Shared HW
![Page 21: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/21.jpg)
tick emulation
● Use a perCPU hrtimer to emulate tick–update jiffies and NTP adjustments–perCPU calls
● process accounting and profiling● Allows high resolution timers and/or
dynamic ticks
![Page 22: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/22.jpg)
hrtimers + GTOD + clockevents + tick emulation
HW
HW
HW
HW
HW
ISR HW
Arch 1
Arch 2
Arch 3
Timekeeping
Process acc.
Profiling
Jiffies
Timer wheel
hrtimers
Clock source
TODClock synchr.
Shared HW
Clock events
ISR
Event distribution
Shared HW
Next event
Dynamic tick
hrtimers
![Page 23: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/23.jpg)
high resolution performance
Kernel min max avg2.6.16 24 4042 1989 µs
2.6.16hrt 12 94 20 µs2.6.16rt 6 40 10 µs
clock_nanosleep(ABS_TIME)interval: 10ms10000 loopsno load
![Page 24: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/24.jpg)
high resolution performance
Kernel min max avg2.6.16 55 4280 2198 µs
2.6.16hrt 11 458 55 µs2.6.16rt 16 55 20 µs
clock_nanosleep(ABS_TIME)interval 10ms10000 loops100% load
![Page 25: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/25.jpg)
dynamic tick idle behaviour
● timer interrupts reduced to ~1 per second.– instrumentation to identify the timer
(ab)users to improve the idle sleep length
![Page 26: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/26.jpg)
timer wheel batching
● run the timer wheel at a lower frequency than the scheduler tick by skipping timer wheel processing for a user space configurable number of ticks
● improves interactivity
![Page 27: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/27.jpg)
things to be done
● get it merged (target is 2.6.19)● support more architectures
(prototypes for ARM and PPC available)
● tighter integration into power management
![Page 28: hrtimers and beyond transformation of the Linux time(r) systemArch 2 Arch 3 Timekeeping Tick Process acc. Profiling Jiffies Timer wheel History double linked list sorted by expiry](https://reader034.vdocuments.site/reader034/viewer/2022051908/5ffb3b0ff05d0c3157460163/html5/thumbnails/28.jpg)
Conclusions
● significant changes are necessary but the benefit is significant increases in:–architecture independent code–ease of using wide range of time
keeping and timer event hardware– increased resolution for scheduled
events when desired