soft, hard and ruby hard real time with linux

Soft, Hard and Ruby Hard Real Time Approaches with LinuxGiladBenYossef CodefidenceLtd,CTO1

Real Time: A Definition

A real-time system is one in which the correctness of the computations not only depends upon the logical correctness of the computation but also upon the time at which the result is produced. If the timing constraints of the system are not met, system failure is said to have occurred."Donald Gillies, quoted on Usenet comp.realtime FAQ

2

The Path to Real Time in Linux

POSIX Real Time Scheduling Domains Priority Inversion Locking Memory Timer Frequency Sources Of Latency Scheduling Latency Interrupt Latency Real Time Linux Benchmarks3

Linux PrioritiesNice level -20 -19 -18 Real Time priority 99 98 97

...4 3 2 1 04

.. . 1617 18 19

Real time processes SCHED_FIFO SCHED_RR

Non real-time processes SCHED_OTHER

Changing Real Time Prioritiesint sched_setscheduler(pid_t pid, int policy, const struct sched_param *p); struct sched_param { int sched_priority };

sched_setscheduler sets the scheduling policy and priority identified by pid.

Policy is SCHED_FIFO, SCHED_RR, SCHED_OTHER or SCHED_BATCH.5

Thread Level Priorities

In Linux, each thread has separate real time priority. When creating threads with the pthread_create() system call, an attribute structure can be passed. Various thread attributes can be used to set scheduling domain and priority. The relevant attributes can also be changed during run time6

Sched PolicyThread Attribute: schedpolicy

Select the scheduling policy for the thread: one of SCHED_OTHER (regular, non-realtime scheduling), SCHED_RR (realtime, round-robin) or SCHED_FIFO (realtime, first-in first-out). Default value: SCHED_OTHER. The real time scheduling policies SCHED_RR and SCHED_FIFO are available only to processes with superuser privileges. The scheduling policy of a thread can be changed after creation with pthread_setschedpolicy(3).

7

Sched ParamThread Attribute: schedparam

Contain the scheduling parameters (essentially, the scheduling priority) for the thread. Default value: priority is 0. This attribute is not significant if the scheduling policy is SCHED_OTHER; it only matters for the real time policies SCHED_RR and SCHED_FIFO. The scheduling priority of a thread can be changed after creation with pthread_setschedparam(3).

8

Inherit SchedThread Attribute: inheritsched

Indicate whether the scheduling policy and scheduling parameters for the newly created thread are determined by the values of the schedpolicy and schedparam attributes (PTHREAD_EXPLICIT_SCHED) or are inherited from the parent thread (value PTHREAD_INHERIT_SCHED). Default value: POSIX says PTHREAD_EXPLICIT_SCHED, but at least some version of Linux do PTHREAD_INHERIT_SCHED.

9

Priority Inversion2. High Priority task preempts low priority task 99 3. Hi Priority task block on mutex Task Priority 50 4. Medium Priority task preempts low priority task and high priority task

3 1. Low Priority task takes mutex Time10

Priority Inheritance

The common solution to priority inversion is called Priority Inheritance A task which holds a lock, automatically inherits the priority of the highest priority task which contends the lock.

Until the lock is released, of course.

Correct implementation hard, performance impact non trivial, not a sliver bullet. Linux implementation was.. difficult.11

PI-Futex

Interface through the Fast User-space muTEX mechanism.

Linux 2.6 application mutex support code. Similar kernel functionality in PREEMPT_RT.

Supports user-space mutexes only.

Introduced in 2.6.18 A patched Glibc is needed right now

Patch merging into mainline Glibc right now.

12

How fork() seems to workA complete copy is created of the father process. Both parent and child continue execution from the same spot.

Father Process

Copy Father Process Child Process

13

How fork() really worksRW MemoryChild process gets a copy of stack and file descriptors. Copy On Write reference is used for the memory.

Father Process

Father Process

RO

Memory

RO

Child Process

14

What happens during write?When write is attempted, a single memory page is copied and references updated. This is called: breaking the CoW.

Original Memory

RO

RW RO

Child Process

Father Process

15

Locking Memoryintmlockall(intflags);

Disables paging for all pages mapped into the address space of the calling process. Flags are:

MCL_CURRENT: Lock all pages which are currently mapped into the address space of the process. MCL_FUTURE : Lock all pages which will become mapped into the address space of the process in the future. Use both for max. effect: MCL_CURRENT|MCL_FUTURE

16

Stack Pre-Faulting

Linux user space stacks are auto expanding.

Use more stack then allocated to a process? kernel gets an exception and allocates more stack for you. Can turn stack access to multiple context switch and memory allocation with non deterministic latency. Call a dummy function that allocates an automatic variable big enough for your entire future stack usage and write to it, after you've mlocked memory.17

We need to pre-fault stack pages.

Timer frequency

Linux timer interrupts are raised every HZ of second. HZ is configurable during kernel build: 100, 250 (i386 default) or 1000. See kernel/Kconfig.hz. Compromise between system responsiveness and global throughput. Caution: not any value can be used. Constraints apply!18

The Effect of Timer FrequencyRequested sleep time, in ms 2 0 HZ=100

1 0 1 1 2 0 5 0 Real sleep time, in ms19

The Effect of Timer Frequency cont.Requested sleep time, in ms 2 0 HZ=1000

1 0 1 1 2 0 5 0 Real sleep time, in ms20

High-Res Timers

High-res timers use non RTC interrupt timer sources to deliver timer interrupt between ticks. Allows POSIX timers and nanosleep() to be as accurate as the hardware allows This feature is transparent.

When enabled it makes these timers much more accurate than the current HZ resolution. Around 1usec on typical hardware.21

Sources Of Latency

PIC

Context Switchsaving registers...

Finding ISR

ISR

Interrupt Latency

Interrupt Latency

ISR

Scheduler

Context Switch

Task

Preemption Latency22

The Linux O(1) Scheduler

The kernel maintains 2 priority arrays: the active and the expired array. Each array contains 140 entries each with a queue of processes with the same priority. The arrays are implemented such that it is possible to pick the queue with the runnable task with the highest priority in constant time

Whatever the number of runnable tasks is.

Let's see how this helps us...23

Choosing and Expiring Tasks

The scheduler finds the highest priority with a runnable task and

Executes the first task with that priority. Non real time tasks are run until they exhaust their time slice and moved to the expired array. Real time tasks are run until they yields the CPU [*].

This is done until there are no more tasks in the array, then the two arrays are swapped and we start over. Scheduling is O(1).24

Kernel Preemption Options

Preemption means a high priority task taking the place of a low priority one.

User space code is always preemptive. PREEMPT_NONE

In kernel mode (when running system call):

None preemptive kernel code. Voluntary preemption points. Non critical section preemptive kernel.25

PREEMPT_VOLUNTARY

PREEMPT

Soft Real Time

Deadline exists, but glitches do not propagate Typically 1 millisecond cycle time. Vanilla Linux is well suited:

Preemptive kernel, O(1) scheduler, high resolution timers, priority inheritance futex support Good hardware design helps: interrupt queues, DMA ring buffers, FIFOs.

26

Hard and Ruby Hard Real Time

Deadline exists, glitch are fatal Typically Sub millisecond cycle times. Can be done with Linux, two approaches available:

Nano-kernel

Run Linux under a nano Hard real time kernel. Change Linux to meet Hard Real Time needs.

PREEMPT_RT RT Patch

27

Nano Kernel Approach

LinuxRT Nano-Kernel

RT Tasks

RTOS

Hardware28

Nano Kernel Approach

Nano-kernels come in many flavors:

FSMLabs WindRiver RTLinux RTAI project Adeos I-Pipe Jaluna VirtualLogix VLX Real time tasks written in non Linux API. Limited Linux task support. Linux and RT task integration can be difficult.29

Problems:

PREEMPT_RT

Fully preemptive kernel by Ingo Molnar

Kernel config option PREEMPT_RT Patched 2.6 Preemptible critical sections Preemptible interrupt handlers Preemptible "interrupt disable" sequences Kernel locks priority inheritance Deferred operations Latency-reduction measures30

Vanilla Linux ContextsInterrupt Handlers Interrupt Context SoftIRQs Kernel Space

Regular tasklets

Hi prio taskletsScheduling Points

Timers31

Net Stack

...

User Context

Process Thread Kernel Thread

User Space

PREEMPT_RT Linux ContextsSO_NODELAY Interrupt Handlers

Process Thread

Kernel Space

Net Stack

Tasklets

Timers

Scheduling Points

Kernel Threads

User Space

32

Interface Changes

Spinlocks and local_irq_save() no longer disable hardware interrupts. Spinlocks no longer disable preemption. Raw_ variants for spinlocks and local_irq_save() preserve original meaning for SO_NODELAY interrupts. Semaphores and spinlocks employ priority inheritance Deferred operations API.33

Linux RT Benchmarking

The setup:

Dell PowerEdge SC420 machine with a Pentium 4 2.8MHz CPU, 256 Mb of RAM. Kernel version for all tests: Linux 2.6.12. Running various load generators.

LMbench, LTP, flood ping, dd...

500,000 to 650,000 interrupts via parallel port. Measure response time on another machine.

34

Interrupt Response TimesTechnology Vanilla Linux 2.6.12 Kernel PREEMPT_RT V0.7.51-02 Adeos/ Ipipe-0.7 Avg 6.8 7.3 7.6 Max 555.6 70.5 50.5 Min 5.6 5.6 5.7

Numbers are for time interrupt generation till receive of answer by the logger over parallel port (round trip), in micro-seconds.

35

Any Questions?

GiladBenYossef CTO CodefidenceLtd. [email protected]

36

Copyright Notice

2007 Codefidence Ltd. 2007 Michal Opdenacker, Free Electrons Released under a CC-by-sa 2.0 License.

37

soft, hard and ruby hard real time with linux

Documents

nano kernel approach

real time tasks

real time

user space

highest priority

address space

child process

runnable task