117 lines
5.4 KiB
ReStructuredText
117 lines
5.4 KiB
ReStructuredText
|
|
.. SPDX-License-Identifier: GPL-2.0
|
|||
|
|
|
|||
|
|
=====================
|
|||
|
|
Theory of operation
|
|||
|
|
=====================
|
|||
|
|
|
|||
|
|
:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|||
|
|
|
|||
|
|
Preface
|
|||
|
|
=======
|
|||
|
|
|
|||
|
|
PREEMPT_RT transforms the Linux kernel into a real-time kernel. It achieves
|
|||
|
|
this by replacing locking primitives, such as spinlock_t, with a preemptible
|
|||
|
|
and priority-inheritance aware implementation known as rtmutex, and by enforcing
|
|||
|
|
the use of threaded interrupts. As a result, the kernel becomes fully
|
|||
|
|
preemptible, with the exception of a few critical code paths, including entry
|
|||
|
|
code, the scheduler, and low-level interrupt handling routines.
|
|||
|
|
|
|||
|
|
This transformation places the majority of kernel execution contexts under the
|
|||
|
|
control of the scheduler and significantly increasing the number of preemption
|
|||
|
|
points. Consequently, it reduces the latency between a high-priority task
|
|||
|
|
becoming runnable and its actual execution on the CPU.
|
|||
|
|
|
|||
|
|
Scheduling
|
|||
|
|
==========
|
|||
|
|
|
|||
|
|
The core principles of Linux scheduling and the associated user-space API are
|
|||
|
|
documented in the man page sched(7)
|
|||
|
|
`sched(7) <https://man7.org/linux/man-pages/man7/sched.7.html>`_.
|
|||
|
|
By default, the Linux kernel uses the SCHED_OTHER scheduling policy. Under
|
|||
|
|
this policy, a task is preempted when the scheduler determines that it has
|
|||
|
|
consumed a fair share of CPU time relative to other runnable tasks. However,
|
|||
|
|
the policy does not guarantee immediate preemption when a new SCHED_OTHER task
|
|||
|
|
becomes runnable. The currently running task may continue executing.
|
|||
|
|
|
|||
|
|
This behavior differs from that of real-time scheduling policies such as
|
|||
|
|
SCHED_FIFO. When a task with a real-time policy becomes runnable, the
|
|||
|
|
scheduler immediately selects it for execution if it has a higher priority than
|
|||
|
|
the currently running task. The task continues to run until it voluntarily
|
|||
|
|
yields the CPU, typically by blocking on an event.
|
|||
|
|
|
|||
|
|
Sleeping spin locks
|
|||
|
|
===================
|
|||
|
|
|
|||
|
|
The various lock types and their behavior under real-time configurations are
|
|||
|
|
described in detail in Documentation/locking/locktypes.rst.
|
|||
|
|
In a non-PREEMPT_RT configuration, a spinlock_t is acquired by first disabling
|
|||
|
|
preemption and then actively spinning until the lock becomes available. Once
|
|||
|
|
the lock is released, preemption is enabled. From a real-time perspective,
|
|||
|
|
this approach is undesirable because disabling preemption prevents the
|
|||
|
|
scheduler from switching to a higher-priority task, potentially increasing
|
|||
|
|
latency.
|
|||
|
|
|
|||
|
|
To address this, PREEMPT_RT replaces spinning locks with sleeping spin locks
|
|||
|
|
that do not disable preemption. On PREEMPT_RT, spinlock_t is implemented using
|
|||
|
|
rtmutex. Instead of spinning, a task attempting to acquire a contended lock
|
|||
|
|
disables CPU migration, donates its priority to the lock owner (priority
|
|||
|
|
inheritance), and voluntarily schedules out while waiting for the lock to
|
|||
|
|
become available.
|
|||
|
|
|
|||
|
|
Disabling CPU migration provides the same effect as disabling preemption, while
|
|||
|
|
still allowing preemption and ensuring that the task continues to run on the
|
|||
|
|
same CPU while holding a sleeping lock.
|
|||
|
|
|
|||
|
|
Priority inheritance
|
|||
|
|
====================
|
|||
|
|
|
|||
|
|
Lock types such as spinlock_t and mutex_t in a PREEMPT_RT enabled kernel are
|
|||
|
|
implemented on top of rtmutex, which provides support for priority inheritance
|
|||
|
|
(PI). When a task blocks on such a lock, the PI mechanism temporarily
|
|||
|
|
propagates the blocked task’s scheduling parameters to the lock owner.
|
|||
|
|
|
|||
|
|
For example, if a SCHED_FIFO task A blocks on a lock currently held by a
|
|||
|
|
SCHED_OTHER task B, task A’s scheduling policy and priority are temporarily
|
|||
|
|
inherited by task B. After this inheritance, task A is put to sleep while
|
|||
|
|
waiting for the lock, and task B effectively becomes the highest-priority task
|
|||
|
|
in the system. This allows B to continue executing, make progress, and
|
|||
|
|
eventually release the lock.
|
|||
|
|
|
|||
|
|
Once B releases the lock, it reverts to its original scheduling parameters, and
|
|||
|
|
task A can resume execution.
|
|||
|
|
|
|||
|
|
Threaded interrupts
|
|||
|
|
===================
|
|||
|
|
|
|||
|
|
Interrupt handlers are another source of code that executes with preemption
|
|||
|
|
disabled and outside the control of the scheduler. To bring interrupt handling
|
|||
|
|
under scheduler control, PREEMPT_RT enforces threaded interrupt handlers.
|
|||
|
|
|
|||
|
|
With forced threading, interrupt handling is split into two stages. The first
|
|||
|
|
stage, the primary handler, is executed in IRQ context with interrupts disabled.
|
|||
|
|
Its sole responsibility is to wake the associated threaded handler. The second
|
|||
|
|
stage, the threaded handler, is the function passed to request_irq() as the
|
|||
|
|
interrupt handler. It runs in process context, scheduled by the kernel.
|
|||
|
|
|
|||
|
|
From waking the interrupt thread until threaded handling is completed, the
|
|||
|
|
interrupt source is masked in the interrupt controller. This ensures that the
|
|||
|
|
device interrupt remains pending but does not retrigger the CPU, allowing the
|
|||
|
|
system to exit IRQ context and handle the interrupt in a scheduled thread.
|
|||
|
|
|
|||
|
|
By default, the threaded handler executes with the SCHED_FIFO scheduling policy
|
|||
|
|
and a priority of 50 (MAX_RT_PRIO / 2), which is midway between the minimum and
|
|||
|
|
maximum real-time priorities.
|
|||
|
|
|
|||
|
|
If the threaded interrupt handler raises any soft interrupts during its
|
|||
|
|
execution, those soft interrupt routines are invoked after the threaded handler
|
|||
|
|
completes, within the same thread. Preemption remains enabled during the
|
|||
|
|
execution of the soft interrupt handler.
|
|||
|
|
|
|||
|
|
Summary
|
|||
|
|
=======
|
|||
|
|
|
|||
|
|
By using sleeping locks and forced-threaded interrupts, PREEMPT_RT
|
|||
|
|
significantly reduces sections of code where interrupts or preemption is
|
|||
|
|
disabled, allowing the scheduler to preempt the current execution context and
|
|||
|
|
switch to a higher-priority task.
|