110 lines
5.4 KiB
ReStructuredText
110 lines
5.4 KiB
ReStructuredText
|
|
.. SPDX-License-Identifier: GPL-2.0
|
|||
|
|
|
|||
|
|
=============================================
|
|||
|
|
Porting an architecture to support PREEMPT_RT
|
|||
|
|
=============================================
|
|||
|
|
|
|||
|
|
:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|||
|
|
|
|||
|
|
This list outlines the architecture specific requirements that must be
|
|||
|
|
implemented in order to enable PREEMPT_RT. Once all required features are
|
|||
|
|
implemented, ARCH_SUPPORTS_RT can be selected in architecture’s Kconfig to make
|
|||
|
|
PREEMPT_RT selectable.
|
|||
|
|
Many prerequisites (genirq support for example) are enforced by the common code
|
|||
|
|
and are omitted here.
|
|||
|
|
|
|||
|
|
The optional features are not strictly required but it is worth to consider
|
|||
|
|
them.
|
|||
|
|
|
|||
|
|
Requirements
|
|||
|
|
------------
|
|||
|
|
|
|||
|
|
Forced threaded interrupts
|
|||
|
|
CONFIG_IRQ_FORCED_THREADING must be selected. Any interrupts that must
|
|||
|
|
remain in hard-IRQ context must be marked with IRQF_NO_THREAD. This
|
|||
|
|
requirement applies for instance to clocksource event interrupts,
|
|||
|
|
perf interrupts and cascading interrupt-controller handlers.
|
|||
|
|
|
|||
|
|
PREEMPTION support
|
|||
|
|
Kernel preemption must be supported and requires that
|
|||
|
|
CONFIG_ARCH_NO_PREEMPT remain unselected. Scheduling requests, such as those
|
|||
|
|
issued from an interrupt or other exception handler, must be processed
|
|||
|
|
immediately.
|
|||
|
|
|
|||
|
|
POSIX CPU timers and KVM
|
|||
|
|
POSIX CPU timers must expire from thread context rather than directly within
|
|||
|
|
the timer interrupt. This behavior is enabled by setting the configuration
|
|||
|
|
option CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK.
|
|||
|
|
When KVM is enabled, CONFIG_KVM_XFER_TO_GUEST_WORK must also be set to ensure
|
|||
|
|
that any pending work, such as POSIX timer expiration, is handled before
|
|||
|
|
transitioning into guest mode.
|
|||
|
|
|
|||
|
|
Hard-IRQ and Soft-IRQ stacks
|
|||
|
|
Soft interrupts are handled in the thread context in which they are raised. If
|
|||
|
|
a soft interrupt is triggered from hard-IRQ context, its execution is deferred
|
|||
|
|
to the ksoftirqd thread. Preemption is never disabled during soft interrupt
|
|||
|
|
handling, which makes soft interrupts preemptible.
|
|||
|
|
If an architecture provides a custom __do_softirq() implementation that uses a
|
|||
|
|
separate stack, it must select CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK. The
|
|||
|
|
functionality should only be enabled when CONFIG_SOFTIRQ_ON_OWN_STACK is set.
|
|||
|
|
|
|||
|
|
FPU and SIMD access in kernel mode
|
|||
|
|
FPU and SIMD registers are typically not used in kernel mode and are therefore
|
|||
|
|
not saved during kernel preemption. As a result, any kernel code that uses
|
|||
|
|
these registers must be enclosed within a kernel_fpu_begin() and
|
|||
|
|
kernel_fpu_end() section.
|
|||
|
|
The kernel_fpu_begin() function usually invokes local_bh_disable() to prevent
|
|||
|
|
interruptions from softirqs and to disable regular preemption. This allows the
|
|||
|
|
protected code to run safely in both thread and softirq contexts.
|
|||
|
|
On PREEMPT_RT kernels, however, kernel_fpu_begin() must not call
|
|||
|
|
local_bh_disable(). Instead, it should use preempt_disable(), since softirqs
|
|||
|
|
are always handled in thread context under PREEMPT_RT. In this case, disabling
|
|||
|
|
preemption alone is sufficient.
|
|||
|
|
The crypto subsystem operates on memory pages and requires users to "walk and
|
|||
|
|
map" these pages while processing a request. This operation must occur outside
|
|||
|
|
the kernel_fpu_begin()/ kernel_fpu_end() section because it requires preemption
|
|||
|
|
to be enabled. These preemption points are generally sufficient to avoid
|
|||
|
|
excessive scheduling latency.
|
|||
|
|
|
|||
|
|
Exception handlers
|
|||
|
|
Exception handlers, such as the page fault handler, typically enable interrupts
|
|||
|
|
early, before invoking any generic code to process the exception. This is
|
|||
|
|
necessary because handling a page fault may involve operations that can sleep.
|
|||
|
|
Enabling interrupts is especially important on PREEMPT_RT, where certain
|
|||
|
|
locks, such as spinlock_t, become sleepable. For example, handling an
|
|||
|
|
invalid opcode may result in sending a SIGILL signal to the user task. A
|
|||
|
|
debug excpetion will send a SIGTRAP signal.
|
|||
|
|
In both cases, if the exception occurred in user space, it is safe to enable
|
|||
|
|
interrupts early. Sending a signal requires both interrupts and kernel
|
|||
|
|
preemption to be enabled.
|
|||
|
|
|
|||
|
|
Optional features
|
|||
|
|
-----------------
|
|||
|
|
|
|||
|
|
Timer and clocksource
|
|||
|
|
A high-resolution clocksource and clockevents device are recommended. The
|
|||
|
|
clockevents device should support the CLOCK_EVT_FEAT_ONESHOT feature for
|
|||
|
|
optimal timer behavior. In most cases, microsecond-level accuracy is
|
|||
|
|
sufficient
|
|||
|
|
|
|||
|
|
Lazy preemption
|
|||
|
|
This mechanism allows an in-kernel scheduling request for non-real-time tasks
|
|||
|
|
to be delayed until the task is about to return to user space. It helps avoid
|
|||
|
|
preempting a task that holds a sleeping lock at the time of the scheduling
|
|||
|
|
request.
|
|||
|
|
With CONFIG_GENERIC_IRQ_ENTRY enabled, supporting this feature requires
|
|||
|
|
defining a bit for TIF_NEED_RESCHED_LAZY, preferably near TIF_NEED_RESCHED.
|
|||
|
|
|
|||
|
|
Serial console with NBCON
|
|||
|
|
With PREEMPT_RT enabled, all console output is handled by a dedicated thread
|
|||
|
|
rather than directly from the context in which printk() is invoked. This design
|
|||
|
|
allows printk() to be safely used in atomic contexts.
|
|||
|
|
However, this also means that if the kernel crashes and cannot switch to the
|
|||
|
|
printing thread, no output will be visible preventing the system from printing
|
|||
|
|
its final messages.
|
|||
|
|
There are exceptions for immediate output, such as during panic() handling. To
|
|||
|
|
support this, the console driver must implement new-style lock handling. This
|
|||
|
|
involves setting the CON_NBCON flag in console::flags and providing
|
|||
|
|
implementations for the write_atomic, write_thread, device_lock, and
|
|||
|
|
device_unlock callbacks.
|