158 lines
6.8 KiB
Plaintext
158 lines
6.8 KiB
Plaintext
What: /sys/kernel/debug/cxl/memX/inject_poison
|
|
Date: April, 2023
|
|
KernelVersion: v6.4
|
|
Contact: linux-cxl@vger.kernel.org
|
|
Description:
|
|
(WO) When a Device Physical Address (DPA) is written to this
|
|
attribute, the memdev driver sends an inject poison command to
|
|
the device for the specified address. The DPA must be 64-byte
|
|
aligned and the length of the injected poison is 64-bytes. If
|
|
successful, the device returns poison when the address is
|
|
accessed through the CXL.mem bus. Injecting poison adds the
|
|
address to the device's Poison List and the error source is set
|
|
to Injected. In addition, the device adds a poison creation
|
|
event to its internal Informational Event log, updates the
|
|
Event Status register, and if configured, interrupts the host.
|
|
It is not an error to inject poison into an address that
|
|
already has poison present and no error is returned. If the
|
|
device returns 'Inject Poison Limit Reached' an -EBUSY error
|
|
is returned to the user. The inject_poison attribute is only
|
|
visible for devices supporting the capability.
|
|
|
|
TEST-ONLY INTERFACE: This interface is intended for testing
|
|
and validation purposes only. It is not a data repair mechanism
|
|
and should never be used on production systems or live data.
|
|
|
|
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
|
|
poison injection can result in permanent data loss. Injected
|
|
poison may render data permanently inaccessible even after
|
|
clearing, as the clear operation writes zeros and does not
|
|
recover original data.
|
|
|
|
SYSTEM STABILITY RISK: For volatile memory, poison injection
|
|
can cause kernel crashes, system instability, or unpredictable
|
|
behavior if the poisoned addresses are accessed by running code
|
|
or critical kernel structures.
|
|
|
|
What: /sys/kernel/debug/cxl/memX/clear_poison
|
|
Date: April, 2023
|
|
KernelVersion: v6.4
|
|
Contact: linux-cxl@vger.kernel.org
|
|
Description:
|
|
(WO) When a Device Physical Address (DPA) is written to this
|
|
attribute, the memdev driver sends a clear poison command to
|
|
the device for the specified address. Clearing poison removes
|
|
the address from the device's Poison List and writes 0 (zero)
|
|
for 64 bytes starting at address. It is not an error to clear
|
|
poison from an address that does not have poison set. If the
|
|
device cannot clear poison from the address, -ENXIO is returned.
|
|
The clear_poison attribute is only visible for devices
|
|
supporting the capability.
|
|
|
|
TEST-ONLY INTERFACE: This interface is intended for testing
|
|
and validation purposes only. It is not a data repair mechanism
|
|
and should never be used on production systems or live data.
|
|
|
|
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
|
|
specified address range and removes the address from the poison
|
|
list. It does NOT recover or restore original data that may have
|
|
been present before poison injection. Any original data at the
|
|
cleared address is permanently lost and replaced with zeros.
|
|
|
|
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
|
|
purposes only and should not be used as a data repair tool.
|
|
Clearing poison is fundamentally different from data recovery
|
|
or error correction.
|
|
|
|
What: /sys/kernel/debug/cxl/regionX/inject_poison
|
|
Date: August, 2025
|
|
Contact: linux-cxl@vger.kernel.org
|
|
Description:
|
|
(WO) When a Host Physical Address (HPA) is written to this
|
|
attribute, the region driver translates it to a Device
|
|
Physical Address (DPA) and identifies the corresponding
|
|
memdev. It then sends an inject poison command to that memdev
|
|
at the translated DPA. Refer to the memdev ABI entry at:
|
|
/sys/kernel/debug/cxl/memX/inject_poison for the detailed
|
|
behavior. This attribute is only visible if all memdevs
|
|
participating in the region support both inject and clear
|
|
poison commands.
|
|
|
|
TEST-ONLY INTERFACE: This interface is intended for testing
|
|
and validation purposes only. It is not a data repair mechanism
|
|
and should never be used on production systems or live data.
|
|
|
|
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
|
|
poison injection can result in permanent data loss. Injected
|
|
poison may render data permanently inaccessible even after
|
|
clearing, as the clear operation writes zeros and does not
|
|
recover original data.
|
|
|
|
SYSTEM STABILITY RISK: For volatile memory, poison injection
|
|
can cause kernel crashes, system instability, or unpredictable
|
|
behavior if the poisoned addresses are accessed by running code
|
|
or critical kernel structures.
|
|
|
|
What: /sys/kernel/debug/cxl/regionX/clear_poison
|
|
Date: August, 2025
|
|
Contact: linux-cxl@vger.kernel.org
|
|
Description:
|
|
(WO) When a Host Physical Address (HPA) is written to this
|
|
attribute, the region driver translates it to a Device
|
|
Physical Address (DPA) and identifies the corresponding
|
|
memdev. It then sends a clear poison command to that memdev
|
|
at the translated DPA. Refer to the memdev ABI entry at:
|
|
/sys/kernel/debug/cxl/memX/clear_poison for the detailed
|
|
behavior. This attribute is only visible if all memdevs
|
|
participating in the region support both inject and clear
|
|
poison commands.
|
|
|
|
TEST-ONLY INTERFACE: This interface is intended for testing
|
|
and validation purposes only. It is not a data repair mechanism
|
|
and should never be used on production systems or live data.
|
|
|
|
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
|
|
specified address range and removes the address from the poison
|
|
list. It does NOT recover or restore original data that may have
|
|
been present before poison injection. Any original data at the
|
|
cleared address is permanently lost and replaced with zeros.
|
|
|
|
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
|
|
purposes only and should not be used as a data repair tool.
|
|
Clearing poison is fundamentally different from data recovery
|
|
or error correction.
|
|
|
|
What: /sys/kernel/debug/cxl/einj_types
|
|
Date: January, 2024
|
|
KernelVersion: v6.9
|
|
Contact: linux-cxl@vger.kernel.org
|
|
Description:
|
|
(RO) Prints the CXL protocol error types made available by
|
|
the platform in the format:
|
|
|
|
0x<error number> <error type>
|
|
|
|
The possible error types are (as of ACPI v6.5):
|
|
|
|
0x1000 CXL.cache Protocol Correctable
|
|
0x2000 CXL.cache Protocol Uncorrectable non-fatal
|
|
0x4000 CXL.cache Protocol Uncorrectable fatal
|
|
0x8000 CXL.mem Protocol Correctable
|
|
0x10000 CXL.mem Protocol Uncorrectable non-fatal
|
|
0x20000 CXL.mem Protocol Uncorrectable fatal
|
|
|
|
The <error number> can be written to einj_inject to inject
|
|
<error type> into a chosen dport.
|
|
|
|
What: /sys/kernel/debug/cxl/$dport_dev/einj_inject
|
|
Date: January, 2024
|
|
KernelVersion: v6.9
|
|
Contact: linux-cxl@vger.kernel.org
|
|
Description:
|
|
(WO) Writing an integer to this file injects the corresponding
|
|
CXL protocol error into $dport_dev ($dport_dev will be a device
|
|
name from /sys/bus/pci/devices). The integer to type mapping for
|
|
injection can be found by reading from einj_types. If the dport
|
|
was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
|
|
a CXL 2.0 error is injected.
|