[PATCH v5 0/7] CXL: Add cxl_reset sysfs attribute for PCI devices

smadhavan@nvidia.com posted 7 patches 1 month ago
Documentation/ABI/testing/sysfs-bus-pci |  22 +
drivers/cxl/core/core.h                 |   2 +
drivers/cxl/core/pci.c                  | 537 ++++++++++++++++++++++++
drivers/cxl/core/port.c                 |   3 +
drivers/pci/pci.c                       |  21 +-
include/linux/pci.h                     |   3 +
include/uapi/linux/pci_regs.h           |  14 +
7 files changed, 600 insertions(+), 2 deletions(-)
[PATCH v5 0/7] CXL: Add cxl_reset sysfs attribute for PCI devices
Posted by smadhavan@nvidia.com 1 month ago
From: Srirangan Madhavan <smadhavan@nvidia.com>

Hi folks!

This patch series introduces support for the CXL Reset method for CXL
Type 2 devices, implementing the reset procedure outlined in CXL Spec [1]
v3.2, Sections 8.1.3, 9.6 and 9.7.

v5 changes (from v4):
- Rebased on v7.0-rc1 and applied fixes from the review v4.
- Added CXL DVSEC and HDM save/restore as a prerequisite series [2]
- Switched from PCI reset method to sysfs
  interface at /sys/bus/pci/devices/.../cxl_reset (Dan, Alex)
- Removed all PCI core changes - reset logic stays in CXL driver
- Use cpu_cache_invalidate_memregion() instead of arch-specific code
- Removed CONFIG_X86/CONFIG_ARM64 ifdefs
- Added ABI documentation for sysfs interface

v4 changes:
- Fix CXL reset capability check parentheses warning
- Gate CXL reset path on CONFIG_CXL_PCI reachability

v3 changes:
- Restrict CXL reset to Type 2 devices only
- Add host and device cache flushing for sibling functions and region peers
- Add region teardown and memory online detection before reset
- Add configuration state save/restore (DVSEC, HDM, IDE)
- Split the series by subsystem and functional blocks

Motivation:
-----------
- As support for Type 2 devices [6] is being introduced, more devices will
  require finer-grained reset mechanisms beyond bus-wide reset methods.

- FLR does not affect CXL.cache or CXL.mem protocols, making CXL Reset
  the preferred method in some cases.

- The CXL spec (Sections 7.2.3 Binding and Unbinding, 9.5 FLR) highlights use
  cases like function rebinding and error recovery, where CXL Reset is
  explicitly mentioned.

ABI Change reasoning (v5):
-------------------------
Previous versions (v1-v4) integrated CXL reset as a new PCI reset method
in pci_reset_methods[]. Based on feedback from Dan Williams and Alex
Williamson, v5 switches to a sysfs-based approach.

The key reasoning is that CXL Reset has expanded scope than existing PCI
reset methods. Mixing these in the same reset infrastructure causes
problems. Therefore selectively exposing a cxl_reset method in pci-sysfs
and leaving the existing interface unaffected.

Change Description:
-------------------

Patch 1: PCI: Add CXL DVSEC reset and capability register definitions
- Add reset and cache control bit definitions to pci_regs.h

Patch 2: PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
- Export for sibling function save/restore during CXL reset

Patch 3: cxl: Add memory offlining and cache flush helpers
- Offline CXL memory regions before reset
- Flush CPU caches using cpu_cache_invalidate_memregion()

Patch 4: cxl: Add multi-function sibling coordination for CXL reset
- Identify CXL.cachemem sibling functions via Non-CXL Function Map DVSEC
- Save/disable and restore sibling PCI functions around reset

Patch 5: cxl: Add CXL DVSEC reset sequence and flow orchestration
- Implement cxl_dev_reset() to trigger reset via DVSEC
- Poll for reset completion with timeout
- cxl_do_reset() orchestrates the complete reset sequence with
  proper locking and error handling

Patch 6: cxl: Add cxl_reset sysfs interface for PCI devices
- Expose /sys/bus/pci/devices/.../cxl_reset
- Only visible for devices with Reset Capable bit set
- Write "1" to trigger reset

Patch 7: Documentation: ABI: Add CXL PCI cxl_reset sysfs attribute
- Document the new sysfs interface
- Explain scope, visibility, and error conditions

Dependencies:
-------------

This series depends on:
  [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets
  https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/T/#t

The cpu_cache_invalidate_memregion() call used for CPU cache flush currently
has support on x86. ARM64 support will be addressed in a separate RFC.

Command line to test the CXL reset on a capable device:
    echo 1 > /sys/bus/pci/devices/<pci_device>/cxl_reset

Basic cxl_reset testing was done on a CXL Type-2 device: writing to the
sysfs attribute, exercising the DVSEC reset sequence including WB+I and
init reset, restore. Further testing is in progress.

This series is based on v7.0-rc1.

Srirangan Madhavan (7):
  PCI: Add CXL DVSEC reset and capability register definitions
  PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
  cxl: Add memory offlining and cache flush helpers
  cxl: Add multi-function sibling coordination for CXL reset
  cxl: Add CXL DVSEC reset sequence and flow orchestration
  cxl: Add cxl_reset sysfs interface for PCI devices
  Documentation: ABI: Add CXL PCI cxl_reset sysfs attribute

 Documentation/ABI/testing/sysfs-bus-pci |  22 +
 drivers/cxl/core/core.h                 |   2 +
 drivers/cxl/core/pci.c                  | 537 ++++++++++++++++++++++++
 drivers/cxl/core/port.c                 |   3 +
 drivers/pci/pci.c                       |  21 +-
 include/linux/pci.h                     |   3 +
 include/uapi/linux/pci_regs.h           |  14 +
 7 files changed, 600 insertions(+), 2 deletions(-)

base-commit: 6de23f81a5e0
--
2.43.0
Re: [PATCH v5 0/7] CXL: Add cxl_reset sysfs attribute for PCI devices
Posted by Dave Jiang 1 month ago

On 3/6/26 2:23 AM, smadhavan@nvidia.com wrote:
> From: Srirangan Madhavan <smadhavan@nvidia.com>
> 
> Hi folks!
> 
> This patch series introduces support for the CXL Reset method for CXL
> Type 2 devices, implementing the reset procedure outlined in CXL Spec [1]
> v3.2, Sections 8.1.3, 9.6 and 9.7.
> 
> v5 changes (from v4):
> - Rebased on v7.0-rc1 and applied fixes from the review v4.
> - Added CXL DVSEC and HDM save/restore as a prerequisite series [2]
> - Switched from PCI reset method to sysfs
>   interface at /sys/bus/pci/devices/.../cxl_reset (Dan, Alex)
> - Removed all PCI core changes - reset logic stays in CXL driver
> - Use cpu_cache_invalidate_memregion() instead of arch-specific code
> - Removed CONFIG_X86/CONFIG_ARM64 ifdefs
> - Added ABI documentation for sysfs interface
> 
> v4 changes:
> - Fix CXL reset capability check parentheses warning
> - Gate CXL reset path on CONFIG_CXL_PCI reachability
> 
> v3 changes:
> - Restrict CXL reset to Type 2 devices only
> - Add host and device cache flushing for sibling functions and region peers
> - Add region teardown and memory online detection before reset
> - Add configuration state save/restore (DVSEC, HDM, IDE)
> - Split the series by subsystem and functional blocks
> 
> Motivation:
> -----------
> - As support for Type 2 devices [6] is being introduced, more devices will
>   require finer-grained reset mechanisms beyond bus-wide reset methods.
> 
> - FLR does not affect CXL.cache or CXL.mem protocols, making CXL Reset
>   the preferred method in some cases.
> 
> - The CXL spec (Sections 7.2.3 Binding and Unbinding, 9.5 FLR) highlights use
>   cases like function rebinding and error recovery, where CXL Reset is
>   explicitly mentioned.
> 
> ABI Change reasoning (v5):
> -------------------------
> Previous versions (v1-v4) integrated CXL reset as a new PCI reset method
> in pci_reset_methods[]. Based on feedback from Dan Williams and Alex
> Williamson, v5 switches to a sysfs-based approach.
> 
> The key reasoning is that CXL Reset has expanded scope than existing PCI
> reset methods. Mixing these in the same reset infrastructure causes
> problems. Therefore selectively exposing a cxl_reset method in pci-sysfs
> and leaving the existing interface unaffected.
> 
> Change Description:
> -------------------
> 
> Patch 1: PCI: Add CXL DVSEC reset and capability register definitions
> - Add reset and cache control bit definitions to pci_regs.h
> 
> Patch 2: PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
> - Export for sibling function save/restore during CXL reset
> 
> Patch 3: cxl: Add memory offlining and cache flush helpers
> - Offline CXL memory regions before reset
> - Flush CPU caches using cpu_cache_invalidate_memregion()
> 
> Patch 4: cxl: Add multi-function sibling coordination for CXL reset
> - Identify CXL.cachemem sibling functions via Non-CXL Function Map DVSEC
> - Save/disable and restore sibling PCI functions around reset
> 
> Patch 5: cxl: Add CXL DVSEC reset sequence and flow orchestration
> - Implement cxl_dev_reset() to trigger reset via DVSEC
> - Poll for reset completion with timeout
> - cxl_do_reset() orchestrates the complete reset sequence with
>   proper locking and error handling
> 
> Patch 6: cxl: Add cxl_reset sysfs interface for PCI devices
> - Expose /sys/bus/pci/devices/.../cxl_reset
> - Only visible for devices with Reset Capable bit set
> - Write "1" to trigger reset
> 
> Patch 7: Documentation: ABI: Add CXL PCI cxl_reset sysfs attribute
> - Document the new sysfs interface
> - Explain scope, visibility, and error conditions
> 
> Dependencies:
> -------------
> 
> This series depends on:
>   [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets
>   https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/T/#t
> 
> The cpu_cache_invalidate_memregion() call used for CPU cache flush currently
> has support on x86. ARM64 support will be addressed in a separate RFC.
> 
> Command line to test the CXL reset on a capable device:
>     echo 1 > /sys/bus/pci/devices/<pci_device>/cxl_reset
> 
> Basic cxl_reset testing was done on a CXL Type-2 device: writing to the
> sysfs attribute, exercising the DVSEC reset sequence including WB+I and
> init reset, restore. Further testing is in progress.
> 
> This series is based on v7.0-rc1.
> 
> Srirangan Madhavan (7):
>   PCI: Add CXL DVSEC reset and capability register definitions
>   PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
>   cxl: Add memory offlining and cache flush helpers
>   cxl: Add multi-function sibling coordination for CXL reset
>   cxl: Add CXL DVSEC reset sequence and flow orchestration
>   cxl: Add cxl_reset sysfs interface for PCI devices
>   Documentation: ABI: Add CXL PCI cxl_reset sysfs attribute
> 
>  Documentation/ABI/testing/sysfs-bus-pci |  22 +
>  drivers/cxl/core/core.h                 |   2 +
>  drivers/cxl/core/pci.c                  | 537 ++++++++++++++++++++++++
>  drivers/cxl/core/port.c                 |   3 +
>  drivers/pci/pci.c                       |  21 +-
>  include/linux/pci.h                     |   3 +
>  include/uapi/linux/pci_regs.h           |  14 +
>  7 files changed, 600 insertions(+), 2 deletions(-)
> 
> base-commit: 6de23f81a5e0

The commit is 7.0-rc1. But b4 shazam seems to fail when attempting to apply.

Applying: PCI: Add CXL DVSEC reset and capability register definitions
Patch failed at 0001 PCI: Add CXL DVSEC reset and capability register definitions
error: patch failed: include/uapi/linux/pci_regs.h:1349
error: include/uapi/linux/pci_regs.h: patch does not apply


> --
> 2.43.0
>
Re: [PATCH v5 0/7] CXL: Add cxl_reset sysfs attribute for PCI devices
Posted by Dave Jiang 1 month ago

On 3/9/26 3:37 PM, Dave Jiang wrote:
> 
> 
> On 3/6/26 2:23 AM, smadhavan@nvidia.com wrote:
>> From: Srirangan Madhavan <smadhavan@nvidia.com>
>>
>> Hi folks!
>>
>> This patch series introduces support for the CXL Reset method for CXL
>> Type 2 devices, implementing the reset procedure outlined in CXL Spec [1]
>> v3.2, Sections 8.1.3, 9.6 and 9.7.
>>
>> v5 changes (from v4):
>> - Rebased on v7.0-rc1 and applied fixes from the review v4.
>> - Added CXL DVSEC and HDM save/restore as a prerequisite series [2]
>> - Switched from PCI reset method to sysfs
>>   interface at /sys/bus/pci/devices/.../cxl_reset (Dan, Alex)
>> - Removed all PCI core changes - reset logic stays in CXL driver
>> - Use cpu_cache_invalidate_memregion() instead of arch-specific code
>> - Removed CONFIG_X86/CONFIG_ARM64 ifdefs
>> - Added ABI documentation for sysfs interface
>>
>> v4 changes:
>> - Fix CXL reset capability check parentheses warning
>> - Gate CXL reset path on CONFIG_CXL_PCI reachability
>>
>> v3 changes:
>> - Restrict CXL reset to Type 2 devices only
>> - Add host and device cache flushing for sibling functions and region peers
>> - Add region teardown and memory online detection before reset
>> - Add configuration state save/restore (DVSEC, HDM, IDE)
>> - Split the series by subsystem and functional blocks
>>
>> Motivation:
>> -----------
>> - As support for Type 2 devices [6] is being introduced, more devices will
>>   require finer-grained reset mechanisms beyond bus-wide reset methods.
>>
>> - FLR does not affect CXL.cache or CXL.mem protocols, making CXL Reset
>>   the preferred method in some cases.
>>
>> - The CXL spec (Sections 7.2.3 Binding and Unbinding, 9.5 FLR) highlights use
>>   cases like function rebinding and error recovery, where CXL Reset is
>>   explicitly mentioned.
>>
>> ABI Change reasoning (v5):
>> -------------------------
>> Previous versions (v1-v4) integrated CXL reset as a new PCI reset method
>> in pci_reset_methods[]. Based on feedback from Dan Williams and Alex
>> Williamson, v5 switches to a sysfs-based approach.
>>
>> The key reasoning is that CXL Reset has expanded scope than existing PCI
>> reset methods. Mixing these in the same reset infrastructure causes
>> problems. Therefore selectively exposing a cxl_reset method in pci-sysfs
>> and leaving the existing interface unaffected.
>>
>> Change Description:
>> -------------------
>>
>> Patch 1: PCI: Add CXL DVSEC reset and capability register definitions
>> - Add reset and cache control bit definitions to pci_regs.h
>>
>> Patch 2: PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
>> - Export for sibling function save/restore during CXL reset
>>
>> Patch 3: cxl: Add memory offlining and cache flush helpers
>> - Offline CXL memory regions before reset
>> - Flush CPU caches using cpu_cache_invalidate_memregion()
>>
>> Patch 4: cxl: Add multi-function sibling coordination for CXL reset
>> - Identify CXL.cachemem sibling functions via Non-CXL Function Map DVSEC
>> - Save/disable and restore sibling PCI functions around reset
>>
>> Patch 5: cxl: Add CXL DVSEC reset sequence and flow orchestration
>> - Implement cxl_dev_reset() to trigger reset via DVSEC
>> - Poll for reset completion with timeout
>> - cxl_do_reset() orchestrates the complete reset sequence with
>>   proper locking and error handling
>>
>> Patch 6: cxl: Add cxl_reset sysfs interface for PCI devices
>> - Expose /sys/bus/pci/devices/.../cxl_reset
>> - Only visible for devices with Reset Capable bit set
>> - Write "1" to trigger reset
>>
>> Patch 7: Documentation: ABI: Add CXL PCI cxl_reset sysfs attribute
>> - Document the new sysfs interface
>> - Explain scope, visibility, and error conditions
>>
>> Dependencies:
>> -------------
>>
>> This series depends on:
>>   [PATCH 0/5] PCI/CXL: Save and restore CXL DVSEC and HDM state across resets
>>   https://lore.kernel.org/linux-cxl/20260306080026.116789-1-smadhavan@nvidia.com/T/#t
>>
>> The cpu_cache_invalidate_memregion() call used for CPU cache flush currently
>> has support on x86. ARM64 support will be addressed in a separate RFC.
>>
>> Command line to test the CXL reset on a capable device:
>>     echo 1 > /sys/bus/pci/devices/<pci_device>/cxl_reset
>>
>> Basic cxl_reset testing was done on a CXL Type-2 device: writing to the
>> sysfs attribute, exercising the DVSEC reset sequence including WB+I and
>> init reset, restore. Further testing is in progress.
>>
>> This series is based on v7.0-rc1.
>>
>> Srirangan Madhavan (7):
>>   PCI: Add CXL DVSEC reset and capability register definitions
>>   PCI: Export pci_dev_save_and_disable() and pci_dev_restore()
>>   cxl: Add memory offlining and cache flush helpers
>>   cxl: Add multi-function sibling coordination for CXL reset
>>   cxl: Add CXL DVSEC reset sequence and flow orchestration
>>   cxl: Add cxl_reset sysfs interface for PCI devices
>>   Documentation: ABI: Add CXL PCI cxl_reset sysfs attribute
>>
>>  Documentation/ABI/testing/sysfs-bus-pci |  22 +
>>  drivers/cxl/core/core.h                 |   2 +
>>  drivers/cxl/core/pci.c                  | 537 ++++++++++++++++++++++++
>>  drivers/cxl/core/port.c                 |   3 +
>>  drivers/pci/pci.c                       |  21 +-
>>  include/linux/pci.h                     |   3 +
>>  include/uapi/linux/pci_regs.h           |  14 +
>>  7 files changed, 600 insertions(+), 2 deletions(-)
>>
>> base-commit: 6de23f81a5e0
> 
> The commit is 7.0-rc1. But b4 shazam seems to fail when attempting to apply.
> 
> Applying: PCI: Add CXL DVSEC reset and capability register definitions
> Patch failed at 0001 PCI: Add CXL DVSEC reset and capability register definitions
> error: patch failed: include/uapi/linux/pci_regs.h:1349
> error: include/uapi/linux/pci_regs.h: patch does not apply
> 

nm. I need to apply the save/restore series first. The base-commit should not be the v7.1-rc1 commit. 
> 
>> --
>> 2.43.0
>>
> 
>