[PATCH 0/2] Misc fixes on registering PCI NVMe CMB

Icenowy Zheng posted 2 patches 10 months, 1 week ago
drivers/nvme/host/pci.c | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
[PATCH 0/2] Misc fixes on registering PCI NVMe CMB
Posted by Icenowy Zheng 10 months, 1 week ago
Here is a small patchset that is developed during my investigation of
a WARNING in my boot kernel log (AMD EPYC 7K62 CPU + Intel DC D4502
SSD), which is because of the SSD's too-small CMB block (512KB only).

The first patch is a fix of the PCI DMA registration error handling
codepath, which is just a observation-based patch (because my disk is
only NVMe 1.2 compliant, and the register cleaned up here is only added
in NVMe 1.4).

The second patch really fixes the warning by testing the CMB block
against the memory hotplugging alignment requirement (which the CMB
block of my SSD surely cannot satisfy -- the alignment requirement is
usually 2M with SPAREMEM_VMEMMAP enabled and even larger in other cases).

Refer to commit 6acd7d5ef264 ("libnvdimm/namespace: Enforce
memremap_compat_align()") for a similar approach for NVDIMM subsystem.

Icenowy Zheng (2):
  nvme-pci: clean up CMBMSC when registering CMB fails
  nvme-pci: skip CMB blocks incompatible with PCI P2P DMA

 drivers/nvme/host/pci.c | 22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

-- 
2.48.1
Re: [PATCH 0/2] Misc fixes on registering PCI NVMe CMB
Posted by Keith Busch 9 months, 3 weeks ago
On Thu, Feb 13, 2025 at 01:04:42AM +0800, Icenowy Zheng wrote:
> Here is a small patchset that is developed during my investigation of
> a WARNING in my boot kernel log (AMD EPYC 7K62 CPU + Intel DC D4502
> SSD), which is because of the SSD's too-small CMB block (512KB only).
> 
> The first patch is a fix of the PCI DMA registration error handling
> codepath, which is just a observation-based patch (because my disk is
> only NVMe 1.2 compliant, and the register cleaned up here is only added
> in NVMe 1.4).
> 
> The second patch really fixes the warning by testing the CMB block
> against the memory hotplugging alignment requirement (which the CMB
> block of my SSD surely cannot satisfy -- the alignment requirement is
> usually 2M with SPAREMEM_VMEMMAP enabled and even larger in other cases).

Applied to nvme-6.14 with the suggested changes in patch 2 folded in.
Re: [PATCH 0/2] Misc fixes on registering PCI NVMe CMB
Posted by Christoph Hellwig 10 months, 1 week ago
On Thu, Feb 13, 2025 at 01:04:42AM +0800, Icenowy Zheng wrote:
> Here is a small patchset that is developed during my investigation of
> a WARNING in my boot kernel log (AMD EPYC 7K62 CPU + Intel DC D4502
> SSD), which is because of the SSD's too-small CMB block (512KB only).

Hah, that's certainly and odd CMB configuration.
Re: [PATCH 0/2] Misc fixes on registering PCI NVMe CMB
Posted by Keith Busch 9 months, 3 weeks ago
On Thu, Feb 13, 2025 at 06:54:49AM +0100, Christoph Hellwig wrote:
> On Thu, Feb 13, 2025 at 01:04:42AM +0800, Icenowy Zheng wrote:
> > Here is a small patchset that is developed during my investigation of
> > a WARNING in my boot kernel log (AMD EPYC 7K62 CPU + Intel DC D4502
> > SSD), which is because of the SSD's too-small CMB block (512KB only).
> 
> Hah, that's certainly and odd CMB configuration.

Should be okay if it's just for submission queues. The driver has an
arbitrary requirement that the queues have at least 64 entries for CMB,
and 512k allows us to create 128 submission queues like that. That's
enough for most systems.
Re: [PATCH 0/2] Misc fixes on registering PCI NVMe CMB
Posted by Icenowy Zheng 9 months, 3 weeks ago
在 2025-02-24星期一的 17:17 -0700,Keith Busch写道:
> On Thu, Feb 13, 2025 at 06:54:49AM +0100, Christoph Hellwig wrote:
> > On Thu, Feb 13, 2025 at 01:04:42AM +0800, Icenowy Zheng wrote:
> > > Here is a small patchset that is developed during my
> > > investigation of
> > > a WARNING in my boot kernel log (AMD EPYC 7K62 CPU + Intel DC
> > > D4502
> > > SSD), which is because of the SSD's too-small CMB block (512KB
> > > only).
> > 
> > Hah, that's certainly and odd CMB configuration.
> 
> Should be okay if it's just for submission queues. The driver has an
> arbitrary requirement that the queues have at least 64 entries for
> CMB,
> and 512k allows us to create 128 submission queues like that. That's
> enough for most systems.

Yes, but this configuration seems to not fit the current driver code
that utilizes PCIe P2P setup code. (Is there any driver that could
utilize this configuration now?)
Re: [PATCH 0/2] Misc fixes on registering PCI NVMe CMB
Posted by Icenowy Zheng 10 months, 1 week ago
在 2025-02-13星期四的 06:54 +0100,Christoph Hellwig写道:
> On Thu, Feb 13, 2025 at 01:04:42AM +0800, Icenowy Zheng wrote:
> > Here is a small patchset that is developed during my investigation
> > of
> > a WARNING in my boot kernel log (AMD EPYC 7K62 CPU + Intel DC D4502
> > SSD), which is because of the SSD's too-small CMB block (512KB
> > only).
> 
> Hah, that's certainly and odd CMB configuration.
> 

Sure, maybe it's just intended for a little queue. Register 0x38 value
is 0x00000004, Register 0x3c is 0x00080001, and BAR 4 is 64-bit
prefetchable memory with size=512K. I tested writing arbitary data to
the BAR 4's first few words and it correctly retains (which means they
seem to be really memory instead of registers).

I saw some mention of support CMB for submission queue in the brief of
Intel D3700/3600, maybe this applies to D450x as a successor?

BTW I am not sure about the relationship between Intel D series and P
series SSDs (only knows that D series is for dual-port redundancy), and
I have a P4511 as my boot disk (D4502 is data storage), which comes
with no CMB at all. (P4511 and D4502 shares PCI ID, but not PCI
subsystem ID).