[PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads

dullfire@yahoo.com posted 2 patches 1 year, 2 months ago
[PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads
Posted by dullfire@yahoo.com 1 year, 2 months ago
From: Jonathan Currier <dullfire@yahoo.com>

Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
introduces a readl() from ENTRY_VECTOR_CTRL before the writel() to
ENTRY_DATA. This is correct, however some hardware, like the Sun Neptune
chips, the niu module, will cause an error and/or fatal trap if any MSIX
table entry is read before the corresponding ENTRY_DATA field is written
to. This patch adds an optional early writel() in msix_prepare_msi_desc().

Cc: stable@vger.kernel.org
Signed-off-by: Jonathan Currier <dullfire@yahoo.com>
---
 drivers/pci/msi/msi.c | 2 ++
 include/linux/pci.h   | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 3a45879d85db..50d87fb5e37f 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -611,6 +611,8 @@ void msix_prepare_msi_desc(struct pci_dev *dev, struct msi_desc *desc)
 	if (desc->pci.msi_attrib.can_mask) {
 		void __iomem *addr = pci_msix_desc_addr(desc);
 
+		if (dev->dev_flags & PCI_DEV_FLAGS_MSIX_TOUCH_ENTRY_DATA_FIRST)
+			writel(0, addr + PCI_MSIX_ENTRY_DATA);
 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 	}
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 37d97bef060f..b8b95b58d522 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -245,6 +245,8 @@ enum pci_dev_flags {
 	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
 	/* Device does honor MSI masking despite saying otherwise */
 	PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12),
+	/* Device requires write to PCI_MSIX_ENTRY_DATA before any MSIX reads */
+	PCI_DEV_FLAGS_MSIX_TOUCH_ENTRY_DATA_FIRST = (__force pci_dev_flags_t) (1 << 13),
 };
 
 enum pci_irq_reroute_variant {
-- 
2.45.2
Re: [PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads
Posted by Paolo Abeni 1 year, 2 months ago

On 11/18/24 00:48, dullfire@yahoo.com wrote:
> From: Jonathan Currier <dullfire@yahoo.com>
> 
> Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
> introduces a readl() from ENTRY_VECTOR_CTRL before the writel() to
> ENTRY_DATA. This is correct, however some hardware, like the Sun Neptune
> chips, the niu module, will cause an error and/or fatal trap if any MSIX
> table entry is read before the corresponding ENTRY_DATA field is written
> to. This patch adds an optional early writel() in msix_prepare_msi_desc().

Why the issue can't be addressed into the relevant device driver? It
looks like an H/W bug, a driver specific fix looks IMHO more fitting.

A cross subsystem series, like this one, gives some extra complication
to maintainers.

Thanks,

Paolo
Re: [PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads
Posted by Dullfire 1 year, 2 months ago
On 11/21/24 02:55, Paolo Abeni wrote:
>
> On 11/18/24 00:48, dullfire@yahoo.com wrote:
>> From: Jonathan Currier <dullfire@yahoo.com>
>>
>> Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
>> introduces a readl() from ENTRY_VECTOR_CTRL before the writel() to
>> ENTRY_DATA. This is correct, however some hardware, like the Sun Neptune
>> chips, the niu module, will cause an error and/or fatal trap if any MSIX
>> table entry is read before the corresponding ENTRY_DATA field is written
>> to. This patch adds an optional early writel() in msix_prepare_msi_desc().
> Why the issue can't be addressed into the relevant device driver? It
> looks like an H/W bug, a driver specific fix looks IMHO more fitting.
I considered this approach, and thus asked about it in the mailing lists here:
https://lore.kernel.org/sparclinux/7de14cca-e2fa-49f7-b83e-5f8322cc9e56@yahoo.com/T/
It sounds like you are suggesting approach 1 (add the MSIX register writes to niu).
I did do a quick and dirty test of that. However it ended up requiring
msix_map_region(), and pci_msix_desc_addr(), both of are internal to the
msi code, and not exported or in pubic headers. The msix_table_size()
macro was also needed. I could either export those functions, or reproduce
their code in the niu driver. However, as Thomas' suggestion seemed very
simple and elegant to me, I decided to got with it.

If it is actually preferable to either copy those msix functions into niu,
they are not very large, or export them (GPL I would assume?) let me know
and I can make that change.
>
> A cross subsystem series, like this one, gives some extra complication
> to maintainers.
>
> Thanks,
>
> Paolo

Sincerely,
Jonathan Currier
Re: [PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads
Posted by Paolo Abeni 1 year, 2 months ago
On 11/21/24 10:22, Dullfire wrote:
> On 11/21/24 02:55, Paolo Abeni wrote:
>> On 11/18/24 00:48, dullfire@yahoo.com wrote:
>>> From: Jonathan Currier <dullfire@yahoo.com>
>>>
>>> Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
>>> introduces a readl() from ENTRY_VECTOR_CTRL before the writel() to
>>> ENTRY_DATA. This is correct, however some hardware, like the Sun Neptune
>>> chips, the niu module, will cause an error and/or fatal trap if any MSIX
>>> table entry is read before the corresponding ENTRY_DATA field is written
>>> to. This patch adds an optional early writel() in msix_prepare_msi_desc().
>> Why the issue can't be addressed into the relevant device driver? It
>> looks like an H/W bug, a driver specific fix looks IMHO more fitting.
>
> I considered this approach, and thus asked about it in the mailing lists here:
> https://lore.kernel.org/sparclinux/7de14cca-e2fa-49f7-b83e-5f8322cc9e56@yahoo.com/T/

I forgot about such thread, thank you for the reminder. Since the more
hackish code is IRQ specific, if Thomas is fine with that, I'll not oppose.

>> A cross subsystem series, like this one, gives some extra complication
>> to maintainers.

The niu driver is not exactly under very active development, I guess the
whole series could go via the IRQ subsystem, if Thomas agrees.

Cheers,

Paolo
Re: [PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads
Posted by Dullfire 1 year ago
On 11/21/24 04:28, Paolo Abeni wrote:
> On 11/21/24 10:22, Dullfire wrote:
>> On 11/21/24 02:55, Paolo Abeni wrote:
>>> On 11/18/24 00:48, dullfire@yahoo.com wrote:
>>>> From: Jonathan Currier <dullfire@yahoo.com>
>>>>
>>>> Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
>>>> introduces a readl() from ENTRY_VECTOR_CTRL before the writel() to
>>>> ENTRY_DATA. This is correct, however some hardware, like the Sun Neptune
>>>> chips, the niu module, will cause an error and/or fatal trap if any MSIX
>>>> table entry is read before the corresponding ENTRY_DATA field is written
>>>> to. This patch adds an optional early writel() in msix_prepare_msi_desc().
>>> Why the issue can't be addressed into the relevant device driver? It
>>> looks like an H/W bug, a driver specific fix looks IMHO more fitting.
>>
>> I considered this approach, and thus asked about it in the mailing lists here:
>> https://lore.kernel.org/sparclinux/7de14cca-e2fa-49f7-b83e-5f8322cc9e56@yahoo.com/T/
> 
> I forgot about such thread, thank you for the reminder. Since the more
> hackish code is IRQ specific, if Thomas is fine with that, I'll not oppose.
> 
>>> A cross subsystem series, like this one, gives some extra complication
>>> to maintainers.
> 
> The niu driver is not exactly under very active development, I guess the
> whole series could go via the IRQ subsystem, if Thomas agrees.
> 
> Cheers,
> 
> Paolo
> 

Thomas, does this work for you, or is there something else you would to see in this series?


Sincerely,
Jonathan Currier
Re: [PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads
Posted by Dullfire 10 months ago

On 1/20/25 6:38 AM, Dullfire wrote:
> 
> On 11/21/24 04:28, Paolo Abeni wrote:
[...]
>> The niu driver is not exactly under very active development, I guess the
>> whole series could go via the IRQ subsystem, if Thomas agrees.
>>
>> Cheers,
>>
>> Paolo
>>
> 
> Thomas, does this work for you, or is there something else you would to see in this series?


Since it's been a few months, just wanted to give this a bump, and see if the series is lacking anything.
Thanks,

Sincerely,
Jonathan Currier
Re: [PATCH 1/2] PCI/MSI: Add MSIX option to write to ENTRY_DATA before any reads
Posted by Thomas Gleixner 9 months, 4 weeks ago
On Mon, Apr 14 2025 at 11:22, dullfire@yahoo.com wrote:
> On 1/20/25 6:38 AM, Dullfire wrote:
>> On 11/21/24 04:28, Paolo Abeni wrote:
> [...]
>>> The niu driver is not exactly under very active development, I guess the
>>> whole series could go via the IRQ subsystem, if Thomas agrees.
>>>
>>> Cheers,
>>>
>>> Paolo
>>>
>> 
>> Thomas, does this work for you, or is there something else you would to see in this series?
>
> Since it's been a few months, just wanted to give this a bump, and see if the series is lacking anything.
> Thanks,

Sorry for letting this fall through the cracks. I put it back on my todo
list...

Thanks,

        tglx
[tip: irq/urgent] PCI/MSI: Add an option to write MSIX ENTRY_DATA before any reads
Posted by tip-bot2 for Jonathan Currier 9 months, 4 weeks ago
The following commit has been merged into the irq/urgent branch of tip:

Commit-ID:     cf761e3dacc6ad5f65a4886d00da1f9681e6805a
Gitweb:        https://git.kernel.org/tip/cf761e3dacc6ad5f65a4886d00da1f9681e6805a
Author:        Jonathan Currier <dullfire@yahoo.com>
AuthorDate:    Sun, 17 Nov 2024 17:48:42 -06:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 15 Apr 2025 08:32:18 +02:00

PCI/MSI: Add an option to write MSIX ENTRY_DATA before any reads

Commit 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") introduced a
readl() from ENTRY_VECTOR_CTRL before the writel() to ENTRY_DATA.

This is correct, however some hardware, like the Sun Neptune chips, the NIU
module, will cause an error and/or fatal trap if any MSIX table entry is
read before the corresponding ENTRY_DATA field is written to.

Add an optional early writel() in msix_prepare_msi_desc().

Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
Signed-off-by: Jonathan Currier <dullfire@yahoo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241117234843.19236-2-dullfire@yahoo.com
---
 drivers/pci/msi/msi.c | 3 +++
 include/linux/pci.h   | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 6569ba3..8b88487 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -615,6 +615,9 @@ void msix_prepare_msi_desc(struct pci_dev *dev, struct msi_desc *desc)
 		void __iomem *addr = pci_msix_desc_addr(desc);
 
 		desc->pci.msi_attrib.can_mask = 1;
+		/* Workaround for SUN NIU insanity, which requires write before read */
+		if (dev->dev_flags & PCI_DEV_FLAGS_MSIX_TOUCH_ENTRY_DATA_FIRST)
+			writel(0, addr + PCI_MSIX_ENTRY_DATA);
 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
 	}
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0e8e3fd..51e2bd6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -245,6 +245,8 @@ enum pci_dev_flags {
 	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
 	/* Device does honor MSI masking despite saying otherwise */
 	PCI_DEV_FLAGS_HAS_MSI_MASKING = (__force pci_dev_flags_t) (1 << 12),
+	/* Device requires write to PCI_MSIX_ENTRY_DATA before any MSIX reads */
+	PCI_DEV_FLAGS_MSIX_TOUCH_ENTRY_DATA_FIRST = (__force pci_dev_flags_t) (1 << 13),
 };
 
 enum pci_irq_reroute_variant {