[PATCH v3] powerpc/pseries/eeh: Fix pseries_eeh_err_inject

Narayana Murty N posted 1 patch 2 months, 3 weeks ago
arch/powerpc/include/asm/eeh.h               |  2 +-
arch/powerpc/kernel/eeh.c                    |  9 +++--
arch/powerpc/platforms/pseries/eeh_pseries.c | 39 +++++++++++++++++++-
3 files changed, 44 insertions(+), 6 deletions(-)
[PATCH v3] powerpc/pseries/eeh: Fix pseries_eeh_err_inject
Posted by Narayana Murty N 2 months, 3 weeks ago
VFIO_EEH_PE_INJECT_ERR ioctl is currently failing on pseries
due to missing implementation of err_inject eeh_ops for pseries.
This patch implements pseries_eeh_err_inject in eeh_ops/pseries
eeh_ops. Implements support for injecting MMIO load/store error
for testing from user space.

The check on PCI error type (bus type) code is moved to platform
code, since the eeh_pe_inject_err can be allowed to more error
types depending on platform requirement. Removal of the check for
'type' in eeh_pe_inject_err() doesn't impact PowerNV as
pnv_eeh_err_inject() already has an equivalent check in place.

Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>

---

Testing:
========
vfio-test [1] by Alex Willamson, was forked and updated to add
support inject error on pSeries guest and used to test this
patch[2].

References:
===========
[1] https://github.com/awilliam/tests
[2] https://github.com/nnmwebmin/vfio-ppc-tests/tree/vfio-ppc-ex

================
Changelog:
V2: https://lore.kernel.org/all/20240823151158.92602-1-nnmlinux@linux.ibm.com/
- Updated patch description explicitly mentioning about similar checks
in place for PoweRNV as suggested.
- eeh_pe_inject_mmio_error wrapper function removed because CONFIG_EEH
is always enabled built for PPC_PSERIES when PPC_PSERIES=y.
V1: https://lore.kernel.org/all/20240822082713.529982-1-nnmlinux@linux.ibm.com/
- Resolved build issues for ppc64|le_defconfig by moving the
pseries_eeh_err_inject() definition outside of the CONFIG_PCI_IOV
code block.
- New eeh_pe_inject_mmio_error wrapper function added to avoid
CONFIG_EEH is not set.
---
 arch/powerpc/include/asm/eeh.h               |  2 +-
 arch/powerpc/kernel/eeh.c                    |  9 +++--
 arch/powerpc/platforms/pseries/eeh_pseries.c | 39 +++++++++++++++++++-
 3 files changed, 44 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
index 91a9fd53254f..317b12fc1fe4 100644
--- a/arch/powerpc/include/asm/eeh.h
+++ b/arch/powerpc/include/asm/eeh.h
@@ -308,7 +308,7 @@ int eeh_pe_reset(struct eeh_pe *pe, int option, bool include_passed);
 int eeh_pe_configure(struct eeh_pe *pe);
 int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
 		      unsigned long addr, unsigned long mask);
-
+int eeh_pe_inject_mmio_error(struct pci_dev *pdev);
 /**
  * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure.
  *
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index d03f17987fca..49ab11a287a3 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1537,10 +1537,6 @@ int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
 	if (!eeh_ops || !eeh_ops->err_inject)
 		return -ENOENT;
 
-	/* Check on PCI error type */
-	if (type != EEH_ERR_TYPE_32 && type != EEH_ERR_TYPE_64)
-		return -EINVAL;
-
 	/* Check on PCI error function */
 	if (func < EEH_ERR_FUNC_MIN || func > EEH_ERR_FUNC_MAX)
 		return -EINVAL;
@@ -1851,6 +1847,11 @@ static const struct file_operations eeh_dev_break_fops = {
 	.read   = eeh_debugfs_dev_usage,
 };
 
+int eeh_pe_inject_mmio_error(struct pci_dev *pdev)
+{
+	return eeh_debugfs_break_device(pdev);
+}
+
 static ssize_t eeh_dev_can_recover(struct file *filp,
 				   const char __user *user_buf,
 				   size_t count, loff_t *ppos)
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index b1ae0c0d1187..1893f66371fa 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -784,6 +784,43 @@ static int pseries_notify_resume(struct eeh_dev *edev)
 }
 #endif
 
+/**
+ * pseries_eeh_err_inject - Inject specified error to the indicated PE
+ * @pe: the indicated PE
+ * @type: error type
+ * @func: specific error type
+ * @addr: address
+ * @mask: address mask
+ * The routine is called to inject specified error, which is
+ * determined by @type and @func, to the indicated PE
+ */
+static int pseries_eeh_err_inject(struct eeh_pe *pe, int type, int func,
+				  unsigned long addr, unsigned long mask)
+{
+	struct	eeh_dev	*pdev;
+
+	/* Check on PCI error type */
+	if (type != EEH_ERR_TYPE_32 && type != EEH_ERR_TYPE_64)
+		return -EINVAL;
+
+	switch (func) {
+	case EEH_ERR_FUNC_LD_MEM_ADDR:
+	case EEH_ERR_FUNC_LD_MEM_DATA:
+	case EEH_ERR_FUNC_ST_MEM_ADDR:
+	case EEH_ERR_FUNC_ST_MEM_DATA:
+		/* injects a MMIO error for all pdev's belonging to PE */
+		pci_lock_rescan_remove();
+		list_for_each_entry(pdev, &pe->edevs, entry)
+			eeh_pe_inject_mmio_error(pdev->pdev);
+		pci_unlock_rescan_remove();
+		break;
+	default:
+		return -ERANGE;
+	}
+
+	return 0;
+}
+
 static struct eeh_ops pseries_eeh_ops = {
 	.name			= "pseries",
 	.probe			= pseries_eeh_probe,
@@ -792,7 +829,7 @@ static struct eeh_ops pseries_eeh_ops = {
 	.reset			= pseries_eeh_reset,
 	.get_log		= pseries_eeh_get_log,
 	.configure_bridge       = pseries_eeh_configure_bridge,
-	.err_inject		= NULL,
+	.err_inject		= pseries_eeh_err_inject,
 	.read_config		= pseries_eeh_read_config,
 	.write_config		= pseries_eeh_write_config,
 	.next_error		= NULL,
-- 
2.45.2
Re: [PATCH v3] powerpc/pseries/eeh: Fix pseries_eeh_err_inject
Posted by Guenter Roeck 2 months, 1 week ago
Hi,

On Mon, Sep 09, 2024 at 09:02:20AM -0500, Narayana Murty N wrote:
> VFIO_EEH_PE_INJECT_ERR ioctl is currently failing on pseries
> due to missing implementation of err_inject eeh_ops for pseries.
> This patch implements pseries_eeh_err_inject in eeh_ops/pseries
> eeh_ops. Implements support for injecting MMIO load/store error
> for testing from user space.
> 
> The check on PCI error type (bus type) code is moved to platform
> code, since the eeh_pe_inject_err can be allowed to more error
> types depending on platform requirement. Removal of the check for
> 'type' in eeh_pe_inject_err() doesn't impact PowerNV as
> pnv_eeh_err_inject() already has an equivalent check in place.
> 
> Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com>
> Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> 
> ---
>  arch/powerpc/include/asm/eeh.h               |  2 +-
>  arch/powerpc/kernel/eeh.c                    |  9 +++--
>  arch/powerpc/platforms/pseries/eeh_pseries.c | 39 +++++++++++++++++++-
>  3 files changed, 44 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
> index 91a9fd53254f..317b12fc1fe4 100644
> --- a/arch/powerpc/include/asm/eeh.h
> +++ b/arch/powerpc/include/asm/eeh.h
> @@ -308,7 +308,7 @@ int eeh_pe_reset(struct eeh_pe *pe, int option, bool include_passed);
>  int eeh_pe_configure(struct eeh_pe *pe);
>  int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
>  		      unsigned long addr, unsigned long mask);
> -
> +int eeh_pe_inject_mmio_error(struct pci_dev *pdev);
>  /**
>   * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure.
>   *
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index d03f17987fca..49ab11a287a3 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1537,10 +1537,6 @@ int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
>  	if (!eeh_ops || !eeh_ops->err_inject)
>  		return -ENOENT;
>  
> -	/* Check on PCI error type */
> -	if (type != EEH_ERR_TYPE_32 && type != EEH_ERR_TYPE_64)
> -		return -EINVAL;
> -
>  	/* Check on PCI error function */
>  	if (func < EEH_ERR_FUNC_MIN || func > EEH_ERR_FUNC_MAX)
>  		return -EINVAL;
> @@ -1851,6 +1847,11 @@ static const struct file_operations eeh_dev_break_fops = {
>  	.read   = eeh_debugfs_dev_usage,
>  };
>  
> +int eeh_pe_inject_mmio_error(struct pci_dev *pdev)
> +{
> +	return eeh_debugfs_break_device(pdev);
> +}
> +

The new function, as the context suggests, is only compiled if CONFIG_DEBUG_FS=y.
However, it is called unconditionally. With CONFIG_DEBUG_FS=n, this results in

powerpc64-linux-ld: arch/powerpc/platforms/pseries/eeh_pseries.o: in function `pseries_eeh_err_inject':
/opt/buildbot/slave/qemu-ppc64/build/arch/powerpc/platforms/pseries/eeh_pseries.c:814:(.text+0x554): undefined reference to `eeh_pe_inject_mmio_error'
make[3]: *** [/opt/buildbot/slave/qemu-ppc64/build/scripts/Makefile.vmlinux:34: vmlinux] Error 1
make[2]: *** [/opt/buildbot/slave/qemu-ppc64/build/Makefile:1157: vmlinux] Error 2

I'll enable CONFIG_DEBUG_FS in my tests and won't report this further,
but you might want to consider fixing the problem at some point.

Guenter
Re: [PATCH v3] powerpc/pseries/eeh: Fix pseries_eeh_err_inject
Posted by Ritesh Harjani (IBM) 2 months, 1 week ago
Guenter Roeck <linux@roeck-us.net> writes:

> Hi,
>
> On Mon, Sep 09, 2024 at 09:02:20AM -0500, Narayana Murty N wrote:
>> VFIO_EEH_PE_INJECT_ERR ioctl is currently failing on pseries
>> due to missing implementation of err_inject eeh_ops for pseries.
>> This patch implements pseries_eeh_err_inject in eeh_ops/pseries
>> eeh_ops. Implements support for injecting MMIO load/store error
>> for testing from user space.
>> 
>> The check on PCI error type (bus type) code is moved to platform
>> code, since the eeh_pe_inject_err can be allowed to more error
>> types depending on platform requirement. Removal of the check for
>> 'type' in eeh_pe_inject_err() doesn't impact PowerNV as
>> pnv_eeh_err_inject() already has an equivalent check in place.
>> 
>> Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com>
>> Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>> 
>> ---
>>  arch/powerpc/include/asm/eeh.h               |  2 +-
>>  arch/powerpc/kernel/eeh.c                    |  9 +++--
>>  arch/powerpc/platforms/pseries/eeh_pseries.c | 39 +++++++++++++++++++-
>>  3 files changed, 44 insertions(+), 6 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/eeh.h b/arch/powerpc/include/asm/eeh.h
>> index 91a9fd53254f..317b12fc1fe4 100644
>> --- a/arch/powerpc/include/asm/eeh.h
>> +++ b/arch/powerpc/include/asm/eeh.h
>> @@ -308,7 +308,7 @@ int eeh_pe_reset(struct eeh_pe *pe, int option, bool include_passed);
>>  int eeh_pe_configure(struct eeh_pe *pe);
>>  int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
>>  		      unsigned long addr, unsigned long mask);
>> -
>> +int eeh_pe_inject_mmio_error(struct pci_dev *pdev);
>>  /**
>>   * EEH_POSSIBLE_ERROR() -- test for possible MMIO failure.
>>   *
>> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>> index d03f17987fca..49ab11a287a3 100644
>> --- a/arch/powerpc/kernel/eeh.c
>> +++ b/arch/powerpc/kernel/eeh.c
>> @@ -1537,10 +1537,6 @@ int eeh_pe_inject_err(struct eeh_pe *pe, int type, int func,
>>  	if (!eeh_ops || !eeh_ops->err_inject)
>>  		return -ENOENT;
>>  
>> -	/* Check on PCI error type */
>> -	if (type != EEH_ERR_TYPE_32 && type != EEH_ERR_TYPE_64)
>> -		return -EINVAL;
>> -
>>  	/* Check on PCI error function */
>>  	if (func < EEH_ERR_FUNC_MIN || func > EEH_ERR_FUNC_MAX)
>>  		return -EINVAL;
>> @@ -1851,6 +1847,11 @@ static const struct file_operations eeh_dev_break_fops = {
>>  	.read   = eeh_debugfs_dev_usage,
>>  };
>>  
>> +int eeh_pe_inject_mmio_error(struct pci_dev *pdev)
>> +{
>> +	return eeh_debugfs_break_device(pdev);
>> +}
>> +
>
> The new function, as the context suggests, is only compiled if CONFIG_DEBUG_FS=y.
> However, it is called unconditionally. With CONFIG_DEBUG_FS=n, this results in
>
> powerpc64-linux-ld: arch/powerpc/platforms/pseries/eeh_pseries.o: in function `pseries_eeh_err_inject':
> /opt/buildbot/slave/qemu-ppc64/build/arch/powerpc/platforms/pseries/eeh_pseries.c:814:(.text+0x554): undefined reference to `eeh_pe_inject_mmio_error'
> make[3]: *** [/opt/buildbot/slave/qemu-ppc64/build/scripts/Makefile.vmlinux:34: vmlinux] Error 1
> make[2]: *** [/opt/buildbot/slave/qemu-ppc64/build/Makefile:1157: vmlinux] Error 2
>
> I'll enable CONFIG_DEBUG_FS in my tests and won't report this further,
> but you might want to consider fixing the problem at some point.
>

Yes, this is fixed and picked up in powerpc tree.

https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?h=merge&id=3af2e2f68cc6baf0a11f662d30b0bf981f77bfea

-ritesh
Re: [PATCH v3] powerpc/pseries/eeh: Fix pseries_eeh_err_inject
Posted by Michael Ellerman 2 months, 2 weeks ago
On Mon, 09 Sep 2024 09:02:20 -0500, Narayana Murty N wrote:
> VFIO_EEH_PE_INJECT_ERR ioctl is currently failing on pseries
> due to missing implementation of err_inject eeh_ops for pseries.
> This patch implements pseries_eeh_err_inject in eeh_ops/pseries
> eeh_ops. Implements support for injecting MMIO load/store error
> for testing from user space.
> 
> The check on PCI error type (bus type) code is moved to platform
> code, since the eeh_pe_inject_err can be allowed to more error
> types depending on platform requirement. Removal of the check for
> 'type' in eeh_pe_inject_err() doesn't impact PowerNV as
> pnv_eeh_err_inject() already has an equivalent check in place.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/pseries/eeh: Fix pseries_eeh_err_inject
      https://git.kernel.org/powerpc/c/b0e2b828dfca645a228f8c89d12fbc2baecfb7ea

cheers