[PATCH v2 2/4] vfio/pci: Add an architecture specific error handler

Farhan Ali posted 4 patches 2 months, 3 weeks ago
Maintainers: Matthew Rosato <mjrosato@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>, Thomas Huth <thuth@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, David Hildenbrand <david@redhat.com>, Ilya Leoshkevich <iii@linux.ibm.com>, Halil Pasic <pasic@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Alex Williamson <alex.williamson@redhat.com>, "Cédric Le Goater" <clg@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Cornelia Huck <cohuck@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
There is a newer version of this series
[PATCH v2 2/4] vfio/pci: Add an architecture specific error handler
Posted by Farhan Ali 2 months, 3 weeks ago
Provide a architecture specific error handling callback, that can be used
by platforms to handle PCI errors for passthrough devices.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
 hw/vfio/pci.c | 5 +++++
 hw/vfio/pci.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 07257d0fa0..3c71d19306 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3026,6 +3026,11 @@ static void vfio_err_notifier_handler(void *opaque)
         return;
     }
 
+    if (vdev->arch_err_handler) {
+        vdev->arch_err_handler(vdev);
+        return;
+    }
+
     /*
      * TBD. Retrieve the error details and decide what action
      * needs to be taken. One of the actions could be to pass
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 810a842f4a..45d4405e47 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -145,6 +145,7 @@ struct VFIOPCIDevice {
     EventNotifier err_notifier;
     EventNotifier req_notifier;
     int (*resetfn)(struct VFIOPCIDevice *);
+    void (*arch_err_handler)(struct VFIOPCIDevice *);
     uint32_t vendor_id;
     uint32_t device_id;
     uint32_t sub_vendor_id;
-- 
2.43.0
Re: [PATCH v2 2/4] vfio/pci: Add an architecture specific error handler
Posted by Cédric Le Goater 2 months, 2 weeks ago
On 8/25/25 23:24, Farhan Ali wrote:
> Provide a architecture specific error handling callback, that can be used
> by platforms to handle PCI errors for passthrough devices.
> 
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
>   hw/vfio/pci.c | 5 +++++
>   hw/vfio/pci.h | 1 +
>   2 files changed, 6 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 07257d0fa0..3c71d19306 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3026,6 +3026,11 @@ static void vfio_err_notifier_handler(void *opaque)
>           return;
>       }
>   
> +    if (vdev->arch_err_handler) {
> +        vdev->arch_err_handler(vdev);


I am not sure that the "architecture specific error handling"
will be implemented this way but we need to check for potential
errors.

So, please make the handler return a bool and add an extra
'Error **' parameter.


Thanks,

C.




> +        return;
> +    }
> +
>       /*
>        * TBD. Retrieve the error details and decide what action
>        * needs to be taken. One of the actions could be to pass
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index 810a842f4a..45d4405e47 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -145,6 +145,7 @@ struct VFIOPCIDevice {
>       EventNotifier err_notifier;
>       EventNotifier req_notifier;
>       int (*resetfn)(struct VFIOPCIDevice *);
> +    void (*arch_err_handler)(struct VFIOPCIDevice *);

>       uint32_t vendor_id;
>       uint32_t device_id;
>       uint32_t sub_vendor_id;
Re: [PATCH v2 2/4] vfio/pci: Add an architecture specific error handler
Posted by Farhan Ali 2 months, 1 week ago
On 9/1/2025 4:28 AM, Cédric Le Goater wrote:
> On 8/25/25 23:24, Farhan Ali wrote:
>> Provide a architecture specific error handling callback, that can be 
>> used
>> by platforms to handle PCI errors for passthrough devices.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>>   hw/vfio/pci.c | 5 +++++
>>   hw/vfio/pci.h | 1 +
>>   2 files changed, 6 insertions(+)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 07257d0fa0..3c71d19306 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3026,6 +3026,11 @@ static void vfio_err_notifier_handler(void 
>> *opaque)
>>           return;
>>       }
>>   +    if (vdev->arch_err_handler) {
>> +        vdev->arch_err_handler(vdev);
>
>
> I am not sure that the "architecture specific error handling"
> will be implemented this way but we need to check for potential
> errors.

Thanks for reviewing, do you have any suggestions on how to implement 
the architecture/device specific error handling?


>
> So, please make the handler return a bool and add an extra
> 'Error **' parameter.
>
>
Sure, I can change that. Are you suggesting the change to bool return 
for the handler to report any failures in the handler (through errp)?

Thanks

Farhan

>
> Thanks,
>
> C.
>
>
>
>
>> +        return;
>> +    }
>> +
>>       /*
>>        * TBD. Retrieve the error details and decide what action
>>        * needs to be taken. One of the actions could be to pass
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index 810a842f4a..45d4405e47 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -145,6 +145,7 @@ struct VFIOPCIDevice {
>>       EventNotifier err_notifier;
>>       EventNotifier req_notifier;
>>       int (*resetfn)(struct VFIOPCIDevice *);
>> +    void (*arch_err_handler)(struct VFIOPCIDevice *);
>
>>       uint32_t vendor_id;
>>       uint32_t device_id;
>>       uint32_t sub_vendor_id;
>
>

Re: [PATCH v2 2/4] vfio/pci: Add an architecture specific error handler
Posted by Cédric Le Goater 2 months ago
On 9/3/25 18:49, Farhan Ali wrote:
> 
> On 9/1/2025 4:28 AM, Cédric Le Goater wrote:
>> On 8/25/25 23:24, Farhan Ali wrote:
>>> Provide a architecture specific error handling callback, that can be used
>>> by platforms to handle PCI errors for passthrough devices.
>>>
>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>> ---
>>>   hw/vfio/pci.c | 5 +++++
>>>   hw/vfio/pci.h | 1 +
>>>   2 files changed, 6 insertions(+)
>>>
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index 07257d0fa0..3c71d19306 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -3026,6 +3026,11 @@ static void vfio_err_notifier_handler(void *opaque)
>>>           return;
>>>       }
>>>   +    if (vdev->arch_err_handler) {
>>> +        vdev->arch_err_handler(vdev);
>>
>>
>> I am not sure that the "architecture specific error handling"
>> will be implemented this way but we need to check for potential
>> errors.
> 
> Thanks for reviewing, do you have any suggestions on how to implement the architecture/device specific error handling?

The nature of VFIOPCIDevice is special, it's both a PCIDevice and
a VFIODevice and it's difficult to use QOM to customize some of
its behavior with device class handlers. I don't see a better
way for now.

I would drop the 'arch_' prefix from the name.


>> So, please make the handler return a bool and add an extra
>> 'Error **' parameter.
>>
>>
> Sure, I can change that. Are you suggesting the change to bool return for the handler to report any failures in the handler (through errp)?

yes. The error reported before the vm_stop() in vfio_err_notifier_handler()
should be updated too.

Thanks,

C.