[Qemu-devel] [PATCH v3 0/3] vfio-pci: support recovery of AER non fatal error

Cao jin posted 3 patches 7 years, 1 month ago
Failed in applying to current master (apply log)
hw/pci/pcie_aer.c          |  28 ++++++
hw/vfio/pci.c              | 243 ++++++++++++++++++++++++++++++++++++++++++++-
hw/vfio/pci.h              |   3 +
linux-headers/linux/vfio.h |   2 +
4 files changed, 271 insertions(+), 5 deletions(-)
[Qemu-devel] [PATCH v3 0/3] vfio-pci: support recovery of AER non fatal error
Posted by Cao jin 7 years, 1 month ago
v3 changelog:
1. Address all comments from MST in patch 3, include remove the flag
   pci_aer_non_fatal & passive_reset, also the boilerplate code.
   The corresponding kernel patch is v6.

Test:
Test with func1 passthroughed while func0 doesn't have user.

Cao jin (3):
  pcie aer: verify if AER functionality is available
  vfio pci: new function to init AER capability
  vfio-pci: process non fatal error of AER

 hw/pci/pcie_aer.c          |  28 ++++++
 hw/vfio/pci.c              | 243 ++++++++++++++++++++++++++++++++++++++++++++-
 hw/vfio/pci.h              |   3 +
 linux-headers/linux/vfio.h |   2 +
 4 files changed, 271 insertions(+), 5 deletions(-)

-- 
1.8.3.1




Re: [Qemu-devel] [PATCH v3 0/3] vfio-pci: support recovery of AER non fatal error
Posted by Alex Williamson 7 years, 1 month ago
On Thu, 23 Mar 2017 17:09:20 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> v3 changelog:
> 1. Address all comments from MST in patch 3, include remove the flag
>    pci_aer_non_fatal & passive_reset, also the boilerplate code.
>    The corresponding kernel patch is v6.
> 
> Test:
> Test with func1 passthroughed while func0 doesn't have user.

So the slot_reset trigger really hasn't been tested at all?
 
> Cao jin (3):
>   pcie aer: verify if AER functionality is available
>   vfio pci: new function to init AER capability
>   vfio-pci: process non fatal error of AER
> 
>  hw/pci/pcie_aer.c          |  28 ++++++
>  hw/vfio/pci.c              | 243 ++++++++++++++++++++++++++++++++++++++++++++-
>  hw/vfio/pci.h              |   3 +
>  linux-headers/linux/vfio.h |   2 +
>  4 files changed, 271 insertions(+), 5 deletions(-)
> 


Re: [Qemu-devel] [PATCH v3 0/3] vfio-pci: support recovery of AER non fatal error
Posted by Cao jin 7 years ago

On 03/25/2017 06:12 AM, Alex Williamson wrote:
> On Thu, 23 Mar 2017 17:09:20 +0800
> Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> 
>> v3 changelog:
>> 1. Address all comments from MST in patch 3, include remove the flag
>>    pci_aer_non_fatal & passive_reset, also the boilerplate code.
>>    The corresponding kernel patch is v6.
>>
>> Test:
>> Test with func1 passthroughed while func0 doesn't have user.
> 
> So the slot_reset trigger really hasn't been tested at all?

No, because we don't have that kind of multi-function device. IIRC, in
real world, most of multi-function devices have the same functions.

I plan to do basic test as described above before got Reviewed-by, and
will do full test as before after reviewed.

I will consider if we can fake to trigger slot_reset.

-- 
Sincerely,
Cao jin



Re: [Qemu-devel] [PATCH v3 0/3] vfio-pci: support recovery of AER non fatal error
Posted by Alex Williamson 7 years ago
On Tue, 28 Mar 2017 21:47:09 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> On 03/25/2017 06:12 AM, Alex Williamson wrote:
> > On Thu, 23 Mar 2017 17:09:20 +0800
> > Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> >   
> >> v3 changelog:
> >> 1. Address all comments from MST in patch 3, include remove the flag
> >>    pci_aer_non_fatal & passive_reset, also the boilerplate code.
> >>    The corresponding kernel patch is v6.
> >>
> >> Test:
> >> Test with func1 passthroughed while func0 doesn't have user.  
> > 
> > So the slot_reset trigger really hasn't been tested at all?  
> 
> No, because we don't have that kind of multi-function device. IIRC, in
> real world, most of multi-function devices have the same functions.

Why does that matter?  Even if the functions are identical, one can be
owned by the host and one can be owned by a completely different driver
in a guest.  The guest driver may be able to recover without a reset
while the host driver may require one for the same error.

> I plan to do basic test as described above before got Reviewed-by, and
> will do full test as before after reviewed.
> 
> I will consider if we can fake to trigger slot_reset.

If more testing is required for a patch series it should be explicitly
noted in the cover letter or sent as an RFC.  Otherwise you're
potentially wasting my time if I'm the first to test it or risking that
untested code will be approved an make it into upstream.  Thanks,

Alex