[PATCH v4 00/15] hw/nvme: SR-IOV with Virtualization Enhancements

Lukasz Maniak posted 15 patches 2 years, 3 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220126171120.2939152-1-lukasz.maniak@linux.intel.com
Maintainers: Keith Busch <kbusch@kernel.org>, Hanna Reitz <hreitz@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, Klaus Jensen <its@irrelevant.dk>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, "Michael S. Tsirkin" <mst@redhat.com>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>
There is a newer version of this series
docs/pcie_sriov.txt          | 115 ++++++
docs/system/devices/nvme.rst |  36 ++
hw/nvme/ctrl.c               | 675 ++++++++++++++++++++++++++++++++---
hw/nvme/ns.c                 |   2 +-
hw/nvme/nvme.h               |  55 ++-
hw/nvme/subsys.c             |  75 +++-
hw/nvme/trace-events         |   6 +
hw/pci/meson.build           |   1 +
hw/pci/pci.c                 | 100 ++++--
hw/pci/pcie.c                |   5 +
hw/pci/pcie_sriov.c          | 302 ++++++++++++++++
hw/pci/trace-events          |   5 +
include/block/nvme.h         |  65 ++++
include/hw/pci/pci.h         |  12 +-
include/hw/pci/pci_ids.h     |   1 +
include/hw/pci/pci_regs.h    |   1 +
include/hw/pci/pcie.h        |   6 +
include/hw/pci/pcie_sriov.h  |  77 ++++
include/qemu/typedefs.h      |   2 +
19 files changed, 1460 insertions(+), 81 deletions(-)
create mode 100644 docs/pcie_sriov.txt
create mode 100644 hw/pci/pcie_sriov.c
create mode 100644 include/hw/pci/pcie_sriov.h
[PATCH v4 00/15] hw/nvme: SR-IOV with Virtualization Enhancements
Posted by Lukasz Maniak 2 years, 3 months ago
Changes since v3:
- Addressed comments to review on pcie: Add support for Single Root I/O
  Virtualization (SR/IOV)
- Fixed issues reported by checkpatch.pl

Knut Omang (2):
  pcie: Add support for Single Root I/O Virtualization (SR/IOV)
  pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt

Lukasz Maniak (4):
  hw/nvme: Add support for SR-IOV
  hw/nvme: Add support for Primary Controller Capabilities
  hw/nvme: Add support for Secondary Controller List
  docs: Add documentation for SR-IOV and Virtualization Enhancements

Łukasz Gieryk (9):
  pcie: Add a helper to the SR/IOV API
  pcie: Add 1.2 version token for the Power Management Capability
  hw/nvme: Implement the Function Level Reset
  hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime
  hw/nvme: Remove reg_size variable and update BAR0 size calculation
  hw/nvme: Calculate BAR attributes in a function
  hw/nvme: Initialize capability structures for primary/secondary
    controllers
  hw/nvme: Add support for the Virtualization Management command
  hw/nvme: Update the initalization place for the AER queue

 docs/pcie_sriov.txt          | 115 ++++++
 docs/system/devices/nvme.rst |  36 ++
 hw/nvme/ctrl.c               | 675 ++++++++++++++++++++++++++++++++---
 hw/nvme/ns.c                 |   2 +-
 hw/nvme/nvme.h               |  55 ++-
 hw/nvme/subsys.c             |  75 +++-
 hw/nvme/trace-events         |   6 +
 hw/pci/meson.build           |   1 +
 hw/pci/pci.c                 | 100 ++++--
 hw/pci/pcie.c                |   5 +
 hw/pci/pcie_sriov.c          | 302 ++++++++++++++++
 hw/pci/trace-events          |   5 +
 include/block/nvme.h         |  65 ++++
 include/hw/pci/pci.h         |  12 +-
 include/hw/pci/pci_ids.h     |   1 +
 include/hw/pci/pci_regs.h    |   1 +
 include/hw/pci/pcie.h        |   6 +
 include/hw/pci/pcie_sriov.h  |  77 ++++
 include/qemu/typedefs.h      |   2 +
 19 files changed, 1460 insertions(+), 81 deletions(-)
 create mode 100644 docs/pcie_sriov.txt
 create mode 100644 hw/pci/pcie_sriov.c
 create mode 100644 include/hw/pci/pcie_sriov.h

-- 
2.25.1


Re: [PATCH v4 00/15] hw/nvme: SR-IOV with Virtualization Enhancements
Posted by Klaus Jensen 2 years, 2 months ago
On Jan 26 18:11, Lukasz Maniak wrote:
> Changes since v3:
> - Addressed comments to review on pcie: Add support for Single Root I/O
>   Virtualization (SR/IOV)
> - Fixed issues reported by checkpatch.pl
> 
> Knut Omang (2):
>   pcie: Add support for Single Root I/O Virtualization (SR/IOV)
>   pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt
> 
> Lukasz Maniak (4):
>   hw/nvme: Add support for SR-IOV
>   hw/nvme: Add support for Primary Controller Capabilities
>   hw/nvme: Add support for Secondary Controller List
>   docs: Add documentation for SR-IOV and Virtualization Enhancements
> 
> Łukasz Gieryk (9):
>   pcie: Add a helper to the SR/IOV API
>   pcie: Add 1.2 version token for the Power Management Capability
>   hw/nvme: Implement the Function Level Reset
>   hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime
>   hw/nvme: Remove reg_size variable and update BAR0 size calculation
>   hw/nvme: Calculate BAR attributes in a function
>   hw/nvme: Initialize capability structures for primary/secondary
>     controllers
>   hw/nvme: Add support for the Virtualization Management command
>   hw/nvme: Update the initalization place for the AER queue
> 
>  docs/pcie_sriov.txt          | 115 ++++++
>  docs/system/devices/nvme.rst |  36 ++
>  hw/nvme/ctrl.c               | 675 ++++++++++++++++++++++++++++++++---
>  hw/nvme/ns.c                 |   2 +-
>  hw/nvme/nvme.h               |  55 ++-
>  hw/nvme/subsys.c             |  75 +++-
>  hw/nvme/trace-events         |   6 +
>  hw/pci/meson.build           |   1 +
>  hw/pci/pci.c                 | 100 ++++--
>  hw/pci/pcie.c                |   5 +
>  hw/pci/pcie_sriov.c          | 302 ++++++++++++++++
>  hw/pci/trace-events          |   5 +
>  include/block/nvme.h         |  65 ++++
>  include/hw/pci/pci.h         |  12 +-
>  include/hw/pci/pci_ids.h     |   1 +
>  include/hw/pci/pci_regs.h    |   1 +
>  include/hw/pci/pcie.h        |   6 +
>  include/hw/pci/pcie_sriov.h  |  77 ++++
>  include/qemu/typedefs.h      |   2 +
>  19 files changed, 1460 insertions(+), 81 deletions(-)
>  create mode 100644 docs/pcie_sriov.txt
>  create mode 100644 hw/pci/pcie_sriov.c
>  create mode 100644 include/hw/pci/pcie_sriov.h
> 
> -- 
> 2.25.1
> 
> 

Hi Lukasz,

Back in v3 you changed this:

- Secondary controller cannot be set online unless the corresponding VF
  is enabled (sriov_numvfs set to at least the secondary controller's VF
  number)

I'm having issues getting this to work now. As I understand it, this now
requires that sriov_numvfs is set prior to onlining the devices, i.e.:

  echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs

However, this causes the kernel to reject it:

  nvme nvme1: Device not ready; aborting initialisation, CSTS=0x2
  nvme nvme1: Removing after probe failure status: -19

Is this the expected behavior? Must I manually bind the device again to
the nvme driver? Prior to v3 this worked just fine since the VF was
onlined at this point.

It would be useful if you added a small "onlining for dummies" section
to the docs ;)
Re: [PATCH v4 00/15] hw/nvme: SR-IOV with Virtualization Enhancements
Posted by Lukasz Maniak 2 years, 2 months ago
On Fri, Feb 11, 2022 at 08:26:10AM +0100, Klaus Jensen wrote:
> On Jan 26 18:11, Lukasz Maniak wrote:
> > Changes since v3:
> > - Addressed comments to review on pcie: Add support for Single Root I/O
> >   Virtualization (SR/IOV)
> > - Fixed issues reported by checkpatch.pl
> > 
> > Knut Omang (2):
> >   pcie: Add support for Single Root I/O Virtualization (SR/IOV)
> >   pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt
> > 
> > Lukasz Maniak (4):
> >   hw/nvme: Add support for SR-IOV
> >   hw/nvme: Add support for Primary Controller Capabilities
> >   hw/nvme: Add support for Secondary Controller List
> >   docs: Add documentation for SR-IOV and Virtualization Enhancements
> > 
> > Łukasz Gieryk (9):
> >   pcie: Add a helper to the SR/IOV API
> >   pcie: Add 1.2 version token for the Power Management Capability
> >   hw/nvme: Implement the Function Level Reset
> >   hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime
> >   hw/nvme: Remove reg_size variable and update BAR0 size calculation
> >   hw/nvme: Calculate BAR attributes in a function
> >   hw/nvme: Initialize capability structures for primary/secondary
> >     controllers
> >   hw/nvme: Add support for the Virtualization Management command
> >   hw/nvme: Update the initalization place for the AER queue
> > 
> >  docs/pcie_sriov.txt          | 115 ++++++
> >  docs/system/devices/nvme.rst |  36 ++
> >  hw/nvme/ctrl.c               | 675 ++++++++++++++++++++++++++++++++---
> >  hw/nvme/ns.c                 |   2 +-
> >  hw/nvme/nvme.h               |  55 ++-
> >  hw/nvme/subsys.c             |  75 +++-
> >  hw/nvme/trace-events         |   6 +
> >  hw/pci/meson.build           |   1 +
> >  hw/pci/pci.c                 | 100 ++++--
> >  hw/pci/pcie.c                |   5 +
> >  hw/pci/pcie_sriov.c          | 302 ++++++++++++++++
> >  hw/pci/trace-events          |   5 +
> >  include/block/nvme.h         |  65 ++++
> >  include/hw/pci/pci.h         |  12 +-
> >  include/hw/pci/pci_ids.h     |   1 +
> >  include/hw/pci/pci_regs.h    |   1 +
> >  include/hw/pci/pcie.h        |   6 +
> >  include/hw/pci/pcie_sriov.h  |  77 ++++
> >  include/qemu/typedefs.h      |   2 +
> >  19 files changed, 1460 insertions(+), 81 deletions(-)
> >  create mode 100644 docs/pcie_sriov.txt
> >  create mode 100644 hw/pci/pcie_sriov.c
> >  create mode 100644 include/hw/pci/pcie_sriov.h
> > 
> > -- 
> > 2.25.1
> > 
> > 
> 
> Hi Lukasz,
> 
> Back in v3 you changed this:
> 
> - Secondary controller cannot be set online unless the corresponding VF
>   is enabled (sriov_numvfs set to at least the secondary controller's VF
>   number)
> 
> I'm having issues getting this to work now. As I understand it, this now
> requires that sriov_numvfs is set prior to onlining the devices, i.e.:
> 
>   echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs
> 
> However, this causes the kernel to reject it:
> 
>   nvme nvme1: Device not ready; aborting initialisation, CSTS=0x2
>   nvme nvme1: Removing after probe failure status: -19
> 
> Is this the expected behavior? Must I manually bind the device again to
> the nvme driver? Prior to v3 this worked just fine since the VF was
> onlined at this point.
> 
> It would be useful if you added a small "onlining for dummies" section
> to the docs ;)

Hi Klaus,

Yes, this is the expected behavior and yeah it is less user friendly
than in v3.

Yet, after re-examining the NVMe specification, we concluded that this
is how it should work.

This is now the correct minimum flow needed to run a VF-based functional
NVMe controller:
# Unbind all flexible resources from the primary controller
nvme virt-mgmt /dev/nvme0 -c 0 -r 1 -a 1 -n 0
nvme virt-mgmt /dev/nvme0 -c 0 -r 0 -a 1 -n 0

# Reset the primary controller to actually release the resources
echo 1 > /sys/bus/pci/devices/0000:01:00.0/reset

# Enable VF
echo 1 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs

# Assign flexible resources to VF and set it ONLINE
nvme virt-mgmt /dev/nvme0 -c 1 -r 1 -a 8 -n 21
nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 8 -n 21
nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 9 -n 0

# Bind NVMe driver for VF controller
echo 0000:01:00.1 > /sys/bus/pci/drivers/nvme/bind

I will update the docs.

Thanks,
Lukasz