[libvirt PATCH] docs: Add pci-addresses.rst

Andrea Bolognani posted 1 patch 4 years ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20200414175305.348601-1-abologna@redhat.com
docs/formatdomain.html.in |   6 +-
docs/pci-addresses.rst    | 184 ++++++++++++++++++++++++++++++++++++++
2 files changed, 189 insertions(+), 1 deletion(-)
create mode 100644 docs/pci-addresses.rst
[libvirt PATCH] docs: Add pci-addresses.rst
Posted by Andrea Bolognani 4 years ago
This document describes the relationship between PCI addresses as
seen in the domain XML and by the guest OS, which is a topic that
people get confused by time and time again.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
---
 docs/formatdomain.html.in |   6 +-
 docs/pci-addresses.rst    | 184 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 189 insertions(+), 1 deletion(-)
 create mode 100644 docs/pci-addresses.rst

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 6f43976815..0077666862 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -4286,7 +4286,11 @@
         element with no other attributes as an explicit request to
         assign a PCI address for the device rather than some other
         type of address that may also be appropriate for that same
-        device (e.g. virtio-mmio).
+        device (e.g. virtio-mmio).<br/>
+        The relationship between the PCI addresses configured in the domain
+        XML and those seen by the guest OS can sometime seem confusing: a
+        separate document describes <a href="pci-addresses.html">how PCI
+        addresses work</a> in more detail.
       </dd>
       <dt><code>drive</code></dt>
       <dd>Drive addresses have the following additional
diff --git a/docs/pci-addresses.rst b/docs/pci-addresses.rst
new file mode 100644
index 0000000000..96c6466899
--- /dev/null
+++ b/docs/pci-addresses.rst
@@ -0,0 +1,184 @@
+========================================
+PCI addresses in domain XML and guest OS
+========================================
+
+.. contents::
+
+When discussing PCI addresses, it's important to understand the the
+relationship between the addresses that can be seen in the domain XML
+and those that are visible inside the guest OS.
+
+
+Simple cases
+============
+
+When the PCI topology of the VM is very simple, the PCI addresses
+will usually match.
+
+For example, the domain XML snippet
+
+::
+
+  <controller type='pci' index='0' model='pcie-root'/>
+  <controller type='pci' index='1' model='pcie-root-port'>
+    <model name='pcie-root-port'/>
+    <target chassis='1' port='0x8'/>
+    <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
+  </controller>
+  <interface type='network'>
+    <source network='default'/>
+    <model type='virtio'/>
+    <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
+  </interface>
+
+will result in the PCI topology
+
+::
+
+  0000:00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
+  0000:00:01.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
+  0000:01:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01)
+
+showing up in the guest OS.
+
+The PCI address of the ``virtio-net`` adapter, ``0000:01:00.0``, is
+the same in both cases, so there's no confusion.
+
+
+More complex cases
+==================
+
+In more complex cases, the PCI address visible in the domain XML will
+correlate to the one seen by the guest OS in a less obvious way.
+
+pcie-expander-bus
+-----------------
+
+This fairly uncommon device, which can be used with ``x86_64/q35``
+guests, will help illustrate one such scenario.
+
+For example, the domain XML snippet
+
+::
+
+  <controller type='pci' index='0' model='pcie-root'/>
+  <controller type='pci' index='1' model='pcie-expander-bus'>
+    <model name='pxb-pcie'/>
+    <target busNr='254'/>
+    <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
+  </controller>
+  <controller type='pci' index='2' model='pcie-root-port'>
+    <model name='pcie-root-port'/>
+    <target chassis='2' port='0x0'/>
+    <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
+  </controller>
+  <interface type='network'>
+    <source network='default'/>
+    <model type='virtio'/>
+    <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
+  </interface>
+
+will result in the PCI topology
+
+::
+
+  0000:00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
+  0000:00:01.0 Host bridge: Red Hat, Inc. QEMU PCIe Expander bridge
+  0000:fe:00.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
+  0000:ff:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01)
+
+showing up in the guest OS.
+
+This time the addresses don't match: this is because the ``busNr``
+property for the ``pcie-expander-bus`` controller causes it to show
+up as bus 254 (``0xfe`` in hexadecimal) instead of bus 1 as one might
+expect based on its ``index`` property.
+
+How can the domain XML shown above work at all, then? Surely the
+``pcie-root-port`` controller and the ``virtio-net`` adapter should
+use ``bus=0xfe`` and ``bus=0xff`` respectively for the configuration
+to be accepted by libvirt?
+
+As it turns out, that's not the case. The reason for this is that
+QEMU, and consequently libvirt, uses the ``bus`` property of a
+device's PCI address only to match it with the PCI controller that
+has the same ``index`` property, and not to set the actual PCI
+address, which is decided by the guest OS.
+
+So, by looking at the XML snippet above, we can see that the
+``virtio-net`` adapter plugs into the ``pcie-root-port`` controller,
+which plugs into the ``pcie-expander-bus`` controller, which plugs
+into ``pcie-root``: the guest OS sees the same topology, but assigns
+different PCI addresses to some of its component.
+
+The takeaway is that the *relationship* between controllers are the
+very same whether you look at the domain XML or at the guest OS, but
+the *actual PCI addresses* are not guaranteed to match and in fact,
+except for the very simplest cases, they usually will not.
+
+spapr-pci-host-bridge
+---------------------
+
+This device, which is unique to ``ppc64/pseries`` guests, will help
+illustrate another scenario.
+
+For example, the domain XML snippet
+
+::
+
+  <controller type='pci' index='0' model='pci-root'>
+     <model name='spapr-pci-host-bridge'/>
+     <target index='0'/>
+   </controller>
+   <controller type='pci' index='1' model='pci-root'>
+     <model name='spapr-pci-host-bridge'/>
+     <target index='1'/>
+   </controller>
+   <interface type='network'>
+     <source network='default'/>
+     <model type='virtio'/>
+     <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
+   </interface>
+
+will result in the PCI topology
+
+::
+
+  0001:00:01.0 Ethernet controller: Red Hat, Inc. Virtio network device
+
+showing up in the guest OS. Note that the two
+``spapr-pci-host-bridge`` controllers are not listed.
+
+This time, in addition to the bus not matching just like in the
+previous example, the interesting part is that the domain doesn't
+match either: this is because each ``spapr-pci-host-bridge``
+controller creates a separate PCI domain.
+
+Once again, while the PCI addresses seen in the domain XML and those
+seen by the guest OS do not match, the relationships between the
+various devices are preserved.
+
+
+Device assignment
+=================
+
+When using VFIO to assign host devices to a guest, an additional
+caveat to keep in mind that the guest OS will base its decisions upon
+the *target address* rather than the *source address*.
+
+For example, the domain XML snippet
+
+::
+
+  <hostdev mode='subsystem' type='pci' managed='yes'>
+    <driver name='vfio'/>
+    <source>
+      <address domain='0x0001' bus='0x08' slot='0x00' function='0x0'/>
+    </source>
+    <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
+  </hostdev>
+
+will result in the device showing up as ``0000:00:01.0`` in the
+guest OS rather than as ``0001:08:00.1``.
+
+Of course, all the rules and behaviors described above still apply.
-- 
2.25.2

Re: [libvirt PATCH] docs: Add pci-addresses.rst
Posted by Cornelia Huck 4 years ago
On Tue, 14 Apr 2020 19:53:05 +0200
Andrea Bolognani <abologna@redhat.com> wrote:

> This document describes the relationship between PCI addresses as
> seen in the domain XML and by the guest OS, which is a topic that
> people get confused by time and time again.
> 
> Signed-off-by: Andrea Bolognani <abologna@redhat.com>
> ---
>  docs/formatdomain.html.in |   6 +-
>  docs/pci-addresses.rst    | 184 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 189 insertions(+), 1 deletion(-)
>  create mode 100644 docs/pci-addresses.rst
> 

(...)

> diff --git a/docs/pci-addresses.rst b/docs/pci-addresses.rst
> new file mode 100644
> index 0000000000..96c6466899
> --- /dev/null
> +++ b/docs/pci-addresses.rst
> @@ -0,0 +1,184 @@
> +========================================
> +PCI addresses in domain XML and guest OS
> +========================================
> +
> +.. contents::
> +
> +When discussing PCI addresses, it's important to understand the the
> +relationship between the addresses that can be seen in the domain XML
> +and those that are visible inside the guest OS.
> +
> +
> +Simple cases
> +============

(...)

> +More complex cases
> +==================

(...)

I'm wondering whether it is worth mentioning zPCI under 'More complex
cases', or maybe under 'Completely wacky cases', as the PCI addresses a
Linux guest will generate do not have any relation to whatever
addresses are used in the XML at all, but only to the zPCI attributes?

Re: [libvirt PATCH] docs: Add pci-addresses.rst
Posted by Andrea Bolognani 4 years ago
On Wed, 2020-04-15 at 08:47 +0200, Cornelia Huck wrote:
> On Tue, 14 Apr 2020 19:53:05 +0200
> Andrea Bolognani <abologna@redhat.com> wrote:
> > +More complex cases
> > +==================
> 
> (...)
> 
> I'm wondering whether it is worth mentioning zPCI under 'More complex
> cases', or maybe under 'Completely wacky cases', as the PCI addresses a
> Linux guest will generate do not have any relation to whatever
> addresses are used in the XML at all, but only to the zPCI attributes?

It could be an interesting example, sure! Would you mind writing a
few lines about it? I don't have easy access to zPCI-capable s390x
hardware.

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [libvirt PATCH] docs: Add pci-addresses.rst
Posted by Christian Ehrhardt 4 years ago
Hi Andrea,
I saw this change committed and a latter push of mine has reported the
following while running the pipelines:

../docs/pci-addresses.rst:7:the the
build-aux/syntax-check.mk: doubled words
make: *** [../build-aux/syntax-check.mk:1727: sc_prohibit_doubled_word] Error 1

This is due to:
  2923e7a3 docs: Add pci-addresses.rst

Would you mind to provide a fixup as otherwise all other commits
pushed will crash on that pipeline?



On Wed, Apr 15, 2020 at 9:46 AM Andrea Bolognani <abologna@redhat.com> wrote:
>
> On Wed, 2020-04-15 at 08:47 +0200, Cornelia Huck wrote:
> > On Tue, 14 Apr 2020 19:53:05 +0200
> > Andrea Bolognani <abologna@redhat.com> wrote:
> > > +More complex cases
> > > +==================
> >
> > (...)
> >
> > I'm wondering whether it is worth mentioning zPCI under 'More complex
> > cases', or maybe under 'Completely wacky cases', as the PCI addresses a
> > Linux guest will generate do not have any relation to whatever
> > addresses are used in the XML at all, but only to the zPCI attributes?
>
> It could be an interesting example, sure! Would you mind writing a
> few lines about it? I don't have easy access to zPCI-capable s390x
> hardware.
>
> --
> Andrea Bolognani / Red Hat / Virtualization
>


--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd


Re: [libvirt PATCH] docs: Add pci-addresses.rst
Posted by Andrea Bolognani 4 years ago
On Wed, 2020-04-15 at 10:46 +0200, Christian Ehrhardt wrote:
> Hi Andrea,
> I saw this change committed and a latter push of mine has reported the
> following while running the pipelines:
> 
> ../docs/pci-addresses.rst:7:the the
> build-aux/syntax-check.mk: doubled words
> make: *** [../build-aux/syntax-check.mk:1727: sc_prohibit_doubled_word] Error 1
> 
> This is due to:
>   2923e7a3 docs: Add pci-addresses.rst
> 
> Would you mind to provide a fixup as otherwise all other commits
> pushed will crash on that pipeline?

Done.

  commit e767f509b2bd2ec7a927515e37ee14b10e338313
  Author: Andrea Bolognani <abologna@redhat.com>
  Date:   Wed Apr 15 10:49:42 2020 +0200

    docs: Fix word repetition in pci-addresses.rst

    Fixes: 2923e7a3dd984c46202703d390dce3ff4ea4048c
    Reported-by: Ján Tomko <jtomko@redhat.com>
    Signed-off-by: Andrea Bolognani <abologna@redhat.com>

I had not seen your message before now, sorry O:-)

-- 
Andrea Bolognani / Red Hat / Virtualization

Re: [libvirt PATCH] docs: Add pci-addresses.rst
Posted by Laine Stump 4 years ago
On 4/14/20 1:53 PM, Andrea Bolognani wrote:
> This document describes the relationship between PCI addresses as
> seen in the domain XML and by the guest OS, which is a topic that
> people get confused by time and time again.
>
> Signed-off-by: Andrea Bolognani <abologna@redhat.com>
> ---
>   docs/formatdomain.html.in |   6 +-
>   docs/pci-addresses.rst    | 184 ++++++++++++++++++++++++++++++++++++++
>   2 files changed, 189 insertions(+), 1 deletion(-)
>   create mode 100644 docs/pci-addresses.rst
>
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 6f43976815..0077666862 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -4286,7 +4286,11 @@
>           element with no other attributes as an explicit request to
>           assign a PCI address for the device rather than some other
>           type of address that may also be appropriate for that same
> -        device (e.g. virtio-mmio).
> +        device (e.g. virtio-mmio).<br/>
> +        The relationship between the PCI addresses configured in the domain
> +        XML and those seen by the guest OS can sometime seem confusing: a
> +        separate document describes <a href="pci-addresses.html">how PCI
> +        addresses work</a> in more detail.
>         </dd>
>         <dt><code>drive</code></dt>
>         <dd>Drive addresses have the following additional
> diff --git a/docs/pci-addresses.rst b/docs/pci-addresses.rst
> new file mode 100644
> index 0000000000..96c6466899
> --- /dev/null
> +++ b/docs/pci-addresses.rst
> @@ -0,0 +1,184 @@
> +========================================
> +PCI addresses in domain XML and guest OS
> +========================================
> +
> +.. contents::
> +
> +When discussing PCI addresses, it's important to understand the the
> +relationship between the addresses that can be seen in the domain XML
> +and those that are visible inside the guest OS.
> +
> +
> +Simple cases
> +============
> +
> +When the PCI topology of the VM is very simple, the PCI addresses
> +will usually match.
> +
> +For example, the domain XML snippet
> +
> +::
> +
> +  <controller type='pci' index='0' model='pcie-root'/>
> +  <controller type='pci' index='1' model='pcie-root-port'>
> +    <model name='pcie-root-port'/>
> +    <target chassis='1' port='0x8'/>
> +    <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
> +  </controller>
> +  <interface type='network'>
> +    <source network='default'/>
> +    <model type='virtio'/>
> +    <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
> +  </interface>
> +
> +will result in the PCI topology
> +
> +::
> +
> +  0000:00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> +  0000:00:01.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
> +  0000:01:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01)
> +
> +showing up in the guest OS.
> +
> +The PCI address of the ``virtio-net`` adapter, ``0000:01:00.0``, is
> +the same in both cases, so there's no confusion.
> +
> +
> +More complex cases
> +==================
> +
> +In more complex cases, the PCI address visible in the domain XML will
> +correlate to the one seen by the guest OS in a less obvious way.
> +
> +pcie-expander-bus
> +-----------------
> +
> +This fairly uncommon device, which can be used with ``x86_64/q35``
> +guests, will help illustrate one such scenario.
> +
> +For example, the domain XML snippet
> +
> +::
> +
> +  <controller type='pci' index='0' model='pcie-root'/>
> +  <controller type='pci' index='1' model='pcie-expander-bus'>
> +    <model name='pxb-pcie'/>
> +    <target busNr='254'/>
> +    <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
> +  </controller>
> +  <controller type='pci' index='2' model='pcie-root-port'>
> +    <model name='pcie-root-port'/>
> +    <target chassis='2' port='0x0'/>
> +    <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
> +  </controller>
> +  <interface type='network'>
> +    <source network='default'/>
> +    <model type='virtio'/>
> +    <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
> +  </interface>
> +
> +will result in the PCI topology
> +
> +::
> +
> +  0000:00:00.0 Host bridge: Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
> +  0000:00:01.0 Host bridge: Red Hat, Inc. QEMU PCIe Expander bridge
> +  0000:fe:00.0 PCI bridge: Red Hat, Inc. QEMU PCIe Root port
> +  0000:ff:00.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01)
> +
> +showing up in the guest OS.
> +
> +This time the addresses don't match: this is because the ``busNr``
> +property for the ``pcie-expander-bus`` controller causes it to show
> +up as bus 254 (``0xfe`` in hexadecimal) instead of bus 1 as one might
> +expect based on its ``index`` property.
> +
> +How can the domain XML shown above work at all, then? Surely the
> +``pcie-root-port`` controller and the ``virtio-net`` adapter should
> +use ``bus=0xfe`` and ``bus=0xff`` respectively for the configuration
> +to be accepted by libvirt?
> +
> +As it turns out, that's not the case. The reason for this is that
> +QEMU, and consequently libvirt, uses the ``bus`` property of a
> +device's PCI address only to match it with the PCI controller that
> +has the same ``index`` property, and not to set the actual PCI
> +address, which is decided by the guest OS.
> +
> +So, by looking at the XML snippet above, we can see that the
> +``virtio-net`` adapter plugs into the ``pcie-root-port`` controller,
> +which plugs into the ``pcie-expander-bus`` controller, which plugs
> +into ``pcie-root``: the guest OS sees the same topology, but assigns
> +different PCI addresses to some of its component.
> +
> +The takeaway is that the *relationship* between controllers are the
> +very same whether you look at the domain XML or at the guest OS, but
> +the *actual PCI addresses* are not guaranteed to match and in fact,
> +except for the very simplest cases, they usually will not.


and it doesn't necessarily take a pcie-expander-bus to make the 
numbering appear "off". It really is 100% up to the guest OS what bus 
number is given to each bus that it discovers, and there's nothing the 
host can do about it. This is similar to the target device name for 
block devices - you can put <target dev='sde'/> in the config for a disk 
as much as you want and it's not going to make any difference - the 
guest OS will name it whatever it feels like naming it, and you'd better 
damn well like it! :-)


Sorry, that was a digression. What you've said is fine.


> +
> +spapr-pci-host-bridge
> +---------------------
> +
> +This device, which is unique to ``ppc64/pseries`` guests, will help
> +illustrate another scenario.
> +
> +For example, the domain XML snippet
> +
> +::
> +
> +  <controller type='pci' index='0' model='pci-root'>
> +     <model name='spapr-pci-host-bridge'/>
> +     <target index='0'/>
> +   </controller>
> +   <controller type='pci' index='1' model='pci-root'>
> +     <model name='spapr-pci-host-bridge'/>
> +     <target index='1'/>
> +   </controller>
> +   <interface type='network'>
> +     <source network='default'/>
> +     <model type='virtio'/>
> +     <address type='pci' domain='0x0000' bus='0x01' slot='0x01' function='0x0'/>
> +   </interface>
> +
> +will result in the PCI topology
> +
> +::
> +
> +  0001:00:01.0 Ethernet controller: Red Hat, Inc. Virtio network device
> +
> +showing up in the guest OS. Note that the two
> +``spapr-pci-host-bridge`` controllers are not listed.
> +
> +This time, in addition to the bus not matching just like in the
> +previous example, the interesting part is that the domain doesn't
> +match either: this is because each ``spapr-pci-host-bridge``
> +controller creates a separate PCI domain.
> +
> +Once again, while the PCI addresses seen in the domain XML and those
> +seen by the guest OS do not match, the relationships between the
> +various devices are preserved.
> +
> +
> +Device assignment
> +=================
> +
> +When using VFIO to assign host devices to a guest, an additional
> +caveat to keep in mind that the guest OS will base its decisions upon
> +the *target address* rather than the *source address*.
> +
> +For example, the domain XML snippet
> +
> +::
> +
> +  <hostdev mode='subsystem' type='pci' managed='yes'>
> +    <driver name='vfio'/>
> +    <source>
> +      <address domain='0x0001' bus='0x08' slot='0x00' function='0x0'/>
> +    </source>
> +    <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
> +  </hostdev>
> +
> +will result in the device showing up as ``0000:00:01.0`` in the
> +guest OS rather than as ``0001:08:00.1``.


"... which is the address of the device *on the host*."


or something like that.


Reviewed-by: Laine Stump <laine@redhat.com>


> +
> +Of course, all the rules and behaviors described above still apply.