[libvirt] [PATCH] virhostdev: Fix PCI devices are still attatched to stub driver bug

Wu Zongyong posted 1 patch 6 years, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/1535700867-86300-1-git-send-email-cordius.wu@huawei.com
Test syntax-check passed
src/util/virhostdev.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
[libvirt] [PATCH] virhostdev: Fix PCI devices are still attatched to stub driver bug
Posted by Wu Zongyong 6 years, 2 months ago
Currently, PCI devices will not be rebound to host drivers but
attached to the stub driver when:

1) use libvirt to start a virtual machine with PCI devices assigned,
then stop libvirtd process and shutdown the virtual machine. Finally,
PCI devices are still bound to the stub driver instead of host drivers
after libvirt start again.
2) use libvirt to shutdown a virtual machine wtih PCI devices assigned,
then stop libvirtd process before libvirt try to rebind PCI devices to
host drivers. Finally, PCI devices are still bound to the stub driver
after libvirt start again.

Notice that the comment on the top of virPCIDeviceDetach as follows:

activeDevs should be a list of all PCI devices currently in use by a
domain.inactiveDevs is a list of all PCI devices that libvirt has
detached from the host driver + attached to the stub driver, but
hasn't yet assigned to a domain.

It's not reasonable that libvirt filter out devices that are either not
active or not used by the current domain and driver. For devices belong
to domains that has been shutdown before libvirt start, we should put
them into inactiveDevs if then meet the condition that PCI devices have
been detached from the host driver + attached to the stub driver.

Moreover, we set orignal states of PCI devices when build PCI devices
list in virHostdevGetPCIHostDeviceList if states has been set in struct
virDomainHostdevDefPtr.

Signed-off-by: Wu Zongyong <cordius.wu@huawei.com>
---
 src/util/virhostdev.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c
index ca79c37..ecf95e3 100644
--- a/src/util/virhostdev.c
+++ b/src/util/virhostdev.c
@@ -235,6 +235,7 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int nhostdevs)
     for (i = 0; i < nhostdevs; i++) {
         virDomainHostdevDefPtr hostdev = hostdevs[i];
         virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci;
+        virDomainHostdevOrigStatesPtr origstates = &hostdev->origstates;
         virPCIDevicePtr pci;
 
         if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS)
@@ -262,6 +263,10 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int nhostdevs)
             virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_XEN);
         else
             virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_KVM);
+
+        virPCIDeviceSetUnbindFromStub(pci, origstates->states.pci.unbind_from_stub);
+        virPCIDeviceSetRemoveSlot(pci, origstates->states.pci.remove_slot);
+        virPCIDeviceSetReprobe(pci, origstates->states.pci.reprobe);
     }
 
     return pcidevs;
@@ -1008,8 +1013,19 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr,
                 continue;
             }
         } else {
-            virPCIDeviceListDel(pcidevs, pci);
-            continue;
+            int stub = virPCIDeviceGetStubDriver(pci);
+            if (stub > VIR_PCI_STUB_DRIVER_NONE &&
+                stub < VIR_PCI_STUB_DRIVER_LAST) {
+                /* The device is bound to a known stub driver: add a copy
+                 * to the inactive list */
+                VIR_DEBUG("Adding PCI device %s to inactive list",
+                          virPCIDeviceGetName(pci));
+                if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) {
+                    VIR_ERROR(_("Failed to add PCI device %s to the inactive list"),
+                              virGetLastErrorMessage());
+                    virResetLastError();
+                }
+            }
         }
 
         i++;
-- 
1.9.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virhostdev: Fix PCI devices are still attatched to stub driver bug
Posted by John Ferlan 6 years, 2 months ago
$SUBJ:

s/attatched/attached

s/bug//

On 08/31/2018 03:34 AM, Wu Zongyong wrote:

So, first off - I believe there are two things going on in this one
patch. Even though there is "some relationship" between the issues, the
libvirtd restart is kind of a corner case, while the change to readd the
device to inactiveDev's would affect both this libvirtd restart case
*and* the normal processing. It's taken me some time to come to that
conclusion - lots of details follow, but I'm also willing to accept the
changes are "related enough" for just one patch.

> Currently, PCI devices will not be rebound to host drivers but
> attached to the stub driver when:
> 
> 1) use libvirt to start a virtual machine with PCI devices assigned,
> then stop libvirtd process and shutdown the virtual machine. Finally,
> PCI devices are still bound to the stub driver instead of host drivers
> after libvirt start again.
> 2) use libvirt to shutdown a virtual machine wtih PCI devices assigned,

s/wiht/with

> then stop libvirtd process before libvirt try to rebind PCI devices to

Be specific which API isn't being called - it really will help. Making a
reviewer figure it out in very specific/particular cases such as this
leads to "review delay". Is qemuHostdevReAttachDomainDevices called from
qemuProcessStop what you were referring to? The Stop not getting called
because the monitor channel is closed and thus the event for shutdown
won't be triggered.

> host drivers. Finally, PCI devices are still bound to the stub driver
> after libvirt start again.

Since you've done the research and so that I'm sure I'm following your
logic - it seems like you're chasing corner cases based on libvirtd
restart. In one instance the order is:

    virsh start $dom
    service libvirtd stop  (or some other means?)
    <shutdown> the guest from within the $dom

and the other instance is

    virsh start @dom
    virsh destroy $dom
    service libvirtd stop

and issues as a result of the subsequent "service libvirtd start".

In the second instance it's not clear how your timing is "just right"
such that libvirtd can be stopped before the PCI devices can be returned
to the host PCI, but it's essentially the same as the first scenario at
least with respect to the subsequent libvirtd restart and discovery that
the guest is no longer running and needing to process that.

What may also have helped is if the commit message indicated the
"normal" path taken that would allow the drivers to be rebound properly.
 Even if it's after the "---" for review backup assistance information.

In edge cases like this, it's far easier to cut out stuff from a commit
message than it is to put yourself into the frame of reference of the
patch submitter

Still what your patch does is use the virDomainHostdevOrigStatesPtr for
a domain that was shutdown, but when libvirtd restarts. I had to go a
long way back to find when the OrigStates code was added - commit
d84b36263 in particular. In that case, the order was start domain,
restart libvirtd, and destroy domain.

So the slight difference to the condition for this patch is the domain
was shutdown while libvirtd was stopped.

In any case, all of the above is "one issue"...

Leaving the rest as a "second issue"

> 
> Notice that the comment on the top of virPCIDeviceDetach as follows:
> 
> activeDevs should be a list of all PCI devices currently in use by a
> domain.inactiveDevs is a list of all PCI devices that libvirt has

  * activeDevs is a list ...
  * inactiveDevs is a list ...

> detached from the host driver + attached to the stub driver, but
> hasn't yet assigned to a domain.
> 
> It's not reasonable that libvirt filter out devices that are either not
> active or not used by the current domain and driver. For devices belong

s/belong/that belong/

> to domains that has been shutdown before libvirt start, we should put

s/has/have

s/libvirt start, we should put them/libvirtd starts, they should be placed

> them into inactiveDevs if then meet the condition that PCI devices have

I'm finding it hard to read/parse/rewrite the rest of this sentence.

> been detached from the host driver + attached to the stub driver.
> 
> Moreover, we set orignal states of PCI devices when build PCI devices

s/we set orignal/set the original/

s/build/building/

> list in virHostdevGetPCIHostDeviceList if states has been set in struct
> virDomainHostdevDefPtr.


So the "second issue" (as I see it) is that ReAttach processing wasn't
handling returning devices that had stub driver attachment (at some
point in time). This issue occurs regardless of the libvirtd restart and
domain shutdown processing order. Even though it's related, it's
different and would need its own patch.

> 
> Signed-off-by: Wu Zongyong <cordius.wu@huawei.com>
> ---
>  src/util/virhostdev.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c
> index ca79c37..ecf95e3 100644
> --- a/src/util/virhostdev.c
> +++ b/src/util/virhostdev.c
> @@ -235,6 +235,7 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int nhostdevs)
>      for (i = 0; i < nhostdevs; i++) {
>          virDomainHostdevDefPtr hostdev = hostdevs[i];
>          virDomainHostdevSubsysPCIPtr pcisrc = &hostdev->source.subsys.u.pci;
> +        virDomainHostdevOrigStatesPtr origstates = &hostdev->origstates;
>          virPCIDevicePtr pci;
>  
>          if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS)
> @@ -262,6 +263,10 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int nhostdevs)
>              virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_XEN);
>          else
>              virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_KVM);
> +
> +        virPCIDeviceSetUnbindFromStub(pci, origstates->states.pci.unbind_from_stub);
> +        virPCIDeviceSetRemoveSlot(pci, origstates->states.pci.remove_slot);
> +        virPCIDeviceSetReprobe(pci, origstates->states.pci.reprobe);

This looks a lot like what virHostdevUpdateActivePCIDevices is doing -
in fact I think that is the libvirtd restart "bug" you're chasing -
qemuProcessReconnect will call qemuHostdevUpdateActiveDomainDevices
after the call to qemuConnectMonitor - at least w/r/t this PCI stub
interaction.

Since the domain was stopped connecting to the monitor will fail (in my
case qemuMonitorOpenUnix), thus falling into qemuProcessStop, but
without having updated (or ReAttach'ing) the PCI devices. So, in your
environment without these two changes - could you check to see what
happens if the order is changed? Does that fix the issue?

That is move the call to qemuHostdevUpdateActiveDomainDevices to right
after qemuProcessPrepareAllowReboot.

Some more "detailed" thoughts I had on this while investigating...

Both virHostdevPreparePCIDevices and virHostdevReAttachPCIDevices will
call virHostdevGetPCIHostDeviceList. However, in the case of the former,
"Step 7" is "set the original states for hostdev def".

Following call traces for virHostdevPreparePCIDevices eventually leads
back to :

  1. qemuMigrationDstPrepareAny (migration)
  2. qemuProcessStart (guest startup)
  3. qemuDomainAttachHostDevice (hotplug).

IOW: This is the "normal" running mode.

So I have a concern regarding the "order of operation" in this case
especially since step 7 has the caveat of (actual) being true before
setting the bools.

Considering the "other pah" through virHostdevReAttachPCIDevices,
callers are:

   1. libxlDomainAttachHostPCIDevice (an error: path)
   2. libxlDomainDetachHostPCIDevice (normal path)
   3. qemuHostdevReAttachPCIDevices (normal path)
   4. virHostdevReAttachDomainDevices (libxl domain close processing)

Focusing on qemuHostdevReAttachPCIDevices I find 3 callers:

   1. qemuHostdevReAttachDomainDevices (called by qemuProcessStop)
   2. qemuDomainAttachHostPCIDevice (hotplug error processing)
   3. qemuDomainRemovePCIHostDevice (hot unplug processing).

So while setting things up seems to be the right thing to do in the
ReAttach case, the normal attach case it doesn't seem so. Perhaps too
generic.

>      }
>  
>      return pcidevs;
> @@ -1008,8 +1013,19 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr mgr,
>                  continue;
>              }
>          } else {
> -            virPCIDeviceListDel(pcidevs, pci);
> -            continue;

The above was added in commit 4cdbff3d5 by Andrea and just removed
inactiveDev's from the @pcidevs since his change added both onto
@pcidevs (at least that's my reading of it).

But with your proposed change rather than deleting that from @pcidevs,
you're moving it to the inactiveDev's list *if* there's a stub.

So what happens if there isn't a stub?  From this point forward @pcidevs
would contain both active and inactive devices.

Then step 2 comes along and runs the @pcidevs "stealing" to place the
device on the inactiveDevs list.  So now wouldn't we have duplicate
devices there?

Now if the new code went before the above 2 lines, then probably no big
deal - that way we're adding back to the inactive list for these stub
drivers and we're not keeping the device in @pcidevs meaning it wouldn't
be readded to inactive afterwards.

Hopefully this makes sense.

> +            int stub = virPCIDeviceGetStubDriver(pci);
> +            if (stub > VIR_PCI_STUB_DRIVER_NONE &&
> +                stub < VIR_PCI_STUB_DRIVER_LAST) {
> +                /* The device is bound to a known stub driver: add a copy
> +                 * to the inactive list */
> +                VIR_DEBUG("Adding PCI device %s to inactive list",

s/%s/'%s'/

(both above and below)

> +                          virPCIDeviceGetName(pci));
> +                if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs, pci) < 0) {
> +                    VIR_ERROR(_("Failed to add PCI device %s to the inactive list"),
> +                              virGetLastErrorMessage());
> +                    virResetLastError();
> +                }
> +            }

For reference, this is a copy of what's in virHostdevPreparePCIDevices
step 2. I think you need to do some extra debugging and be sure you're
not duplicating things.  IOW: Add some debug code to dump the active and
inactive lists.

Although this is really long - I hope it all makes sense.

John

>          }
>  
>          i++;
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virhostdev: Fix PCI devices are still attatched to stub driver bug
Posted by Wuzongyong (Euler Dept) 6 years, 2 months ago
> $SUBJ:
> 
> s/attatched/attached
> 
> s/bug//
> 
> On 08/31/2018 03:34 AM, Wu Zongyong wrote:
> 
> So, first off - I believe there are two things going on in this one patch.
> Even though there is "some relationship" between the issues, the libvirtd
> restart is kind of a corner case, while the change to readd the device to
> inactiveDev's would affect both this libvirtd restart case
> *and* the normal processing. It's taken me some time to come to that
> conclusion - lots of details follow, but I'm also willing to accept the
> changes are "related enough" for just one patch.
> 
> > Currently, PCI devices will not be rebound to host drivers but
> > attached to the stub driver when:
> >
> > 1) use libvirt to start a virtual machine with PCI devices assigned,
> > then stop libvirtd process and shutdown the virtual machine. Finally,
> > PCI devices are still bound to the stub driver instead of host drivers
> > after libvirt start again.
> > 2) use libvirt to shutdown a virtual machine wtih PCI devices
> > assigned,
> 
> s/wiht/with
> 
> > then stop libvirtd process before libvirt try to rebind PCI devices to
> 
> Be specific which API isn't being called - it really will help. Making a
> reviewer figure it out in very specific/particular cases such as this
> leads to "review delay". Is qemuHostdevReAttachDomainDevices called from
> qemuProcessStop what you were referring to? The Stop not getting called
> because the monitor channel is closed and thus the event for shutdown
> won't be triggered.
> 
> > host drivers. Finally, PCI devices are still bound to the stub driver
> > after libvirt start again.
> 
> Since you've done the research and so that I'm sure I'm following your
> logic - it seems like you're chasing corner cases based on libvirtd
> restart. In one instance the order is:
> 
>     virsh start $dom
>     service libvirtd stop  (or some other means?)
>     <shutdown> the guest from within the $dom
> 
> and the other instance is
> 
>     virsh start @dom
>     virsh destroy $dom
>     service libvirtd stop
> 
> and issues as a result of the subsequent "service libvirtd start".
> 
> In the second instance it's not clear how your timing is "just right"
> such that libvirtd can be stopped before the PCI devices can be returned
> to the host PCI, but it's essentially the same as the first scenario at
> least with respect to the subsequent libvirtd restart and discovery that
> the guest is no longer running and needing to process that.
> 
> What may also have helped is if the commit message indicated the "normal"
> path taken that would allow the drivers to be rebound properly.
>  Even if it's after the "---" for review backup assistance information.
> 
> In edge cases like this, it's far easier to cut out stuff from a commit
> message than it is to put yourself into the frame of reference of the
> patch submitter
> 
> Still what your patch does is use the virDomainHostdevOrigStatesPtr for a
> domain that was shutdown, but when libvirtd restarts. I had to go a long
> way back to find when the OrigStates code was added - commit
> d84b36263 in particular. In that case, the order was start domain, restart
> libvirtd, and destroy domain.
> 
> So the slight difference to the condition for this patch is the domain was
> shutdown while libvirtd was stopped.
> 
> In any case, all of the above is "one issue"...
> 
> Leaving the rest as a "second issue"
> 
> >
> > Notice that the comment on the top of virPCIDeviceDetach as follows:
> >
> > activeDevs should be a list of all PCI devices currently in use by a
> > domain.inactiveDevs is a list of all PCI devices that libvirt has
> 
>   * activeDevs is a list ...
>   * inactiveDevs is a list ...
> 
> > detached from the host driver + attached to the stub driver, but
> > hasn't yet assigned to a domain.
> >
> > It's not reasonable that libvirt filter out devices that are either
> > not active or not used by the current domain and driver. For devices
> > belong
> 
> s/belong/that belong/
> 
> > to domains that has been shutdown before libvirt start, we should put
> 
> s/has/have
> 
> s/libvirt start, we should put them/libvirtd starts, they should be placed
> 
> > them into inactiveDevs if then meet the condition that PCI devices
> > have
> 
> I'm finding it hard to read/parse/rewrite the rest of this sentence.
> 
> > been detached from the host driver + attached to the stub driver.
> >
> > Moreover, we set orignal states of PCI devices when build PCI devices
> 
> s/we set orignal/set the original/
> 
> s/build/building/
> 
> > list in virHostdevGetPCIHostDeviceList if states has been set in
> > struct virDomainHostdevDefPtr.
> 
> 
> So the "second issue" (as I see it) is that ReAttach processing wasn't
> handling returning devices that had stub driver attachment (at some point
> in time). This issue occurs regardless of the libvirtd restart and domain
> shutdown processing order. Even though it's related, it's different and
> would need its own patch.
> 
> >
> > Signed-off-by: Wu Zongyong <cordius.wu@huawei.com>
> > ---
> >  src/util/virhostdev.c | 20 ++++++++++++++++++--
> >  1 file changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index
> > ca79c37..ecf95e3 100644
> > --- a/src/util/virhostdev.c
> > +++ b/src/util/virhostdev.c
> > @@ -235,6 +235,7 @@
> virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int
> nhostdevs)
> >      for (i = 0; i < nhostdevs; i++) {
> >          virDomainHostdevDefPtr hostdev = hostdevs[i];
> >          virDomainHostdevSubsysPCIPtr pcisrc =
> > &hostdev->source.subsys.u.pci;
> > +        virDomainHostdevOrigStatesPtr origstates =
> > + &hostdev->origstates;
> >          virPCIDevicePtr pci;
> >
> >          if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) @@
> > -262,6 +263,10 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr
> *hostdevs, int nhostdevs)
> >              virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_XEN);
> >          else
> >              virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_KVM);
> > +
> > +        virPCIDeviceSetUnbindFromStub(pci, origstates-
> >states.pci.unbind_from_stub);
> > +        virPCIDeviceSetRemoveSlot(pci, origstates-
> >states.pci.remove_slot);
> > +        virPCIDeviceSetReprobe(pci, origstates->states.pci.reprobe);
> 
> This looks a lot like what virHostdevUpdateActivePCIDevices is doing - in
> fact I think that is the libvirtd restart "bug" you're chasing -
> qemuProcessReconnect will call qemuHostdevUpdateActiveDomainDevices
> after the call to qemuConnectMonitor - at least w/r/t this PCI stub
> interaction.
> 
> Since the domain was stopped connecting to the monitor will fail (in my
> case qemuMonitorOpenUnix), thus falling into qemuProcessStop, but without
> having updated (or ReAttach'ing) the PCI devices. So, in your environment
> without these two changes - could you check to see what happens if the
> order is changed? Does that fix the issue?
> 
> That is move the call to qemuHostdevUpdateActiveDomainDevices to right
> after qemuProcessPrepareAllowReboot.
> 
> Some more "detailed" thoughts I had on this while investigating...
> 
> Both virHostdevPreparePCIDevices and virHostdevReAttachPCIDevices will
> call virHostdevGetPCIHostDeviceList. However, in the case of the former,
> "Step 7" is "set the original states for hostdev def".
> 
> Following call traces for virHostdevPreparePCIDevices eventually leads
> back to :
> 
>   1. qemuMigrationDstPrepareAny (migration)
>   2. qemuProcessStart (guest startup)
>   3. qemuDomainAttachHostDevice (hotplug).
> 
> IOW: This is the "normal" running mode.
> 
> So I have a concern regarding the "order of operation" in this case
> especially since step 7 has the caveat of (actual) being true before
> setting the bools.
> 
> Considering the "other pah" through virHostdevReAttachPCIDevices, callers
> are:
> 
>    1. libxlDomainAttachHostPCIDevice (an error: path)
>    2. libxlDomainDetachHostPCIDevice (normal path)
>    3. qemuHostdevReAttachPCIDevices (normal path)
>    4. virHostdevReAttachDomainDevices (libxl domain close processing)
> 
> Focusing on qemuHostdevReAttachPCIDevices I find 3 callers:
> 
>    1. qemuHostdevReAttachDomainDevices (called by qemuProcessStop)
>    2. qemuDomainAttachHostPCIDevice (hotplug error processing)
>    3. qemuDomainRemovePCIHostDevice (hot unplug processing).
> 
> So while setting things up seems to be the right thing to do in the
> ReAttach case, the normal attach case it doesn't seem so. Perhaps too
> generic.
> 
> >      }
> >
> >      return pcidevs;
> > @@ -1008,8 +1013,19 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr
> mgr,
> >                  continue;
> >              }
> >          } else {
> > -            virPCIDeviceListDel(pcidevs, pci);
> > -            continue;
> 
> The above was added in commit 4cdbff3d5 by Andrea and just removed
> inactiveDev's from the @pcidevs since his change added both onto @pcidevs
> (at least that's my reading of it).
> 
> But with your proposed change rather than deleting that from @pcidevs,
> you're moving it to the inactiveDev's list *if* there's a stub.
> 
> So what happens if there isn't a stub?  From this point forward @pcidevs
> would contain both active and inactive devices.
> 
> Then step 2 comes along and runs the @pcidevs "stealing" to place the
> device on the inactiveDevs list.  So now wouldn't we have duplicate
> devices there?
> 
> Now if the new code went before the above 2 lines, then probably no big
> deal - that way we're adding back to the inactive list for these stub
> drivers and we're not keeping the device in @pcidevs meaning it wouldn't
> be readded to inactive afterwards.
> 
> Hopefully this makes sense.
> 
> > +            int stub = virPCIDeviceGetStubDriver(pci);
> > +            if (stub > VIR_PCI_STUB_DRIVER_NONE &&
> > +                stub < VIR_PCI_STUB_DRIVER_LAST) {
> > +                /* The device is bound to a known stub driver: add a
> copy
> > +                 * to the inactive list */
> > +                VIR_DEBUG("Adding PCI device %s to inactive list",
> 
> s/%s/'%s'/
> 
> (both above and below)
> 
> > +                          virPCIDeviceGetName(pci));
> > +                if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs,
> pci) < 0) {
> > +                    VIR_ERROR(_("Failed to add PCI device %s to the
> inactive list"),
> > +                              virGetLastErrorMessage());
> > +                    virResetLastError();
> > +                }
> > +            }
> 
> For reference, this is a copy of what's in virHostdevPreparePCIDevices
> step 2. I think you need to do some extra debugging and be sure you're not
> duplicating things.  IOW: Add some debug code to dump the active and
> inactive lists.
> 
> Although this is really long - I hope it all makes sense.
> 
> John
Very sorry for the late, I have been occupied with a new project these days. Very sorry.
Also thanks for correcting my poor English.

Back to the question, you suggested to move the call to qemuHostdevUpdateActiveDomainDevices
to right after qemuProcessPrepareAllowReboot.
Indeed, adjusting the order can fix the problem. So, maybe we don't need to talk other aspect
You mentioned? All you mentioned really make sense and I don't consider so much indeed.

Thanks,
Zongyong Wu



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] virhostdev: Fix PCI devices are still attatched to stub driver bug
Posted by John Ferlan 6 years, 2 months ago

On 09/20/2018 06:35 AM, Wuzongyong (Euler Dept) wrote:
>> $SUBJ:
>>
>> s/attatched/attached
>>
>> s/bug//
>>
>> On 08/31/2018 03:34 AM, Wu Zongyong wrote:
>>
>> So, first off - I believe there are two things going on in this one patch.
>> Even though there is "some relationship" between the issues, the libvirtd
>> restart is kind of a corner case, while the change to readd the device to
>> inactiveDev's would affect both this libvirtd restart case
>> *and* the normal processing. It's taken me some time to come to that
>> conclusion - lots of details follow, but I'm also willing to accept the
>> changes are "related enough" for just one patch.
>>
>>> Currently, PCI devices will not be rebound to host drivers but
>>> attached to the stub driver when:
>>>
>>> 1) use libvirt to start a virtual machine with PCI devices assigned,
>>> then stop libvirtd process and shutdown the virtual machine. Finally,
>>> PCI devices are still bound to the stub driver instead of host drivers
>>> after libvirt start again.
>>> 2) use libvirt to shutdown a virtual machine wtih PCI devices
>>> assigned,
>>
>> s/wiht/with
>>
>>> then stop libvirtd process before libvirt try to rebind PCI devices to
>>
>> Be specific which API isn't being called - it really will help. Making a
>> reviewer figure it out in very specific/particular cases such as this
>> leads to "review delay". Is qemuHostdevReAttachDomainDevices called from
>> qemuProcessStop what you were referring to? The Stop not getting called
>> because the monitor channel is closed and thus the event for shutdown
>> won't be triggered.
>>
>>> host drivers. Finally, PCI devices are still bound to the stub driver
>>> after libvirt start again.
>>
>> Since you've done the research and so that I'm sure I'm following your
>> logic - it seems like you're chasing corner cases based on libvirtd
>> restart. In one instance the order is:
>>
>>     virsh start $dom
>>     service libvirtd stop  (or some other means?)
>>     <shutdown> the guest from within the $dom
>>
>> and the other instance is
>>
>>     virsh start @dom
>>     virsh destroy $dom
>>     service libvirtd stop
>>
>> and issues as a result of the subsequent "service libvirtd start".
>>
>> In the second instance it's not clear how your timing is "just right"
>> such that libvirtd can be stopped before the PCI devices can be returned
>> to the host PCI, but it's essentially the same as the first scenario at
>> least with respect to the subsequent libvirtd restart and discovery that
>> the guest is no longer running and needing to process that.
>>
>> What may also have helped is if the commit message indicated the "normal"
>> path taken that would allow the drivers to be rebound properly.
>>  Even if it's after the "---" for review backup assistance information.
>>
>> In edge cases like this, it's far easier to cut out stuff from a commit
>> message than it is to put yourself into the frame of reference of the
>> patch submitter
>>
>> Still what your patch does is use the virDomainHostdevOrigStatesPtr for a
>> domain that was shutdown, but when libvirtd restarts. I had to go a long
>> way back to find when the OrigStates code was added - commit
>> d84b36263 in particular. In that case, the order was start domain, restart
>> libvirtd, and destroy domain.
>>
>> So the slight difference to the condition for this patch is the domain was
>> shutdown while libvirtd was stopped.
>>
>> In any case, all of the above is "one issue"...
>>
>> Leaving the rest as a "second issue"
>>
>>>
>>> Notice that the comment on the top of virPCIDeviceDetach as follows:
>>>
>>> activeDevs should be a list of all PCI devices currently in use by a
>>> domain.inactiveDevs is a list of all PCI devices that libvirt has
>>
>>   * activeDevs is a list ...
>>   * inactiveDevs is a list ...
>>
>>> detached from the host driver + attached to the stub driver, but
>>> hasn't yet assigned to a domain.
>>>
>>> It's not reasonable that libvirt filter out devices that are either
>>> not active or not used by the current domain and driver. For devices
>>> belong
>>
>> s/belong/that belong/
>>
>>> to domains that has been shutdown before libvirt start, we should put
>>
>> s/has/have
>>
>> s/libvirt start, we should put them/libvirtd starts, they should be placed
>>
>>> them into inactiveDevs if then meet the condition that PCI devices
>>> have
>>
>> I'm finding it hard to read/parse/rewrite the rest of this sentence.
>>
>>> been detached from the host driver + attached to the stub driver.
>>>
>>> Moreover, we set orignal states of PCI devices when build PCI devices
>>
>> s/we set orignal/set the original/
>>
>> s/build/building/
>>
>>> list in virHostdevGetPCIHostDeviceList if states has been set in
>>> struct virDomainHostdevDefPtr.
>>
>>
>> So the "second issue" (as I see it) is that ReAttach processing wasn't
>> handling returning devices that had stub driver attachment (at some point
>> in time). This issue occurs regardless of the libvirtd restart and domain
>> shutdown processing order. Even though it's related, it's different and
>> would need its own patch.
>>
>>>
>>> Signed-off-by: Wu Zongyong <cordius.wu@huawei.com>
>>> ---
>>>  src/util/virhostdev.c | 20 ++++++++++++++++++--
>>>  1 file changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index
>>> ca79c37..ecf95e3 100644
>>> --- a/src/util/virhostdev.c
>>> +++ b/src/util/virhostdev.c
>>> @@ -235,6 +235,7 @@
>> virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr *hostdevs, int
>> nhostdevs)
>>>      for (i = 0; i < nhostdevs; i++) {
>>>          virDomainHostdevDefPtr hostdev = hostdevs[i];
>>>          virDomainHostdevSubsysPCIPtr pcisrc =
>>> &hostdev->source.subsys.u.pci;
>>> +        virDomainHostdevOrigStatesPtr origstates =
>>> + &hostdev->origstates;
>>>          virPCIDevicePtr pci;
>>>
>>>          if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) @@
>>> -262,6 +263,10 @@ virHostdevGetPCIHostDeviceList(virDomainHostdevDefPtr
>> *hostdevs, int nhostdevs)
>>>              virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_XEN);
>>>          else
>>>              virPCIDeviceSetStubDriver(pci, VIR_PCI_STUB_DRIVER_KVM);
>>> +
>>> +        virPCIDeviceSetUnbindFromStub(pci, origstates-
>>> states.pci.unbind_from_stub);
>>> +        virPCIDeviceSetRemoveSlot(pci, origstates-
>>> states.pci.remove_slot);
>>> +        virPCIDeviceSetReprobe(pci, origstates->states.pci.reprobe);
>>
>> This looks a lot like what virHostdevUpdateActivePCIDevices is doing - in
>> fact I think that is the libvirtd restart "bug" you're chasing -
>> qemuProcessReconnect will call qemuHostdevUpdateActiveDomainDevices
>> after the call to qemuConnectMonitor - at least w/r/t this PCI stub
>> interaction.
>>
>> Since the domain was stopped connecting to the monitor will fail (in my
>> case qemuMonitorOpenUnix), thus falling into qemuProcessStop, but without
>> having updated (or ReAttach'ing) the PCI devices. So, in your environment
>> without these two changes - could you check to see what happens if the
>> order is changed? Does that fix the issue?
>>
>> That is move the call to qemuHostdevUpdateActiveDomainDevices to right
>> after qemuProcessPrepareAllowReboot.
>>
>> Some more "detailed" thoughts I had on this while investigating...
>>
>> Both virHostdevPreparePCIDevices and virHostdevReAttachPCIDevices will
>> call virHostdevGetPCIHostDeviceList. However, in the case of the former,
>> "Step 7" is "set the original states for hostdev def".
>>
>> Following call traces for virHostdevPreparePCIDevices eventually leads
>> back to :
>>
>>   1. qemuMigrationDstPrepareAny (migration)
>>   2. qemuProcessStart (guest startup)
>>   3. qemuDomainAttachHostDevice (hotplug).
>>
>> IOW: This is the "normal" running mode.
>>
>> So I have a concern regarding the "order of operation" in this case
>> especially since step 7 has the caveat of (actual) being true before
>> setting the bools.
>>
>> Considering the "other pah" through virHostdevReAttachPCIDevices, callers
>> are:
>>
>>    1. libxlDomainAttachHostPCIDevice (an error: path)
>>    2. libxlDomainDetachHostPCIDevice (normal path)
>>    3. qemuHostdevReAttachPCIDevices (normal path)
>>    4. virHostdevReAttachDomainDevices (libxl domain close processing)
>>
>> Focusing on qemuHostdevReAttachPCIDevices I find 3 callers:
>>
>>    1. qemuHostdevReAttachDomainDevices (called by qemuProcessStop)
>>    2. qemuDomainAttachHostPCIDevice (hotplug error processing)
>>    3. qemuDomainRemovePCIHostDevice (hot unplug processing).
>>
>> So while setting things up seems to be the right thing to do in the
>> ReAttach case, the normal attach case it doesn't seem so. Perhaps too
>> generic.
>>
>>>      }
>>>
>>>      return pcidevs;
>>> @@ -1008,8 +1013,19 @@ virHostdevReAttachPCIDevices(virHostdevManagerPtr
>> mgr,
>>>                  continue;
>>>              }
>>>          } else {
>>> -            virPCIDeviceListDel(pcidevs, pci);
>>> -            continue;
>>
>> The above was added in commit 4cdbff3d5 by Andrea and just removed
>> inactiveDev's from the @pcidevs since his change added both onto @pcidevs
>> (at least that's my reading of it).
>>
>> But with your proposed change rather than deleting that from @pcidevs,
>> you're moving it to the inactiveDev's list *if* there's a stub.
>>
>> So what happens if there isn't a stub?  From this point forward @pcidevs
>> would contain both active and inactive devices.
>>
>> Then step 2 comes along and runs the @pcidevs "stealing" to place the
>> device on the inactiveDevs list.  So now wouldn't we have duplicate
>> devices there?
>>
>> Now if the new code went before the above 2 lines, then probably no big
>> deal - that way we're adding back to the inactive list for these stub
>> drivers and we're not keeping the device in @pcidevs meaning it wouldn't
>> be readded to inactive afterwards.
>>
>> Hopefully this makes sense.
>>
>>> +            int stub = virPCIDeviceGetStubDriver(pci);
>>> +            if (stub > VIR_PCI_STUB_DRIVER_NONE &&
>>> +                stub < VIR_PCI_STUB_DRIVER_LAST) {
>>> +                /* The device is bound to a known stub driver: add a
>> copy
>>> +                 * to the inactive list */
>>> +                VIR_DEBUG("Adding PCI device %s to inactive list",
>>
>> s/%s/'%s'/
>>
>> (both above and below)
>>
>>> +                          virPCIDeviceGetName(pci));
>>> +                if (virPCIDeviceListAddCopy(mgr->inactivePCIHostdevs,
>> pci) < 0) {
>>> +                    VIR_ERROR(_("Failed to add PCI device %s to the
>> inactive list"),
>>> +                              virGetLastErrorMessage());
>>> +                    virResetLastError();
>>> +                }
>>> +            }
>>
>> For reference, this is a copy of what's in virHostdevPreparePCIDevices
>> step 2. I think you need to do some extra debugging and be sure you're not
>> duplicating things.  IOW: Add some debug code to dump the active and
>> inactive lists.
>>
>> Although this is really long - I hope it all makes sense.
>>
>> John
> Very sorry for the late, I have been occupied with a new project these days. Very sorry.
> Also thanks for correcting my poor English.

No problem... We're all busy in different ways. The only hazard in
delayed followups for me is short term memory problems ;-). The
grammar/spelling is just part of the process - even people who are
native english speakers aren't always "good" at it.

> 
> Back to the question, you suggested to move the call to qemuHostdevUpdateActiveDomainDevices
> to right after qemuProcessPrepareAllowReboot.
> Indeed, adjusting the order can fix the problem. So, maybe we don't need to talk other aspect
> You mentioned? All you mentioned really make sense and I don't consider so much indeed.
> 
> Thanks,
> Zongyong Wu
> 
> 

If moving the call works, then send a patch with just that!  I thought
"logistically" it would work, but didn't have the empirical evidence.

If the "second issue" is no longer an issue, then fine we can just drop
and forget about it.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list