[PATCH] qemu_driver.c: Fix domfsinfo for non-PCI device information from guest agent

Thomas Huth posted 1 patch 3 years, 9 months ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20200720102233.28768-1-thuth@redhat.com
src/qemu/qemu_driver.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
[PATCH] qemu_driver.c: Fix domfsinfo for non-PCI device information from guest agent
Posted by Thomas Huth 3 years, 9 months ago
qemuAgentFSInfoToPublic() currently only sets the devAlias for PCI devices.
However, the QEMU guest agent could also provide the device name in the
"dev" field of the response for other devices instead (well, at least after
fixing another problem in the current QEMU guest agent...). So if creating
the devAlias from the PCI information failed, let's fall back to the name
provided by the guest agent. This helps to fix the empty "Target" fields
that occur when running "virsh domfsinfo" on s390x where CCW devices are
used for the guest instead of PCI devices.

Also add a proper debug message here in case we completely failed to set the
device alias, since this problem here was very hard to debug: The only two
error messages that I've seen were "Unable to get filesystem information"
and "Unable to encode message payload" - which only indicates that something
went wrong in the RPC call. No debug message indicated the real problem, so
I had to learn the hard way why the RPC call failed (it apparently does not
like devAlias left to be NULL) and where the real problem comes from.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1755075
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 src/qemu/qemu_driver.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index d185666ed8..e45c61ee20 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -21935,14 +21935,17 @@ qemuAgentFSInfoToPublic(qemuAgentFSInfoPtr agent,
         qemuAgentDiskInfoPtr agentdisk = agent->disks[i];
         virDomainDiskDefPtr diskDef;
 
-        if (!(diskDef = virDomainDiskByAddress(vmdef,
-                                               &agentdisk->pci_controller,
-                                               agentdisk->bus,
-                                               agentdisk->target,
-                                               agentdisk->unit)))
-            continue;
-
-        ret->devAlias[i] = g_strdup(diskDef->dst);
+        diskDef = virDomainDiskByAddress(vmdef,
+                                         &agentdisk->pci_controller,
+                                         agentdisk->bus,
+                                         agentdisk->target,
+                                         agentdisk->unit);
+        if (diskDef != NULL)
+            ret->devAlias[i] = g_strdup(diskDef->dst);
+        else if (agentdisk->devnode != NULL)
+            ret->devAlias[i] = g_strdup(agentdisk->devnode);
+        else
+            VIR_DEBUG("Missing devnode name for '%s'.", ret->mountpoint);
     }
 
     return ret;
-- 
2.18.1

Re: [PATCH] qemu_driver.c: Fix domfsinfo for non-PCI device information from guest agent
Posted by Daniel P. Berrangé 3 years, 9 months ago
On Mon, Jul 20, 2020 at 12:22:33PM +0200, Thomas Huth wrote:
> qemuAgentFSInfoToPublic() currently only sets the devAlias for PCI devices.
> However, the QEMU guest agent could also provide the device name in the
> "dev" field of the response for other devices instead (well, at least after
> fixing another problem in the current QEMU guest agent...). So if creating
> the devAlias from the PCI information failed, let's fall back to the name
> provided by the guest agent. This helps to fix the empty "Target" fields
> that occur when running "virsh domfsinfo" on s390x where CCW devices are
> used for the guest instead of PCI devices.
> 
> Also add a proper debug message here in case we completely failed to set the
> device alias, since this problem here was very hard to debug: The only two
> error messages that I've seen were "Unable to get filesystem information"
> and "Unable to encode message payload" - which only indicates that something
> went wrong in the RPC call. No debug message indicated the real problem, so
> I had to learn the hard way why the RPC call failed (it apparently does not
> like devAlias left to be NULL) and where the real problem comes from.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1755075
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  src/qemu/qemu_driver.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index d185666ed8..e45c61ee20 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -21935,14 +21935,17 @@ qemuAgentFSInfoToPublic(qemuAgentFSInfoPtr agent,
>          qemuAgentDiskInfoPtr agentdisk = agent->disks[i];
>          virDomainDiskDefPtr diskDef;
>  
> -        if (!(diskDef = virDomainDiskByAddress(vmdef,
> -                                               &agentdisk->pci_controller,
> -                                               agentdisk->bus,
> -                                               agentdisk->target,
> -                                               agentdisk->unit)))
> -            continue;
> -
> -        ret->devAlias[i] = g_strdup(diskDef->dst);
> +        diskDef = virDomainDiskByAddress(vmdef,
> +                                         &agentdisk->pci_controller,
> +                                         agentdisk->bus,
> +                                         agentdisk->target,
> +                                         agentdisk->unit);

Hmmm, even on x86 I think the original code is broken, or at least fragile.
The libvirt "bus" number is not guaranteed to match the guest kernel PCI
"bus" number. I wonder if this has been validated for complex PCI-Express
topologies.

> +        if (diskDef != NULL)
> +            ret->devAlias[i] = g_strdup(diskDef->dst);
> +        else if (agentdisk->devnode != NULL)
> +            ret->devAlias[i] = g_strdup(agentdisk->devnode);

In the linked BZ the "devnode" field is not provided either. You mention
a related QEMU patch, which I guess fixes that issue in QEMU ? Can you point
to that fix.

> +        else
> +            VIR_DEBUG("Missing devnode name for '%s'.", ret->mountpoint);
>      }

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] qemu_driver.c: Fix domfsinfo for non-PCI device information from guest agent
Posted by Thomas Huth 3 years, 9 months ago
On 20/07/2020 12.32, Daniel P. Berrangé wrote:
> On Mon, Jul 20, 2020 at 12:22:33PM +0200, Thomas Huth wrote:
>> qemuAgentFSInfoToPublic() currently only sets the devAlias for PCI devices.
>> However, the QEMU guest agent could also provide the device name in the
>> "dev" field of the response for other devices instead (well, at least after
>> fixing another problem in the current QEMU guest agent...). So if creating
>> the devAlias from the PCI information failed, let's fall back to the name
>> provided by the guest agent. This helps to fix the empty "Target" fields
>> that occur when running "virsh domfsinfo" on s390x where CCW devices are
>> used for the guest instead of PCI devices.
>>
>> Also add a proper debug message here in case we completely failed to set the
>> device alias, since this problem here was very hard to debug: The only two
>> error messages that I've seen were "Unable to get filesystem information"
>> and "Unable to encode message payload" - which only indicates that something
>> went wrong in the RPC call. No debug message indicated the real problem, so
>> I had to learn the hard way why the RPC call failed (it apparently does not
>> like devAlias left to be NULL) and where the real problem comes from.
>>
>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1755075
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  src/qemu/qemu_driver.c | 19 +++++++++++--------
>>  1 file changed, 11 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index d185666ed8..e45c61ee20 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -21935,14 +21935,17 @@ qemuAgentFSInfoToPublic(qemuAgentFSInfoPtr agent,
>>          qemuAgentDiskInfoPtr agentdisk = agent->disks[i];
>>          virDomainDiskDefPtr diskDef;
>>  
>> -        if (!(diskDef = virDomainDiskByAddress(vmdef,
>> -                                               &agentdisk->pci_controller,
>> -                                               agentdisk->bus,
>> -                                               agentdisk->target,
>> -                                               agentdisk->unit)))
>> -            continue;
>> -
>> -        ret->devAlias[i] = g_strdup(diskDef->dst);
>> +        diskDef = virDomainDiskByAddress(vmdef,
>> +                                         &agentdisk->pci_controller,
>> +                                         agentdisk->bus,
>> +                                         agentdisk->target,
>> +                                         agentdisk->unit);
> 
> Hmmm, even on x86 I think the original code is broken, or at least fragile.
> The libvirt "bus" number is not guaranteed to match the guest kernel PCI
> "bus" number. I wonder if this has been validated for complex PCI-Express
> topologies.

I also thought already that maybe I should revert the logic and try
agentdisk->devnode first and only fall back to virDomainDiskByAddress()
if it is not set? ... but that might result in different behavior in
some cases, so I did not use that approach in the this version of my patch.

>> +        if (diskDef != NULL)
>> +            ret->devAlias[i] = g_strdup(diskDef->dst);
>> +        else if (agentdisk->devnode != NULL)
>> +            ret->devAlias[i] = g_strdup(agentdisk->devnode);
> 
> In the linked BZ the "devnode" field is not provided either. You mention
> a related QEMU patch, which I guess fixes that issue in QEMU ? Can you point
> to that fix.

I just sent the related patches to qemu-devel (look for the "Allow
guest-get-fsinfo also for non-PCI devices" patch series ... I've also
put you on CC:).

 Thomas

Re: [PATCH] qemu_driver.c: Fix domfsinfo for non-PCI device information from guest agent
Posted by Thomas Huth 3 years, 8 months ago
On 20/07/2020 12.22, Thomas Huth wrote:
> qemuAgentFSInfoToPublic() currently only sets the devAlias for PCI devices.
> However, the QEMU guest agent could also provide the device name in the
> "dev" field of the response for other devices instead (well, at least after
> fixing another problem in the current QEMU guest agent...). So if creating
> the devAlias from the PCI information failed, let's fall back to the name
> provided by the guest agent. This helps to fix the empty "Target" fields
> that occur when running "virsh domfsinfo" on s390x where CCW devices are
> used for the guest instead of PCI devices.
> 
> Also add a proper debug message here in case we completely failed to set the
> device alias, since this problem here was very hard to debug: The only two
> error messages that I've seen were "Unable to get filesystem information"
> and "Unable to encode message payload" - which only indicates that something
> went wrong in the RPC call. No debug message indicated the real problem, so
> I had to learn the hard way why the RPC call failed (it apparently does not
> like devAlias left to be NULL) and where the real problem comes from.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1755075
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  src/qemu/qemu_driver.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index d185666ed8..e45c61ee20 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -21935,14 +21935,17 @@ qemuAgentFSInfoToPublic(qemuAgentFSInfoPtr agent,
>          qemuAgentDiskInfoPtr agentdisk = agent->disks[i];
>          virDomainDiskDefPtr diskDef;
>  
> -        if (!(diskDef = virDomainDiskByAddress(vmdef,
> -                                               &agentdisk->pci_controller,
> -                                               agentdisk->bus,
> -                                               agentdisk->target,
> -                                               agentdisk->unit)))
> -            continue;
> -
> -        ret->devAlias[i] = g_strdup(diskDef->dst);
> +        diskDef = virDomainDiskByAddress(vmdef,
> +                                         &agentdisk->pci_controller,
> +                                         agentdisk->bus,
> +                                         agentdisk->target,
> +                                         agentdisk->unit);
> +        if (diskDef != NULL)
> +            ret->devAlias[i] = g_strdup(diskDef->dst);
> +        else if (agentdisk->devnode != NULL)
> +            ret->devAlias[i] = g_strdup(agentdisk->devnode);
> +        else
> +            VIR_DEBUG("Missing devnode name for '%s'.", ret->mountpoint);
>      }
>  
>      return ret;
> 

Friendly ping!

Any more comments here? Is it ok this way or shall I change the order?

 Thomas

Re: [PATCH] qemu_driver.c: Fix domfsinfo for non-PCI device information from guest agent
Posted by Thomas Huth 3 years, 7 months ago
On 07/08/2020 18.22, Thomas Huth wrote:
> On 20/07/2020 12.22, Thomas Huth wrote:
>> qemuAgentFSInfoToPublic() currently only sets the devAlias for PCI devices.
>> However, the QEMU guest agent could also provide the device name in the
>> "dev" field of the response for other devices instead (well, at least after
>> fixing another problem in the current QEMU guest agent...). So if creating
>> the devAlias from the PCI information failed, let's fall back to the name
>> provided by the guest agent. This helps to fix the empty "Target" fields
>> that occur when running "virsh domfsinfo" on s390x where CCW devices are
>> used for the guest instead of PCI devices.
>>
>> Also add a proper debug message here in case we completely failed to set the
>> device alias, since this problem here was very hard to debug: The only two
>> error messages that I've seen were "Unable to get filesystem information"
>> and "Unable to encode message payload" - which only indicates that something
>> went wrong in the RPC call. No debug message indicated the real problem, so
>> I had to learn the hard way why the RPC call failed (it apparently does not
>> like devAlias left to be NULL) and where the real problem comes from.
>>
>> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1755075
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>  src/qemu/qemu_driver.c | 19 +++++++++++--------
>>  1 file changed, 11 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index d185666ed8..e45c61ee20 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -21935,14 +21935,17 @@ qemuAgentFSInfoToPublic(qemuAgentFSInfoPtr agent,
>>          qemuAgentDiskInfoPtr agentdisk = agent->disks[i];
>>          virDomainDiskDefPtr diskDef;
>>  
>> -        if (!(diskDef = virDomainDiskByAddress(vmdef,
>> -                                               &agentdisk->pci_controller,
>> -                                               agentdisk->bus,
>> -                                               agentdisk->target,
>> -                                               agentdisk->unit)))
>> -            continue;
>> -
>> -        ret->devAlias[i] = g_strdup(diskDef->dst);
>> +        diskDef = virDomainDiskByAddress(vmdef,
>> +                                         &agentdisk->pci_controller,
>> +                                         agentdisk->bus,
>> +                                         agentdisk->target,
>> +                                         agentdisk->unit);
>> +        if (diskDef != NULL)
>> +            ret->devAlias[i] = g_strdup(diskDef->dst);
>> +        else if (agentdisk->devnode != NULL)
>> +            ret->devAlias[i] = g_strdup(agentdisk->devnode);
>> +        else
>> +            VIR_DEBUG("Missing devnode name for '%s'.", ret->mountpoint);
>>      }
>>  
>>      return ret;
>>
> 
> Friendly ping!
> 
> Any more comments here? Is it ok this way or shall I change the order?

Ping again! Is the patch ok, or do I have to rework it?

 Thomas

Re: [PATCH] qemu_driver.c: Fix domfsinfo for non-PCI device information from guest agent
Posted by Daniel P. Berrangé 3 years, 7 months ago
On Wed, Sep 02, 2020 at 06:26:02PM +0200, Thomas Huth wrote:
> On 07/08/2020 18.22, Thomas Huth wrote:
> > On 20/07/2020 12.22, Thomas Huth wrote:
> >> qemuAgentFSInfoToPublic() currently only sets the devAlias for PCI devices.
> >> However, the QEMU guest agent could also provide the device name in the
> >> "dev" field of the response for other devices instead (well, at least after
> >> fixing another problem in the current QEMU guest agent...). So if creating
> >> the devAlias from the PCI information failed, let's fall back to the name
> >> provided by the guest agent. This helps to fix the empty "Target" fields
> >> that occur when running "virsh domfsinfo" on s390x where CCW devices are
> >> used for the guest instead of PCI devices.
> >>
> >> Also add a proper debug message here in case we completely failed to set the
> >> device alias, since this problem here was very hard to debug: The only two
> >> error messages that I've seen were "Unable to get filesystem information"
> >> and "Unable to encode message payload" - which only indicates that something
> >> went wrong in the RPC call. No debug message indicated the real problem, so
> >> I had to learn the hard way why the RPC call failed (it apparently does not
> >> like devAlias left to be NULL) and where the real problem comes from.
> >>
> >> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1755075
> >> Signed-off-by: Thomas Huth <thuth@redhat.com>
> >> ---
> >>  src/qemu/qemu_driver.c | 19 +++++++++++--------
> >>  1 file changed, 11 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> >> index d185666ed8..e45c61ee20 100644
> >> --- a/src/qemu/qemu_driver.c
> >> +++ b/src/qemu/qemu_driver.c
> >> @@ -21935,14 +21935,17 @@ qemuAgentFSInfoToPublic(qemuAgentFSInfoPtr agent,
> >>          qemuAgentDiskInfoPtr agentdisk = agent->disks[i];
> >>          virDomainDiskDefPtr diskDef;
> >>  
> >> -        if (!(diskDef = virDomainDiskByAddress(vmdef,
> >> -                                               &agentdisk->pci_controller,
> >> -                                               agentdisk->bus,
> >> -                                               agentdisk->target,
> >> -                                               agentdisk->unit)))
> >> -            continue;
> >> -
> >> -        ret->devAlias[i] = g_strdup(diskDef->dst);
> >> +        diskDef = virDomainDiskByAddress(vmdef,
> >> +                                         &agentdisk->pci_controller,
> >> +                                         agentdisk->bus,
> >> +                                         agentdisk->target,
> >> +                                         agentdisk->unit);
> >> +        if (diskDef != NULL)
> >> +            ret->devAlias[i] = g_strdup(diskDef->dst);
> >> +        else if (agentdisk->devnode != NULL)
> >> +            ret->devAlias[i] = g_strdup(agentdisk->devnode);
> >> +        else
> >> +            VIR_DEBUG("Missing devnode name for '%s'.", ret->mountpoint);
> >>      }
> >>  
> >>      return ret;
> >>
> > 
> > Friendly ping!
> > 
> > Any more comments here? Is it ok this way or shall I change the order?
> 
> Ping again! Is the patch ok, or do I have to rework it?

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

and i'll push it shortly. The related QEMU fix is also queued for 5.2
I see.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|