[PATCH 00/21] qemu: Refactor domstats code to avoid error reporting

Peter Krempa posted 21 patches 9 months, 4 weeks ago
Failed in applying to current master (apply log)
src/ch/ch_driver.c             |   3 +-
src/conf/capabilities.c        |   9 +-
src/conf/domain_conf.c         |  69 ++-----
src/conf/numa_conf.c           |  18 +-
src/conf/virnetworkobj.c       |   3 -
src/hypervisor/domain_cgroup.c |   6 +-
src/libxl/libxl_driver.c       |   3 +-
src/libxl/xen_common.c         |   6 +-
src/qemu/qemu_command.c        |   3 +-
src/qemu/qemu_domain.c         |  10 +-
src/qemu/qemu_driver.c         | 341 ++++++++++++---------------------
src/qemu/qemu_monitor_json.c   |   5 +-
src/util/virbitmap.c           |  11 +-
src/util/vircgroup.c           |   5 +-
src/util/virperf.c             |  19 +-
src/vz/vz_sdk.c                |   3 +-
16 files changed, 178 insertions(+), 336 deletions(-)
[PATCH 00/21] qemu: Refactor domstats code to avoid error reporting
Posted by Peter Krempa 9 months, 4 weeks ago
The workers of qemuDomainGetStats should not report errors if they can't
fetch data; but rather omit the entries. Refactor the code to
de-incentivize error reporting.

Peter Krempa (21):
  qemuDomainGetStatsBlockExportHeader: Remove return value
  qemuDomainGetStatsBlockExportFrontend: Remove return value
  qemuDomainGetStatsBlockExportBackendStorage: Remove return value
  qemuDomainGetStatsOneBlockFallback: Remove return value
  qemuDomainGetStatsOneBlock: Remove return value
  qemuDomainStorageAlias: Remove NULL checks from callers
  qemuDomainGetStatsBlockExportHeader: Remove return value
  virBitmapFormat: Clarify returned values
  virDomainResctrlMonDefParse: Refactor temporary variables
  virDomainCputuneDefFormat: Refactor bitmap formatting
  virBitmapFormat: Don't check return value
  qemuDomainGetStatsCpuCgroup: Remove return value
  qemuDomainGetStatsCpuProc: Remove return value
  qemuDomainGetStatsCpuHaltPollTime: Remove return value
  qemuDomainGetStatsCpuCache: Don't error out
  virPerfReadEvent: Refactor to return -errno on failure
  qemuDomainGetStatsPerfOneEvent: Ignore erros from 'virPerfReadEvent'
  qemuDomainGetStatsIOThread: Don't error out if fetching iothread info
    fails
  qemuDomainGetStatsMemoryBandwidth: Don't error out
  qemuDomainGetStatsDirtyRate: Don't error out
  qemuDomainGetStats: Convert worker functions to void

 src/ch/ch_driver.c             |   3 +-
 src/conf/capabilities.c        |   9 +-
 src/conf/domain_conf.c         |  69 ++-----
 src/conf/numa_conf.c           |  18 +-
 src/conf/virnetworkobj.c       |   3 -
 src/hypervisor/domain_cgroup.c |   6 +-
 src/libxl/libxl_driver.c       |   3 +-
 src/libxl/xen_common.c         |   6 +-
 src/qemu/qemu_command.c        |   3 +-
 src/qemu/qemu_domain.c         |  10 +-
 src/qemu/qemu_driver.c         | 341 ++++++++++++---------------------
 src/qemu/qemu_monitor_json.c   |   5 +-
 src/util/virbitmap.c           |  11 +-
 src/util/vircgroup.c           |   5 +-
 src/util/virperf.c             |  19 +-
 src/vz/vz_sdk.c                |   3 +-
 16 files changed, 178 insertions(+), 336 deletions(-)

-- 
2.48.1
Re: [PATCH 00/21] qemu: Refactor domstats code to avoid error reporting
Posted by Ján Tomko 9 months, 4 weeks ago
On a Thursday in 2025, Peter Krempa wrote:
>The workers of qemuDomainGetStats should not report errors if they can't
>fetch data; but rather omit the entries. Refactor the code to
>de-incentivize error reporting.
>
>Peter Krempa (21):
>  qemuDomainGetStatsBlockExportHeader: Remove return value
>  qemuDomainGetStatsBlockExportFrontend: Remove return value
>  qemuDomainGetStatsBlockExportBackendStorage: Remove return value
>  qemuDomainGetStatsOneBlockFallback: Remove return value
>  qemuDomainGetStatsOneBlock: Remove return value
>  qemuDomainStorageAlias: Remove NULL checks from callers
>  qemuDomainGetStatsBlockExportHeader: Remove return value
>  virBitmapFormat: Clarify returned values
>  virDomainResctrlMonDefParse: Refactor temporary variables
>  virDomainCputuneDefFormat: Refactor bitmap formatting
>  virBitmapFormat: Don't check return value
>  qemuDomainGetStatsCpuCgroup: Remove return value
>  qemuDomainGetStatsCpuProc: Remove return value
>  qemuDomainGetStatsCpuHaltPollTime: Remove return value
>  qemuDomainGetStatsCpuCache: Don't error out
>  virPerfReadEvent: Refactor to return -errno on failure
>  qemuDomainGetStatsPerfOneEvent: Ignore erros from 'virPerfReadEvent'
>  qemuDomainGetStatsIOThread: Don't error out if fetching iothread info
>    fails
>  qemuDomainGetStatsMemoryBandwidth: Don't error out
>  qemuDomainGetStatsDirtyRate: Don't error out
>  qemuDomainGetStats: Convert worker functions to void
>
> src/ch/ch_driver.c             |   3 +-
> src/conf/capabilities.c        |   9 +-
> src/conf/domain_conf.c         |  69 ++-----
> src/conf/numa_conf.c           |  18 +-
> src/conf/virnetworkobj.c       |   3 -
> src/hypervisor/domain_cgroup.c |   6 +-
> src/libxl/libxl_driver.c       |   3 +-
> src/libxl/xen_common.c         |   6 +-
> src/qemu/qemu_command.c        |   3 +-
> src/qemu/qemu_domain.c         |  10 +-
> src/qemu/qemu_driver.c         | 341 ++++++++++++---------------------
> src/qemu/qemu_monitor_json.c   |   5 +-
> src/util/virbitmap.c           |  11 +-
> src/util/vircgroup.c           |   5 +-
> src/util/virperf.c             |  19 +-
> src/vz/vz_sdk.c                |   3 +-
> 16 files changed, 178 insertions(+), 336 deletions(-)
>

s/outputing/outputting/ in the commit messages

Reviewed-by: Ján Tomko <jtomko@redhat.com>

Jano

Re: [PATCH 00/21] qemu: Refactor domstats code to avoid error reporting
Posted by Michal Prívozník 9 months, 4 weeks ago
On 2/20/25 10:01, Peter Krempa wrote:
> The workers of qemuDomainGetStats should not report errors if they can't
> fetch data; but rather omit the entries. Refactor the code to
> de-incentivize error reporting.
> 
> Peter Krempa (21):
>   qemuDomainGetStatsBlockExportHeader: Remove return value
>   qemuDomainGetStatsBlockExportFrontend: Remove return value
>   qemuDomainGetStatsBlockExportBackendStorage: Remove return value
>   qemuDomainGetStatsOneBlockFallback: Remove return value
>   qemuDomainGetStatsOneBlock: Remove return value
>   qemuDomainStorageAlias: Remove NULL checks from callers
>   qemuDomainGetStatsBlockExportHeader: Remove return value
>   virBitmapFormat: Clarify returned values
>   virDomainResctrlMonDefParse: Refactor temporary variables
>   virDomainCputuneDefFormat: Refactor bitmap formatting
>   virBitmapFormat: Don't check return value
>   qemuDomainGetStatsCpuCgroup: Remove return value
>   qemuDomainGetStatsCpuProc: Remove return value
>   qemuDomainGetStatsCpuHaltPollTime: Remove return value
>   qemuDomainGetStatsCpuCache: Don't error out
>   virPerfReadEvent: Refactor to return -errno on failure
>   qemuDomainGetStatsPerfOneEvent: Ignore erros from 'virPerfReadEvent'
>   qemuDomainGetStatsIOThread: Don't error out if fetching iothread info
>     fails
>   qemuDomainGetStatsMemoryBandwidth: Don't error out
>   qemuDomainGetStatsDirtyRate: Don't error out
>   qemuDomainGetStats: Convert worker functions to void
> 
>  src/ch/ch_driver.c             |   3 +-
>  src/conf/capabilities.c        |   9 +-
>  src/conf/domain_conf.c         |  69 ++-----
>  src/conf/numa_conf.c           |  18 +-
>  src/conf/virnetworkobj.c       |   3 -
>  src/hypervisor/domain_cgroup.c |   6 +-
>  src/libxl/libxl_driver.c       |   3 +-
>  src/libxl/xen_common.c         |   6 +-
>  src/qemu/qemu_command.c        |   3 +-
>  src/qemu/qemu_domain.c         |  10 +-
>  src/qemu/qemu_driver.c         | 341 ++++++++++++---------------------
>  src/qemu/qemu_monitor_json.c   |   5 +-
>  src/util/virbitmap.c           |  11 +-
>  src/util/vircgroup.c           |   5 +-
>  src/util/virperf.c             |  19 +-
>  src/vz/vz_sdk.c                |   3 +-
>  16 files changed, 178 insertions(+), 336 deletions(-)
> 

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

Michal
Re: [PATCH 00/21] qemu: Refactor domstats code to avoid error reporting
Posted by Daniel P. Berrangé 9 months, 3 weeks ago
On Thu, Feb 20, 2025 at 10:01:13AM +0100, Peter Krempa wrote:
> The workers of qemuDomainGetStats should not report errors if they can't
> fetch data; but rather omit the entries. Refactor the code to
> de-incentivize error reporting.

Why shouldn't they report errors ?

Can you give an example of a problem seen with error reporting ? Silently
dropping data has the problem that its hard for people to realize that
anything has gone wrong, and in some cases dropped data will be not
distinguishable from unimplemented data.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [PATCH 00/21] qemu: Refactor domstats code to avoid error reporting
Posted by Peter Krempa 9 months, 3 weeks ago
On Mon, Feb 24, 2025 at 09:33:51 +0000, Daniel P. Berrangé wrote:
> On Thu, Feb 20, 2025 at 10:01:13AM +0100, Peter Krempa wrote:
> > The workers of qemuDomainGetStats should not report errors if they can't
> > fetch data; but rather omit the entries. Refactor the code to
> > de-incentivize error reporting.
> 
> Why shouldn't they report errors ?
> 
> Can you give an example of a problem seen with error reporting ? Silently

The API itself allows querying multiple VMs for stats. Now if fetching
just one of the stats fields fails you'll get an error instead of
returning stats for everything else.

> dropping data has the problem that its hard for people to realize that
> anything has gone wrong, and in some cases dropped data will be not
> distinguishable from unimplemented data.

The API explicitly documents this behaviour and was intended from the
beginning:

virConnectGetAllDomainStats:


 * Note that entire stats groups or individual stat fields may be missing from
 * the output in case they are not supported by the given hypervisor, are not
 * applicable for the current state of the guest domain, or their retrieval
 * was not successful.
 *
 * Using 0 for @stats returns all stats groups supported by the given
 * hypervisor.
 *
 * Specifying VIR_CONNECT_GET_ALL_DOMAINS_STATS_ENFORCE_STATS as @flags makes
 * the function return error in case some of the stat types in @stats were
 * not recognized by the daemon.  However, even with this flag, a hypervisor
 * may omit individual fields within a known group if the information is not
 * available; as an extreme example, a supported group may produce zero
 * fields for offline domains if the statistics are meaningful only for a
 * running domain.
 *
Re: [PATCH 00/21] qemu: Refactor domstats code to avoid error reporting
Posted by Peter Krempa 9 months, 3 weeks ago
On Mon, Feb 24, 2025 at 10:42:05 +0100, Peter Krempa wrote:
> On Mon, Feb 24, 2025 at 09:33:51 +0000, Daniel P. Berrangé wrote:
> > On Thu, Feb 20, 2025 at 10:01:13AM +0100, Peter Krempa wrote:
> > > The workers of qemuDomainGetStats should not report errors if they can't
> > > fetch data; but rather omit the entries. Refactor the code to
> > > de-incentivize error reporting.
> > 
> > Why shouldn't they report errors ?

N.B: The error reporting infrastructure that was present and is removed
by this patch existed solely for reporting allocation errors when adding
the results to the list and the originally implemnted workers never
reported errors when fetching actual data; just irrecoverable
daemon-side errors.

It's just the new workers that did report errors, which was very likely
based on the fact that the callback function did require an error code
to be returned.
Re: [PATCH 00/21] qemu: Refactor domstats code to avoid error reporting
Posted by Daniel P. Berrangé 9 months, 3 weeks ago
On Mon, Feb 24, 2025 at 10:42:05AM +0100, Peter Krempa wrote:
> On Mon, Feb 24, 2025 at 09:33:51 +0000, Daniel P. Berrangé wrote:
> > On Thu, Feb 20, 2025 at 10:01:13AM +0100, Peter Krempa wrote:
> > > The workers of qemuDomainGetStats should not report errors if they can't
> > > fetch data; but rather omit the entries. Refactor the code to
> > > de-incentivize error reporting.
> > 
> > Why shouldn't they report errors ?
> > 
> > Can you give an example of a problem seen with error reporting ? Silently
> 
> The API itself allows querying multiple VMs for stats. Now if fetching
> just one of the stats fields fails you'll get an error instead of
> returning stats for everything else.
> 
> > dropping data has the problem that its hard for people to realize that
> > anything has gone wrong, and in some cases dropped data will be not
> > distinguishable from unimplemented data.
> 
> The API explicitly documents this behaviour and was intended from the
> beginning:

Ok, that's good.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|