[PATCH 3/5] API: Document more possibilities for @reason field of VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event

Peter Krempa posted 5 patches 7 months, 2 weeks ago
[PATCH 3/5] API: Document more possibilities for @reason field of VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event
Posted by Peter Krempa 7 months, 2 weeks ago
The VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event allows @reason to be
reported to the user, but currently we only report 'enospc'.

Extend the documentation so that we allow the passthrough of the
verbatim error, prefixed by 'other: ' in order to prevent collisions and
note that users must not attempt to parse them.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
---
 include/libvirt/libvirt-domain.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
index 2a4b81f4df..e031b23547 100644
--- a/include/libvirt/libvirt-domain.h
+++ b/include/libvirt/libvirt-domain.h
@@ -4790,8 +4790,12 @@ typedef void (*virConnectDomainEventIOErrorCallback)(virConnectPtr conn,
  * If the I/O error is known to be caused by an ENOSPC condition in
  * the host (where resizing the disk to be larger will allow the guest
  * to be resumed as if nothing happened), @reason will be "enospc".
- * Otherwise, @reason will be "", although future strings may be added
- * if determination of other error types becomes possible.
+ *
+ * Otherwise, if the hypervisor reported an error @reason will be the verbatim
+ * error message from hypervisor prefixed by "other: ". Note that this error
+ * may not be stable and thus is only really usable for human use. In case
+ * when the hypervisor doesn't report the error @reason will be an empty string
+ * "".
  *
  * Since: 0.8.1
  */
-- 
2.48.1
Re: [PATCH 3/5] API: Document more possibilities for @reason field of VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event
Posted by Daniel P. Berrangé 7 months, 2 weeks ago
On Fri, Jan 24, 2025 at 05:33:04PM +0100, Peter Krempa wrote:
> The VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event allows @reason to be
> reported to the user, but currently we only report 'enospc'.
> 
> Extend the documentation so that we allow the passthrough of the
> verbatim error, prefixed by 'other: ' in order to prevent collisions and
> note that users must not attempt to parse them.
> 
> Signed-off-by: Peter Krempa <pkrempa@redhat.com>
> ---
>  include/libvirt/libvirt-domain.h | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
> index 2a4b81f4df..e031b23547 100644
> --- a/include/libvirt/libvirt-domain.h
> +++ b/include/libvirt/libvirt-domain.h
> @@ -4790,8 +4790,12 @@ typedef void (*virConnectDomainEventIOErrorCallback)(virConnectPtr conn,
>   * If the I/O error is known to be caused by an ENOSPC condition in
>   * the host (where resizing the disk to be larger will allow the guest
>   * to be resumed as if nothing happened), @reason will be "enospc".
> - * Otherwise, @reason will be "", although future strings may be added
> - * if determination of other error types becomes possible.
> + *
> + * Otherwise, if the hypervisor reported an error @reason will be the verbatim
> + * error message from hypervisor prefixed by "other: ". Note that this error
> + * may not be stable and thus is only really usable for human use. In case
> + * when the hypervisor doesn't report the error @reason will be an empty string
> + * "".

Hmmm, this makes me feel pretty uncomfortable.

When set, the 'reason' field has clear long term stable supported semantics,
which applications are permitted to match against.

Essentially it is an enum field as currently defined.

This is now being turned into a free-format human readable string
which applications are told not to interpret at all.

Effectively we've overloaded the field to have two completely
different sets of semantics.

I don't think we should do this.

If we want a human readable string, it should be distinct from the
the enum reason we support already.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [PATCH 3/5] API: Document more possibilities for @reason field of VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event
Posted by Peter Krempa 7 months, 2 weeks ago
On Mon, Jan 27, 2025 at 10:31:28 +0000, Daniel P. Berrangé wrote:
> On Fri, Jan 24, 2025 at 05:33:04PM +0100, Peter Krempa wrote:
> > The VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event allows @reason to be
> > reported to the user, but currently we only report 'enospc'.
> > 
> > Extend the documentation so that we allow the passthrough of the
> > verbatim error, prefixed by 'other: ' in order to prevent collisions and
> > note that users must not attempt to parse them.
> > 
> > Signed-off-by: Peter Krempa <pkrempa@redhat.com>
> > ---
> >  include/libvirt/libvirt-domain.h | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
> > index 2a4b81f4df..e031b23547 100644
> > --- a/include/libvirt/libvirt-domain.h
> > +++ b/include/libvirt/libvirt-domain.h
> > @@ -4790,8 +4790,12 @@ typedef void (*virConnectDomainEventIOErrorCallback)(virConnectPtr conn,
> >   * If the I/O error is known to be caused by an ENOSPC condition in
> >   * the host (where resizing the disk to be larger will allow the guest
> >   * to be resumed as if nothing happened), @reason will be "enospc".
> > - * Otherwise, @reason will be "", although future strings may be added
> > - * if determination of other error types becomes possible.
> > + *
> > + * Otherwise, if the hypervisor reported an error @reason will be the verbatim
> > + * error message from hypervisor prefixed by "other: ". Note that this error
> > + * may not be stable and thus is only really usable for human use. In case
> > + * when the hypervisor doesn't report the error @reason will be an empty string
> > + * "".
> 
> Hmmm, this makes me feel pretty uncomfortable.
> 
> When set, the 'reason' field has clear long term stable supported semantics,
> which applications are permitted to match against.
> 
> Essentially it is an enum field as currently defined.
> 
> This is now being turned into a free-format human readable string
> which applications are told not to interpret at all.
> 
> Effectively we've overloaded the field to have two completely
> different sets of semantics.
> 
> I don't think we should do this.
> 
> If we want a human readable string, it should be distinct from the
> the enum reason we support already.

You're right about this effectively being an 'enum' while not being an
enum. A bit of history how we ended here because it was at one point in
time passtrhough of arbitrary strings.

In certain ancient downstreams this was originally a field which was
passed through from qemu verbatim and could have also other values than
just 'enospc'. The current state was done after qemu formalized the
event upstream, where the only stable field we get is 'nospace' which
we've mapped to the exact string ('enospc') the downstream version did
especially since it was used by oVirt. 

No other value did get a stable flag in the event so in turn libvirt
didn't ever codify any other value, while still keeping the 'string'
field.

Now over time this did in fact become an enum with two possible options:
"" and "enospc". At this point it felt convenient to use this field as
it isn't really an enum, to encode a catch-all/default case for the
user-originated string rather than adding yet another libvirt event:

1) Adding event is pain.
2) I'd be a new interface, thus potential users such as Kubevirt would
   need to use new libvirt instead of using existing interface.
3) It is a string when the CPU looks at it.
4) Adding events is pain!

Original addition: 34dcbbb470fb8b93232b8bd709e949f9012a7462
De-facto formalization of enum: e9392e48d4e3b29809da7883b697d5caf3a09680
Re: [PATCH 3/5] API: Document more possibilities for @reason field of VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event
Posted by Daniel P. Berrangé 7 months, 2 weeks ago
On Mon, Jan 27, 2025 at 11:55:29AM +0100, Peter Krempa wrote:
> On Mon, Jan 27, 2025 at 10:31:28 +0000, Daniel P. Berrangé wrote:
> > On Fri, Jan 24, 2025 at 05:33:04PM +0100, Peter Krempa wrote:
> > > The VIR_DOMAIN_EVENT_ID_IO_ERROR_REASON event allows @reason to be
> > > reported to the user, but currently we only report 'enospc'.
> > > 
> > > Extend the documentation so that we allow the passthrough of the
> > > verbatim error, prefixed by 'other: ' in order to prevent collisions and
> > > note that users must not attempt to parse them.
> > > 
> > > Signed-off-by: Peter Krempa <pkrempa@redhat.com>
> > > ---
> > >  include/libvirt/libvirt-domain.h | 8 ++++++--
> > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
> > > index 2a4b81f4df..e031b23547 100644
> > > --- a/include/libvirt/libvirt-domain.h
> > > +++ b/include/libvirt/libvirt-domain.h
> > > @@ -4790,8 +4790,12 @@ typedef void (*virConnectDomainEventIOErrorCallback)(virConnectPtr conn,
> > >   * If the I/O error is known to be caused by an ENOSPC condition in
> > >   * the host (where resizing the disk to be larger will allow the guest
> > >   * to be resumed as if nothing happened), @reason will be "enospc".
> > > - * Otherwise, @reason will be "", although future strings may be added
> > > - * if determination of other error types becomes possible.
> > > + *
> > > + * Otherwise, if the hypervisor reported an error @reason will be the verbatim
> > > + * error message from hypervisor prefixed by "other: ". Note that this error
> > > + * may not be stable and thus is only really usable for human use. In case
> > > + * when the hypervisor doesn't report the error @reason will be an empty string
> > > + * "".
> > 
> > Hmmm, this makes me feel pretty uncomfortable.
> > 
> > When set, the 'reason' field has clear long term stable supported semantics,
> > which applications are permitted to match against.
> > 
> > Essentially it is an enum field as currently defined.
> > 
> > This is now being turned into a free-format human readable string
> > which applications are told not to interpret at all.
> > 
> > Effectively we've overloaded the field to have two completely
> > different sets of semantics.
> > 
> > I don't think we should do this.
> > 
> > If we want a human readable string, it should be distinct from the
> > the enum reason we support already.
> 
> You're right about this effectively being an 'enum' while not being an
> enum. A bit of history how we ended here because it was at one point in
> time passtrhough of arbitrary strings.
> 
> In certain ancient downstreams this was originally a field which was
> passed through from qemu verbatim and could have also other values than
> just 'enospc'. The current state was done after qemu formalized the
> event upstream, where the only stable field we get is 'nospace' which
> we've mapped to the exact string ('enospc') the downstream version did
> especially since it was used by oVirt. 
> 
> No other value did get a stable flag in the event so in turn libvirt
> didn't ever codify any other value, while still keeping the 'string'
> field.

AFAIR, QEMU never codified machine targetted reasons, as there was
debate over how much apps ought to know about the root cause. We
have enospc, as that was the one thing with a clear use case, where
mgmt apps could take specific actions.


> Now over time this did in fact become an enum with two possible options:
> "" and "enospc". At this point it felt convenient to use this field as
> it isn't really an enum, to encode a catch-all/default case for the
> user-originated string rather than adding yet another libvirt event:
> 
> 1) Adding event is pain.
> 2) I'd be a new interface, thus potential users such as Kubevirt would
>    need to use new libvirt instead of using existing interface.

Apps should not actually be looking at these "new" human targetted
"reason" strings though, so I don't think it is actually compelling
to expose this in the events. THis is a distinct use case from the
existing "reason" which is clearly intended for applications to
toggle behaviour from.

The only valid thing apps would do with the human targetted reason
is to record it as a log message. 

> 3) It is a string when the CPU looks at it.
> 4) Adding events is pain!

We don't need to add new events IMHO, we can expose it with the existing
virDomainGetMessages API which is intended for such human targetted
messages.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|