[PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports

Vladimir Sementsov-Ogievskiy posted 1 patch 8 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20240109131308.455371-1-vsementsov@yandex-team.ru
Maintainers: Markus Armbruster <armbru@redhat.com>, "Dr. David Alan Gilbert" <dave@treblig.org>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Eric Blake <eblake@redhat.com>
monitor/monitor.c    | 10 ++++++++++
qapi/block-core.json |  2 ++
2 files changed, 12 insertions(+)
[PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports
Posted by Vladimir Sementsov-Ogievskiy 8 months, 2 weeks ago
From: Leonid Kaplan <xeor@yandex-team.ru>

BLOCK_IO_ERROR events comes from guest, so we must throttle them.
We still want per-device throttling, so let's use device id as a key.

Signed-off-by: Leonid Kaplan <xeor@yandex-team.ru>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---

v2: add Note: to QAPI doc

 monitor/monitor.c    | 10 ++++++++++
 qapi/block-core.json |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/monitor/monitor.c b/monitor/monitor.c
index 01ede1babd..ad0243e9d7 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -309,6 +309,7 @@ int error_printf_unless_qmp(const char *fmt, ...)
 static MonitorQAPIEventConf monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
     /* Limit guest-triggerable events to 1 per second */
     [QAPI_EVENT_RTC_CHANGE]        = { 1000 * SCALE_MS },
+    [QAPI_EVENT_BLOCK_IO_ERROR]    = { 1000 * SCALE_MS },
     [QAPI_EVENT_WATCHDOG]          = { 1000 * SCALE_MS },
     [QAPI_EVENT_BALLOON_CHANGE]    = { 1000 * SCALE_MS },
     [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
@@ -498,6 +499,10 @@ static unsigned int qapi_event_throttle_hash(const void *key)
         hash += g_str_hash(qdict_get_str(evstate->data, "qom-path"));
     }
 
+    if (evstate->event == QAPI_EVENT_BLOCK_IO_ERROR) {
+        hash += g_str_hash(qdict_get_str(evstate->data, "device"));
+    }
+
     return hash;
 }
 
@@ -525,6 +530,11 @@ static gboolean qapi_event_throttle_equal(const void *a, const void *b)
                        qdict_get_str(evb->data, "qom-path"));
     }
 
+    if (eva->event == QAPI_EVENT_BLOCK_IO_ERROR) {
+        return !strcmp(qdict_get_str(eva->data, "device"),
+                       qdict_get_str(evb->data, "device"));
+    }
+
     return TRUE;
 }
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ca390c5700..32c2c2f030 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5559,6 +5559,8 @@
 # Note: If action is "stop", a STOP event will eventually follow the
 #     BLOCK_IO_ERROR event
 #
+# Note: This event is rate-limited.
+#
 # Since: 0.13
 #
 # Example:
-- 
2.34.1
Re: [PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports
Posted by Kevin Wolf 2 months ago
Am 09.01.2024 um 14:13 hat Vladimir Sementsov-Ogievskiy geschrieben:
> From: Leonid Kaplan <xeor@yandex-team.ru>
> 
> BLOCK_IO_ERROR events comes from guest, so we must throttle them.
> We still want per-device throttling, so let's use device id as a key.
> 
> Signed-off-by: Leonid Kaplan <xeor@yandex-team.ru>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> 
> v2: add Note: to QAPI doc
> 
>  monitor/monitor.c    | 10 ++++++++++
>  qapi/block-core.json |  2 ++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 01ede1babd..ad0243e9d7 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -309,6 +309,7 @@ int error_printf_unless_qmp(const char *fmt, ...)
>  static MonitorQAPIEventConf monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
>      /* Limit guest-triggerable events to 1 per second */
>      [QAPI_EVENT_RTC_CHANGE]        = { 1000 * SCALE_MS },
> +    [QAPI_EVENT_BLOCK_IO_ERROR]    = { 1000 * SCALE_MS },
>      [QAPI_EVENT_WATCHDOG]          = { 1000 * SCALE_MS },
>      [QAPI_EVENT_BALLOON_CHANGE]    = { 1000 * SCALE_MS },
>      [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
> @@ -498,6 +499,10 @@ static unsigned int qapi_event_throttle_hash(const void *key)
>          hash += g_str_hash(qdict_get_str(evstate->data, "qom-path"));
>      }
>  
> +    if (evstate->event == QAPI_EVENT_BLOCK_IO_ERROR) {
> +        hash += g_str_hash(qdict_get_str(evstate->data, "device"));
> +    }

Using "device" only works with -drive, i.e. when the BlockBackend
actually has a name. In modern configurations with a -blockdev
referenced by -device, the BlockBackend doesn't have a name any more.

Maybe we should be using the qdev id (or more generally, QOM path) here,
but that's something the event doesn't even contain yet.

> +
>      return hash;
>  }
>  
> @@ -525,6 +530,11 @@ static gboolean qapi_event_throttle_equal(const void *a, const void *b)
>                         qdict_get_str(evb->data, "qom-path"));
>      }
>  
> +    if (eva->event == QAPI_EVENT_BLOCK_IO_ERROR) {
> +        return !strcmp(qdict_get_str(eva->data, "device"),
> +                       qdict_get_str(evb->data, "device"));
> +    }
> +
>      return TRUE;
>  }
>  
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index ca390c5700..32c2c2f030 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -5559,6 +5559,8 @@
>  # Note: If action is "stop", a STOP event will eventually follow the
>  #     BLOCK_IO_ERROR event
>  #
> +# Note: This event is rate-limited.
> +#
>  # Since: 0.13
>  #
>  # Example:

Kevin
Re: [PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports
Posted by Markus Armbruster 2 months ago
Kevin Wolf <kwolf@redhat.com> writes:

> Am 09.01.2024 um 14:13 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> From: Leonid Kaplan <xeor@yandex-team.ru>
>> 
>> BLOCK_IO_ERROR events comes from guest, so we must throttle them.
>> We still want per-device throttling, so let's use device id as a key.
>> 
>> Signed-off-by: Leonid Kaplan <xeor@yandex-team.ru>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> ---
>> 
>> v2: add Note: to QAPI doc
>> 
>>  monitor/monitor.c    | 10 ++++++++++
>>  qapi/block-core.json |  2 ++
>>  2 files changed, 12 insertions(+)
>> 
>> diff --git a/monitor/monitor.c b/monitor/monitor.c
>> index 01ede1babd..ad0243e9d7 100644
>> --- a/monitor/monitor.c
>> +++ b/monitor/monitor.c
>> @@ -309,6 +309,7 @@ int error_printf_unless_qmp(const char *fmt, ...)
>>  static MonitorQAPIEventConf monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
>>      /* Limit guest-triggerable events to 1 per second */
>>      [QAPI_EVENT_RTC_CHANGE]        = { 1000 * SCALE_MS },
>> +    [QAPI_EVENT_BLOCK_IO_ERROR]    = { 1000 * SCALE_MS },
>>      [QAPI_EVENT_WATCHDOG]          = { 1000 * SCALE_MS },
>>      [QAPI_EVENT_BALLOON_CHANGE]    = { 1000 * SCALE_MS },
>>      [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
>> @@ -498,6 +499,10 @@ static unsigned int qapi_event_throttle_hash(const void *key)
>>          hash += g_str_hash(qdict_get_str(evstate->data, "qom-path"));
>>      }
>>  
>> +    if (evstate->event == QAPI_EVENT_BLOCK_IO_ERROR) {
>> +        hash += g_str_hash(qdict_get_str(evstate->data, "device"));
>> +    }
>
> Using "device" only works with -drive, i.e. when the BlockBackend
> actually has a name. In modern configurations with a -blockdev
> referenced by -device, the BlockBackend doesn't have a name any more.
>
> Maybe we should be using the qdev id (or more generally, QOM path) here,
> but that's something the event doesn't even contain yet.

Uh, does the event reliably identify the I/O error's node or not?  If
not, then that's a serious design defect.

There's @node-name.  Several commands use "either @device or @node-name"
to identify a node.  Is that sufficient here?

[...]
Re: [PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports
Posted by Kevin Wolf 2 months ago
Am 19.07.2024 um 06:54 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 09.01.2024 um 14:13 hat Vladimir Sementsov-Ogievskiy geschrieben:
> >> From: Leonid Kaplan <xeor@yandex-team.ru>
> >> 
> >> BLOCK_IO_ERROR events comes from guest, so we must throttle them.
> >> We still want per-device throttling, so let's use device id as a key.
> >> 
> >> Signed-off-by: Leonid Kaplan <xeor@yandex-team.ru>
> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> >> ---
> >> 
> >> v2: add Note: to QAPI doc
> >> 
> >>  monitor/monitor.c    | 10 ++++++++++
> >>  qapi/block-core.json |  2 ++
> >>  2 files changed, 12 insertions(+)
> >> 
> >> diff --git a/monitor/monitor.c b/monitor/monitor.c
> >> index 01ede1babd..ad0243e9d7 100644
> >> --- a/monitor/monitor.c
> >> +++ b/monitor/monitor.c
> >> @@ -309,6 +309,7 @@ int error_printf_unless_qmp(const char *fmt, ...)
> >>  static MonitorQAPIEventConf monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
> >>      /* Limit guest-triggerable events to 1 per second */
> >>      [QAPI_EVENT_RTC_CHANGE]        = { 1000 * SCALE_MS },
> >> +    [QAPI_EVENT_BLOCK_IO_ERROR]    = { 1000 * SCALE_MS },
> >>      [QAPI_EVENT_WATCHDOG]          = { 1000 * SCALE_MS },
> >>      [QAPI_EVENT_BALLOON_CHANGE]    = { 1000 * SCALE_MS },
> >>      [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
> >> @@ -498,6 +499,10 @@ static unsigned int qapi_event_throttle_hash(const void *key)
> >>          hash += g_str_hash(qdict_get_str(evstate->data, "qom-path"));
> >>      }
> >>  
> >> +    if (evstate->event == QAPI_EVENT_BLOCK_IO_ERROR) {
> >> +        hash += g_str_hash(qdict_get_str(evstate->data, "device"));
> >> +    }
> >
> > Using "device" only works with -drive, i.e. when the BlockBackend
> > actually has a name. In modern configurations with a -blockdev
> > referenced by -device, the BlockBackend doesn't have a name any more.
> >
> > Maybe we should be using the qdev id (or more generally, QOM path) here,
> > but that's something the event doesn't even contain yet.
> 
> Uh, does the event reliably identify the I/O error's node or not?  If
> not, then that's a serious design defect.
> 
> There's @node-name.  Several commands use "either @device or @node-name"
> to identify a node.  Is that sufficient here?

Possibly. The QAPI event is sent by a device, not by the backend, and
the commit message claims per-device throttling. That's what made me
think that we should base it on the device id.

But it's true that the error does originate in the backend (and it's
unlikely that two devices are attached to the same node anyway), so the
node-name could be good enough if we don't have a BlockBackend name. We
should claim per-block-node rather then per-device throttling then.

Kevin
Re: [PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports
Posted by Markus Armbruster 2 months ago
Kevin Wolf <kwolf@redhat.com> writes:

> Am 19.07.2024 um 06:54 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > Am 09.01.2024 um 14:13 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> >> From: Leonid Kaplan <xeor@yandex-team.ru>
>> >> 
>> >> BLOCK_IO_ERROR events comes from guest, so we must throttle them.
>> >> We still want per-device throttling, so let's use device id as a key.
>> >> 
>> >> Signed-off-by: Leonid Kaplan <xeor@yandex-team.ru>
>> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> >> ---
>> >> 
>> >> v2: add Note: to QAPI doc
>> >> 
>> >>  monitor/monitor.c    | 10 ++++++++++
>> >>  qapi/block-core.json |  2 ++
>> >>  2 files changed, 12 insertions(+)
>> >> 
>> >> diff --git a/monitor/monitor.c b/monitor/monitor.c
>> >> index 01ede1babd..ad0243e9d7 100644
>> >> --- a/monitor/monitor.c
>> >> +++ b/monitor/monitor.c
>> >> @@ -309,6 +309,7 @@ int error_printf_unless_qmp(const char *fmt, ...)
>> >>  static MonitorQAPIEventConf monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
>> >>      /* Limit guest-triggerable events to 1 per second */
>> >>      [QAPI_EVENT_RTC_CHANGE]        = { 1000 * SCALE_MS },
>> >> +    [QAPI_EVENT_BLOCK_IO_ERROR]    = { 1000 * SCALE_MS },
>> >>      [QAPI_EVENT_WATCHDOG]          = { 1000 * SCALE_MS },
>> >>      [QAPI_EVENT_BALLOON_CHANGE]    = { 1000 * SCALE_MS },
>> >>      [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
>> >> @@ -498,6 +499,10 @@ static unsigned int qapi_event_throttle_hash(const void *key)
>> >>          hash += g_str_hash(qdict_get_str(evstate->data, "qom-path"));
>> >>      }
>> >>  
>> >> +    if (evstate->event == QAPI_EVENT_BLOCK_IO_ERROR) {
>> >> +        hash += g_str_hash(qdict_get_str(evstate->data, "device"));
>> >> +    }
>> >
>> > Using "device" only works with -drive, i.e. when the BlockBackend
>> > actually has a name. In modern configurations with a -blockdev
>> > referenced by -device, the BlockBackend doesn't have a name any more.
>> >
>> > Maybe we should be using the qdev id (or more generally, QOM path) here,
>> > but that's something the event doesn't even contain yet.
>> 
>> Uh, does the event reliably identify the I/O error's node or not?  If
>> not, then that's a serious design defect.
>> 
>> There's @node-name.  Several commands use "either @device or @node-name"
>> to identify a node.  Is that sufficient here?
>
> Possibly. The QAPI event is sent by a device, not by the backend, and
> the commit message claims per-device throttling. That's what made me
> think that we should base it on the device id.

This is an argument for having the event point at the device.  Events
that already point at the device carry a mandatory @qom-path, and some
also carry an optional qdev @id for backward compatibility.

As far as I can tell, not all devices implement this event.  If that's
true, then documentation should spell it out.

> But it's true that the error does originate in the backend (and it's
> unlikely that two devices are attached to the same node anyway), so the
> node-name could be good enough if we don't have a BlockBackend name. We
> should claim per-block-node rather then per-device throttling then.

Yes.
Re: [PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports
Posted by Vladimir Sementsov-Ogievskiy 2 months, 3 weeks ago
ping2

On 09.01.24 16:13, Vladimir Sementsov-Ogievskiy wrote:
> From: Leonid Kaplan <xeor@yandex-team.ru>
> 
> BLOCK_IO_ERROR events comes from guest, so we must throttle them.
> We still want per-device throttling, so let's use device id as a key.
> 
> Signed-off-by: Leonid Kaplan <xeor@yandex-team.ru>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
> 
> v2: add Note: to QAPI doc
> 
>   monitor/monitor.c    | 10 ++++++++++
>   qapi/block-core.json |  2 ++
>   2 files changed, 12 insertions(+)
> 
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 01ede1babd..ad0243e9d7 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -309,6 +309,7 @@ int error_printf_unless_qmp(const char *fmt, ...)
>   static MonitorQAPIEventConf monitor_qapi_event_conf[QAPI_EVENT__MAX] = {
>       /* Limit guest-triggerable events to 1 per second */
>       [QAPI_EVENT_RTC_CHANGE]        = { 1000 * SCALE_MS },
> +    [QAPI_EVENT_BLOCK_IO_ERROR]    = { 1000 * SCALE_MS },
>       [QAPI_EVENT_WATCHDOG]          = { 1000 * SCALE_MS },
>       [QAPI_EVENT_BALLOON_CHANGE]    = { 1000 * SCALE_MS },
>       [QAPI_EVENT_QUORUM_REPORT_BAD] = { 1000 * SCALE_MS },
> @@ -498,6 +499,10 @@ static unsigned int qapi_event_throttle_hash(const void *key)
>           hash += g_str_hash(qdict_get_str(evstate->data, "qom-path"));
>       }
>   
> +    if (evstate->event == QAPI_EVENT_BLOCK_IO_ERROR) {
> +        hash += g_str_hash(qdict_get_str(evstate->data, "device"));
> +    }
> +
>       return hash;
>   }
>   
> @@ -525,6 +530,11 @@ static gboolean qapi_event_throttle_equal(const void *a, const void *b)
>                          qdict_get_str(evb->data, "qom-path"));
>       }
>   
> +    if (eva->event == QAPI_EVENT_BLOCK_IO_ERROR) {
> +        return !strcmp(qdict_get_str(eva->data, "device"),
> +                       qdict_get_str(evb->data, "device"));
> +    }
> +
>       return TRUE;
>   }
>   
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index ca390c5700..32c2c2f030 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -5559,6 +5559,8 @@
>   # Note: If action is "stop", a STOP event will eventually follow the
>   #     BLOCK_IO_ERROR event
>   #
> +# Note: This event is rate-limited.
> +#
>   # Since: 0.13
>   #
>   # Example:

-- 
Best regards,
Vladimir
Re: [PATCH v2] block-backend: per-device throttling of BLOCK_IO_ERROR reports
Posted by Vladimir Sementsov-Ogievskiy 6 months, 3 weeks ago
On 09.01.24 16:13, Vladimir Sementsov-Ogievskiy wrote:
> From: Leonid Kaplan<xeor@yandex-team.ru>
> 
> BLOCK_IO_ERROR events comes from guest, so we must throttle them.
> We still want per-device throttling, so let's use device id as a key.
> 
> Signed-off-by: Leonid Kaplan<xeor@yandex-team.ru>
> Signed-off-by: Vladimir Sementsov-Ogievskiy<vsementsov@yandex-team.ru>

ping)

-- 
Best regards,
Vladimir