[Qemu-devel] [PATCH v2] block: don't set the same context

Denis Plotnikov posted 1 patch 5 years, 2 months ago
Test docker-clang@ubuntu failed
Test asan failed
Test docker-mingw@fedora passed
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190215130325.5294-1-dplotnikov@virtuozzo.com
Maintainers: Kevin Wolf <kwolf@redhat.com>, Max Reitz <mreitz@redhat.com>
block.c | 4 ++++
1 file changed, 4 insertions(+)
[Qemu-devel] [PATCH v2] block: don't set the same context
Posted by Denis Plotnikov 5 years, 2 months ago
Adds a fast path on aio context setting preventing
unnecessary context setting routine.
Also, it prevents issues with cyclic walk of child
bds-es appeared because of registering aio walking
notifiers:

Call stack:

0  __GI_raise
1  __GI_abort
2  __assert_fail_base
3  __GI___assert_fail
4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
7  block_job_attached_aio_context
8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
9  bdrv_set_aio_context (bs=0x55f54d65c000)
10 blk_set_aio_context
11 virtio_blk_data_plane_stop
12 virtio_bus_stop_ioeventfd
13 virtio_vmstate_change
14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
16 vm_stop (state=RUN_STATE_SHUTDOWN)
17 main_loop_should_exit
18 main_loop
19 main

This can happen because of "new" context attachment to VM disk bds.
When attaching a new context the corresponding aio context handler is
called for each of aio_notifiers registered on the VM disk bds context.
Among those handlers, there is the block_job_attached_aio_context handler
which sets a new aio context for the block job bds. When doing so,
the old context is detached from all the block job bds children and one of
them is the VM disk bds, serving as backing store for the blockjob bds,
although the VM disk bds is actually the initializer of that process.
Since the VM disk bds is protected with walking_aio_notifiers flag
from double processing in recursive calls, the assert fires.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
---
 block.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/block.c b/block.c
index adda221c2c..5742545e7a 100644
--- a/block.c
+++ b/block.c
@@ -5263,6 +5263,10 @@ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
 {
     AioContext *ctx = bdrv_get_aio_context(bs);
 
+    if (ctx == new_context) {
+        return;
+    }
+
     aio_disable_external(ctx);
     bdrv_parent_drained_begin(bs, NULL, false);
     bdrv_drain(bs); /* ensure there are no in-flight requests */
-- 
2.17.0


Re: [Qemu-devel] [PATCH v2] block: don't set the same context
Posted by Eric Blake 5 years, 2 months ago
On 2/15/19 7:03 AM, Denis Plotnikov wrote:
> Adds a fast path on aio context setting preventing
> unnecessary context setting routine.
> Also, it prevents issues with cyclic walk of child
> bds-es appeared because of registering aio walking
> notifiers:

> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
> ---

Right here, after the ---, is a good place to list how v2 differs from
v1. It looks to me like the only change was a typo fix in the commit
message?  But that still doesn't address the concern on whether this
approach is right in the face of nesting (if attach/detach is always
paired, then attach-attach-detach-detach will cause the inner detach to
pair to the outer attach, any actions between the inner and outer detach
will not be against an attached context, and the outer detach needs to
be safe as a no-op).  The patch may still be right, but I'm not enough
of an expert in how aio attach/detach are supposed to work to know for
sure.  Or, we may need some form of reference counting, so that the
inner detach is a no-op if the inner attach was short-circuited, and the
outer detach then resumes being the proper pair against the outer
attach.  If you have checked why the patch works, then amending the
commit message to address my question will also help future readers that
might otherwise ask the same question. Posting a rapid-turnaround v2 for
just a typo fix, while not addressing the larger question raised as to
whether the patch is correct, is just churn.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v2] block: don't set the same context
Posted by Eric Blake 5 years, 2 months ago
On 2/15/19 7:12 AM, Eric Blake wrote:
> On 2/15/19 7:03 AM, Denis Plotnikov wrote:
>> Adds a fast path on aio context setting preventing
>> unnecessary context setting routine.
>> Also, it prevents issues with cyclic walk of child
>> bds-es appeared because of registering aio walking
>> notifiers:
> 
>> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
>> ---
> 
> Right here, after the ---, is a good place to list how v2 differs from
> v1. It looks to me like the only change was a typo fix in the commit
> message?  But that still doesn't address the concern on whether this
> approach is right in the face of nesting (if attach/detach is always
> paired, then attach-attach-detach-detach will cause the inner detach to
> pair to the outer attach, any actions between the inner and outer detach
> will not be against an attached context, and the outer detach needs to
> be safe as a no-op).  The patch may still be right, but I'm not enough
> of an expert in how aio attach/detach are supposed to work to know for
> sure.  Or, we may need some form of reference counting, so that the
> inner detach is a no-op if the inner attach was short-circuited, and the
> outer detach then resumes being the proper pair against the outer
> attach.  If you have checked why the patch works, then amending the
> commit message to address my question will also help future readers that
> might otherwise ask the same question. Posting a rapid-turnaround v2 for
> just a typo fix, while not addressing the larger question raised as to
> whether the patch is correct, is just churn.

Having now re-read the patch, I see that you are patching
bdrv_set_aio_context, even though your commit message focused my
attention on the nesting:

> 
> 3  __GI___assert_fail
> 4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
> 5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
> 6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
> 7  block_job_attached_aio_context
> 8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
> 9  bdrv_set_aio_context (bs=0x55f54d65c000)

That is, you marked lines 4 and 8 as forming nested attach/detach pairs,
rather than 6 and 9 as nested set calls. If I understand the point of
this patch, your goal is to realize that setting the context to what it
already has is pointless, and by short-circuiting the second set, you
can then completely bypass the nested attach/detach.  So it looks like
the change is correct, and that it is merely that the commit message
could still be improved to do a better job of calling out that the
symptoms were a nested attach/detach, but the fix is to avoid the nested
calls by fixing the setting to be smarter on the realization that
setting can be reached more than once.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v2] block: don't set the same context
Posted by Kevin Wolf 5 years, 2 months ago
Am 15.02.2019 um 14:03 hat Denis Plotnikov geschrieben:
> Adds a fast path on aio context setting preventing
> unnecessary context setting routine.
> Also, it prevents issues with cyclic walk of child
> bds-es appeared because of registering aio walking
> notifiers:
> 
> Call stack:
> 
> 0  __GI_raise
> 1  __GI_abort
> 2  __assert_fail_base
> 3  __GI___assert_fail
> 4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
> 5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
> 6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
> 7  block_job_attached_aio_context
> 8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
> 9  bdrv_set_aio_context (bs=0x55f54d65c000)
> 10 blk_set_aio_context
> 11 virtio_blk_data_plane_stop
> 12 virtio_bus_stop_ioeventfd
> 13 virtio_vmstate_change
> 14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
> 15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
> 16 vm_stop (state=RUN_STATE_SHUTDOWN)
> 17 main_loop_should_exit
> 18 main_loop
> 19 main
> 
> This can happen because of "new" context attachment to VM disk bds.
> When attaching a new context the corresponding aio context handler is
> called for each of aio_notifiers registered on the VM disk bds context.
> Among those handlers, there is the block_job_attached_aio_context handler
> which sets a new aio context for the block job bds. When doing so,
> the old context is detached from all the block job bds children and one of
> them is the VM disk bds, serving as backing store for the blockjob bds,
> although the VM disk bds is actually the initializer of that process.
> Since the VM disk bds is protected with walking_aio_notifiers flag
> from double processing in recursive calls, the assert fires.
> 
> Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>

Thanks, applied to the block branch.

Kevin