[PATCH 0/5] blkdebug: fix racing condition when iterating on

Emanuele Giuseppe Esposito posted 5 patches 3 years ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210408155913.53235-1-eesposit@redhat.com
Maintainers: Kevin Wolf <kwolf@redhat.com>, Max Reitz <mreitz@redhat.com>
There is a newer version of this series
block/blkdebug.c | 113 +++++++++++++++++++++++++++++++++--------------
1 file changed, 79 insertions(+), 34 deletions(-)
[PATCH 0/5] blkdebug: fix racing condition when iterating on
Posted by Emanuele Giuseppe Esposito 3 years ago
When qemu_coroutine_enter is executed in a loop
(even QEMU_FOREACH_SAFE), the new routine can modify the list,
for example removing an element, causing problem when control
is given back to the caller that continues iterating on the same list. 

Patch 1 solves the issue in blkdebug_debug_resume by restarting
the list walk after every coroutine_enter if list has to be fully iterated.
Patches 2,3,4 aim to fix blkdebug_debug_event by gathering
all actions that the rules make in a counter and invoking 
the respective coroutine_yeld only after processing all requests.
Patch 5 adds a lock to protect rules and suspended_reqs.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>

Emanuele Giuseppe Esposito (5):
  blkdebug: refactor removal of a suspended request
  blkdebug: move post-resume handling to resume_req_by_tag
  blkdebug: track all actions
  blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE
  blkdebug: protect rules and suspended_reqs with a lock

 block/blkdebug.c | 113 +++++++++++++++++++++++++++++++++--------------
 1 file changed, 79 insertions(+), 34 deletions(-)

-- 
2.30.2


Re: [PATCH 0/5] blkdebug: fix racing condition when iterating on
Posted by Paolo Bonzini 3 years ago
On 08/04/21 17:59, Emanuele Giuseppe Esposito wrote:
> When qemu_coroutine_enter is executed in a loop
> (even QEMU_FOREACH_SAFE), the new routine can modify the list,
> for example removing an element, causing problem when control
> is given back to the caller that continues iterating on the same list.
> 
> Patch 1 solves the issue in blkdebug_debug_resume by restarting
> the list walk after every coroutine_enter if list has to be fully iterated.
> Patches 2,3,4 aim to fix blkdebug_debug_event by gathering
> all actions that the rules make in a counter and invoking
> the respective coroutine_yeld only after processing all requests.
> Patch 5 adds a lock to protect rules and suspended_reqs.

Patch 5 is somewhat independent of the others; right now everything 
works because it's protected by the AioContext lock.

On the other hand the scenarios in patches 1-4 are bugs even without 
patch 5.  They become more obvious if you see an explicit unlock/lock 
pair within QTAILQ_FOREACH_SAFE, but they can happen already with just a 
qemu_coroutine_yield or qemu_coroutine_enter within the iteration.

Paolo