[Qemu-devel] [PATCH for-2.9] blockjob: avoid recursive AioContext locking

Paolo Bonzini posted 1 patch 7 years ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1490118490-5597-1-git-send-email-pbonzini@redhat.com
Test checkpatch passed
Test docker failed
Test s390x passed
blockjob.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
[Qemu-devel] [PATCH for-2.9] blockjob: avoid recursive AioContext locking
Posted by Paolo Bonzini 7 years ago
Streaming or any other block job hangs when performed on a block device
that has a non-default iothread.  This happens because the AioContext
is acquired twice by block_job_defer_to_main_loop_bh and then released
only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which

unfortunately are a temporary but necessary evil for iothreads at the
moment).

Luckily, the reason for the double acquisition is simple; the function
acquires the AioContext for both the job iothread and the BDS iothread,
in case the BDS iothread was changed while the job was running.  It
is therefore enough to skip the second acquisition when the two
AioContexts are one and the same.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 blockjob.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 69126af..2159df7 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -755,12 +755,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
 
     /* Fetch BDS AioContext again, in case it has changed */
     aio_context = blk_get_aio_context(data->job->blk);
-    aio_context_acquire(aio_context);
+    if (aio_context != data->aio_context) {
+        aio_context_acquire(aio_context);
+    }
 
     data->job->deferred_to_main_loop = false;
     data->fn(data->job, data->opaque);
 
-    aio_context_release(aio_context);
+    if (aio_context != data->aio_context) {
+        aio_context_release(aio_context);
+    }
 
     aio_context_release(data->aio_context);
 
-- 
1.8.3.1


Re: [Qemu-devel] [PATCH for-2.9] blockjob: avoid recursive AioContext locking
Posted by Eric Blake 7 years ago
On 03/21/2017 12:48 PM, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the

Why the blank line?

> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 

Makes sense from the description.
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] [Qemu-block] [PATCH for-2.9] blockjob: avoid recursive AioContext locking
Posted by Jeff Cody 7 years ago
On Tue, Mar 21, 2017 at 06:48:10PM +0100, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the
> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 69126af..2159df7 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -755,12 +755,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
>  
>      /* Fetch BDS AioContext again, in case it has changed */
>      aio_context = blk_get_aio_context(data->job->blk);
> -    aio_context_acquire(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_acquire(aio_context);
> +    }
>  
>      data->job->deferred_to_main_loop = false;
>      data->fn(data->job, data->opaque);
>  
> -    aio_context_release(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_release(aio_context);
> +    }
>  
>      aio_context_release(data->aio_context);
>  
> -- 
> 1.8.3.1
> 
>

Reviewed-by: Jeff Cody <jcody@redhat.com>

Re: [Qemu-devel] [Qemu-block] [PATCH for-2.9] blockjob: avoid recursive AioContext locking
Posted by Jeff Cody 7 years ago
On Tue, Mar 21, 2017 at 06:48:10PM +0100, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the
> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 69126af..2159df7 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -755,12 +755,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
>  
>      /* Fetch BDS AioContext again, in case it has changed */
>      aio_context = blk_get_aio_context(data->job->blk);
> -    aio_context_acquire(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_acquire(aio_context);
> +    }
>  
>      data->job->deferred_to_main_loop = false;
>      data->fn(data->job, data->opaque);
>  
> -    aio_context_release(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_release(aio_context);
> +    }
>  
>      aio_context_release(data->aio_context);
>  
> -- 
> 1.8.3.1
> 
>

Deleted the blank line in the commit message, and:


Thanks,

Applied to my block branch:

git://github.com/codyprime/qemu-kvm-jtc.git block

-Jeff

Re: [Qemu-devel] [PATCH for-2.9] blockjob: avoid recursive AioContext locking
Posted by John Snow 7 years ago

On 03/21/2017 01:48 PM, Paolo Bonzini wrote:
> Streaming or any other block job hangs when performed on a block device
> that has a non-default iothread.  This happens because the AioContext
> is acquired twice by block_job_defer_to_main_loop_bh and then released
> only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
> 
> unfortunately are a temporary but necessary evil for iothreads at the
> moment).
> 
> Luckily, the reason for the double acquisition is simple; the function
> acquires the AioContext for both the job iothread and the BDS iothread,
> in case the BDS iothread was changed while the job was running.  It
> is therefore enough to skip the second acquisition when the two
> AioContexts are one and the same.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  blockjob.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 69126af..2159df7 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -755,12 +755,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
>  
>      /* Fetch BDS AioContext again, in case it has changed */
>      aio_context = blk_get_aio_context(data->job->blk);
> -    aio_context_acquire(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_acquire(aio_context);
> +    }
>  
>      data->job->deferred_to_main_loop = false;
>      data->fn(data->job, data->opaque);
>  
> -    aio_context_release(aio_context);
> +    if (aio_context != data->aio_context) {
> +        aio_context_release(aio_context);
> +    }
>  
>      aio_context_release(data->aio_context);
>  
> 

Reviewed-by: John Snow <jsnow@redhat.com>