Series comparison

-[Qemu-devel] [PULL for-2.9 0/4] Block patches
+[PULL for-6.1 0/3] Block patches
-The following changes since commit 55a19ad8b2d0797e3a8fe90ab99a9bb713824059:
+The following changes since commit 3521ade3510eb5cefb2e27a101667f25dad89935:
-  Update version for v2.9.0-rc1 release (2017-03-21 17:13:29 +0000)
+  Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-07-29' into staging (2021-07-29 13:17:20 +0100)
-are available in the git repository at:
+are available in the Git repository at:
-  https://github.com/codyprime/qemu-kvm-jtc.git tags/block-pull-request
+  https://gitlab.com/stefanha/qemu.git tags/block-pull-request
-for you to fetch changes up to 600ac6a0ef5c06418446ef2f37407bddcc51b21c:
+for you to fetch changes up to cc8eecd7f105a1dff5876adeb238a14696061a4a:
-  blockjob: add devops to blockjob backends (2017-03-22 13:26:27 -0400)
+  MAINTAINERS: Added myself as a reviewer for the NVMe Block Driver (2021-07-29 17:17:34 +0100)
 ----------------------------------------------------------------
-Block patches for 2.9
+Pull request
 The main fix here is for io_uring. Spurious -EAGAIN errors can happen and the
 request needs to be resubmitted.
 The MAINTAINERS changes carry no risk and we might as well include them in QEMU
 .1.
 ----------------------------------------------------------------
-John Snow (3):
+Fabian Ebner (1):
-  blockjob: add block_job_start_shim
+  block/io_uring: resubmit when result is -EAGAIN
   block-backend: add drained_begin / drained_end ops
   blockjob: add devops to blockjob backends
-Paolo Bonzini (1):
+Philippe Mathieu-Daudé (1):
-  blockjob: avoid recursive AioContext locking
+  MAINTAINERS: Added myself as a reviewer for the NVMe Block Driver
- block/block-backend.c          | 24 ++++++++++++++--
+Stefano Garzarella (1):
- blockjob.c                     | 63 ++++++++++++++++++++++++++++++++----------
+  MAINTAINERS: add Stefano Garzarella as io_uring reviewer
- include/sysemu/block-backend.h |  8 ++++++
-files changed, 79 insertions(+), 16 deletions(-)
+ MAINTAINERS      |  2 ++
  block/io_uring.c | 16 +++++++++++++++-
 files changed, 17 insertions(+), 1 deletion(-)
 --
-.9.3
+.31.1

-[Qemu-devel] [PULL for-2.9 1/4] blockjob: avoid recursive AioContext locking
+Deleted patch
-From: Paolo Bonzini <pbonzini@redhat.com>
-Streaming or any other block job hangs when performed on a block device
-that has a non-default iothread.  This happens because the AioContext
-is acquired twice by block_job_defer_to_main_loop_bh and then released
-only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
-unfortunately are a temporary but necessary evil for iothreads at the
-moment).
-Luckily, the reason for the double acquisition is simple; the function
-acquires the AioContext for both the job iothread and the BDS iothread,
-in case the BDS iothread was changed while the job was running.  It
-is therefore enough to skip the second acquisition when the two
-AioContexts are one and the same.
-Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-Reviewed-by: Eric Blake <eblake@redhat.com>
-Reviewed-by: Jeff Cody <jcody@redhat.com>
-Message-id: 1490118490-5597-1-git-send-email-pbonzini@redhat.com
-Signed-off-by: Jeff Cody <jcody@redhat.com>
----
- blockjob.c | 8 ++++++--
-file changed, 6 insertions(+), 2 deletions(-)
-diff --git a/blockjob.c b/blockjob.c
-index XXXXXXX..XXXXXXX 100644
---- a/blockjob.c
-+++ b/blockjob.c
-@@ -XXX,XX +XXX,XX @@ static void block_job_defer_to_main_loop_bh(void *opaque)
-     /* Fetch BDS AioContext again, in case it has changed */
-     aio_context = blk_get_aio_context(data->job->blk);
--    aio_context_acquire(aio_context);
-+    if (aio_context != data->aio_context) {
-+        aio_context_acquire(aio_context);
-+    }
-     data->job->deferred_to_main_loop = false;
-     data->fn(data->job, data->opaque);
--    aio_context_release(aio_context);
-+    if (aio_context != data->aio_context) {
-+        aio_context_release(aio_context);
-+    }
-     aio_context_release(data->aio_context);
---
-.9.3

-[Qemu-devel] [PULL for-2.9 4/4] blockjob: add devops to blockjob backends
+[PULL for-6.1 1/3] MAINTAINERS: add Stefano Garzarella as io_uring reviewer
-From: John Snow <jsnow@redhat.com>
+From: Stefano Garzarella <sgarzare@redhat.com>
-This lets us hook into drained_begin and drained_end requests from the
+I've been working with io_uring for a while so I'd like to help
-backend level, which is particularly useful for making sure that all
+with reviews.
 jobs associated with a particular node (whether the source or the target)
 receive a drain request.
-Suggested-by: Kevin Wolf <kwolf@redhat.com>
+Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
-Signed-off-by: John Snow <jsnow@redhat.com>
+Message-Id: <20210728131515.131045-1-sgarzare@redhat.com>
-Reviewed-by: Jeff Cody <jcody@redhat.com>
+Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
 Message-id: 20170316212351.13797-4-jsnow@redhat.com
 Signed-off-by: Jeff Cody <jcody@redhat.com>
 ---
- blockjob.c | 29 ++++++++++++++++++++++++-----
+ MAINTAINERS | 1 +
-file changed, 24 insertions(+), 5 deletions(-)
+file changed, 1 insertion(+)
-diff --git a/blockjob.c b/blockjob.c
+diff --git a/MAINTAINERS b/MAINTAINERS
 index XXXXXXX..XXXXXXX 100644
---- a/blockjob.c
+--- a/MAINTAINERS
-+++ b/blockjob.c
++++ b/MAINTAINERS
-@@ -XXX,XX +XXX,XX @@ static const BdrvChildRole child_job = {
+@@ -XXX,XX +XXX,XX @@ Linux io_uring
-     .stay_at_node       = true,
+ M: Aarushi Mehta <mehta.aaru20@gmail.com>
- };
+ M: Julia Suvorova <jusual@redhat.com>
+ M: Stefan Hajnoczi <stefanha@redhat.com>
-+static void block_job_drained_begin(void *opaque)
++R: Stefano Garzarella <sgarzare@redhat.com>
-+{
+ L: qemu-block@nongnu.org
-+    BlockJob *job = opaque;
+ S: Maintained
-+    block_job_pause(job);
+ F: block/io_uring.c
 +}
 +
 +static void block_job_drained_end(void *opaque)
 +{
 +    BlockJob *job = opaque;
 +    block_job_resume(job);
 +}
 +
 +static const BlockDevOps block_job_dev_ops = {
 +    .drained_begin = block_job_drained_begin,
 +    .drained_end = block_job_drained_end,
 +};
 +
  BlockJob *block_job_next(BlockJob *job)
  {
      if (!job) {
@@ -XXX,XX +XXX,XX @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
      }
      job = g_malloc0(driver->instance_size);
 -    error_setg(&job->blocker, "block device is in use by block job: %s",
 -               BlockJobType_lookup[driver->job_type]);
 -    block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
 -    bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
 -
      job->driver        = driver;
      job->id            = g_strdup(job_id);
      job->blk           = blk;
@@ -XXX,XX +XXX,XX @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
      job->paused        = true;
      job->pause_count   = 1;
      job->refcnt        = 1;
 +
 +    error_setg(&job->blocker, "block device is in use by block job: %s",
 +               BlockJobType_lookup[driver->job_type]);
 +    block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
      bs->job = job;
 +    blk_set_dev_ops(blk, &block_job_dev_ops, job);
 +    bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
 +
      QLIST_INSERT_HEAD(&block_jobs, job, job_list);
      blk_add_aio_context_notifier(blk, block_job_attached_aio_context,
 --
-.9.3
+.31.1

-[Qemu-devel] [PULL for-2.9 3/4] block-backend: add drained_begin / drained_end ops
+[PULL for-6.1 2/3] block/io_uring: resubmit when result is -EAGAIN
-From: John Snow <jsnow@redhat.com>
+From: Fabian Ebner <f.ebner@proxmox.com>
-Allow block backends to forward drain requests to their devices/users.
+Linux SCSI can throw spurious -EAGAIN in some corner cases in its
-The initial intended purpose for this patch is to allow BBs to forward
+completion path, which will end up being the result in the completed
-requests along to BlockJobs, which will want to pause if their associated
+io_uring request.
 BB has entered a drained region.
-Signed-off-by: John Snow <jsnow@redhat.com>
+Resubmitting such requests should allow block jobs to complete, even
-Reviewed-by: Jeff Cody <jcody@redhat.com>
+if such spurious errors are encountered.
-Message-id: 20170316212351.13797-3-jsnow@redhat.com
-Signed-off-by: Jeff Cody <jcody@redhat.com>
+Co-authored-by: Stefan Hajnoczi <stefanha@gmail.com>
 Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
 Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
 Message-id: 20210729091029.65369-1-f.ebner@proxmox.com
 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
 ---
- block/block-backend.c          | 24 ++++++++++++++++++++++--
+ block/io_uring.c | 16 +++++++++++++++-
- include/sysemu/block-backend.h |  8 ++++++++
+file changed, 15 insertions(+), 1 deletion(-)
 files changed, 30 insertions(+), 2 deletions(-)
-diff --git a/block/block-backend.c b/block/block-backend.c
+diff --git a/block/io_uring.c b/block/io_uring.c
 index XXXXXXX..XXXXXXX 100644
---- a/block/block-backend.c
+--- a/block/io_uring.c
-+++ b/block/block-backend.c
++++ b/block/io_uring.c
-@@ -XXX,XX +XXX,XX @@ struct BlockBackend {
+@@ -XXX,XX +XXX,XX @@ static void luring_process_completions(LuringState *s)
-     bool allow_write_beyond_eof;
+         total_bytes = ret + luringcb->total_read;
-     NotifierList remove_bs_notifiers, insert_bs_notifiers;
+         if (ret < 0) {
-+
+-            if (ret == -EINTR) {
-+    int quiesce_counter;
++            /*
- };
++             * Only writev/readv/fsync requests on regular files or host block
++             * devices are submitted. Therefore -EAGAIN is not expected but it's
- typedef struct BlockBackendAIOCB {
++             * known to happen sometimes with Linux SCSI. Submit again and hope
-@@ -XXX,XX +XXX,XX @@ void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
++             * the request completes successfully.
-                      void *opaque)
++             *
- {
++             * For more information, see:
-     /* All drivers that use blk_set_dev_ops() are qdevified and we want to keep
++             * https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u
--     * it that way, so we can assume blk->dev is a DeviceState if blk->dev_ops
++             *
--     * is set. */
++             * If the code is changed to submit other types of requests in the
-+     * it that way, so we can assume blk->dev, if present, is a DeviceState if
++             * future, then this workaround may need to be extended to deal with
-+     * blk->dev_ops is set. Non-device users may use dev_ops without device. */
++             * genuine -EAGAIN results that should not be resubmitted
-     assert(!blk->legacy_dev);
++             * immediately.
++             */
-     blk->dev_ops = ops;
++            if (ret == -EINTR || ret == -EAGAIN) {
-     blk->dev_opaque = opaque;
+                 luring_resubmit(s, luringcb);
-+
+                 continue;
-+    /* Are we currently quiesced? Should we enforce this right now? */
+             }
 +    if (blk->quiesce_counter && ops->drained_begin) {
 +        ops->drained_begin(opaque);
 +    }
  }
  /*
@@ -XXX,XX +XXX,XX @@ static void blk_root_drained_begin(BdrvChild *child)
  {
      BlockBackend *blk = child->opaque;
 +    if (++blk->quiesce_counter == 1) {
 +        if (blk->dev_ops && blk->dev_ops->drained_begin) {
 +            blk->dev_ops->drained_begin(blk->dev_opaque);
 +        }
 +    }
 +
      /* Note that blk->root may not be accessible here yet if we are just
       * attaching to a BlockDriverState that is drained. Use child instead. */
@@ -XXX,XX +XXX,XX @@ static void blk_root_drained_begin(BdrvChild *child)
  static void blk_root_drained_end(BdrvChild *child)
  {
      BlockBackend *blk = child->opaque;
 +    assert(blk->quiesce_counter);
      assert(blk->public.io_limits_disabled);
      --blk->public.io_limits_disabled;
 +
 +    if (--blk->quiesce_counter == 0) {
 +        if (blk->dev_ops && blk->dev_ops->drained_end) {
 +            blk->dev_ops->drained_end(blk->dev_opaque);
 +        }
 +    }
  }
 diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
 index XXXXXXX..XXXXXXX 100644
 --- a/include/sysemu/block-backend.h
 +++ b/include/sysemu/block-backend.h
@@ -XXX,XX +XXX,XX @@ typedef struct BlockDevOps {
       * Runs when the size changed (e.g. monitor command block_resize)
       */
      void (*resize_cb)(void *opaque);
 +    /*
 +     * Runs when the backend receives a drain request.
 +     */
 +    void (*drained_begin)(void *opaque);
 +    /*
 +     * Runs when the backend's last drain request ends.
 +     */
 +    void (*drained_end)(void *opaque);
  } BlockDevOps;
  /* This struct is embedded in (the private) BlockBackend struct and contains
 --
-.9.3
+.31.1

-[Qemu-devel] [PULL for-2.9 2/4] blockjob: add block_job_start_shim
+[PULL for-6.1 3/3] MAINTAINERS: Added myself as a reviewer for the NVMe Block Driver
-From: John Snow <jsnow@redhat.com>
+From: Philippe Mathieu-Daudé <philmd@redhat.com>
-The purpose of this shim is to allow us to pause pre-started jobs.
+I'm interested in following the activity around the NVMe bdrv.
 The purpose of *that* is to allow us to buffer a pause request that
 will be able to take effect before the job ever does any work, allowing
 us to create jobs during a quiescent state (under which they will be
 automatically paused), then resuming the jobs after the critical section
 in any order, either:
-(1) -block_job_start
+Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
-    -block_job_resume (via e.g. drained_end)
+Message-id: 20210728183340.2018313-1-philmd@redhat.com
 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
 ---
  MAINTAINERS | 1 +
 file changed, 1 insertion(+)
-(2) -block_job_resume (via e.g. drained_end)
+diff --git a/MAINTAINERS b/MAINTAINERS
-    -block_job_start
+index XXXXXXX..XXXXXXX 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
@@ -XXX,XX +XXX,XX @@ F: block/null.c
  NVMe Block Driver
  M: Stefan Hajnoczi <stefanha@redhat.com>
  R: Fam Zheng <fam@euphon.net>
 +R: Philippe Mathieu-Daudé <philmd@redhat.com>
  L: qemu-block@nongnu.org
  S: Supported
  F: block/nvme*
 --
 .31.1
-The problem that requires a startup wrapper is the idea that a job must
-start in the busy=true state only its first time-- all subsequent entries
-require busy to be false, and the toggling of this state is otherwise
-handled during existing pause and yield points.
-The wrapper simply allows us to mandate that a job can "start," set busy
-to true, then immediately pause only if necessary. We could avoid
-requiring a wrapper, but all jobs would need to do it, so it's been
-factored out here.
-Signed-off-by: John Snow <jsnow@redhat.com>
-Reviewed-by: Jeff Cody <jcody@redhat.com>
-Message-id: 20170316212351.13797-2-jsnow@redhat.com
-Signed-off-by: Jeff Cody <jcody@redhat.com>
----
- blockjob.c | 26 +++++++++++++++++++-------
-file changed, 19 insertions(+), 7 deletions(-)
-diff --git a/blockjob.c b/blockjob.c
-index XXXXXXX..XXXXXXX 100644
---- a/blockjob.c
-+++ b/blockjob.c
-@@ -XXX,XX +XXX,XX @@ static bool block_job_started(BlockJob *job)
-     return job->co;
- }
-+/**
-+ * All jobs must allow a pause point before entering their job proper. This
-+ * ensures that jobs can be paused prior to being started, then resumed later.
-+ */
-+static void coroutine_fn block_job_co_entry(void *opaque)
-+{
-+    BlockJob *job = opaque;
-+
-+    assert(job && job->driver && job->driver->start);
-+    block_job_pause_point(job);
-+    job->driver->start(job);
-+}
-+
- void block_job_start(BlockJob *job)
- {
-     assert(job && !block_job_started(job) && job->paused &&
--           !job->busy && job->driver->start);
--    job->co = qemu_coroutine_create(job->driver->start, job);
--    if (--job->pause_count == 0) {
--        job->paused = false;
--        job->busy = true;
--        qemu_coroutine_enter(job->co);
--    }
-+           job->driver && job->driver->start);
-+    job->co = qemu_coroutine_create(block_job_co_entry, job);
-+    job->pause_count--;
-+    job->busy = true;
-+    job->paused = false;
-+    qemu_coroutine_enter(job->co);
- }
- void block_job_ref(BlockJob *job)
---
-.9.3

The following changes since commit 55a19ad8b2d0797e3a8fe90ab99a9bb713824059:

Update version for v2.9.0-rc1 release (2017-03-21 17:13:29 +0000)

are available in the git repository at:

https://github.com/codyprime/qemu-kvm-jtc.git tags/block-pull-request

for you to fetch changes up to 600ac6a0ef5c06418446ef2f37407bddcc51b21c:

blockjob: add devops to blockjob backends (2017-03-22 13:26:27 -0400)

----------------------------------------------------------------
Block patches for 2.9
----------------------------------------------------------------

John Snow (3):
  blockjob: add block_job_start_shim
  block-backend: add drained_begin / drained_end ops
  blockjob: add devops to blockjob backends

Paolo Bonzini (1):
  blockjob: avoid recursive AioContext locking

block/block-backend.c          | 24 ++++++++++++++--
 blockjob.c                     | 63 ++++++++++++++++++++++++++++++++----------
 include/sysemu/block-backend.h |  8 ++++++
 3 files changed, 79 insertions(+), 16 deletions(-)

-- 
2.9.3

From: Paolo Bonzini <pbonzini@redhat.com>

Streaming or any other block job hangs when performed on a block device
that has a non-default iothread.  This happens because the AioContext
is acquired twice by block_job_defer_to_main_loop_bh and then released
only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
unfortunately are a temporary but necessary evil for iothreads at the
moment).

Luckily, the reason for the double acquisition is simple; the function
acquires the AioContext for both the job iothread and the BDS iothread,
in case the BDS iothread was changed while the job was running.  It
is therefore enough to skip the second acquisition when the two
AioContexts are one and the same.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490118490-5597-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 blockjob.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index XXXXXXX..XXXXXXX 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -XXX,XX +XXX,XX @@ static void block_job_defer_to_main_loop_bh(void *opaque)
 
     /* Fetch BDS AioContext again, in case it has changed */
     aio_context = blk_get_aio_context(data->job->blk);
-    aio_context_acquire(aio_context);
+    if (aio_context != data->aio_context) {
+        aio_context_acquire(aio_context);
+    }
 
     data->job->deferred_to_main_loop = false;
     data->fn(data->job, data->opaque);
 
-    aio_context_release(aio_context);
+    if (aio_context != data->aio_context) {
+        aio_context_release(aio_context);
+    }
 
     aio_context_release(data->aio_context);
 
-- 
2.9.3

From: John Snow <jsnow@redhat.com>

The purpose of this shim is to allow us to pause pre-started jobs.
The purpose of *that* is to allow us to buffer a pause request that
will be able to take effect before the job ever does any work, allowing
us to create jobs during a quiescent state (under which they will be
automatically paused), then resuming the jobs after the critical section
in any order, either:

(1) -block_job_start
    -block_job_resume (via e.g. drained_end)

(2) -block_job_resume (via e.g. drained_end)
    -block_job_start

The problem that requires a startup wrapper is the idea that a job must
start in the busy=true state only its first time-- all subsequent entries
require busy to be false, and the toggling of this state is otherwise
handled during existing pause and yield points.

The wrapper simply allows us to mandate that a job can "start," set busy
to true, then immediately pause only if necessary. We could avoid
requiring a wrapper, but all jobs would need to do it, so it's been
factored out here.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-2-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 blockjob.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index XXXXXXX..XXXXXXX 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -XXX,XX +XXX,XX @@ static bool block_job_started(BlockJob *job)
     return job->co;
 }
 
+/**
+ * All jobs must allow a pause point before entering their job proper. This
+ * ensures that jobs can be paused prior to being started, then resumed later.
+ */
+static void coroutine_fn block_job_co_entry(void *opaque)
+{
+    BlockJob *job = opaque;
+
+    assert(job && job->driver && job->driver->start);
+    block_job_pause_point(job);
+    job->driver->start(job);
+}
+
 void block_job_start(BlockJob *job)
 {
     assert(job && !block_job_started(job) && job->paused &&
-           !job->busy && job->driver->start);
-    job->co = qemu_coroutine_create(job->driver->start, job);
-    if (--job->pause_count == 0) {
-        job->paused = false;
-        job->busy = true;
-        qemu_coroutine_enter(job->co);
-    }
+           job->driver && job->driver->start);
+    job->co = qemu_coroutine_create(block_job_co_entry, job);
+    job->pause_count--;
+    job->busy = true;
+    job->paused = false;
+    qemu_coroutine_enter(job->co);
 }
 
 void block_job_ref(BlockJob *job)
-- 
2.9.3

From: John Snow <jsnow@redhat.com>

Allow block backends to forward drain requests to their devices/users.
The initial intended purpose for this patch is to allow BBs to forward
requests along to BlockJobs, which will want to pause if their associated
BB has entered a drained region.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-3-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/block-backend.c          | 24 ++++++++++++++++++++++--
 include/sysemu/block-backend.h |  8 ++++++++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index XXXXXXX..XXXXXXX 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -XXX,XX +XXX,XX @@ struct BlockBackend {
     bool allow_write_beyond_eof;
 
     NotifierList remove_bs_notifiers, insert_bs_notifiers;
+
+    int quiesce_counter;
 };
 
 typedef struct BlockBackendAIOCB {
@@ -XXX,XX +XXX,XX @@ void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
                      void *opaque)
 {
     /* All drivers that use blk_set_dev_ops() are qdevified and we want to keep
-     * it that way, so we can assume blk->dev is a DeviceState if blk->dev_ops
-     * is set. */
+     * it that way, so we can assume blk->dev, if present, is a DeviceState if
+     * blk->dev_ops is set. Non-device users may use dev_ops without device. */
     assert(!blk->legacy_dev);
 
     blk->dev_ops = ops;
     blk->dev_opaque = opaque;
+
+    /* Are we currently quiesced? Should we enforce this right now? */
+    if (blk->quiesce_counter && ops->drained_begin) {
+        ops->drained_begin(opaque);
+    }
 }
 
 /*
@@ -XXX,XX +XXX,XX @@ static void blk_root_drained_begin(BdrvChild *child)
 {
     BlockBackend *blk = child->opaque;
 
+    if (++blk->quiesce_counter == 1) {
+        if (blk->dev_ops && blk->dev_ops->drained_begin) {
+            blk->dev_ops->drained_begin(blk->dev_opaque);
+        }
+    }
+
     /* Note that blk->root may not be accessible here yet if we are just
      * attaching to a BlockDriverState that is drained. Use child instead. */
 
@@ -XXX,XX +XXX,XX @@ static void blk_root_drained_begin(BdrvChild *child)
 static void blk_root_drained_end(BdrvChild *child)
 {
     BlockBackend *blk = child->opaque;
+    assert(blk->quiesce_counter);
 
     assert(blk->public.io_limits_disabled);
     --blk->public.io_limits_disabled;
+
+    if (--blk->quiesce_counter == 0) {
+        if (blk->dev_ops && blk->dev_ops->drained_end) {
+            blk->dev_ops->drained_end(blk->dev_opaque);
+        }
+    }
 }
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index XXXXXXX..XXXXXXX 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -XXX,XX +XXX,XX @@ typedef struct BlockDevOps {
      * Runs when the size changed (e.g. monitor command block_resize)
      */
     void (*resize_cb)(void *opaque);
+    /*
+     * Runs when the backend receives a drain request.
+     */
+    void (*drained_begin)(void *opaque);
+    /*
+     * Runs when the backend's last drain request ends.
+     */
+    void (*drained_end)(void *opaque);
 } BlockDevOps;
 
 /* This struct is embedded in (the private) BlockBackend struct and contains
-- 
2.9.3

From: John Snow <jsnow@redhat.com>

This lets us hook into drained_begin and drained_end requests from the
backend level, which is particularly useful for making sure that all
jobs associated with a particular node (whether the source or the target)
receive a drain request.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-4-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 blockjob.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index XXXXXXX..XXXXXXX 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -XXX,XX +XXX,XX @@ static const BdrvChildRole child_job = {
     .stay_at_node       = true,
 };
 
+static void block_job_drained_begin(void *opaque)
+{
+    BlockJob *job = opaque;
+    block_job_pause(job);
+}
+
+static void block_job_drained_end(void *opaque)
+{
+    BlockJob *job = opaque;
+    block_job_resume(job);
+}
+
+static const BlockDevOps block_job_dev_ops = {
+    .drained_begin = block_job_drained_begin,
+    .drained_end = block_job_drained_end,
+};
+
 BlockJob *block_job_next(BlockJob *job)
 {
     if (!job) {
@@ -XXX,XX +XXX,XX @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     }
 
     job = g_malloc0(driver->instance_size);
-    error_setg(&job->blocker, "block device is in use by block job: %s",
-               BlockJobType_lookup[driver->job_type]);
-    block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
-    bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
-
     job->driver        = driver;
     job->id            = g_strdup(job_id);
     job->blk           = blk;
@@ -XXX,XX +XXX,XX @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
     job->paused        = true;
     job->pause_count   = 1;
     job->refcnt        = 1;
+
+    error_setg(&job->blocker, "block device is in use by block job: %s",
+               BlockJobType_lookup[driver->job_type]);
+    block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
     bs->job = job;
 
+    blk_set_dev_ops(blk, &block_job_dev_ops, job);
+    bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
+
     QLIST_INSERT_HEAD(&block_jobs, job, job_list);
 
     blk_add_aio_context_notifier(blk, block_job_attached_aio_context,
-- 
2.9.3

The following changes since commit 3521ade3510eb5cefb2e27a101667f25dad89935:

Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-07-29' into staging (2021-07-29 13:17:20 +0100)

are available in the Git repository at:

https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to cc8eecd7f105a1dff5876adeb238a14696061a4a:

MAINTAINERS: Added myself as a reviewer for the NVMe Block Driver (2021-07-29 17:17:34 +0100)

----------------------------------------------------------------
Pull request

The main fix here is for io_uring. Spurious -EAGAIN errors can happen and the
request needs to be resubmitted.

The MAINTAINERS changes carry no risk and we might as well include them in QEMU
6.1.

----------------------------------------------------------------

Fabian Ebner (1):
  block/io_uring: resubmit when result is -EAGAIN

Philippe Mathieu-Daudé (1):
  MAINTAINERS: Added myself as a reviewer for the NVMe Block Driver

Stefano Garzarella (1):
  MAINTAINERS: add Stefano Garzarella as io_uring reviewer

MAINTAINERS      |  2 ++
 block/io_uring.c | 16 +++++++++++++++-
 2 files changed, 17 insertions(+), 1 deletion(-)

-- 
2.31.1

From: Fabian Ebner <f.ebner@proxmox.com>

Linux SCSI can throw spurious -EAGAIN in some corner cases in its
completion path, which will end up being the result in the completed
io_uring request.

Resubmitting such requests should allow block jobs to complete, even
if such spurious errors are encountered.

Co-authored-by: Stefan Hajnoczi <stefanha@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Message-id: 20210729091029.65369-1-f.ebner@proxmox.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/io_uring.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/io_uring.c b/block/io_uring.c
index XXXXXXX..XXXXXXX 100644
--- a/block/io_uring.c
+++ b/block/io_uring.c
@@ -XXX,XX +XXX,XX @@ static void luring_process_completions(LuringState *s)
         total_bytes = ret + luringcb->total_read;
 
         if (ret < 0) {
-            if (ret == -EINTR) {
+            /*
+             * Only writev/readv/fsync requests on regular files or host block
+             * devices are submitted. Therefore -EAGAIN is not expected but it's
+             * known to happen sometimes with Linux SCSI. Submit again and hope
+             * the request completes successfully.
+             *
+             * For more information, see:
+             * https://lore.kernel.org/io-uring/20210727165811.284510-3-axboe@kernel.dk/T/#u
+             *
+             * If the code is changed to submit other types of requests in the
+             * future, then this workaround may need to be extended to deal with
+             * genuine -EAGAIN results that should not be resubmitted
+             * immediately.
+             */
+            if (ret == -EINTR || ret == -EAGAIN) {
                 luring_resubmit(s, luringcb);
                 continue;
             }
-- 
2.31.1