[Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block

Deepa Srinivasan posted 1 patch 6 years, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1511364808-30171-1-git-send-email-deepa.srinivasan@oracle.com
Test checkpatch passed
Test docker failed
Test ppc passed
Test s390x passed
There is a newer version of this series
block/block-backend.c          | 13 ++-----------
hw/block/virtio-blk.c          |  9 ++++++++-
hw/scsi/scsi-disk.c            | 10 +++++++++-
hw/scsi/scsi-generic.c         |  9 ++++++++-
include/sysemu/block-backend.h |  2 +-
5 files changed, 28 insertions(+), 15 deletions(-)
[Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Deepa Srinivasan 6 years, 4 months ago
Starting qemu with the following arguments causes qemu to segfault:
... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1

This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
details about the bug follow.

blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().

When blk_aio_ioctl() is executed from within a coroutine context (e.g.
iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
the current coroutine's wakeup queue. blk_aio_ioctl() then returns.

When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
....
    BlkRwCo *rwco = &acb->rwco;

    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
                             rwco->qiov->iov[0].iov_base);  <--- qiov is
                                                                 invalid here
...

In the case when blk_aio_ioctl() is called from a non-coroutine context,
blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
execution is complete, control returns to blk_aio_ioctl_entry() after the call
to blk_co_ioctl(). There is no invalid reference after this point, but the
function is still holding on to invalid pointers.

The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
the request struct which the IO buffer is part of. The memory for this struct is
guaranteed to be valid till the AIO is completed.

Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
---
 block/block-backend.c          | 13 ++-----------
 hw/block/virtio-blk.c          |  9 ++++++++-
 hw/scsi/scsi-disk.c            | 10 +++++++++-
 hw/scsi/scsi-generic.c         |  9 ++++++++-
 include/sysemu/block-backend.h |  2 +-
 5 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index baef8e7..c275827 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
     blk_aio_complete(acb);
 }
 
-BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
+BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov,
                           BlockCompletionFunc *cb, void *opaque)
 {
-    QEMUIOVector qiov;
-    struct iovec iov;
-
-    iov = (struct iovec) {
-        .iov_base = buf,
-        .iov_len = 0,
-    };
-    qemu_iovec_init_external(&qiov, &iov, 1);
-
-    return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, opaque);
+    return blk_aio_prwv(blk, req, 0, qiov, blk_aio_ioctl_entry, 0, cb, opaque);
 }
 
 int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 05d1440..ed9f774 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -151,6 +151,8 @@ out:
 typedef struct {
     VirtIOBlockReq *req;
     struct sg_io_hdr hdr;
+    QEMUIOVector qiov;
+    struct iovec iov;
 } VirtIOBlockIoctlReq;
 
 static void virtio_blk_ioctl_complete(void *opaque, int status)
@@ -298,7 +300,12 @@ static int virtio_blk_handle_scsi_req(VirtIOBlockReq *req)
     ioctl_req->hdr.sbp = elem->in_sg[elem->in_num - 3].iov_base;
     ioctl_req->hdr.mx_sb_len = elem->in_sg[elem->in_num - 3].iov_len;
 
-    acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->hdr,
+    ioctl_req->iov.iov_base = &ioctl_req->hdr;
+    ioctl_req->iov.iov_len = 0;
+
+    qemu_iovec_init_external(&ioctl_req->qiov, &ioctl_req->iov, 1);
+
+    acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->qiov,
                         virtio_blk_ioctl_complete, ioctl_req);
     if (!acb) {
         g_free(ioctl_req);
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 1243117..7cbe18d 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2636,6 +2636,9 @@ typedef struct SCSIBlockReq {
     SCSIDiskReq req;
     sg_io_hdr_t io_header;
 
+    QEMUIOVector qiov;
+    struct iovec iov;
+
     /* Selected bytes of the original CDB, copied into our own CDB.  */
     uint8_t cmd, cdb1, group_number;
 
@@ -2722,7 +2725,12 @@ static BlockAIOCB *scsi_block_do_sgio(SCSIBlockReq *req,
     io_header->usr_ptr = r;
     io_header->flags |= SG_FLAG_DIRECT_IO;
 
-    aiocb = blk_aio_ioctl(s->qdev.conf.blk, SG_IO, io_header, cb, opaque);
+    req->iov.iov_base = io_header;
+    req->iov.iov_len = 0;
+
+    qemu_iovec_init_external(&req->qiov, &req->iov, 1);
+
+    aiocb = blk_aio_ioctl(s->qdev.conf.blk, SG_IO, &req->qiov, cb, opaque);
     assert(aiocb != NULL);
     return aiocb;
 }
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index bd0d9ff..856af7c 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -46,6 +46,8 @@ typedef struct SCSIGenericReq {
     int buflen;
     int len;
     sg_io_hdr_t io_header;
+    QEMUIOVector qiov;
+    struct iovec iov;
 } SCSIGenericReq;
 
 static void scsi_generic_save_request(QEMUFile *f, SCSIRequest *req)
@@ -135,7 +137,12 @@ static int execute_command(BlockBackend *blk,
     r->io_header.usr_ptr = r;
     r->io_header.flags |= SG_FLAG_DIRECT_IO;
 
-    r->req.aiocb = blk_aio_ioctl(blk, SG_IO, &r->io_header, complete, r);
+    r->iov.iov_base = &r->io_header;
+    r->iov.iov_len = 0;
+
+    qemu_iovec_init_external(&r->qiov, &r->iov, 1);
+
+    r->req.aiocb = blk_aio_ioctl(blk, SG_IO, &r->qiov, complete, r);
     if (r->req.aiocb == NULL) {
         return -EIO;
     }
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index c4e52a5..32f4486 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -151,7 +151,7 @@ void blk_aio_cancel(BlockAIOCB *acb);
 void blk_aio_cancel_async(BlockAIOCB *acb);
 int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
 int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
-BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
+BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov,
                           BlockCompletionFunc *cb, void *opaque);
 int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes);
 int blk_co_flush(BlockBackend *blk);
-- 
2.7.4


Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Paolo Bonzini 6 years, 4 months ago
On 22/11/2017 16:33, Deepa Srinivasan wrote:
> Starting qemu with the following arguments causes qemu to segfault:
> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> 
> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> details about the bug follow.
> 
> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> 
> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> 
> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> ....
>     BlkRwCo *rwco = &acb->rwco;
> 
>     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>                              rwco->qiov->iov[0].iov_base);  <--- qiov is
>                                                                  invalid here
> ...
> 
> In the case when blk_aio_ioctl() is called from a non-coroutine context,
> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> execution is complete, control returns to blk_aio_ioctl_entry() after the call
> to blk_co_ioctl(). There is no invalid reference after this point, but the
> function is still holding on to invalid pointers.
> 
> The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
> the request struct which the IO buffer is part of. The memory for this struct is
> guaranteed to be valid till the AIO is completed.
> 
> Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
> ---
>  block/block-backend.c          | 13 ++-----------
>  hw/block/virtio-blk.c          |  9 ++++++++-
>  hw/scsi/scsi-disk.c            | 10 +++++++++-
>  hw/scsi/scsi-generic.c         |  9 ++++++++-
>  include/sysemu/block-backend.h |  2 +-
>  5 files changed, 28 insertions(+), 15 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index baef8e7..c275827 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
>      blk_aio_complete(acb);
>  }
>  
> -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
> +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov,
>                            BlockCompletionFunc *cb, void *opaque)

I think this is not the best way to fix the bug, because it adds extra
unnecessary code in the callers.

Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
same for blk_aio_prwv's "qiov" argument?

Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
use rwco->buf.

Thanks,

Paolo

>  {
> -    QEMUIOVector qiov;
> -    struct iovec iov;
> -
> -    iov = (struct iovec) {
> -        .iov_base = buf,
> -        .iov_len = 0,
> -    };
> -    qemu_iovec_init_external(&qiov, &iov, 1);
> -
> -    return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, opaque);
> +    return blk_aio_prwv(blk, req, 0, qiov, blk_aio_ioctl_entry, 0, cb, opaque);
>  }
>  
>  int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 05d1440..ed9f774 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -151,6 +151,8 @@ out:
>  typedef struct {
>      VirtIOBlockReq *req;
>      struct sg_io_hdr hdr;
> +    QEMUIOVector qiov;
> +    struct iovec iov;
>  } VirtIOBlockIoctlReq;
>  
>  static void virtio_blk_ioctl_complete(void *opaque, int status)
> @@ -298,7 +300,12 @@ static int virtio_blk_handle_scsi_req(VirtIOBlockReq *req)
>      ioctl_req->hdr.sbp = elem->in_sg[elem->in_num - 3].iov_base;
>      ioctl_req->hdr.mx_sb_len = elem->in_sg[elem->in_num - 3].iov_len;
>  
> -    acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->hdr,
> +    ioctl_req->iov.iov_base = &ioctl_req->hdr;
> +    ioctl_req->iov.iov_len = 0;
> +
> +    qemu_iovec_init_external(&ioctl_req->qiov, &ioctl_req->iov, 1);
> +
> +    acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->qiov,
>                          virtio_blk_ioctl_complete, ioctl_req);
>      if (!acb) {
>          g_free(ioctl_req);
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 1243117..7cbe18d 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -2636,6 +2636,9 @@ typedef struct SCSIBlockReq {
>      SCSIDiskReq req;
>      sg_io_hdr_t io_header;
>  
> +    QEMUIOVector qiov;
> +    struct iovec iov;
> +
>      /* Selected bytes of the original CDB, copied into our own CDB.  */
>      uint8_t cmd, cdb1, group_number;
>  
> @@ -2722,7 +2725,12 @@ static BlockAIOCB *scsi_block_do_sgio(SCSIBlockReq *req,
>      io_header->usr_ptr = r;
>      io_header->flags |= SG_FLAG_DIRECT_IO;
>  
> -    aiocb = blk_aio_ioctl(s->qdev.conf.blk, SG_IO, io_header, cb, opaque);
> +    req->iov.iov_base = io_header;
> +    req->iov.iov_len = 0;
> +
> +    qemu_iovec_init_external(&req->qiov, &req->iov, 1);
> +
> +    aiocb = blk_aio_ioctl(s->qdev.conf.blk, SG_IO, &req->qiov, cb, opaque);
>      assert(aiocb != NULL);
>      return aiocb;
>  }
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index bd0d9ff..856af7c 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -46,6 +46,8 @@ typedef struct SCSIGenericReq {
>      int buflen;
>      int len;
>      sg_io_hdr_t io_header;
> +    QEMUIOVector qiov;
> +    struct iovec iov;
>  } SCSIGenericReq;
>  
>  static void scsi_generic_save_request(QEMUFile *f, SCSIRequest *req)
> @@ -135,7 +137,12 @@ static int execute_command(BlockBackend *blk,
>      r->io_header.usr_ptr = r;
>      r->io_header.flags |= SG_FLAG_DIRECT_IO;
>  
> -    r->req.aiocb = blk_aio_ioctl(blk, SG_IO, &r->io_header, complete, r);
> +    r->iov.iov_base = &r->io_header;
> +    r->iov.iov_len = 0;
> +
> +    qemu_iovec_init_external(&r->qiov, &r->iov, 1);
> +
> +    r->req.aiocb = blk_aio_ioctl(blk, SG_IO, &r->qiov, complete, r);
>      if (r->req.aiocb == NULL) {
>          return -EIO;
>      }
> diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
> index c4e52a5..32f4486 100644
> --- a/include/sysemu/block-backend.h
> +++ b/include/sysemu/block-backend.h
> @@ -151,7 +151,7 @@ void blk_aio_cancel(BlockAIOCB *acb);
>  void blk_aio_cancel_async(BlockAIOCB *acb);
>  int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
>  int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf);
> -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
> +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov,
>                            BlockCompletionFunc *cb, void *opaque);
>  int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes);
>  int blk_co_flush(BlockBackend *blk);
> 


Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Kevin Wolf 6 years, 4 months ago
Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben:
> On 22/11/2017 16:33, Deepa Srinivasan wrote:
> > Starting qemu with the following arguments causes qemu to segfault:
> > ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> > 
> > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> > details about the bug follow.
> > 
> > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> > 
> > When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> > the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> > 
> > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> > ....
> >     BlkRwCo *rwco = &acb->rwco;
> > 
> >     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> >                              rwco->qiov->iov[0].iov_base);  <--- qiov is
> >                                                                  invalid here
> > ...
> > 
> > In the case when blk_aio_ioctl() is called from a non-coroutine context,
> > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> > execution is complete, control returns to blk_aio_ioctl_entry() after the call
> > to blk_co_ioctl(). There is no invalid reference after this point, but the
> > function is still holding on to invalid pointers.
> > 
> > The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
> > the request struct which the IO buffer is part of. The memory for this struct is
> > guaranteed to be valid till the AIO is completed.
> > 
> > Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
> > ---
> >  block/block-backend.c          | 13 ++-----------
> >  hw/block/virtio-blk.c          |  9 ++++++++-
> >  hw/scsi/scsi-disk.c            | 10 +++++++++-
> >  hw/scsi/scsi-generic.c         |  9 ++++++++-
> >  include/sysemu/block-backend.h |  2 +-
> >  5 files changed, 28 insertions(+), 15 deletions(-)
> > 
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index baef8e7..c275827 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> > @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
> >      blk_aio_complete(acb);
> >  }
> >  
> > -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
> > +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov,
> >                            BlockCompletionFunc *cb, void *opaque)
> 
> I think this is not the best way to fix the bug, because it adds extra
> unnecessary code in the callers.
> 
> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
> same for blk_aio_prwv's "qiov" argument?
> 
> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
> use rwco->buf.

But the same struct is used for read and write requests that do use an
actual QEMUIOVector and not just a linear buffer.

Kevin

Re: [Qemu-devel] [Qemu-block] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Paolo Bonzini 6 years, 4 months ago
On 22/11/2017 19:06, Kevin Wolf wrote:
> Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben:
>> On 22/11/2017 16:33, Deepa Srinivasan wrote:
>>> Starting qemu with the following arguments causes qemu to segfault:
>>> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
>>> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
>>>
>>> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
>>> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
>>> details about the bug follow.
>>>
>>> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
>>> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
>>>
>>> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
>>> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
>>> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
>>>
>>> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
>>> ....
>>>     BlkRwCo *rwco = &acb->rwco;
>>>
>>>     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>>>                              rwco->qiov->iov[0].iov_base);  <--- qiov is
>>>                                                                  invalid here
>>> ...
>>>
>>> In the case when blk_aio_ioctl() is called from a non-coroutine context,
>>> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
>>> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
>>> execution is complete, control returns to blk_aio_ioctl_entry() after the call
>>> to blk_co_ioctl(). There is no invalid reference after this point, but the
>>> function is still holding on to invalid pointers.
>>>
>>> The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
>>> the request struct which the IO buffer is part of. The memory for this struct is
>>> guaranteed to be valid till the AIO is completed.
>>>
>>> Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
>>> ---
>>>  block/block-backend.c          | 13 ++-----------
>>>  hw/block/virtio-blk.c          |  9 ++++++++-
>>>  hw/scsi/scsi-disk.c            | 10 +++++++++-
>>>  hw/scsi/scsi-generic.c         |  9 ++++++++-
>>>  include/sysemu/block-backend.h |  2 +-
>>>  5 files changed, 28 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/block/block-backend.c b/block/block-backend.c
>>> index baef8e7..c275827 100644
>>> --- a/block/block-backend.c
>>> +++ b/block/block-backend.c
>>> @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
>>>      blk_aio_complete(acb);
>>>  }
>>>  
>>> -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
>>> +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov,
>>>                            BlockCompletionFunc *cb, void *opaque)
>>
>> I think this is not the best way to fix the bug, because it adds extra
>> unnecessary code in the callers.
>>
>> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
>> same for blk_aio_prwv's "qiov" argument?
>>
>> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
>> use rwco->buf.
> 
> But the same struct is used for read and write requests that do use an
> actual QEMUIOVector and not just a linear buffer.

Then let's call it "void *opaque", or make it a union (but I think
that's overkill).

The QEMUIOVector pointer is opaque as far as blk_aio_prwv is concerned,
and it is only created by blk_aio_ioctl for blk_aio_ioctl_entry to
extract buf:

    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
                             rwco->qiov->iov[0].iov_base);

Exposing the fake QEMUIOVector to the callers of blk_aio_ioctl is much
uglier than using a void* for what is effectively a multi-type pointer.

Paolo

Re: [Qemu-devel] [Qemu-block] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Deepa Srinivasan 6 years, 4 months ago
I agree that passing in QEMUIOVector to blk_aio_ioctl() as a holder of the void* buffer used in blk_aio_ioctl_entry() is unnecessary. But, as Kevin noted, read and write were using the QEMUIOVector in BlkRwCo.

To avoid changes to the callers of blk_aio_ioctl(), I’ll change blk_aio_prwv() to take a void pointer instead of QEMUIOVector* and use a union to hold the buffer in BlkRwCo.

> On Nov 22, 2017, at 11:24 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 22/11/2017 19:06, Kevin Wolf wrote:
>> Am 22.11.2017 um 17:34 hat Paolo Bonzini geschrieben:
>>> On 22/11/2017 16:33, Deepa Srinivasan wrote:
>>>> Starting qemu with the following arguments causes qemu to segfault:
>>>> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
>>>> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
>>>> 
>>>> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
>>>> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
>>>> details about the bug follow.
>>>> 
>>>> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
>>>> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
>>>> 
>>>> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
>>>> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
>>>> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
>>>> 
>>>> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
>>>> ....
>>>>    BlkRwCo *rwco = &acb->rwco;
>>>> 
>>>>    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>>>>                             rwco->qiov->iov[0].iov_base);  <--- qiov is
>>>>                                                                 invalid here
>>>> ...
>>>> 
>>>> In the case when blk_aio_ioctl() is called from a non-coroutine context,
>>>> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
>>>> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
>>>> execution is complete, control returns to blk_aio_ioctl_entry() after the call
>>>> to blk_co_ioctl(). There is no invalid reference after this point, but the
>>>> function is still holding on to invalid pointers.
>>>> 
>>>> The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
>>>> the request struct which the IO buffer is part of. The memory for this struct is
>>>> guaranteed to be valid till the AIO is completed.
>>>> 
>>>> Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
>>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>>> Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
>>>> ---
>>>> block/block-backend.c          | 13 ++-----------
>>>> hw/block/virtio-blk.c          |  9 ++++++++-
>>>> hw/scsi/scsi-disk.c            | 10 +++++++++-
>>>> hw/scsi/scsi-generic.c         |  9 ++++++++-
>>>> include/sysemu/block-backend.h |  2 +-
>>>> 5 files changed, 28 insertions(+), 15 deletions(-)
>>>> 
>>>> diff --git a/block/block-backend.c b/block/block-backend.c
>>>> index baef8e7..c275827 100644
>>>> --- a/block/block-backend.c
>>>> +++ b/block/block-backend.c
>>>> @@ -1472,19 +1472,10 @@ static void blk_aio_ioctl_entry(void *opaque)
>>>>     blk_aio_complete(acb);
>>>> }
>>>> 
>>>> -BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
>>>> +BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, QEMUIOVector *qiov,
>>>>                           BlockCompletionFunc *cb, void *opaque)
>>> 
>>> I think this is not the best way to fix the bug, because it adds extra
>>> unnecessary code in the callers.
>>> 
>>> Perhaps you can change BlkRwCo's "qiov" field to "void *buf" and the
>>> same for blk_aio_prwv's "qiov" argument?
>>> 
>>> Then the QEMUIOVector is not needed at all, and blk_co_ioctl can just
>>> use rwco->buf.
>> 
>> But the same struct is used for read and write requests that do use an
>> actual QEMUIOVector and not just a linear buffer.
> 
> Then let's call it "void *opaque", or make it a union (but I think
> that's overkill).
> 
> The QEMUIOVector pointer is opaque as far as blk_aio_prwv is concerned,
> and it is only created by blk_aio_ioctl for blk_aio_ioctl_entry to
> extract buf:
> 
>    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>                             rwco->qiov->iov[0].iov_base);
> 
> Exposing the fake QEMUIOVector to the callers of blk_aio_ioctl is much
> uglier than using a void* for what is effectively a multi-type pointer.
> 
> Paolo

Re: [Qemu-devel] [Qemu-block] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Paolo Bonzini 6 years, 4 months ago
On 23/11/2017 03:55, Deepa Srinivasan wrote:
> I agree that passing in QEMUIOVector to blk_aio_ioctl() as a holder of
> the void* buffer used in blk_aio_ioctl_entry() is unnecessary. But, as
> Kevin noted, read and write were using the QEMUIOVector in BlkRwCo.
> 
> To avoid changes to the callers of blk_aio_ioctl(), I’ll change
> blk_aio_prwv() to take a void pointer instead of QEMUIOVector* and use a
> union to hold the buffer in BlkRwCo.

The union is unnecessary.  A QEMUIOVector* can be stored in a void* just
fine.

Paolo

[Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Deepa Srinivasan 6 years, 4 months ago
Starting qemu with the following arguments causes qemu to segfault:
... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1

This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
details about the bug follow.

blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().

When blk_aio_ioctl() is executed from within a coroutine context (e.g.
iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
the current coroutine's wakeup queue. blk_aio_ioctl() then returns.

When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
....
    BlkRwCo *rwco = &acb->rwco;

    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
                             rwco->qiov->iov[0].iov_base);  <--- qiov is
                                                                 invalid here
...

In the case when blk_aio_ioctl() is called from a non-coroutine context,
blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
execution is complete, control returns to blk_aio_ioctl_entry() after the call
to blk_co_ioctl(). There is no invalid reference after this point, but the
function is still holding on to invalid pointers.

The fix is to change blk_aio_prwv() to accept a void pointer for the IO buffer
rather than a QEMUIOVector. blk_aio_prwv() passes this through in BlkRwCo and the
coroutine function casts it to QEMUIOVector or uses the void pointer directly.

Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
---
 block/block-backend.c | 51 +++++++++++++++++++++++++--------------------------
 1 file changed, 25 insertions(+), 26 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index baef8e7..2d0d9b6 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1140,7 +1140,7 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
 typedef struct BlkRwCo {
     BlockBackend *blk;
     int64_t offset;
-    QEMUIOVector *qiov;
+    void *iobuf;
     int ret;
     BdrvRequestFlags flags;
 } BlkRwCo;
@@ -1148,17 +1148,19 @@ typedef struct BlkRwCo {
 static void blk_read_entry(void *opaque)
 {
     BlkRwCo *rwco = opaque;
+    QEMUIOVector *qiov = rwco->iobuf;
 
-    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, rwco->qiov->size,
-                              rwco->qiov, rwco->flags);
+    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, qiov->size,
+                              qiov, rwco->flags);
 }
 
 static void blk_write_entry(void *opaque)
 {
     BlkRwCo *rwco = opaque;
+    QEMUIOVector *qiov = rwco->iobuf;
 
-    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, rwco->qiov->size,
-                               rwco->qiov, rwco->flags);
+    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, qiov->size,
+                               qiov, rwco->flags);
 }
 
 static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
@@ -1178,7 +1180,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
     rwco = (BlkRwCo) {
         .blk    = blk,
         .offset = offset,
-        .qiov   = &qiov,
+        .iobuf  = &qiov,
         .flags  = flags,
         .ret    = NOT_DONE,
     };
@@ -1275,7 +1277,7 @@ static void blk_aio_complete_bh(void *opaque)
 }
 
 static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
-                                QEMUIOVector *qiov, CoroutineEntry co_entry,
+                                void *iobuf, CoroutineEntry co_entry,
                                 BdrvRequestFlags flags,
                                 BlockCompletionFunc *cb, void *opaque)
 {
@@ -1287,7 +1289,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
     acb->rwco = (BlkRwCo) {
         .blk    = blk,
         .offset = offset,
-        .qiov   = qiov,
+        .iobuf  = iobuf,
         .flags  = flags,
         .ret    = NOT_DONE,
     };
@@ -1310,10 +1312,11 @@ static void blk_aio_read_entry(void *opaque)
 {
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
+    QEMUIOVector *qiov = rwco->iobuf;
 
-    assert(rwco->qiov->size == acb->bytes);
+    assert(qiov->size == acb->bytes);
     rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, acb->bytes,
-                              rwco->qiov, rwco->flags);
+                              qiov, rwco->flags);
     blk_aio_complete(acb);
 }
 
@@ -1321,10 +1324,11 @@ static void blk_aio_write_entry(void *opaque)
 {
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
+    QEMUIOVector *qiov = rwco->iobuf;
 
-    assert(!rwco->qiov || rwco->qiov->size == acb->bytes);
+    assert(!qiov || qiov->size == acb->bytes);
     rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, acb->bytes,
-                               rwco->qiov, rwco->flags);
+                               qiov, rwco->flags);
     blk_aio_complete(acb);
 }
 
@@ -1453,8 +1457,10 @@ int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
 static void blk_ioctl_entry(void *opaque)
 {
     BlkRwCo *rwco = opaque;
+    QEMUIOVector *qiov = rwco->iobuf;
+
     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
-                             rwco->qiov->iov[0].iov_base);
+                             qiov->iov[0].iov_base);
 }
 
 int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
@@ -1467,24 +1473,15 @@ static void blk_aio_ioctl_entry(void *opaque)
     BlkAioEmAIOCB *acb = opaque;
     BlkRwCo *rwco = &acb->rwco;
 
-    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
-                             rwco->qiov->iov[0].iov_base);
+    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
+
     blk_aio_complete(acb);
 }
 
 BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
                           BlockCompletionFunc *cb, void *opaque)
 {
-    QEMUIOVector qiov;
-    struct iovec iov;
-
-    iov = (struct iovec) {
-        .iov_base = buf,
-        .iov_len = 0,
-    };
-    qemu_iovec_init_external(&qiov, &iov, 1);
-
-    return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, opaque);
+    return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, opaque);
 }
 
 int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
@@ -1900,7 +1897,9 @@ int blk_truncate(BlockBackend *blk, int64_t offset, PreallocMode prealloc,
 static void blk_pdiscard_entry(void *opaque)
 {
     BlkRwCo *rwco = opaque;
-    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, rwco->qiov->size);
+    QEMUIOVector *qiov = rwco->iobuf;
+
+    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, qiov->size);
 }
 
 int blk_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
-- 
2.7.4


Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Deepa Srinivasan 6 years, 4 months ago
blk_aio_prwv() now takes a void pointer and the coroutine functions have been modified to cast it into QEMUIOVector if needed. It does not use an union in BlkRwCo since this leads to code - blk_aio_prwv() would have to write to the void pointer member, but coroutines would sometimes read the QEMUIOVector member. Paolo also suggested not using a union.

Note that a similar issue exists in blk_ioctl()/blk_ioctl_entry()/blk_prw() where blk_prw() always creates the QEMUIOVector even if blk_ioctl()/blk_ioctl_entry() does not need a QEMUIOVector. This will need to be fixed separately to keep it consistent with the AIO path.

> On Nov 23, 2017, at 8:55 AM, Deepa Srinivasan <deepa.srinivasan@oracle.com> wrote:
> 
> Starting qemu with the following arguments causes qemu to segfault:
> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> 
> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> details about the bug follow.
> 
> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> 
> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> 
> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> ....
>    BlkRwCo *rwco = &acb->rwco;
> 
>    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>                             rwco->qiov->iov[0].iov_base);  <--- qiov is
>                                                                 invalid here
> ...
> 
> In the case when blk_aio_ioctl() is called from a non-coroutine context,
> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> execution is complete, control returns to blk_aio_ioctl_entry() after the call
> to blk_co_ioctl(). There is no invalid reference after this point, but the
> function is still holding on to invalid pointers.
> 
> The fix is to change blk_aio_prwv() to accept a void pointer for the IO buffer
> rather than a QEMUIOVector. blk_aio_prwv() passes this through in BlkRwCo and the
> coroutine function casts it to QEMUIOVector or uses the void pointer directly.
> 
> Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
> ---
> block/block-backend.c | 51 +++++++++++++++++++++++++--------------------------
> 1 file changed, 25 insertions(+), 26 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index baef8e7..2d0d9b6 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1140,7 +1140,7 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
> typedef struct BlkRwCo {
>     BlockBackend *blk;
>     int64_t offset;
> -    QEMUIOVector *qiov;
> +    void *iobuf;
>     int ret;
>     BdrvRequestFlags flags;
> } BlkRwCo;
> @@ -1148,17 +1148,19 @@ typedef struct BlkRwCo {
> static void blk_read_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, rwco->qiov->size,
> -                              rwco->qiov, rwco->flags);
> +    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, qiov->size,
> +                              qiov, rwco->flags);
> }
> 
> static void blk_write_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, rwco->qiov->size,
> -                               rwco->qiov, rwco->flags);
> +    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, qiov->size,
> +                               qiov, rwco->flags);
> }
> 
> static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
> @@ -1178,7 +1180,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
>     rwco = (BlkRwCo) {
>         .blk    = blk,
>         .offset = offset,
> -        .qiov   = &qiov,
> +        .iobuf  = &qiov,
>         .flags  = flags,
>         .ret    = NOT_DONE,
>     };
> @@ -1275,7 +1277,7 @@ static void blk_aio_complete_bh(void *opaque)
> }
> 
> static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
> -                                QEMUIOVector *qiov, CoroutineEntry co_entry,
> +                                void *iobuf, CoroutineEntry co_entry,
>                                 BdrvRequestFlags flags,
>                                 BlockCompletionFunc *cb, void *opaque)
> {
> @@ -1287,7 +1289,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
>     acb->rwco = (BlkRwCo) {
>         .blk    = blk,
>         .offset = offset,
> -        .qiov   = qiov,
> +        .iobuf  = iobuf,
>         .flags  = flags,
>         .ret    = NOT_DONE,
>     };
> @@ -1310,10 +1312,11 @@ static void blk_aio_read_entry(void *opaque)
> {
>     BlkAioEmAIOCB *acb = opaque;
>     BlkRwCo *rwco = &acb->rwco;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    assert(rwco->qiov->size == acb->bytes);
> +    assert(qiov->size == acb->bytes);
>     rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, acb->bytes,
> -                              rwco->qiov, rwco->flags);
> +                              qiov, rwco->flags);
>     blk_aio_complete(acb);
> }
> 
> @@ -1321,10 +1324,11 @@ static void blk_aio_write_entry(void *opaque)
> {
>     BlkAioEmAIOCB *acb = opaque;
>     BlkRwCo *rwco = &acb->rwco;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    assert(!rwco->qiov || rwco->qiov->size == acb->bytes);
> +    assert(!qiov || qiov->size == acb->bytes);
>     rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, acb->bytes,
> -                               rwco->qiov, rwco->flags);
> +                               qiov, rwco->flags);
>     blk_aio_complete(acb);
> }
> 
> @@ -1453,8 +1457,10 @@ int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
> static void blk_ioctl_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> +    QEMUIOVector *qiov = rwco->iobuf;
> +
>     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> -                             rwco->qiov->iov[0].iov_base);
> +                             qiov->iov[0].iov_base);
> }
> 
> int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
> @@ -1467,24 +1473,15 @@ static void blk_aio_ioctl_entry(void *opaque)
>     BlkAioEmAIOCB *acb = opaque;
>     BlkRwCo *rwco = &acb->rwco;
> 
> -    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> -                             rwco->qiov->iov[0].iov_base);
> +    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
> +
>     blk_aio_complete(acb);
> }
> 
> BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
>                           BlockCompletionFunc *cb, void *opaque)
> {
> -    QEMUIOVector qiov;
> -    struct iovec iov;
> -
> -    iov = (struct iovec) {
> -        .iov_base = buf,
> -        .iov_len = 0,
> -    };
> -    qemu_iovec_init_external(&qiov, &iov, 1);
> -
> -    return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, opaque);
> +    return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, opaque);
> }
> 
> int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
> @@ -1900,7 +1897,9 @@ int blk_truncate(BlockBackend *blk, int64_t offset, PreallocMode prealloc,
> static void blk_pdiscard_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> -    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, rwco->qiov->size);
> +    QEMUIOVector *qiov = rwco->iobuf;
> +
> +    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, qiov->size);
> }
> 
> int blk_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
> -- 
> 2.7.4
> 
> 


Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Paolo Bonzini 6 years, 4 months ago
On 23/11/2017 18:05, Deepa Srinivasan wrote:
> blk_aio_prwv() now takes a void pointer and the coroutine functions
> have been modified to cast it into QEMUIOVector if needed. It does
> not use an union in BlkRwCo since this leads to code - blk_aio_prwv()
> would have to write to the void pointer member, but coroutines would
> sometimes read the QEMUIOVector member. Paolo also suggested not
> using a union.
> 
> Note that a similar issue exists in
> blk_ioctl()/blk_ioctl_entry()/blk_prw() where blk_prw() always
> creates the QEMUIOVector even if blk_ioctl()/blk_ioctl_entry() does
> not need a QEMUIOVector. This will need to be fixed separately to
> keep it consistent with the AIO path.

For that it's probably simplest to inline blk_prw into blk_ioctl and
remove all the cruft:

diff --git a/block/block-backend.c b/block/block-backend.c
index 45d9101be3..ceab3166bc 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1404,12 +1404,28 @@ static void blk_ioctl_entry(void *opaque)
 {
     BlkRwCo *rwco = opaque;
     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
-                             rwco->qiov->iov[0].iov_base);
+                             rwco->iobuf);
 }

 int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
 {
-    return blk_prw(blk, req, buf, 0, blk_ioctl_entry, 0);
+    BlkRwCo rwco = (BlkRwCo) {
+        .blk    = blk,
+        .iobuf  = buf,
+        .offset = req,
+        .ret    = NOT_DONE,
+    };
+
+    if (qemu_in_coroutine()) {
+        /* Fast-path if already in coroutine context */
+        blk_ioctl_entry(&rwco);
+    } else {
+        Coroutine *co = qemu_coroutine_create(blk_ioctl_entry, &rwco);
+        bdrv_coroutine_enter(blk_bs(blk), co);
+        BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE);
+    }
+
+    return rwco.ret;
 }

 static void blk_aio_ioctl_entry(void *opaque)

Thanks,

Paolo

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Kevin Wolf 6 years, 4 months ago
Am 23.11.2017 um 18:05 hat Deepa Srinivasan geschrieben:
> blk_aio_prwv() now takes a void pointer and the coroutine functions
> have been modified to cast it into QEMUIOVector if needed. It does not
> use an union in BlkRwCo since this leads to code - blk_aio_prwv()
> would have to write to the void pointer member, but coroutines would
> sometimes read the QEMUIOVector member. Paolo also suggested not using
> a union.

I don't particularly like void pointers, but I guess it's fair enough.

> Note that a similar issue exists in
> blk_ioctl()/blk_ioctl_entry()/blk_prw() where blk_prw() always creates
> the QEMUIOVector even if blk_ioctl()/blk_ioctl_entry() does not need a
> QEMUIOVector. This will need to be fixed separately to keep it
> consistent with the AIO path.

I don't think there is an actual problem in the blk_ioctl() path because
the iov on the stack stays valid as long as the coroutine runs. AIO is
different because it returns before the coroutine has terminated.

Kevin

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Paolo Bonzini 6 years, 4 months ago
On 23/11/2017 18:29, Kevin Wolf wrote:
>> Note that a similar issue exists in
>> blk_ioctl()/blk_ioctl_entry()/blk_prw() where blk_prw() always creates
>> the QEMUIOVector even if blk_ioctl()/blk_ioctl_entry() does not need a
>> QEMUIOVector. This will need to be fixed separately to keep it
>> consistent with the AIO path.
> 
> I don't think there is an actual problem in the blk_ioctl() path because
> the iov on the stack stays valid as long as the coroutine runs. AIO is
> different because it returns before the coroutine has terminated.

I agree, it's just code that is slightly ugly.

Paolo

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Kevin Wolf 6 years, 4 months ago
Am 23.11.2017 um 18:31 hat Paolo Bonzini geschrieben:
> On 23/11/2017 18:29, Kevin Wolf wrote:
> >> Note that a similar issue exists in
> >> blk_ioctl()/blk_ioctl_entry()/blk_prw() where blk_prw() always creates
> >> the QEMUIOVector even if blk_ioctl()/blk_ioctl_entry() does not need a
> >> QEMUIOVector. This will need to be fixed separately to keep it
> >> consistent with the AIO path.
> > 
> > I don't think there is an actual problem in the blk_ioctl() path because
> > the iov on the stack stays valid as long as the coroutine runs. AIO is
> > different because it returns before the coroutine has terminated.
> 
> I agree, it's just code that is slightly ugly.

Slightly. Neither void pointers nor code duplication make it less ugly,
though. So in this case, I'd say: If it ain't broke, don't fix it.

Kevin

Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Deepa Srinivasan 6 years, 4 months ago
> On Nov 23, 2017, at 9:29 AM, Kevin Wolf <kwolf@redhat.com> wrote:
> 
> Am 23.11.2017 um 18:05 hat Deepa Srinivasan geschrieben:
>> blk_aio_prwv() now takes a void pointer and the coroutine functions
>> have been modified to cast it into QEMUIOVector if needed. It does not
>> use an union in BlkRwCo since this leads to code - blk_aio_prwv()
>> would have to write to the void pointer member, but coroutines would
>> sometimes read the QEMUIOVector member. Paolo also suggested not using
>> a union.
> 
> I don't particularly like void pointers, but I guess it's fair enough.

Agreed, but if a union were to hold QEMUIOVector* and void* in BlkRwCo, blk_aio_prwv() would always write to void* but some coroutine functions would read from the QEMUIOVector* member. Keeping it as a void pointer is a safer option.

> 
>> Note that a similar issue exists in
>> blk_ioctl()/blk_ioctl_entry()/blk_prw() where blk_prw() always creates
>> the QEMUIOVector even if blk_ioctl()/blk_ioctl_entry() does not need a
>> QEMUIOVector. This will need to be fixed separately to keep it
>> consistent with the AIO path.
> 
> I don't think there is an actual problem in the blk_ioctl() path because
> the iov on the stack stays valid as long as the coroutine runs. AIO is
> different because it returns before the coroutine has terminated.
> 

The problem in blk_ioctl() is not a crash, because blk_prwv() waits for the coroutine completion, as you say.

The issue is that it unnecessarily creates a QEMUIOVector for the ioctl case. I was saying, if this is to be kept consistent with the AIO patch, then it could be done in a separate patch.

> Kevin
> 


Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Deepa Srinivasan 6 years, 4 months ago
Kevin, Paolo, Stefan,

Are there any further comments on this patch? Can this patch be committed?

Thanks
Deepa

> On Nov 23, 2017, at 8:55 AM, Deepa Srinivasan <deepa.srinivasan@oracle.com> wrote:
> 
> Starting qemu with the following arguments causes qemu to segfault:
> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> 
> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> details about the bug follow.
> 
> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> 
> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> 
> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> ....
>    BlkRwCo *rwco = &acb->rwco;
> 
>    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>                             rwco->qiov->iov[0].iov_base);  <--- qiov is
>                                                                 invalid here
> ...
> 
> In the case when blk_aio_ioctl() is called from a non-coroutine context,
> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> execution is complete, control returns to blk_aio_ioctl_entry() after the call
> to blk_co_ioctl(). There is no invalid reference after this point, but the
> function is still holding on to invalid pointers.
> 
> The fix is to change blk_aio_prwv() to accept a void pointer for the IO buffer
> rather than a QEMUIOVector. blk_aio_prwv() passes this through in BlkRwCo and the
> coroutine function casts it to QEMUIOVector or uses the void pointer directly.
> 
> Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
> ---
> block/block-backend.c | 51 +++++++++++++++++++++++++--------------------------
> 1 file changed, 25 insertions(+), 26 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index baef8e7..2d0d9b6 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1140,7 +1140,7 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
> typedef struct BlkRwCo {
>     BlockBackend *blk;
>     int64_t offset;
> -    QEMUIOVector *qiov;
> +    void *iobuf;
>     int ret;
>     BdrvRequestFlags flags;
> } BlkRwCo;
> @@ -1148,17 +1148,19 @@ typedef struct BlkRwCo {
> static void blk_read_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, rwco->qiov->size,
> -                              rwco->qiov, rwco->flags);
> +    rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, qiov->size,
> +                              qiov, rwco->flags);
> }
> 
> static void blk_write_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, rwco->qiov->size,
> -                               rwco->qiov, rwco->flags);
> +    rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, qiov->size,
> +                               qiov, rwco->flags);
> }
> 
> static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
> @@ -1178,7 +1180,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
>     rwco = (BlkRwCo) {
>         .blk    = blk,
>         .offset = offset,
> -        .qiov   = &qiov,
> +        .iobuf  = &qiov,
>         .flags  = flags,
>         .ret    = NOT_DONE,
>     };
> @@ -1275,7 +1277,7 @@ static void blk_aio_complete_bh(void *opaque)
> }
> 
> static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
> -                                QEMUIOVector *qiov, CoroutineEntry co_entry,
> +                                void *iobuf, CoroutineEntry co_entry,
>                                 BdrvRequestFlags flags,
>                                 BlockCompletionFunc *cb, void *opaque)
> {
> @@ -1287,7 +1289,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
>     acb->rwco = (BlkRwCo) {
>         .blk    = blk,
>         .offset = offset,
> -        .qiov   = qiov,
> +        .iobuf  = iobuf,
>         .flags  = flags,
>         .ret    = NOT_DONE,
>     };
> @@ -1310,10 +1312,11 @@ static void blk_aio_read_entry(void *opaque)
> {
>     BlkAioEmAIOCB *acb = opaque;
>     BlkRwCo *rwco = &acb->rwco;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    assert(rwco->qiov->size == acb->bytes);
> +    assert(qiov->size == acb->bytes);
>     rwco->ret = blk_co_preadv(rwco->blk, rwco->offset, acb->bytes,
> -                              rwco->qiov, rwco->flags);
> +                              qiov, rwco->flags);
>     blk_aio_complete(acb);
> }
> 
> @@ -1321,10 +1324,11 @@ static void blk_aio_write_entry(void *opaque)
> {
>     BlkAioEmAIOCB *acb = opaque;
>     BlkRwCo *rwco = &acb->rwco;
> +    QEMUIOVector *qiov = rwco->iobuf;
> 
> -    assert(!rwco->qiov || rwco->qiov->size == acb->bytes);
> +    assert(!qiov || qiov->size == acb->bytes);
>     rwco->ret = blk_co_pwritev(rwco->blk, rwco->offset, acb->bytes,
> -                               rwco->qiov, rwco->flags);
> +                               qiov, rwco->flags);
>     blk_aio_complete(acb);
> }
> 
> @@ -1453,8 +1457,10 @@ int blk_co_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
> static void blk_ioctl_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> +    QEMUIOVector *qiov = rwco->iobuf;
> +
>     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> -                             rwco->qiov->iov[0].iov_base);
> +                             qiov->iov[0].iov_base);
> }
> 
> int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
> @@ -1467,24 +1473,15 @@ static void blk_aio_ioctl_entry(void *opaque)
>     BlkAioEmAIOCB *acb = opaque;
>     BlkRwCo *rwco = &acb->rwco;
> 
> -    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> -                             rwco->qiov->iov[0].iov_base);
> +    rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset, rwco->iobuf);
> +
>     blk_aio_complete(acb);
> }
> 
> BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
>                           BlockCompletionFunc *cb, void *opaque)
> {
> -    QEMUIOVector qiov;
> -    struct iovec iov;
> -
> -    iov = (struct iovec) {
> -        .iov_base = buf,
> -        .iov_len = 0,
> -    };
> -    qemu_iovec_init_external(&qiov, &iov, 1);
> -
> -    return blk_aio_prwv(blk, req, 0, &qiov, blk_aio_ioctl_entry, 0, cb, opaque);
> +    return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, opaque);
> }
> 
> int blk_co_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
> @@ -1900,7 +1897,9 @@ int blk_truncate(BlockBackend *blk, int64_t offset, PreallocMode prealloc,
> static void blk_pdiscard_entry(void *opaque)
> {
>     BlkRwCo *rwco = opaque;
> -    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, rwco->qiov->size);
> +    QEMUIOVector *qiov = rwco->iobuf;
> +
> +    rwco->ret = blk_co_pdiscard(rwco->blk, rwco->offset, qiov->size);
> }
> 
> int blk_pdiscard(BlockBackend *blk, int64_t offset, int bytes)
> -- 
> 2.7.4
> 
> 


Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Stefan Hajnoczi 6 years, 4 months ago
On Wed, Nov 22, 2017 at 07:33:28AM -0800, Deepa Srinivasan wrote:
> Starting qemu with the following arguments causes qemu to segfault:
> ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> 
> This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> details about the bug follow.
> 
> blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> 
> When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> 
> When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> ....
>     BlkRwCo *rwco = &acb->rwco;
> 
>     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
>                              rwco->qiov->iov[0].iov_base);  <--- qiov is
>                                                                  invalid here
> ...
> 
> In the case when blk_aio_ioctl() is called from a non-coroutine context,
> blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> execution is complete, control returns to blk_aio_ioctl_entry() after the call
> to blk_co_ioctl(). There is no invalid reference after this point, but the
> function is still holding on to invalid pointers.
> 
> The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
> the request struct which the IO buffer is part of. The memory for this struct is
> guaranteed to be valid till the AIO is completed.

Thanks for the patch!

AIO APIs currently don't require the caller to match qiov's lifetime to
the I/O request lifetime.  This patch changes that for blk_aio_ioctl()
only.  If we want to do this consistently then all aio callers need to
be audited and fixed.

The alternative is to make the API copy qiov when necessary.  That is
less efficient but avoids modifying all callers.

Either way, the lifetime of qiov must be consistent across all aio APIs,
not just blk_aio_ioctl().
Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Kevin Wolf 6 years, 4 months ago
Am 22.11.2017 um 18:06 hat Stefan Hajnoczi geschrieben:
> On Wed, Nov 22, 2017 at 07:33:28AM -0800, Deepa Srinivasan wrote:
> > Starting qemu with the following arguments causes qemu to segfault:
> > ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> > 
> > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> > details about the bug follow.
> > 
> > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> > 
> > When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> > the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> > 
> > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> > ....
> >     BlkRwCo *rwco = &acb->rwco;
> > 
> >     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> >                              rwco->qiov->iov[0].iov_base);  <--- qiov is
> >                                                                  invalid here
> > ...
> > 
> > In the case when blk_aio_ioctl() is called from a non-coroutine context,
> > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> > execution is complete, control returns to blk_aio_ioctl_entry() after the call
> > to blk_co_ioctl(). There is no invalid reference after this point, but the
> > function is still holding on to invalid pointers.
> > 
> > The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
> > the request struct which the IO buffer is part of. The memory for this struct is
> > guaranteed to be valid till the AIO is completed.
> 
> Thanks for the patch!
> 
> AIO APIs currently don't require the caller to match qiov's lifetime to
> the I/O request lifetime.  This patch changes that for blk_aio_ioctl()
> only.  If we want to do this consistently then all aio callers need to
> be audited and fixed.
> 
> The alternative is to make the API copy qiov when necessary.  That is
> less efficient but avoids modifying all callers.
> 
> Either way, the lifetime of qiov must be consistent across all aio APIs,
> not just blk_aio_ioctl().

Don't all blk_aio_*() APIs that take a qiov pointer require that it
remains valid until the request completes? I don't think they are copied
anywhere for blk_aio_preadv/pwritev() before being passed to the block
driver.

So this does look consistent with the existing functions to me.

Kevin
Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Stefan Hajnoczi 6 years, 4 months ago
On Wed, Nov 22, 2017 at 07:04:26PM +0100, Kevin Wolf wrote:
> Am 22.11.2017 um 18:06 hat Stefan Hajnoczi geschrieben:
> > On Wed, Nov 22, 2017 at 07:33:28AM -0800, Deepa Srinivasan wrote:
> > > Starting qemu with the following arguments causes qemu to segfault:
> > > ... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
> > > iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
> > > 
> > > This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
> > > blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
> > > details about the bug follow.
> > > 
> > > blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
> > > coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
> > > 
> > > When blk_aio_ioctl() is executed from within a coroutine context (e.g.
> > > iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
> > > the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
> > > 
> > > When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
> > > ....
> > >     BlkRwCo *rwco = &acb->rwco;
> > > 
> > >     rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
> > >                              rwco->qiov->iov[0].iov_base);  <--- qiov is
> > >                                                                  invalid here
> > > ...
> > > 
> > > In the case when blk_aio_ioctl() is called from a non-coroutine context,
> > > blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
> > > qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
> > > execution is complete, control returns to blk_aio_ioctl_entry() after the call
> > > to blk_co_ioctl(). There is no invalid reference after this point, but the
> > > function is still holding on to invalid pointers.
> > > 
> > > The fix is to allocate memory for the QEMUIOVector and struct iovec as part of
> > > the request struct which the IO buffer is part of. The memory for this struct is
> > > guaranteed to be valid till the AIO is completed.
> > 
> > Thanks for the patch!
> > 
> > AIO APIs currently don't require the caller to match qiov's lifetime to
> > the I/O request lifetime.  This patch changes that for blk_aio_ioctl()
> > only.  If we want to do this consistently then all aio callers need to
> > be audited and fixed.
> > 
> > The alternative is to make the API copy qiov when necessary.  That is
> > less efficient but avoids modifying all callers.
> > 
> > Either way, the lifetime of qiov must be consistent across all aio APIs,
> > not just blk_aio_ioctl().
> 
> Don't all blk_aio_*() APIs that take a qiov pointer require that it
> remains valid until the request completes? I don't think they are copied
> anywhere for blk_aio_preadv/pwritev() before being passed to the block
> driver.
> 
> So this does look consistent with the existing functions to me.

You are right.  I audited the blk_aio_preadv() callers and they all keep
qiov around until the request is complete.

Actually this makes sense because even in the simple non-coroutine case
with aio=threads the qiov hasn't necessarily been read yet when the
function returns.  The aio_worker() function executes later and only
then is qiov handed to the host kernel.

So this is a one-off bug in blk_aio_ioctl() callers.

Stefan
Re: [Qemu-devel] [PATCH] block: Fix qemu crash when using scsi-block
Posted by Paolo Bonzini 6 years, 4 months ago
On 23/11/2017 11:23, Stefan Hajnoczi wrote:
> You are right.  I audited the blk_aio_preadv() callers and they all keep
> qiov around until the request is complete.
> 
> Actually this makes sense because even in the simple non-coroutine case
> with aio=threads the qiov hasn't necessarily been read yet when the
> function returns.  The aio_worker() function executes later and only
> then is qiov handed to the host kernel.
> 
> So this is a one-off bug in blk_aio_ioctl() callers.

Only in blk_aio_ioctl, not in the callers.

Paolo