[PATCH v3 3/7] block: add max_hw_transfer to BlockLimits

Paolo Bonzini posted 7 patches 4 years, 8 months ago
Maintainers: Kevin Wolf <kwolf@redhat.com>, Eric Blake <eblake@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Markus Armbruster <armbru@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Max Reitz <mreitz@redhat.com>, Fam Zheng <fam@euphon.net>
There is a newer version of this series
[PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
Posted by Paolo Bonzini 4 years, 8 months ago
For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.

In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed.  Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size.  max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/block-backend.c          | 12 ++++++++++++
 block/file-posix.c             |  2 +-
 block/io.c                     |  1 +
 hw/scsi/scsi-generic.c         |  2 +-
 include/block/block_int.h      |  7 +++++++
 include/sysemu/block-backend.h |  1 +
 6 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 15f1ea4288..2ea1412a54 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
     return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
 }
 
+/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
+uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
+{
+    BlockDriverState *bs = blk_bs(blk);
+    uint64_t max = INT_MAX;
+
+    if (bs) {
+        max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
+    }
+    return max;
+}
+
 /* Returns the maximum transfer length, in bytes; guaranteed nonzero */
 uint32_t blk_get_max_transfer(BlockBackend *blk)
 {
diff --git a/block/file-posix.c b/block/file-posix.c
index e3241a0dd3..f55f92d0f5 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1234,7 +1234,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
         int ret = sg_get_max_transfer_length(s->fd);
 
         if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
-            bs->bl.max_transfer = pow2floor(ret);
+            bs->bl.max_hw_transfer = pow2floor(ret);
         }
 
         ret = sg_get_max_segments(s->fd);
diff --git a/block/io.c b/block/io.c
index 323854d063..089b99bb0c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -127,6 +127,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
 {
     dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
     dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
+    dst->max_hw_transfer = MIN_NON_ZERO(dst->max_hw_transfer, src->max_hw_transfer);
     dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
                                  src->opt_mem_alignment);
     dst->min_mem_alignment = MAX(dst->min_mem_alignment,
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 82e1e2ee79..3762dce749 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -179,7 +179,7 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
         (r->req.cmd.buf[1] & 0x01)) {
         page = r->req.cmd.buf[2];
         if (page == 0xb0) {
-            uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
+            uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
             uint32_t max_iov = blk_get_max_iov(s->conf.blk);
 
             assert(max_transfer);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 057d88b1fc..f1a54db0f8 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -695,6 +695,13 @@ typedef struct BlockLimits {
      * clamped down. */
     uint32_t max_transfer;
 
+    /* Maximal hardware transfer length in bytes.  Applies whenever
+     * transfers to the device bypass the kernel I/O scheduler, for
+     * example with SG_IO.  If larger than max_transfer or if zero,
+     * blk_get_max_hw_transfer will fall back to max_transfer.
+     */
+    uint64_t max_hw_transfer;
+
     /* memory alignment, in bytes so that no bounce buffer is needed */
     size_t min_mem_alignment;
 
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 5423e3d9c6..9ac5f7bbd3 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -208,6 +208,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag);
 int blk_get_flags(BlockBackend *blk);
 uint32_t blk_get_request_alignment(BlockBackend *blk);
 uint32_t blk_get_max_transfer(BlockBackend *blk);
+uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
 int blk_get_max_iov(BlockBackend *blk);
 void blk_set_guest_block_size(BlockBackend *blk, int align);
 void *blk_try_blockalign(BlockBackend *blk, size_t size);
-- 
2.31.1



Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
Posted by Eric Blake 4 years, 8 months ago
On Thu, Jun 03, 2021 at 03:37:18PM +0200, Paolo Bonzini wrote:
> For block host devices, I/O can happen through either the kernel file
> descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
> or the SCSI passthrough ioctl SG_IO.
> 
> In the latter case, the size of each transfer can be limited by the
> HBA, while for file descriptor I/O the kernel is able to split and
> merge I/O in smaller pieces as needed.  Applying the HBA limits to
> file descriptor I/O results in more system calls and suboptimal
> performance, so this patch splits the max_transfer limit in two:
> max_transfer remains valid and is used in general, while max_hw_transfer
> is limited to the maximum hardware size.  max_hw_transfer can then be
> included by the scsi-generic driver in the block limits page, to ensure
> that the stricter hardware limit is used.
> 

> +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
> +{
> +    BlockDriverState *bs = blk_bs(blk);
> +    uint64_t max = INT_MAX;

This is an unaligned value; should we instead round it down to the
request_alignment granularity?

> +
> +    if (bs) {
> +        max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
> +    }
> +    return max;
> +}
> +
>  /* Returns the maximum transfer length, in bytes; guaranteed nonzero */
>  uint32_t blk_get_max_transfer(BlockBackend *blk)
>  {

> +++ b/include/block/block_int.h
> @@ -695,6 +695,13 @@ typedef struct BlockLimits {
>       * clamped down. */
>      uint32_t max_transfer;
>  
> +    /* Maximal hardware transfer length in bytes.  Applies whenever

Leading /* on its own line, per our style.

> +     * transfers to the device bypass the kernel I/O scheduler, for
> +     * example with SG_IO.  If larger than max_transfer or if zero,
> +     * blk_get_max_hw_transfer will fall back to max_transfer.
> +     */

Should we mandate any additional requirements on this value such as
multiple of request_alignment or even power-of-2?

> +    uint64_t max_hw_transfer;
> +
>      /* memory alignment, in bytes so that no bounce buffer is needed */
>      size_t min_mem_alignment;
>  

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
Posted by Paolo Bonzini 4 years, 8 months ago
On 03/06/21 19:33, Eric Blake wrote:
>> +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
>> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
>> +{
>> +    BlockDriverState *bs = blk_bs(blk);
>> +    uint64_t max = INT_MAX;
> 
> This is an unaligned value; should we instead round it down to the
> request_alignment granularity?

See below...

>> +++ b/include/block/block_int.h
>> @@ -695,6 +695,13 @@ typedef struct BlockLimits {
>>        * clamped down. */
>>       uint32_t max_transfer;
>>   
>> +    /* Maximal hardware transfer length in bytes.  Applies whenever
> 
> Leading /* on its own line, per our style.

The whole file still uses this style, I can change it if desired or do 
it later for the whole file or even the whole block subsystem.

>> +     * transfers to the device bypass the kernel I/O scheduler, for
>> +     * example with SG_IO.  If larger than max_transfer or if zero,
>> +     * blk_get_max_hw_transfer will fall back to max_transfer.
>> +     */
> 
> Should we mandate any additional requirements on this value such as
> multiple of request_alignment or even power-of-2?

Certainly not power of 2.  Multiple of request_alignment probably makes 
sense, but max_transfer doesn't have that limit.

Paolo

>> +    uint64_t max_hw_transfer;
>> +
>>       /* memory alignment, in bytes so that no bounce buffer is needed */
>>       size_t min_mem_alignment;
>>   
> 


Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
Posted by Max Reitz 4 years, 7 months ago
On 03.06.21 15:37, Paolo Bonzini wrote:
> For block host devices, I/O can happen through either the kernel file
> descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
> or the SCSI passthrough ioctl SG_IO.
>
> In the latter case, the size of each transfer can be limited by the
> HBA, while for file descriptor I/O the kernel is able to split and
> merge I/O in smaller pieces as needed.  Applying the HBA limits to
> file descriptor I/O results in more system calls and suboptimal
> performance, so this patch splits the max_transfer limit in two:
> max_transfer remains valid and is used in general, while max_hw_transfer
> is limited to the maximum hardware size.  max_hw_transfer can then be
> included by the scsi-generic driver in the block limits page, to ensure
> that the stricter hardware limit is used.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   block/block-backend.c          | 12 ++++++++++++
>   block/file-posix.c             |  2 +-
>   block/io.c                     |  1 +
>   hw/scsi/scsi-generic.c         |  2 +-
>   include/block/block_int.h      |  7 +++++++
>   include/sysemu/block-backend.h |  1 +
>   6 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 15f1ea4288..2ea1412a54 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
>       return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
>   }
>   
> +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
> +{
> +    BlockDriverState *bs = blk_bs(blk);
> +    uint64_t max = INT_MAX;
> +
> +    if (bs) {
> +        max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
> +    }
> +    return max;

Both `max_hw_transfer` and `max_transfer` can be 0, so this could return 
0, contrary to what the comment above promises.

Should `max` be initialized to 0 with a `MIN_NON_ZERO(max, INT_MAX)` 
here (like `blk_get_max_transfer()` does it)?

(As for the rest, I think aligning to the request alignment makes sense, 
but then again we don’t do that for max_transfer either, so... this at 
least wouldn’t be a new bug.

Regarding the comment, checkpatch complains about it, so it should be 
fixed so that /* is on its own line.

Speaking of checkpatch, now that I ran it, it also complains about the 
new line in bdrv_merge_limits() exceeding 80 characters, so that should 
be fixed, too.)

Max


Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
Posted by Paolo Bonzini 4 years, 7 months ago
On 15/06/21 18:18, Max Reitz wrote:
>>   }
>> +/* Returns the maximum hardware transfer length, in bytes; guaranteed 
>> nonzero */
>> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
>> +{
>> +    BlockDriverState *bs = blk_bs(blk);
>> +    uint64_t max = INT_MAX;
>> +
>> +    if (bs) {
>> +        max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
>> +    }
>> +    return max;
> 
> Both `max_hw_transfer` and `max_transfer` can be 0, so this could return 
> 0, contrary to what the comment above promises.
> 
> Should `max` be initialized to 0 with a `MIN_NON_ZERO(max, INT_MAX)` 
> here (like `blk_get_max_transfer()` does it)?

Yes, something to that effect.

> (As for the rest, I think aligning to the request alignment makes sense, 
> but then again we don’t do that for max_transfer either, so... this at 
> least wouldn’t be a new bug.

Ok, will do.  I will also add a new patch to align max_transfer to the 
request alignment.

> Regarding the comment, checkpatch complains about it, so it should be 
> fixed so that /* is on its own line.

That makes it different from every other comment in block_int.h though. 
  Is it okay to fix all of them in a follow-up?

Paolo

> Speaking of checkpatch, now that I ran it, it also complains about the 
> new line in bdrv_merge_limits() exceeding 80 characters, so that should 
> be fixed, too.)


Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
Posted by Max Reitz 4 years, 7 months ago
On 16.06.21 15:18, Paolo Bonzini wrote:
> On 15/06/21 18:18, Max Reitz wrote:
>>>   }
>>> +/* Returns the maximum hardware transfer length, in bytes; 
>>> guaranteed nonzero */
>>> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
>>> +{
>>> +    BlockDriverState *bs = blk_bs(blk);
>>> +    uint64_t max = INT_MAX;
>>> +
>>> +    if (bs) {
>>> +        max = MIN_NON_ZERO(bs->bl.max_hw_transfer, 
>>> bs->bl.max_transfer);
>>> +    }
>>> +    return max;
>>
>> Both `max_hw_transfer` and `max_transfer` can be 0, so this could 
>> return 0, contrary to what the comment above promises.
>>
>> Should `max` be initialized to 0 with a `MIN_NON_ZERO(max, INT_MAX)` 
>> here (like `blk_get_max_transfer()` does it)?
>
> Yes, something to that effect.
>
>> (As for the rest, I think aligning to the request alignment makes 
>> sense, but then again we don’t do that for max_transfer either, so... 
>> this at least wouldn’t be a new bug.
>
> Ok, will do.  I will also add a new patch to align max_transfer to the 
> request alignment.
>
>> Regarding the comment, checkpatch complains about it, so it should be 
>> fixed so that /* is on its own line.
>
> That makes it different from every other comment in block_int.h 
> though.  Is it okay to fix all of them in a follow-up?

The reason it’s different is that the comment style in question was 
added to checkpatch only relatively recently. I can’t speak for others, 
but I’m a simple person. I just do what makes checkpatch happy. :)

Given that the checkpatch complaint is only a warning, I think it’s OK 
to keep the comment as it is here, and perhaps optionally fix all 
comments in block_int.h in a follow-up. I don’t think we need to fix 
existing comments, but, well, it wouldn’t be wrong.

Max


Re: [PATCH v3 3/7] block: add max_hw_transfer to BlockLimits
Posted by Paolo Bonzini 4 years, 7 months ago
On 16/06/21 15:46, Max Reitz wrote:
> Given that the checkpatch complaint is only a warning, I think it’s OK 
> to keep the comment as it is here, and perhaps optionally fix all 
> comments in block_int.h in a follow-up. I don’t think we need to fix 
> existing comments, but, well, it wouldn’t be wrong.

We don't need to, but sometimes the benefit (of not discussing 
checkpatch over and over) does seem to outweigh the churn.  Thanks for 
the review!

Paolo