[PATCH 7/7] block/rbd: change request alignment to 1 byte

Peter Lieven posted 7 patches 4 years, 10 months ago
Maintainers: Kevin Wolf <kwolf@redhat.com>, Jason Dillaman <dillaman@redhat.com>, Max Reitz <mreitz@redhat.com>
There is a newer version of this series
[PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Peter Lieven 4 years, 10 months ago
since we implement byte interfaces and librbd supports aio on byte granularity we can lift
the 512 byte alignment.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/rbd.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 27b4404adf..8673e8f553 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -223,8 +223,6 @@ done:
 static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BDRVRBDState *s = bs->opaque;
-    /* XXX Does RBD support AIO on less than 512-byte alignment? */
-    bs->bl.request_alignment = 512;
 #ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
     bs->bl.pwrite_zeroes_alignment = s->object_size;
 #endif
-- 
2.17.1



Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Jason Dillaman 4 years, 10 months ago
On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>
> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> the 512 byte alignment.
>
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  block/rbd.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/block/rbd.c b/block/rbd.c
> index 27b4404adf..8673e8f553 100644
> --- a/block/rbd.c
> +++ b/block/rbd.c
> @@ -223,8 +223,6 @@ done:
>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>      BDRVRBDState *s = bs->opaque;
> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> -    bs->bl.request_alignment = 512;

Just a suggestion, but perhaps improve discard alignment, max discard,
optimal alignment (if that's something QEMU handles internally) if not
overridden by the user.


>  #ifdef LIBRBD_SUPPORTS_WRITE_ZEROES
>      bs->bl.pwrite_zeroes_alignment = s->object_size;
>  #endif
> --
> 2.17.1
>
>


--
Jason


Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Peter Lieven 4 years, 10 months ago
Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>> the 512 byte alignment.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>>  block/rbd.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/block/rbd.c b/block/rbd.c
>> index 27b4404adf..8673e8f553 100644
>> --- a/block/rbd.c
>> +++ b/block/rbd.c
>> @@ -223,8 +223,6 @@ done:
>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>  {
>>      BDRVRBDState *s = bs->opaque;
>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>> -    bs->bl.request_alignment = 512;
> Just a suggestion, but perhaps improve discard alignment, max discard,
> optimal alignment (if that's something QEMU handles internally) if not
> overridden by the user.


Qemu supports max_discard and discard_alignment. Is there a call to get these limits

from librbd?


What do you mean by optimal_alignment? The object size?


Peter




Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Jason Dillaman 4 years, 10 months ago
On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>
> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> > On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >> the 512 byte alignment.
> >>
> >> Signed-off-by: Peter Lieven <pl@kamp.de>
> >> ---
> >>  block/rbd.c | 2 --
> >>  1 file changed, 2 deletions(-)
> >>
> >> diff --git a/block/rbd.c b/block/rbd.c
> >> index 27b4404adf..8673e8f553 100644
> >> --- a/block/rbd.c
> >> +++ b/block/rbd.c
> >> @@ -223,8 +223,6 @@ done:
> >>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>  {
> >>      BDRVRBDState *s = bs->opaque;
> >> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >> -    bs->bl.request_alignment = 512;
> > Just a suggestion, but perhaps improve discard alignment, max discard,
> > optimal alignment (if that's something QEMU handles internally) if not
> > overridden by the user.
>
>
> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>
> from librbd?
>
>
> What do you mean by optimal_alignment? The object size?

krbd does a good job of initializing defaults [1] where optimal and
discard alignment is 64KiB (can actually be 4KiB now), max IO size for
writes, discards, and write-zeroes is the object size * the stripe
count.

> Peter
>
>
>

[1] https://github.com/torvalds/linux/blob/master/drivers/block/rbd.c#L4981

-- 
Jason


Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Peter Lieven 4 years, 10 months ago
Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>>>> the 512 byte alignment.
>>>>
>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>> ---
>>>>  block/rbd.c | 2 --
>>>>  1 file changed, 2 deletions(-)
>>>>
>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>> index 27b4404adf..8673e8f553 100644
>>>> --- a/block/rbd.c
>>>> +++ b/block/rbd.c
>>>> @@ -223,8 +223,6 @@ done:
>>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>  {
>>>>      BDRVRBDState *s = bs->opaque;
>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>>>> -    bs->bl.request_alignment = 512;
>>> Just a suggestion, but perhaps improve discard alignment, max discard,
>>> optimal alignment (if that's something QEMU handles internally) if not
>>> overridden by the user.
>>
>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>>
>> from librbd?
>>
>>
>> What do you mean by optimal_alignment? The object size?
> krbd does a good job of initializing defaults [1] where optimal and
> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> writes, discards, and write-zeroes is the object size * the stripe
> count.


Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than

obj_size  * stripe count will librbd split it internally or will the request fail?


Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device

configuration and not from rbd? I don't have that information inside the Qemu RBD driver.


Peter



Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Jason Dillaman 4 years, 10 months ago
On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
>
> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> > On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
> >> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> >>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >>>> the 512 byte alignment.
> >>>>
> >>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>> ---
> >>>>  block/rbd.c | 2 --
> >>>>  1 file changed, 2 deletions(-)
> >>>>
> >>>> diff --git a/block/rbd.c b/block/rbd.c
> >>>> index 27b4404adf..8673e8f553 100644
> >>>> --- a/block/rbd.c
> >>>> +++ b/block/rbd.c
> >>>> @@ -223,8 +223,6 @@ done:
> >>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>>  {
> >>>>      BDRVRBDState *s = bs->opaque;
> >>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >>>> -    bs->bl.request_alignment = 512;
> >>> Just a suggestion, but perhaps improve discard alignment, max discard,
> >>> optimal alignment (if that's something QEMU handles internally) if not
> >>> overridden by the user.
> >>
> >> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
> >>
> >> from librbd?
> >>
> >>
> >> What do you mean by optimal_alignment? The object size?
> > krbd does a good job of initializing defaults [1] where optimal and
> > discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> > writes, discards, and write-zeroes is the object size * the stripe
> > count.
>
>
> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
>
> obj_size  * stripe count will librbd split it internally or will the request fail?

librbd will handle it as needed. My goal is really just to get the
hints down the guest OS.

> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
>
> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.

librbd doesn't really have the information either. The 64KiB guess
that krbd uses was a compromise since that was the default OSD
allocation size for HDDs since Luminous. Starting with Pacific that
default is going down to 4KiB.

>
> Peter
>
>


-- 
Jason


Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Peter Lieven 4 years, 10 months ago
Am 18.01.21 um 23:33 schrieb Jason Dillaman:
> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
>> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
>>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
>>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
>>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
>>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
>>>>>> the 512 byte alignment.
>>>>>>
>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>>>> ---
>>>>>>  block/rbd.c | 2 --
>>>>>>  1 file changed, 2 deletions(-)
>>>>>>
>>>>>> diff --git a/block/rbd.c b/block/rbd.c
>>>>>> index 27b4404adf..8673e8f553 100644
>>>>>> --- a/block/rbd.c
>>>>>> +++ b/block/rbd.c
>>>>>> @@ -223,8 +223,6 @@ done:
>>>>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>>>>>  {
>>>>>>      BDRVRBDState *s = bs->opaque;
>>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
>>>>>> -    bs->bl.request_alignment = 512;
>>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
>>>>> optimal alignment (if that's something QEMU handles internally) if not
>>>>> overridden by the user.
>>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
>>>>
>>>> from librbd?
>>>>
>>>>
>>>> What do you mean by optimal_alignment? The object size?
>>> krbd does a good job of initializing defaults [1] where optimal and
>>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
>>> writes, discards, and write-zeroes is the object size * the stripe
>>> count.
>>
>> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
>>
>> obj_size  * stripe count will librbd split it internally or will the request fail?
> librbd will handle it as needed. My goal is really just to get the
> hints down the guest OS.
>
>> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
>>
>> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
> librbd doesn't really have the information either. The 64KiB guess
> that krbd uses was a compromise since that was the default OSD
> allocation size for HDDs since Luminous. Starting with Pacific that
> default is going down to 4KiB.


I will try to adjust these values as far as it is possible and makes sense.


Is there a way to check the minimum supported OSD release in the backend from librbd / librados?


Anyway, I want to sent a V2 by the end of this week.


Peter



Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Posted by Jason Dillaman 4 years, 10 months ago
On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven <pl@kamp.de> wrote:
>
> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
> > On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
> >> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> >>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
> >>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> >>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >>>>>> since we implement byte interfaces and librbd supports aio on byte granularity we can lift
> >>>>>> the 512 byte alignment.
> >>>>>>
> >>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>>>> ---
> >>>>>>  block/rbd.c | 2 --
> >>>>>>  1 file changed, 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/block/rbd.c b/block/rbd.c
> >>>>>> index 27b4404adf..8673e8f553 100644
> >>>>>> --- a/block/rbd.c
> >>>>>> +++ b/block/rbd.c
> >>>>>> @@ -223,8 +223,6 @@ done:
> >>>>>>  static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error **errp)
> >>>>>>  {
> >>>>>>      BDRVRBDState *s = bs->opaque;
> >>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >>>>>> -    bs->bl.request_alignment = 512;
> >>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
> >>>>> optimal alignment (if that's something QEMU handles internally) if not
> >>>>> overridden by the user.
> >>>> Qemu supports max_discard and discard_alignment. Is there a call to get these limits
> >>>>
> >>>> from librbd?
> >>>>
> >>>>
> >>>> What do you mean by optimal_alignment? The object size?
> >>> krbd does a good job of initializing defaults [1] where optimal and
> >>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> >>> writes, discards, and write-zeroes is the object size * the stripe
> >>> count.
> >>
> >> Okay, I will have a look at it. If qemu issues a write, discard, write_zero greater than
> >>
> >> obj_size  * stripe count will librbd split it internally or will the request fail?
> > librbd will handle it as needed. My goal is really just to get the
> > hints down the guest OS.
> >
> >> Regarding the alignment it seems that rbd_dev->opts->alloc_size is something that comes from the device
> >>
> >> configuration and not from rbd? I don't have that information inside the Qemu RBD driver.
> > librbd doesn't really have the information either. The 64KiB guess
> > that krbd uses was a compromise since that was the default OSD
> > allocation size for HDDs since Luminous. Starting with Pacific that
> > default is going down to 4KiB.
>
>
> I will try to adjust these values as far as it is possible and makes sense.
>
>
> Is there a way to check the minimum supported OSD release in the backend from librbd / librados?

It's not a minimum -- RADOS will gladly access 1 byte writes as well.
It's really just the optimal (performance and space-wise). Sadly,
there is no realistic way to query this data from the backend.

>
> Anyway, I want to sent a V2 by the end of this week.
>
>
> Peter
>
>


-- 
Jason