nbd-client: Work around server BLOCK_STATUS misalignment at EOF

[Qemu-devel] [PATCH for-4.0] nbd-client: Work around server BLOCK_STATUS misalignment at EOF

Posted by Eric Blake 6 years, 10 months ago

The NBD spec is clear that a server that advertises a minimum block
size should reply to NBD_CMD_BLOCK_STATUS with extents aligned
accordingly. However, we know that the qemu NBD server implementation
has had a corner-case bug where it is not compliant with the spec,
present since the introduction of NBD_CMD_BLOCK_STATUS in qemu 2.12
(and unlikely to be patched in time for 4.0). Namely, when qemu is
serving a file that is not a multiple of 512 bytes, it rounds the size
advertised over NBD up to the next sector boundary (someday, I'd like
to fix that to be byte-accurate, but it's a much bigger audit not
appropriate for this release); yet if the final sector contains data
prior to EOF, lseek(SEEK_HOLE) will point to the implicit hole
mid-sector which qemu then reported over NBD.

We are well within our rights to hang up on a server that can't follow
the spec, but it is more useful to try and keep the connection alive
in spite of the problem. Do so by tracing a message about the problem,
and then either truncating the request back to an aligned boundary (if
it covered more than the final sector) or widening it out to the full
boundary with a forced status of data (since truncating would result
in 0 bytes, but we have to make progress, and valid since data is a
default-safe answer). And in practice, since the problem only happens
on a sector that starts with data and ends with a hole, we are going
to want to read that full sector anyway (where qemu as the server
fills in the tail beyond EOF with appropriate NUL bytes).

Easy reproduction:
$ printf %1000d 1 > file
$ qemu-nbd -f raw -t file & pid=$!
$ qemu-img map --output=json -f raw nbd://localhost:10809
qemu-img: Could not read file metadata: Invalid argument
$ kill $pid

where the patched version instead succeeds with:
[{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}]

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/nbd-client.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index a3b70d14004..241cc555246 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -269,14 +269,36 @@ static int nbd_parse_blockstatus_payload(NBDClientSession *client,
     extent->length = payload_advance32(&payload);
     extent->flags = payload_advance32(&payload);

-    if (extent->length == 0 ||
-        (client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
-                                                    client->info.min_block))) {
+    if (extent->length == 0) {
         error_setg(errp, "Protocol error: server sent status chunk with "
                    "invalid length");
         return -EINVAL;
     }

+    /*
+     * A server sending unaligned block status is in violation of the
+     * protocol, but as qemu-nbd 3.1 is such a server (at least for
+     * POSIX files that are not a multiple of 512 bytes, since qemu
+     * rounds files up to 512-byte multiples but lseek(SEEK_HOLE)
+     * still sees an implicit hole beyond the real EOF), it's nicer to
+     * work around the misbehaving server. If the request included
+     * more than the final unaligned block, truncate it back to an
+     * aligned result; if the request was only the final block, round
+     * up to the full block and change the status to fully-allocated
+     * (always a safe status, even if it loses information).
+     */
+    if (client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
+                                                   client->info.min_block)) {
+        trace_nbd_parse_blockstatus_compliance("extent length is unaligned");
+        if (extent->length > client->info.min_block) {
+            extent->length = QEMU_ALIGN_DOWN(extent->length,
+                                             client->info.min_block);
+        } else {
+            extent->length = client->info.min_block;
+            extent->flags = 0;
+        }
+    }
+
     /*
      * We used NBD_CMD_FLAG_REQ_ONE, so the server should not have
      * sent us any more than one extent, nor should it have included
-- 
2.20.1

Re: [Qemu-devel] [PATCH for-4.0] nbd-client: Work around server BLOCK_STATUS misalignment at EOF

Posted by Eric Blake 6 years, 10 months ago

On 3/26/19 12:13 PM, Eric Blake wrote:
> The NBD spec is clear that a server that advertises a minimum block
> size should reply to NBD_CMD_BLOCK_STATUS with extents aligned
> accordingly. However, we know that the qemu NBD server implementation
> has had a corner-case bug where it is not compliant with the spec,
> present since the introduction of NBD_CMD_BLOCK_STATUS in qemu 2.12
> (and unlikely to be patched in time for 4.0). Namely, when qemu is
> serving a file that is not a multiple of 512 bytes, it rounds the size
> advertised over NBD up to the next sector boundary (someday, I'd like
> to fix that to be byte-accurate, but it's a much bigger audit not
> appropriate for this release); yet if the final sector contains data
> prior to EOF, lseek(SEEK_HOLE) will point to the implicit hole
> mid-sector which qemu then reported over NBD.
> 
> We are well within our rights to hang up on a server that can't follow
> the spec, but it is more useful to try and keep the connection alive
> in spite of the problem. Do so by tracing a message about the problem,
> and then either truncating the request back to an aligned boundary (if
> it covered more than the final sector) or widening it out to the full
> boundary with a forced status of data (since truncating would result
> in 0 bytes, but we have to make progress, and valid since data is a
> default-safe answer). And in practice, since the problem only happens
> on a sector that starts with data and ends with a hole, we are going
> to want to read that full sector anyway (where qemu as the server
> fills in the tail beyond EOF with appropriate NUL bytes).
> 
> Easy reproduction:
> $ printf %1000d 1 > file
> $ qemu-nbd -f raw -t file & pid=$!
> $ qemu-img map --output=json -f raw nbd://localhost:10809
> qemu-img: Could not read file metadata: Invalid argument
> $ kill $pid
> 
> where the patched version instead succeeds with:
> [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}]
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  block/nbd-client.c | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)

I haven't seen patchew reports in a few days. But just in case it tries
to apply this,

Based-on: <20190325190104.30213-1-eblake@redhat.com>
[v2 0/2] nbd: simple errors to BLOCK_STATUS

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH for-4.0] nbd-client: Work around server BLOCK_STATUS misalignment at EOF

Posted by Vladimir Sementsov-Ogievskiy 6 years, 10 months ago

26.03.2019 20:13, Eric Blake wrote:
> The NBD spec is clear that a server that advertises a minimum block
> size should reply to NBD_CMD_BLOCK_STATUS with extents aligned
> accordingly. However, we know that the qemu NBD server implementation
> has had a corner-case bug where it is not compliant with the spec,
> present since the introduction of NBD_CMD_BLOCK_STATUS in qemu 2.12
> (and unlikely to be patched in time for 4.0). Namely, when qemu is
> serving a file that is not a multiple of 512 bytes, it rounds the size
> advertised over NBD up to the next sector boundary (someday, I'd like
> to fix that to be byte-accurate, but it's a much bigger audit not
> appropriate for this release); yet if the final sector contains data
> prior to EOF, lseek(SEEK_HOLE) will point to the implicit hole
> mid-sector which qemu then reported over NBD.
> 
> We are well within our rights to hang up on a server that can't follow
> the spec, but it is more useful to try and keep the connection alive
> in spite of the problem. Do so by tracing a message about the problem,
> and then either truncating the request back to an aligned boundary (if
> it covered more than the final sector) or widening it out to the full
> boundary with a forced status of data (since truncating would result
> in 0 bytes, but we have to make progress, and valid since data is a
> default-safe answer). And in practice, since the problem only happens
> on a sector that starts with data and ends with a hole, we are going
> to want to read that full sector anyway (where qemu as the server
> fills in the tail beyond EOF with appropriate NUL bytes).
> 
> Easy reproduction:
> $ printf %1000d 1 > file
> $ qemu-nbd -f raw -t file & pid=$!
> $ qemu-img map --output=json -f raw nbd://localhost:10809
> qemu-img: Could not read file metadata: Invalid argument
> $ kill $pid

would be good to add it to iotests

> 
> where the patched version instead succeeds with:
> [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}]
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>


Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

> ---
>   block/nbd-client.c | 28 +++++++++++++++++++++++++---
>   1 file changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index a3b70d14004..241cc555246 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -269,14 +269,36 @@ static int nbd_parse_blockstatus_payload(NBDClientSession *client,
>       extent->length = payload_advance32(&payload);
>       extent->flags = payload_advance32(&payload);
> 
> -    if (extent->length == 0 ||
> -        (client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
> -                                                    client->info.min_block))) {
> +    if (extent->length == 0) {
>           error_setg(errp, "Protocol error: server sent status chunk with "
>                      "invalid length");

may be improved to s/invalid/zero/

>           return -EINVAL;
>       }
> 
> +    /*
> +     * A server sending unaligned block status is in violation of the
> +     * protocol, but as qemu-nbd 3.1 is such a server (at least for
> +     * POSIX files that are not a multiple of 512 bytes, since qemu
> +     * rounds files up to 512-byte multiples but lseek(SEEK_HOLE)
> +     * still sees an implicit hole beyond the real EOF), it's nicer to
> +     * work around the misbehaving server. If the request included
> +     * more than the final unaligned block, truncate it back to an
> +     * aligned result; if the request was only the final block, round
> +     * up to the full block and change the status to fully-allocated
> +     * (always a safe status, even if it loses information).
> +     */
> +    if (client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
> +                                                   client->info.min_block)) {
> +        trace_nbd_parse_blockstatus_compliance("extent length is unaligned");
> +        if (extent->length > client->info.min_block) {
> +            extent->length = QEMU_ALIGN_DOWN(extent->length,
> +                                             client->info.min_block);
> +        } else {
> +            extent->length = client->info.min_block;
> +            extent->flags = 0;
> +        }
> +    }
> +
>       /*
>        * We used NBD_CMD_FLAG_REQ_ONE, so the server should not have
>        * sent us any more than one extent, nor should it have included
> 


-- 
Best regards,
Vladimir

[Qemu-devel] [PATCH for-4.0] nbd-client: Work around server BLOCK_STATUS misalignment at EOF

Posted by Eric Blake 6 years, 10 months ago

On 3/27/19 11:56 AM, Vladimir Sementsov-Ogievskiy wrote:
> 26.03.2019 20:13, Eric Blake wrote:
>> The NBD spec is clear that a server that advertises a minimum block
>> size should reply to NBD_CMD_BLOCK_STATUS with extents aligned
>> accordingly. However, we know that the qemu NBD server implementation
>> has had a corner-case bug where it is not compliant with the spec,
>> present since the introduction of NBD_CMD_BLOCK_STATUS in qemu 2.12
>> (and unlikely to be patched in time for 4.0). Namely, when qemu is
>> serving a file that is not a multiple of 512 bytes, it rounds the size
>> advertised over NBD up to the next sector boundary (someday, I'd like
>> to fix that to be byte-accurate, but it's a much bigger audit not
>> appropriate for this release); yet if the final sector contains data
>> prior to EOF, lseek(SEEK_HOLE) will point to the implicit hole
>> mid-sector which qemu then reported over NBD.
>>

>>
>> Easy reproduction:
>> $ printf %1000d 1 > file
>> $ qemu-nbd -f raw -t file & pid=$!
>> $ qemu-img map --output=json -f raw nbd://localhost:10809
>> qemu-img: Could not read file metadata: Invalid argument
>> $ kill $pid
> 
> would be good to add it to iotests
> 

Will do in a followup; probably by reviving a v2 of this series:
https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg00305.html

>>
>> where the patched version instead succeeds with:
>> [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}]
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 

Thanks; will queue through my NBD tree for -rc2.


>> +++ b/block/nbd-client.c
>> @@ -269,14 +269,36 @@ static int nbd_parse_blockstatus_payload(NBDClientSession *client,
>>       extent->length = payload_advance32(&payload);
>>       extent->flags = payload_advance32(&payload);
>>
>> -    if (extent->length == 0 ||
>> -        (client->info.min_block && !QEMU_IS_ALIGNED(extent->length,
>> -                                                    client->info.min_block))) {
>> +    if (extent->length == 0) {
>>           error_setg(errp, "Protocol error: server sent status chunk with "
>>                      "invalid length");
> 
> may be improved to s/invalid/zero/

Will do.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org