[v6] migration/multifd: Remove multifd_send_state->pages

[PATCH v6 18/19] migration/multifd: Stop changing the packet on recv side

Posted by Fabiano Rosas 2 months, 4 weeks ago

As observed by Philippe, the multifd_ram_unfill_packet() function
currently leaves the MultiFDPacket structure with mixed
endianness. This is harmless, but ultimately not very clean. Aside
from that, the packet is also written to on the recv side to ensure
the ramblock name is null-terminated.

Stop touching the received packet and do the necessary work using
stack variables instead.

While here tweak the error strings and fix the space before
semicolons. Also remove the "100 times bigger" comment because it's
just one possible explanation for a size mismatch and it doesn't even
match the code.

CC: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd-nocomp.c | 40 ++++++++++++++++----------------------
 migration/multifd.c        | 20 +++++++++----------
 2 files changed, 26 insertions(+), 34 deletions(-)

diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index f294d1b0b2..a759470c9c 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -217,36 +217,32 @@ void multifd_ram_fill_packet(MultiFDSendParams *p)
 
 int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
 {
-    MultiFDPacket_t *packet = p->packet;
+    const MultiFDPacket_t *packet = p->packet;
     uint32_t page_count = multifd_ram_page_count();
     uint32_t page_size = multifd_ram_page_size();
+    uint32_t pages_per_packet = be32_to_cpu(packet->pages_alloc);
+    g_autofree const char *ramblock_name = NULL;
     int i;
 
-    packet->pages_alloc = be32_to_cpu(packet->pages_alloc);
-    /*
-     * If we received a packet that is 100 times bigger than expected
-     * just stop migration.  It is a magic number.
-     */
-    if (packet->pages_alloc > page_count) {
-        error_setg(errp, "multifd: received packet "
-                   "with size %u and expected a size of %u",
-                   packet->pages_alloc, page_count) ;
+    if (pages_per_packet > page_count) {
+        error_setg(errp, "multifd: received packet with %u pages, expected %u",
+                   pages_per_packet, page_count);
         return -1;
     }
 
     p->normal_num = be32_to_cpu(packet->normal_pages);
-    if (p->normal_num > packet->pages_alloc) {
-        error_setg(errp, "multifd: received packet "
-                   "with %u normal pages and expected maximum pages are %u",
-                   p->normal_num, packet->pages_alloc) ;
+    if (p->normal_num > pages_per_packet) {
+        error_setg(errp, "multifd: received packet with %u non-zero pages, "
+                   "which exceeds maximum expected pages %u",
+                   p->normal_num, pages_per_packet);
         return -1;
     }
 
     p->zero_num = be32_to_cpu(packet->zero_pages);
-    if (p->zero_num > packet->pages_alloc - p->normal_num) {
-        error_setg(errp, "multifd: received packet "
-                   "with %u zero pages and expected maximum zero pages are %u",
-                   p->zero_num, packet->pages_alloc - p->normal_num) ;
+    if (p->zero_num > pages_per_packet - p->normal_num) {
+        error_setg(errp,
+                   "multifd: received packet with %u zero pages, expected maximum %u",
+                   p->zero_num, pages_per_packet - p->normal_num);
         return -1;
     }
 
@@ -254,12 +250,10 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
         return 0;
     }
 
-    /* make sure that ramblock is 0 terminated */
-    packet->ramblock[255] = 0;
-    p->block = qemu_ram_block_by_name(packet->ramblock);
+    ramblock_name = g_strndup(packet->ramblock, 255);
+    p->block = qemu_ram_block_by_name(ramblock_name);
     if (!p->block) {
-        error_setg(errp, "multifd: unknown ram block %s",
-                   packet->ramblock);
+        error_setg(errp, "multifd: unknown ram block %s", ramblock_name);
         return -1;
     }
 
diff --git a/migration/multifd.c b/migration/multifd.c
index b89715fdc2..2a8cd9174c 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -230,22 +230,20 @@ void multifd_send_fill_packet(MultiFDSendParams *p)
 
 static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp)
 {
-    MultiFDPacket_t *packet = p->packet;
+    const MultiFDPacket_t *packet = p->packet;
+    uint32_t magic = be32_to_cpu(packet->magic);
+    uint32_t version = be32_to_cpu(packet->version);
     int ret = 0;
 
-    packet->magic = be32_to_cpu(packet->magic);
-    if (packet->magic != MULTIFD_MAGIC) {
-        error_setg(errp, "multifd: received packet "
-                   "magic %x and expected magic %x",
-                   packet->magic, MULTIFD_MAGIC);
+    if (magic != MULTIFD_MAGIC) {
+        error_setg(errp, "multifd: received packet magic %x, expected %x",
+                   magic, MULTIFD_MAGIC);
         return -1;
     }
 
-    packet->version = be32_to_cpu(packet->version);
-    if (packet->version != MULTIFD_VERSION) {
-        error_setg(errp, "multifd: received packet "
-                   "version %u and expected version %u",
-                   packet->version, MULTIFD_VERSION);
+    if (version != MULTIFD_VERSION) {
+        error_setg(errp, "multifd: received packet version %u, expected %u",
+                   version, MULTIFD_VERSION);
         return -1;
     }
 
-- 
2.35.3

Re: [PATCH v6 18/19] migration/multifd: Stop changing the packet on recv side

Posted by Peter Xu 2 months, 4 weeks ago

On Tue, Aug 27, 2024 at 02:46:05PM -0300, Fabiano Rosas wrote:
> @@ -254,12 +250,10 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
>          return 0;
>      }
>  
> -    /* make sure that ramblock is 0 terminated */
> -    packet->ramblock[255] = 0;
> -    p->block = qemu_ram_block_by_name(packet->ramblock);
> +    ramblock_name = g_strndup(packet->ramblock, 255);

I understand we want to move to a const*, however this introduces a 256B
allocation per multifd packet, which we definitely want to avoid.. I wonder
whether that's worthwhile just to make it const. :-(

I don't worry too much on the const* and vars pointed being abused /
updated when without it - the packet struct is pretty much limited only to
be referenced in this unfill function, and then we will do the load based
on MultiFDRecvParams* later anyway.  So personally I'd rather lose the
const* v.s. one allocation.

Or we could also sanity check byte 255 to be '\0' (which, AFAIU, should
always be the case..), then we can get both benefits.

> +    p->block = qemu_ram_block_by_name(ramblock_name);
>      if (!p->block) {
> -        error_setg(errp, "multifd: unknown ram block %s",
> -                   packet->ramblock);
> +        error_setg(errp, "multifd: unknown ram block %s", ramblock_name);
>          return -1;
>      }

-- 
Peter Xu

Re: [PATCH v6 18/19] migration/multifd: Stop changing the packet on recv side

Posted by Fabiano Rosas 2 months, 4 weeks ago

Peter Xu <peterx@redhat.com> writes:

> On Tue, Aug 27, 2024 at 02:46:05PM -0300, Fabiano Rosas wrote:
>> @@ -254,12 +250,10 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
>>          return 0;
>>      }
>>  
>> -    /* make sure that ramblock is 0 terminated */
>> -    packet->ramblock[255] = 0;
>> -    p->block = qemu_ram_block_by_name(packet->ramblock);
>> +    ramblock_name = g_strndup(packet->ramblock, 255);
>
> I understand we want to move to a const*, however this introduces a 256B
> allocation per multifd packet, which we definitely want to avoid.. I wonder
> whether that's worthwhile just to make it const. :-(
>
> I don't worry too much on the const* and vars pointed being abused /
> updated when without it - the packet struct is pretty much limited only to
> be referenced in this unfill function, and then we will do the load based
> on MultiFDRecvParams* later anyway.  So personally I'd rather lose the
> const* v.s. one allocation.
>
> Or we could also sanity check byte 255 to be '\0' (which, AFAIU, should
> always be the case..), then we can get both benefits.

We can't because it breaks compat. Previous QEMUs didn't zero the
packet.

>
>> +    p->block = qemu_ram_block_by_name(ramblock_name);
>>      if (!p->block) {
>> -        error_setg(errp, "multifd: unknown ram block %s",
>> -                   packet->ramblock);
>> +        error_setg(errp, "multifd: unknown ram block %s", ramblock_name);
>>          return -1;
>>      }

Re: [PATCH v6 18/19] migration/multifd: Stop changing the packet on recv side

Posted by Peter Xu 2 months, 4 weeks ago

On Tue, Aug 27, 2024 at 03:45:11PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Aug 27, 2024 at 02:46:05PM -0300, Fabiano Rosas wrote:
> >> @@ -254,12 +250,10 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
> >>          return 0;
> >>      }
> >>  
> >> -    /* make sure that ramblock is 0 terminated */
> >> -    packet->ramblock[255] = 0;
> >> -    p->block = qemu_ram_block_by_name(packet->ramblock);
> >> +    ramblock_name = g_strndup(packet->ramblock, 255);
> >
> > I understand we want to move to a const*, however this introduces a 256B
> > allocation per multifd packet, which we definitely want to avoid.. I wonder
> > whether that's worthwhile just to make it const. :-(
> >
> > I don't worry too much on the const* and vars pointed being abused /
> > updated when without it - the packet struct is pretty much limited only to
> > be referenced in this unfill function, and then we will do the load based
> > on MultiFDRecvParams* later anyway.  So personally I'd rather lose the
> > const* v.s. one allocation.
> >
> > Or we could also sanity check byte 255 to be '\0' (which, AFAIU, should
> > always be the case..), then we can get both benefits.
> 
> We can't because it breaks compat. Previous QEMUs didn't zero the
> packet.

Ouch!

Then.. shall we still try to avoid the allocation?

-- 
Peter Xu

Re: [PATCH v6 18/19] migration/multifd: Stop changing the packet on recv side

Posted by Fabiano Rosas 2 months, 4 weeks ago

Peter Xu <peterx@redhat.com> writes:

> On Tue, Aug 27, 2024 at 03:45:11PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Tue, Aug 27, 2024 at 02:46:05PM -0300, Fabiano Rosas wrote:
>> >> @@ -254,12 +250,10 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
>> >>          return 0;
>> >>      }
>> >>  
>> >> -    /* make sure that ramblock is 0 terminated */
>> >> -    packet->ramblock[255] = 0;
>> >> -    p->block = qemu_ram_block_by_name(packet->ramblock);
>> >> +    ramblock_name = g_strndup(packet->ramblock, 255);
>> >
>> > I understand we want to move to a const*, however this introduces a 256B
>> > allocation per multifd packet, which we definitely want to avoid.. I wonder
>> > whether that's worthwhile just to make it const. :-(
>> >
>> > I don't worry too much on the const* and vars pointed being abused /
>> > updated when without it - the packet struct is pretty much limited only to
>> > be referenced in this unfill function, and then we will do the load based
>> > on MultiFDRecvParams* later anyway.  So personally I'd rather lose the
>> > const* v.s. one allocation.
>> >
>> > Or we could also sanity check byte 255 to be '\0' (which, AFAIU, should
>> > always be the case..), then we can get both benefits.
>> 
>> We can't because it breaks compat. Previous QEMUs didn't zero the
>> packet.
>
> Ouch!
>
> Then.. shall we still try to avoid the allocation?

Can I strcpy it to the stack?

char idstr[256];

strncpy(&idstr, packet->ramblock, 256);
idstr[255] = 0;

Re: [PATCH v6 18/19] migration/multifd: Stop changing the packet on recv side

Posted by Peter Xu 2 months, 4 weeks ago

On Tue, Aug 27, 2024 at 04:27:42PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Aug 27, 2024 at 03:45:11PM -0300, Fabiano Rosas wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Tue, Aug 27, 2024 at 02:46:05PM -0300, Fabiano Rosas wrote:
> >> >> @@ -254,12 +250,10 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp)
> >> >>          return 0;
> >> >>      }
> >> >>  
> >> >> -    /* make sure that ramblock is 0 terminated */
> >> >> -    packet->ramblock[255] = 0;
> >> >> -    p->block = qemu_ram_block_by_name(packet->ramblock);
> >> >> +    ramblock_name = g_strndup(packet->ramblock, 255);
> >> >
> >> > I understand we want to move to a const*, however this introduces a 256B
> >> > allocation per multifd packet, which we definitely want to avoid.. I wonder
> >> > whether that's worthwhile just to make it const. :-(
> >> >
> >> > I don't worry too much on the const* and vars pointed being abused /
> >> > updated when without it - the packet struct is pretty much limited only to
> >> > be referenced in this unfill function, and then we will do the load based
> >> > on MultiFDRecvParams* later anyway.  So personally I'd rather lose the
> >> > const* v.s. one allocation.
> >> >
> >> > Or we could also sanity check byte 255 to be '\0' (which, AFAIU, should
> >> > always be the case..), then we can get both benefits.
> >> 
> >> We can't because it breaks compat. Previous QEMUs didn't zero the
> >> packet.
> >
> > Ouch!
> >
> > Then.. shall we still try to avoid the allocation?
> 
> Can I strcpy it to the stack?
> 
> char idstr[256];
> 
> strncpy(&idstr, packet->ramblock, 256);
> idstr[255] = 0;

Should be much better than an allocation, yes.  However personally I'd
still try to avoid that.

Multifd is a performance feature, after all, so we care about perf here
more than elsewhere.  Meanwhile this is exactly the hot path on recv
side.. so it might still be wise we leave all non-trivial cosmetic changes
for later when it's against it.

-- 
Peter Xu