[PATCH v6 19/19] migration/multifd: Add documentation for multifd methods

Fabiano Rosas posted 19 patches 2 months, 4 weeks ago
[PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Fabiano Rosas 2 months, 4 weeks ago
Add documentation clarifying the usage of the multifd methods. The
general idea is that the client code calls into multifd to trigger
send/recv of data and multifd then calls these hooks back from the
worker threads at opportune moments so the client can process a
portion of the data.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
Note that the doc is not symmetrical among send/recv because the recv
side is still wonky. It doesn't give the packet to the hooks, which
forces the p->normal, p->zero, etc. to be processed at the top level
of the threads, where no client-specific information should be.
---
 migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 70 insertions(+), 6 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 13e7a88c01..ebb17bdbcf 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -229,17 +229,81 @@ typedef struct {
 } MultiFDRecvParams;
 
 typedef struct {
-    /* Setup for sending side */
+    /*
+     * The send_setup, send_cleanup, send_prepare are only called on
+     * the QEMU instance at the migration source.
+     */
+
+    /*
+     * Setup for sending side. Called once per channel during channel
+     * setup phase.
+     *
+     * Must allocate p->iov. If packets are in use (default), one
+     * extra iovec must be allocated for the packet header. Any memory
+     * allocated in this hook must be released at send_cleanup.
+     *
+     * p->write_flags may be used for passing flags to the QIOChannel.
+     *
+     * p->compression_data may be used by compression methods to store
+     * compression data.
+     */
     int (*send_setup)(MultiFDSendParams *p, Error **errp);
-    /* Cleanup for sending side */
+
+    /*
+     * Cleanup for sending side. Called once per channel during
+     * channel cleanup phase. May be empty.
+     */
     void (*send_cleanup)(MultiFDSendParams *p, Error **errp);
-    /* Prepare the send packet */
+
+    /*
+     * Prepare the send packet. Called from multifd_send(), with p
+     * pointing to the MultiFDSendParams of a channel that is
+     * currently idle.
+     *
+     * Must populate p->iov with the data to be sent, increment
+     * p->iovs_num to match the amount of iovecs used and set
+     * p->next_packet_size with the amount of data currently present
+     * in p->iov.
+     *
+     * Must indicate whether this is a compression packet by setting
+     * p->flags.
+     *
+     * As a last step, if packets are in use (default), must prepare
+     * the packet by calling multifd_send_fill_packet().
+     */
     int (*send_prepare)(MultiFDSendParams *p, Error **errp);
-    /* Setup for receiving side */
+
+    /*
+     * The recv_setup, recv_cleanup, recv are only called on the QEMU
+     * instance at the migration destination.
+     */
+
+    /*
+     * Setup for receiving side. Called once per channel during
+     * channel setup phase. May be empty.
+     *
+     * May allocate data structures for the receiving of data. May use
+     * p->iov. Compression methods may use p->compress_data.
+     */
     int (*recv_setup)(MultiFDRecvParams *p, Error **errp);
-    /* Cleanup for receiving side */
+
+    /*
+     * Cleanup for receiving side. Called once per channel during
+     * channel cleanup phase. May be empty.
+     */
     void (*recv_cleanup)(MultiFDRecvParams *p);
-    /* Read all data */
+
+    /*
+     * Data receive method. Called from multifd_recv(), with p
+     * pointing to the MultiFDRecvParams of a channel that is
+     * currently idle. Only called if there is data available to
+     * receive.
+     *
+     * Must validate p->flags according to what was set at
+     * send_prepare.
+     *
+     * Must read the data from the QIOChannel p->c.
+     */
     int (*recv)(MultiFDRecvParams *p, Error **errp);
 } MultiFDMethods;
 
-- 
2.35.3
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Peter Xu 2 months, 4 weeks ago
On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
> Add documentation clarifying the usage of the multifd methods. The
> general idea is that the client code calls into multifd to trigger
> send/recv of data and multifd then calls these hooks back from the
> worker threads at opportune moments so the client can process a
> portion of the data.
> 
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> Note that the doc is not symmetrical among send/recv because the recv
> side is still wonky. It doesn't give the packet to the hooks, which
> forces the p->normal, p->zero, etc. to be processed at the top level
> of the threads, where no client-specific information should be.
> ---
>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 70 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 13e7a88c01..ebb17bdbcf 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -229,17 +229,81 @@ typedef struct {
>  } MultiFDRecvParams;
>  
>  typedef struct {
> -    /* Setup for sending side */
> +    /*
> +     * The send_setup, send_cleanup, send_prepare are only called on
> +     * the QEMU instance at the migration source.
> +     */
> +
> +    /*
> +     * Setup for sending side. Called once per channel during channel
> +     * setup phase.
> +     *
> +     * Must allocate p->iov. If packets are in use (default), one

Pure thoughts: wonder whether we can assert(p->iov) that after the hook
returns in code to match this line.

> +     * extra iovec must be allocated for the packet header. Any memory
> +     * allocated in this hook must be released at send_cleanup.
> +     *
> +     * p->write_flags may be used for passing flags to the QIOChannel.
> +     *
> +     * p->compression_data may be used by compression methods to store
> +     * compression data.
> +     */
>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
> -    /* Cleanup for sending side */
> +
> +    /*
> +     * Cleanup for sending side. Called once per channel during
> +     * channel cleanup phase. May be empty.

Hmm, if we require p->iov allocation per-ops, then they must free it here?
I wonder whether we leaked it in most compressors.

With that, I wonder whether we should also assert(p->iov == NULL) after
this one returns (squash in this same patch).

> +     */
>      void (*send_cleanup)(MultiFDSendParams *p, Error **errp);
> -    /* Prepare the send packet */
> +
> +    /*
> +     * Prepare the send packet. Called from multifd_send(), with p

multifd_send_thread()?

> +     * pointing to the MultiFDSendParams of a channel that is
> +     * currently idle.
> +     *
> +     * Must populate p->iov with the data to be sent, increment
> +     * p->iovs_num to match the amount of iovecs used and set
> +     * p->next_packet_size with the amount of data currently present
> +     * in p->iov.
> +     *
> +     * Must indicate whether this is a compression packet by setting
> +     * p->flags.

Sigh.. I wonder whether we could avoid mentioning this, and also we avoid
adding new flags for new compressors, relying on libvirt guarding things.
Then when we have the handshakes that's something we verify there.

> +     *
> +     * As a last step, if packets are in use (default), must prepare
> +     * the packet by calling multifd_send_fill_packet().
> +     */
>      int (*send_prepare)(MultiFDSendParams *p, Error **errp);
> -    /* Setup for receiving side */
> +
> +    /*
> +     * The recv_setup, recv_cleanup, recv are only called on the QEMU
> +     * instance at the migration destination.
> +     */
> +
> +    /*
> +     * Setup for receiving side. Called once per channel during
> +     * channel setup phase. May be empty.
> +     *
> +     * May allocate data structures for the receiving of data. May use
> +     * p->iov. Compression methods may use p->compress_data.
> +     */
>      int (*recv_setup)(MultiFDRecvParams *p, Error **errp);
> -    /* Cleanup for receiving side */
> +
> +    /*
> +     * Cleanup for receiving side. Called once per channel during
> +     * channel cleanup phase. May be empty.
> +     */
>      void (*recv_cleanup)(MultiFDRecvParams *p);
> -    /* Read all data */
> +
> +    /*
> +     * Data receive method. Called from multifd_recv(), with p

multifd_recv_thread()?

> +     * pointing to the MultiFDRecvParams of a channel that is
> +     * currently idle. Only called if there is data available to
> +     * receive.
> +     *
> +     * Must validate p->flags according to what was set at
> +     * send_prepare.
> +     *
> +     * Must read the data from the QIOChannel p->c.
> +     */
>      int (*recv)(MultiFDRecvParams *p, Error **errp);
>  } MultiFDMethods;
>  
> -- 
> 2.35.3
> 

-- 
Peter Xu
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Fabiano Rosas 2 months, 4 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
>> Add documentation clarifying the usage of the multifd methods. The
>> general idea is that the client code calls into multifd to trigger
>> send/recv of data and multifd then calls these hooks back from the
>> worker threads at opportune moments so the client can process a
>> portion of the data.
>> 
>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>> Note that the doc is not symmetrical among send/recv because the recv
>> side is still wonky. It doesn't give the packet to the hooks, which
>> forces the p->normal, p->zero, etc. to be processed at the top level
>> of the threads, where no client-specific information should be.
>> ---
>>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
>>  1 file changed, 70 insertions(+), 6 deletions(-)
>> 
>> diff --git a/migration/multifd.h b/migration/multifd.h
>> index 13e7a88c01..ebb17bdbcf 100644
>> --- a/migration/multifd.h
>> +++ b/migration/multifd.h
>> @@ -229,17 +229,81 @@ typedef struct {
>>  } MultiFDRecvParams;
>>  
>>  typedef struct {
>> -    /* Setup for sending side */
>> +    /*
>> +     * The send_setup, send_cleanup, send_prepare are only called on
>> +     * the QEMU instance at the migration source.
>> +     */
>> +
>> +    /*
>> +     * Setup for sending side. Called once per channel during channel
>> +     * setup phase.
>> +     *
>> +     * Must allocate p->iov. If packets are in use (default), one
>
> Pure thoughts: wonder whether we can assert(p->iov) that after the hook
> returns in code to match this line.

Not worth the extra instructions in my opinion. It would crash
immediately once the thread touches p->iov anyway.

>
>> +     * extra iovec must be allocated for the packet header. Any memory
>> +     * allocated in this hook must be released at send_cleanup.
>> +     *
>> +     * p->write_flags may be used for passing flags to the QIOChannel.
>> +     *
>> +     * p->compression_data may be used by compression methods to store
>> +     * compression data.
>> +     */
>>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
>> -    /* Cleanup for sending side */
>> +
>> +    /*
>> +     * Cleanup for sending side. Called once per channel during
>> +     * channel cleanup phase. May be empty.
>
> Hmm, if we require p->iov allocation per-ops, then they must free it here?
> I wonder whether we leaked it in most compressors.

Sorry, this one shouldn't have that text.

>
> With that, I wonder whether we should also assert(p->iov == NULL) after
> this one returns (squash in this same patch).
>
>> +     */
>>      void (*send_cleanup)(MultiFDSendParams *p, Error **errp);
>> -    /* Prepare the send packet */
>> +
>> +    /*
>> +     * Prepare the send packet. Called from multifd_send(), with p
>
> multifd_send_thread()?

No, I meant called as a result of multifd_send(), which is the function
the client uses to trigger a send on the thread.

>
>> +     * pointing to the MultiFDSendParams of a channel that is
>> +     * currently idle.
>> +     *
>> +     * Must populate p->iov with the data to be sent, increment
>> +     * p->iovs_num to match the amount of iovecs used and set
>> +     * p->next_packet_size with the amount of data currently present
>> +     * in p->iov.
>> +     *
>> +     * Must indicate whether this is a compression packet by setting
>> +     * p->flags.
>
> Sigh.. I wonder whether we could avoid mentioning this, and also we avoid
> adding new flags for new compressors, relying on libvirt guarding things.
> Then when we have the handshakes that's something we verify there.
>

I understand that part is not in the best shape, but we must document
the current state. There's no problem changing this later.

Besides, there's the whole "the migration stream should be considered
hostile" which might mean we should really be keeping these sanity check
flags around in case something really weird happens so we don't carry on
with a bad stream.

>> +     *
>> +     * As a last step, if packets are in use (default), must prepare
>> +     * the packet by calling multifd_send_fill_packet().
>> +     */
>>      int (*send_prepare)(MultiFDSendParams *p, Error **errp);
>> -    /* Setup for receiving side */
>> +
>> +    /*
>> +     * The recv_setup, recv_cleanup, recv are only called on the QEMU
>> +     * instance at the migration destination.
>> +     */
>> +
>> +    /*
>> +     * Setup for receiving side. Called once per channel during
>> +     * channel setup phase. May be empty.
>> +     *
>> +     * May allocate data structures for the receiving of data. May use
>> +     * p->iov. Compression methods may use p->compress_data.
>> +     */
>>      int (*recv_setup)(MultiFDRecvParams *p, Error **errp);
>> -    /* Cleanup for receiving side */
>> +
>> +    /*
>> +     * Cleanup for receiving side. Called once per channel during
>> +     * channel cleanup phase. May be empty.
>> +     */
>>      void (*recv_cleanup)(MultiFDRecvParams *p);
>> -    /* Read all data */
>> +
>> +    /*
>> +     * Data receive method. Called from multifd_recv(), with p
>
> multifd_recv_thread()?

Same as before. I'll reword this somehow.

>
>> +     * pointing to the MultiFDRecvParams of a channel that is
>> +     * currently idle. Only called if there is data available to
>> +     * receive.
>> +     *
>> +     * Must validate p->flags according to what was set at
>> +     * send_prepare.
>> +     *
>> +     * Must read the data from the QIOChannel p->c.
>> +     */
>>      int (*recv)(MultiFDRecvParams *p, Error **errp);
>>  } MultiFDMethods;
>>  
>> -- 
>> 2.35.3
>>
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Peter Xu 2 months, 4 weeks ago
On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
> >> Add documentation clarifying the usage of the multifd methods. The
> >> general idea is that the client code calls into multifd to trigger
> >> send/recv of data and multifd then calls these hooks back from the
> >> worker threads at opportune moments so the client can process a
> >> portion of the data.
> >> 
> >> Suggested-by: Peter Xu <peterx@redhat.com>
> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> ---
> >> Note that the doc is not symmetrical among send/recv because the recv
> >> side is still wonky. It doesn't give the packet to the hooks, which
> >> forces the p->normal, p->zero, etc. to be processed at the top level
> >> of the threads, where no client-specific information should be.
> >> ---
> >>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
> >>  1 file changed, 70 insertions(+), 6 deletions(-)
> >> 
> >> diff --git a/migration/multifd.h b/migration/multifd.h
> >> index 13e7a88c01..ebb17bdbcf 100644
> >> --- a/migration/multifd.h
> >> +++ b/migration/multifd.h
> >> @@ -229,17 +229,81 @@ typedef struct {
> >>  } MultiFDRecvParams;
> >>  
> >>  typedef struct {
> >> -    /* Setup for sending side */
> >> +    /*
> >> +     * The send_setup, send_cleanup, send_prepare are only called on
> >> +     * the QEMU instance at the migration source.
> >> +     */
> >> +
> >> +    /*
> >> +     * Setup for sending side. Called once per channel during channel
> >> +     * setup phase.
> >> +     *
> >> +     * Must allocate p->iov. If packets are in use (default), one
> >
> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook
> > returns in code to match this line.
> 
> Not worth the extra instructions in my opinion. It would crash
> immediately once the thread touches p->iov anyway.

It might still be good IMHO to have that assert(), not only to abort
earlier, but also as a code-styled comment.  Your call when resend.

PS: feel free to queue existing patches into your own tree without
resending the whole series!

> 
> >
> >> +     * extra iovec must be allocated for the packet header. Any memory
> >> +     * allocated in this hook must be released at send_cleanup.
> >> +     *
> >> +     * p->write_flags may be used for passing flags to the QIOChannel.
> >> +     *
> >> +     * p->compression_data may be used by compression methods to store
> >> +     * compression data.
> >> +     */
> >>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
> >> -    /* Cleanup for sending side */
> >> +
> >> +    /*
> >> +     * Cleanup for sending side. Called once per channel during
> >> +     * channel cleanup phase. May be empty.
> >
> > Hmm, if we require p->iov allocation per-ops, then they must free it here?
> > I wonder whether we leaked it in most compressors.
> 
> Sorry, this one shouldn't have that text.

I still want to double check with you: we leaked iov[] in most compressors
here, or did I overlook something?

That's definitely more important than the doc update itself..

> 
> >
> > With that, I wonder whether we should also assert(p->iov == NULL) after
> > this one returns (squash in this same patch).
> >
> >> +     */
> >>      void (*send_cleanup)(MultiFDSendParams *p, Error **errp);
> >> -    /* Prepare the send packet */
> >> +
> >> +    /*
> >> +     * Prepare the send packet. Called from multifd_send(), with p
> >
> > multifd_send_thread()?
> 
> No, I meant called as a result of multifd_send(), which is the function
> the client uses to trigger a send on the thread.

OK, but it's confusing.  Some rewords you mentioned below could work.

> 
> >
> >> +     * pointing to the MultiFDSendParams of a channel that is
> >> +     * currently idle.
> >> +     *
> >> +     * Must populate p->iov with the data to be sent, increment
> >> +     * p->iovs_num to match the amount of iovecs used and set
> >> +     * p->next_packet_size with the amount of data currently present
> >> +     * in p->iov.
> >> +     *
> >> +     * Must indicate whether this is a compression packet by setting
> >> +     * p->flags.
> >
> > Sigh.. I wonder whether we could avoid mentioning this, and also we avoid
> > adding new flags for new compressors, relying on libvirt guarding things.
> > Then when we have the handshakes that's something we verify there.
> >
> 
> I understand that part is not in the best shape, but we must document
> the current state. There's no problem changing this later.
> 
> Besides, there's the whole "the migration stream should be considered
> hostile" which might mean we should really be keeping these sanity check
> flags around in case something really weird happens so we don't carry on
> with a bad stream.

Yep, it's OK.

> 
> >> +     *
> >> +     * As a last step, if packets are in use (default), must prepare
> >> +     * the packet by calling multifd_send_fill_packet().
> >> +     */
> >>      int (*send_prepare)(MultiFDSendParams *p, Error **errp);
> >> -    /* Setup for receiving side */
> >> +
> >> +    /*
> >> +     * The recv_setup, recv_cleanup, recv are only called on the QEMU
> >> +     * instance at the migration destination.
> >> +     */
> >> +
> >> +    /*
> >> +     * Setup for receiving side. Called once per channel during
> >> +     * channel setup phase. May be empty.
> >> +     *
> >> +     * May allocate data structures for the receiving of data. May use
> >> +     * p->iov. Compression methods may use p->compress_data.
> >> +     */
> >>      int (*recv_setup)(MultiFDRecvParams *p, Error **errp);
> >> -    /* Cleanup for receiving side */
> >> +
> >> +    /*
> >> +     * Cleanup for receiving side. Called once per channel during
> >> +     * channel cleanup phase. May be empty.
> >> +     */
> >>      void (*recv_cleanup)(MultiFDRecvParams *p);
> >> -    /* Read all data */
> >> +
> >> +    /*
> >> +     * Data receive method. Called from multifd_recv(), with p
> >
> > multifd_recv_thread()?
> 
> Same as before. I'll reword this somehow.
> 
> >
> >> +     * pointing to the MultiFDRecvParams of a channel that is
> >> +     * currently idle. Only called if there is data available to
> >> +     * receive.
> >> +     *
> >> +     * Must validate p->flags according to what was set at
> >> +     * send_prepare.
> >> +     *
> >> +     * Must read the data from the QIOChannel p->c.
> >> +     */
> >>      int (*recv)(MultiFDRecvParams *p, Error **errp);
> >>  } MultiFDMethods;
> >>  
> >> -- 
> >> 2.35.3
> >> 
> 

-- 
Peter Xu
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Fabiano Rosas 2 months, 4 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
>> >> Add documentation clarifying the usage of the multifd methods. The
>> >> general idea is that the client code calls into multifd to trigger
>> >> send/recv of data and multifd then calls these hooks back from the
>> >> worker threads at opportune moments so the client can process a
>> >> portion of the data.
>> >> 
>> >> Suggested-by: Peter Xu <peterx@redhat.com>
>> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> >> ---
>> >> Note that the doc is not symmetrical among send/recv because the recv
>> >> side is still wonky. It doesn't give the packet to the hooks, which
>> >> forces the p->normal, p->zero, etc. to be processed at the top level
>> >> of the threads, where no client-specific information should be.
>> >> ---
>> >>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
>> >>  1 file changed, 70 insertions(+), 6 deletions(-)
>> >> 
>> >> diff --git a/migration/multifd.h b/migration/multifd.h
>> >> index 13e7a88c01..ebb17bdbcf 100644
>> >> --- a/migration/multifd.h
>> >> +++ b/migration/multifd.h
>> >> @@ -229,17 +229,81 @@ typedef struct {
>> >>  } MultiFDRecvParams;
>> >>  
>> >>  typedef struct {
>> >> -    /* Setup for sending side */
>> >> +    /*
>> >> +     * The send_setup, send_cleanup, send_prepare are only called on
>> >> +     * the QEMU instance at the migration source.
>> >> +     */
>> >> +
>> >> +    /*
>> >> +     * Setup for sending side. Called once per channel during channel
>> >> +     * setup phase.
>> >> +     *
>> >> +     * Must allocate p->iov. If packets are in use (default), one
>> >
>> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook
>> > returns in code to match this line.
>> 
>> Not worth the extra instructions in my opinion. It would crash
>> immediately once the thread touches p->iov anyway.
>
> It might still be good IMHO to have that assert(), not only to abort
> earlier, but also as a code-styled comment.  Your call when resend.
>
> PS: feel free to queue existing patches into your own tree without
> resending the whole series!
>
>> 
>> >
>> >> +     * extra iovec must be allocated for the packet header. Any memory
>> >> +     * allocated in this hook must be released at send_cleanup.
>> >> +     *
>> >> +     * p->write_flags may be used for passing flags to the QIOChannel.
>> >> +     *
>> >> +     * p->compression_data may be used by compression methods to store
>> >> +     * compression data.
>> >> +     */
>> >>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
>> >> -    /* Cleanup for sending side */
>> >> +
>> >> +    /*
>> >> +     * Cleanup for sending side. Called once per channel during
>> >> +     * channel cleanup phase. May be empty.
>> >
>> > Hmm, if we require p->iov allocation per-ops, then they must free it here?
>> > I wonder whether we leaked it in most compressors.
>> 
>> Sorry, this one shouldn't have that text.
>
> I still want to double check with you: we leaked iov[] in most compressors
> here, or did I overlook something?

They have their own send_cleanup function where p->iov is freed.
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Peter Xu 2 months, 4 weeks ago
On Tue, Aug 27, 2024 at 04:17:59PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
> >> >> Add documentation clarifying the usage of the multifd methods. The
> >> >> general idea is that the client code calls into multifd to trigger
> >> >> send/recv of data and multifd then calls these hooks back from the
> >> >> worker threads at opportune moments so the client can process a
> >> >> portion of the data.
> >> >> 
> >> >> Suggested-by: Peter Xu <peterx@redhat.com>
> >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> >> ---
> >> >> Note that the doc is not symmetrical among send/recv because the recv
> >> >> side is still wonky. It doesn't give the packet to the hooks, which
> >> >> forces the p->normal, p->zero, etc. to be processed at the top level
> >> >> of the threads, where no client-specific information should be.
> >> >> ---
> >> >>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
> >> >>  1 file changed, 70 insertions(+), 6 deletions(-)
> >> >> 
> >> >> diff --git a/migration/multifd.h b/migration/multifd.h
> >> >> index 13e7a88c01..ebb17bdbcf 100644
> >> >> --- a/migration/multifd.h
> >> >> +++ b/migration/multifd.h
> >> >> @@ -229,17 +229,81 @@ typedef struct {
> >> >>  } MultiFDRecvParams;
> >> >>  
> >> >>  typedef struct {
> >> >> -    /* Setup for sending side */
> >> >> +    /*
> >> >> +     * The send_setup, send_cleanup, send_prepare are only called on
> >> >> +     * the QEMU instance at the migration source.
> >> >> +     */
> >> >> +
> >> >> +    /*
> >> >> +     * Setup for sending side. Called once per channel during channel
> >> >> +     * setup phase.
> >> >> +     *
> >> >> +     * Must allocate p->iov. If packets are in use (default), one
> >> >
> >> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook
> >> > returns in code to match this line.
> >> 
> >> Not worth the extra instructions in my opinion. It would crash
> >> immediately once the thread touches p->iov anyway.
> >
> > It might still be good IMHO to have that assert(), not only to abort
> > earlier, but also as a code-styled comment.  Your call when resend.
> >
> > PS: feel free to queue existing patches into your own tree without
> > resending the whole series!
> >
> >> 
> >> >
> >> >> +     * extra iovec must be allocated for the packet header. Any memory
> >> >> +     * allocated in this hook must be released at send_cleanup.
> >> >> +     *
> >> >> +     * p->write_flags may be used for passing flags to the QIOChannel.
> >> >> +     *
> >> >> +     * p->compression_data may be used by compression methods to store
> >> >> +     * compression data.
> >> >> +     */
> >> >>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
> >> >> -    /* Cleanup for sending side */
> >> >> +
> >> >> +    /*
> >> >> +     * Cleanup for sending side. Called once per channel during
> >> >> +     * channel cleanup phase. May be empty.
> >> >
> >> > Hmm, if we require p->iov allocation per-ops, then they must free it here?
> >> > I wonder whether we leaked it in most compressors.
> >> 
> >> Sorry, this one shouldn't have that text.
> >
> > I still want to double check with you: we leaked iov[] in most compressors
> > here, or did I overlook something?
> 
> They have their own send_cleanup function where p->iov is freed.

Oh, so I guess I just accidentally stumbled upon
multifd_uadk_send_cleanup() when looking..

I thought I looked a few more but now when I check most of them are indeed
there but looks like uadk is missing that.

I think it might still be a good idea to assert(iov==NULL) after the
cleanup..

-- 
Peter Xu
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Fabiano Rosas 2 months, 4 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Tue, Aug 27, 2024 at 04:17:59PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >> 
>> >> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
>> >> >> Add documentation clarifying the usage of the multifd methods. The
>> >> >> general idea is that the client code calls into multifd to trigger
>> >> >> send/recv of data and multifd then calls these hooks back from the
>> >> >> worker threads at opportune moments so the client can process a
>> >> >> portion of the data.
>> >> >> 
>> >> >> Suggested-by: Peter Xu <peterx@redhat.com>
>> >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> >> >> ---
>> >> >> Note that the doc is not symmetrical among send/recv because the recv
>> >> >> side is still wonky. It doesn't give the packet to the hooks, which
>> >> >> forces the p->normal, p->zero, etc. to be processed at the top level
>> >> >> of the threads, where no client-specific information should be.
>> >> >> ---
>> >> >>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
>> >> >>  1 file changed, 70 insertions(+), 6 deletions(-)
>> >> >> 
>> >> >> diff --git a/migration/multifd.h b/migration/multifd.h
>> >> >> index 13e7a88c01..ebb17bdbcf 100644
>> >> >> --- a/migration/multifd.h
>> >> >> +++ b/migration/multifd.h
>> >> >> @@ -229,17 +229,81 @@ typedef struct {
>> >> >>  } MultiFDRecvParams;
>> >> >>  
>> >> >>  typedef struct {
>> >> >> -    /* Setup for sending side */
>> >> >> +    /*
>> >> >> +     * The send_setup, send_cleanup, send_prepare are only called on
>> >> >> +     * the QEMU instance at the migration source.
>> >> >> +     */
>> >> >> +
>> >> >> +    /*
>> >> >> +     * Setup for sending side. Called once per channel during channel
>> >> >> +     * setup phase.
>> >> >> +     *
>> >> >> +     * Must allocate p->iov. If packets are in use (default), one
>> >> >
>> >> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook
>> >> > returns in code to match this line.
>> >> 
>> >> Not worth the extra instructions in my opinion. It would crash
>> >> immediately once the thread touches p->iov anyway.
>> >
>> > It might still be good IMHO to have that assert(), not only to abort
>> > earlier, but also as a code-styled comment.  Your call when resend.
>> >
>> > PS: feel free to queue existing patches into your own tree without
>> > resending the whole series!
>> >
>> >> 
>> >> >
>> >> >> +     * extra iovec must be allocated for the packet header. Any memory
>> >> >> +     * allocated in this hook must be released at send_cleanup.
>> >> >> +     *
>> >> >> +     * p->write_flags may be used for passing flags to the QIOChannel.
>> >> >> +     *
>> >> >> +     * p->compression_data may be used by compression methods to store
>> >> >> +     * compression data.
>> >> >> +     */
>> >> >>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
>> >> >> -    /* Cleanup for sending side */
>> >> >> +
>> >> >> +    /*
>> >> >> +     * Cleanup for sending side. Called once per channel during
>> >> >> +     * channel cleanup phase. May be empty.
>> >> >
>> >> > Hmm, if we require p->iov allocation per-ops, then they must free it here?
>> >> > I wonder whether we leaked it in most compressors.
>> >> 
>> >> Sorry, this one shouldn't have that text.
>> >
>> > I still want to double check with you: we leaked iov[] in most compressors
>> > here, or did I overlook something?
>> 
>> They have their own send_cleanup function where p->iov is freed.
>
> Oh, so I guess I just accidentally stumbled upon
> multifd_uadk_send_cleanup() when looking..

Yeah, this is a bit worrying. The reason this has not shown on valgrind
or the asan that Peter ran recently is that uadk, qpl and soon qat are
never enabled in a regular build. I have myself introduced compilation
errors in those files that I only caught by accident at a later point
(before sending to the ml).

>
> I thought I looked a few more but now when I check most of them are indeed
> there but looks like uadk is missing that.
>
> I think it might still be a good idea to assert(iov==NULL) after the
> cleanup..

Should we maybe just free p->iov at the top level then?
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Peter Xu 2 months, 4 weeks ago
On Tue, Aug 27, 2024 at 05:22:32PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Aug 27, 2024 at 04:17:59PM -0300, Fabiano Rosas wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote:
> >> >> Peter Xu <peterx@redhat.com> writes:
> >> >> 
> >> >> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
> >> >> >> Add documentation clarifying the usage of the multifd methods. The
> >> >> >> general idea is that the client code calls into multifd to trigger
> >> >> >> send/recv of data and multifd then calls these hooks back from the
> >> >> >> worker threads at opportune moments so the client can process a
> >> >> >> portion of the data.
> >> >> >> 
> >> >> >> Suggested-by: Peter Xu <peterx@redhat.com>
> >> >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> >> >> ---
> >> >> >> Note that the doc is not symmetrical among send/recv because the recv
> >> >> >> side is still wonky. It doesn't give the packet to the hooks, which
> >> >> >> forces the p->normal, p->zero, etc. to be processed at the top level
> >> >> >> of the threads, where no client-specific information should be.
> >> >> >> ---
> >> >> >>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
> >> >> >>  1 file changed, 70 insertions(+), 6 deletions(-)
> >> >> >> 
> >> >> >> diff --git a/migration/multifd.h b/migration/multifd.h
> >> >> >> index 13e7a88c01..ebb17bdbcf 100644
> >> >> >> --- a/migration/multifd.h
> >> >> >> +++ b/migration/multifd.h
> >> >> >> @@ -229,17 +229,81 @@ typedef struct {
> >> >> >>  } MultiFDRecvParams;
> >> >> >>  
> >> >> >>  typedef struct {
> >> >> >> -    /* Setup for sending side */
> >> >> >> +    /*
> >> >> >> +     * The send_setup, send_cleanup, send_prepare are only called on
> >> >> >> +     * the QEMU instance at the migration source.
> >> >> >> +     */
> >> >> >> +
> >> >> >> +    /*
> >> >> >> +     * Setup for sending side. Called once per channel during channel
> >> >> >> +     * setup phase.
> >> >> >> +     *
> >> >> >> +     * Must allocate p->iov. If packets are in use (default), one
> >> >> >
> >> >> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook
> >> >> > returns in code to match this line.
> >> >> 
> >> >> Not worth the extra instructions in my opinion. It would crash
> >> >> immediately once the thread touches p->iov anyway.
> >> >
> >> > It might still be good IMHO to have that assert(), not only to abort
> >> > earlier, but also as a code-styled comment.  Your call when resend.
> >> >
> >> > PS: feel free to queue existing patches into your own tree without
> >> > resending the whole series!
> >> >
> >> >> 
> >> >> >
> >> >> >> +     * extra iovec must be allocated for the packet header. Any memory
> >> >> >> +     * allocated in this hook must be released at send_cleanup.
> >> >> >> +     *
> >> >> >> +     * p->write_flags may be used for passing flags to the QIOChannel.
> >> >> >> +     *
> >> >> >> +     * p->compression_data may be used by compression methods to store
> >> >> >> +     * compression data.
> >> >> >> +     */
> >> >> >>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
> >> >> >> -    /* Cleanup for sending side */
> >> >> >> +
> >> >> >> +    /*
> >> >> >> +     * Cleanup for sending side. Called once per channel during
> >> >> >> +     * channel cleanup phase. May be empty.
> >> >> >
> >> >> > Hmm, if we require p->iov allocation per-ops, then they must free it here?
> >> >> > I wonder whether we leaked it in most compressors.
> >> >> 
> >> >> Sorry, this one shouldn't have that text.
> >> >
> >> > I still want to double check with you: we leaked iov[] in most compressors
> >> > here, or did I overlook something?
> >> 
> >> They have their own send_cleanup function where p->iov is freed.
> >
> > Oh, so I guess I just accidentally stumbled upon
> > multifd_uadk_send_cleanup() when looking..
> 
> Yeah, this is a bit worrying. The reason this has not shown on valgrind
> or the asan that Peter ran recently is that uadk, qpl and soon qat are
> never enabled in a regular build. I have myself introduced compilation
> errors in those files that I only caught by accident at a later point
> (before sending to the ml).

I tried to manually install qpl and uadk just now but neither of them is
trivial to compile and install..  I hit random errors here and there in my
first shot.

OTOH, qatzip packages are around at least in Fedora repositories, so that
might be the easiest to reach.  Not sure how's that when with OpenSUSE.

Shall we perhaps draft an email and check with them? E.g., would that be
better if there's plan they would at some point provide RPMs for libraries
at some point so that we could somehow integrate that into CI routines?

> 
> >
> > I thought I looked a few more but now when I check most of them are indeed
> > there but looks like uadk is missing that.
> >
> > I think it might still be a good idea to assert(iov==NULL) after the
> > cleanup..
> 
> Should we maybe just free p->iov at the top level then?

We could, but if so it might be good to also allocate at the top level so
the hooks are paired up on these allocations/frees.

IMHO we could already always allocate iov[] to page_count+2 which is the
maximum of all compressors - currently we've got 128 pages per packet by
default, which is 128*16=2KB iov[] per channel.  Not so bad when only used
during migrations.

Or we can perhaps do send_setup(..., &iovs_needed), returns how many iovs
are needed, then multifd allocates them.

-- 
Peter Xu
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Fabiano Rosas 2 months, 3 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Tue, Aug 27, 2024 at 05:22:32PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Tue, Aug 27, 2024 at 04:17:59PM -0300, Fabiano Rosas wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >> 
>> >> > On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote:
>> >> >> Peter Xu <peterx@redhat.com> writes:
>> >> >> 
>> >> >> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
>> >> >> >> Add documentation clarifying the usage of the multifd methods. The
>> >> >> >> general idea is that the client code calls into multifd to trigger
>> >> >> >> send/recv of data and multifd then calls these hooks back from the
>> >> >> >> worker threads at opportune moments so the client can process a
>> >> >> >> portion of the data.
>> >> >> >> 
>> >> >> >> Suggested-by: Peter Xu <peterx@redhat.com>
>> >> >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> >> >> >> ---
>> >> >> >> Note that the doc is not symmetrical among send/recv because the recv
>> >> >> >> side is still wonky. It doesn't give the packet to the hooks, which
>> >> >> >> forces the p->normal, p->zero, etc. to be processed at the top level
>> >> >> >> of the threads, where no client-specific information should be.
>> >> >> >> ---
>> >> >> >>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
>> >> >> >>  1 file changed, 70 insertions(+), 6 deletions(-)
>> >> >> >> 
>> >> >> >> diff --git a/migration/multifd.h b/migration/multifd.h
>> >> >> >> index 13e7a88c01..ebb17bdbcf 100644
>> >> >> >> --- a/migration/multifd.h
>> >> >> >> +++ b/migration/multifd.h
>> >> >> >> @@ -229,17 +229,81 @@ typedef struct {
>> >> >> >>  } MultiFDRecvParams;
>> >> >> >>  
>> >> >> >>  typedef struct {
>> >> >> >> -    /* Setup for sending side */
>> >> >> >> +    /*
>> >> >> >> +     * The send_setup, send_cleanup, send_prepare are only called on
>> >> >> >> +     * the QEMU instance at the migration source.
>> >> >> >> +     */
>> >> >> >> +
>> >> >> >> +    /*
>> >> >> >> +     * Setup for sending side. Called once per channel during channel
>> >> >> >> +     * setup phase.
>> >> >> >> +     *
>> >> >> >> +     * Must allocate p->iov. If packets are in use (default), one
>> >> >> >
>> >> >> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook
>> >> >> > returns in code to match this line.
>> >> >> 
>> >> >> Not worth the extra instructions in my opinion. It would crash
>> >> >> immediately once the thread touches p->iov anyway.
>> >> >
>> >> > It might still be good IMHO to have that assert(), not only to abort
>> >> > earlier, but also as a code-styled comment.  Your call when resend.
>> >> >
>> >> > PS: feel free to queue existing patches into your own tree without
>> >> > resending the whole series!
>> >> >
>> >> >> 
>> >> >> >
>> >> >> >> +     * extra iovec must be allocated for the packet header. Any memory
>> >> >> >> +     * allocated in this hook must be released at send_cleanup.
>> >> >> >> +     *
>> >> >> >> +     * p->write_flags may be used for passing flags to the QIOChannel.
>> >> >> >> +     *
>> >> >> >> +     * p->compression_data may be used by compression methods to store
>> >> >> >> +     * compression data.
>> >> >> >> +     */
>> >> >> >>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
>> >> >> >> -    /* Cleanup for sending side */
>> >> >> >> +
>> >> >> >> +    /*
>> >> >> >> +     * Cleanup for sending side. Called once per channel during
>> >> >> >> +     * channel cleanup phase. May be empty.
>> >> >> >
>> >> >> > Hmm, if we require p->iov allocation per-ops, then they must free it here?
>> >> >> > I wonder whether we leaked it in most compressors.
>> >> >> 
>> >> >> Sorry, this one shouldn't have that text.
>> >> >
>> >> > I still want to double check with you: we leaked iov[] in most compressors
>> >> > here, or did I overlook something?
>> >> 
>> >> They have their own send_cleanup function where p->iov is freed.
>> >
>> > Oh, so I guess I just accidentally stumbled upon
>> > multifd_uadk_send_cleanup() when looking..
>> 
>> Yeah, this is a bit worrying. The reason this has not shown on valgrind
>> or the asan that Peter ran recently is that uadk, qpl and soon qat are
>> never enabled in a regular build. I have myself introduced compilation
>> errors in those files that I only caught by accident at a later point
>> (before sending to the ml).
>
> I tried to manually install qpl and uadk just now but neither of them is
> trivial to compile and install..  I hit random errors here and there in my
> first shot.
>
> OTOH, qatzip packages are around at least in Fedora repositories, so that
> might be the easiest to reach.  Not sure how's that when with OpenSUSE.
>
> Shall we perhaps draft an email and check with them? E.g., would that be
> better if there's plan they would at some point provide RPMs for libraries
> at some point so that we could somehow integrate that into CI routines?

We merged most of these things already. Now even if rpms show up at some
point we still have to deal with not being able to build that code until
then. Perhaps we could have a container that has all of these
pre-installed just to exercize the code a bit. But it still wouldn't
catch some issues becase we cannot run the code due to the lack of
hardware.

>
>> 
>> >
>> > I thought I looked a few more but now when I check most of them are indeed
>> > there but looks like uadk is missing that.
>> >
>> > I think it might still be a good idea to assert(iov==NULL) after the
>> > cleanup..
>> 
>> Should we maybe just free p->iov at the top level then?
>
> We could, but if so it might be good to also allocate at the top level so
> the hooks are paired up on these allocations/frees.
>
> IMHO we could already always allocate iov[] to page_count+2 which is the
> maximum of all compressors - currently we've got 128 pages per packet by
> default, which is 128*16=2KB iov[] per channel.  Not so bad when only used
> during migrations.
>
> Or we can perhaps do send_setup(..., &iovs_needed), returns how many iovs
> are needed, then multifd allocates them.

Let me play around with these a bit. I might just fix uadk and leave
everything else as it is for now.
Re: [PATCH v6 19/19] migration/multifd: Add documentation for multifd methods
Posted by Peter Xu 2 months, 3 weeks ago
On Wed, Aug 28, 2024 at 10:04:47AM -0300, Fabiano Rosas wrote:
> We merged most of these things already. Now even if rpms show up at some
> point we still have to deal with not being able to build that code until
> then. Perhaps we could have a container that has all of these
> pre-installed just to exercize the code a bit. But it still wouldn't
> catch some issues becase we cannot run the code due to the lack of
> hardware.

Yes, ultimately we may need help from the relevant people..

One last fallback plan is we can consult them for help at least to make
sure it's working at the end of each release, so it might be helpful they
help verify the code at soft-freeze for each release.  Then we can keep the
development as usual ignoring them during dev cycles.

If we find some feature broken (e.g. fail to compile..) for more than
multiple releases, it may mean that upstream has nobody using it, then we
suggest obsoletions.

-- 
Peter Xu