[PATCH v2 0/6] migration/multifd: Fix channel creation vs. cleanup races

Fabiano Rosas posted 6 patches 9 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20240205194929.28963-1-farosas@suse.de
Maintainers: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
There is a newer version of this series
migration/migration.c |  14 ++--
migration/multifd.c   | 157 +++++++++++++++++++++++-------------------
migration/multifd.h   |  11 ++-
3 files changed, 98 insertions(+), 84 deletions(-)
[PATCH v2 0/6] migration/multifd: Fix channel creation vs. cleanup races
Posted by Fabiano Rosas 9 months, 3 weeks ago
Based-on: 20240202102857.110210-1-peterx@redhat.com
[PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups
https://lore.kernel.org/r/20240202102857.110210-1-peterx@redhat.com

Hi,

In this v2 I made sure NO channel is created after the semaphores are
posted. Feel free to call me out if that's not the case.

Not much changes, except that now both TLS and non-TLS go through the
same code, so there's a centralized place to do error handling and
releasing the semaphore.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1165206107
based on Peter's code: https://gitlab.com/farosas/qemu/-/pipelines/1165303276

v1:
https://lore.kernel.org/r/20240202191128.1901-1-farosas@suse.de

This contains 2 patches from my previous series addressing the
p->running misuse and the TLS thread leak and 3 new patches to fix the
cleanup-while-creating-threads race.

For the p->running I'm keeping the idea from the other series to
remove p->running and use a more narrow p->thread_created flag. This
flag is used only inform whether the thread has been created so we can
join it.

For the cleanup race I have moved some code around and added a
semaphore to make multifd_save_setup() only return once all channel
creation tasks have started.

The idea is that after multifd_save_setup() returns, no new creations
are in flight and the p->thread_created flags will never change again,
so they're enough to cause the cleanup code to wait for the threads to
join.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1162798843

@Peter: I can rebase this on top of your series once we decide about
it.

Fabiano Rosas (6):
  migration/multifd: Join the TLS thread
  migration/multifd: Remove p->running
  migration/multifd: Move multifd_send_setup error handling in to the
    function
  migration/multifd: Move multifd_send_setup into migration thread
  migration/multifd: Unify multifd and TLS connection paths
  migration/multifd: Add a synchronization point for channel creation

 migration/migration.c |  14 ++--
 migration/multifd.c   | 157 +++++++++++++++++++++++-------------------
 migration/multifd.h   |  11 ++-
 3 files changed, 98 insertions(+), 84 deletions(-)

-- 
2.35.3
Re: [PATCH v2 0/6] migration/multifd: Fix channel creation vs. cleanup races
Posted by Peter Xu 9 months, 3 weeks ago
On Mon, Feb 05, 2024 at 04:49:23PM -0300, Fabiano Rosas wrote:
> Based-on: 20240202102857.110210-1-peterx@redhat.com
> [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups
> https://lore.kernel.org/r/20240202102857.110210-1-peterx@redhat.com
> 
> Hi,
> 
> In this v2 I made sure NO channel is created after the semaphores are
> posted. Feel free to call me out if that's not the case.

Queued into -staging.  Plan to send a pull only before I'll be out (Feb
9-19), so comments are still welcomed.  Thanks.

-- 
Peter Xu