On Wed, Jan 31, 2024 at 06:31:11PM +0800, peterx@redhat.com wrote:
> From: Peter Xu <peterx@redhat.com>
>
> Now multifd's logic is designed to have no spurious wakeup. I still
> remember a talk to Juan and he seems to agree we should drop it now, and if
> my memory was right it was there because multifd used to hit that when
> still debugging.
>
> Let's drop it and see what can explode; as long as it's not reaching
> soft-freeze.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> migration/multifd.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 0f22646f95..bd0e3ea1a5 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -766,9 +766,6 @@ static void *multifd_send_thread(void *opaque)
> p->pending_sync = false;
> qemu_mutex_unlock(&p->mutex);
> qemu_sem_post(&p->sem_sync);
> - } else {
> - qemu_mutex_unlock(&p->mutex);
> - /* sometimes there are spurious wakeups */
> }
> }
>
> --
> 2.43.0
>
While removing this is still the goal, I just noticed that _if_ something
spurious wakeup happens then this will not crash qemu, but instead it'll
cause mutex locked forever and deadlock.
A deadlock is less wanted than a crash in this case, so when I repost, I'll
make sure it crashes and does it hard, like squashing this in:
====
diff --git a/migration/multifd.c b/migration/multifd.c
index bd0e3ea1a5..89011f75d9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -751,7 +751,9 @@ static void *multifd_send_thread(void *opaque)
p->next_packet_size = 0;
p->pending_job = false;
qemu_mutex_unlock(&p->mutex);
- } else if (p->pending_sync) {
+ } else {
+ /* If not a normal job, must be a sync request */
+ assert(p->pending_sync);
p->flags = MULTIFD_FLAG_SYNC;
multifd_send_fill_packet(p);
ret = qio_channel_write_all(p->c, (void *)p->packet,
====
Fabiano, I'll keep your ACK, but let me know otherwise..
--
Peter Xu