[v2] migration/ram: Yield periodically to the main loop

[PATCH v2] migration/ram: Yield periodically to the main loop

Yury Kotov posted 1 patch 4 years, 7 months ago

Diff against v1
Download mbox

Test asan passed

Test checkpatch passed

Test FreeBSD passed

Test docker-mingw@fedora passed

Test docker-clang@ubuntu passed

Test docker-quick@centos7 passed

Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20191125133632.21387-1-yury-kotov@yandex-team.ru

Maintainers: Juan Quintela <quintela@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>

migration/ram.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

Expand all Fold all

[PATCH v2] migration/ram: Yield periodically to the main loop

Posted by Yury Kotov 4 years, 7 months ago

Usually, incoming migration coroutine yields to the main loop
while its IO-channel is waiting for data to receive. But there is a case
when RAM migration and data receive have the same speed: VM with huge
zeroed RAM. In this case, IO-channel won't read and thus the main loop
is stuck and for instance, it doesn't respond to QMP commands.

For this case, yield periodically, but not too often, so as not to
affect the speed of migration.

Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
---
 migration/ram.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index 5078f94490..9694ee7a0b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4227,7 +4227,7 @@ static void colo_flush_ram_cache(void)
  */
 static int ram_load_precopy(QEMUFile *f)
 {
-    int flags = 0, ret = 0, invalid_flags = 0, len = 0;
+    int flags = 0, ret = 0, invalid_flags = 0, len = 0, i = 0;
     /* ADVISE is earlier, it shows the source has the postcopy capability on */
     bool postcopy_advised = postcopy_is_advised();
     if (!migrate_use_compression()) {
@@ -4239,6 +4239,17 @@ static int ram_load_precopy(QEMUFile *f)
         void *host = NULL;
         uint8_t ch;
 
+        /*
+         * Yield periodically to let main loop run, but an iteration of
+         * the main loop is expensive, so do it each some iterations
+         */
+        if ((i & 32767) == 0 && qemu_in_coroutine()) {
+            aio_co_schedule(qemu_get_current_aio_context(),
+                            qemu_coroutine_self());
+            qemu_coroutine_yield();
+        }
+        i++;
+
         addr = qemu_get_be64(f);
         flags = addr & ~TARGET_PAGE_MASK;
         addr &= TARGET_PAGE_MASK;
-- 
2.24.0

Re: [PATCH v2] migration/ram: Yield periodically to the main loop

Posted by Juan Quintela 4 years, 5 months ago

Yury Kotov <yury-kotov@yandex-team.ru> wrote:
> Usually, incoming migration coroutine yields to the main loop
> while its IO-channel is waiting for data to receive. But there is a case
> when RAM migration and data receive have the same speed: VM with huge
> zeroed RAM. In this case, IO-channel won't read and thus the main loop
> is stuck and for instance, it doesn't respond to QMP commands.
>
> For this case, yield periodically, but not too often, so as not to
> affect the speed of migration.
>
> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Re: [PATCH v2] migration/ram: Yield periodically to the main loop

Posted by Juan Quintela 4 years, 5 months ago

Juan Quintela <quintela@redhat.com> wrote:
> Yury Kotov <yury-kotov@yandex-team.ru> wrote:
>> Usually, incoming migration coroutine yields to the main loop
>> while its IO-channel is waiting for data to receive. But there is a case
>> when RAM migration and data receive have the same speed: VM with huge
>> zeroed RAM. In this case, IO-channel won't read and thus the main loop
>> is stuck and for instance, it doesn't respond to QMP commands.
>>
>> For this case, yield periodically, but not too often, so as not to
>> affect the speed of migration.
>>
>> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
>
> Reviewed-by: Juan Quintela <quintela@redhat.com>

Nack.

The idea is good.  But it fails migration-test to fail from time to time
(50% of the time on my laptop).

Will investigate why this is failing.

Later, Juan.