migration: improve multithreads for compression and decompression

[Qemu-devel] [PATCH 08/12] migration: do not flush_compressed_data at the end of each iteration

Posted by guangrong.xiao@gmail.com 7 years, 8 months ago

From: Xiao Guangrong <xiaoguangrong@tencent.com>

flush_compressed_data() needs to wait all compression threads to
finish their work, after that all threads are free until the
migration feed new request to them, reducing its call can improve
the throughput and use CPU resource more effectively

We do not need to flush all threads at the end of iteration, the
data can be kept locally until the memory block is changed

Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
---
 migration/ram.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/ram.c b/migration/ram.c
index f9a8646520..0a38c1c61e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1994,6 +1994,7 @@ static void ram_save_cleanup(void *opaque)
     }
 
     xbzrle_cleanup();
+    flush_compressed_data(*rsp);
     compress_threads_save_cleanup();
     ram_state_cleanup(rsp);
 }
@@ -2690,7 +2691,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
         }
         i++;
     }
-    flush_compressed_data(rs);
     rcu_read_unlock();
 
     /*
-- 
2.14.4

Re: [Qemu-devel] [PATCH 08/12] migration: do not flush_compressed_data at the end of each iteration

Posted by Dr. David Alan Gilbert 7 years, 7 months ago

* guangrong.xiao@gmail.com (guangrong.xiao@gmail.com) wrote:
> From: Xiao Guangrong <xiaoguangrong@tencent.com>
> 
> flush_compressed_data() needs to wait all compression threads to
> finish their work, after that all threads are free until the
> migration feed new request to them, reducing its call can improve
> the throughput and use CPU resource more effectively
> 
> We do not need to flush all threads at the end of iteration, the
> data can be kept locally until the memory block is changed
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
> ---
>  migration/ram.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index f9a8646520..0a38c1c61e 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1994,6 +1994,7 @@ static void ram_save_cleanup(void *opaque)
>      }
>  
>      xbzrle_cleanup();
> +    flush_compressed_data(*rsp);
>      compress_threads_save_cleanup();
>      ram_state_cleanup(rsp);
>  }

I'm not sure why this change corresponds to the other removal.
We should already have sent all remaining data in ram_save_complete()'s
call to flush_compressed_data - so what is this one for?

> @@ -2690,7 +2691,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>          }
>          i++;
>      }
> -    flush_compressed_data(rs);
>      rcu_read_unlock();

Hmm - are we sure there's no other cases that depend on ordering of all
of the compressed data being sent out between threads?
I think the one I'd most worry about is the case where:

  iteration one:
     thread 1: Save compressed page 'n'

  iteration two:
     thread 2: Save compressed page 'n'

What guarantees that the version of page 'n'
from thread 2 reaches the destination first without
this flush?

Dave

>      /*
> -- 
> 2.14.4
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH 08/12] migration: do not flush_compressed_data at the end of each iteration

Posted by Xiao Guangrong 7 years, 6 months ago


On 07/14/2018 02:01 AM, Dr. David Alan Gilbert wrote:
> * guangrong.xiao@gmail.com (guangrong.xiao@gmail.com) wrote:
>> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>>
>> flush_compressed_data() needs to wait all compression threads to
>> finish their work, after that all threads are free until the
>> migration feed new request to them, reducing its call can improve
>> the throughput and use CPU resource more effectively
>>
>> We do not need to flush all threads at the end of iteration, the
>> data can be kept locally until the memory block is changed
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
>> ---
>>   migration/ram.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index f9a8646520..0a38c1c61e 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1994,6 +1994,7 @@ static void ram_save_cleanup(void *opaque)
>>       }
>>   
>>       xbzrle_cleanup();
>> +    flush_compressed_data(*rsp);
>>       compress_threads_save_cleanup();
>>       ram_state_cleanup(rsp);
>>   }
> 
> I'm not sure why this change corresponds to the other removal.
> We should already have sent all remaining data in ram_save_complete()'s
> call to flush_compressed_data - so what is this one for?
> 

This is for the error condition, if any error occurred during live migration,
there is no chance to call ram_save_complete. After using the lockless
multithreads model, we assert all requests have been handled before destroy
the work threads.

>> @@ -2690,7 +2691,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>>           }
>>           i++;
>>       }
>> -    flush_compressed_data(rs);
>>       rcu_read_unlock();
> 
> Hmm - are we sure there's no other cases that depend on ordering of all
> of the compressed data being sent out between threads?

Err, i tried think it over carefully, however, still missed the case you mentioned. :(
Anyway, doing flush_compressed_data() for every 50ms hurt us too much.

> I think the one I'd most worry about is the case where:
> 
>    iteration one:
>       thread 1: Save compressed page 'n'
> 
>    iteration two:
>       thread 2: Save compressed page 'n'
> 
> What guarantees that the version of page 'n'
> from thread 2 reaches the destination first without
> this flush?
> 

Hmm... you are right, i missed this case. So how about avoid it by doing this
check at the beginning of ram_save_iterate()：

if (ram_counters.dirty_sync_count != rs.dirty_sync_count) {
     flush_compressed_data(*rsp);
     rs.dirty_sync_count = ram_counters.dirty_sync_count;
}