On 07/14/2018 02:01 AM, Dr. David Alan Gilbert wrote:
> * guangrong.xiao@gmail.com (guangrong.xiao@gmail.com) wrote:
>> From: Xiao Guangrong <xiaoguangrong@tencent.com>
>>
>> flush_compressed_data() needs to wait all compression threads to
>> finish their work, after that all threads are free until the
>> migration feed new request to them, reducing its call can improve
>> the throughput and use CPU resource more effectively
>>
>> We do not need to flush all threads at the end of iteration, the
>> data can be kept locally until the memory block is changed
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com>
>> ---
>> migration/ram.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index f9a8646520..0a38c1c61e 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1994,6 +1994,7 @@ static void ram_save_cleanup(void *opaque)
>> }
>>
>> xbzrle_cleanup();
>> + flush_compressed_data(*rsp);
>> compress_threads_save_cleanup();
>> ram_state_cleanup(rsp);
>> }
>
> I'm not sure why this change corresponds to the other removal.
> We should already have sent all remaining data in ram_save_complete()'s
> call to flush_compressed_data - so what is this one for?
>
This is for the error condition, if any error occurred during live migration,
there is no chance to call ram_save_complete. After using the lockless
multithreads model, we assert all requests have been handled before destroy
the work threads.
>> @@ -2690,7 +2691,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>> }
>> i++;
>> }
>> - flush_compressed_data(rs);
>> rcu_read_unlock();
>
> Hmm - are we sure there's no other cases that depend on ordering of all
> of the compressed data being sent out between threads?
Err, i tried think it over carefully, however, still missed the case you mentioned. :(
Anyway, doing flush_compressed_data() for every 50ms hurt us too much.
> I think the one I'd most worry about is the case where:
>
> iteration one:
> thread 1: Save compressed page 'n'
>
> iteration two:
> thread 2: Save compressed page 'n'
>
> What guarantees that the version of page 'n'
> from thread 2 reaches the destination first without
> this flush?
>
Hmm... you are right, i missed this case. So how about avoid it by doing this
check at the beginning of ram_save_iterate():
if (ram_counters.dirty_sync_count != rs.dirty_sync_count) {
flush_compressed_data(*rsp);
rs.dirty_sync_count = ram_counters.dirty_sync_count;
}