migration: keep bytes_xfer_prev init'd to zero

[Qemu-devel] [PATCH] migration: keep bytes_xfer_prev init'd to zero

Posted by Felipe Franciosi 8 years, 8 months ago

The first time migration_bitmap_sync() is called, bytes_xfer_prev is set
to ram_state.bytes_transferred which is, at this point, zero. The next
time migration_bitmap_sync() is called, an iteration has happened and
bytes_xfer_prev is set to 'x' bytes. Most likely, more than one second
has passed, so the auto converge logic will be triggered and
bytes_xfer_now will also be set to 'x' bytes.

This condition is currently masked by dirty_rate_high_cnt, which will
wait for a few iterations before throttling. It would otherwise always
assume zero bytes have been copied and therefore throttle the guest
(possibly) prematurely.

Given bytes_xfer_prev is only used by the auto convergence logic, it
makes sense to only set its value after a check has been made against
bytes_xfer_now.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
~
---
 migration/ram.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f59fdd4..793af39 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -670,10 +670,6 @@ static void migration_bitmap_sync(RAMState *rs)
 
     rs->bitmap_sync_count++;
 
-    if (!rs->bytes_xfer_prev) {
-        rs->bytes_xfer_prev = ram_bytes_transferred();
-    }
-
     if (!rs->time_last_bitmap_sync) {
         rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     }
-- 
1.9.5

Re: [Qemu-devel] [PATCH] migration: keep bytes_xfer_prev init'd to zero

Posted by Peter Xu 8 years, 8 months ago

On Fri, May 19, 2017 at 10:59:02PM +0100, Felipe Franciosi wrote:
> The first time migration_bitmap_sync() is called, bytes_xfer_prev is set
> to ram_state.bytes_transferred which is, at this point, zero. The next
> time migration_bitmap_sync() is called, an iteration has happened and
> bytes_xfer_prev is set to 'x' bytes. Most likely, more than one second
> has passed, so the auto converge logic will be triggered and
> bytes_xfer_now will also be set to 'x' bytes.
> 
> This condition is currently masked by dirty_rate_high_cnt, which will
> wait for a few iterations before throttling. It would otherwise always
> assume zero bytes have been copied and therefore throttle the guest
> (possibly) prematurely.
> 
> Given bytes_xfer_prev is only used by the auto convergence logic, it
> makes sense to only set its value after a check has been made against
> bytes_xfer_now.
> 
> Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
> ~
> ---
>  migration/ram.c | 4 ----
>  1 file changed, 4 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index f59fdd4..793af39 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -670,10 +670,6 @@ static void migration_bitmap_sync(RAMState *rs)
>  
>      rs->bitmap_sync_count++;
>  
> -    if (!rs->bytes_xfer_prev) {
> -        rs->bytes_xfer_prev = ram_bytes_transferred();
> -    }
> -
>      if (!rs->time_last_bitmap_sync) {
>          rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      }
> -- 
> 1.9.5
> 
> 

I feel like this patch wants to correctly initialize bytes_xfer_prev,
however I still see problem. E.g., when user specify auto-convergence
during migration, and in the first iteration we'll always have a very
small bytes_xfer_prev (with this patch, it'll be zero) with a very big
bytes_xfer_now (which is the ram_bytes_transferred() value). If so, do
you think squash below change together with current one would be more
accurate?

diff --git a/migration/ram.c b/migration/ram.c
index f59fdd4..e01a218 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -703,7 +703,8 @@ static void migration_bitmap_sync(RAMState *rs)
                throttling */
             bytes_xfer_now = ram_bytes_transferred();
 
-            if (rs->dirty_pages_rate &&
+            /* Skip first iteration when bytes_xfer_prev not inited */
+            if (rs->bytes_xfer_prev && rs->dirty_pages_rate &&
                (rs->num_dirty_pages_period * TARGET_PAGE_SIZE >
                    (bytes_xfer_now - rs->bytes_xfer_prev) / 2) &&
                (rs->dirty_rate_high_cnt++ >= 2)) {

Even I am not sure whether we can move bytes_xfer_prev out of RAMState
since it's only used by migration_bitmap_sync().

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] migration: keep bytes_xfer_prev init'd to zero

Posted by Felipe Franciosi 8 years, 8 months ago

> On 23 May 2017, at 05:27, Peter Xu <peterx@redhat.com> wrote:
> 
> On Fri, May 19, 2017 at 10:59:02PM +0100, Felipe Franciosi wrote:
>> The first time migration_bitmap_sync() is called, bytes_xfer_prev is set
>> to ram_state.bytes_transferred which is, at this point, zero. The next
>> time migration_bitmap_sync() is called, an iteration has happened and
>> bytes_xfer_prev is set to 'x' bytes. Most likely, more than one second
>> has passed, so the auto converge logic will be triggered and
>> bytes_xfer_now will also be set to 'x' bytes.
>> 
>> This condition is currently masked by dirty_rate_high_cnt, which will
>> wait for a few iterations before throttling. It would otherwise always
>> assume zero bytes have been copied and therefore throttle the guest
>> (possibly) prematurely.
>> 
>> Given bytes_xfer_prev is only used by the auto convergence logic, it
>> makes sense to only set its value after a check has been made against
>> bytes_xfer_now.
>> 
>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
>> ~
>> ---
>> migration/ram.c | 4 ----
>> 1 file changed, 4 deletions(-)
>> 
>> diff --git a/migration/ram.c b/migration/ram.c
>> index f59fdd4..793af39 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -670,10 +670,6 @@ static void migration_bitmap_sync(RAMState *rs)
>> 
>>     rs->bitmap_sync_count++;
>> 
>> -    if (!rs->bytes_xfer_prev) {
>> -        rs->bytes_xfer_prev = ram_bytes_transferred();
>> -    }
>> -
>>     if (!rs->time_last_bitmap_sync) {
>>         rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>     }
>> -- 
>> 1.9.5
>> 
>> 
> 
> I feel like this patch wants to correctly initialize bytes_xfer_prev,
> however I still see problem. E.g., when user specify auto-convergence
> during migration, and in the first iteration we'll always have a very
> small bytes_xfer_prev (with this patch, it'll be zero) with a very big
> bytes_xfer_now (which is the ram_bytes_transferred() value).

Interesting point. Worth noting, that's no different than what happens today anyway (bytes_xfer_prev would be initialised to the first non-zero bytes_transferred). I therefore don't think it should stop or slow this patch's acceptance.

> If so, do
> you think squash below change together with current one would be more
> accurate?

As a matter of fact I had a different idea (below) to fix what you are describing. I was still experimenting with this code, so haven't sent a patch yet. But I'm going to send a series soon for comments. Basically, I think some other changes are required to make sure these numbers are correct:

1) dirty_rate_high_cnt++ >= 2 should be ++dirty_rate_high_cnt >= 2
- The original commit msg from 070afca25 (Jason J. Herne) says that convergence should be triggered after two passes. Current code does it after three passes (and four passes the first time around; see number 2 below).
- I personally feel this counter should go away altogether. If the migration is not converging and the VM is going to be throttled, there's no point stressing the network any further; just start throttling straight away.

2) dirty_pages_rate should be updated before the autoconverge logic.
- Right now, we delay throttling by a further iteration, as dirty_pages_rate is set after the first pass through the autoconverge logic (it is zero the first time around).
- The "if (rs->dirty_pages_rate &&..." part of the conditional can then be removed, as it won't ever be zero.

3) bytes_xfer counters should be updated alongside dirty_pages counters (for the same period).
- This fixes the issue you described, as bytes_xfer_* will correspond to the period.

I'll send the series shortly. Thoughts in the meantime?

Thanks,
Felipe

> 
> diff --git a/migration/ram.c b/migration/ram.c
> index f59fdd4..e01a218 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -703,7 +703,8 @@ static void migration_bitmap_sync(RAMState *rs)
>                throttling */
>             bytes_xfer_now = ram_bytes_transferred();
> 
> -            if (rs->dirty_pages_rate &&
> +            /* Skip first iteration when bytes_xfer_prev not inited */
> +            if (rs->bytes_xfer_prev && rs->dirty_pages_rate &&
>                (rs->num_dirty_pages_period * TARGET_PAGE_SIZE >
>                    (bytes_xfer_now - rs->bytes_xfer_prev) / 2) &&
>                (rs->dirty_rate_high_cnt++ >= 2)) {
> 
> Even I am not sure whether we can move bytes_xfer_prev out of RAMState
> since it's only used by migration_bitmap_sync().
> 
> Thanks,
> 
> -- 
> Peter Xu

Re: [Qemu-devel] [PATCH] migration: keep bytes_xfer_prev init'd to zero

Posted by Peter Xu 8 years, 8 months ago

On Wed, May 24, 2017 at 01:02:25PM +0000, Felipe Franciosi wrote:
> 
> > On 23 May 2017, at 05:27, Peter Xu <peterx@redhat.com> wrote:
> > 
> > On Fri, May 19, 2017 at 10:59:02PM +0100, Felipe Franciosi wrote:
> >> The first time migration_bitmap_sync() is called, bytes_xfer_prev is set
> >> to ram_state.bytes_transferred which is, at this point, zero. The next
> >> time migration_bitmap_sync() is called, an iteration has happened and
> >> bytes_xfer_prev is set to 'x' bytes. Most likely, more than one second
> >> has passed, so the auto converge logic will be triggered and
> >> bytes_xfer_now will also be set to 'x' bytes.
> >> 
> >> This condition is currently masked by dirty_rate_high_cnt, which will
> >> wait for a few iterations before throttling. It would otherwise always
> >> assume zero bytes have been copied and therefore throttle the guest
> >> (possibly) prematurely.
> >> 
> >> Given bytes_xfer_prev is only used by the auto convergence logic, it
> >> makes sense to only set its value after a check has been made against
> >> bytes_xfer_now.
> >> 
> >> Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
> >> ~
> >> ---
> >> migration/ram.c | 4 ----
> >> 1 file changed, 4 deletions(-)
> >> 
> >> diff --git a/migration/ram.c b/migration/ram.c
> >> index f59fdd4..793af39 100644
> >> --- a/migration/ram.c
> >> +++ b/migration/ram.c
> >> @@ -670,10 +670,6 @@ static void migration_bitmap_sync(RAMState *rs)
> >> 
> >>     rs->bitmap_sync_count++;
> >> 
> >> -    if (!rs->bytes_xfer_prev) {
> >> -        rs->bytes_xfer_prev = ram_bytes_transferred();
> >> -    }
> >> -
> >>     if (!rs->time_last_bitmap_sync) {
> >>         rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >>     }
> >> -- 
> >> 1.9.5
> >> 
> >> 
> > 
> > I feel like this patch wants to correctly initialize bytes_xfer_prev,
> > however I still see problem. E.g., when user specify auto-convergence
> > during migration, and in the first iteration we'll always have a very
> > small bytes_xfer_prev (with this patch, it'll be zero) with a very big
> > bytes_xfer_now (which is the ram_bytes_transferred() value).
> 
> Interesting point. Worth noting, that's no different than what happens today anyway (bytes_xfer_prev would be initialised to the first non-zero bytes_transferred). I therefore don't think it should stop or slow this patch's acceptance.
> 
> > If so, do
> > you think squash below change together with current one would be more
> > accurate?
> 
> As a matter of fact I had a different idea (below) to fix what you are describing. I was still experimenting with this code, so haven't sent a patch yet. But I'm going to send a series soon for comments. Basically, I think some other changes are required to make sure these numbers are correct:

I agree on most points.

> 
> 1) dirty_rate_high_cnt++ >= 2 should be ++dirty_rate_high_cnt >= 2
> - The original commit msg from 070afca25 (Jason J. Herne) says that convergence should be triggered after two passes. Current code does it after three passes (and four passes the first time around; see number 2 below).
> - I personally feel this counter should go away altogether. If the migration is not converging and the VM is going to be throttled, there's no point stressing the network any further; just start throttling straight away.
> 
> 2) dirty_pages_rate should be updated before the autoconverge logic.
> - Right now, we delay throttling by a further iteration, as dirty_pages_rate is set after the first pass through the autoconverge logic (it is zero the first time around).
> - The "if (rs->dirty_pages_rate &&..." part of the conditional can then be removed, as it won't ever be zero.

For this one: why dirty_pages_rate cannot be zero? But I agree with
you that it can be removed since even if it's zero, then the next
check would fail as well (rs->num_dirty_pages_period *
TARGET_PAGE_SIZE > (bytes_xfer_now - rs->bytes_xfer_prev) / 2).

> 
> 3) bytes_xfer counters should be updated alongside dirty_pages counters (for the same period).
> - This fixes the issue you described, as bytes_xfer_* will correspond to the period.
> 
> I'll send the series shortly. Thoughts in the meantime?

I'll reply to that patchset then. Thanks.

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] migration: keep bytes_xfer_prev init'd to zero

Posted by Felipe Franciosi 8 years, 8 months ago

> On 25 May 2017, at 01:36, Peter Xu <peterx@redhat.com> wrote:
> 
> On Wed, May 24, 2017 at 01:02:25PM +0000, Felipe Franciosi wrote:
>> 
>>> On 23 May 2017, at 05:27, Peter Xu <peterx@redhat.com> wrote:
>>> 
>>> On Fri, May 19, 2017 at 10:59:02PM +0100, Felipe Franciosi wrote:
>>>> The first time migration_bitmap_sync() is called, bytes_xfer_prev is set
>>>> to ram_state.bytes_transferred which is, at this point, zero. The next
>>>> time migration_bitmap_sync() is called, an iteration has happened and
>>>> bytes_xfer_prev is set to 'x' bytes. Most likely, more than one second
>>>> has passed, so the auto converge logic will be triggered and
>>>> bytes_xfer_now will also be set to 'x' bytes.
>>>> 
>>>> This condition is currently masked by dirty_rate_high_cnt, which will
>>>> wait for a few iterations before throttling. It would otherwise always
>>>> assume zero bytes have been copied and therefore throttle the guest
>>>> (possibly) prematurely.
>>>> 
>>>> Given bytes_xfer_prev is only used by the auto convergence logic, it
>>>> makes sense to only set its value after a check has been made against
>>>> bytes_xfer_now.
>>>> 
>>>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
>>>> ~
>>>> ---
>>>> migration/ram.c | 4 ----
>>>> 1 file changed, 4 deletions(-)
>>>> 
>>>> diff --git a/migration/ram.c b/migration/ram.c
>>>> index f59fdd4..793af39 100644
>>>> --- a/migration/ram.c
>>>> +++ b/migration/ram.c
>>>> @@ -670,10 +670,6 @@ static void migration_bitmap_sync(RAMState *rs)
>>>> 
>>>>    rs->bitmap_sync_count++;
>>>> 
>>>> -    if (!rs->bytes_xfer_prev) {
>>>> -        rs->bytes_xfer_prev = ram_bytes_transferred();
>>>> -    }
>>>> -
>>>>    if (!rs->time_last_bitmap_sync) {
>>>>        rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>>>    }
>>>> -- 
>>>> 1.9.5
>>>> 
>>>> 
>>> 
>>> I feel like this patch wants to correctly initialize bytes_xfer_prev,
>>> however I still see problem. E.g., when user specify auto-convergence
>>> during migration, and in the first iteration we'll always have a very
>>> small bytes_xfer_prev (with this patch, it'll be zero) with a very big
>>> bytes_xfer_now (which is the ram_bytes_transferred() value).
>> 
>> Interesting point. Worth noting, that's no different than what happens today anyway (bytes_xfer_prev would be initialised to the first non-zero bytes_transferred). I therefore don't think it should stop or slow this patch's acceptance.
>> 
>>> If so, do
>>> you think squash below change together with current one would be more
>>> accurate?
>> 
>> As a matter of fact I had a different idea (below) to fix what you are describing. I was still experimenting with this code, so haven't sent a patch yet. But I'm going to send a series soon for comments. Basically, I think some other changes are required to make sure these numbers are correct:
> 
> I agree on most points.
> 
>> 
>> 1) dirty_rate_high_cnt++ >= 2 should be ++dirty_rate_high_cnt >= 2
>> - The original commit msg from 070afca25 (Jason J. Herne) says that convergence should be triggered after two passes. Current code does it after three passes (and four passes the first time around; see number 2 below).
>> - I personally feel this counter should go away altogether. If the migration is not converging and the VM is going to be throttled, there's no point stressing the network any further; just start throttling straight away.
>> 
>> 2) dirty_pages_rate should be updated before the autoconverge logic.
>> - Right now, we delay throttling by a further iteration, as dirty_pages_rate is set after the first pass through the autoconverge logic (it is zero the first time around).
>> - The "if (rs->dirty_pages_rate &&..." part of the conditional can then be removed, as it won't ever be zero.
> 
> For this one: why dirty_pages_rate cannot be zero?

Yeah good point. If _nothing_ got dirtied in the period (possible I suppose), then the dirty rate would be zero.

> But I agree with
> you that it can be removed since even if it's zero, then the next
> check would fail as well (rs->num_dirty_pages_period *
> TARGET_PAGE_SIZE > (bytes_xfer_now - rs->bytes_xfer_prev) / 2).

Exactly. I suppose that check was introduced to consider the first iteration where the counter hadn't been initialised. But it's not needed.

> 
>> 
>> 3) bytes_xfer counters should be updated alongside dirty_pages counters (for the same period).
>> - This fixes the issue you described, as bytes_xfer_* will correspond to the period.
>> 
>> I'll send the series shortly. Thoughts in the meantime?
> 
> I'll reply to that patchset then. Thanks.

Thanks for the quick feedback on this. I've included this change on the patchset, so let's continue the conversation there.

Felipe

> 
> -- 
> Peter Xu