[PATCH v5 09/10] KVM: Disable manual dirty log when dirty ring enabled

Peter Xu posted 10 patches 4 years, 9 months ago
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
There is a newer version of this series
[PATCH v5 09/10] KVM: Disable manual dirty log when dirty ring enabled
Posted by Peter Xu 4 years, 9 months ago
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only
useful for KVM_GET_DIRTY_LOG.  Skip enabling it for kvm dirty ring.

More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages
initially, which is against how kvm dirty ring is used - there's no way for kvm
dirty ring to re-protect a page before it's notified as being written first
with a GFN entry in the ring!  So when KVM_DIRTY_LOG_INITIALLY_SET is enabled
with dirty ring, we'll see silent data loss after migration.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 accel/kvm/kvm-all.c | 37 +++++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 10137b6af11..ae9393266b2 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2173,20 +2173,29 @@ static int kvm_init(MachineState *ms)
         }
     }
 
-    dirty_log_manual_caps =
-        kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
-    dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
-                              KVM_DIRTY_LOG_INITIALLY_SET);
-    s->manual_dirty_log_protect = dirty_log_manual_caps;
-    if (dirty_log_manual_caps) {
-        ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
-                                   dirty_log_manual_caps);
-        if (ret) {
-            warn_report("Trying to enable capability %"PRIu64" of "
-                        "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
-                        "Falling back to the legacy mode. ",
-                        dirty_log_manual_caps);
-            s->manual_dirty_log_protect = 0;
+    /*
+     * KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is not needed when dirty ring is
+     * enabled.  More importantly, KVM_DIRTY_LOG_INITIALLY_SET will assume no
+     * page is wr-protected initially, which is against how kvm dirty ring is
+     * usage - kvm dirty ring requires all pages are wr-protected at the very
+     * beginning.  Enabling this feature for dirty ring causes data corruption.
+     */
+    if (!s->kvm_dirty_ring_enabled) {
+        dirty_log_manual_caps =
+            kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
+        dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
+                                  KVM_DIRTY_LOG_INITIALLY_SET);
+        s->manual_dirty_log_protect = dirty_log_manual_caps;
+        if (dirty_log_manual_caps) {
+            ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
+                                    dirty_log_manual_caps);
+            if (ret) {
+                warn_report("Trying to enable capability %"PRIu64" of "
+                            "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
+                            "Falling back to the legacy mode. ",
+                            dirty_log_manual_caps);
+                s->manual_dirty_log_protect = 0;
+            }
         }
     }
 
-- 
2.26.2


Re: [PATCH v5 09/10] KVM: Disable manual dirty log when dirty ring enabled
Posted by Keqian Zhu 4 years, 8 months ago
Hi Peter,

On 2021/3/11 4:33, Peter Xu wrote:
> KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only
> useful for KVM_GET_DIRTY_LOG.  Skip enabling it for kvm dirty ring.
> 
> More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages
> initially, which is against how kvm dirty ring is used - there's no way for kvm
> dirty ring to re-protect a page before it's notified as being written first
> with a GFN entry in the ring!  So when KVM_DIRTY_LOG_INITIALLY_SET is enabled
> with dirty ring, we'll see silent data loss after migration.
I feel a little regret that dirty ring can not work with KVM_DIRTY_LOG_INITIALLY_SET ...
With KVM_DIRTY_LOG_INITIALLY_SET, we can speedup dirty log start. More important, we can
enable dirty log gradually. For write fault based dirty log, it greatly reduces the side
effect of dirty log over guest.

I hope we can put forward another similar optimization under dirty ring mode. :)

Thanks,
Keqian

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  accel/kvm/kvm-all.c | 37 +++++++++++++++++++++++--------------
>  1 file changed, 23 insertions(+), 14 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 10137b6af11..ae9393266b2 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2173,20 +2173,29 @@ static int kvm_init(MachineState *ms)
>          }
>      }
>  
> -    dirty_log_manual_caps =
> -        kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
> -    dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
> -                              KVM_DIRTY_LOG_INITIALLY_SET);
> -    s->manual_dirty_log_protect = dirty_log_manual_caps;
> -    if (dirty_log_manual_caps) {
> -        ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
> -                                   dirty_log_manual_caps);
> -        if (ret) {
> -            warn_report("Trying to enable capability %"PRIu64" of "
> -                        "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
> -                        "Falling back to the legacy mode. ",
> -                        dirty_log_manual_caps);
> -            s->manual_dirty_log_protect = 0;
> +    /*
> +     * KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is not needed when dirty ring is
> +     * enabled.  More importantly, KVM_DIRTY_LOG_INITIALLY_SET will assume no
> +     * page is wr-protected initially, which is against how kvm dirty ring is
> +     * usage - kvm dirty ring requires all pages are wr-protected at the very
> +     * beginning.  Enabling this feature for dirty ring causes data corruption.
> +     */
> +    if (!s->kvm_dirty_ring_enabled) {
> +        dirty_log_manual_caps =
> +            kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
> +        dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
> +                                  KVM_DIRTY_LOG_INITIALLY_SET);
> +        s->manual_dirty_log_protect = dirty_log_manual_caps;
> +        if (dirty_log_manual_caps) {
> +            ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
> +                                    dirty_log_manual_caps);
> +            if (ret) {
> +                warn_report("Trying to enable capability %"PRIu64" of "
> +                            "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
> +                            "Falling back to the legacy mode. ",
> +                            dirty_log_manual_caps);
> +                s->manual_dirty_log_protect = 0;
> +            }
>          }
>      }
>  
> 

Re: [PATCH v5 09/10] KVM: Disable manual dirty log when dirty ring enabled
Posted by Paolo Bonzini 4 years, 8 months ago
On 22/03/21 10:17, Keqian Zhu wrote:
> Hi Peter,
> 
> On 2021/3/11 4:33, Peter Xu wrote:
>> KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only
>> useful for KVM_GET_DIRTY_LOG.  Skip enabling it for kvm dirty ring.
>>
>> More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages
>> initially, which is against how kvm dirty ring is used - there's no way for kvm
>> dirty ring to re-protect a page before it's notified as being written first
>> with a GFN entry in the ring!  So when KVM_DIRTY_LOG_INITIALLY_SET is enabled
>> with dirty ring, we'll see silent data loss after migration.
> I feel a little regret that dirty ring can not work with KVM_DIRTY_LOG_INITIALLY_SET ...
> With KVM_DIRTY_LOG_INITIALLY_SET, we can speedup dirty log start. More important, we can
> enable dirty log gradually. For write fault based dirty log, it greatly reduces the side
> effect of dirty log over guest.
> 
> I hope we can put forward another similar optimization under dirty ring mode. :)

Indeed, perhaps (even though KVM_GET_DIRTY_LOG does not make sense with 
dirty ring) we could allow KVM_CLEAR_DIRTY_LOG.

Paolo


Re: [PATCH v5 09/10] KVM: Disable manual dirty log when dirty ring enabled
Posted by Peter Xu 4 years, 8 months ago
On Mon, Mar 22, 2021 at 02:55:44PM +0100, Paolo Bonzini wrote:
> On 22/03/21 10:17, Keqian Zhu wrote:
> > Hi Peter,
> > 
> > On 2021/3/11 4:33, Peter Xu wrote:
> > > KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only
> > > useful for KVM_GET_DIRTY_LOG.  Skip enabling it for kvm dirty ring.
> > > 
> > > More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages
> > > initially, which is against how kvm dirty ring is used - there's no way for kvm
> > > dirty ring to re-protect a page before it's notified as being written first
> > > with a GFN entry in the ring!  So when KVM_DIRTY_LOG_INITIALLY_SET is enabled
> > > with dirty ring, we'll see silent data loss after migration.
> > I feel a little regret that dirty ring can not work with KVM_DIRTY_LOG_INITIALLY_SET ...
> > With KVM_DIRTY_LOG_INITIALLY_SET, we can speedup dirty log start. More important, we can
> > enable dirty log gradually. For write fault based dirty log, it greatly reduces the side
> > effect of dirty log over guest.
> > 
> > I hope we can put forward another similar optimization under dirty ring mode. :)
> 
> Indeed, perhaps (even though KVM_GET_DIRTY_LOG does not make sense with
> dirty ring) we could allow KVM_CLEAR_DIRTY_LOG.

Right, KVM_CLEAR_DIRTY_LOG is a good interface to reuse so as to grant
userspace more flexibility to explicit wr-protect some guest pages.  However
that'll need kernel reworks - obviously when I worked on the kernel part I
didn't notice this issue..

To make it a complete work, IMHO we'll also need QEMU to completely drop the
whole dirty bitmap in all the layers, then I'd expect the dirty ring idea as a
whole start to make more difference on huge vms.  They all just need some more
work which should be based on this series.

Shall we make it a "TODO" though?  E.g., I can add a comment here mentioning
about this issue.  I still hope we can have the qemu series lands soon first,
since it'll be bigger project to fully complete it.  Paolo?

Thanks,

-- 
Peter Xu