KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only
useful for KVM_GET_DIRTY_LOG. Skip enabling it for kvm dirty ring.
More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages
initially, which is against how kvm dirty ring is used - there's no way for kvm
dirty ring to re-protect a page before it's notified as being written first
with a GFN entry in the ring! So when KVM_DIRTY_LOG_INITIALLY_SET is enabled
with dirty ring, we'll see silent data loss after migration.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
accel/kvm/kvm-all.c | 37 +++++++++++++++++++++++--------------
1 file changed, 23 insertions(+), 14 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 10137b6af11..ae9393266b2 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2173,20 +2173,29 @@ static int kvm_init(MachineState *ms)
}
}
- dirty_log_manual_caps =
- kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
- dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
- KVM_DIRTY_LOG_INITIALLY_SET);
- s->manual_dirty_log_protect = dirty_log_manual_caps;
- if (dirty_log_manual_caps) {
- ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
- dirty_log_manual_caps);
- if (ret) {
- warn_report("Trying to enable capability %"PRIu64" of "
- "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
- "Falling back to the legacy mode. ",
- dirty_log_manual_caps);
- s->manual_dirty_log_protect = 0;
+ /*
+ * KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is not needed when dirty ring is
+ * enabled. More importantly, KVM_DIRTY_LOG_INITIALLY_SET will assume no
+ * page is wr-protected initially, which is against how kvm dirty ring is
+ * usage - kvm dirty ring requires all pages are wr-protected at the very
+ * beginning. Enabling this feature for dirty ring causes data corruption.
+ */
+ if (!s->kvm_dirty_ring_enabled) {
+ dirty_log_manual_caps =
+ kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
+ dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
+ KVM_DIRTY_LOG_INITIALLY_SET);
+ s->manual_dirty_log_protect = dirty_log_manual_caps;
+ if (dirty_log_manual_caps) {
+ ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
+ dirty_log_manual_caps);
+ if (ret) {
+ warn_report("Trying to enable capability %"PRIu64" of "
+ "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
+ "Falling back to the legacy mode. ",
+ dirty_log_manual_caps);
+ s->manual_dirty_log_protect = 0;
+ }
}
}
--
2.26.2
Hi Peter,
On 2021/3/11 4:33, Peter Xu wrote:
> KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only
> useful for KVM_GET_DIRTY_LOG. Skip enabling it for kvm dirty ring.
>
> More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages
> initially, which is against how kvm dirty ring is used - there's no way for kvm
> dirty ring to re-protect a page before it's notified as being written first
> with a GFN entry in the ring! So when KVM_DIRTY_LOG_INITIALLY_SET is enabled
> with dirty ring, we'll see silent data loss after migration.
I feel a little regret that dirty ring can not work with KVM_DIRTY_LOG_INITIALLY_SET ...
With KVM_DIRTY_LOG_INITIALLY_SET, we can speedup dirty log start. More important, we can
enable dirty log gradually. For write fault based dirty log, it greatly reduces the side
effect of dirty log over guest.
I hope we can put forward another similar optimization under dirty ring mode. :)
Thanks,
Keqian
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> accel/kvm/kvm-all.c | 37 +++++++++++++++++++++++--------------
> 1 file changed, 23 insertions(+), 14 deletions(-)
>
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 10137b6af11..ae9393266b2 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2173,20 +2173,29 @@ static int kvm_init(MachineState *ms)
> }
> }
>
> - dirty_log_manual_caps =
> - kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
> - dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
> - KVM_DIRTY_LOG_INITIALLY_SET);
> - s->manual_dirty_log_protect = dirty_log_manual_caps;
> - if (dirty_log_manual_caps) {
> - ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
> - dirty_log_manual_caps);
> - if (ret) {
> - warn_report("Trying to enable capability %"PRIu64" of "
> - "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
> - "Falling back to the legacy mode. ",
> - dirty_log_manual_caps);
> - s->manual_dirty_log_protect = 0;
> + /*
> + * KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is not needed when dirty ring is
> + * enabled. More importantly, KVM_DIRTY_LOG_INITIALLY_SET will assume no
> + * page is wr-protected initially, which is against how kvm dirty ring is
> + * usage - kvm dirty ring requires all pages are wr-protected at the very
> + * beginning. Enabling this feature for dirty ring causes data corruption.
> + */
> + if (!s->kvm_dirty_ring_enabled) {
> + dirty_log_manual_caps =
> + kvm_check_extension(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2);
> + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE |
> + KVM_DIRTY_LOG_INITIALLY_SET);
> + s->manual_dirty_log_protect = dirty_log_manual_caps;
> + if (dirty_log_manual_caps) {
> + ret = kvm_vm_enable_cap(s, KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2, 0,
> + dirty_log_manual_caps);
> + if (ret) {
> + warn_report("Trying to enable capability %"PRIu64" of "
> + "KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 but failed. "
> + "Falling back to the legacy mode. ",
> + dirty_log_manual_caps);
> + s->manual_dirty_log_protect = 0;
> + }
> }
> }
>
>
On 22/03/21 10:17, Keqian Zhu wrote: > Hi Peter, > > On 2021/3/11 4:33, Peter Xu wrote: >> KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only >> useful for KVM_GET_DIRTY_LOG. Skip enabling it for kvm dirty ring. >> >> More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages >> initially, which is against how kvm dirty ring is used - there's no way for kvm >> dirty ring to re-protect a page before it's notified as being written first >> with a GFN entry in the ring! So when KVM_DIRTY_LOG_INITIALLY_SET is enabled >> with dirty ring, we'll see silent data loss after migration. > I feel a little regret that dirty ring can not work with KVM_DIRTY_LOG_INITIALLY_SET ... > With KVM_DIRTY_LOG_INITIALLY_SET, we can speedup dirty log start. More important, we can > enable dirty log gradually. For write fault based dirty log, it greatly reduces the side > effect of dirty log over guest. > > I hope we can put forward another similar optimization under dirty ring mode. :) Indeed, perhaps (even though KVM_GET_DIRTY_LOG does not make sense with dirty ring) we could allow KVM_CLEAR_DIRTY_LOG. Paolo
On Mon, Mar 22, 2021 at 02:55:44PM +0100, Paolo Bonzini wrote: > On 22/03/21 10:17, Keqian Zhu wrote: > > Hi Peter, > > > > On 2021/3/11 4:33, Peter Xu wrote: > > > KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is for KVM_CLEAR_DIRTY_LOG, which is only > > > useful for KVM_GET_DIRTY_LOG. Skip enabling it for kvm dirty ring. > > > > > > More importantly, KVM_DIRTY_LOG_INITIALLY_SET will not wr-protect all the pages > > > initially, which is against how kvm dirty ring is used - there's no way for kvm > > > dirty ring to re-protect a page before it's notified as being written first > > > with a GFN entry in the ring! So when KVM_DIRTY_LOG_INITIALLY_SET is enabled > > > with dirty ring, we'll see silent data loss after migration. > > I feel a little regret that dirty ring can not work with KVM_DIRTY_LOG_INITIALLY_SET ... > > With KVM_DIRTY_LOG_INITIALLY_SET, we can speedup dirty log start. More important, we can > > enable dirty log gradually. For write fault based dirty log, it greatly reduces the side > > effect of dirty log over guest. > > > > I hope we can put forward another similar optimization under dirty ring mode. :) > > Indeed, perhaps (even though KVM_GET_DIRTY_LOG does not make sense with > dirty ring) we could allow KVM_CLEAR_DIRTY_LOG. Right, KVM_CLEAR_DIRTY_LOG is a good interface to reuse so as to grant userspace more flexibility to explicit wr-protect some guest pages. However that'll need kernel reworks - obviously when I worked on the kernel part I didn't notice this issue.. To make it a complete work, IMHO we'll also need QEMU to completely drop the whole dirty bitmap in all the layers, then I'd expect the dirty ring idea as a whole start to make more difference on huge vms. They all just need some more work which should be based on this series. Shall we make it a "TODO" though? E.g., I can add a comment here mentioning about this issue. I still hope we can have the qemu series lands soon first, since it'll be bigger project to fully complete it. Paolo? Thanks, -- Peter Xu
© 2016 - 2025 Red Hat, Inc.