Commit f10562799 introduced a regression: if reverting to a snapshot
fails early (such as when we refuse to revert to an external
snapshot), we lose track of the domain's current snapshot.
See: https://bugzilla.redhat.com/1738747
Signed-off-by: Eric Blake <eblake@redhat.com>
---
src/qemu/qemu_driver.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index b28a26c3d6..093b15f500 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16941,8 +16941,6 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
virDomainSnapshotSetCurrent(vm->snapshots, NULL);
ret = -1;
}
- } else if (snap) {
- virDomainSnapshotSetCurrent(vm->snapshots, NULL);
}
if (ret == 0 && config && vm->persistent &&
!(ret = virDomainSaveConfig(cfg->configDir, driver->caps,
--
2.21.0
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On 9/9/19 5:52 PM, Eric Blake wrote: > Commit f10562799 introduced a regression: if reverting to a snapshot > fails early (such as when we refuse to revert to an external > snapshot), we lose track of the domain's current snapshot. > > See: https://bugzilla.redhat.com/1738747 > Signed-off-by: Eric Blake <eblake@redhat.com> > --- I don't understand why f10562799 broke this code like this - tried looking the changes made in the commit and the "if (snap)" was there since a long time, no changes were made in the 'snap' variable as well - but this change didn't break anything else, so: Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> ps: I think adding a test case for this failure (in tests/virsh-snapshot ?) is worth considering, especially considering that we changes from from Maxiwell (adding an inactive XML to the snapshot) that can add up more complexity in the snapshot mechanics. Thanks, DHB > src/qemu/qemu_driver.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > index b28a26c3d6..093b15f500 100644 > --- a/src/qemu/qemu_driver.c > +++ b/src/qemu/qemu_driver.c > @@ -16941,8 +16941,6 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, > virDomainSnapshotSetCurrent(vm->snapshots, NULL); > ret = -1; > } > - } else if (snap) { > - virDomainSnapshotSetCurrent(vm->snapshots, NULL); > } > if (ret == 0 && config && vm->persistent && > !(ret = virDomainSaveConfig(cfg->configDir, driver->caps, -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 9/9/19 5:17 PM, Daniel Henrique Barboza wrote: > > > On 9/9/19 5:52 PM, Eric Blake wrote: >> Commit f10562799 introduced a regression: if reverting to a snapshot >> fails early (such as when we refuse to revert to an external >> snapshot), we lose track of the domain's current snapshot. >> >> See: https://bugzilla.redhat.com/1738747 >> Signed-off-by: Eric Blake <eblake@redhat.com> >> --- > > I don't understand why f10562799 broke this code like this - tried > looking the changes made in the commit and the "if (snap)" was > there since a long time, no changes were made in the 'snap' > variable as well - but this change didn't break anything else, so: > > Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> I'll update the commit message to give more details: Before that patch, we maintained the notion of the current snapshot in two places: vm->current_snapshot, and snap->def->current. If you have a domain with two snapshots, s1 and s2 (where s2 is current) and want to revert to s1, the old code cleared the def->current flag on s2 before noticing that reverting to s1 was not possible, but at least left vm->current_snapshot unchanged. That meant we had a _different_ bug - the code was inconsistent on whether the domain had a current snapshot (as long as libvirtd didn't shut down, vm->current_snapshot ruled, so s2 was still claimed as current; but if you restart libvirtd, the XML for s2 didn't track that it was current, so after restart you'd have no current snapshot). After that patch, the code was unconditionally changing vm->current_snapshot to NULL (but at least was consistent everywhere else that the XML never got out of sync with the notion of which snapshot was current). It also didn't help that after that patch, the code clearing the snapshot occurs twice in the function - once right after determining that the early checks have succeeded, the other unconditionally on all failure paths. So the fix is as simple as removing the unconditional clearing of s2 as the current snapshot in the cleanup code, in favor of the earlier clearing that happens only after the early checks succeed. > > > ps: I think adding a test case for this failure (in tests/virsh-snapshot ?) > is worth considering, especially considering that we changes from > from Maxiwell (adding an inactive XML to the snapshot) that can > add up more complexity in the snapshot mechanics. I tried. But the test driver doesn't forbid 'snapshot-revert' on external snapshots, and also doesn't try to write state to XML files, so it never copied the problematic code from the qemu driver that could even trigger the bug. > >> src/qemu/qemu_driver.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >> index b28a26c3d6..093b15f500 100644 >> --- a/src/qemu/qemu_driver.c >> +++ b/src/qemu/qemu_driver.c >> @@ -16941,8 +16941,6 @@ >> qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >> virDomainSnapshotSetCurrent(vm->snapshots, NULL); >> ret = -1; >> } >> - } else if (snap) { >> - virDomainSnapshotSetCurrent(vm->snapshots, NULL); >> } >> if (ret == 0 && config && vm->persistent && >> !(ret = virDomainSaveConfig(cfg->configDir, driver->caps, > > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 9/10/19 3:59 PM, Eric Blake wrote: > On 9/9/19 5:17 PM, Daniel Henrique Barboza wrote: >> >> On 9/9/19 5:52 PM, Eric Blake wrote: >>> Commit f10562799 introduced a regression: if reverting to a snapshot >>> fails early (such as when we refuse to revert to an external >>> snapshot), we lose track of the domain's current snapshot. >>> >>> See: https://bugzilla.redhat.com/1738747 >>> Signed-off-by: Eric Blake <eblake@redhat.com> >>> --- >> I don't understand why f10562799 broke this code like this - tried >> looking the changes made in the commit and the "if (snap)" was >> there since a long time, no changes were made in the 'snap' >> variable as well - but this change didn't break anything else, so: >> >> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> > I'll update the commit message to give more details: > > Before that patch, we maintained the notion of the current snapshot in > two places: vm->current_snapshot, and snap->def->current. If you have a > domain with two snapshots, s1 and s2 (where s2 is current) and want to > revert to s1, the old code cleared the def->current flag on s2 before > noticing that reverting to s1 was not possible, but at least left > vm->current_snapshot unchanged. That meant we had a _different_ bug - > the code was inconsistent on whether the domain had a current snapshot > (as long as libvirtd didn't shut down, vm->current_snapshot ruled, so s2 > was still claimed as current; but if you restart libvirtd, the XML for > s2 didn't track that it was current, so after restart you'd have no > current snapshot). After that patch, the code was unconditionally > changing vm->current_snapshot to NULL (but at least was consistent > everywhere else that the XML never got out of sync with the notion of > which snapshot was current). It also didn't help that after that patch, > the code clearing the snapshot occurs twice in the function - once right > after determining that the early checks have succeeded, the other > unconditionally on all failure paths. > > So the fix is as simple as removing the unconditional clearing of s2 as > the current snapshot in the cleanup code, in favor of the earlier > clearing that happens only after the early checks succeed. Thanks for the explanation! > >> >> ps: I think adding a test case for this failure (in tests/virsh-snapshot ?) >> is worth considering, especially considering that we changes from >> from Maxiwell (adding an inactive XML to the snapshot) that can >> add up more complexity in the snapshot mechanics. > I tried. But the test driver doesn't forbid 'snapshot-revert' on > external snapshots, and also doesn't try to write state to XML files, so > it never copied the problematic code from the qemu driver that could > even trigger the bug. I see. But now that I understood what you did and how the code is neater, I changed my mind - I don't think Maxiwell's changes are going to be too hard of a hit there. We can enhance the test_driver to behave like the regular driver in this case of external snapshots, and add a test case for this scenario, in another day. Thanks, DHB > > >>> src/qemu/qemu_driver.c | 2 -- >>> 1 file changed, 2 deletions(-) >>> >>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >>> index b28a26c3d6..093b15f500 100644 >>> --- a/src/qemu/qemu_driver.c >>> +++ b/src/qemu/qemu_driver.c >>> @@ -16941,8 +16941,6 @@ >>> qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, >>> virDomainSnapshotSetCurrent(vm->snapshots, NULL); >>> ret = -1; >>> } >>> - } else if (snap) { >>> - virDomainSnapshotSetCurrent(vm->snapshots, NULL); >>> } >>> if (ret == 0 && config && vm->persistent && >>> !(ret = virDomainSaveConfig(cfg->configDir, driver->caps, >> -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2024 Red Hat, Inc.