[libvirt] [PATCH] qemu: Fix regression in snapshot-revert

Eric Blake posted 1 patch 4 years, 7 months ago
Test syntax-check passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20190909205242.15406-1-eblake@redhat.com
src/qemu/qemu_driver.c | 2 --
1 file changed, 2 deletions(-)
[libvirt] [PATCH] qemu: Fix regression in snapshot-revert
Posted by Eric Blake 4 years, 7 months ago
Commit f10562799 introduced a regression: if reverting to a snapshot
fails early (such as when we refuse to revert to an external
snapshot), we lose track of the domain's current snapshot.

See: https://bugzilla.redhat.com/1738747
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 src/qemu/qemu_driver.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index b28a26c3d6..093b15f500 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16941,8 +16941,6 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
             virDomainSnapshotSetCurrent(vm->snapshots, NULL);
             ret = -1;
         }
-    } else if (snap) {
-        virDomainSnapshotSetCurrent(vm->snapshots, NULL);
     }
     if (ret == 0 && config && vm->persistent &&
         !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,
-- 
2.21.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Fix regression in snapshot-revert
Posted by Daniel Henrique Barboza 4 years, 7 months ago

On 9/9/19 5:52 PM, Eric Blake wrote:
> Commit f10562799 introduced a regression: if reverting to a snapshot
> fails early (such as when we refuse to revert to an external
> snapshot), we lose track of the domain's current snapshot.
>
> See: https://bugzilla.redhat.com/1738747
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---

I don't understand why f10562799 broke this code like this - tried
looking the changes made in the commit and the "if (snap)" was
there since a long time, no changes were made in the 'snap'
variable as well - but this change didn't break anything else, so:

Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>


ps: I think adding a test case for this failure (in tests/virsh-snapshot ?)
is worth considering, especially considering that we changes from
from Maxiwell (adding an inactive XML to the snapshot) that can
add up more complexity in the snapshot mechanics.


Thanks,


DHB


>   src/qemu/qemu_driver.c | 2 --
>   1 file changed, 2 deletions(-)
>
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index b28a26c3d6..093b15f500 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -16941,8 +16941,6 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>               virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>               ret = -1;
>           }
> -    } else if (snap) {
> -        virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>       }
>       if (ret == 0 && config && vm->persistent &&
>           !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Fix regression in snapshot-revert
Posted by Eric Blake 4 years, 7 months ago
On 9/9/19 5:17 PM, Daniel Henrique Barboza wrote:
> 
> 
> On 9/9/19 5:52 PM, Eric Blake wrote:
>> Commit f10562799 introduced a regression: if reverting to a snapshot
>> fails early (such as when we refuse to revert to an external
>> snapshot), we lose track of the domain's current snapshot.
>>
>> See: https://bugzilla.redhat.com/1738747
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
> 
> I don't understand why f10562799 broke this code like this - tried
> looking the changes made in the commit and the "if (snap)" was
> there since a long time, no changes were made in the 'snap'
> variable as well - but this change didn't break anything else, so:
> 
> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>

I'll update the commit message to give more details:

Before that patch, we maintained the notion of the current snapshot in
two places: vm->current_snapshot, and snap->def->current.  If you have a
domain with two snapshots, s1 and s2 (where s2 is current) and want to
revert to s1, the old code cleared the def->current flag on s2 before
noticing that reverting to s1 was not possible, but at least left
vm->current_snapshot unchanged.  That meant we had a _different_ bug -
the code was inconsistent on whether the domain had a current snapshot
(as long as libvirtd didn't shut down, vm->current_snapshot ruled, so s2
was still claimed as current; but if you restart libvirtd, the XML for
s2 didn't track that it was current, so after restart you'd have no
current snapshot).  After that patch, the code was unconditionally
changing vm->current_snapshot to NULL (but at least was consistent
everywhere else that the XML never got out of sync with the notion of
which snapshot was current).  It also didn't help that after that patch,
the code clearing the snapshot occurs twice in the function - once right
after determining that the early checks have succeeded, the other
unconditionally on all failure paths.

So the fix is as simple as removing the unconditional clearing of s2 as
the current snapshot in the cleanup code, in favor of the earlier
clearing that happens only after the early checks succeed.

> 
> 
> ps: I think adding a test case for this failure (in tests/virsh-snapshot ?)
> is worth considering, especially considering that we changes from
> from Maxiwell (adding an inactive XML to the snapshot) that can
> add up more complexity in the snapshot mechanics.

I tried. But the test driver doesn't forbid 'snapshot-revert' on
external snapshots, and also doesn't try to write state to XML files, so
it never copied the problematic code from the qemu driver that could
even trigger the bug.


> 
>>   src/qemu/qemu_driver.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index b28a26c3d6..093b15f500 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -16941,8 +16941,6 @@
>> qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>               virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>>               ret = -1;
>>           }
>> -    } else if (snap) {
>> -        virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>>       }
>>       if (ret == 0 && config && vm->persistent &&
>>           !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: Fix regression in snapshot-revert
Posted by Daniel Henrique Barboza 4 years, 7 months ago

On 9/10/19 3:59 PM, Eric Blake wrote:
> On 9/9/19 5:17 PM, Daniel Henrique Barboza wrote:
>>
>> On 9/9/19 5:52 PM, Eric Blake wrote:
>>> Commit f10562799 introduced a regression: if reverting to a snapshot
>>> fails early (such as when we refuse to revert to an external
>>> snapshot), we lose track of the domain's current snapshot.
>>>
>>> See: https://bugzilla.redhat.com/1738747
>>> Signed-off-by: Eric Blake <eblake@redhat.com>
>>> ---
>> I don't understand why f10562799 broke this code like this - tried
>> looking the changes made in the commit and the "if (snap)" was
>> there since a long time, no changes were made in the 'snap'
>> variable as well - but this change didn't break anything else, so:
>>
>> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
> I'll update the commit message to give more details:
>
> Before that patch, we maintained the notion of the current snapshot in
> two places: vm->current_snapshot, and snap->def->current.  If you have a
> domain with two snapshots, s1 and s2 (where s2 is current) and want to
> revert to s1, the old code cleared the def->current flag on s2 before
> noticing that reverting to s1 was not possible, but at least left
> vm->current_snapshot unchanged.  That meant we had a _different_ bug -
> the code was inconsistent on whether the domain had a current snapshot
> (as long as libvirtd didn't shut down, vm->current_snapshot ruled, so s2
> was still claimed as current; but if you restart libvirtd, the XML for
> s2 didn't track that it was current, so after restart you'd have no
> current snapshot).  After that patch, the code was unconditionally
> changing vm->current_snapshot to NULL (but at least was consistent
> everywhere else that the XML never got out of sync with the notion of
> which snapshot was current).  It also didn't help that after that patch,
> the code clearing the snapshot occurs twice in the function - once right
> after determining that the early checks have succeeded, the other
> unconditionally on all failure paths.
>
> So the fix is as simple as removing the unconditional clearing of s2 as
> the current snapshot in the cleanup code, in favor of the earlier
> clearing that happens only after the early checks succeed.

Thanks for the explanation!

>
>>
>> ps: I think adding a test case for this failure (in tests/virsh-snapshot ?)
>> is worth considering, especially considering that we changes from
>> from Maxiwell (adding an inactive XML to the snapshot) that can
>> add up more complexity in the snapshot mechanics.
> I tried. But the test driver doesn't forbid 'snapshot-revert' on
> external snapshots, and also doesn't try to write state to XML files, so
> it never copied the problematic code from the qemu driver that could
> even trigger the bug.

I see. But now that I understood what you did and how the code is
neater, I changed my mind - I don't think Maxiwell's changes are going to
be too hard of a hit there. We can enhance the test_driver to
behave like the regular driver in this case of external snapshots, and
add a test case for this scenario, in another day.


Thanks,


DHB

>
>
>>>    src/qemu/qemu_driver.c | 2 --
>>>    1 file changed, 2 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index b28a26c3d6..093b15f500 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -16941,8 +16941,6 @@
>>> qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>>                virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>>>                ret = -1;
>>>            }
>>> -    } else if (snap) {
>>> -        virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>>>        }
>>>        if (ret == 0 && config && vm->persistent &&
>>>            !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,
>>

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list