[libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware

Peter Krempa posted 1 patch 7 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/68bb5b413b9ebc55fee368a0cd0201d424a468a1.1490261332.git.pkrempa@redhat.com
src/qemu/qemu_driver.c | 10 ++++++++++
1 file changed, 10 insertions(+)
[libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Peter Krempa 7 years, 1 month ago
If the variable store (<nvram>) file is raw qemu can't do a snapshot of
it and thus the snapshot would be incomplete. QEMU does no reject such
snapshot.

Additionally allowing to use a qcow2 variable store backing file would
solve this issue but then it would become eligible to become target of
the memory dump.

Offline internal snapshot would be incomplete too with either storage
format since libvirt does not handle the pflash file in this case.

Forbid such snapshot so that we can avoid problems.
---

Notes:
    v2:
    - changed error code to OPERATION_UNSUPPORTED (from CONFIG_UNSUPPORTED)
    - dropped mention of QEMU from the error message
    - dropped mentions of OVMF or the firmware itself altoghether, the culprit is
      the pflash device regardless of the software it contains
    - mentioned all the stuff in the commit message and comment
    
    We also will need to introduce a way to snapshot the pflash for external
    snapshots which is currently impossible as well, but fortunately does not
    have inherent drawbacks as internal snapshots.

 src/qemu/qemu_driver.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 676295208..202d190b3 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -13873,6 +13873,16 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
         goto cleanup;
     }

+    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
+     * not snapshot the variable store */
+    if (found_internal &&
+        vm->def->os.loader->type == VIR_DOMAIN_LOADER_TYPE_PFLASH) {
+        virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
+                       _("internal snapshots of a VM with pflash based "
+                         "firmware are not supported"));
+        goto cleanup;
+    }
+
     /* Alter flags to let later users know what we learned.  */
     if (external && !active)
         *flags |= VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY;
-- 
2.12.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Laszlo Ersek 7 years, 1 month ago
On 03/23/17 10:29, Peter Krempa wrote:
> If the variable store (<nvram>) file is raw qemu can't do a snapshot of
> it and thus the snapshot would be incomplete. QEMU does no reject such
> snapshot.
> 
> Additionally allowing to use a qcow2 variable store backing file would
> solve this issue but then it would become eligible to become target of
> the memory dump.
> 
> Offline internal snapshot would be incomplete too with either storage
> format since libvirt does not handle the pflash file in this case.
> 
> Forbid such snapshot so that we can avoid problems.
> ---
> 
> Notes:
>     v2:
>     - changed error code to OPERATION_UNSUPPORTED (from CONFIG_UNSUPPORTED)
>     - dropped mention of QEMU from the error message
>     - dropped mentions of OVMF or the firmware itself altoghether, the culprit is
>       the pflash device regardless of the software it contains
>     - mentioned all the stuff in the commit message and comment
>     
>     We also will need to introduce a way to snapshot the pflash for external
>     snapshots which is currently impossible as well, but fortunately does not
>     have inherent drawbacks as internal snapshots.
> 
>  src/qemu/qemu_driver.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 676295208..202d190b3 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -13873,6 +13873,16 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
>          goto cleanup;
>      }
> 
> +    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
> +     * not snapshot the variable store */
> +    if (found_internal &&
> +        vm->def->os.loader->type == VIR_DOMAIN_LOADER_TYPE_PFLASH) {
> +        virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
> +                       _("internal snapshots of a VM with pflash based "
> +                         "firmware are not supported"));
> +        goto cleanup;
> +    }
> +
>      /* Alter flags to let later users know what we learned.  */
>      if (external && !active)
>          *flags |= VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY;
> 

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thanks!
Laszlo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Peter Krempa 7 years, 1 month ago
On Thu, Mar 23, 2017 at 10:48:01 +0100, Laszlo Ersek wrote:
> On 03/23/17 10:29, Peter Krempa wrote:
> > If the variable store (<nvram>) file is raw qemu can't do a snapshot of
> > it and thus the snapshot would be incomplete. QEMU does no reject such
> > snapshot.
> > 
> > Additionally allowing to use a qcow2 variable store backing file would
> > solve this issue but then it would become eligible to become target of
> > the memory dump.
> > 
> > Offline internal snapshot would be incomplete too with either storage
> > format since libvirt does not handle the pflash file in this case.
> > 
> > Forbid such snapshot so that we can avoid problems.
> > ---
> > 
> > Notes:
> >     v2:
> >     - changed error code to OPERATION_UNSUPPORTED (from CONFIG_UNSUPPORTED)
> >     - dropped mention of QEMU from the error message
> >     - dropped mentions of OVMF or the firmware itself altoghether, the culprit is
> >       the pflash device regardless of the software it contains
> >     - mentioned all the stuff in the commit message and comment
> >     
> >     We also will need to introduce a way to snapshot the pflash for external
> >     snapshots which is currently impossible as well, but fortunately does not
> >     have inherent drawbacks as internal snapshots.
> > 
> >  src/qemu/qemu_driver.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index 676295208..202d190b3 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -13873,6 +13873,16 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
> >          goto cleanup;
> >      }
> > 
> > +    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
> > +     * not snapshot the variable store */
> > +    if (found_internal &&
> > +        vm->def->os.loader->type == VIR_DOMAIN_LOADER_TYPE_PFLASH) {
> > +        virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
> > +                       _("internal snapshots of a VM with pflash based "
> > +                         "firmware are not supported"));
> > +        goto cleanup;
> > +    }
> > +
> >      /* Alter flags to let later users know what we learned.  */
> >      if (external && !active)
> >          *flags |= VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY;
> > 

I apparently forgot to commit this despite mentioning it in the commit
message:

@@ -13873,8 +13873,14 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
         goto cleanup;
     }

-    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
-     * not snapshot the variable store */
+    /* internal snapshots + pflash based loader have the following problems:
+     * - if the variable store is raw, the snapshot is incomplete
+     * - alowing a qcow2 image as the varstore would make it eligible to receive
+     *   the vmstate dump, which would make it huge
+     * - offline snapshot would not snapshot the varstore at all
+     *
+     * Avoid the issues by forbidding this completely.
+     */

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Laszlo Ersek 7 years, 1 month ago
On 03/23/17 10:54, Peter Krempa wrote:
> On Thu, Mar 23, 2017 at 10:48:01 +0100, Laszlo Ersek wrote:
>> On 03/23/17 10:29, Peter Krempa wrote:
>>> If the variable store (<nvram>) file is raw qemu can't do a snapshot of
>>> it and thus the snapshot would be incomplete. QEMU does no reject such
>>> snapshot.
>>>
>>> Additionally allowing to use a qcow2 variable store backing file would
>>> solve this issue but then it would become eligible to become target of
>>> the memory dump.
>>>
>>> Offline internal snapshot would be incomplete too with either storage
>>> format since libvirt does not handle the pflash file in this case.
>>>
>>> Forbid such snapshot so that we can avoid problems.
>>> ---
>>>
>>> Notes:
>>>     v2:
>>>     - changed error code to OPERATION_UNSUPPORTED (from CONFIG_UNSUPPORTED)
>>>     - dropped mention of QEMU from the error message
>>>     - dropped mentions of OVMF or the firmware itself altoghether, the culprit is
>>>       the pflash device regardless of the software it contains
>>>     - mentioned all the stuff in the commit message and comment
>>>     
>>>     We also will need to introduce a way to snapshot the pflash for external
>>>     snapshots which is currently impossible as well, but fortunately does not
>>>     have inherent drawbacks as internal snapshots.
>>>
>>>  src/qemu/qemu_driver.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>> index 676295208..202d190b3 100644
>>> --- a/src/qemu/qemu_driver.c
>>> +++ b/src/qemu/qemu_driver.c
>>> @@ -13873,6 +13873,16 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
>>>          goto cleanup;
>>>      }
>>>
>>> +    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
>>> +     * not snapshot the variable store */
>>> +    if (found_internal &&
>>> +        vm->def->os.loader->type == VIR_DOMAIN_LOADER_TYPE_PFLASH) {
>>> +        virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
>>> +                       _("internal snapshots of a VM with pflash based "
>>> +                         "firmware are not supported"));
>>> +        goto cleanup;
>>> +    }
>>> +
>>>      /* Alter flags to let later users know what we learned.  */
>>>      if (external && !active)
>>>          *flags |= VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY;
>>>
> 
> I apparently forgot to commit this despite mentioning it in the commit
> message:
> 
> @@ -13873,8 +13873,14 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
>          goto cleanup;
>      }
> 
> -    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
> -     * not snapshot the variable store */
> +    /* internal snapshots + pflash based loader have the following problems:
> +     * - if the variable store is raw, the snapshot is incomplete
> +     * - alowing a qcow2 image as the varstore would make it eligible to receive
> +     *   the vmstate dump, which would make it huge
> +     * - offline snapshot would not snapshot the varstore at all
> +     *
> +     * Avoid the issues by forbidding this completely.
> +     */
> 

Yeah I saw the OVMF mention in the v2 comment, despite "dropped mentions
of OVMF", but I figured I wouldn't obsess about it. :)

Anyway, if you post a v3 with the comment updated, you can keep my R-b
with it -- the new comment looks great.

Thanks!
Laszlo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Peter Krempa 7 years, 1 month ago
On Thu, Mar 23, 2017 at 11:03:02 +0100, Laszlo Ersek wrote:
> On 03/23/17 10:54, Peter Krempa wrote:
> > On Thu, Mar 23, 2017 at 10:48:01 +0100, Laszlo Ersek wrote:
> >> On 03/23/17 10:29, Peter Krempa wrote:
> >>> If the variable store (<nvram>) file is raw qemu can't do a snapshot of
> >>> it and thus the snapshot would be incomplete. QEMU does no reject such
> >>> snapshot.
> >>>
> >>> Additionally allowing to use a qcow2 variable store backing file would
> >>> solve this issue but then it would become eligible to become target of
> >>> the memory dump.
> >>>
> >>> Offline internal snapshot would be incomplete too with either storage
> >>> format since libvirt does not handle the pflash file in this case.
> >>>
> >>> Forbid such snapshot so that we can avoid problems.

[...]

> > @@ -13873,8 +13873,14 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
> >          goto cleanup;
> >      }
> > 
> > -    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
> > -     * not snapshot the variable store */
> > +    /* internal snapshots + pflash based loader have the following problems:
> > +     * - if the variable store is raw, the snapshot is incomplete
> > +     * - alowing a qcow2 image as the varstore would make it eligible to receive
> > +     *   the vmstate dump, which would make it huge
> > +     * - offline snapshot would not snapshot the varstore at all
> > +     *
> > +     * Avoid the issues by forbidding this completely.
> > +     */

I thought about this a bit more and I think that while there are the
above problems we still can have users of snapshots + OVMF which use it
successfully. Forbiding it would create a regression for them since they
did not observe anything bad despite the problems mentioned above:

The reasons are following:

1) internal snapshots are the default in virt-manager
2) guests usually don't re-write the varstore very often, usually only
at install
3) OSes usually don't modify anything besides the boot entry
4) snapshot of an online VM carries the varstore in the memory image
5) OSes are pretty good at restoring the boot entry if it fails

Due to the facts above I think that there are users that legitimately
think that snapshots with pflash loaders work as expected. It's mostly
due to the fact that the data are pretty static and OSes don't store
anything important there and are able to self-heal some of the problems.

I think we should not disallow this to avoid usability regressions. We
can add documentation that states that it's unsafe to do snapshots.
Additionally we will need to add support for external snapshots, which
currently have similar kind of problems, although fixable.

Peter
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Laszlo Ersek 7 years, 1 month ago
On 03/23/17 15:07, Peter Krempa wrote:
> On Thu, Mar 23, 2017 at 11:03:02 +0100, Laszlo Ersek wrote:
>> On 03/23/17 10:54, Peter Krempa wrote:
>>> On Thu, Mar 23, 2017 at 10:48:01 +0100, Laszlo Ersek wrote:
>>>> On 03/23/17 10:29, Peter Krempa wrote:
>>>>> If the variable store (<nvram>) file is raw qemu can't do a snapshot of
>>>>> it and thus the snapshot would be incomplete. QEMU does no reject such
>>>>> snapshot.
>>>>>
>>>>> Additionally allowing to use a qcow2 variable store backing file would
>>>>> solve this issue but then it would become eligible to become target of
>>>>> the memory dump.
>>>>>
>>>>> Offline internal snapshot would be incomplete too with either storage
>>>>> format since libvirt does not handle the pflash file in this case.
>>>>>
>>>>> Forbid such snapshot so that we can avoid problems.
> 
> [...]
> 
>>> @@ -13873,8 +13873,14 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
>>>          goto cleanup;
>>>      }
>>>
>>> -    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
>>> -     * not snapshot the variable store */
>>> +    /* internal snapshots + pflash based loader have the following problems:
>>> +     * - if the variable store is raw, the snapshot is incomplete
>>> +     * - alowing a qcow2 image as the varstore would make it eligible to receive
>>> +     *   the vmstate dump, which would make it huge
>>> +     * - offline snapshot would not snapshot the varstore at all
>>> +     *
>>> +     * Avoid the issues by forbidding this completely.
>>> +     */
> 
> I thought about this a bit more and I think that while there are the
> above problems we still can have users of snapshots + OVMF which use it
> successfully. Forbiding it would create a regression for them since they
> did not observe anything bad despite the problems mentioned above:
> 
> The reasons are following:
> 
> 1) internal snapshots are the default in virt-manager
> 2) guests usually don't re-write the varstore very often, usually only
> at install
> 3) OSes usually don't modify anything besides the boot entry
> 4) snapshot of an online VM carries the varstore in the memory image
> 5) OSes are pretty good at restoring the boot entry if it fails
> 
> Due to the facts above I think that there are users that legitimately
> think that snapshots with pflash loaders work as expected. It's mostly
> due to the fact that the data are pretty static and OSes don't store
> anything important there and are able to self-heal some of the problems.
> 
> I think we should not disallow this to avoid usability regressions. We
> can add documentation that states that it's unsafe to do snapshots.
> Additionally we will need to add support for external snapshots, which
> currently have similar kind of problems, although fixable.

The tradeoff is between a seemingly working, but inherently unsafe
operation, and a clear error message that keeps things safe.

The UEFI variable store is used for more and different things than you
mention, such as (in roughly decreasing order of importance):

* Some UEFI variables (the authenticated ones) have security impact.
This covers the standardized ones used for Secure Boot (Platform Key,
Key Exchange Keys, white-listed certificates and signatures (DB) and
black-listed certificates and signatures (DBX)).

* To my knowledge, it also includes some similar security-related
variables used by shim / MokManager (where "MOK" is short for Machine
Owner Key); that is, non-volatile variables to which shim delegates the
EFI binaries' verification, from the standardized Secure Boot interfaces.

* UEFI variables can serve as the backend for the linux "pstore"
(persistent store) file system. Pstore in turn can be used to save the
last part of dmesg on a crash. The messages can be re-read at a new boot.

* Firmware uses (reads/writes) a number of variables internally at each
boot. These may not be critical. One example is a variable that helps
reduce UEFI memory map fragmentation over a series of boots.

* OVMF manipulates UEFI boot options on each boot, according to the
bootindex properties (or more directly, according to the "bootorder"
fw_cfg file). Although, admittedly, this is likely the least risky
category of varstore contents.


While I have myself successfully used -- offline only -- internal
snapshots with OVMF guests, hand-waving away the knowledge that the
varstore was never actually snapshotted, I feel real uncomfortable about
silently performing an inherently lossy operation, especially when the
varstore may well have security impact.

Users will not read the documentation (they never do), and I would
rather not field future bug reports about obscure Secure Boot misbehavior.

It is ultimately up to libvirt developers, but IMO, if we continue to
allow this unsafe operation, then the minimum would be:

* in virt-manager, pop up an extra confirmation dialog, with clear
indication that the operation will be lossy and could have security impact,

* in virsh, reject the operation with a similar error message, unless
"--force" or something similar were specified.

* And, because there are other (independent) libvirt client
applications, this would likely require a new flag on the libvirt API
level, so that libvirt itself can reject unsafe snapshotting requests,
regardless of the client application that submits it.

I agree that usability regressions are not nice and will likely generate
bug reports; however, *if* they are in direct conflict with security
improvements, then security improvements trump usability regressions. I
guess we can allow users to ignore security, but they need to be
informed on the spot, and they have to opt in.

I would prefer if we went ahead with this patch; but, again, it's up to
you in the end.

Thanks!
Laszlo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Peter Krempa 7 years, 1 month ago
On Thu, Mar 23, 2017 at 17:49:56 +0100, Laszlo Ersek wrote:
> On 03/23/17 15:07, Peter Krempa wrote:
> > On Thu, Mar 23, 2017 at 11:03:02 +0100, Laszlo Ersek wrote:
> >> On 03/23/17 10:54, Peter Krempa wrote:
> >>> On Thu, Mar 23, 2017 at 10:48:01 +0100, Laszlo Ersek wrote:
> >>>> On 03/23/17 10:29, Peter Krempa wrote:

[...]

> > Due to the facts above I think that there are users that legitimately
> > think that snapshots with pflash loaders work as expected. It's mostly
> > due to the fact that the data are pretty static and OSes don't store
> > anything important there and are able to self-heal some of the problems.
> > 
> > I think we should not disallow this to avoid usability regressions. We
> > can add documentation that states that it's unsafe to do snapshots.
> > Additionally we will need to add support for external snapshots, which
> > currently have similar kind of problems, although fixable.
> 
> The tradeoff is between a seemingly working, but inherently unsafe
> operation, and a clear error message that keeps things safe.
> 
> The UEFI variable store is used for more and different things than you
> mention, such as (in roughly decreasing order of importance):
> 
> * Some UEFI variables (the authenticated ones) have security impact.
> This covers the standardized ones used for Secure Boot (Platform Key,
> Key Exchange Keys, white-listed certificates and signatures (DB) and
> black-listed certificates and signatures (DBX)).
> 
> * To my knowledge, it also includes some similar security-related
> variables used by shim / MokManager (where "MOK" is short for Machine
> Owner Key); that is, non-volatile variables to which shim delegates the
> EFI binaries' verification, from the standardized Secure Boot interfaces.
> 
> * UEFI variables can serve as the backend for the linux "pstore"
> (persistent store) file system. Pstore in turn can be used to save the
> last part of dmesg on a crash. The messages can be re-read at a new boot.
> 
> * Firmware uses (reads/writes) a number of variables internally at each
> boot. These may not be critical. One example is a variable that helps
> reduce UEFI memory map fragmentation over a series of boots.
> 
> * OVMF manipulates UEFI boot options on each boot, according to the
> bootindex properties (or more directly, according to the "bootorder"
> fw_cfg file). Although, admittedly, this is likely the least risky
> category of varstore contents.

I was worried about the secure boot case too. Thanks for pointing out
that OVMF actually uses other variables too.

> While I have myself successfully used -- offline only -- internal
> snapshots with OVMF guests, hand-waving away the knowledge that the
> varstore was never actually snapshotted, I feel real uncomfortable about
> silently performing an inherently lossy operation, especially when the
> varstore may well have security impact.

I agree.

> 
> Users will not read the documentation (they never do), and I would
> rather not field future bug reports about obscure Secure Boot misbehavior.
> 
> It is ultimately up to libvirt developers, but IMO, if we continue to
> allow this unsafe operation, then the minimum would be:
> 
> * in virt-manager, pop up an extra confirmation dialog, with clear
> indication that the operation will be lossy and could have security impact,

I'll file a bug for that ...

> 
> * in virsh, reject the operation with a similar error message, unless
> "--force" or something similar were specified.

and introduce this, called --unsafe in virsh and
VIR_ERR_OPERATION_UNSAFE for notification and
VIR_DOMAIN_SNAPSHOT_CREATE_UNSAFE for overriding.

> 
> * And, because there are other (independent) libvirt client
> applications, this would likely require a new flag on the libvirt API
> level, so that libvirt itself can reject unsafe snapshotting requests,
> regardless of the client application that submits it.
> 
> I agree that usability regressions are not nice and will likely generate
> bug reports; however, *if* they are in direct conflict with security
> improvements, then security improvements trump usability regressions. I
> guess we can allow users to ignore security, but they need to be
> informed on the spot, and they have to opt in.

Agreed.

> 
> I would prefer if we went ahead with this patch; but, again, it's up to
> you in the end.

I'll post a v3 with the option to override it, if users insist that they
don't care about the state of their varstore.

Peter
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Laszlo Ersek 7 years, 1 month ago
On 03/24/17 13:46, Peter Krempa wrote:
> On Thu, Mar 23, 2017 at 17:49:56 +0100, Laszlo Ersek wrote:
>> On 03/23/17 15:07, Peter Krempa wrote:
>>> On Thu, Mar 23, 2017 at 11:03:02 +0100, Laszlo Ersek wrote:
>>>> On 03/23/17 10:54, Peter Krempa wrote:
>>>>> On Thu, Mar 23, 2017 at 10:48:01 +0100, Laszlo Ersek wrote:
>>>>>> On 03/23/17 10:29, Peter Krempa wrote:
> 
> [...]
> 
>>> Due to the facts above I think that there are users that legitimately
>>> think that snapshots with pflash loaders work as expected. It's mostly
>>> due to the fact that the data are pretty static and OSes don't store
>>> anything important there and are able to self-heal some of the problems.
>>>
>>> I think we should not disallow this to avoid usability regressions. We
>>> can add documentation that states that it's unsafe to do snapshots.
>>> Additionally we will need to add support for external snapshots, which
>>> currently have similar kind of problems, although fixable.
>>
>> The tradeoff is between a seemingly working, but inherently unsafe
>> operation, and a clear error message that keeps things safe.
>>
>> The UEFI variable store is used for more and different things than you
>> mention, such as (in roughly decreasing order of importance):
>>
>> * Some UEFI variables (the authenticated ones) have security impact.
>> This covers the standardized ones used for Secure Boot (Platform Key,
>> Key Exchange Keys, white-listed certificates and signatures (DB) and
>> black-listed certificates and signatures (DBX)).
>>
>> * To my knowledge, it also includes some similar security-related
>> variables used by shim / MokManager (where "MOK" is short for Machine
>> Owner Key); that is, non-volatile variables to which shim delegates the
>> EFI binaries' verification, from the standardized Secure Boot interfaces.
>>
>> * UEFI variables can serve as the backend for the linux "pstore"
>> (persistent store) file system. Pstore in turn can be used to save the
>> last part of dmesg on a crash. The messages can be re-read at a new boot.
>>
>> * Firmware uses (reads/writes) a number of variables internally at each
>> boot. These may not be critical. One example is a variable that helps
>> reduce UEFI memory map fragmentation over a series of boots.
>>
>> * OVMF manipulates UEFI boot options on each boot, according to the
>> bootindex properties (or more directly, according to the "bootorder"
>> fw_cfg file). Although, admittedly, this is likely the least risky
>> category of varstore contents.
> 
> I was worried about the secure boot case too. Thanks for pointing out
> that OVMF actually uses other variables too.
> 
>> While I have myself successfully used -- offline only -- internal
>> snapshots with OVMF guests, hand-waving away the knowledge that the
>> varstore was never actually snapshotted, I feel real uncomfortable about
>> silently performing an inherently lossy operation, especially when the
>> varstore may well have security impact.
> 
> I agree.
> 
>>
>> Users will not read the documentation (they never do), and I would
>> rather not field future bug reports about obscure Secure Boot misbehavior.
>>
>> It is ultimately up to libvirt developers, but IMO, if we continue to
>> allow this unsafe operation, then the minimum would be:
>>
>> * in virt-manager, pop up an extra confirmation dialog, with clear
>> indication that the operation will be lossy and could have security impact,
> 
> I'll file a bug for that ...
> 
>>
>> * in virsh, reject the operation with a similar error message, unless
>> "--force" or something similar were specified.
> 
> and introduce this, called --unsafe in virsh and
> VIR_ERR_OPERATION_UNSAFE for notification and
> VIR_DOMAIN_SNAPSHOT_CREATE_UNSAFE for overriding.
> 
>>
>> * And, because there are other (independent) libvirt client
>> applications, this would likely require a new flag on the libvirt API
>> level, so that libvirt itself can reject unsafe snapshotting requests,
>> regardless of the client application that submits it.
>>
>> I agree that usability regressions are not nice and will likely generate
>> bug reports; however, *if* they are in direct conflict with security
>> improvements, then security improvements trump usability regressions. I
>> guess we can allow users to ignore security, but they need to be
>> informed on the spot, and they have to opt in.
> 
> Agreed.
> 
>>
>> I would prefer if we went ahead with this patch; but, again, it's up to
>> you in the end.
> 
> I'll post a v3 with the option to override it, if users insist that they
> don't care about the state of their varstore.

Thank you, this all sounds good to me.

Could you please test v3 with both live (online) and offline OVMF VMs,
with and without --unsafe? (That is, four cases in total.)

Maybe that goes without saying :), but I figured I would ask for full
testing coverage explicitly, because from
<https://bugzilla.redhat.com/show_bug.cgi?id=1214187> it is not exactly
clear to me whether right now we'd have an actual functional failure for
live OVMF VMs, even with --unsafe.

Thanks!
Laszlo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Peter Krempa 7 years, 1 month ago
On Fri, Mar 24, 2017 at 14:04:08 +0100, Laszlo Ersek wrote:
> On 03/24/17 13:46, Peter Krempa wrote:
> > On Thu, Mar 23, 2017 at 17:49:56 +0100, Laszlo Ersek wrote:
> >> On 03/23/17 15:07, Peter Krempa wrote:

[...]

> > 
> > I'll post a v3 with the option to override it, if users insist that they
> > don't care about the state of their varstore.
> 
> Thank you, this all sounds good to me.

I've already posted v3, but forgot to CC you:

https://www.redhat.com/archives/libvir-list/2017-March/msg01151.html


> 
> Could you please test v3 with both live (online) and offline OVMF VMs,
> with and without --unsafe? (That is, four cases in total.)

Sigh. I've thought that I tested this properly, but libvirt code is not
able to detect the failure from taking the snapshot and thus thinks it
created the snapshot properly.

I've got:

{"return": "Device 'pflash1' is writable but does not support snapshots.\r\n", "id": "libvirt-37"}

This is due to the fact that 'savevm' uses HMP passthrough and the code
to detect errors sucks.

Sorry for the noise.

v3 is not necessary with all the fancy code. I'll push v2 with a better
explanation, since it fixes an actual bug :/



--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Laszlo Ersek 7 years, 1 month ago
On 03/24/17 14:23, Peter Krempa wrote:
> On Fri, Mar 24, 2017 at 14:04:08 +0100, Laszlo Ersek wrote:
>> On 03/24/17 13:46, Peter Krempa wrote:
>>> On Thu, Mar 23, 2017 at 17:49:56 +0100, Laszlo Ersek wrote:
>>>> On 03/23/17 15:07, Peter Krempa wrote:
> 
> [...]
> 
>>>
>>> I'll post a v3 with the option to override it, if users insist that they
>>> don't care about the state of their varstore.
>>
>> Thank you, this all sounds good to me.
> 
> I've already posted v3, but forgot to CC you:
> 
> https://www.redhat.com/archives/libvir-list/2017-March/msg01151.html
> 
> 
>>
>> Could you please test v3 with both live (online) and offline OVMF VMs,
>> with and without --unsafe? (That is, four cases in total.)
> 
> Sigh. I've thought that I tested this properly, but libvirt code is not
> able to detect the failure from taking the snapshot and thus thinks it
> created the snapshot properly.
> 
> I've got:
> 
> {"return": "Device 'pflash1' is writable but does not support snapshots.\r\n", "id": "libvirt-37"}
> 
> This is due to the fact that 'savevm' uses HMP passthrough and the code
> to detect errors sucks.
> 
> Sorry for the noise.
> 
> v3 is not necessary with all the fancy code. I'll push v2 with a better
> explanation, since it fixes an actual bug :/

Thanks! It's just as well that we've been thorough enough to track this
down, so that's a plus at least.

Laszlo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2] qemu: snapshot: Forbid internal snapshots with pflash firmware
Posted by Jiri Denemark 7 years, 1 month ago
On Thu, Mar 23, 2017 at 10:54:04 +0100, Peter Krempa wrote:
> On Thu, Mar 23, 2017 at 10:48:01 +0100, Laszlo Ersek wrote:
> > On 03/23/17 10:29, Peter Krempa wrote:
> > > If the variable store (<nvram>) file is raw qemu can't do a snapshot of
> > > it and thus the snapshot would be incomplete. QEMU does no reject such
> > > snapshot.
> > > 
> > > Additionally allowing to use a qcow2 variable store backing file would
> > > solve this issue but then it would become eligible to become target of
> > > the memory dump.
> > > 
> > > Offline internal snapshot would be incomplete too with either storage
> > > format since libvirt does not handle the pflash file in this case.
> > > 
> > > Forbid such snapshot so that we can avoid problems.
> > > ---
> > > 
> > > Notes:
> > >     v2:
> > >     - changed error code to OPERATION_UNSUPPORTED (from CONFIG_UNSUPPORTED)
> > >     - dropped mention of QEMU from the error message
> > >     - dropped mentions of OVMF or the firmware itself altoghether, the culprit is
> > >       the pflash device regardless of the software it contains
> > >     - mentioned all the stuff in the commit message and comment
> > >     
> > >     We also will need to introduce a way to snapshot the pflash for external
> > >     snapshots which is currently impossible as well, but fortunately does not
> > >     have inherent drawbacks as internal snapshots.
> > > 
> > >  src/qemu/qemu_driver.c | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > > index 676295208..202d190b3 100644
> > > --- a/src/qemu/qemu_driver.c
> > > +++ b/src/qemu/qemu_driver.c
> > > @@ -13873,6 +13873,16 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
> > >          goto cleanup;
> > >      }
> > > 
> > > +    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
> > > +     * not snapshot the variable store */
> > > +    if (found_internal &&
> > > +        vm->def->os.loader->type == VIR_DOMAIN_LOADER_TYPE_PFLASH) {
> > > +        virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
> > > +                       _("internal snapshots of a VM with pflash based "
> > > +                         "firmware are not supported"));
> > > +        goto cleanup;
> > > +    }
> > > +
> > >      /* Alter flags to let later users know what we learned.  */
> > >      if (external && !active)
> > >          *flags |= VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY;
> > > 
> 
> I apparently forgot to commit this despite mentioning it in the commit
> message:
> 
> @@ -13873,8 +13873,14 @@ qemuDomainSnapshotPrepare(virConnectPtr conn,
>          goto cleanup;
>      }
> 
> -    /* Internal snapshots don't work with VMs with OVMF loader since qemu does
> -     * not snapshot the variable store */
> +    /* internal snapshots + pflash based loader have the following problems:
> +     * - if the variable store is raw, the snapshot is incomplete
> +     * - alowing a qcow2 image as the varstore would make it eligible to receive
> +     *   the vmstate dump, which would make it huge
> +     * - offline snapshot would not snapshot the varstore at all
> +     *
> +     * Avoid the issues by forbidding this completely.
> +     */

ACK with this updated commit message.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list