[v1] tools: provide virt-qemu-sev-validate for SEV(-ES) launch attestation

[libvirt PATCH 12/12] docs/manpages: add checklist of problems for SEV attestation

Posted by Daniel P. Berrangé 3 years, 4 months ago

Despite efforts to make the virt-qemu-sev-validate tool friendly, it is
a certainty that almost everyone who tries it will hit false negative
results, getting a failure despite the VM being trustworthy.

Diagnosing these problems is no easy matter, especially for those not
familiar with SEV/SEV-ES in general. This extra docs text attempts to
set out a checklist of items to look at to identify what went wrong.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 docs/manpages/virt-qemu-sev-validate.rst | 112 +++++++++++++++++++++++
 1 file changed, 112 insertions(+)

diff --git a/docs/manpages/virt-qemu-sev-validate.rst b/docs/manpages/virt-qemu-sev-validate.rst
index 7542bea9aa..e0c18f2d20 100644
--- a/docs/manpages/virt-qemu-sev-validate.rst
+++ b/docs/manpages/virt-qemu-sev-validate.rst
@@ -437,6 +437,118 @@ inject a disk password on success:
        --domain fedora34x86_64 \
        --disk-password passwd.txt
 
+COMMON MISTAKES CHECKLIST
+=========================
+
+The complexity of configuring a guest and validating its boot measurement
+means it is very likely to see the failure::
+
+   ERROR: Measurement does not match, VM is not trustworthy
+
+This error message assumes the worst, but in most cases will failure will be
+a result of either mis-configuring the guest, or passing the wrong information
+when trying to validate it. The following information is a guide for what
+items to check in order to stand the best chance of diagnosing the problem
+
+* Check the VM configuration for the DH certificate and session
+  blob in the libvirt guest XML.
+
+  The content for these fields should be in base64 format, which is
+  what ``sevctl session`` generates. Other tools may generate the files
+  in binary format, so ensure it has been correctly converted to base64.
+
+* Check the VM configuration policy value matches the session blob
+
+  The ``<policy>`` value in libvirt guest XML has to match the value
+  passed to the ``sevctl session`` command.
+
+* Check the correct TIK/TEK keypair are passed
+
+  The TIK/TEK keypair are uniquely tied to each DH cert and session
+  blob. Make sure that the TIK/TEK keypair passed to this program
+  the ones matched to the DH cert and session blob configured for
+  the libvirt guest XML. This is one of the most common mistakes.
+  Further ensure that the TIK and TEK files are not swapped.
+
+* Check the firmware binary matches the one used to boot
+
+  The firmware binary content is part of the data covered by the
+  launch measurement. Ensure that the firmware binary passed to
+  this program matches the one used to launch the guest. The
+  hypervisor host will periodically get software updates which
+  introduce a new firmware binary version.
+
+* Check the kernel, initrd and cmdline match the one used to boot
+
+  If the guest is configured to use direct kernel boot, check that
+  the kernel, initrd and cmdline passed to this program match the
+  ones used to boot the guest. In the kernel cmdline whitespace
+  must be preserved exactly, including any leading or trailing
+  spaces.
+
+* Check whether the kernel hash measurement is enabled
+
+  The ``kernelHashes`` property in the libvirt guest XML controls
+  whether hashes of the kernel, initrd and cmdline content are
+  covered by the boot measurement. If enabled, then the matching
+  content must be passed to this program. UIf disabled, then
+  the content must **NOT** be passed.
+
+* Check that the correct measurement hash is passed
+
+  The measurement hash includes a nonce, so it will be different
+  on every boot attempt. Thus when validating the measuremnt it
+  is important ensure the most recent measurement is used.
+
+* Check the correct VMSA blobs / CPU SKU values for the host are used
+
+  The VMSA blobs provide the initial register state for the
+  boot CPU and any additional CPUs. One of the registers
+  encodes the CPU SKU (family, model, stepping) of the physical
+  host CPU. Make sure that the VMSA blob used for validation
+  is one that matches the SKU of the host the guest is booted
+  on. Passing the CPU SKU values directly to the tool can
+  reduce the likelihood of using the wrong ones.
+
+* Check the CPU count is correct
+
+  When passing VMSA blobs for SEV-ES guests, the number of CPUs
+  present will influence the measurement result. Ensure that the
+  correct vCPU count is used corresponding to the guest boot
+  attempt.
+
+
+Best practice is to run this tool in completely offline mode and pass
+all information as explicit command line parameters. When debugging
+failures, however, it can be useful to tell it to connect to libvirt
+and fetch information. If connecting to a remote libvirt instance,
+it will fetch any information that can be trusted, which is the basic
+VM launch state data. It will also sanity check the XML configuration
+to identify some common mistakes. If the ``--insecure`` flag is passed
+it can extract some configuration information and use that for the
+attestation process.
+
+If the mistake still can't be identified, then this tool can be run
+on the virtualization host. In that scenario the only three command
+line parameters required are for the TIK, TEK and libvirt domain
+name. It should be able to automatically determine all the other
+information required. If it still reports a failure, this points
+very strongly to the TIK/TEK pair not maching the configured
+DH certificate and session blob.
+
+The ``--debug`` flag will display hashes and/or hex dumps for various
+pieces of information used in the attestation process. Comparing the
+``--debug`` output from running on the hypervisor host, against that
+obtained when running in offline mode can give further guidance to
+which parameter is inconsistent.
+
+As mentioned earlier in this document, bear in mind that in general
+any attestation answers obtained from running on the hypervisor host
+should not be trusted. So if a configuration mistake is identified
+it is strongly recommended to re-run the attestation in offline mode
+on a trusted machine.
+
+  
 EXIT STATUS
 ===========
 
-- 
2.37.3

Re: [libvirt PATCH 12/12] docs/manpages: add checklist of problems for SEV attestation

Posted by Cole Robinson 3 years, 3 months ago

On 10/7/22 7:43 AM, Daniel P. Berrangé wrote:
> Despite efforts to make the virt-qemu-sev-validate tool friendly, it is
> a certainty that almost everyone who tries it will hit false negative
> results, getting a failure despite the VM being trustworthy.
> 
> Diagnosing these problems is no easy matter, especially for those not
> familiar with SEV/SEV-ES in general. This extra docs text attempts to
> set out a checklist of items to look at to identify what went wrong.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  docs/manpages/virt-qemu-sev-validate.rst | 112 +++++++++++++++++++++++
>  1 file changed, 112 insertions(+)
> 
> diff --git a/docs/manpages/virt-qemu-sev-validate.rst b/docs/manpages/virt-qemu-sev-validate.rst
> index 7542bea9aa..e0c18f2d20 100644
> --- a/docs/manpages/virt-qemu-sev-validate.rst
> +++ b/docs/manpages/virt-qemu-sev-validate.rst
> @@ -437,6 +437,118 @@ inject a disk password on success:
>         --domain fedora34x86_64 \
>         --disk-password passwd.txt
>  
> +COMMON MISTAKES CHECKLIST
> +=========================
> +
> +The complexity of configuring a guest and validating its boot measurement
> +means it is very likely to see the failure::
> +
> +   ERROR: Measurement does not match, VM is not trustworthy
> +
> +This error message assumes the worst, but in most cases will failure will be
> +a result of either mis-configuring the guest, or passing the wrong information
> +when trying to validate it. The following information is a guide for what
> +items to check in order to stand the best chance of diagnosing the problem
> +
> +* Check the VM configuration for the DH certificate and session
> +  blob in the libvirt guest XML.
> +
> +  The content for these fields should be in base64 format, which is
> +  what ``sevctl session`` generates. Other tools may generate the files
> +  in binary format, so ensure it has been correctly converted to base64.
> +
> +* Check the VM configuration policy value matches the session blob
> +
> +  The ``<policy>`` value in libvirt guest XML has to match the value
> +  passed to the ``sevctl session`` command.
> +

FWIW In this case, qemu will explicitly error. From 7.0.0-6.fc36:

-accel kvm: sev_launch_start: LAUNCH_START ret=1 fw_error=11 'Bad
measurement'

I think it's worth putting some subset of that qemu error string at the
top of this section too. If users hit it, going through the checklist
here may solve their issue.

For example, If you're flailing around with sevctl like I have, some of
the sub commands will invalidate all your previous generated
session/dhCert blobs, and subsequent VM boots will fail as above.
`sevctl reset` and/or `sevctl rotate`. That's probably obscure enough to
not need documenting, but if the first item here leads to re-running
sevctl session then you'll fix your problem :)

Thanks,
Cole

Re: [libvirt PATCH 12/12] docs/manpages: add checklist of problems for SEV attestation

Posted by Daniel P. Berrangé 3 years, 3 months ago

On Sun, Oct 16, 2022 at 03:27:39PM -0400, Cole Robinson wrote:
> On 10/7/22 7:43 AM, Daniel P. Berrangé wrote:
> > Despite efforts to make the virt-qemu-sev-validate tool friendly, it is
> > a certainty that almost everyone who tries it will hit false negative
> > results, getting a failure despite the VM being trustworthy.
> > 
> > Diagnosing these problems is no easy matter, especially for those not
> > familiar with SEV/SEV-ES in general. This extra docs text attempts to
> > set out a checklist of items to look at to identify what went wrong.
> > 
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > ---
> >  docs/manpages/virt-qemu-sev-validate.rst | 112 +++++++++++++++++++++++
> >  1 file changed, 112 insertions(+)
> > 
> > diff --git a/docs/manpages/virt-qemu-sev-validate.rst b/docs/manpages/virt-qemu-sev-validate.rst
> > index 7542bea9aa..e0c18f2d20 100644
> > --- a/docs/manpages/virt-qemu-sev-validate.rst
> > +++ b/docs/manpages/virt-qemu-sev-validate.rst
> > @@ -437,6 +437,118 @@ inject a disk password on success:
> >         --domain fedora34x86_64 \
> >         --disk-password passwd.txt
> >  
> > +COMMON MISTAKES CHECKLIST
> > +=========================
> > +
> > +The complexity of configuring a guest and validating its boot measurement
> > +means it is very likely to see the failure::
> > +
> > +   ERROR: Measurement does not match, VM is not trustworthy
> > +
> > +This error message assumes the worst, but in most cases will failure will be
> > +a result of either mis-configuring the guest, or passing the wrong information
> > +when trying to validate it. The following information is a guide for what
> > +items to check in order to stand the best chance of diagnosing the problem
> > +
> > +* Check the VM configuration for the DH certificate and session
> > +  blob in the libvirt guest XML.
> > +
> > +  The content for these fields should be in base64 format, which is
> > +  what ``sevctl session`` generates. Other tools may generate the files
> > +  in binary format, so ensure it has been correctly converted to base64.
> > +
> > +* Check the VM configuration policy value matches the session blob
> > +
> > +  The ``<policy>`` value in libvirt guest XML has to match the value
> > +  passed to the ``sevctl session`` command.
> > +
> 
> FWIW In this case, qemu will explicitly error. From 7.0.0-6.fc36:
> 
> -accel kvm: sev_launch_start: LAUNCH_START ret=1 fw_error=11 'Bad
> measurement'

Oh, I had forgotten that

> 
> I think it's worth putting some subset of that qemu error string at the
> top of this section too. If users hit it, going through the checklist
> here may solve their issue.
> 
> For example, If you're flailing around with sevctl like I have, some of
> the sub commands will invalidate all your previous generated
> session/dhCert blobs, and subsequent VM boots will fail as above.
> `sevctl reset` and/or `sevctl rotate`. That's probably obscure enough to
> not need documenting, but if the first item here leads to re-running
> sevctl session then you'll fix your problem :)

Hmm, yes, I'd stayed away from reset/rotate to avoid trouble :-)


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|