SGX enclaves have an attestation mechanism. An enclave might,
for instance, need to attest to its state before it is given
a special decryption key. Since SGX must trust the CPU microcode,
attestation incorporates the microcode versions of all processors
on the system and is affected by microcode updates. This enables
deployment decisions based on the microcode version.
For example, an enclave might be denied a decryption key if it
runs on a system that has old microcode without a specific mitigation.
Unfortunately, this attestation metric (called CPUSVN) is only a snapshot.
When the kernel first uses SGX (successfully executes any ENCLS instruction),
SGX inspects all CPUs in the system and incorporates a record of their
microcode versions into CPUSVN. CPUSVN is only automatically updated at reboot.
This means that, although the microcode has been updated, enclaves can never
attest to this fact. Enclaves are stuck attesting to the old version until
a reboot.
The SGX architecture has an alternative to these reboots: the ENCLS[EUPDATESVN]
instruction [1]. It allows another snapshot to be taken to update CPUSVN
after a runtime microcode update without a reboot.
Whenever a microcode update affects SGX, the SGX attestation architecture
assumes that all running enclaves and cryptographic assets (like internal
SGX encryption keys) have been compromised. To mitigate the impact of this
presumed compromise, EUPDATESVN success requires that all SGX memory to be
marked as “unused” and its contents destroyed. This requirement ensures
that no compromised enclave can survive the EUPDATESVN procedure and provides
an opportunity to generate new cryptographic assets.
Attempt to execute EUPDATESVN on the first open of sgx_(vepc)open().
If EUPDATESVN fails with any other error code than SGX_INSUFFICIENT_ENTROPY,
this is considered unexpected and the open() returns an error. This
should not happen in practice. On contrary SGX_INSUFFICIENT_ENTROPY might
happen due to a pressure on the system DRNG (RDSEED) and therefore
the open() is not blocked to allow normal enclave operation.
[1] Runtime Microcode Updates with Intel Software Guard Extensions,
https://cdrdv2.intel.com/v1/dl/getContent/648682
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
arch/x86/kernel/cpu/sgx/driver.c | 23 +++++++++++++-------
arch/x86/kernel/cpu/sgx/main.c | 36 ++++++++++++++++++++++++++++++--
arch/x86/kernel/cpu/sgx/virt.c | 16 +++++++++++---
3 files changed, 63 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
index b5ffe104af4c..bde06b6755f2 100644
--- a/arch/x86/kernel/cpu/sgx/driver.c
+++ b/arch/x86/kernel/cpu/sgx/driver.c
@@ -19,10 +19,15 @@ static int sgx_open(struct inode *inode, struct file *file)
struct sgx_encl *encl;
int ret;
- sgx_inc_usage_count();
+ ret = sgx_inc_usage_count();
+ if (ret)
+ return -EBUSY;
+
encl = kzalloc(sizeof(*encl), GFP_KERNEL);
- if (!encl)
- return -ENOMEM;
+ if (!encl) {
+ ret = -ENOMEM;
+ goto err_usage_count;
+ }
kref_init(&encl->refcount);
xa_init(&encl->page_array);
@@ -32,14 +37,18 @@ static int sgx_open(struct inode *inode, struct file *file)
spin_lock_init(&encl->mm_lock);
ret = init_srcu_struct(&encl->srcu);
- if (ret) {
- kfree(encl);
- return ret;
- }
+ if (ret)
+ goto err_encl;
file->private_data = encl;
return 0;
+
+err_encl:
+ kfree(encl);
+err_usage_count:
+ sgx_dec_usage_count();
+ return ret;
}
static int sgx_release(struct inode *inode, struct file *file)
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index fd71e2ddca59..d58e0c46cbf9 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -917,6 +917,8 @@ EXPORT_SYMBOL_GPL(sgx_set_attribute);
/* Counter to count the active SGX users */
static atomic64_t sgx_usage_count;
+/* Mutex to ensure EUPDATESVN is called when EPC is empty */
+static DEFINE_MUTEX(sgx_svn_lock);
/**
* sgx_updatesvn() - Attempt to call ENCLS[EUPDATESVN]
@@ -976,8 +978,38 @@ static int sgx_update_svn(void)
int sgx_inc_usage_count(void)
{
- atomic64_inc(&sgx_usage_count);
- return 0;
+ int ret;
+
+ /*
+ * Increments from non-zero indicate EPC other
+ * active EPC users and EUPDATESVN is not attempted.
+ */
+ if (atomic64_inc_not_zero(&sgx_usage_count))
+ return 0;
+
+ /*
+ * Ensure no other concurrent threads can start
+ * touching EPC while EUPDATESVN is running.
+ */
+ guard(mutex)(&sgx_svn_lock);
+
+ if (atomic64_inc_not_zero(&sgx_usage_count))
+ return 0;
+
+ /*
+ * Attempt to call EUPDATESVN since EPC must be
+ * empty at this point.
+ */
+ ret = sgx_update_svn();
+
+ /*
+ * If EUPDATESVN failed with a non-expected error
+ * code, return failure to sgx_(vepc_)open and
+ * do not increment the sgx_usage_count.
+ */
+ if (!ret)
+ atomic64_inc(&sgx_usage_count);
+ return ret;
}
void sgx_dec_usage_count(void)
diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c
index 83de0907f32c..e6e29c09c3b9 100644
--- a/arch/x86/kernel/cpu/sgx/virt.c
+++ b/arch/x86/kernel/cpu/sgx/virt.c
@@ -262,17 +262,27 @@ static int sgx_vepc_release(struct inode *inode, struct file *file)
static int sgx_vepc_open(struct inode *inode, struct file *file)
{
struct sgx_vepc *vepc;
+ int ret;
+
+ ret = sgx_inc_usage_count();
+ if (ret)
+ return -EBUSY;
- sgx_inc_usage_count();
vepc = kzalloc(sizeof(struct sgx_vepc), GFP_KERNEL);
- if (!vepc)
- return -ENOMEM;
+ if (!vepc) {
+ ret = -ENOMEM;
+ goto err_usage_count;
+ }
mutex_init(&vepc->lock);
xa_init(&vepc->page_array);
file->private_data = vepc;
return 0;
+
+err_usage_count:
+ sgx_dec_usage_count();
+ return ret;
}
static long sgx_vepc_ioctl(struct file *file,
--
2.45.2
On Mon, May 19, 2025 at 10:24:31AM +0300, Elena Reshetova wrote: > SGX enclaves have an attestation mechanism. An enclave might, > for instance, need to attest to its state before it is given > a special decryption key. Since SGX must trust the CPU microcode, > attestation incorporates the microcode versions of all processors > on the system and is affected by microcode updates. This enables > deployment decisions based on the microcode version. > For example, an enclave might be denied a decryption key if it > runs on a system that has old microcode without a specific mitigation. > > Unfortunately, this attestation metric (called CPUSVN) is only a snapshot. > When the kernel first uses SGX (successfully executes any ENCLS instruction), > SGX inspects all CPUs in the system and incorporates a record of their > microcode versions into CPUSVN. CPUSVN is only automatically updated at reboot. > This means that, although the microcode has been updated, enclaves can never > attest to this fact. Enclaves are stuck attesting to the old version until > a reboot. > > The SGX architecture has an alternative to these reboots: the ENCLS[EUPDATESVN] > instruction [1]. It allows another snapshot to be taken to update CPUSVN > after a runtime microcode update without a reboot. > > Whenever a microcode update affects SGX, the SGX attestation architecture > assumes that all running enclaves and cryptographic assets (like internal > SGX encryption keys) have been compromised. To mitigate the impact of this > presumed compromise, EUPDATESVN success requires that all SGX memory to be > marked as “unused” and its contents destroyed. This requirement ensures > that no compromised enclave can survive the EUPDATESVN procedure and provides > an opportunity to generate new cryptographic assets. > > Attempt to execute EUPDATESVN on the first open of sgx_(vepc)open(). > If EUPDATESVN fails with any other error code than SGX_INSUFFICIENT_ENTROPY, > this is considered unexpected and the open() returns an error. This > should not happen in practice. On contrary SGX_INSUFFICIENT_ENTROPY might > happen due to a pressure on the system DRNG (RDSEED) and therefore > the open() is not blocked to allow normal enclave operation. > > [1] Runtime Microcode Updates with Intel Software Guard Extensions, > https://cdrdv2.intel.com/v1/dl/getContent/648682 I'd hope, despite having the wealth of information in it, this commit message would a go round or few of editing, and nail the gist of this commit better. Then it would be easier in future review the choices made. BR, Jarkko
> > On Mon, May 19, 2025 at 10:24:31AM +0300, Elena Reshetova wrote: > > SGX enclaves have an attestation mechanism. An enclave might, > > for instance, need to attest to its state before it is given > > a special decryption key. Since SGX must trust the CPU microcode, > > attestation incorporates the microcode versions of all processors > > on the system and is affected by microcode updates. This enables > > deployment decisions based on the microcode version. > > For example, an enclave might be denied a decryption key if it > > runs on a system that has old microcode without a specific mitigation. > > > > Unfortunately, this attestation metric (called CPUSVN) is only a snapshot. > > When the kernel first uses SGX (successfully executes any ENCLS > instruction), > > SGX inspects all CPUs in the system and incorporates a record of their > > microcode versions into CPUSVN. CPUSVN is only automatically updated at > reboot. > > This means that, although the microcode has been updated, enclaves can > never > > attest to this fact. Enclaves are stuck attesting to the old version until > > a reboot. > > > > The SGX architecture has an alternative to these reboots: the > ENCLS[EUPDATESVN] > > instruction [1]. It allows another snapshot to be taken to update CPUSVN > > after a runtime microcode update without a reboot. > > > > Whenever a microcode update affects SGX, the SGX attestation architecture > > assumes that all running enclaves and cryptographic assets (like internal > > SGX encryption keys) have been compromised. To mitigate the impact of > this > > presumed compromise, EUPDATESVN success requires that all SGX memory > to be > > marked as “unused” and its contents destroyed. This requirement ensures > > that no compromised enclave can survive the EUPDATESVN procedure and > provides > > an opportunity to generate new cryptographic assets. > > > > Attempt to execute EUPDATESVN on the first open of sgx_(vepc)open(). > > If EUPDATESVN fails with any other error code than > SGX_INSUFFICIENT_ENTROPY, > > this is considered unexpected and the open() returns an error. This > > should not happen in practice. On contrary SGX_INSUFFICIENT_ENTROPY > might > > happen due to a pressure on the system DRNG (RDSEED) and therefore > > the open() is not blocked to allow normal enclave operation. > > > > [1] Runtime Microcode Updates with Intel Software Guard Extensions, > > https://cdrdv2.intel.com/v1/dl/getContent/648682 > > I'd hope, despite having the wealth of information in it, this commit > message would a go round or few of editing, and nail the gist of this > commit better. Then it would be easier in future review the choices > made. I will try to trim the background text more. Best Regards, Elena.
* Elena Reshetova <elena.reshetova@intel.com> wrote:
> @@ -19,10 +19,15 @@ static int sgx_open(struct inode *inode, struct file *file)
> struct sgx_encl *encl;
> int ret;
>
> + ret = sgx_inc_usage_count();
> + if (ret)
> + return -EBUSY;
So if sgx_inc_usage_count() returns nonzero, it's in use already and we
return -EBUSY, right?
But:
> int sgx_inc_usage_count(void)
> {
> + int ret;
> +
> + /*
> + * Increments from non-zero indicate EPC other
> + * active EPC users and EUPDATESVN is not attempted.
> + */
> + if (atomic64_inc_not_zero(&sgx_usage_count))
> + return 0;
If sgx_usage_count is 1 here (ie. it's busy), this will return *zero*,
and sgx_open() will not run into the -EBUSY condition and will continue
assuming it has claimed the usage count, while it hasn't ...
Furthermore, in the sgx_open() error paths we can then run into
sgx_dec_usage_count(), which will merrily underflow the counter into
negative:
+void sgx_dec_usage_count(void)
+{
+ atomic64_dec(&sgx_usage_count);
+}
How is this all supposed to work?
Thanks,
Ingo
> * Elena Reshetova <elena.reshetova@intel.com> wrote:
>
> > @@ -19,10 +19,15 @@ static int sgx_open(struct inode *inode, struct file
> *file)
> > struct sgx_encl *encl;
> > int ret;
> >
> > + ret = sgx_inc_usage_count();
> > + if (ret)
> > + return -EBUSY;
>
> So if sgx_inc_usage_count() returns nonzero, it's in use already and we
> return -EBUSY, right?
I guess my selection of error code here was wrong.
The intended logic is if sgx_inc_usage_count() returns nonzero,
the *incrementing of counter failed* (due to failed EUPDATESVN)
and we want to stop and report error.
>
> But:
>
> > int sgx_inc_usage_count(void)
> > {
> > + int ret;
> > +
> > + /*
> > + * Increments from non-zero indicate EPC other
> > + * active EPC users and EUPDATESVN is not attempted.
> > + */
> > + if (atomic64_inc_not_zero(&sgx_usage_count))
> > + return 0;
>
> If sgx_usage_count is 1 here (ie. it's busy), this will return *zero*,
> and sgx_open() will not run into the -EBUSY condition and will continue
> assuming it has claimed the usage count, while it hasn't ...
Yes, meaning is different, see above.
>
> Furthermore, in the sgx_open() error paths we can then run into
What error paths? In case sgx_inc_usage_count() fails, we exit
immediately.
> sgx_dec_usage_count(), which will merrily underflow the counter into
> negative:
>
> +void sgx_dec_usage_count(void)
> +{
> + atomic64_dec(&sgx_usage_count);
> +}
>
> How is this all supposed to work?
Looks like I need more explanation on the counter and a better error
returned to userspace than -EBUSY. Maybe EIO then?
Best Regards,
Elena.
>
> Thanks,
>
> Ingo
* Reshetova, Elena <elena.reshetova@intel.com> wrote:
>
> > * Elena Reshetova <elena.reshetova@intel.com> wrote:
> >
> > > @@ -19,10 +19,15 @@ static int sgx_open(struct inode *inode, struct file
> > *file)
> > > struct sgx_encl *encl;
> > > int ret;
> > >
> > > + ret = sgx_inc_usage_count();
> > > + if (ret)
> > > + return -EBUSY;
> >
> > So if sgx_inc_usage_count() returns nonzero, it's in use already and we
> > return -EBUSY, right?
>
> I guess my selection of error code here was wrong.
> The intended logic is if sgx_inc_usage_count() returns nonzero,
> the *incrementing of counter failed* (due to failed EUPDATESVN)
> and we want to stop and report error.
>
> >
> > But:
> >
> > > int sgx_inc_usage_count(void)
> > > {
> > > + int ret;
> > > +
> > > + /*
> > > + * Increments from non-zero indicate EPC other
> > > + * active EPC users and EUPDATESVN is not attempted.
> > > + */
> > > + if (atomic64_inc_not_zero(&sgx_usage_count))
> > > + return 0;
> >
> > If sgx_usage_count is 1 here (ie. it's busy), this will return *zero*,
> > and sgx_open() will not run into the -EBUSY condition and will continue
> > assuming it has claimed the usage count, while it hasn't ...
>
> Yes, meaning is different, see above.
So that's rather convoluted:
atomic64_inc_not_zero(): returns 1 on successful increase, 0 on failure
sgx_inc_usage_count(): returns 0 on successful increase, 1 on failure
sgx_open(): returns 0 on successful increase, -EBUSY on failure
Could we at least standardize sgx_inc_usage_count() on -EBUSY in the
failure case, so it's a more obvious pattern:
+ ret = sgx_inc_usage_count();
+ if (ret < 0)
+ return ret;
Thanks,
Ingo
> > Yes, meaning is different, see above. > > So that's rather convoluted: > > atomic64_inc_not_zero(): returns 1 on successful increase, 0 on > failure > sgx_inc_usage_count(): returns 0 on successful increase, 1 on failure > sgx_open(): returns 0 on successful increase, -EBUSY on failure > > Could we at least standardize sgx_inc_usage_count() on -EBUSY in the > failure case, so it's a more obvious pattern: > > + ret = sgx_inc_usage_count(); > + if (ret < 0) > + return ret; > Yes, will rewrite accordingly. Especially since I have to return two different error codes into sgx_open() now to indicate different nature of issues with running EUDPATESVN: temporal failure due to lack of entropy (-EAGAIN) and potentially persistent problem when getting unexpected error codes (-EIO). Best Regards, Elena.
* Reshetova, Elena <elena.reshetova@intel.com> wrote: > > > Yes, meaning is different, see above. > > > > So that's rather convoluted: > > > > atomic64_inc_not_zero(): returns 1 on successful increase, 0 on > > failure > > sgx_inc_usage_count(): returns 0 on successful increase, 1 on failure > > sgx_open(): returns 0 on successful increase, -EBUSY on failure > > > > Could we at least standardize sgx_inc_usage_count() on -EBUSY in the > > failure case, so it's a more obvious pattern: > > > > + ret = sgx_inc_usage_count(); > > + if (ret < 0) > > + return ret; > > > > Yes, will rewrite accordingly. Especially since I have to return two different > error codes into sgx_open() now to indicate different nature of issues with > running EUDPATESVN: temporal failure due to lack of entropy (-EAGAIN) > and potentially persistent problem when getting unexpected error codes > (-EIO). Makes sense! Thanks, Ingo
© 2016 - 2025 Red Hat, Inc.