[PATCH v7 5/5] x86/sgx: Enable automatic SVN updates for SGX enclaves

Elena Reshetova posted 5 patches 2 months, 3 weeks ago
There is a newer version of this series
[PATCH v7 5/5] x86/sgx: Enable automatic SVN updates for SGX enclaves
Posted by Elena Reshetova 2 months, 3 weeks ago
== Background ==

ENCLS[EUPDATESVN] is a new SGX instruction [1] which allows enclave
attestation to include information about updated microcode SVN without a
reboot. Before an EUPDATESVN operation can be successful, all SGX memory
(aka. EPC) must be marked as “unused” in the SGX hardware metadata
(aka.EPCM). This requirement ensures that no compromised enclave can
survive the EUPDATESVN procedure and provides an opportunity to generate
new cryptographic assets.

== Patch Contents ==

Attempt to execute ENCLS[EUPDATESVN] every time the first file descriptor
is obtained via sgx_(vepc_)open(). In the most common case the microcode
SVN is already up-to-date, and the operation succeeds without updating SVN.
If it fails with any other error code than SGX_INSUFFICIENT_ENTROPY, this
is considered unexpected and the *open() returns an error. This should not
happen in practice. On contrary, SGX_INSUFFICIENT_ENTROPY might happen due
to a pressure on the system's DRNG (RDSEED) and therefore the *open() can
be safely retried to allow normal enclave operation.

[1] Runtime Microcode Updates with Intel Software Guard Extensions,
https://cdrdv2.intel.com/v1/dl/getContent/648682

Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 37 +++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 7615d92bb1ed..c3db49e6e967 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -920,6 +920,8 @@ EXPORT_SYMBOL_GPL(sgx_set_attribute);
 
 /* Counter to count the active SGX users */
 static atomic64_t sgx_usage_count;
+/* Mutex to ensure no concurrent EPC accesses during EUPDATESVN */
+static DEFINE_MUTEX(sgx_svn_lock);
 
 /**
  * sgx_update_svn() - Attempt to call ENCLS[EUPDATESVN].
@@ -937,7 +939,7 @@ static atomic64_t sgx_usage_count;
  *  entropy in RNG.
  * -EIO: Unexpected error, retries are not advisable.
  */
-static int __maybe_unused sgx_update_svn(void)
+static int sgx_update_svn(void)
 {
 	int ret;
 
@@ -983,8 +985,37 @@ static int __maybe_unused sgx_update_svn(void)
 
 int sgx_inc_usage_count(void)
 {
-	atomic64_inc(&sgx_usage_count);
-	return 0;
+	int ret;
+
+	/*
+	 * Increments from non-zero indicate potential other
+	 * active EPC users and EUPDATESVN is not attempted.
+	 */
+	if (atomic64_inc_not_zero(&sgx_usage_count))
+		return 0;
+
+	/*
+	 * Ensure no other concurrent threads can start
+	 * touching EPC while EUPDATESVN is running.
+	 */
+	guard(mutex)(&sgx_svn_lock);
+
+	if (atomic64_inc_not_zero(&sgx_usage_count))
+		return 0;
+
+	/*
+	 * Attempt to call EUPDATESVN since EPC must be
+	 * empty at this point.
+	 */
+	ret = sgx_update_svn();
+
+	/*
+	 * If EUPDATESVN failed, return failure to sgx_(vepc_)open and
+	 * do not increment the sgx_usage_count.
+	 */
+	if (!ret)
+		atomic64_inc(&sgx_usage_count);
+	return ret;
 }
 
 void sgx_dec_usage_count(void)
-- 
2.45.2

Re: [PATCH v7 5/5] x86/sgx: Enable automatic SVN updates for SGX enclaves
Posted by Dave Hansen 2 months, 3 weeks ago
On 7/11/25 09:50, Elena Reshetova wrote:
> == Background ==
> 
> ENCLS[EUPDATESVN] is a new SGX instruction [1] which allows enclave
> attestation to include information about updated microcode SVN without a
> reboot. Before an EUPDATESVN operation can be successful, all SGX memory
> (aka. EPC) must be marked as “unused” in the SGX hardware metadata
> (aka.EPCM). This requirement ensures that no compromised enclave can
> survive the EUPDATESVN procedure and provides an opportunity to generate
> new cryptographic assets.
> 
> == Patch Contents ==
> 
> Attempt to execute ENCLS[EUPDATESVN] every time the first file descriptor
> is obtained via sgx_(vepc_)open(). In the most common case the microcode
> SVN is already up-to-date, and the operation succeeds without updating SVN.
> If it fails with any other error code than SGX_INSUFFICIENT_ENTROPY, this
> is considered unexpected and the *open() returns an error. This should not
> happen in practice. On contrary, SGX_INSUFFICIENT_ENTROPY might happen due
> to a pressure on the system's DRNG (RDSEED) and therefore the *open() can
> be safely retried to allow normal enclave operation.

The effort to add paragraphs would be appreciated.

>  int sgx_inc_usage_count(void)
>  {
> -	atomic64_inc(&sgx_usage_count);
> -	return 0;
> +	int ret;
...

For what it does sgx_inc_usage_count() is seriously hard to parse and
complicated. Three logical atomic ops *and* a spinlock? Wouldn't this
suffice?

Just make 'sgx_usage_count' a normal integer and always guard it with a
the existing lock:

int sgx_inc_usage_count(void)
{
	int ret;

	guard(mutex)(&sgx_svn_lock);

	if (sgx_usage_count++ == 0)
		return sgx_update_svn();

	return 0;
}

Yeah, it makes the fast path a spinlock instead of an atomic_inc, but
it's still the same number of atomic ops in the end.

Isn't that a billion times more readable? It barely even needs commenting.

What am I missing?
RE: [PATCH v7 5/5] x86/sgx: Enable automatic SVN updates for SGX enclaves
Posted by Reshetova, Elena 2 months, 3 weeks ago
> On 7/11/25 09:50, Elena Reshetova wrote:
> > == Background ==
> >
> > ENCLS[EUPDATESVN] is a new SGX instruction [1] which allows enclave
> > attestation to include information about updated microcode SVN without a
> > reboot. Before an EUPDATESVN operation can be successful, all SGX memory
> > (aka. EPC) must be marked as “unused” in the SGX hardware metadata
> > (aka.EPCM). This requirement ensures that no compromised enclave can
> > survive the EUPDATESVN procedure and provides an opportunity to generate
> > new cryptographic assets.
> >
> > == Patch Contents ==
> >
> > Attempt to execute ENCLS[EUPDATESVN] every time the first file descriptor
> > is obtained via sgx_(vepc_)open(). In the most common case the microcode
> > SVN is already up-to-date, and the operation succeeds without updating SVN.
> > If it fails with any other error code than SGX_INSUFFICIENT_ENTROPY, this
> > is considered unexpected and the *open() returns an error. This should not
> > happen in practice. On contrary, SGX_INSUFFICIENT_ENTROPY might happen
> due
> > to a pressure on the system's DRNG (RDSEED) and therefore the *open() can
> > be safely retried to allow normal enclave operation.
> 
> The effort to add paragraphs would be appreciated.

You mean break this paragrah into two starting from "On contrary" ? 

> 
> >  int sgx_inc_usage_count(void)
> >  {
> > -	atomic64_inc(&sgx_usage_count);
> > -	return 0;
> > +	int ret;
> ...
> 
> For what it does sgx_inc_usage_count() is seriously hard to parse and
> complicated. Three logical atomic ops *and* a spinlock? Wouldn't this
> suffice?
> 
> Just make 'sgx_usage_count' a normal integer and always guard it with a
> the existing lock:
> 
> int sgx_inc_usage_count(void)
> {
> 	int ret;
> 
> 	guard(mutex)(&sgx_svn_lock);
> 
> 	if (sgx_usage_count++ == 0)
> 		return sgx_update_svn();
> 
> 	return 0;
> }
> 
> Yeah, it makes the fast path a spinlock instead of an atomic_inc, but
> it's still the same number of atomic ops in the end.
> 
> Isn't that a billion times more readable? It barely even needs commenting.

Yes, agree, billion times more readable. 

> 
> What am I missing?

I think you put it: this would require a spinlock in the fast path and
*in theory* if we are running many many concurrent enclaves can create 
contention. Whenever this is a realistic *practical deployment problem*
I don't know. On the other hand, since we switched per Sean's suggestion
to taking a lock per sgx_open() vs. per each EPC page manipulation, the
contention is only on enclave creation, so this would require many
concurrent enclaves being constantly created, which even further limits
a problem to a particular deployment. Do we have such deployments
where people constantly create and destroy big number of enclaves?
Maybe instead of trying to find this out we go with your suggestion,
and only if there is a practical problem reported, go back to atomic version? 

Best Regards,
Elena.
Re: [PATCH v7 5/5] x86/sgx: Enable automatic SVN updates for SGX enclaves
Posted by Dave Hansen 2 months, 3 weeks ago
On 7/14/25 00:35, Reshetova, Elena wrote:
> I think you put it: this would require a spinlock in the fast path and
> *in theory* if we are running many many concurrent enclaves can create 
> contention

FWIW, my mental model is that spinlocks that are held for short periods
of time are pretty much the same cost as an atomic under contention.

If there are lots of users, the cost of moving the cacheline for the
atomic or the spinlock dominates everything else. It doesn't matter
whether that cacheline is an atomic_t or spinlock_t.

The only difference is that there is _visible_ spinning for a spinlock.
Re: [PATCH v7 5/5] x86/sgx: Enable automatic SVN updates for SGX enclaves
Posted by Dave Hansen 2 months, 3 weeks ago
On 7/14/25 06:54, Dave Hansen wrote:
> On 7/14/25 00:35, Reshetova, Elena wrote:
>> I think you put it: this would require a spinlock in the fast path and
>> *in theory* if we are running many many concurrent enclaves can create 
>> contention
> FWIW, my mental model is that spinlocks that are held for short periods
> of time are pretty much the same cost as an atomic under contention.
> 
> If there are lots of users, the cost of moving the cacheline for the
> atomic or the spinlock dominates everything else. It doesn't matter
> whether that cacheline is an atomic_t or spinlock_t.
> 
> The only difference is that there is _visible_ spinning for a spinlock.

Oh, and I had a brain fart on this one. You've got a mutex, not a spinlock.

But the concept still applies: for small critical sections, the cost of
moving the cacheline dominates the cost of everything else, no matter if
it's a mutex, spinlock or atomic.

Never add complexity unless you're getting actual, real-world
performance out of it. In this case, the only thing you'd _maybe_
improve with added complexity is an open()/close() loop on /dev/sgx,
which is completely unrealistic.