[PATCH v2 2/2] x86/sgx: Log information when a node lacks an EPC section

Aaron Lu posted 2 patches 1 year, 3 months ago
[PATCH v2 2/2] x86/sgx: Log information when a node lacks an EPC section
Posted by Aaron Lu 1 year, 3 months ago
For optimized performance, firmware typically distributes EPC sections
evenly across different NUMA nodes. However, there are scenarios where
a node may have both CPUs and memory but no EPC section configured. For
example, in an 8-socket system with a Sub-Numa-Cluster=2 setup, there
are a total of 16 nodes. Given that the maximum number of supported EPC
sections is 8, it is simply not feasible to assign one EPC section to
each node. This configuration is not incorrect - SGX will still operate
correctly; it is just not optimized from a NUMA standpoint.

For this reason, log a message when a node with both CPUs and memory
lacks an EPC section. This will provide users with a hint as to why they
might be experiencing less-than-ideal performance when running SGX
enclaves.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
---
 arch/x86/kernel/cpu/sgx/main.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index 694fcf7a5e3a..3a79105455f1 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -848,6 +848,13 @@ static bool __init sgx_page_cache_init(void)
 		return false;
 	}
 
+	for_each_online_node(nid) {
+		if (!node_isset(nid, sgx_numa_mask) &&
+		    node_state(nid, N_MEMORY) && node_state(nid, N_CPU))
+			pr_info("node%d has both CPUs and memory but doesn't have an EPC section\n",
+				nid);
+	}
+
 	return true;
 }
 
-- 
2.45.2
Re: [PATCH v2 2/2] x86/sgx: Log information when a node lacks an EPC section
Posted by Huang, Kai 1 year, 3 months ago

On 5/09/2024 8:08 pm, Aaron Lu wrote:
> For optimized performance, firmware typically distributes EPC sections
> evenly across different NUMA nodes. However, there are scenarios where
> a node may have both CPUs and memory but no EPC section configured. For
> example, in an 8-socket system with a Sub-Numa-Cluster=2 setup, there
> are a total of 16 nodes. Given that the maximum number of supported EPC
> sections is 8, it is simply not feasible to assign one EPC section to
> each node. This configuration is not incorrect - SGX will still operate
> correctly; it is just not optimized from a NUMA standpoint.
> 
> For this reason, log a message when a node with both CPUs and memory
> lacks an EPC section. This will provide users with a hint as to why they
> might be experiencing less-than-ideal performance when running SGX
> enclaves.
> 
> Suggested-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>

Acked-by: Kai Huang <kai.huang@intel.com>
Re: [PATCH v2 2/2] x86/sgx: Log information when a node lacks an EPC section
Posted by Jarkko Sakkinen 1 year, 3 months ago
On Thu Sep 5, 2024 at 11:08 AM EEST, Aaron Lu wrote:
> For optimized performance, firmware typically distributes EPC sections
> evenly across different NUMA nodes. However, there are scenarios where
> a node may have both CPUs and memory but no EPC section configured. For
> example, in an 8-socket system with a Sub-Numa-Cluster=2 setup, there
> are a total of 16 nodes. Given that the maximum number of supported EPC
> sections is 8, it is simply not feasible to assign one EPC section to
> each node. This configuration is not incorrect - SGX will still operate
> correctly; it is just not optimized from a NUMA standpoint.
>
> For this reason, log a message when a node with both CPUs and memory
> lacks an EPC section. This will provide users with a hint as to why they
> might be experiencing less-than-ideal performance when running SGX
> enclaves.
>
> Suggested-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Aaron Lu <aaron.lu@intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/main.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
> index 694fcf7a5e3a..3a79105455f1 100644
> --- a/arch/x86/kernel/cpu/sgx/main.c
> +++ b/arch/x86/kernel/cpu/sgx/main.c
> @@ -848,6 +848,13 @@ static bool __init sgx_page_cache_init(void)
>  		return false;
>  	}
>  
> +	for_each_online_node(nid) {
> +		if (!node_isset(nid, sgx_numa_mask) &&
> +		    node_state(nid, N_MEMORY) && node_state(nid, N_CPU))
> +			pr_info("node%d has both CPUs and memory but doesn't have an EPC section\n",
> +				nid);

Is this enough, or is there anything that would need to be done
automatically if this happens? With a tracepoint you could react to such
even but I'm totally fine with this.

> +	}
> +
>  	return true;
>  }
>  

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko
Re: [PATCH v2 2/2] x86/sgx: Log information when a node lacks an EPC section
Posted by Dave Hansen 1 year, 3 months ago
On 9/5/24 07:24, Jarkko Sakkinen wrote:
>> +	for_each_online_node(nid) {
>> +		if (!node_isset(nid, sgx_numa_mask) &&
>> +		    node_state(nid, N_MEMORY) && node_state(nid, N_CPU))
>> +			pr_info("node%d has both CPUs and memory but doesn't have an EPC section\n",
>> +				nid);
> Is this enough, or is there anything that would need to be done
> automatically if this happens? With a tracepoint you could react to such
> even but I'm totally fine with this.

There's always the theoretical chance that there are nodes being
hotplugged or that this is all running in a VM that has some whacky
topology.

But this simple pr_info() provides good coverage for the common cases
and shouldn't be too much of a burden for the weridos.