[RFC PATCH 11/20] x86/virt/seamldr: Abort updates if errors occurred midway

Chao Gao posted 20 patches 7 months ago
[RFC PATCH 11/20] x86/virt/seamldr: Abort updates if errors occurred midway
Posted by Chao Gao 7 months ago
The update process is divided into multiple stages, each of which may
encounter failures. However, the current state machine for updates proceeds
to the next stage regardless of errors.

Continuing updates when errors occur midway is pointless.

Implement a mechanism that transitions directly to the final stage,
effectively aborting the update and skipping all remaining stages when an
error is detected.

This is in preparation for adding the first stage that may fail.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
---
 arch/x86/virt/vmx/tdx/seamldr.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/virt/vmx/tdx/seamldr.c b/arch/x86/virt/vmx/tdx/seamldr.c
index 01dc2b0bc4a5..9d0d37a92bfd 100644
--- a/arch/x86/virt/vmx/tdx/seamldr.c
+++ b/arch/x86/virt/vmx/tdx/seamldr.c
@@ -247,6 +247,7 @@ enum tdp_state {
 static struct {
 	enum tdp_state state;
 	atomic_t thread_ack;
+	atomic_t failed;
 } tdp_data;
 
 static void set_state(enum tdp_state state)
@@ -261,8 +262,16 @@ static void set_state(enum tdp_state state)
 /* Last one to ack a state moves to the next state. */
 static void ack_state(void)
 {
-	if (atomic_dec_and_test(&tdp_data.thread_ack))
-		set_state(tdp_data.state + 1);
+	if (atomic_dec_and_test(&tdp_data.thread_ack)) {
+		/*
+		 * If an error occurred, abort the update by skipping to
+		 * the final state
+		 */
+		if (atomic_read(&tdp_data.failed))
+			set_state(TDP_DONE);
+		else
+			set_state(tdp_data.state + 1);
+	}
 }
 
 /*
@@ -285,6 +294,9 @@ static int do_seamldr_install_module(void *params)
 			default:
 				break;
 			}
+
+			if (ret)
+				atomic_inc(&tdp_data.failed);
 			ack_state();
 		} else {
 			touch_nmi_watchdog();
@@ -314,6 +326,7 @@ static int seamldr_install_module(const u8 *data, u32 size)
 	if (IS_ERR(params))
 		return PTR_ERR(params);
 
+	atomic_set(&tdp_data.failed, 0);
 	set_state(TDP_START + 1);
 	return stop_machine(do_seamldr_install_module, params, cpu_online_mask);
 }
-- 
2.47.1
Re: [RFC PATCH 11/20] x86/virt/seamldr: Abort updates if errors occurred midway
Posted by Nikolay Borisov 6 months, 2 weeks ago

On 5/23/25 12:52, Chao Gao wrote:
> The update process is divided into multiple stages, each of which may
> encounter failures. However, the current state machine for updates proceeds
> to the next stage regardless of errors.
> 
> Continuing updates when errors occur midway is pointless.
> 
> Implement a mechanism that transitions directly to the final stage,
> effectively aborting the update and skipping all remaining stages when an
> error is detected.
> 
> This is in preparation for adding the first stage that may fail.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Farrah Chen <farrah.chen@intel.com>
> ---
>   arch/x86/virt/vmx/tdx/seamldr.c | 17 +++++++++++++++--
>   1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/virt/vmx/tdx/seamldr.c b/arch/x86/virt/vmx/tdx/seamldr.c
> index 01dc2b0bc4a5..9d0d37a92bfd 100644
> --- a/arch/x86/virt/vmx/tdx/seamldr.c
> +++ b/arch/x86/virt/vmx/tdx/seamldr.c
> @@ -247,6 +247,7 @@ enum tdp_state {
>   static struct {
>   	enum tdp_state state;
>   	atomic_t thread_ack;
> +	atomic_t failed;
>   } tdp_data;
>   
>   static void set_state(enum tdp_state state)
> @@ -261,8 +262,16 @@ static void set_state(enum tdp_state state)
>   /* Last one to ack a state moves to the next state. */
>   static void ack_state(void)
>   {
> -	if (atomic_dec_and_test(&tdp_data.thread_ack))
> -		set_state(tdp_data.state + 1);
> +	if (atomic_dec_and_test(&tdp_data.thread_ack)) {
> +		/*
> +		 * If an error occurred, abort the update by skipping to
> +		 * the final state
> +		 */
> +		if (atomic_read(&tdp_data.failed))
> +			set_state(TDP_DONE);
> +		else
> +			set_state(tdp_data.state + 1);
> +	}
>   }
>   
>   /*
> @@ -285,6 +294,9 @@ static int do_seamldr_install_module(void *params)
>   			default:
>   				break;
>   			}
> +
> +			if (ret)
> +				atomic_inc(&tdp_data.failed);

Should there be some explicit ordering requirement between setting an 
error and reading it in ack_state by a different CPU?


  < snip>
Re: [RFC PATCH 11/20] x86/virt/seamldr: Abort updates if errors occurred midway
Posted by Chao Gao 6 months, 1 week ago
>>   static void ack_state(void)
>>   {
>> -	if (atomic_dec_and_test(&tdp_data.thread_ack))
>> -		set_state(tdp_data.state + 1);
>> +	if (atomic_dec_and_test(&tdp_data.thread_ack)) {
>> +		/*
>> +		 * If an error occurred, abort the update by skipping to
>> +		 * the final state
>> +		 */
>> +		if (atomic_read(&tdp_data.failed))
>> +			set_state(TDP_DONE);
>> +		else
>> +			set_state(tdp_data.state + 1);
>> +	}
>>   }
>>   /*
>> @@ -285,6 +294,9 @@ static int do_seamldr_install_module(void *params)
>>   			default:
>>   				break;
>>   			}
>> +
>> +			if (ret)
>> +				atomic_inc(&tdp_data.failed);
>
>Should there be some explicit ordering requirement between setting an error
>and reading it in ack_state by a different CPU?

Only the last CPU that calls ack_state() will change the global state, either
advancing to the next state or setting it to TDP_DONE on error. so, we only
need to ensure that the last CPU can see the error. This is guaranteed because
the error is set before the call to ack_state().

+			if (ret)
+				atomic_inc(&tdp_data.failed);
			ack_state();