[PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init

Jan Beulich posted 1 patch 1 month, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/12fbad10-78ad-4679-a1db-3995e34da094@suse.com
[PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Jan Beulich 1 month, 1 week ago
MCE init for APs was broken when CPU feature re-checking was added. MTRR
(re)init for the BSP looks to never have been there on the resume path.

Fixes: bb502a8ca592 ("x86: check feature flags after resume")
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Sadly we need to go by CPU number (zero vs non-zero) here. See the call
site of recheck_cpu_features() in enter_state().

--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -642,16 +642,21 @@ void identify_cpu(struct cpuinfo_x86 *c)
 			       smp_processor_id());
 	}
 
-	if (system_state == SYS_STATE_resume)
-		return;
+	if (system_state == SYS_STATE_resume) {
+		unsigned int cpu = smp_processor_id();
 
+		if (cpu)
+			mcheck_init(&cpu_data[cpu], false);
+		else /* Yes, the BSP needs to use the AP function here. */
+			mtrr_ap_init();
+	}
 	/*
 	 * On SMP, boot_cpu_data holds the common feature set between
 	 * all CPUs; so make sure that we indicate which features are
 	 * common between the CPUs.  The first time this routine gets
 	 * executed, c == &boot_cpu_data.
 	 */
-	if ( c != &boot_cpu_data ) {
+	else if (c != &boot_cpu_data) {
 		/* AND the already accumulated flags with these */
 		for ( i = 0 ; i < NCAPINTS ; i++ )
 			boot_cpu_data.x86_capability[i] &= c->x86_capability[i];

Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Roger Pau Monné 2 weeks, 6 days ago
On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
> MCE init for APs was broken when CPU feature re-checking was added. MTRR
> (re)init for the BSP looks to never have been there on the resume path.

I'm not sure the statement about MTRR init is correct, AFAICT
mtrr_aps_sync_end() will also re-init the MTRRs on the BSP, and hence
the added mtrr_ap_init() seems to duplicate what's already done in
mtrr_aps_sync_end().

> Fixes: bb502a8ca592 ("x86: check feature flags after resume")
> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Sadly we need to go by CPU number (zero vs non-zero) here. See the call
> site of recheck_cpu_features() in enter_state().
> 
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -642,16 +642,21 @@ void identify_cpu(struct cpuinfo_x86 *c)
>  			       smp_processor_id());
>  	}
>  
> -	if (system_state == SYS_STATE_resume)
> -		return;
> +	if (system_state == SYS_STATE_resume) {
> +		unsigned int cpu = smp_processor_id();
>  
> +		if (cpu)
> +			mcheck_init(&cpu_data[cpu], false);
> +		else /* Yes, the BSP needs to use the AP function here. */
> +			mtrr_ap_init();

For symmetry with the BSP path, is it really needed to init MCE so
early for the BSP by calling it directly in enter_state(), or could it
also be done here?

Thanks, Roger.

Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Jan Beulich 2 weeks, 6 days ago
On 23.03.2026 12:16, Roger Pau Monné wrote:
> On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
>> MCE init for APs was broken when CPU feature re-checking was added. MTRR
>> (re)init for the BSP looks to never have been there on the resume path.
> 
> I'm not sure the statement about MTRR init is correct, AFAICT
> mtrr_aps_sync_end() will also re-init the MTRRs on the BSP, and hence
> the added mtrr_ap_init() seems to duplicate what's already done in
> mtrr_aps_sync_end().

Hmm, right you are. Had I been asked, I would have confirmed that I checked
the code past the "enable_cpu" label, but clearly I must not have, or I was
blind at that time. Let me strip that out.

>> --- a/xen/arch/x86/cpu/common.c
>> +++ b/xen/arch/x86/cpu/common.c
>> @@ -642,16 +642,21 @@ void identify_cpu(struct cpuinfo_x86 *c)
>>  			       smp_processor_id());
>>  	}
>>  
>> -	if (system_state == SYS_STATE_resume)
>> -		return;
>> +	if (system_state == SYS_STATE_resume) {
>> +		unsigned int cpu = smp_processor_id();
>>  
>> +		if (cpu)
>> +			mcheck_init(&cpu_data[cpu], false);
>> +		else /* Yes, the BSP needs to use the AP function here. */
>> +			mtrr_ap_init();
> 
> For symmetry with the BSP path, is it really needed to init MCE so
> early for the BSP by calling it directly in enter_state(), or could it
> also be done here?

To be honest, I would put the question the other way around: Is it really
okay to do it this late for APs (during boot also for the BSP [1])? Iirc
an #MC prior to mcheck_init() is going to be deadly to the system. Moving
it earlier may, however, be a more intrusive change.

Jan

[1] Us crashing (rebooting) during boot is perhaps less of an issue than
us doing so during S3 resume: In that latter case it may mean data loss
(or maybe even data corruption).

Jan

Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Roger Pau Monné 2 weeks, 6 days ago
On Mon, Mar 23, 2026 at 12:38:48PM +0100, Jan Beulich wrote:
> On 23.03.2026 12:16, Roger Pau Monné wrote:
> > On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
> >> MCE init for APs was broken when CPU feature re-checking was added. MTRR
> >> (re)init for the BSP looks to never have been there on the resume path.
> > 
> > I'm not sure the statement about MTRR init is correct, AFAICT
> > mtrr_aps_sync_end() will also re-init the MTRRs on the BSP, and hence
> > the added mtrr_ap_init() seems to duplicate what's already done in
> > mtrr_aps_sync_end().
> 
> Hmm, right you are. Had I been asked, I would have confirmed that I checked
> the code past the "enable_cpu" label, but clearly I must not have, or I was
> blind at that time. Let me strip that out.
> 
> >> --- a/xen/arch/x86/cpu/common.c
> >> +++ b/xen/arch/x86/cpu/common.c
> >> @@ -642,16 +642,21 @@ void identify_cpu(struct cpuinfo_x86 *c)
> >>  			       smp_processor_id());
> >>  	}
> >>  
> >> -	if (system_state == SYS_STATE_resume)
> >> -		return;
> >> +	if (system_state == SYS_STATE_resume) {
> >> +		unsigned int cpu = smp_processor_id();
> >>  
> >> +		if (cpu)
> >> +			mcheck_init(&cpu_data[cpu], false);
> >> +		else /* Yes, the BSP needs to use the AP function here. */
> >> +			mtrr_ap_init();
> > 
> > For symmetry with the BSP path, is it really needed to init MCE so
> > early for the BSP by calling it directly in enter_state(), or could it
> > also be done here?
> 
> To be honest, I would put the question the other way around: Is it really
> okay to do it this late for APs (during boot also for the BSP [1])? Iirc
> an #MC prior to mcheck_init() is going to be deadly to the system. Moving
> it earlier may, however, be a more intrusive change.

We might want to at least add a note to document this asymmetric
initialization between the BSP and the APs at least?

I would be perfectly happy with moving this earlier, and it needs to
be consistent between the APs and the BSP.

Thanks, Roger.

Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Marek Marczykowski 1 month, 1 week ago
On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
> MCE init for APs was broken when CPU feature re-checking was added. MTRR
> (re)init for the BSP looks to never have been there on the resume path.
> 
> Fixes: bb502a8ca592 ("x86: check feature flags after resume")
> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Sadly we need to go by CPU number (zero vs non-zero) here. See the call
> site of recheck_cpu_features() in enter_state().

With this patch, I now see the "Thermal monitoring enabled" on resume
also for AP.
And then, the "Temperature above threshold" + "Running in modulated
clock mode" for AP too. But, I don't see matching "Temperature/speed
normal" for any of them...

My simple performance test says it's okay for now, though. I'll see how
it looks in a few hours...

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Jan Beulich 1 month, 1 week ago
On 04.03.2026 15:36, Marek Marczykowski wrote:
> On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
>> MCE init for APs was broken when CPU feature re-checking was added. MTRR
>> (re)init for the BSP looks to never have been there on the resume path.
>>
>> Fixes: bb502a8ca592 ("x86: check feature flags after resume")
>> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> Sadly we need to go by CPU number (zero vs non-zero) here. See the call
>> site of recheck_cpu_features() in enter_state().
> 
> With this patch, I now see the "Thermal monitoring enabled" on resume
> also for AP.
> And then, the "Temperature above threshold" + "Running in modulated
> clock mode" for AP too. But, I don't see matching "Temperature/speed
> normal" for any of them...

Which would imply that for each CPU you see at most one such message after
resume. Can you confirm this? (Generally for every CPU they should be
alternating, but appear no more frequently than every 5 seconds. Albeit I
can't help the impression that it is possible for the current state to not
be reflected by the most recently seen message, for a potentially
indefinite period of time.)

> My simple performance test says it's okay for now, though. I'll see how
> it looks in a few hours...

I actually don't expect the change here to make a difference in that
regard. intel_thermal_interrupt() exists only for reporting purposes.

Jan

Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Marek Marczykowski 1 month, 1 week ago
On Wed, Mar 04, 2026 at 03:47:14PM +0100, Jan Beulich wrote:
> On 04.03.2026 15:36, Marek Marczykowski wrote:
> > On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
> >> MCE init for APs was broken when CPU feature re-checking was added. MTRR
> >> (re)init for the BSP looks to never have been there on the resume path.
> >>
> >> Fixes: bb502a8ca592 ("x86: check feature flags after resume")
> >> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> ---
> >> Sadly we need to go by CPU number (zero vs non-zero) here. See the call
> >> site of recheck_cpu_features() in enter_state().
> > 
> > With this patch, I now see the "Thermal monitoring enabled" on resume
> > also for AP.
> > And then, the "Temperature above threshold" + "Running in modulated
> > clock mode" for AP too. But, I don't see matching "Temperature/speed
> > normal" for any of them...
> 
> Which would imply that for each CPU you see at most one such message after
> resume. Can you confirm this? 

For the current test, yes. I got the messages for CPUs 16, 6, 18, 4, 2 -
in this order. Not for 0, 8-15 or 20-21. Not sure about CPU0, but for
others it kinda looks like I got it for P cores, but not E cores? But
I'm not sure how to reliably distinguish them - I base it on the holes
in numbering due to smt=off. Specifically I have online CPUs:
0,2,4,6,8-16,18,20-21 (yeah, weird ordering...).

> (Generally for every CPU they should be
> alternating, but appear no more frequently than every 5 seconds. Albeit I
> can't help the impression that it is possible for the current state to not
> be reflected by the most recently seen message, for a potentially
> indefinite period of time.)
> 
> > My simple performance test says it's okay for now, though. I'll see how
> > it looks in a few hours...
> 
> I actually don't expect the change here to make a difference in that
> regard. intel_thermal_interrupt() exists only for reporting purposes.

Yeah, it's too soon to say definitely, but just after resume test said
stable 6ms, and now (~30min later) later it's at 12-14ms.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Jan Beulich 2 weeks, 6 days ago
On 04.03.2026 16:00, Marek Marczykowski wrote:
> On Wed, Mar 04, 2026 at 03:47:14PM +0100, Jan Beulich wrote:
>> On 04.03.2026 15:36, Marek Marczykowski wrote:
>>> On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
>>>> MCE init for APs was broken when CPU feature re-checking was added. MTRR
>>>> (re)init for the BSP looks to never have been there on the resume path.
>>>>
>>>> Fixes: bb502a8ca592 ("x86: check feature flags after resume")
>>>> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>> ---
>>>> Sadly we need to go by CPU number (zero vs non-zero) here. See the call
>>>> site of recheck_cpu_features() in enter_state().
>>>
>>> With this patch, I now see the "Thermal monitoring enabled" on resume
>>> also for AP.
>>> And then, the "Temperature above threshold" + "Running in modulated
>>> clock mode" for AP too. But, I don't see matching "Temperature/speed
>>> normal" for any of them...
>>
>> Which would imply that for each CPU you see at most one such message after
>> resume. Can you confirm this? 
> 
> For the current test, yes. I got the messages for CPUs 16, 6, 18, 4, 2 -
> in this order. Not for 0, 8-15 or 20-21. Not sure about CPU0, but for
> others it kinda looks like I got it for P cores, but not E cores? But
> I'm not sure how to reliably distinguish them - I base it on the holes
> in numbering due to smt=off. Specifically I have online CPUs:
> 0,2,4,6,8-16,18,20-21 (yeah, weird ordering...).

I wonder, btw, if this is good enough to translate into a Tested-by: for
this patch. Thoughts?

Jan

Re: [PATCH] x86/S3: restore MCE (APs) and add MTRR (BSP) init
Posted by Marek Marczykowski 2 weeks, 6 days ago
On Mon, Mar 23, 2026 at 12:21:46PM +0100, Jan Beulich wrote:
> On 04.03.2026 16:00, Marek Marczykowski wrote:
> > On Wed, Mar 04, 2026 at 03:47:14PM +0100, Jan Beulich wrote:
> >> On 04.03.2026 15:36, Marek Marczykowski wrote:
> >>> On Wed, Mar 04, 2026 at 02:39:01PM +0100, Jan Beulich wrote:
> >>>> MCE init for APs was broken when CPU feature re-checking was added. MTRR
> >>>> (re)init for the BSP looks to never have been there on the resume path.
> >>>>
> >>>> Fixes: bb502a8ca592 ("x86: check feature flags after resume")
> >>>> Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> >>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >>>> ---
> >>>> Sadly we need to go by CPU number (zero vs non-zero) here. See the call
> >>>> site of recheck_cpu_features() in enter_state().
> >>>
> >>> With this patch, I now see the "Thermal monitoring enabled" on resume
> >>> also for AP.
> >>> And then, the "Temperature above threshold" + "Running in modulated
> >>> clock mode" for AP too. But, I don't see matching "Temperature/speed
> >>> normal" for any of them...
> >>
> >> Which would imply that for each CPU you see at most one such message after
> >> resume. Can you confirm this? 
> > 
> > For the current test, yes. I got the messages for CPUs 16, 6, 18, 4, 2 -
> > in this order. Not for 0, 8-15 or 20-21. Not sure about CPU0, but for
> > others it kinda looks like I got it for P cores, but not E cores? But
> > I'm not sure how to reliably distinguish them - I base it on the holes
> > in numbering due to smt=off. Specifically I have online CPUs:
> > 0,2,4,6,8-16,18,20-21 (yeah, weird ordering...).
> 
> I wonder, btw, if this is good enough to translate into a Tested-by: for
> this patch. Thoughts?

I think so, It clearly fixes reporting issue.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab