[PATCH] x86/mcheck: allow varying bank counts per CPU

Soham Dandapat posted 1 patch 4 days, 9 hours ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20250905165212.96843-1-Soham.Dandapat@amd.com
xen/arch/x86/cpu/mcheck/mce.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] x86/mcheck: allow varying bank counts per CPU
Posted by Soham Dandapat 4 days, 9 hours ago
In mca_cap_init function,the mcabanks_alloc allocates and
initializes an mca_banks structure for managing MCA banks,
setting up a bank map and storing the specified or default number
of banks.

After this we will call mcabanks_set(i, mca_allbanks);
The mcabanks_set function sets a specific bit in the bank_map of
an mca_banks structure, provided the structure, its bank_map, and
the bit index are valid.

At the end, we will call
mcabanks_free(xchg(&mca_allbanks, all));
This function is thread safe and does below:
   1. Atomically exchanges the value of "mca_allbanks" with "all"
   2. Returns the old value that was previously in "mca_allbanks"
So, when we will call mcabanks_free , that will free the memory.

The problem is that mcabanks_set(i, mca_allbanks) function is updating
mca_allbanks which will be freed via mcabanks_free later. This means
new mca_allbanks instance("all") will never get chance to update
it's bank_map.

Due to this when we will collect log from mcheck_mca_logout function ,
the condition "if ( !mcabanks_test(i, bankmask) )" will always fails
and MCA logs will not be collected for any bank.

The fix is to solve this problem.

Fixes: 560cf418c845 ("x86/mcheck: allow varying bank counts per CPU")
Signed-off-by: Soham Dandapat <soham.dandapat@amd.com>
---
 xen/arch/x86/cpu/mcheck/mce.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 9028ccde54..84238cd0ef 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -663,7 +663,7 @@ static int mca_cap_init(void)
         if ( !all )
             return -ENOMEM;
         for ( i = 0; i < nr; i++ )
-            mcabanks_set(i, mca_allbanks);
+            mcabanks_set(i, all);
         mcabanks_free(xchg(&mca_allbanks, all));
     }
 
-- 
2.17.1
Re: [PATCH] x86/mcheck: allow varying bank counts per CPU
Posted by Jason Andryuk 4 days, 9 hours ago

On 2025-09-05 12:52, Soham Dandapat wrote:
> In mca_cap_init function,the mcabanks_alloc allocates and
> initializes an mca_banks structure for managing MCA banks,
> setting up a bank map and storing the specified or default number
> of banks.
> 
> After this we will call mcabanks_set(i, mca_allbanks);
> The mcabanks_set function sets a specific bit in the bank_map of
> an mca_banks structure, provided the structure, its bank_map, and
> the bit index are valid.
> 
> At the end, we will call
> mcabanks_free(xchg(&mca_allbanks, all));
> This function is thread safe and does below:
>     1. Atomically exchanges the value of "mca_allbanks" with "all"
>     2. Returns the old value that was previously in "mca_allbanks"
> So, when we will call mcabanks_free , that will free the memory.
> 
> The problem is that mcabanks_set(i, mca_allbanks) function is updating
> mca_allbanks which will be freed via mcabanks_free later. This means
> new mca_allbanks instance("all") will never get chance to update
> it's bank_map.
> 
> Due to this when we will collect log from mcheck_mca_logout function ,
> the condition "if ( !mcabanks_test(i, bankmask) )" will always fails
> and MCA logs will not be collected for any bank.
> 
> The fix is to solve this problem.
> 
> Fixes: 560cf418c845 ("x86/mcheck: allow varying bank counts per CPU")
> Signed-off-by: Soham Dandapat <soham.dandapat@amd.com>

Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>

Maybe the patch subject should be "x86/mcheck: Fix mca bank 
initialization" to differentiate from the Fixes commit?

Thanks,
Jason
Re: [PATCH] x86/mcheck: allow varying bank counts per CPU
Posted by Jan Beulich 1 day, 16 hours ago
On 05.09.2025 19:02, Jason Andryuk wrote:
> 
> 
> On 2025-09-05 12:52, Soham Dandapat wrote:
>> In mca_cap_init function,the mcabanks_alloc allocates and
>> initializes an mca_banks structure for managing MCA banks,
>> setting up a bank map and storing the specified or default number
>> of banks.
>>
>> After this we will call mcabanks_set(i, mca_allbanks);
>> The mcabanks_set function sets a specific bit in the bank_map of
>> an mca_banks structure, provided the structure, its bank_map, and
>> the bit index are valid.
>>
>> At the end, we will call
>> mcabanks_free(xchg(&mca_allbanks, all));
>> This function is thread safe and does below:
>>     1. Atomically exchanges the value of "mca_allbanks" with "all"
>>     2. Returns the old value that was previously in "mca_allbanks"
>> So, when we will call mcabanks_free , that will free the memory.
>>
>> The problem is that mcabanks_set(i, mca_allbanks) function is updating
>> mca_allbanks which will be freed via mcabanks_free later. This means
>> new mca_allbanks instance("all") will never get chance to update
>> it's bank_map.
>>
>> Due to this when we will collect log from mcheck_mca_logout function ,
>> the condition "if ( !mcabanks_test(i, bankmask) )" will always fails
>> and MCA logs will not be collected for any bank.
>>
>> The fix is to solve this problem.
>>
>> Fixes: 560cf418c845 ("x86/mcheck: allow varying bank counts per CPU")
>> Signed-off-by: Soham Dandapat <soham.dandapat@amd.com>
> 
> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
> 
> Maybe the patch subject should be "x86/mcheck: Fix mca bank 
> initialization" to differentiate from the Fixes commit?

That's still more generic than wanted. How about "x86/mcheck: fix
mca_allbanks updating"? With a more concise title (which can be
adjusted while committing, so long as there's agreement):
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan
Re: [PATCH] x86/mcheck: allow varying bank counts per CPU
Posted by Jason Andryuk 1 day, 12 hours ago
On 2025-09-08 05:08, Jan Beulich wrote:
> On 05.09.2025 19:02, Jason Andryuk wrote:
>>
>>
>> On 2025-09-05 12:52, Soham Dandapat wrote:
>>> In mca_cap_init function,the mcabanks_alloc allocates and
>>> initializes an mca_banks structure for managing MCA banks,
>>> setting up a bank map and storing the specified or default number
>>> of banks.
>>>
>>> After this we will call mcabanks_set(i, mca_allbanks);
>>> The mcabanks_set function sets a specific bit in the bank_map of
>>> an mca_banks structure, provided the structure, its bank_map, and
>>> the bit index are valid.
>>>
>>> At the end, we will call
>>> mcabanks_free(xchg(&mca_allbanks, all));
>>> This function is thread safe and does below:
>>>      1. Atomically exchanges the value of "mca_allbanks" with "all"
>>>      2. Returns the old value that was previously in "mca_allbanks"
>>> So, when we will call mcabanks_free , that will free the memory.
>>>
>>> The problem is that mcabanks_set(i, mca_allbanks) function is updating
>>> mca_allbanks which will be freed via mcabanks_free later. This means
>>> new mca_allbanks instance("all") will never get chance to update
>>> it's bank_map.
>>>
>>> Due to this when we will collect log from mcheck_mca_logout function ,
>>> the condition "if ( !mcabanks_test(i, bankmask) )" will always fails
>>> and MCA logs will not be collected for any bank.
>>>
>>> The fix is to solve this problem.
>>>
>>> Fixes: 560cf418c845 ("x86/mcheck: allow varying bank counts per CPU")
>>> Signed-off-by: Soham Dandapat <soham.dandapat@amd.com>
>>
>> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
>>
>> Maybe the patch subject should be "x86/mcheck: Fix mca bank
>> initialization" to differentiate from the Fixes commit?
> 
> That's still more generic than wanted. How about "x86/mcheck: fix
> mca_allbanks updating"? With a more concise title (which can be
> adjusted while committing, so long as there's agreement):
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Your suggestion sounds good to me.

Thanks,
Jason
RE: [PATCH] x86/mcheck: allow varying bank counts per CPU
Posted by Dandapat, Soham 1 day, 5 hours ago
[Public]

Hi Jan , Jason ,

Suggestion sounds good to me . I am ok with that .

Thanks,
Soham

-----Original Message-----
From: Jason Andryuk <jason.andryuk@amd.com>
Sent: Monday, September 8, 2025 7:10 PM
To: Jan Beulich <jbeulich@suse.com>; Dandapat, Soham <Soham.Dandapat@amd.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau Monné <roger.pau@citrix.com>; xen-devel@lists.xenproject.org
Subject: Re: [PATCH] x86/mcheck: allow varying bank counts per CPU

On 2025-09-08 05:08, Jan Beulich wrote:
> On 05.09.2025 19:02, Jason Andryuk wrote:
>>
>>
>> On 2025-09-05 12:52, Soham Dandapat wrote:
>>> In mca_cap_init function,the mcabanks_alloc allocates and
>>> initializes an mca_banks structure for managing MCA banks, setting
>>> up a bank map and storing the specified or default number of banks.
>>>
>>> After this we will call mcabanks_set(i, mca_allbanks); The
>>> mcabanks_set function sets a specific bit in the bank_map of an
>>> mca_banks structure, provided the structure, its bank_map, and the
>>> bit index are valid.
>>>
>>> At the end, we will call
>>> mcabanks_free(xchg(&mca_allbanks, all)); This function is thread
>>> safe and does below:
>>>      1. Atomically exchanges the value of "mca_allbanks" with "all"
>>>      2. Returns the old value that was previously in "mca_allbanks"
>>> So, when we will call mcabanks_free , that will free the memory.
>>>
>>> The problem is that mcabanks_set(i, mca_allbanks) function is
>>> updating mca_allbanks which will be freed via mcabanks_free later.
>>> This means new mca_allbanks instance("all") will never get chance to
>>> update it's bank_map.
>>>
>>> Due to this when we will collect log from mcheck_mca_logout function
>>> , the condition "if ( !mcabanks_test(i, bankmask) )" will always
>>> fails and MCA logs will not be collected for any bank.
>>>
>>> The fix is to solve this problem.
>>>
>>> Fixes: 560cf418c845 ("x86/mcheck: allow varying bank counts per
>>> CPU")
>>> Signed-off-by: Soham Dandapat <soham.dandapat@amd.com>
>>
>> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
>>
>> Maybe the patch subject should be "x86/mcheck: Fix mca bank
>> initialization" to differentiate from the Fixes commit?
>
> That's still more generic than wanted. How about "x86/mcheck: fix
> mca_allbanks updating"? With a more concise title (which can be
> adjusted while committing, so long as there's agreement):
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Your suggestion sounds good to me.

Thanks,
Jason