[patch 0/2] x86/cpu/amd: Fixup the topology rework fallout

Thomas Gleixner posted 2 patches 1 year, 10 months ago
Only 0 patches received!
[patch 0/2] x86/cpu/amd: Fixup the topology rework fallout
Posted by Thomas Gleixner 1 year, 10 months ago
Testing at Collabora unearthed two issues in the new AMD topology parser
code:

  1) The CPUID 0x80000008 parser initializes the wrong topology domain
     level.

  2) The NODEID_MSR parser uses bitfields in a union wrongly which results
     in reading out the wrong value and finally in a division by zero.

Many thanks to Laura for helping to debug this issue.

       tglx
Re: [patch 0/2] x86/cpu/amd: Fixup the topology rework fallout
Posted by Laura Nao 1 year, 10 months ago
Hi Thomas,

On 4/10/24 21:45, Thomas Gleixner wrote:
> Testing at Collabora unearthed two issues in the new AMD topology parser
> code:
> 
>    1) The CPUID 0x80000008 parser initializes the wrong topology domain
>       level.
> 
>    2) The NODEID_MSR parser uses bitfields in a union wrongly which results
>       in reading out the wrong value and finally in a division by zero.
> 
> Many thanks to Laura for helping to debug this issue.
> 
>         tglx
> 
> 

Thanks a lot for investigating and solving the issue!

I confirm that with this series applied the kernel boots correctly on 
all three AMD Stoney Ridge Chromebooks that were affected by the 
regression.

I tested the patches on top of c749ce39 (culprit commit identified by
the bisection) - reference test job:
https://lava.collabora.dev/scheduler/job/13339645

The series doesn't apply directly to next, but I manually applied the
changes on top of next-20240411 and can confirm the kernel boots
correctly with this revision too - reference test job:
https://lava.collabora.dev/scheduler/job/13340321

The regression was originally reported by KernelCI, so:

Reported-by: "kernelci.org bot" <bot@kernelci.org>
Tested-by: Laura Nao <laura.nao@collabora.com>

I'll make sure to update the Regzbot tag when the series is merged.

Best,

Laura
Re: [patch 0/2] x86/cpu/amd: Fixup the topology rework fallout
Posted by Linux regression tracking (Thorsten Leemhuis) 1 year, 10 months ago
On 11.04.24 13:27, Laura Nao wrote:
> 
> On 4/10/24 21:45, Thomas Gleixner wrote:
>> Testing at Collabora unearthed two issues in the new AMD topology parser
>> code:
>>
>>    1) The CPUID 0x80000008 parser initializes the wrong topology domain
>>       level.
>>
>>    2) The NODEID_MSR parser uses bitfields in a union wrongly which results
>>       in reading out the wrong value and finally in a division by zero.
>>
>> Many thanks to Laura for helping to debug this issue.
>>
>>         tglx
>>
>>
> 
> Thanks a lot for investigating and solving the issue!> [...]
> 
> The regression was originally reported by KernelCI, so:
> 
> Reported-by: "kernelci.org bot" <bot@kernelci.org>
> Tested-by: Laura Nao <laura.nao@collabora.com>
> 
> I'll make sure to update the Regzbot tag when the series is merged.

No need to wait, we can do that now:

#regzbot fix: x86/cpu/amd: Make the NODEID_MSR union actually work

But ideally Thomas would add Link: or Closes: tag to the patch
description (e.g.

 Closes:
https://lore.kernel.org/all/20240322175210.124416-1-laura.nao@collabora.com/

) just like Linus asked him to do a while ago already[1], as then this
would not be necessary at all. ;) (SCNR)

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

[1]
https://lore.kernel.org/all/CAHk-=wjMmSZzMJ3Xnskdg4+GGz=5p5p+GSYyFBTh0f-DgvdBWg@mail.gmail.com/
Re: [patch 0/2] x86/cpu/amd: Fixup the topology rework fallout
Posted by Thomas Gleixner 1 year, 10 months ago
On Thu, Apr 11 2024 at 13:37, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 11.04.24 13:27, Laura Nao wrote:
> No need to wait, we can do that now:
>
> #regzbot fix: x86/cpu/amd: Make the NODEID_MSR union actually work
>
> But ideally Thomas would add Link: or Closes: tag to the patch
> description (e.g.
>
>  Closes:
> https://lore.kernel.org/all/20240322175210.124416-1-laura.nao@collabora.com/
>
> ) just like Linus asked him to do a while ago already[1], as then this
> would not be necessary at all. ;) (SCNR)

Will do when applying them and I try to remember that Closes thing, but
you know at my age ....

Thanks,

        tglx