The __mcheck_cpu_init_early() function was introduced so that some
vendor-specific features are detected before the first MCA polling event
done in __mcheck_cpu_init_generic().
Currently, __mcheck_cpu_init_early() is only used on AMD-based systems and
additional code will be needed to support various system configurations.
However, the current and future vendor-specific code should be done during
vendor init. This keeps all the vendor code in a common location and
simplifies the generic init flow.
Move all the __mcheck_cpu_init_early() code into mce_amd_feature_init().
Also, move __mcheck_cpu_init_generic() after
__mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first
MCA polling event.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
Notes:
Link:
https://lore.kernel.org/r/20221206173607.1185907-3-yazen.ghannam@amd.com
v1->v2:
* New in v2, but based on old patch (see link).
* Changed cpu_has() to cpu_feature_enabled().
arch/x86/kernel/cpu/mce/amd.c | 4 ++++
arch/x86/kernel/cpu/mce/core.c | 20 +++-----------------
2 files changed, 7 insertions(+), 17 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index df8984aac1c6..302a310d0630 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -656,6 +656,10 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
u32 low = 0, high = 0, address = 0;
int offset = -1;
+ mce_flags.overflow_recov = cpu_feature_enabled(X86_FEATURE_OVERFLOW_RECOV);
+ mce_flags.succor = cpu_feature_enabled(X86_FEATURE_SUCCOR);
+ mce_flags.smca = cpu_feature_enabled(X86_FEATURE_SMCA);
+ mce_flags.amd_threshold = 1;
for (bank = 0; bank < this_cpu_read(mce_num_banks); ++bank) {
if (mce_flags.smca)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d85bd861ecca..7fbf1c8291b8 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -2029,19 +2029,6 @@ static bool __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
return false;
}
-/*
- * Init basic CPU features needed for early decoding of MCEs.
- */
-static void __mcheck_cpu_init_early(struct cpuinfo_x86 *c)
-{
- if (c->x86_vendor == X86_VENDOR_AMD || c->x86_vendor == X86_VENDOR_HYGON) {
- mce_flags.overflow_recov = !!cpu_has(c, X86_FEATURE_OVERFLOW_RECOV);
- mce_flags.succor = !!cpu_has(c, X86_FEATURE_SUCCOR);
- mce_flags.smca = !!cpu_has(c, X86_FEATURE_SMCA);
- mce_flags.amd_threshold = 1;
- }
-}
-
static void mce_centaur_feature_init(struct cpuinfo_x86 *c)
{
struct mca_config *cfg = &mca_cfg;
@@ -2281,10 +2268,9 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
mca_cfg.initialized = 1;
- __mcheck_cpu_init_early(c);
- __mcheck_cpu_init_generic();
__mcheck_cpu_init_vendor(c);
__mcheck_cpu_init_prepare_banks();
+ __mcheck_cpu_init_generic();
__mcheck_cpu_setup_timer();
}
@@ -2450,9 +2436,9 @@ static void mce_syscore_shutdown(void)
*/
static void mce_syscore_resume(void)
{
- __mcheck_cpu_init_generic();
__mcheck_cpu_init_vendor(raw_cpu_ptr(&cpu_info));
__mcheck_cpu_init_prepare_banks();
+ __mcheck_cpu_init_generic();
}
static struct syscore_ops mce_syscore_ops = {
@@ -2469,8 +2455,8 @@ static void mce_cpu_restart(void *data)
{
if (!mce_available(raw_cpu_ptr(&cpu_info)))
return;
- __mcheck_cpu_init_generic();
__mcheck_cpu_init_prepare_banks();
+ __mcheck_cpu_init_generic();
__mcheck_cpu_init_timer();
}
--
2.43.0
On Thu, Feb 13, 2025 at 04:45:55PM +0000, Yazen Ghannam wrote:
> Also, move __mcheck_cpu_init_generic() after
> __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first
> MCA polling event.
The reason being?
Precaution?
It was this way since forever, why are you moving it now? Any particular
reason?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Feb 27, 2025 at 04:25:00PM +0100, Borislav Petkov wrote: > On Thu, Feb 13, 2025 at 04:45:55PM +0000, Yazen Ghannam wrote: > > Also, move __mcheck_cpu_init_generic() after > > __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first > > MCA polling event. > > The reason being? > > Precaution? > > It was this way since forever, why are you moving it now? Any particular > reason? > 1) To read/clear old errors before turning on MCA. The updated __mcheck_cpu_init_prepare_banks() function does this for the MCi_CTL registers. This patch does this for the MCG_CTL register too. 2) To ensure that vendor-specific setup is finished beforehand also. Thanks, Yazen
On February 27, 2025 5:31:48 PM GMT+01:00, Yazen Ghannam <yazen.ghannam@amd.com> wrote: >On Thu, Feb 27, 2025 at 04:25:00PM +0100, Borislav Petkov wrote: >> On Thu, Feb 13, 2025 at 04:45:55PM +0000, Yazen Ghannam wrote: >> > Also, move __mcheck_cpu_init_generic() after >> > __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first >> > MCA polling event. >> >> The reason being? >> >> Precaution? >> >> It was this way since forever, why are you moving it now? Any particular >> reason? >> > >1) To read/clear old errors before turning on MCA. The updated >__mcheck_cpu_init_prepare_banks() function does this for the MCi_CTL >registers. This patch does this for the MCG_CTL register too. > >2) To ensure that vendor-specific setup is finished beforehand also. That doesn't answer my question. All of the above gets done even without shuffling the order... -- Sent from a small device: formatting sucks and brevity is inevitable.
On Thu, Feb 27, 2025 at 08:33:19PM +0100, Borislav Petkov wrote: > On February 27, 2025 5:31:48 PM GMT+01:00, Yazen Ghannam <yazen.ghannam@amd.com> wrote: > >On Thu, Feb 27, 2025 at 04:25:00PM +0100, Borislav Petkov wrote: > >> On Thu, Feb 13, 2025 at 04:45:55PM +0000, Yazen Ghannam wrote: > >> > Also, move __mcheck_cpu_init_generic() after > >> > __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first > >> > MCA polling event. > >> > >> The reason being? > >> > >> Precaution? > >> > >> It was this way since forever, why are you moving it now? Any particular > >> reason? > >> > > > >1) To read/clear old errors before turning on MCA. The updated > >__mcheck_cpu_init_prepare_banks() function does this for the MCi_CTL > >registers. This patch does this for the MCG_CTL register too. > > > >2) To ensure that vendor-specific setup is finished beforehand also. > > That doesn't answer my question. All of the above gets done even without shuffling the order... > > MCA banks can start logging errors once MCG_CTL is set. The AMD docs say "The operating system must initialize the MCA_CONFIG registers prior to initialization of the MCA_CTL registers." "The MCA_CTL registers must be initialized prior to enabling the error reporting banks in MCG_CTL". However, the Intel docs "Machine-Check Initialization Pseudocode" say MCG_CTL first then MCi_CTL. But both agree that CR4.MCE should be set last. We have an old thread on the topic that led to this patch. https://lore.kernel.org/all/YqJHwXkg3Ny9fI3s@yaz-fattaah/ And it seemed okay at the time. https://lore.kernel.org/all/YrnTMmwl5TrHwT9J@zn.tnic/ I don't think anything much has changed since then, so I included the old patch again in this set. Thanks, Yazen
On February 27, 2025 8:59:33 PM GMT+01:00, Yazen Ghannam <yazen.ghannam@amd.com> wrote: >On Thu, Feb 27, 2025 at 08:33:19PM +0100, Borislav Petkov wrote: >> On February 27, 2025 5:31:48 PM GMT+01:00, Yazen Ghannam <yazen.ghannam@amd.com> wrote: >> >On Thu, Feb 27, 2025 at 04:25:00PM +0100, Borislav Petkov wrote: >> >> On Thu, Feb 13, 2025 at 04:45:55PM +0000, Yazen Ghannam wrote: >> >> > Also, move __mcheck_cpu_init_generic() after >> >> > __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first >> >> > MCA polling event. >> >> >> >> The reason being? >> >> >> >> Precaution? >> >> >> >> It was this way since forever, why are you moving it now? Any particular >> >> reason? >> >> >> > >> >1) To read/clear old errors before turning on MCA. The updated >> >__mcheck_cpu_init_prepare_banks() function does this for the MCi_CTL >> >registers. This patch does this for the MCG_CTL register too. >> > >> >2) To ensure that vendor-specific setup is finished beforehand also. >> >> That doesn't answer my question. All of the above gets done even without shuffling the order... >> >> > >MCA banks can start logging errors once MCG_CTL is set. The AMD docs say >"The operating system must initialize the MCA_CONFIG registers prior to >initialization of the MCA_CTL registers." > >"The MCA_CTL registers must be initialized prior to enabling the error >reporting banks in MCG_CTL". > >However, the Intel docs "Machine-Check Initialization Pseudocode" say >MCG_CTL first then MCi_CTL. > >But both agree that CR4.MCE should be set last. > >We have an old thread on the topic that led to this patch. >https://lore.kernel.org/all/YqJHwXkg3Ny9fI3s@yaz-fattaah/ > >And it seemed okay at the time. >https://lore.kernel.org/all/YrnTMmwl5TrHwT9J@zn.tnic/ > >I don't think anything much has changed since then, so I included the >old patch again in this set. > >Thanks, >Yazen This is exactly what needs to be in the commit message - why is the change being done. -- Sent from a small device: formatting sucks and brevity is inevitable.
On Thu, Feb 27, 2025 at 09:48:01PM +0100, Borislav Petkov wrote: > On February 27, 2025 8:59:33 PM GMT+01:00, Yazen Ghannam <yazen.ghannam@amd.com> wrote: > >On Thu, Feb 27, 2025 at 08:33:19PM +0100, Borislav Petkov wrote: > >> On February 27, 2025 5:31:48 PM GMT+01:00, Yazen Ghannam <yazen.ghannam@amd.com> wrote: > >> >On Thu, Feb 27, 2025 at 04:25:00PM +0100, Borislav Petkov wrote: > >> >> On Thu, Feb 13, 2025 at 04:45:55PM +0000, Yazen Ghannam wrote: > >> >> > Also, move __mcheck_cpu_init_generic() after > >> >> > __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first > >> >> > MCA polling event. > >> >> > >> >> The reason being? > >> >> > >> >> Precaution? > >> >> > >> >> It was this way since forever, why are you moving it now? Any particular > >> >> reason? > >> >> > >> > > >> >1) To read/clear old errors before turning on MCA. The updated > >> >__mcheck_cpu_init_prepare_banks() function does this for the MCi_CTL > >> >registers. This patch does this for the MCG_CTL register too. > >> > > >> >2) To ensure that vendor-specific setup is finished beforehand also. > >> > >> That doesn't answer my question. All of the above gets done even without shuffling the order... > >> > >> > > > >MCA banks can start logging errors once MCG_CTL is set. The AMD docs say > >"The operating system must initialize the MCA_CONFIG registers prior to > >initialization of the MCA_CTL registers." > > > >"The MCA_CTL registers must be initialized prior to enabling the error > >reporting banks in MCG_CTL". > > > >However, the Intel docs "Machine-Check Initialization Pseudocode" say > >MCG_CTL first then MCi_CTL. > > > >But both agree that CR4.MCE should be set last. > > > >We have an old thread on the topic that led to this patch. > >https://lore.kernel.org/all/YqJHwXkg3Ny9fI3s@yaz-fattaah/ > > > >And it seemed okay at the time. > >https://lore.kernel.org/all/YrnTMmwl5TrHwT9J@zn.tnic/ > > > >I don't think anything much has changed since then, so I included the > >old patch again in this set. > > > >Thanks, > >Yazen > > This is exactly what needs to be in the commit message - why is the change being done. > -- Sure thing. I'll update the commit message. Thanks, Yazen
> From: Yazen Ghannam <yazen.ghannam@amd.com>
> Sent: Friday, February 14, 2025 12:46 AM
> To: x86@kernel.org; Luck, Tony <tony.luck@intel.com>
> Cc: linux-kernel@vger.kernel.org; linux-edac@vger.kernel.org;
> Smita.KoralahalliChannabasappa@amd.com; Yazen Ghannam
> <yazen.ghannam@amd.com>
> Subject: [PATCH v2 06/16] x86/mce: Remove __mcheck_cpu_init_early()
>
> The __mcheck_cpu_init_early() function was introduced so that some
> vendor-specific features are detected before the first MCA polling event done
> in __mcheck_cpu_init_generic().
>
> Currently, __mcheck_cpu_init_early() is only used on AMD-based systems
> and additional code will be needed to support various system configurations.
>
> However, the current and future vendor-specific code should be done during
> vendor init. This keeps all the vendor code in a common location and
> simplifies the generic init flow.
>
> Move all the __mcheck_cpu_init_early() code into mce_amd_feature_init().
> Also, move __mcheck_cpu_init_generic() after
> __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first
> MCA polling event.
Maybe providing a bit more information about the first MCA polling event,
as shown below, would be clearer:
"... so that MCA is enabled after the first MCA polling event for the machine
checks left over from the previous reset."
> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Other than that,
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
On Tue, Feb 18, 2025 at 03:00:12AM +0000, Zhuo, Qiuxu wrote: > > From: Yazen Ghannam <yazen.ghannam@amd.com> > > Sent: Friday, February 14, 2025 12:46 AM > > To: x86@kernel.org; Luck, Tony <tony.luck@intel.com> > > Cc: linux-kernel@vger.kernel.org; linux-edac@vger.kernel.org; > > Smita.KoralahalliChannabasappa@amd.com; Yazen Ghannam > > <yazen.ghannam@amd.com> > > Subject: [PATCH v2 06/16] x86/mce: Remove __mcheck_cpu_init_early() > > > > The __mcheck_cpu_init_early() function was introduced so that some > > vendor-specific features are detected before the first MCA polling event done > > in __mcheck_cpu_init_generic(). > > > > Currently, __mcheck_cpu_init_early() is only used on AMD-based systems > > and additional code will be needed to support various system configurations. > > > > However, the current and future vendor-specific code should be done during > > vendor init. This keeps all the vendor code in a common location and > > simplifies the generic init flow. > > > > Move all the __mcheck_cpu_init_early() code into mce_amd_feature_init(). > > Also, move __mcheck_cpu_init_generic() after > > __mcheck_cpu_init_prepare_banks() so that MCA is enabled after the first > > MCA polling event. > > Maybe providing a bit more information about the first MCA polling event, > as shown below, would be clearer: > > "... so that MCA is enabled after the first MCA polling event for the machine > checks left over from the previous reset." > Sure thing. > > Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> > > Other than that, > > Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > Thanks, Yazen
© 2016 - 2025 Red Hat, Inc.