arch/x86/include/asm/mce.h | 4 +- arch/x86/kernel/cpu/mce/amd.c | 18 +-- arch/x86/kernel/cpu/mce/core.c | 230 +++++++++++++++------------ arch/x86/kernel/cpu/mce/dev-mcelog.c | 11 +- arch/x86/kernel/cpu/mce/genpool.c | 9 +- arch/x86/kernel/cpu/mce/inject.c | 2 +- arch/x86/kernel/cpu/mce/intel.c | 11 +- arch/x86/kernel/cpu/mce/threshold.c | 2 +- 8 files changed, 149 insertions(+), 138 deletions(-)
Clean up some x86/mce code as below. No functional changes intended.
- Simplify some code.
- Remove some unnecessary code.
- Improve readability for some code.
- Convert some family/model mixed checks to VFM-based checks.
- Fix some typos.
Pass the following basic tests:
- Tested on an Intel Sapphire Rapids server.
- Compile test.
- System boot test.
- Correctable/uncorrectable memory errors can be notified via CMCI/MCE interrupts.
- Correctable/uncorrectable memory errors can be dispatched to the mcelog daemon and the EDAC driver.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
- Drop the "x86/mce: Remove the redundant zeroing assignments" patch.
- 0003: Rename mce_notify_irq() to mce_notify_user().
- 0005: Move the 'int ret' variable along with the other int variables.
- 0006: New patch. Break up __mcheck_cpu_apply_quirks().
- 0007: New patch. Convert family/model mixed checks to VFM-based checks.
- 0009: Remove the variable names from the commit message.
- 0010: Remove the detail typos from the commit message.
Changes in v2:
- Collect "Reviewed-by:" from Tony.
- 0009: Add the missing variable names to the commit message.
This series is based on v6.12-rc4.
Thanks Tony, Dave, Sohil and Nikolay for your review and discussion on this series.
Qiuxu Zhuo (9):
x86/mce/dev-mcelog: Use xchg() to get and clear the flags
x86/mce/intel: Use MCG_BANKCNT_MASK instead of 0xff
x86/mce: Make several functions return bool and rename a function
x86/mce/threshold: Remove the redundant this_cpu_dec_return()
x86/mce/genpool: Make mce_gen_pool_create() return explicit error codes
x86/mce: Convert family/model mixed checks to VFM-based checks
x86/mce: Remove the unnecessary {}
x86/mce/amd: Remove unnecessary NULL pointer initializations
x86/mce: Fix typos
Tony Luck (1):
x86/mce: Break up __mcheck_cpu_apply_quirks()
arch/x86/include/asm/mce.h | 4 +-
arch/x86/kernel/cpu/mce/amd.c | 18 +--
arch/x86/kernel/cpu/mce/core.c | 230 +++++++++++++++------------
arch/x86/kernel/cpu/mce/dev-mcelog.c | 11 +-
arch/x86/kernel/cpu/mce/genpool.c | 9 +-
arch/x86/kernel/cpu/mce/inject.c | 2 +-
arch/x86/kernel/cpu/mce/intel.c | 11 +-
arch/x86/kernel/cpu/mce/threshold.c | 2 +-
8 files changed, 149 insertions(+), 138 deletions(-)
base-commit: 42f7652d3eb527d03665b09edac47f85fb600924
--
2.17.1
Clean up some x86/mce code as below. No functional changes intended.
- Simplify some code.
- Remove some unnecessary code.
- Improve readability for some code.
- Convert some family/model mixed checks to VFM-based checks.
- Fix some typos.
Pass the following basic tests:
- Tested on an Intel Sapphire Rapids server.
- Compile test.
- System boot test.
- Correctable/uncorrectable memory errors can be notified via CMCI/MCE interrupts.
- Correctable/uncorrectable memory errors can be dispatched to the mcelog daemon and the EDAC driver.
Changes in v4:
- Drop the first two patches as they've landed in the TIP ras/core branch.
- Drop "Make mce_gen_pool_create() return explicit error codes" patch.
- 0001: Don't rename mce_notify_irq().
- 0003: New patch. Make four functions return bool.
- 0004: Add necessary blank lines and directly use 'mca_cfg'.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
- Drop the "x86/mce: Remove the redundant zeroing assignments" patch.
- 0003: Rename mce_notify_irq() to mce_notify_user().
- 0005: Move the 'int ret' variable along with the other int variables.
- 0006: New patch. Break up __mcheck_cpu_apply_quirks().
- 0007: New patch. Convert family/model mixed checks to VFM-based checks.
- 0009: Remove the variables' names from the commit message.
- 0010: Remove the detail typos from the commit message.
Changes in v2:
- Collect "Reviewed-by:" from Tony.
- Update the commit message of patch 9 to include the names of all
variables that don't need NULL pointer initializations.
This series is based on the TIP branch ras/core with top commit 612c2addff36.
Thanks Thomas, Boris, Tony, Dave, Sohil and Nikolay for your review and discussion on this series.
Qiuxu Zhuo (7):
x86/mce: Make several functions return bool
x86/mce/threshold: Remove the redundant this_cpu_dec_return()
x86/mce: Make four functions return bool
x86/mce: Convert family/model mixed checks to VFM-based checks
x86/mce: Remove the unnecessary {}
x86/mce/amd: Remove unnecessary NULL pointer initializations
x86/mce: Fix typos
Tony Luck (1):
x86/mce: Break up __mcheck_cpu_apply_quirks()
arch/x86/include/asm/mce.h | 4 +-
arch/x86/kernel/cpu/mce/amd.c | 18 +--
arch/x86/kernel/cpu/mce/core.c | 234 +++++++++++++++-------------
arch/x86/kernel/cpu/mce/genpool.c | 29 ++--
arch/x86/kernel/cpu/mce/intel.c | 9 +-
arch/x86/kernel/cpu/mce/internal.h | 4 +-
arch/x86/kernel/cpu/mce/threshold.c | 2 +-
7 files changed, 158 insertions(+), 142 deletions(-)
base-commit: 612c2addff367ee461dc99ffca2bc786f105d2ec
--
2.17.1
Clean up some x86/mce code as below. No functional changes intended.
- Simplify some code.
- Remove some unnecessary code.
- Improve readability for some code.
- Convert some family/model mixed checks to VFM-based checks.
Pass the following basic tests:
- Tested on an Intel Sapphire Rapids server.
- Compile test.
- System boot test.
- Correctable/uncorrectable memory errors can be notified via CMCI/MCE interrupts.
- Correctable/uncorrectable memory errors can be dispatched to the mcelog daemon and the EDAC driver.
Changes in v5:
- Collect "Reviewed-by:" from Sohil & Yazen.
- Drop "Fix typos" patch.
- 0003: Update the commit message with mention the polarities of the return values are flipped.
- 0005: Reduce 'if (mce_num_banks > 0)' to 'if (mce_num_banks)'.
- 0006: Combine AMD and HYGON feature initialization and remove mce_hygon_feature_init().
Changes in v4:
- Drop the first two patches as they've landed in the TIP ras/core branch.
- Drop "Make mce_gen_pool_create() return explicit error codes" patch.
- 0001: Don't rename mce_notify_irq().
- 0003: New patch. Make four functions return bool.
- 0004: Add necessary blank lines and directly use 'mca_cfg'.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
- Drop the "x86/mce: Remove the redundant zeroing assignments" patch.
- 0003: Rename mce_notify_irq() to mce_notify_user().
- 0005: Move the 'int ret' variable along with the other int variables.
- 0006: New patch. Break up __mcheck_cpu_apply_quirks().
- 0007: New patch. Convert family/model mixed checks to VFM-based checks.
- 0009: Remove the variables' names from the commit message.
- 0010: Remove the detail typos from the commit message.
Changes in v2:
- Collect "Reviewed-by:" from Tony.
- Update the commit message of patch 9 to include the names of all
variables that don't need NULL pointer initializations.
This series is based on v6.13-rc2.
Thanks Thomas, Boris, Tony, Dave, Sohil, Yazen, and Nikolay for your review and discussion on this series.
Qiuxu Zhuo (6):
x86/mce: Make several functions return bool
x86/mce/threshold: Remove the redundant this_cpu_dec_return()
x86/mce: Make four functions return bool
x86/mce: Convert family/model mixed checks to VFM-based checks
x86/mce: Remove the redundant mce_hygon_feature_init()
x86/mce/amd: Remove unnecessary NULL pointer initializations
Tony Luck (1):
x86/mce: Break up __mcheck_cpu_apply_quirks()
arch/x86/include/asm/mce.h | 6 +-
arch/x86/kernel/cpu/mce/amd.c | 18 +--
arch/x86/kernel/cpu/mce/core.c | 237 +++++++++++++++-------------
arch/x86/kernel/cpu/mce/genpool.c | 29 ++--
arch/x86/kernel/cpu/mce/intel.c | 9 +-
arch/x86/kernel/cpu/mce/internal.h | 4 +-
arch/x86/kernel/cpu/mce/threshold.c | 2 +-
7 files changed, 158 insertions(+), 147 deletions(-)
base-commit: fac04efc5c793dccbd07e2d59af9f90b7fc0dca4
--
2.17.1
Make several functions that return 0 or 1 return a boolean value for
better readability.
No functional changes are intended.
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v5:
- No changes.
Changes in v4:
- Don't rename mce_notify_irq() - Boris.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
- Rename mce_notify_irq() to mce_notify_user() (Sohil).
Changes in v2:
- Collect "Reviewed-by:" from Tony.
arch/x86/include/asm/mce.h | 4 ++--
arch/x86/kernel/cpu/mce/amd.c | 10 +++++-----
arch/x86/kernel/cpu/mce/core.c | 22 +++++++++++-----------
arch/x86/kernel/cpu/mce/intel.c | 9 +++++----
4 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 4543cf2eb5e8..ea9ca7689f6b 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -276,7 +276,7 @@ static inline void cmci_rediscover(void) {}
static inline void cmci_recheck(void) {}
#endif
-int mce_available(struct cpuinfo_x86 *c);
+bool mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
bool mce_is_correctable(struct mce *m);
bool mce_usable_address(struct mce *m);
@@ -296,7 +296,7 @@ enum mcp_flags {
void machine_check_poll(enum mcp_flags flags, mce_banks_t *b);
-int mce_notify_irq(void);
+bool mce_notify_irq(void);
DECLARE_PER_CPU(struct mce, injectm);
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 6ca80fff1fea..018874b554cb 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -381,7 +381,7 @@ static bool lvt_interrupt_supported(unsigned int bank, u32 msr_high_bits)
return msr_high_bits & BIT(28);
}
-static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
+static bool lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
{
int msr = (hi & MASK_LVTOFF_HI) >> 20;
@@ -389,7 +389,7 @@ static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
"for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
b->bank, b->block, b->address, hi, lo);
- return 0;
+ return false;
}
if (apic != msr) {
@@ -399,15 +399,15 @@ static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
* was set is reserved. Return early here:
*/
if (mce_flags.smca)
- return 0;
+ return false;
pr_err(FW_BUG "cpu %d, invalid threshold interrupt offset %d "
"for bank %d, block %d (MSR%08X=0x%x%08x)\n",
b->cpu, apic, b->bank, b->block, b->address, hi, lo);
- return 0;
+ return false;
}
- return 1;
+ return true;
};
/* Reprogram MCx_MISC MSR behind this threshold bank. */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7fb5556a0b53..167965bd2ac0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -492,10 +492,10 @@ static noinstr void mce_gather_info(struct mce_hw_err *err, struct pt_regs *regs
}
}
-int mce_available(struct cpuinfo_x86 *c)
+bool mce_available(struct cpuinfo_x86 *c)
{
if (mca_cfg.disabled)
- return 0;
+ return false;
return cpu_has(c, X86_FEATURE_MCE) && cpu_has(c, X86_FEATURE_MCA);
}
@@ -1778,7 +1778,7 @@ static void mce_timer_delete_all(void)
* Can be called from interrupt context, but not from machine check/NMI
* context.
*/
-int mce_notify_irq(void)
+bool mce_notify_irq(void)
{
/* Not more than two messages every minute */
static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
@@ -1789,9 +1789,9 @@ int mce_notify_irq(void)
if (__ratelimit(&ratelimit))
pr_info(HW_ERR "Machine check events logged\n");
- return 1;
+ return true;
}
- return 0;
+ return false;
}
EXPORT_SYMBOL_GPL(mce_notify_irq);
@@ -2015,25 +2015,25 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
return 0;
}
-static int __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
+static bool __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
{
if (c->x86 != 5)
- return 0;
+ return false;
switch (c->x86_vendor) {
case X86_VENDOR_INTEL:
intel_p5_mcheck_init(c);
mce_flags.p5 = 1;
- return 1;
+ return true;
case X86_VENDOR_CENTAUR:
winchip_mcheck_init(c);
mce_flags.winchip = 1;
- return 1;
+ return true;
default:
- return 0;
+ return false;
}
- return 0;
+ return false;
}
/*
diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index b3cd2c61b11d..f863df0ff42c 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -75,12 +75,12 @@ static u16 cmci_threshold[MAX_NR_BANKS];
*/
#define CMCI_STORM_THRESHOLD 32749
-static int cmci_supported(int *banks)
+static bool cmci_supported(int *banks)
{
u64 cap;
if (mca_cfg.cmci_disabled || mca_cfg.ignore_ce)
- return 0;
+ return false;
/*
* Vendor check is not strictly needed, but the initial
@@ -89,10 +89,11 @@ static int cmci_supported(int *banks)
*/
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL &&
boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN)
- return 0;
+ return false;
if (!boot_cpu_has(X86_FEATURE_APIC) || lapic_get_maxlvt() < 6)
- return 0;
+ return false;
+
rdmsrl(MSR_IA32_MCG_CAP, cap);
*banks = min_t(unsigned, MAX_NR_BANKS, cap & MCG_BANKCNT_MASK);
return !!(cap & MCG_CMCI_P);
--
2.17.1
On Thu, Dec 12, 2024 at 10:00:57PM +0800, Qiuxu Zhuo wrote: > Make several functions that return 0 or 1 return a boolean value for > better readability. > > No functional changes are intended. > > Reviewed-by: Tony Luck <tony.luck@intel.com> > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> > Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Thanks, Yazen
> From: Yazen Ghannam <yazen.ghannam@amd.com> > [...] > Subject: Re: [PATCH v5 1/7] x86/mce: Make several functions return bool > [...] > > Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Hi Yazen, Thanks for reviewing this patch series. - Qiuxu
The following commit has been merged into the ras/core branch of tip:
Commit-ID: c845cb8dbd2e1a804babfd13648026c3a7cfbc0b
Gitweb: https://git.kernel.org/tip/c845cb8dbd2e1a804babfd13648026c3a7cfbc0b
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
AuthorDate: Thu, 12 Dec 2024 22:00:57 +08:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 30 Dec 2024 19:05:50 +01:00
x86/mce: Make several functions return bool
Make several functions that return 0 or 1 return a boolean value for
better readability.
No functional changes are intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20241212140103.66964-2-qiuxu.zhuo@intel.com
---
arch/x86/include/asm/mce.h | 4 ++--
arch/x86/kernel/cpu/mce/amd.c | 10 +++++-----
arch/x86/kernel/cpu/mce/core.c | 22 +++++++++++-----------
arch/x86/kernel/cpu/mce/intel.c | 9 +++++----
4 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 4543cf2..ea9ca76 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -276,7 +276,7 @@ static inline void cmci_rediscover(void) {}
static inline void cmci_recheck(void) {}
#endif
-int mce_available(struct cpuinfo_x86 *c);
+bool mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
bool mce_is_correctable(struct mce *m);
bool mce_usable_address(struct mce *m);
@@ -296,7 +296,7 @@ enum mcp_flags {
void machine_check_poll(enum mcp_flags flags, mce_banks_t *b);
-int mce_notify_irq(void);
+bool mce_notify_irq(void);
DECLARE_PER_CPU(struct mce, injectm);
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 6ca80ff..018874b 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -381,7 +381,7 @@ static bool lvt_interrupt_supported(unsigned int bank, u32 msr_high_bits)
return msr_high_bits & BIT(28);
}
-static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
+static bool lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
{
int msr = (hi & MASK_LVTOFF_HI) >> 20;
@@ -389,7 +389,7 @@ static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
"for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
b->bank, b->block, b->address, hi, lo);
- return 0;
+ return false;
}
if (apic != msr) {
@@ -399,15 +399,15 @@ static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
* was set is reserved. Return early here:
*/
if (mce_flags.smca)
- return 0;
+ return false;
pr_err(FW_BUG "cpu %d, invalid threshold interrupt offset %d "
"for bank %d, block %d (MSR%08X=0x%x%08x)\n",
b->cpu, apic, b->bank, b->block, b->address, hi, lo);
- return 0;
+ return false;
}
- return 1;
+ return true;
};
/* Reprogram MCx_MISC MSR behind this threshold bank. */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7fb5556..167965b 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -492,10 +492,10 @@ static noinstr void mce_gather_info(struct mce_hw_err *err, struct pt_regs *regs
}
}
-int mce_available(struct cpuinfo_x86 *c)
+bool mce_available(struct cpuinfo_x86 *c)
{
if (mca_cfg.disabled)
- return 0;
+ return false;
return cpu_has(c, X86_FEATURE_MCE) && cpu_has(c, X86_FEATURE_MCA);
}
@@ -1778,7 +1778,7 @@ static void mce_timer_delete_all(void)
* Can be called from interrupt context, but not from machine check/NMI
* context.
*/
-int mce_notify_irq(void)
+bool mce_notify_irq(void)
{
/* Not more than two messages every minute */
static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
@@ -1789,9 +1789,9 @@ int mce_notify_irq(void)
if (__ratelimit(&ratelimit))
pr_info(HW_ERR "Machine check events logged\n");
- return 1;
+ return true;
}
- return 0;
+ return false;
}
EXPORT_SYMBOL_GPL(mce_notify_irq);
@@ -2015,25 +2015,25 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
return 0;
}
-static int __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
+static bool __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
{
if (c->x86 != 5)
- return 0;
+ return false;
switch (c->x86_vendor) {
case X86_VENDOR_INTEL:
intel_p5_mcheck_init(c);
mce_flags.p5 = 1;
- return 1;
+ return true;
case X86_VENDOR_CENTAUR:
winchip_mcheck_init(c);
mce_flags.winchip = 1;
- return 1;
+ return true;
default:
- return 0;
+ return false;
}
- return 0;
+ return false;
}
/*
diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index b3cd2c6..f863df0 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -75,12 +75,12 @@ static u16 cmci_threshold[MAX_NR_BANKS];
*/
#define CMCI_STORM_THRESHOLD 32749
-static int cmci_supported(int *banks)
+static bool cmci_supported(int *banks)
{
u64 cap;
if (mca_cfg.cmci_disabled || mca_cfg.ignore_ce)
- return 0;
+ return false;
/*
* Vendor check is not strictly needed, but the initial
@@ -89,10 +89,11 @@ static int cmci_supported(int *banks)
*/
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL &&
boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN)
- return 0;
+ return false;
if (!boot_cpu_has(X86_FEATURE_APIC) || lapic_get_maxlvt() < 6)
- return 0;
+ return false;
+
rdmsrl(MSR_IA32_MCG_CAP, cap);
*banks = min_t(unsigned, MAX_NR_BANKS, cap & MCG_BANKCNT_MASK);
return !!(cap & MCG_CMCI_P);
The 'storm' variable points to this_cpu_ptr(&storm_desc). Access the
'stormy_bank_count' field through the 'storm' to avoid calling
this_cpu_*() on the same per-CPU variable twice.
This minor optimization reduces the text size by 16 bytes.
$ size threshold.o.*
text data bss dec hex filename
1395 1664 0 3059 bf3 threshold.o.old
1379 1664 0 3043 be3 threshold.o.new
No functional changes intended.
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v5:
- No changes.
Changes in v4:
- No changes.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
Changes in v2:
- Collect "Reviewed-by:" from Tony.
arch/x86/kernel/cpu/mce/threshold.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/threshold.c b/arch/x86/kernel/cpu/mce/threshold.c
index 89e31e1e5c9c..f4a007616468 100644
--- a/arch/x86/kernel/cpu/mce/threshold.c
+++ b/arch/x86/kernel/cpu/mce/threshold.c
@@ -90,7 +90,7 @@ void cmci_storm_end(unsigned int bank)
storm->banks[bank].in_storm_mode = false;
/* If no banks left in storm mode, stop polling. */
- if (!this_cpu_dec_return(storm_desc.stormy_bank_count))
+ if (!--storm->stormy_bank_count)
mce_timer_kick(false);
}
--
2.17.1
On Thu, Dec 12, 2024 at 10:00:58PM +0800, Qiuxu Zhuo wrote: > The 'storm' variable points to this_cpu_ptr(&storm_desc). Access the > 'stormy_bank_count' field through the 'storm' to avoid calling > this_cpu_*() on the same per-CPU variable twice. > > This minor optimization reduces the text size by 16 bytes. > > $ size threshold.o.* > text data bss dec hex filename > 1395 1664 0 3059 bf3 threshold.o.old > 1379 1664 0 3043 be3 threshold.o.new > > No functional changes intended. > > Reviewed-by: Tony Luck <tony.luck@intel.com> > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> > Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Thanks, Yazen
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 64a668fbea1b6ec06ddca66d09cc49352f063342
Gitweb: https://git.kernel.org/tip/64a668fbea1b6ec06ddca66d09cc49352f063342
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
AuthorDate: Thu, 12 Dec 2024 22:00:58 +08:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 30 Dec 2024 19:45:03 +01:00
x86/mce/threshold: Remove the redundant this_cpu_dec_return()
The 'storm' variable points to this_cpu_ptr(&storm_desc). Access the
'stormy_bank_count' field through the 'storm' to avoid calling
this_cpu_*() on the same per-CPU variable twice.
This minor optimization reduces the text size by 16 bytes.
$ size threshold.o.*
text data bss dec hex filename
1395 1664 0 3059 bf3 threshold.o.old
1379 1664 0 3043 be3 threshold.o.new
No functional changes intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20241212140103.66964-3-qiuxu.zhuo@intel.com
---
arch/x86/kernel/cpu/mce/threshold.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/threshold.c b/arch/x86/kernel/cpu/mce/threshold.c
index 89e31e1..f4a0076 100644
--- a/arch/x86/kernel/cpu/mce/threshold.c
+++ b/arch/x86/kernel/cpu/mce/threshold.c
@@ -90,7 +90,7 @@ void cmci_storm_end(unsigned int bank)
storm->banks[bank].in_storm_mode = false;
/* If no banks left in storm mode, stop polling. */
- if (!this_cpu_dec_return(storm_desc.stormy_bank_count))
+ if (!--storm->stormy_bank_count)
mce_timer_kick(false);
}
Make those functions whose callers only care about success or failure
return a boolean value for better readability. Also, update the call
sites accordingly as the polarities of all the return values have been
flipped.
No functional changes.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v5:
- Collect "Reviewed-by:" from Sohil.
- Mention the polarities of return values are flipped in the commit message.
Changes in v4:
- New patch.
arch/x86/kernel/cpu/mce/core.c | 12 ++++++------
arch/x86/kernel/cpu/mce/genpool.c | 29 ++++++++++++++---------------
arch/x86/kernel/cpu/mce/internal.h | 4 ++--
3 files changed, 22 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 167965bd2ac0..ce6fe5e20805 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -151,7 +151,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
void mce_log(struct mce_hw_err *err)
{
- if (!mce_gen_pool_add(err))
+ if (mce_gen_pool_add(err))
irq_work_queue(&mce_irq_work);
}
EXPORT_SYMBOL_GPL(mce_log);
@@ -1911,14 +1911,14 @@ static void __mcheck_cpu_check_banks(void)
}
/* Add per CPU specific workarounds here */
-static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
+static bool __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
if (c->x86_vendor == X86_VENDOR_UNKNOWN) {
pr_info("unknown CPU type - not enabling MCE support\n");
- return -EOPNOTSUPP;
+ return false;
}
/* This should be disabled by the BIOS, but isn't always */
@@ -2012,7 +2012,7 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (cfg->bootlog != 0)
cfg->panic_timeout = 30;
- return 0;
+ return true;
}
static bool __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
@@ -2279,12 +2279,12 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
__mcheck_cpu_cap_init();
- if (__mcheck_cpu_apply_quirks(c) < 0) {
+ if (!__mcheck_cpu_apply_quirks(c)) {
mca_cfg.disabled = 1;
return;
}
- if (mce_gen_pool_init()) {
+ if (!mce_gen_pool_init()) {
mca_cfg.disabled = 1;
pr_emerg("Couldn't allocate MCE records pool!\n");
return;
diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c
index d0be6dda0c14..3ca9c007a666 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -94,64 +94,63 @@ bool mce_gen_pool_empty(void)
return llist_empty(&mce_event_llist);
}
-int mce_gen_pool_add(struct mce_hw_err *err)
+bool mce_gen_pool_add(struct mce_hw_err *err)
{
struct mce_evt_llist *node;
if (filter_mce(&err->m))
- return -EINVAL;
+ return false;
if (!mce_evt_pool)
- return -EINVAL;
+ return false;
node = (void *)gen_pool_alloc(mce_evt_pool, sizeof(*node));
if (!node) {
pr_warn_ratelimited("MCE records pool full!\n");
- return -ENOMEM;
+ return false;
}
memcpy(&node->err, err, sizeof(*err));
llist_add(&node->llnode, &mce_event_llist);
- return 0;
+ return true;
}
-static int mce_gen_pool_create(void)
+static bool mce_gen_pool_create(void)
{
int mce_numrecords, mce_poolsz, order;
struct gen_pool *gpool;
- int ret = -ENOMEM;
void *mce_pool;
order = order_base_2(sizeof(struct mce_evt_llist));
gpool = gen_pool_create(order, -1);
if (!gpool)
- return ret;
+ return false;
mce_numrecords = max(MCE_MIN_ENTRIES, num_possible_cpus() * MCE_PER_CPU);
mce_poolsz = mce_numrecords * (1 << order);
mce_pool = kmalloc(mce_poolsz, GFP_KERNEL);
if (!mce_pool) {
gen_pool_destroy(gpool);
- return ret;
+ return false;
}
- ret = gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1);
- if (ret) {
+
+ if (gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1)) {
gen_pool_destroy(gpool);
kfree(mce_pool);
- return ret;
+ return false;
}
mce_evt_pool = gpool;
- return ret;
+ return true;
}
-int mce_gen_pool_init(void)
+bool mce_gen_pool_init(void)
{
/* Just init mce_gen_pool once. */
if (mce_evt_pool)
- return 0;
+ return true;
return mce_gen_pool_create();
}
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index 84f810598231..95a504ece43e 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -31,8 +31,8 @@ struct mce_evt_llist {
void mce_gen_pool_process(struct work_struct *__unused);
bool mce_gen_pool_empty(void);
-int mce_gen_pool_add(struct mce_hw_err *err);
-int mce_gen_pool_init(void);
+bool mce_gen_pool_add(struct mce_hw_err *err);
+bool mce_gen_pool_init(void);
struct llist_node *mce_gen_pool_prepare_records(void);
int mce_severity(struct mce *a, struct pt_regs *regs, char **msg, bool is_excp);
--
2.17.1
On Thu, Dec 12, 2024 at 10:00:59PM +0800, Qiuxu Zhuo wrote:
> Make those functions whose callers only care about success or failure
> return a boolean value for better readability. Also, update the call
> sites accordingly as the polarities of all the return values have been
> flipped.
>
> No functional changes.
>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> ---
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
But one thought below...
[...]
> -static int mce_gen_pool_create(void)
> +static bool mce_gen_pool_create(void)
> {
> int mce_numrecords, mce_poolsz, order;
> struct gen_pool *gpool;
> - int ret = -ENOMEM;
> void *mce_pool;
>
> order = order_base_2(sizeof(struct mce_evt_llist));
> gpool = gen_pool_create(order, -1);
> if (!gpool)
> - return ret;
> + return false;
>
> mce_numrecords = max(MCE_MIN_ENTRIES, num_possible_cpus() * MCE_PER_CPU);
> mce_poolsz = mce_numrecords * (1 << order);
> mce_pool = kmalloc(mce_poolsz, GFP_KERNEL);
> if (!mce_pool) {
> gen_pool_destroy(gpool);
> - return ret;
> + return false;
> }
> - ret = gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1);
> - if (ret) {
> +
> + if (gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1)) {
> gen_pool_destroy(gpool);
> kfree(mce_pool);
> - return ret;
> + return false;
> }
>
> mce_evt_pool = gpool;
>
> - return ret;
> + return true;
> }
>
It seems like this segment could make use of the helpers in
<linux/cleanup.h> (maybe as future work).
Thanks,
Yazen
The following commit has been merged into the ras/core branch of tip:
Commit-ID: c46945c9cac8437a674edb9d8fbe71511fb4acee
Gitweb: https://git.kernel.org/tip/c46945c9cac8437a674edb9d8fbe71511fb4acee
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
AuthorDate: Thu, 12 Dec 2024 22:00:59 +08:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Mon, 30 Dec 2024 22:06:36 +01:00
x86/mce: Make four functions return bool
Make those functions whose callers only care about success or failure return
a boolean value for better readability. Also, update the call sites
accordingly as the polarities of all the return values have been flipped.
No functional changes.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20241212140103.66964-4-qiuxu.zhuo@intel.com
---
arch/x86/kernel/cpu/mce/core.c | 12 ++++++------
arch/x86/kernel/cpu/mce/genpool.c | 29 ++++++++++++++---------------
arch/x86/kernel/cpu/mce/internal.h | 4 ++--
3 files changed, 22 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 167965b..ce6fe5e 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -151,7 +151,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
void mce_log(struct mce_hw_err *err)
{
- if (!mce_gen_pool_add(err))
+ if (mce_gen_pool_add(err))
irq_work_queue(&mce_irq_work);
}
EXPORT_SYMBOL_GPL(mce_log);
@@ -1911,14 +1911,14 @@ static void __mcheck_cpu_check_banks(void)
}
/* Add per CPU specific workarounds here */
-static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
+static bool __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
if (c->x86_vendor == X86_VENDOR_UNKNOWN) {
pr_info("unknown CPU type - not enabling MCE support\n");
- return -EOPNOTSUPP;
+ return false;
}
/* This should be disabled by the BIOS, but isn't always */
@@ -2012,7 +2012,7 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (cfg->bootlog != 0)
cfg->panic_timeout = 30;
- return 0;
+ return true;
}
static bool __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
@@ -2279,12 +2279,12 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
__mcheck_cpu_cap_init();
- if (__mcheck_cpu_apply_quirks(c) < 0) {
+ if (!__mcheck_cpu_apply_quirks(c)) {
mca_cfg.disabled = 1;
return;
}
- if (mce_gen_pool_init()) {
+ if (!mce_gen_pool_init()) {
mca_cfg.disabled = 1;
pr_emerg("Couldn't allocate MCE records pool!\n");
return;
diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c
index d0be6dd..3ca9c00 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -94,64 +94,63 @@ bool mce_gen_pool_empty(void)
return llist_empty(&mce_event_llist);
}
-int mce_gen_pool_add(struct mce_hw_err *err)
+bool mce_gen_pool_add(struct mce_hw_err *err)
{
struct mce_evt_llist *node;
if (filter_mce(&err->m))
- return -EINVAL;
+ return false;
if (!mce_evt_pool)
- return -EINVAL;
+ return false;
node = (void *)gen_pool_alloc(mce_evt_pool, sizeof(*node));
if (!node) {
pr_warn_ratelimited("MCE records pool full!\n");
- return -ENOMEM;
+ return false;
}
memcpy(&node->err, err, sizeof(*err));
llist_add(&node->llnode, &mce_event_llist);
- return 0;
+ return true;
}
-static int mce_gen_pool_create(void)
+static bool mce_gen_pool_create(void)
{
int mce_numrecords, mce_poolsz, order;
struct gen_pool *gpool;
- int ret = -ENOMEM;
void *mce_pool;
order = order_base_2(sizeof(struct mce_evt_llist));
gpool = gen_pool_create(order, -1);
if (!gpool)
- return ret;
+ return false;
mce_numrecords = max(MCE_MIN_ENTRIES, num_possible_cpus() * MCE_PER_CPU);
mce_poolsz = mce_numrecords * (1 << order);
mce_pool = kmalloc(mce_poolsz, GFP_KERNEL);
if (!mce_pool) {
gen_pool_destroy(gpool);
- return ret;
+ return false;
}
- ret = gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1);
- if (ret) {
+
+ if (gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1)) {
gen_pool_destroy(gpool);
kfree(mce_pool);
- return ret;
+ return false;
}
mce_evt_pool = gpool;
- return ret;
+ return true;
}
-int mce_gen_pool_init(void)
+bool mce_gen_pool_init(void)
{
/* Just init mce_gen_pool once. */
if (mce_evt_pool)
- return 0;
+ return true;
return mce_gen_pool_create();
}
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index 84f8105..95a504e 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -31,8 +31,8 @@ struct mce_evt_llist {
void mce_gen_pool_process(struct work_struct *__unused);
bool mce_gen_pool_empty(void);
-int mce_gen_pool_add(struct mce_hw_err *err);
-int mce_gen_pool_init(void);
+bool mce_gen_pool_add(struct mce_hw_err *err);
+bool mce_gen_pool_init(void);
struct llist_node *mce_gen_pool_prepare_records(void);
int mce_severity(struct mce *a, struct pt_regs *regs, char **msg, bool is_excp);
From: Tony Luck <tony.luck@intel.com>
Split each vendor specific part into its own helper function.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v5:
- Collect "Reviewed-by:" from Sohil.
Changes in v4:
- Add necessary blank lines in apply_quirks_amd() (Yazen).
- Use 'mca_cfg' instead of 'cfg' in apply_quirks_*(). (Yazen).
Changes in v3:
- New patch.
arch/x86/kernel/cpu/mce/core.c | 192 ++++++++++++++++++---------------
1 file changed, 104 insertions(+), 88 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index ce6fe5e20805..3855ec2ed0e0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1910,101 +1910,117 @@ static void __mcheck_cpu_check_banks(void)
}
}
+static void apply_quirks_amd(struct cpuinfo_x86 *c)
+{
+ struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+
+ /* This should be disabled by the BIOS, but isn't always */
+ if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
+ /*
+ * disable GART TBL walk error reporting, which
+ * trips off incorrectly with the IOMMU & 3ware
+ * & Cerberus:
+ */
+ clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
+ }
+
+ if (c->x86 < 0x11 && mca_cfg.bootlog < 0) {
+ /*
+ * Lots of broken BIOS around that don't clear them
+ * by default and leave crap in there. Don't log:
+ */
+ mca_cfg.bootlog = 0;
+ }
+
+ /*
+ * Various K7s with broken bank 0 around. Always disable
+ * by default.
+ */
+ if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
+ mce_banks[0].ctl = 0;
+
+ /*
+ * overflow_recov is supported for F15h Models 00h-0fh
+ * even though we don't have a CPUID bit for it.
+ */
+ if (c->x86 == 0x15 && c->x86_model <= 0xf)
+ mce_flags.overflow_recov = 1;
+
+ if (c->x86 >= 0x17 && c->x86 <= 0x1A)
+ mce_flags.zen_ifu_quirk = 1;
+}
+
+static void apply_quirks_intel(struct cpuinfo_x86 *c)
+{
+ struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+
+ /*
+ * SDM documents that on family 6 bank 0 should not be written
+ * because it aliases to another special BIOS controlled
+ * register.
+ * But it's not aliased anymore on model 0x1a+
+ * Don't ignore bank 0 completely because there could be a
+ * valid event later, merely don't write CTL0.
+ */
+ if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
+ mce_banks[0].init = false;
+
+ /*
+ * All newer Intel systems support MCE broadcasting. Enable
+ * synchronization with a one second timeout.
+ */
+ if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
+ mca_cfg.monarch_timeout < 0)
+ mca_cfg.monarch_timeout = USEC_PER_SEC;
+
+ /*
+ * There are also broken BIOSes on some Pentium M and
+ * earlier systems:
+ */
+ if (c->x86 == 6 && c->x86_model <= 13 && mca_cfg.bootlog < 0)
+ mca_cfg.bootlog = 0;
+
+ if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
+ mce_flags.snb_ifu_quirk = 1;
+
+ /*
+ * Skylake, Cascacde Lake and Cooper Lake require a quirk on
+ * rep movs.
+ */
+ if (c->x86_vfm == INTEL_SKYLAKE_X)
+ mce_flags.skx_repmov_quirk = 1;
+}
+
+static void apply_quirks_zhaoxin(struct cpuinfo_x86 *c)
+{
+ /*
+ * All newer Zhaoxin CPUs support MCE broadcasting. Enable
+ * synchronization with a one second timeout.
+ */
+ if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
+ if (mca_cfg.monarch_timeout < 0)
+ mca_cfg.monarch_timeout = USEC_PER_SEC;
+ }
+}
+
/* Add per CPU specific workarounds here */
static bool __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
{
- struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
- if (c->x86_vendor == X86_VENDOR_UNKNOWN) {
+ switch (c->x86_vendor) {
+ case X86_VENDOR_UNKNOWN:
pr_info("unknown CPU type - not enabling MCE support\n");
return false;
- }
-
- /* This should be disabled by the BIOS, but isn't always */
- if (c->x86_vendor == X86_VENDOR_AMD) {
- if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
- /*
- * disable GART TBL walk error reporting, which
- * trips off incorrectly with the IOMMU & 3ware
- * & Cerberus:
- */
- clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
- }
- if (c->x86 < 0x11 && cfg->bootlog < 0) {
- /*
- * Lots of broken BIOS around that don't clear them
- * by default and leave crap in there. Don't log:
- */
- cfg->bootlog = 0;
- }
- /*
- * Various K7s with broken bank 0 around. Always disable
- * by default.
- */
- if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
- mce_banks[0].ctl = 0;
-
- /*
- * overflow_recov is supported for F15h Models 00h-0fh
- * even though we don't have a CPUID bit for it.
- */
- if (c->x86 == 0x15 && c->x86_model <= 0xf)
- mce_flags.overflow_recov = 1;
-
- if (c->x86 >= 0x17 && c->x86 <= 0x1A)
- mce_flags.zen_ifu_quirk = 1;
-
- }
-
- if (c->x86_vendor == X86_VENDOR_INTEL) {
- /*
- * SDM documents that on family 6 bank 0 should not be written
- * because it aliases to another special BIOS controlled
- * register.
- * But it's not aliased anymore on model 0x1a+
- * Don't ignore bank 0 completely because there could be a
- * valid event later, merely don't write CTL0.
- */
-
- if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
- mce_banks[0].init = false;
-
- /*
- * All newer Intel systems support MCE broadcasting. Enable
- * synchronization with a one second timeout.
- */
- if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
- cfg->monarch_timeout < 0)
- cfg->monarch_timeout = USEC_PER_SEC;
-
- /*
- * There are also broken BIOSes on some Pentium M and
- * earlier systems:
- */
- if (c->x86 == 6 && c->x86_model <= 13 && cfg->bootlog < 0)
- cfg->bootlog = 0;
-
- if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
- mce_flags.snb_ifu_quirk = 1;
-
- /*
- * Skylake, Cascacde Lake and Cooper Lake require a quirk on
- * rep movs.
- */
- if (c->x86_vfm == INTEL_SKYLAKE_X)
- mce_flags.skx_repmov_quirk = 1;
- }
-
- if (c->x86_vendor == X86_VENDOR_ZHAOXIN) {
- /*
- * All newer Zhaoxin CPUs support MCE broadcasting. Enable
- * synchronization with a one second timeout.
- */
- if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
- if (cfg->monarch_timeout < 0)
- cfg->monarch_timeout = USEC_PER_SEC;
- }
+ case X86_VENDOR_AMD:
+ apply_quirks_amd(c);
+ break;
+ case X86_VENDOR_INTEL:
+ apply_quirks_intel(c);
+ break;
+ case X86_VENDOR_ZHAOXIN:
+ apply_quirks_zhaoxin(c);
+ break;
}
if (cfg->monarch_timeout < 0)
--
2.17.1
On Thu, Dec 12, 2024 at 10:01:00PM +0800, Qiuxu Zhuo wrote: > From: Tony Luck <tony.luck@intel.com> > > Split each vendor specific part into its own helper function. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Thanks, Yazen
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 51a12c28bb9a043e9444db5bd214b00ec161a639
Gitweb: https://git.kernel.org/tip/51a12c28bb9a043e9444db5bd214b00ec161a639
Author: Tony Luck <tony.luck@intel.com>
AuthorDate: Thu, 12 Dec 2024 22:01:00 +08:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 31 Dec 2024 11:07:05 +01:00
x86/mce: Break up __mcheck_cpu_apply_quirks()
Split each vendor specific part into its own helper function.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://lore.kernel.org/r/20241212140103.66964-5-qiuxu.zhuo@intel.com
---
arch/x86/kernel/cpu/mce/core.c | 168 +++++++++++++++++---------------
1 file changed, 92 insertions(+), 76 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index ce6fe5e..3855ec2 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1910,101 +1910,117 @@ static void __mcheck_cpu_check_banks(void)
}
}
-/* Add per CPU specific workarounds here */
-static bool __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
+static void apply_quirks_amd(struct cpuinfo_x86 *c)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
- struct mca_config *cfg = &mca_cfg;
-
- if (c->x86_vendor == X86_VENDOR_UNKNOWN) {
- pr_info("unknown CPU type - not enabling MCE support\n");
- return false;
- }
/* This should be disabled by the BIOS, but isn't always */
- if (c->x86_vendor == X86_VENDOR_AMD) {
- if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
- /*
- * disable GART TBL walk error reporting, which
- * trips off incorrectly with the IOMMU & 3ware
- * & Cerberus:
- */
- clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
- }
- if (c->x86 < 0x11 && cfg->bootlog < 0) {
- /*
- * Lots of broken BIOS around that don't clear them
- * by default and leave crap in there. Don't log:
- */
- cfg->bootlog = 0;
- }
+ if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
/*
- * Various K7s with broken bank 0 around. Always disable
- * by default.
+ * disable GART TBL walk error reporting, which
+ * trips off incorrectly with the IOMMU & 3ware
+ * & Cerberus:
*/
- if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
- mce_banks[0].ctl = 0;
+ clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
+ }
+ if (c->x86 < 0x11 && mca_cfg.bootlog < 0) {
/*
- * overflow_recov is supported for F15h Models 00h-0fh
- * even though we don't have a CPUID bit for it.
+ * Lots of broken BIOS around that don't clear them
+ * by default and leave crap in there. Don't log:
*/
- if (c->x86 == 0x15 && c->x86_model <= 0xf)
- mce_flags.overflow_recov = 1;
+ mca_cfg.bootlog = 0;
+ }
- if (c->x86 >= 0x17 && c->x86 <= 0x1A)
- mce_flags.zen_ifu_quirk = 1;
+ /*
+ * Various K7s with broken bank 0 around. Always disable
+ * by default.
+ */
+ if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
+ mce_banks[0].ctl = 0;
- }
+ /*
+ * overflow_recov is supported for F15h Models 00h-0fh
+ * even though we don't have a CPUID bit for it.
+ */
+ if (c->x86 == 0x15 && c->x86_model <= 0xf)
+ mce_flags.overflow_recov = 1;
- if (c->x86_vendor == X86_VENDOR_INTEL) {
- /*
- * SDM documents that on family 6 bank 0 should not be written
- * because it aliases to another special BIOS controlled
- * register.
- * But it's not aliased anymore on model 0x1a+
- * Don't ignore bank 0 completely because there could be a
- * valid event later, merely don't write CTL0.
- */
+ if (c->x86 >= 0x17 && c->x86 <= 0x1A)
+ mce_flags.zen_ifu_quirk = 1;
+}
- if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
- mce_banks[0].init = false;
+static void apply_quirks_intel(struct cpuinfo_x86 *c)
+{
+ struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
- /*
- * All newer Intel systems support MCE broadcasting. Enable
- * synchronization with a one second timeout.
- */
- if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
- cfg->monarch_timeout < 0)
- cfg->monarch_timeout = USEC_PER_SEC;
+ /*
+ * SDM documents that on family 6 bank 0 should not be written
+ * because it aliases to another special BIOS controlled
+ * register.
+ * But it's not aliased anymore on model 0x1a+
+ * Don't ignore bank 0 completely because there could be a
+ * valid event later, merely don't write CTL0.
+ */
+ if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
+ mce_banks[0].init = false;
- /*
- * There are also broken BIOSes on some Pentium M and
- * earlier systems:
- */
- if (c->x86 == 6 && c->x86_model <= 13 && cfg->bootlog < 0)
- cfg->bootlog = 0;
+ /*
+ * All newer Intel systems support MCE broadcasting. Enable
+ * synchronization with a one second timeout.
+ */
+ if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
+ mca_cfg.monarch_timeout < 0)
+ mca_cfg.monarch_timeout = USEC_PER_SEC;
+
+ /*
+ * There are also broken BIOSes on some Pentium M and
+ * earlier systems:
+ */
+ if (c->x86 == 6 && c->x86_model <= 13 && mca_cfg.bootlog < 0)
+ mca_cfg.bootlog = 0;
- if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
- mce_flags.snb_ifu_quirk = 1;
+ if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
+ mce_flags.snb_ifu_quirk = 1;
- /*
- * Skylake, Cascacde Lake and Cooper Lake require a quirk on
- * rep movs.
- */
- if (c->x86_vfm == INTEL_SKYLAKE_X)
- mce_flags.skx_repmov_quirk = 1;
+ /*
+ * Skylake, Cascacde Lake and Cooper Lake require a quirk on
+ * rep movs.
+ */
+ if (c->x86_vfm == INTEL_SKYLAKE_X)
+ mce_flags.skx_repmov_quirk = 1;
+}
+
+static void apply_quirks_zhaoxin(struct cpuinfo_x86 *c)
+{
+ /*
+ * All newer Zhaoxin CPUs support MCE broadcasting. Enable
+ * synchronization with a one second timeout.
+ */
+ if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
+ if (mca_cfg.monarch_timeout < 0)
+ mca_cfg.monarch_timeout = USEC_PER_SEC;
}
+}
- if (c->x86_vendor == X86_VENDOR_ZHAOXIN) {
- /*
- * All newer Zhaoxin CPUs support MCE broadcasting. Enable
- * synchronization with a one second timeout.
- */
- if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
- if (cfg->monarch_timeout < 0)
- cfg->monarch_timeout = USEC_PER_SEC;
- }
+/* Add per CPU specific workarounds here */
+static bool __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
+{
+ struct mca_config *cfg = &mca_cfg;
+
+ switch (c->x86_vendor) {
+ case X86_VENDOR_UNKNOWN:
+ pr_info("unknown CPU type - not enabling MCE support\n");
+ return false;
+ case X86_VENDOR_AMD:
+ apply_quirks_amd(c);
+ break;
+ case X86_VENDOR_INTEL:
+ apply_quirks_intel(c);
+ break;
+ case X86_VENDOR_ZHAOXIN:
+ apply_quirks_zhaoxin(c);
+ break;
}
if (cfg->monarch_timeout < 0)
Convert family/model mixed checks to VFM-based checks to make the code
more compact. Also, as 'mce_num_banks' is an unsigned int, simplify the
check from 'if (this_cpu_read(mce_num_banks) > 0)' to
'if (this_cpu_read(mce_num_banks))'.
Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v5:
- Collect "Reviewed-by:" from Sohil & Yazen.
- Reduce 'if (mce_num_banks > 0)' to 'if (mce_num_banks)' - Yazen.
Changes in v4:
- No changes but rebased.
Changes in v3:
- Newly added.
arch/x86/kernel/cpu/mce/core.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 3855ec2ed0e0..f90cbcb31a62 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1936,7 +1936,7 @@ static void apply_quirks_amd(struct cpuinfo_x86 *c)
* Various K7s with broken bank 0 around. Always disable
* by default.
*/
- if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
+ if (c->x86 == 6 && this_cpu_read(mce_num_banks))
mce_banks[0].ctl = 0;
/*
@@ -1954,6 +1954,10 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+ /* Older CPUs (prior to family 6) don't need quirks. */
+ if (c->x86_vfm < INTEL_PENTIUM_PRO)
+ return;
+
/*
* SDM documents that on family 6 bank 0 should not be written
* because it aliases to another special BIOS controlled
@@ -1962,22 +1966,21 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
* Don't ignore bank 0 completely because there could be a
* valid event later, merely don't write CTL0.
*/
- if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
+ if (c->x86_vfm < INTEL_NEHALEM_EP && this_cpu_read(mce_num_banks))
mce_banks[0].init = false;
/*
* All newer Intel systems support MCE broadcasting. Enable
* synchronization with a one second timeout.
*/
- if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
- mca_cfg.monarch_timeout < 0)
+ if (c->x86_vfm >= INTEL_CORE_YONAH && mca_cfg.monarch_timeout < 0)
mca_cfg.monarch_timeout = USEC_PER_SEC;
/*
* There are also broken BIOSes on some Pentium M and
* earlier systems:
*/
- if (c->x86 == 6 && c->x86_model <= 13 && mca_cfg.bootlog < 0)
+ if (c->x86_vfm < INTEL_CORE_YONAH && mca_cfg.bootlog < 0)
mca_cfg.bootlog = 0;
if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
--
2.17.1
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 359d7a98e3e3f88dbf45411427b284bb3bbbaea5
Gitweb: https://git.kernel.org/tip/359d7a98e3e3f88dbf45411427b284bb3bbbaea5
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
AuthorDate: Thu, 12 Dec 2024 22:01:01 +08:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 31 Dec 2024 11:11:08 +01:00
x86/mce: Convert family/model mixed checks to VFM-based checks
Convert family/model mixed checks to VFM-based checks to make the code
more compact. Simplify.
[ bp: Drop the "what" from the commit message - it should be visible from
the diff alone. ]
Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20241212140103.66964-6-qiuxu.zhuo@intel.com
---
arch/x86/kernel/cpu/mce/core.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 3855ec2..f90cbcb 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1936,7 +1936,7 @@ static void apply_quirks_amd(struct cpuinfo_x86 *c)
* Various K7s with broken bank 0 around. Always disable
* by default.
*/
- if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
+ if (c->x86 == 6 && this_cpu_read(mce_num_banks))
mce_banks[0].ctl = 0;
/*
@@ -1954,6 +1954,10 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+ /* Older CPUs (prior to family 6) don't need quirks. */
+ if (c->x86_vfm < INTEL_PENTIUM_PRO)
+ return;
+
/*
* SDM documents that on family 6 bank 0 should not be written
* because it aliases to another special BIOS controlled
@@ -1962,22 +1966,21 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
* Don't ignore bank 0 completely because there could be a
* valid event later, merely don't write CTL0.
*/
- if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
+ if (c->x86_vfm < INTEL_NEHALEM_EP && this_cpu_read(mce_num_banks))
mce_banks[0].init = false;
/*
* All newer Intel systems support MCE broadcasting. Enable
* synchronization with a one second timeout.
*/
- if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
- mca_cfg.monarch_timeout < 0)
+ if (c->x86_vfm >= INTEL_CORE_YONAH && mca_cfg.monarch_timeout < 0)
mca_cfg.monarch_timeout = USEC_PER_SEC;
/*
* There are also broken BIOSes on some Pentium M and
* earlier systems:
*/
- if (c->x86 == 6 && c->x86_model <= 13 && mca_cfg.bootlog < 0)
+ if (c->x86_vfm < INTEL_CORE_YONAH && mca_cfg.bootlog < 0)
mca_cfg.bootlog = 0;
if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
Get HYGON to directly call mce_amd_feature_init() and remove the
redundant mce_hygon_feature_init().
Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v5:
- New patch.
arch/x86/include/asm/mce.h | 2 --
arch/x86/kernel/cpu/mce/core.c | 8 ++------
2 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index ea9ca7689f6b..eb2db07ef39c 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -386,8 +386,6 @@ static inline bool amd_mce_is_memory_error(struct mce *m) { return false; };
static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
#endif
-static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return mce_amd_feature_init(c); }
-
unsigned long copy_mc_fragile_handle_tail(char *to, char *from, unsigned len);
#endif /* _ASM_X86_MCE_H */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f90cbcb31a62..0dc00c9894c7 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -2118,14 +2118,10 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
mce_intel_feature_init(c);
break;
- case X86_VENDOR_AMD: {
+ case X86_VENDOR_AMD:
+ case X86_VENDOR_HYGON:
mce_amd_feature_init(c);
break;
- }
-
- case X86_VENDOR_HYGON:
- mce_hygon_feature_init(c);
- break;
case X86_VENDOR_CENTAUR:
mce_centaur_feature_init(c);
--
2.17.1
On Thu, Dec 12, 2024 at 10:01:02PM +0800, Qiuxu Zhuo wrote:
> Get HYGON to directly call mce_amd_feature_init() and remove the
> redundant mce_hygon_feature_init().
>
> Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> ---
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
> Changes in v5:
> - New patch.
>
> arch/x86/include/asm/mce.h | 2 --
> arch/x86/kernel/cpu/mce/core.c | 8 ++------
> 2 files changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
> index ea9ca7689f6b..eb2db07ef39c 100644
> --- a/arch/x86/include/asm/mce.h
> +++ b/arch/x86/include/asm/mce.h
> @@ -386,8 +386,6 @@ static inline bool amd_mce_is_memory_error(struct mce *m) { return false; };
> static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
> #endif
>
> -static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return mce_amd_feature_init(c); }
> -
Another cleanup idea: move functions from <asm/mce.h> to "internal.h".
Thanks,
Yazen
On 12/12/2024 6:01 AM, Qiuxu Zhuo wrote: > Get HYGON to directly call mce_amd_feature_init() and remove the > redundant mce_hygon_feature_init(). > > Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com> > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > --- Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 053d18057e6292462f1b3f9460dd0c1e34609f67
Gitweb: https://git.kernel.org/tip/053d18057e6292462f1b3f9460dd0c1e34609f67
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
AuthorDate: Thu, 12 Dec 2024 22:01:02 +08:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 31 Dec 2024 11:12:45 +01:00
x86/mce: Remove the redundant mce_hygon_feature_init()
Get HYGON to directly call mce_amd_feature_init() and remove the redundant
mce_hygon_feature_init().
Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20241212140103.66964-7-qiuxu.zhuo@intel.com
---
arch/x86/include/asm/mce.h | 2 --
arch/x86/kernel/cpu/mce/core.c | 8 ++------
2 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index ea9ca76..eb2db07 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -386,8 +386,6 @@ static inline bool amd_mce_is_memory_error(struct mce *m) { return false; };
static inline void mce_amd_feature_init(struct cpuinfo_x86 *c) { }
#endif
-static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c) { return mce_amd_feature_init(c); }
-
unsigned long copy_mc_fragile_handle_tail(char *to, char *from, unsigned len);
#endif /* _ASM_X86_MCE_H */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index f90cbcb..0dc00c9 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -2118,13 +2118,9 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
mce_intel_feature_init(c);
break;
- case X86_VENDOR_AMD: {
- mce_amd_feature_init(c);
- break;
- }
-
+ case X86_VENDOR_AMD:
case X86_VENDOR_HYGON:
- mce_hygon_feature_init(c);
+ mce_amd_feature_init(c);
break;
case X86_VENDOR_CENTAUR:
Remove unnecessary NULL pointer initializations from variables that
are already initialized before use.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v5:
- Collect "Reviewed-by:" Yazen.
Changes in v4:
- No changes.
Changes in v3:
- Collect "Reviewed-by:" Nikolay & Sohil.
- Remove the variables' names from the commit message (Sohil).
arch/x86/kernel/cpu/mce/amd.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 018874b554cb..c79a82912d38 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -921,8 +921,8 @@ static void log_and_reset_block(struct threshold_block *block)
*/
static void amd_threshold_interrupt(void)
{
- struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
struct threshold_bank **bp = this_cpu_read(threshold_banks);
+ struct threshold_block *first_block, *block, *tmp;
unsigned int bank, cpu = smp_processor_id();
/*
@@ -1201,8 +1201,7 @@ static int allocate_threshold_blocks(unsigned int cpu, struct threshold_bank *tb
static int __threshold_add_blocks(struct threshold_bank *b)
{
struct list_head *head = &b->blocks->miscj;
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
+ struct threshold_block *pos, *tmp;
int err = 0;
err = kobject_add(&b->blocks->kobj, b->kobj, b->blocks->kobj.name);
@@ -1312,8 +1311,7 @@ static void deallocate_threshold_blocks(struct threshold_bank *bank)
static void __threshold_remove_blocks(struct threshold_bank *b)
{
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
+ struct threshold_block *pos, *tmp;
kobject_put(b->kobj);
--
2.17.1
The following commit has been merged into the ras/core branch of tip:
Commit-ID: 0a6c5391f8812323300057f4dbb89b8bf886ee8e
Gitweb: https://git.kernel.org/tip/0a6c5391f8812323300057f4dbb89b8bf886ee8e
Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
AuthorDate: Thu, 12 Dec 2024 22:01:03 +08:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Tue, 31 Dec 2024 11:14:36 +01:00
x86/mce/amd: Remove unnecessary NULL pointer initializations
Remove unnecessary NULL pointer initializations from variables that
are already initialized before use.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://lore.kernel.org/r/20241212140103.66964-8-qiuxu.zhuo@intel.com
---
arch/x86/kernel/cpu/mce/amd.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 018874b..c79a829 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -921,8 +921,8 @@ static void log_and_reset_block(struct threshold_block *block)
*/
static void amd_threshold_interrupt(void)
{
- struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
struct threshold_bank **bp = this_cpu_read(threshold_banks);
+ struct threshold_block *first_block, *block, *tmp;
unsigned int bank, cpu = smp_processor_id();
/*
@@ -1201,8 +1201,7 @@ out_free:
static int __threshold_add_blocks(struct threshold_bank *b)
{
struct list_head *head = &b->blocks->miscj;
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
+ struct threshold_block *pos, *tmp;
int err = 0;
err = kobject_add(&b->blocks->kobj, b->kobj, b->blocks->kobj.name);
@@ -1312,8 +1311,7 @@ static void deallocate_threshold_blocks(struct threshold_bank *bank)
static void __threshold_remove_blocks(struct threshold_bank *b)
{
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
+ struct threshold_block *pos, *tmp;
kobject_put(b->kobj);
On Wed, Jan 01, 2025 at 11:44:03AM -0000, tip-bot2 for Qiuxu Zhuo wrote:
> The following commit has been merged into the ras/core branch of tip:
>
> Commit-ID: 0a6c5391f8812323300057f4dbb89b8bf886ee8e
> Gitweb: https://git.kernel.org/tip/0a6c5391f8812323300057f4dbb89b8bf886ee8e
> Author: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> AuthorDate: Thu, 12 Dec 2024 22:01:03 +08:00
> Committer: Borislav Petkov (AMD) <bp@alien8.de>
> CommitterDate: Tue, 31 Dec 2024 11:14:36 +01:00
>
> x86/mce/amd: Remove unnecessary NULL pointer initializations
>
> Remove unnecessary NULL pointer initializations from variables that
> are already initialized before use.
>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
> Link: https://lore.kernel.org/r/20241212140103.66964-8-qiuxu.zhuo@intel.com
Zapping this one in favor of:
https://lore.kernel.org/r/20241206161210.163701-2-yazen.ghannam@amd.com
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Make several functions that return 0 or 1 return a boolean value for
better readability.
No functional changes are intended.
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- Don't rename mce_notify_irq() (Boris).
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
- Rename mce_notify_irq() to mce_notify_user() (Sohil).
Changes in v2:
- Collect "Reviewed-by:" from Tony.
arch/x86/include/asm/mce.h | 4 ++--
arch/x86/kernel/cpu/mce/amd.c | 10 +++++-----
arch/x86/kernel/cpu/mce/core.c | 22 +++++++++++-----------
arch/x86/kernel/cpu/mce/intel.c | 9 +++++----
4 files changed, 23 insertions(+), 22 deletions(-)
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 4543cf2eb5e8..ea9ca7689f6b 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -276,7 +276,7 @@ static inline void cmci_rediscover(void) {}
static inline void cmci_recheck(void) {}
#endif
-int mce_available(struct cpuinfo_x86 *c);
+bool mce_available(struct cpuinfo_x86 *c);
bool mce_is_memory_error(struct mce *m);
bool mce_is_correctable(struct mce *m);
bool mce_usable_address(struct mce *m);
@@ -296,7 +296,7 @@ enum mcp_flags {
void machine_check_poll(enum mcp_flags flags, mce_banks_t *b);
-int mce_notify_irq(void);
+bool mce_notify_irq(void);
DECLARE_PER_CPU(struct mce, injectm);
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 6ca80fff1fea..018874b554cb 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -381,7 +381,7 @@ static bool lvt_interrupt_supported(unsigned int bank, u32 msr_high_bits)
return msr_high_bits & BIT(28);
}
-static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
+static bool lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
{
int msr = (hi & MASK_LVTOFF_HI) >> 20;
@@ -389,7 +389,7 @@ static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
pr_err(FW_BUG "cpu %d, failed to setup threshold interrupt "
"for bank %d, block %d (MSR%08X=0x%x%08x)\n", b->cpu,
b->bank, b->block, b->address, hi, lo);
- return 0;
+ return false;
}
if (apic != msr) {
@@ -399,15 +399,15 @@ static int lvt_off_valid(struct threshold_block *b, int apic, u32 lo, u32 hi)
* was set is reserved. Return early here:
*/
if (mce_flags.smca)
- return 0;
+ return false;
pr_err(FW_BUG "cpu %d, invalid threshold interrupt offset %d "
"for bank %d, block %d (MSR%08X=0x%x%08x)\n",
b->cpu, apic, b->bank, b->block, b->address, hi, lo);
- return 0;
+ return false;
}
- return 1;
+ return true;
};
/* Reprogram MCx_MISC MSR behind this threshold bank. */
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 7fb5556a0b53..167965bd2ac0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -492,10 +492,10 @@ static noinstr void mce_gather_info(struct mce_hw_err *err, struct pt_regs *regs
}
}
-int mce_available(struct cpuinfo_x86 *c)
+bool mce_available(struct cpuinfo_x86 *c)
{
if (mca_cfg.disabled)
- return 0;
+ return false;
return cpu_has(c, X86_FEATURE_MCE) && cpu_has(c, X86_FEATURE_MCA);
}
@@ -1778,7 +1778,7 @@ static void mce_timer_delete_all(void)
* Can be called from interrupt context, but not from machine check/NMI
* context.
*/
-int mce_notify_irq(void)
+bool mce_notify_irq(void)
{
/* Not more than two messages every minute */
static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2);
@@ -1789,9 +1789,9 @@ int mce_notify_irq(void)
if (__ratelimit(&ratelimit))
pr_info(HW_ERR "Machine check events logged\n");
- return 1;
+ return true;
}
- return 0;
+ return false;
}
EXPORT_SYMBOL_GPL(mce_notify_irq);
@@ -2015,25 +2015,25 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
return 0;
}
-static int __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
+static bool __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
{
if (c->x86 != 5)
- return 0;
+ return false;
switch (c->x86_vendor) {
case X86_VENDOR_INTEL:
intel_p5_mcheck_init(c);
mce_flags.p5 = 1;
- return 1;
+ return true;
case X86_VENDOR_CENTAUR:
winchip_mcheck_init(c);
mce_flags.winchip = 1;
- return 1;
+ return true;
default:
- return 0;
+ return false;
}
- return 0;
+ return false;
}
/*
diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index b3cd2c61b11d..f863df0ff42c 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -75,12 +75,12 @@ static u16 cmci_threshold[MAX_NR_BANKS];
*/
#define CMCI_STORM_THRESHOLD 32749
-static int cmci_supported(int *banks)
+static bool cmci_supported(int *banks)
{
u64 cap;
if (mca_cfg.cmci_disabled || mca_cfg.ignore_ce)
- return 0;
+ return false;
/*
* Vendor check is not strictly needed, but the initial
@@ -89,10 +89,11 @@ static int cmci_supported(int *banks)
*/
if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL &&
boot_cpu_data.x86_vendor != X86_VENDOR_ZHAOXIN)
- return 0;
+ return false;
if (!boot_cpu_has(X86_FEATURE_APIC) || lapic_get_maxlvt() < 6)
- return 0;
+ return false;
+
rdmsrl(MSR_IA32_MCG_CAP, cap);
*banks = min_t(unsigned, MAX_NR_BANKS, cap & MCG_BANKCNT_MASK);
return !!(cap & MCG_CMCI_P);
--
2.17.1
The 'storm' variable points to this_cpu_ptr(&storm_desc). Access the
'stormy_bank_count' field through the 'storm' to avoid calling
this_cpu_*() on the same per-CPU variable twice.
This minor optimization reduces the text size by 16 bytes.
$ size threshold.o.*
text data bss dec hex filename
1395 1664 0 3059 bf3 threshold.o.old
1379 1664 0 3043 be3 threshold.o.new
No functional changes intended.
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- No changes.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
Changes in v2:
- Collect "Reviewed-by:" from Tony.
arch/x86/kernel/cpu/mce/threshold.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/mce/threshold.c b/arch/x86/kernel/cpu/mce/threshold.c
index 89e31e1e5c9c..f4a007616468 100644
--- a/arch/x86/kernel/cpu/mce/threshold.c
+++ b/arch/x86/kernel/cpu/mce/threshold.c
@@ -90,7 +90,7 @@ void cmci_storm_end(unsigned int bank)
storm->banks[bank].in_storm_mode = false;
/* If no banks left in storm mode, stop polling. */
- if (!this_cpu_dec_return(storm_desc.stormy_bank_count))
+ if (!--storm->stormy_bank_count)
mce_timer_kick(false);
}
--
2.17.1
Make those functions whose callers only care about success or failure
return a boolean value for better readability. Also, update the call
sites accordingly.
No functional changes.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- New patch.
arch/x86/kernel/cpu/mce/core.c | 12 ++++++------
arch/x86/kernel/cpu/mce/genpool.c | 29 ++++++++++++++---------------
arch/x86/kernel/cpu/mce/internal.h | 4 ++--
3 files changed, 22 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 167965bd2ac0..ce6fe5e20805 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -151,7 +151,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
void mce_log(struct mce_hw_err *err)
{
- if (!mce_gen_pool_add(err))
+ if (mce_gen_pool_add(err))
irq_work_queue(&mce_irq_work);
}
EXPORT_SYMBOL_GPL(mce_log);
@@ -1911,14 +1911,14 @@ static void __mcheck_cpu_check_banks(void)
}
/* Add per CPU specific workarounds here */
-static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
+static bool __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
if (c->x86_vendor == X86_VENDOR_UNKNOWN) {
pr_info("unknown CPU type - not enabling MCE support\n");
- return -EOPNOTSUPP;
+ return false;
}
/* This should be disabled by the BIOS, but isn't always */
@@ -2012,7 +2012,7 @@ static int __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
if (cfg->bootlog != 0)
cfg->panic_timeout = 30;
- return 0;
+ return true;
}
static bool __mcheck_cpu_ancient_init(struct cpuinfo_x86 *c)
@@ -2279,12 +2279,12 @@ void mcheck_cpu_init(struct cpuinfo_x86 *c)
__mcheck_cpu_cap_init();
- if (__mcheck_cpu_apply_quirks(c) < 0) {
+ if (!__mcheck_cpu_apply_quirks(c)) {
mca_cfg.disabled = 1;
return;
}
- if (mce_gen_pool_init()) {
+ if (!mce_gen_pool_init()) {
mca_cfg.disabled = 1;
pr_emerg("Couldn't allocate MCE records pool!\n");
return;
diff --git a/arch/x86/kernel/cpu/mce/genpool.c b/arch/x86/kernel/cpu/mce/genpool.c
index d0be6dda0c14..3ca9c007a666 100644
--- a/arch/x86/kernel/cpu/mce/genpool.c
+++ b/arch/x86/kernel/cpu/mce/genpool.c
@@ -94,64 +94,63 @@ bool mce_gen_pool_empty(void)
return llist_empty(&mce_event_llist);
}
-int mce_gen_pool_add(struct mce_hw_err *err)
+bool mce_gen_pool_add(struct mce_hw_err *err)
{
struct mce_evt_llist *node;
if (filter_mce(&err->m))
- return -EINVAL;
+ return false;
if (!mce_evt_pool)
- return -EINVAL;
+ return false;
node = (void *)gen_pool_alloc(mce_evt_pool, sizeof(*node));
if (!node) {
pr_warn_ratelimited("MCE records pool full!\n");
- return -ENOMEM;
+ return false;
}
memcpy(&node->err, err, sizeof(*err));
llist_add(&node->llnode, &mce_event_llist);
- return 0;
+ return true;
}
-static int mce_gen_pool_create(void)
+static bool mce_gen_pool_create(void)
{
int mce_numrecords, mce_poolsz, order;
struct gen_pool *gpool;
- int ret = -ENOMEM;
void *mce_pool;
order = order_base_2(sizeof(struct mce_evt_llist));
gpool = gen_pool_create(order, -1);
if (!gpool)
- return ret;
+ return false;
mce_numrecords = max(MCE_MIN_ENTRIES, num_possible_cpus() * MCE_PER_CPU);
mce_poolsz = mce_numrecords * (1 << order);
mce_pool = kmalloc(mce_poolsz, GFP_KERNEL);
if (!mce_pool) {
gen_pool_destroy(gpool);
- return ret;
+ return false;
}
- ret = gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1);
- if (ret) {
+
+ if (gen_pool_add(gpool, (unsigned long)mce_pool, mce_poolsz, -1)) {
gen_pool_destroy(gpool);
kfree(mce_pool);
- return ret;
+ return false;
}
mce_evt_pool = gpool;
- return ret;
+ return true;
}
-int mce_gen_pool_init(void)
+bool mce_gen_pool_init(void)
{
/* Just init mce_gen_pool once. */
if (mce_evt_pool)
- return 0;
+ return true;
return mce_gen_pool_create();
}
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index 84f810598231..95a504ece43e 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -31,8 +31,8 @@ struct mce_evt_llist {
void mce_gen_pool_process(struct work_struct *__unused);
bool mce_gen_pool_empty(void);
-int mce_gen_pool_add(struct mce_hw_err *err);
-int mce_gen_pool_init(void);
+bool mce_gen_pool_add(struct mce_hw_err *err);
+bool mce_gen_pool_init(void);
struct llist_node *mce_gen_pool_prepare_records(void);
int mce_severity(struct mce *a, struct pt_regs *regs, char **msg, bool is_excp);
--
2.17.1
On 11/10/2024 10:04 PM, Qiuxu Zhuo wrote:
> Make those functions whose callers only care about success or failure
> return a boolean value for better readability. Also, update the call
> sites accordingly.
>
> No functional changes.
>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> ---
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Changes in v4:
> - New patch.
>
> arch/x86/kernel/cpu/mce/core.c | 12 ++++++------
> arch/x86/kernel/cpu/mce/genpool.c | 29 ++++++++++++++---------------
> arch/x86/kernel/cpu/mce/internal.h | 4 ++--
> 3 files changed, 22 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 167965bd2ac0..ce6fe5e20805 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -151,7 +151,7 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm);
>
> void mce_log(struct mce_hw_err *err)
> {
> - if (!mce_gen_pool_add(err))
> + if (mce_gen_pool_add(err))
> irq_work_queue(&mce_irq_work);
> }
The polarities of all the return values have been flipped. They all look
correct but took me a few minutes to figure out. Might be useful to make
a mention in the commit message if you end up doing another version (Not
needed otherwise).
> EXPORT_SYMBOL_GPL(mce_log);
From: Tony Luck <tony.luck@intel.com>
Split each vendor specific part into its own helper function.
Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- Add necessary blank lines in apply_quirks_amd() (Yazen).
- Use 'mca_cfg' instead of 'cfg' in apply_quirks_*(). (Yazen).
Changes in v3:
- New patch.
arch/x86/kernel/cpu/mce/core.c | 192 ++++++++++++++++++---------------
1 file changed, 104 insertions(+), 88 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index ce6fe5e20805..3855ec2ed0e0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1910,101 +1910,117 @@ static void __mcheck_cpu_check_banks(void)
}
}
+static void apply_quirks_amd(struct cpuinfo_x86 *c)
+{
+ struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+
+ /* This should be disabled by the BIOS, but isn't always */
+ if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
+ /*
+ * disable GART TBL walk error reporting, which
+ * trips off incorrectly with the IOMMU & 3ware
+ * & Cerberus:
+ */
+ clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
+ }
+
+ if (c->x86 < 0x11 && mca_cfg.bootlog < 0) {
+ /*
+ * Lots of broken BIOS around that don't clear them
+ * by default and leave crap in there. Don't log:
+ */
+ mca_cfg.bootlog = 0;
+ }
+
+ /*
+ * Various K7s with broken bank 0 around. Always disable
+ * by default.
+ */
+ if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
+ mce_banks[0].ctl = 0;
+
+ /*
+ * overflow_recov is supported for F15h Models 00h-0fh
+ * even though we don't have a CPUID bit for it.
+ */
+ if (c->x86 == 0x15 && c->x86_model <= 0xf)
+ mce_flags.overflow_recov = 1;
+
+ if (c->x86 >= 0x17 && c->x86 <= 0x1A)
+ mce_flags.zen_ifu_quirk = 1;
+}
+
+static void apply_quirks_intel(struct cpuinfo_x86 *c)
+{
+ struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+
+ /*
+ * SDM documents that on family 6 bank 0 should not be written
+ * because it aliases to another special BIOS controlled
+ * register.
+ * But it's not aliased anymore on model 0x1a+
+ * Don't ignore bank 0 completely because there could be a
+ * valid event later, merely don't write CTL0.
+ */
+ if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
+ mce_banks[0].init = false;
+
+ /*
+ * All newer Intel systems support MCE broadcasting. Enable
+ * synchronization with a one second timeout.
+ */
+ if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
+ mca_cfg.monarch_timeout < 0)
+ mca_cfg.monarch_timeout = USEC_PER_SEC;
+
+ /*
+ * There are also broken BIOSes on some Pentium M and
+ * earlier systems:
+ */
+ if (c->x86 == 6 && c->x86_model <= 13 && mca_cfg.bootlog < 0)
+ mca_cfg.bootlog = 0;
+
+ if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
+ mce_flags.snb_ifu_quirk = 1;
+
+ /*
+ * Skylake, Cascacde Lake and Cooper Lake require a quirk on
+ * rep movs.
+ */
+ if (c->x86_vfm == INTEL_SKYLAKE_X)
+ mce_flags.skx_repmov_quirk = 1;
+}
+
+static void apply_quirks_zhaoxin(struct cpuinfo_x86 *c)
+{
+ /*
+ * All newer Zhaoxin CPUs support MCE broadcasting. Enable
+ * synchronization with a one second timeout.
+ */
+ if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
+ if (mca_cfg.monarch_timeout < 0)
+ mca_cfg.monarch_timeout = USEC_PER_SEC;
+ }
+}
+
/* Add per CPU specific workarounds here */
static bool __mcheck_cpu_apply_quirks(struct cpuinfo_x86 *c)
{
- struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
struct mca_config *cfg = &mca_cfg;
- if (c->x86_vendor == X86_VENDOR_UNKNOWN) {
+ switch (c->x86_vendor) {
+ case X86_VENDOR_UNKNOWN:
pr_info("unknown CPU type - not enabling MCE support\n");
return false;
- }
-
- /* This should be disabled by the BIOS, but isn't always */
- if (c->x86_vendor == X86_VENDOR_AMD) {
- if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
- /*
- * disable GART TBL walk error reporting, which
- * trips off incorrectly with the IOMMU & 3ware
- * & Cerberus:
- */
- clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
- }
- if (c->x86 < 0x11 && cfg->bootlog < 0) {
- /*
- * Lots of broken BIOS around that don't clear them
- * by default and leave crap in there. Don't log:
- */
- cfg->bootlog = 0;
- }
- /*
- * Various K7s with broken bank 0 around. Always disable
- * by default.
- */
- if (c->x86 == 6 && this_cpu_read(mce_num_banks) > 0)
- mce_banks[0].ctl = 0;
-
- /*
- * overflow_recov is supported for F15h Models 00h-0fh
- * even though we don't have a CPUID bit for it.
- */
- if (c->x86 == 0x15 && c->x86_model <= 0xf)
- mce_flags.overflow_recov = 1;
-
- if (c->x86 >= 0x17 && c->x86 <= 0x1A)
- mce_flags.zen_ifu_quirk = 1;
-
- }
-
- if (c->x86_vendor == X86_VENDOR_INTEL) {
- /*
- * SDM documents that on family 6 bank 0 should not be written
- * because it aliases to another special BIOS controlled
- * register.
- * But it's not aliased anymore on model 0x1a+
- * Don't ignore bank 0 completely because there could be a
- * valid event later, merely don't write CTL0.
- */
-
- if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
- mce_banks[0].init = false;
-
- /*
- * All newer Intel systems support MCE broadcasting. Enable
- * synchronization with a one second timeout.
- */
- if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
- cfg->monarch_timeout < 0)
- cfg->monarch_timeout = USEC_PER_SEC;
-
- /*
- * There are also broken BIOSes on some Pentium M and
- * earlier systems:
- */
- if (c->x86 == 6 && c->x86_model <= 13 && cfg->bootlog < 0)
- cfg->bootlog = 0;
-
- if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
- mce_flags.snb_ifu_quirk = 1;
-
- /*
- * Skylake, Cascacde Lake and Cooper Lake require a quirk on
- * rep movs.
- */
- if (c->x86_vfm == INTEL_SKYLAKE_X)
- mce_flags.skx_repmov_quirk = 1;
- }
-
- if (c->x86_vendor == X86_VENDOR_ZHAOXIN) {
- /*
- * All newer Zhaoxin CPUs support MCE broadcasting. Enable
- * synchronization with a one second timeout.
- */
- if (c->x86 > 6 || (c->x86_model == 0x19 || c->x86_model == 0x1f)) {
- if (cfg->monarch_timeout < 0)
- cfg->monarch_timeout = USEC_PER_SEC;
- }
+ case X86_VENDOR_AMD:
+ apply_quirks_amd(c);
+ break;
+ case X86_VENDOR_INTEL:
+ apply_quirks_intel(c);
+ break;
+ case X86_VENDOR_ZHAOXIN:
+ apply_quirks_zhaoxin(c);
+ break;
}
if (cfg->monarch_timeout < 0)
--
2.17.1
On 11/10/2024 10:04 PM, Qiuxu Zhuo wrote: > From: Tony Luck <tony.luck@intel.com> > > Split each vendor specific part into its own helper function. > > Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > Signed-off-by: Tony Luck <tony.luck@intel.com> > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > --- Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Convert family/model mixed checks to VFM-based checks to make
the code more compact.
Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- No changes but rebased.
Changes in v3:
- Newly added.
arch/x86/kernel/cpu/mce/core.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 3855ec2ed0e0..d288cc7390f6 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1954,6 +1954,10 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
{
struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
+ /* Older CPUs (prior to family 6) don't need quirks. */
+ if (c->x86_vfm < INTEL_PENTIUM_PRO)
+ return;
+
/*
* SDM documents that on family 6 bank 0 should not be written
* because it aliases to another special BIOS controlled
@@ -1962,22 +1966,21 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
* Don't ignore bank 0 completely because there could be a
* valid event later, merely don't write CTL0.
*/
- if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
+ if (c->x86_vfm < INTEL_NEHALEM_EP && this_cpu_read(mce_num_banks) > 0)
mce_banks[0].init = false;
/*
* All newer Intel systems support MCE broadcasting. Enable
* synchronization with a one second timeout.
*/
- if ((c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xe)) &&
- mca_cfg.monarch_timeout < 0)
+ if (c->x86_vfm >= INTEL_CORE_YONAH && mca_cfg.monarch_timeout < 0)
mca_cfg.monarch_timeout = USEC_PER_SEC;
/*
* There are also broken BIOSes on some Pentium M and
* earlier systems:
*/
- if (c->x86 == 6 && c->x86_model <= 13 && mca_cfg.bootlog < 0)
+ if (c->x86_vfm < INTEL_CORE_YONAH && mca_cfg.bootlog < 0)
mca_cfg.bootlog = 0;
if (c->x86_vfm == INTEL_SANDYBRIDGE_X)
--
2.17.1
On Mon, Nov 11, 2024 at 02:04:25PM +0800, Qiuxu Zhuo wrote:
> Convert family/model mixed checks to VFM-based checks to make
> the code more compact.
>
> Suggested-by: Sohil Mehta <sohil.mehta@intel.com>
> Suggested-by: Dave Hansen <dave.hansen@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> ---
> Changes in v4:
> - No changes but rebased.
>
> Changes in v3:
> - Newly added.
>
> arch/x86/kernel/cpu/mce/core.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 3855ec2ed0e0..d288cc7390f6 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1954,6 +1954,10 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
> {
> struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
>
> + /* Older CPUs (prior to family 6) don't need quirks. */
> + if (c->x86_vfm < INTEL_PENTIUM_PRO)
> + return;
> +
Is it possible for pre-"family 6" to get here?
Family 5 is "ancient" which has its own MCE init path. And I assume
anything older doesn't support MCE/MCA. Is this correct?
> /*
> * SDM documents that on family 6 bank 0 should not be written
> * because it aliases to another special BIOS controlled
> @@ -1962,22 +1966,21 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
> * Don't ignore bank 0 completely because there could be a
> * valid event later, merely don't write CTL0.
> */
> - if (c->x86 == 6 && c->x86_model < 0x1A && this_cpu_read(mce_num_banks) > 0)
> + if (c->x86_vfm < INTEL_NEHALEM_EP && this_cpu_read(mce_num_banks) > 0)
The "> 0" is not needed, since mce_num_banks is unsigned int.
Otherwise, looks good.
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Thanks,
Yazen
Hi Yazen,
> From: Yazen Ghannam <yazen.ghannam@amd.com>
> [...]
> > @@ -1954,6 +1954,10 @@ static void apply_quirks_intel(struct
> > cpuinfo_x86 *c) {
> > struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
> >
> > + /* Older CPUs (prior to family 6) don't need quirks. */
> > + if (c->x86_vfm < INTEL_PENTIUM_PRO)
> > + return;
> > +
>
> Is it possible for pre-"family 6" to get here?
>
> Family 5 is "ancient" which has its own MCE init path. And I assume anything
> older doesn't support MCE/MCA. Is this correct?
Yes, there is an early return in __mcheck_cpu_ancient_init() for Family 5.
However, this code explicitly indicates that "prior to families 6 don't need quirks"
and addresses concerns like:
https://lore.kernel.org/all/dcfdba92-7004-413d-8011-12771636d11f@intel.com/
> > /*
> > * SDM documents that on family 6 bank 0 should not be written
> > * because it aliases to another special BIOS controlled @@ -1962,22
> > +1966,21 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
> > * Don't ignore bank 0 completely because there could be a
> > * valid event later, merely don't write CTL0.
> > */
> > - if (c->x86 == 6 && c->x86_model < 0x1A &&
> this_cpu_read(mce_num_banks) > 0)
> > + if (c->x86_vfm < INTEL_NEHALEM_EP &&
> this_cpu_read(mce_num_banks) >
> > +0)
>
> The "> 0" is not needed, since mce_num_banks is unsigned int.
I don't get your point here.
But it needs to check for the case where mce_num_banks == 0.
> Otherwise, looks good.
>
> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Thanks!
-Qiuxu
On Wed, Nov 13, 2024 at 12:10:31PM +0000, Zhuo, Qiuxu wrote:
> Hi Yazen,
>
> > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > [...]
> > > @@ -1954,6 +1954,10 @@ static void apply_quirks_intel(struct
> > > cpuinfo_x86 *c) {
> > > struct mce_bank *mce_banks = this_cpu_ptr(mce_banks_array);
> > >
> > > + /* Older CPUs (prior to family 6) don't need quirks. */
> > > + if (c->x86_vfm < INTEL_PENTIUM_PRO)
> > > + return;
> > > +
> >
> > Is it possible for pre-"family 6" to get here?
> >
> > Family 5 is "ancient" which has its own MCE init path. And I assume anything
> > older doesn't support MCE/MCA. Is this correct?
>
> Yes, there is an early return in __mcheck_cpu_ancient_init() for Family 5.
> However, this code explicitly indicates that "prior to families 6 don't need quirks"
> and addresses concerns like:
>
> https://lore.kernel.org/all/dcfdba92-7004-413d-8011-12771636d11f@intel.com/
>
Right, but my point is that this check would never be executed, since
the older systems would not get here during init. So this seems like
dead code.
> > > /*
> > > * SDM documents that on family 6 bank 0 should not be written
> > > * because it aliases to another special BIOS controlled @@ -1962,22
> > > +1966,21 @@ static void apply_quirks_intel(struct cpuinfo_x86 *c)
> > > * Don't ignore bank 0 completely because there could be a
> > > * valid event later, merely don't write CTL0.
> > > */
> > > - if (c->x86 == 6 && c->x86_model < 0x1A &&
> > this_cpu_read(mce_num_banks) > 0)
> > > + if (c->x86_vfm < INTEL_NEHALEM_EP &&
> > this_cpu_read(mce_num_banks) >
> > > +0)
> >
> > The "> 0" is not needed, since mce_num_banks is unsigned int.
>
> I don't get your point here.
> But it needs to check for the case where mce_num_banks == 0.
>
The check is "mce_num_banks > 0", and mce_num_banks is an unsigned int.
Therefore, the check is reduced to "mce_num_banks != 0". In this case,
you can just do "if (mce_num_banks)" to the same effect.
Thanks,
Yazen
Hi Yazen, > From: Yazen Ghannam <yazen.ghannam@amd.com> > [...] > > > > + if (c->x86_vfm < INTEL_NEHALEM_EP && > > > this_cpu_read(mce_num_banks) > > > > > +0) > > > > > > The "> 0" is not needed, since mce_num_banks is unsigned int. > > > > I don't get your point here. > > But it needs to check for the case where mce_num_banks == 0. > > > > The check is "mce_num_banks > 0", and mce_num_banks is an unsigned int. > Therefore, the check is reduced to "mce_num_banks != 0". In this case, you > can just do "if (mce_num_banks)" to the same effect. I got you. OK, if nobody else objects, I'll update it in the next version. [ Hope others won't blame this as over-optimization. ] -Qiuxu
On 11/10/2024 10:04 PM, Qiuxu Zhuo wrote: > Convert family/model mixed checks to VFM-based checks to make > the code more compact. > > Suggested-by: Sohil Mehta <sohil.mehta@intel.com> > Suggested-by: Dave Hansen <dave.hansen@intel.com> > Reviewed-by: Tony Luck <tony.luck@intel.com> > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > --- Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Remove the unnecessary {} from the case statement.
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- No changes.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
Changes in v2:
- Collect "Reviewed-by:" from Tony.
arch/x86/kernel/cpu/mce/core.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d288cc7390f6..0f0c6e9d9183 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -2118,10 +2118,9 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
mce_intel_feature_init(c);
break;
- case X86_VENDOR_AMD: {
+ case X86_VENDOR_AMD:
mce_amd_feature_init(c);
break;
- }
case X86_VENDOR_HYGON:
mce_hygon_feature_init(c);
--
2.17.1
On Mon, Nov 11, 2024 at 02:04:26PM +0800, Qiuxu Zhuo wrote:
> Remove the unnecessary {} from the case statement.
>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
But please see note below.
> ---
> Changes in v4:
> - No changes.
>
> Changes in v3:
> - Collect "Reviewed-by:" from Nikolay & Sohil.
>
> Changes in v2:
> - Collect "Reviewed-by:" from Tony.
>
> arch/x86/kernel/cpu/mce/core.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index d288cc7390f6..0f0c6e9d9183 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -2118,10 +2118,9 @@ static void __mcheck_cpu_init_vendor(struct cpuinfo_x86 *c)
> mce_intel_feature_init(c);
> break;
>
> - case X86_VENDOR_AMD: {
> + case X86_VENDOR_AMD:
> mce_amd_feature_init(c);
> break;
> - }
>
> case X86_VENDOR_HYGON:
> mce_hygon_feature_init(c);
> --
I think this could be a bit more substantive if you also combine the AMD
and HYGON cases. And remove mce_hygon_feature_init() which just calls
mce_amd_feature_init() anyway.
Thanks,
Yazen
Hi Yazen,
> From: Yazen Ghannam <yazen.ghannam@amd.com>
> [...]
>
> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Thanks!
> But please see note below.
> [...]
> > - case X86_VENDOR_AMD: {
> > + case X86_VENDOR_AMD:
> > mce_amd_feature_init(c);
> > break;
> > - }
> >
> > case X86_VENDOR_HYGON:
> > mce_hygon_feature_init(c);
> > --
>
> I think this could be a bit more substantive if you also combine the AMD and
> HYGON cases. And remove mce_hygon_feature_init() which just calls
> mce_amd_feature_init() anyway.
How about a separate patch for this?
If It's OK for you, I'll follow up on it after the current patch series has settled.
-Qiuxu
On Wed, Nov 13, 2024 at 01:32:08PM +0000, Zhuo, Qiuxu wrote:
> How about a separate patch for this?
Can you drop this micro-change per patch? Just do a single patch here which
fixes up everything mentioned during review in that area and be done with it.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Hi Boris, > From: Borislav Petkov <bp@alien8.de> > [...] > > Can you drop this micro-change per patch? Just do a single patch here which > fixes up everything mentioned during review in that area and be done with it. OK. I'll drop this one and replace it with Yazen's suggestion in next version. -Qiuxu
On Wed, Nov 13, 2024 at 01:32:08PM +0000, Zhuo, Qiuxu wrote:
> Hi Yazen,
>
> > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > [...]
> >
> > Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
>
> Thanks!
>
> > But please see note below.
> > [...]
> > > - case X86_VENDOR_AMD: {
> > > + case X86_VENDOR_AMD:
> > > mce_amd_feature_init(c);
> > > break;
> > > - }
> > >
> > > case X86_VENDOR_HYGON:
> > > mce_hygon_feature_init(c);
> > > --
> >
> > I think this could be a bit more substantive if you also combine the AMD and
> > HYGON cases. And remove mce_hygon_feature_init() which just calls
> > mce_amd_feature_init() anyway.
>
> How about a separate patch for this?
> If It's OK for you, I'll follow up on it after the current patch series has settled.
>
Sure thing. No problem.
Thanks,
Yazen
Remove unnecessary NULL pointer initializations from variables that
are already initialized before use.
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- No changes.
Changes in v3:
- Collect "Reviewed-by:" Nikolay & Sohil.
- Remove the variables' names from the commit message (Sohil).
arch/x86/kernel/cpu/mce/amd.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 018874b554cb..c79a82912d38 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -921,8 +921,8 @@ static void log_and_reset_block(struct threshold_block *block)
*/
static void amd_threshold_interrupt(void)
{
- struct threshold_block *first_block = NULL, *block = NULL, *tmp = NULL;
struct threshold_bank **bp = this_cpu_read(threshold_banks);
+ struct threshold_block *first_block, *block, *tmp;
unsigned int bank, cpu = smp_processor_id();
/*
@@ -1201,8 +1201,7 @@ static int allocate_threshold_blocks(unsigned int cpu, struct threshold_bank *tb
static int __threshold_add_blocks(struct threshold_bank *b)
{
struct list_head *head = &b->blocks->miscj;
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
+ struct threshold_block *pos, *tmp;
int err = 0;
err = kobject_add(&b->blocks->kobj, b->kobj, b->blocks->kobj.name);
@@ -1312,8 +1311,7 @@ static void deallocate_threshold_blocks(struct threshold_bank *bank)
static void __threshold_remove_blocks(struct threshold_bank *b)
{
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
+ struct threshold_block *pos, *tmp;
kobject_put(b->kobj);
--
2.17.1
On Mon, Nov 11, 2024 at 02:04:27PM +0800, Qiuxu Zhuo wrote: > Remove unnecessary NULL pointer initializations from variables that > are already initialized before use. > > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> > Reviewed-by: Sohil Mehta <sohil.mehta@intel.com> > Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> > --- Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Thanks, Yazen
Fix typos in comments.
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
---
Changes in v4:
- No changes.
Changes in v3:
- Collect "Reviewed-by:" from Nikolay & Sohil.
- Remove the detail typos from the commit message (Sohil).
Changes in v2:
- Collect "Reviewed-by:" from Tony.
arch/x86/kernel/cpu/mce/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 0f0c6e9d9183..6e194ccffc7c 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1144,7 +1144,7 @@ static noinstr int mce_start(int *no_way_out)
} else {
/*
* Subject: Now start the scanning loop one by one in
- * the original callin order.
+ * the original calling order.
* This way when there are any shared banks it will be
* only seen by one CPU before cleared, avoiding duplicates.
*/
@@ -1917,7 +1917,7 @@ static void apply_quirks_amd(struct cpuinfo_x86 *c)
/* This should be disabled by the BIOS, but isn't always */
if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
/*
- * disable GART TBL walk error reporting, which
+ * disable GART TLB walk error reporting, which
* trips off incorrectly with the IOMMU & 3ware
* & Cerberus:
*/
--
2.17.1
On Mon, Nov 11, 2024 at 02:04:28PM +0800, Qiuxu Zhuo wrote:
> Fix typos in comments.
>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> ---
> Changes in v4:
> - No changes.
>
> Changes in v3:
> - Collect "Reviewed-by:" from Nikolay & Sohil.
> - Remove the detail typos from the commit message (Sohil).
>
> Changes in v2:
> - Collect "Reviewed-by:" from Tony.
>
> arch/x86/kernel/cpu/mce/core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 0f0c6e9d9183..6e194ccffc7c 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1144,7 +1144,7 @@ static noinstr int mce_start(int *no_way_out)
> } else {
> /*
> * Subject: Now start the scanning loop one by one in
> - * the original callin order.
> + * the original calling order.
I don't think this is a typo. It seems to refer to the mce_callin
variable/idea.
For example, each CPU "calls in" when ready. This is independent of when
each CPU is "called" to do something.
CPUs are called in this order 0, 1, 2.
CPUs "call in" in this order 1, 0, 2.
When a CPU is called can be different from when it responds.
Maybe I'm reading too much into this. :/
> * This way when there are any shared banks it will be
> * only seen by one CPU before cleared, avoiding duplicates.
> */
> @@ -1917,7 +1917,7 @@ static void apply_quirks_amd(struct cpuinfo_x86 *c)
> /* This should be disabled by the BIOS, but isn't always */
> if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
> /*
> - * disable GART TBL walk error reporting, which
> + * disable GART TLB walk error reporting, which
This also is not a typo. TBL -> table
From old AMD K8 BKDG document:
10 GartTblWkEn GART Table Walk Error Reporting Enable R/W 0
Thanks,
Yazen
Hi Yazen,
> From: Yazen Ghannam <yazen.ghannam@amd.com>
> [...]
> > @@ -1144,7 +1144,7 @@ static noinstr int mce_start(int *no_way_out)
> > } else {
> > /*
> > * Subject: Now start the scanning loop one by one in
> > - * the original callin order.
> > + * the original calling order.
>
> I don't think this is a typo. It seems to refer to the mce_callin variable/idea.
>
> For example, each CPU "calls in" when ready. This is independent of when
> each CPU is "called" to do something.
>
> CPUs are called in this order 0, 1, 2.
> CPUs "call in" in this order 1, 0, 2.
>
> When a CPU is called can be different from when it responds.
>
> Maybe I'm reading too much into this. :/
Too finicky to me :(
But I appreciate you sharing your reading and thoughts. 😊
-Qiuxu
On 11/12/2024 7:38 AM, Yazen Ghannam wrote:
>
>> * This way when there are any shared banks it will be
>> * only seen by one CPU before cleared, avoiding duplicates.
>> */
>> @@ -1917,7 +1917,7 @@ static void apply_quirks_amd(struct cpuinfo_x86 *c)
>> /* This should be disabled by the BIOS, but isn't always */
>> if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
>> /*
>> - * disable GART TBL walk error reporting, which
>> + * disable GART TLB walk error reporting, which
>
> This also is not a typo. TBL -> table
>
> From old AMD K8 BKDG document:
> 10 GartTblWkEn GART Table Walk Error Reporting Enable R/W 0
>
There is another comment in init_amd_gh() that seems to be related to
the same thing and similarly worded. That seems to refer to TLB instead
of TBL(table).
/*
* Disable GART TLB Walk Errors on Fam10h. We do this here because this
* is always needed when GART is enabled, even in a kernel which has no
* MCE support built in. BIOS should disable GartTlbWlk Errors already.
* If it doesn't, we do it here as suggested by the BKDG.
*
* Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=33012
*/
msr_set_bit(MSR_AMD64_MCx_MASK(4), 10);
On Tue, Nov 12, 2024 at 02:35:32PM -0800, Sohil Mehta wrote:
> On 11/12/2024 7:38 AM, Yazen Ghannam wrote:
> >
> >> * This way when there are any shared banks it will be
> >> * only seen by one CPU before cleared, avoiding duplicates.
> >> */
> >> @@ -1917,7 +1917,7 @@ static void apply_quirks_amd(struct cpuinfo_x86 *c)
> >> /* This should be disabled by the BIOS, but isn't always */
> >> if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
> >> /*
> >> - * disable GART TBL walk error reporting, which
> >> + * disable GART TLB walk error reporting, which
> >
> > This also is not a typo. TBL -> table
> >
> > From old AMD K8 BKDG document:
> > 10 GartTblWkEn GART Table Walk Error Reporting Enable R/W 0
> >
>
> There is another comment in init_amd_gh() that seems to be related to
> the same thing and similarly worded. That seems to refer to TLB instead
> of TBL(table).
>
> /*
> * Disable GART TLB Walk Errors on Fam10h. We do this here because this
> * is always needed when GART is enabled, even in a kernel which has no
> * MCE support built in. BIOS should disable GartTlbWlk Errors already.
> * If it doesn't, we do it here as suggested by the BKDG.
> *
> * Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=33012
> */
> msr_set_bit(MSR_AMD64_MCx_MASK(4), 10);
>
>
Now I think *that* is a typo, since it doesn't match the documentation.
:)
Thanks,
Yazen
> From: Yazen Ghannam <yazen.ghannam@amd.com> > [...] > > >> - * disable GART TBL walk error reporting, which > > >> + * disable GART TLB walk error reporting, which > > > > > > This also is not a typo. TBL -> table > > > > > > From old AMD K8 BKDG document: > > > 10 GartTblWkEn GART Table Walk Error Reporting Enable R/W 0 > > > > > > > There is another comment in init_amd_gh() that seems to be related to > > the same thing and similarly worded. That seems to refer to TLB > > instead of TBL(table). > > > > /* > > * Disable GART TLB Walk Errors on Fam10h. We do this here because > > this > > * is always needed when GART is enabled, even in a kernel which has > > no > > * MCE support built in. BIOS should disable GartTlbWlk Errors already. > > * If it doesn't, we do it here as suggested by the BKDG. > > * > > * Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=33012 > > */ Thanks Sohil for this reference. > > msr_set_bit(MSR_AMD64_MCx_MASK(4), 10); > > > > > > Now I think *that* is a typo, since it doesn't match the documentation. > :) As Yazen has the documentation that says TBL (table) is the intended meaning. I'll drop this patch in next version. -Qiuxu
On 11/10/2024 10:04 PM, Qiuxu Zhuo wrote:
> Fix typos in comments.
>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
> ---
It might have been fine to merge this with patch 6/8. They both touch
the same file and the changes are mostly superficial. Probably consider
it if you do another revision but not needed otherwise.
> arch/x86/kernel/cpu/mce/core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 0f0c6e9d9183..6e194ccffc7c 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -1144,7 +1144,7 @@ static noinstr int mce_start(int *no_way_out)
> } else {
> /*
> * Subject: Now start the scanning loop one by one in
> - * the original callin order.
> + * the original calling order.
> * This way when there are any shared banks it will be
> * only seen by one CPU before cleared, avoiding duplicates.
> */
> @@ -1917,7 +1917,7 @@ static void apply_quirks_amd(struct cpuinfo_x86 *c)
> /* This should be disabled by the BIOS, but isn't always */
> if (c->x86 == 15 && this_cpu_read(mce_num_banks) > 4) {
> /*
> - * disable GART TBL walk error reporting, which
> + * disable GART TLB walk error reporting, which
> * trips off incorrectly with the IOMMU & 3ware
> * & Cerberus:
> */
© 2016 - 2026 Red Hat, Inc.