[RFC PATCH 0/5] Print CPU at segfault time

ira.weiny@intel.com posted 5 patches 6 days, 5 hours ago
arch/arm64/include/asm/preempt.h    |  2 +-
arch/arm64/kernel/entry-common.c    |  4 +--
arch/x86/Kconfig                    |  4 +++
arch/x86/entry/calling.h            | 19 ++++++++++++++
arch/x86/entry/common.c             |  2 +-
arch/x86/entry/entry_64.S           | 22 ++++++++++++++++
arch/x86/entry/entry_64_compat.S    |  6 +++++
arch/x86/include/asm/entry-common.h | 12 +++++++++
arch/x86/include/asm/ptrace.h       | 19 ++++++++++++++
arch/x86/kernel/asm-offsets_64.c    | 15 +++++++++++
arch/x86/kernel/head_64.S           |  6 +++++
arch/x86/mm/fault.c                 | 10 ++++++++
include/linux/entry-common.h        | 25 +++++++++++++-----
kernel/entry/common.c               | 29 ++++++++++++++++-----
kernel/sched/core.c                 | 40 ++++++++++++++---------------
15 files changed, 178 insertions(+), 37 deletions(-)
[RFC PATCH 0/5] Print CPU at segfault time
Posted by ira.weiny@intel.com 6 days, 5 hours ago
From: Ira Weiny <ira.weiny@intel.com>


Rik reported that the knowledge of which CPU's are seeing faults can help in
determining which CPUs are failing in a large data center.[1]

Storing the CPU at exception entry time allows this print to report the CPU
which actually took the exception.  This still may not be the CPU which is
failing but it should be closer.

Dave and Boris recognized that the auxiliary pt_regs work I did for the PKS
series could help to store this value and avoid passing the CPU throughout the
fault handler call stack.

I'm posting this RFC for a few reasons.

1) I've left in arch_restore_aux_pt_regs().  This is called on exception exit
   and is not needed for this use case but I believe it is better to leave it
   for symmetry within the generic entry code.  This also means that patch
   1/5 could be dropped completely.

2) I want to see if 0day has any issues with the Kconfig option changes I made
   which may creep in from a 32bit build.

3) The final patch could be squashed with Rik's but it seemed better to leave
   them split for authorship clarity.

Compile tested only.

[1] https://lore.kernel.org/all/20220805101644.2e674553@imladris.surriel.com/

Ira Weiny (4):
  entry: Pass pt_regs to irqentry_exit_cond_resched()
  entry: Add calls for save/restore auxiliary pt_regs
  x86/entry: Add auxiliary pt_regs space
  x86/entry: Store CPU info on exception entry

Rik van Riel (1):
  x86,mm: print likely CPU at segfault time

 arch/arm64/include/asm/preempt.h    |  2 +-
 arch/arm64/kernel/entry-common.c    |  4 +--
 arch/x86/Kconfig                    |  4 +++
 arch/x86/entry/calling.h            | 19 ++++++++++++++
 arch/x86/entry/common.c             |  2 +-
 arch/x86/entry/entry_64.S           | 22 ++++++++++++++++
 arch/x86/entry/entry_64_compat.S    |  6 +++++
 arch/x86/include/asm/entry-common.h | 12 +++++++++
 arch/x86/include/asm/ptrace.h       | 19 ++++++++++++++
 arch/x86/kernel/asm-offsets_64.c    | 15 +++++++++++
 arch/x86/kernel/head_64.S           |  6 +++++
 arch/x86/mm/fault.c                 | 10 ++++++++
 include/linux/entry-common.h        | 25 +++++++++++++-----
 kernel/entry/common.c               | 29 ++++++++++++++++-----
 kernel/sched/core.c                 | 40 ++++++++++++++---------------
 15 files changed, 178 insertions(+), 37 deletions(-)


base-commit: b2a88c212e652e94f1e4b635910972ac57ba4e97
-- 
2.35.3