arch/s390/Kconfig | 1 + arch/s390/kernel/crash_dump.c | 39 ++++- drivers/virtio/virtio_mem.c | 103 ++++++++++++- fs/proc/Kconfig | 19 +++ fs/proc/vmcore.c | 283 ++++++++++++++++++++++++++-------- include/linux/crash_dump.h | 41 +++++ include/linux/kcore.h | 13 -- 7 files changed, 407 insertions(+), 92 deletions(-)
The only "different than everything else" thing about virtio-mem on s390
is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr
during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the
kdump kernel must detect memory ranges of the crashed kernel to
include via PT_LOAD in the vmcore.
On other architectures, all RAM regions (boot + hotplugged) can easily be
observed on the old (to crash) kernel (e.g., using /proc/iomem) to create
the elfcore hdr.
On s390, information about "ordinary" memory (heh, "storage") can be
obtained by querying the hypervisor/ultravisor via SCLP/diag260, and
that information is stored early during boot in the "physmem" memblock
data structure.
But virtio-mem memory is always detected by as device driver, which is
usually build as a module. So in the crash kernel, this memory can only be
properly detected once the virtio-mem driver started up.
The virtio-mem driver already supports the "kdump mode", where it won't
hotplug any memory but instead queries the device to implement the
pfn_is_ram() callback, to avoid reading unplugged memory holes when reading
the vmcore.
With this series, if the virtio-mem driver is included in the kdump
initrd -- which dracut already takes care of under Fedora/RHEL -- it will
now detect the device RAM ranges on s390 once it probes the devices, to add
them to the vmcore using the same callback mechanism we already have for
pfn_is_ram().
To add these device RAM ranges to the vmcore ("patch the vmcore"), we will
add new PT_LOAD entries that describe these memory ranges, and update
all offsets vmcore size so it is all consistent.
My testing when creating+analyzing crash dumps with hotplugged virtio-mem
memory (incl. holes) did not reveal any surprises.
Patch #1 -- #7 are vmcore preparations and cleanups
Patch #8 adds the infrastructure for drivers to report device RAM
Patch #9 + #10 are virtio-mem preparations
Patch #11 implements virtio-mem support to report device RAM
Patch #12 activates it for s390, implementing a new function to fill
PT_LOAD entry for device RAM
v1 -> v2:
* "fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutex"
-> Extend patch description
* "fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutex"
-> Extend patch description
* "fs/proc/vmcore: disallow vmcore modifications while the vmcore is open"
-> Disallow modifications only if it is currently open, but warn if it
was already open and got closed again.
-> Track vmcore_open vs. vmcore_opened
-> Extend patch description
* "fs/proc/vmcore: prefix all pr_* with "vmcore:""
-> Added
* "fs/proc/vmcore: move vmcore definitions out if kcore.h"
-> Call it "vmcore_range"
-> Place vmcoredd_node into vmcore.c
-> Adjust patch subject + description
* "fs/proc/vmcore: factor out allocating a vmcore range and adding it to a
list"
-> Adjust to "vmcore_range"
* "fs/proc/vmcore: factor out freeing a list of vmcore ranges"
-> Adjust to "vmcore_range"
* "fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM
ranges in 2nd kernel"
-> Drop PROVIDE_PROC_VMCORE_DEVICE_RAM for now
-> Simplify Kconfig a bit
-> Drop "Kdump:" from warnings/errors
-> Perform Elf64 check first
-> Add regions also if the vmcore was opened, but got closed again. But
warn in any case, because it is unexpected.
-> Adjust patch description
* "virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAM"
-> "depends on VIRTIO_MEM" for PROC_VMCORE_DEVICE_RAM
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Eugenio Pérez" <eperezma@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Eric Farman <farman@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
David Hildenbrand (12):
fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutex
fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutex
fs/proc/vmcore: disallow vmcore modifications while the vmcore is open
fs/proc/vmcore: prefix all pr_* with "vmcore:"
fs/proc/vmcore: move vmcore definitions out of kcore.h
fs/proc/vmcore: factor out allocating a vmcore range and adding it to
a list
fs/proc/vmcore: factor out freeing a list of vmcore ranges
fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM
ranges in 2nd kernel
virtio-mem: mark device ready before registering callbacks in kdump
mode
virtio-mem: remember usable region size
virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAM
s390/kdump: virtio-mem kdump support (CONFIG_PROC_VMCORE_DEVICE_RAM)
arch/s390/Kconfig | 1 +
arch/s390/kernel/crash_dump.c | 39 ++++-
drivers/virtio/virtio_mem.c | 103 ++++++++++++-
fs/proc/Kconfig | 19 +++
fs/proc/vmcore.c | 283 ++++++++++++++++++++++++++--------
include/linux/crash_dump.h | 41 +++++
include/linux/kcore.h | 13 --
7 files changed, 407 insertions(+), 92 deletions(-)
base-commit: feffde684ac29a3b7aec82d2df850fbdbdee55e4
--
2.47.1
On Wed, Dec 04, 2024 at 01:54:31PM +0100, David Hildenbrand wrote:
> The only "different than everything else" thing about virtio-mem on s390
> is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr
> during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the
> kdump kernel must detect memory ranges of the crashed kernel to
> include via PT_LOAD in the vmcore.
>
> On other architectures, all RAM regions (boot + hotplugged) can easily be
> observed on the old (to crash) kernel (e.g., using /proc/iomem) to create
> the elfcore hdr.
>
> On s390, information about "ordinary" memory (heh, "storage") can be
> obtained by querying the hypervisor/ultravisor via SCLP/diag260, and
> that information is stored early during boot in the "physmem" memblock
> data structure.
>
> But virtio-mem memory is always detected by as device driver, which is
> usually build as a module. So in the crash kernel, this memory can only be
> properly detected once the virtio-mem driver started up.
>
> The virtio-mem driver already supports the "kdump mode", where it won't
> hotplug any memory but instead queries the device to implement the
> pfn_is_ram() callback, to avoid reading unplugged memory holes when reading
> the vmcore.
>
> With this series, if the virtio-mem driver is included in the kdump
> initrd -- which dracut already takes care of under Fedora/RHEL -- it will
> now detect the device RAM ranges on s390 once it probes the devices, to add
> them to the vmcore using the same callback mechanism we already have for
> pfn_is_ram().
>
> To add these device RAM ranges to the vmcore ("patch the vmcore"), we will
> add new PT_LOAD entries that describe these memory ranges, and update
> all offsets vmcore size so it is all consistent.
>
> My testing when creating+analyzing crash dumps with hotplugged virtio-mem
> memory (incl. holes) did not reveal any surprises.
>
> Patch #1 -- #7 are vmcore preparations and cleanups
> Patch #8 adds the infrastructure for drivers to report device RAM
> Patch #9 + #10 are virtio-mem preparations
> Patch #11 implements virtio-mem support to report device RAM
> Patch #12 activates it for s390, implementing a new function to fill
> PT_LOAD entry for device RAM
Who is merging this?
virtio parts:
Acked-by: Michael S. Tsirkin <mst@redhat.com>
> v1 -> v2:
> * "fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutex"
> -> Extend patch description
> * "fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutex"
> -> Extend patch description
> * "fs/proc/vmcore: disallow vmcore modifications while the vmcore is open"
> -> Disallow modifications only if it is currently open, but warn if it
> was already open and got closed again.
> -> Track vmcore_open vs. vmcore_opened
> -> Extend patch description
> * "fs/proc/vmcore: prefix all pr_* with "vmcore:""
> -> Added
> * "fs/proc/vmcore: move vmcore definitions out if kcore.h"
> -> Call it "vmcore_range"
> -> Place vmcoredd_node into vmcore.c
> -> Adjust patch subject + description
> * "fs/proc/vmcore: factor out allocating a vmcore range and adding it to a
> list"
> -> Adjust to "vmcore_range"
> * "fs/proc/vmcore: factor out freeing a list of vmcore ranges"
> -> Adjust to "vmcore_range"
> * "fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM
> ranges in 2nd kernel"
> -> Drop PROVIDE_PROC_VMCORE_DEVICE_RAM for now
> -> Simplify Kconfig a bit
> -> Drop "Kdump:" from warnings/errors
> -> Perform Elf64 check first
> -> Add regions also if the vmcore was opened, but got closed again. But
> warn in any case, because it is unexpected.
> -> Adjust patch description
> * "virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAM"
> -> "depends on VIRTIO_MEM" for PROC_VMCORE_DEVICE_RAM
>
>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Vasily Gorbik <gor@linux.ibm.com>
> Cc: Alexander Gordeev <agordeev@linux.ibm.com>
> Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
> Cc: Sven Schnelle <svens@linux.ibm.com>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jason Wang <jasowang@redhat.com>
> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> Cc: "Eugenio Pérez" <eperezma@redhat.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: Dave Young <dyoung@redhat.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Cc: Janosch Frank <frankja@linux.ibm.com>
> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Cc: Eric Farman <farman@linux.ibm.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
>
> David Hildenbrand (12):
> fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutex
> fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutex
> fs/proc/vmcore: disallow vmcore modifications while the vmcore is open
> fs/proc/vmcore: prefix all pr_* with "vmcore:"
> fs/proc/vmcore: move vmcore definitions out of kcore.h
> fs/proc/vmcore: factor out allocating a vmcore range and adding it to
> a list
> fs/proc/vmcore: factor out freeing a list of vmcore ranges
> fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM
> ranges in 2nd kernel
> virtio-mem: mark device ready before registering callbacks in kdump
> mode
> virtio-mem: remember usable region size
> virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAM
> s390/kdump: virtio-mem kdump support (CONFIG_PROC_VMCORE_DEVICE_RAM)
>
> arch/s390/Kconfig | 1 +
> arch/s390/kernel/crash_dump.c | 39 ++++-
> drivers/virtio/virtio_mem.c | 103 ++++++++++++-
> fs/proc/Kconfig | 19 +++
> fs/proc/vmcore.c | 283 ++++++++++++++++++++++++++--------
> include/linux/crash_dump.h | 41 +++++
> include/linux/kcore.h | 13 --
> 7 files changed, 407 insertions(+), 92 deletions(-)
>
>
> base-commit: feffde684ac29a3b7aec82d2df850fbdbdee55e4
> --
> 2.47.1
On Wed, Jan 08, 2025 at 07:04:23AM -0500, Michael S. Tsirkin wrote:
> On Wed, Dec 04, 2024 at 01:54:31PM +0100, David Hildenbrand wrote:
> > The only "different than everything else" thing about virtio-mem on s390
> > is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr
> > during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the
> > kdump kernel must detect memory ranges of the crashed kernel to
> > include via PT_LOAD in the vmcore.
> >
> > On other architectures, all RAM regions (boot + hotplugged) can easily be
> > observed on the old (to crash) kernel (e.g., using /proc/iomem) to create
> > the elfcore hdr.
> >
> > On s390, information about "ordinary" memory (heh, "storage") can be
> > obtained by querying the hypervisor/ultravisor via SCLP/diag260, and
> > that information is stored early during boot in the "physmem" memblock
> > data structure.
> >
> > But virtio-mem memory is always detected by as device driver, which is
> > usually build as a module. So in the crash kernel, this memory can only be
> > properly detected once the virtio-mem driver started up.
> >
> > The virtio-mem driver already supports the "kdump mode", where it won't
> > hotplug any memory but instead queries the device to implement the
> > pfn_is_ram() callback, to avoid reading unplugged memory holes when reading
> > the vmcore.
> >
> > With this series, if the virtio-mem driver is included in the kdump
> > initrd -- which dracut already takes care of under Fedora/RHEL -- it will
> > now detect the device RAM ranges on s390 once it probes the devices, to add
> > them to the vmcore using the same callback mechanism we already have for
> > pfn_is_ram().
> >
> > To add these device RAM ranges to the vmcore ("patch the vmcore"), we will
> > add new PT_LOAD entries that describe these memory ranges, and update
> > all offsets vmcore size so it is all consistent.
> >
> > My testing when creating+analyzing crash dumps with hotplugged virtio-mem
> > memory (incl. holes) did not reveal any surprises.
> >
> > Patch #1 -- #7 are vmcore preparations and cleanups
> > Patch #8 adds the infrastructure for drivers to report device RAM
> > Patch #9 + #10 are virtio-mem preparations
> > Patch #11 implements virtio-mem support to report device RAM
> > Patch #12 activates it for s390, implementing a new function to fill
> > PT_LOAD entry for device RAM
>
> Who is merging this?
> virtio parts:
>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
I guess this series should go via Andrew Morton. Andrew?
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
On 08.01.25 13:10, Heiko Carstens wrote:
> On Wed, Jan 08, 2025 at 07:04:23AM -0500, Michael S. Tsirkin wrote:
>> On Wed, Dec 04, 2024 at 01:54:31PM +0100, David Hildenbrand wrote:
>>> The only "different than everything else" thing about virtio-mem on s390
>>> is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr
>>> during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the
>>> kdump kernel must detect memory ranges of the crashed kernel to
>>> include via PT_LOAD in the vmcore.
>>>
>>> On other architectures, all RAM regions (boot + hotplugged) can easily be
>>> observed on the old (to crash) kernel (e.g., using /proc/iomem) to create
>>> the elfcore hdr.
>>>
>>> On s390, information about "ordinary" memory (heh, "storage") can be
>>> obtained by querying the hypervisor/ultravisor via SCLP/diag260, and
>>> that information is stored early during boot in the "physmem" memblock
>>> data structure.
>>>
>>> But virtio-mem memory is always detected by as device driver, which is
>>> usually build as a module. So in the crash kernel, this memory can only be
>>> properly detected once the virtio-mem driver started up.
>>>
>>> The virtio-mem driver already supports the "kdump mode", where it won't
>>> hotplug any memory but instead queries the device to implement the
>>> pfn_is_ram() callback, to avoid reading unplugged memory holes when reading
>>> the vmcore.
>>>
>>> With this series, if the virtio-mem driver is included in the kdump
>>> initrd -- which dracut already takes care of under Fedora/RHEL -- it will
>>> now detect the device RAM ranges on s390 once it probes the devices, to add
>>> them to the vmcore using the same callback mechanism we already have for
>>> pfn_is_ram().
>>>
>>> To add these device RAM ranges to the vmcore ("patch the vmcore"), we will
>>> add new PT_LOAD entries that describe these memory ranges, and update
>>> all offsets vmcore size so it is all consistent.
>>>
>>> My testing when creating+analyzing crash dumps with hotplugged virtio-mem
>>> memory (incl. holes) did not reveal any surprises.
>>>
>>> Patch #1 -- #7 are vmcore preparations and cleanups
>>> Patch #8 adds the infrastructure for drivers to report device RAM
>>> Patch #9 + #10 are virtio-mem preparations
>>> Patch #11 implements virtio-mem support to report device RAM
>>> Patch #12 activates it for s390, implementing a new function to fill
>>> PT_LOAD entry for device RAM
>>
>> Who is merging this?
>> virtio parts:
>>
>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>
> I guess this series should go via Andrew Morton. Andrew?
>
> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
>
Yes, it's in mm-unstable already for quite a while.
Thanks for the acks!
--
Cheers,
David / dhildenb
© 2016 - 2025 Red Hat, Inc.