arch/s390/Kconfig | 1 + arch/s390/kernel/crash_dump.c | 39 +++-- drivers/virtio/Kconfig | 1 + drivers/virtio/virtio_mem.c | 103 +++++++++++++- fs/proc/Kconfig | 25 ++++ fs/proc/vmcore.c | 258 +++++++++++++++++++++++++--------- include/linux/crash_dump.h | 47 +++++++ include/linux/kcore.h | 13 -- 8 files changed, 396 insertions(+), 91 deletions(-)
This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds virtio-mem support on s390. The only "different than everything else" thing about virtio-mem on s390 is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the crash kernel must detect memory ranges of the crashed/panicked kernel to include via PT_LOAD in the vmcore. On other architectures, all RAM regions (boot + hotplugged) can easily be observed on the old (to crash) kernel (e.g., using /proc/iomem) to create the elfcore hdr. On s390, information about "ordinary" memory (heh, "storage") can be obtained by querying the hypervisor/ultravisor via SCLP/diag260, and that information is stored early during boot in the "physmem" memblock data structure. But virtio-mem memory is always detected by as device driver, which is usually build as a module. So in the crash kernel, this memory can only be properly detected once the virtio-mem driver started up. The virtio-mem driver already supports the "kdump mode", where it won't hotplug any memory but instead queries the device to implement the pfn_is_ram() callback, to avoid reading unplugged memory holes when reading the vmcore. With this series, if the virtio-mem driver is included in the kdump initrd -- which dracut already takes care of under Fedora/RHEL -- it will now detect the device RAM ranges on s390 once it probes the devices, to add them to the vmcore using the same callback mechanism we already have for pfn_is_ram(). To add these device RAM ranges to the vmcore ("patch the vmcore"), we will add new PT_LOAD entries that describe these memory ranges, and update all offsets vmcore size so it is all consistent. Note that makedumfile is shaky with v6.12-rcX, I made the "obvious" things (e.g., free page detection) work again while testing as documented in [2]. Creating the dumps using makedumpfile seems to work fine, and the dump regions (PT_LOAD) are as expected. I yet have to check in more detail if the created dumps are good (IOW, the right memory was dumped, but it looks like makedumpfile reads the right memory when interpreting the kernel data structures, which is promising). Patch #1 -- #6 are vmcore preparations and cleanups Patch #7 adds the infrastructure for drivers to report device RAM Patch #8 + #9 are virtio-mem preparations Patch #10 implements virtio-mem support to report device RAM Patch #11 activates it for s390, implementing a new function to fill PT_LOAD entry for device RAM [1] https://lkml.kernel.org/r/20241025141453.1210600-1-david@redhat.com [2] https://github.com/makedumpfile/makedumpfile/issues/16 Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: "Eugenio Pérez" <eperezma@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Cc: Thomas Huth <thuth@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: Eric Farman <farman@linux.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> David Hildenbrand (11): fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutex fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutex fs/proc/vmcore: disallow vmcore modifications after the vmcore was opened fs/proc/vmcore: move vmcore definitions from kcore.h to crash_dump.h fs/proc/vmcore: factor out allocating a vmcore memory node fs/proc/vmcore: factor out freeing a list of vmcore ranges fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM ranges in 2nd kernel virtio-mem: mark device ready before registering callbacks in kdump mode virtio-mem: remember usable region size virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAM s390/kdump: virtio-mem kdump support (CONFIG_PROC_VMCORE_DEVICE_RAM) arch/s390/Kconfig | 1 + arch/s390/kernel/crash_dump.c | 39 +++-- drivers/virtio/Kconfig | 1 + drivers/virtio/virtio_mem.c | 103 +++++++++++++- fs/proc/Kconfig | 25 ++++ fs/proc/vmcore.c | 258 +++++++++++++++++++++++++--------- include/linux/crash_dump.h | 47 +++++++ include/linux/kcore.h | 13 -- 8 files changed, 396 insertions(+), 91 deletions(-) -- 2.46.1
On 10/25/24 at 05:11pm, David Hildenbrand wrote: > This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds > virtio-mem support on s390. > > The only "different than everything else" thing about virtio-mem on s390 > is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr > during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the > crash kernel must detect memory ranges of the crashed/panicked kernel to > include via PT_LOAD in the vmcore. > > On other architectures, all RAM regions (boot + hotplugged) can easily be > observed on the old (to crash) kernel (e.g., using /proc/iomem) to create > the elfcore hdr. > > On s390, information about "ordinary" memory (heh, "storage") can be > obtained by querying the hypervisor/ultravisor via SCLP/diag260, and > that information is stored early during boot in the "physmem" memblock > data structure. > > But virtio-mem memory is always detected by as device driver, which is > usually build as a module. So in the crash kernel, this memory can only be ~~~~~~~~~~~ Is it 1st kernel or 2nd kernel? Usually we call the 1st kernel as panicked kernel, crashed kernel, the 2nd kernel as kdump kernel. > properly detected once the virtio-mem driver started up. > > The virtio-mem driver already supports the "kdump mode", where it won't > hotplug any memory but instead queries the device to implement the > pfn_is_ram() callback, to avoid reading unplugged memory holes when reading > the vmcore. > > With this series, if the virtio-mem driver is included in the kdump > initrd -- which dracut already takes care of under Fedora/RHEL -- it will > now detect the device RAM ranges on s390 once it probes the devices, to add > them to the vmcore using the same callback mechanism we already have for > pfn_is_ram(). Do you mean on s390 virtio-mem memory region will be detected and added to vmcore in kdump kernel when virtio-mem driver is initialized? Not sure if I understand it correctly. > > To add these device RAM ranges to the vmcore ("patch the vmcore"), we will > add new PT_LOAD entries that describe these memory ranges, and update > all offsets vmcore size so it is all consistent. > > Note that makedumfile is shaky with v6.12-rcX, I made the "obvious" things > (e.g., free page detection) work again while testing as documented in [2]. > > Creating the dumps using makedumpfile seems to work fine, and the > dump regions (PT_LOAD) are as expected. I yet have to check in more detail > if the created dumps are good (IOW, the right memory was dumped, but it > looks like makedumpfile reads the right memory when interpreting the > kernel data structures, which is promising). > > Patch #1 -- #6 are vmcore preparations and cleanups > Patch #7 adds the infrastructure for drivers to report device RAM > Patch #8 + #9 are virtio-mem preparations > Patch #10 implements virtio-mem support to report device RAM > Patch #11 activates it for s390, implementing a new function to fill > PT_LOAD entry for device RAM > > [1] https://lkml.kernel.org/r/20241025141453.1210600-1-david@redhat.com > [2] https://github.com/makedumpfile/makedumpfile/issues/16 > > Cc: Heiko Carstens <hca@linux.ibm.com> > Cc: Vasily Gorbik <gor@linux.ibm.com> > Cc: Alexander Gordeev <agordeev@linux.ibm.com> > Cc: Christian Borntraeger <borntraeger@linux.ibm.com> > Cc: Sven Schnelle <svens@linux.ibm.com> > Cc: "Michael S. Tsirkin" <mst@redhat.com> > Cc: Jason Wang <jasowang@redhat.com> > Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Cc: "Eugenio Pérez" <eperezma@redhat.com> > Cc: Baoquan He <bhe@redhat.com> > Cc: Vivek Goyal <vgoyal@redhat.com> > Cc: Dave Young <dyoung@redhat.com> > Cc: Thomas Huth <thuth@redhat.com> > Cc: Cornelia Huck <cohuck@redhat.com> > Cc: Janosch Frank <frankja@linux.ibm.com> > Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> > Cc: Eric Farman <farman@linux.ibm.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > > David Hildenbrand (11): > fs/proc/vmcore: convert vmcore_cb_lock into vmcore_mutex > fs/proc/vmcore: replace vmcoredd_mutex by vmcore_mutex > fs/proc/vmcore: disallow vmcore modifications after the vmcore was > opened > fs/proc/vmcore: move vmcore definitions from kcore.h to crash_dump.h > fs/proc/vmcore: factor out allocating a vmcore memory node > fs/proc/vmcore: factor out freeing a list of vmcore ranges > fs/proc/vmcore: introduce PROC_VMCORE_DEVICE_RAM to detect device RAM > ranges in 2nd kernel > virtio-mem: mark device ready before registering callbacks in kdump > mode > virtio-mem: remember usable region size > virtio-mem: support CONFIG_PROC_VMCORE_DEVICE_RAM > s390/kdump: virtio-mem kdump support (CONFIG_PROC_VMCORE_DEVICE_RAM) > > arch/s390/Kconfig | 1 + > arch/s390/kernel/crash_dump.c | 39 +++-- > drivers/virtio/Kconfig | 1 + > drivers/virtio/virtio_mem.c | 103 +++++++++++++- > fs/proc/Kconfig | 25 ++++ > fs/proc/vmcore.c | 258 +++++++++++++++++++++++++--------- > include/linux/crash_dump.h | 47 +++++++ > include/linux/kcore.h | 13 -- > 8 files changed, 396 insertions(+), 91 deletions(-) > > -- > 2.46.1 >
On 15.11.24 09:46, Baoquan He wrote: > On 10/25/24 at 05:11pm, David Hildenbrand wrote: >> This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds >> virtio-mem support on s390. >> >> The only "different than everything else" thing about virtio-mem on s390 >> is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr >> during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the >> crash kernel must detect memory ranges of the crashed/panicked kernel to >> include via PT_LOAD in the vmcore. >> >> On other architectures, all RAM regions (boot + hotplugged) can easily be >> observed on the old (to crash) kernel (e.g., using /proc/iomem) to create >> the elfcore hdr. >> >> On s390, information about "ordinary" memory (heh, "storage") can be >> obtained by querying the hypervisor/ultravisor via SCLP/diag260, and >> that information is stored early during boot in the "physmem" memblock >> data structure. >> >> But virtio-mem memory is always detected by as device driver, which is >> usually build as a module. So in the crash kernel, this memory can only be > ~~~~~~~~~~~ > Is it 1st kernel or 2nd kernel? > Usually we call the 1st kernel as panicked kernel, crashed kernel, the > 2nd kernel as kdump kernel. It should have been called "kdump (2nd) kernel" here indeed. >> properly detected once the virtio-mem driver started up. >> >> The virtio-mem driver already supports the "kdump mode", where it won't >> hotplug any memory but instead queries the device to implement the >> pfn_is_ram() callback, to avoid reading unplugged memory holes when reading >> the vmcore. >> >> With this series, if the virtio-mem driver is included in the kdump >> initrd -- which dracut already takes care of under Fedora/RHEL -- it will >> now detect the device RAM ranges on s390 once it probes the devices, to add >> them to the vmcore using the same callback mechanism we already have for >> pfn_is_ram(). > > Do you mean on s390 virtio-mem memory region will be detected and added > to vmcore in kdump kernel when virtio-mem driver is initialized? Not > sure if I understand it correctly. Yes exactly. In the kdump kernel, the driver gets probed and registers the vmcore callbacks. From there, we detect and add the device regions. Thanks! -- Cheers, David / dhildenb
On 11/15/24 at 09:55am, David Hildenbrand wrote: > On 15.11.24 09:46, Baoquan He wrote: > > On 10/25/24 at 05:11pm, David Hildenbrand wrote: > > > This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds > > > virtio-mem support on s390. > > > > > > The only "different than everything else" thing about virtio-mem on s390 > > > is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr > > > during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the > > > crash kernel must detect memory ranges of the crashed/panicked kernel to > > > include via PT_LOAD in the vmcore. > > > > > > On other architectures, all RAM regions (boot + hotplugged) can easily be > > > observed on the old (to crash) kernel (e.g., using /proc/iomem) to create > > > the elfcore hdr. > > > > > > On s390, information about "ordinary" memory (heh, "storage") can be > > > obtained by querying the hypervisor/ultravisor via SCLP/diag260, and > > > that information is stored early during boot in the "physmem" memblock > > > data structure. > > > > > > But virtio-mem memory is always detected by as device driver, which is > > > usually build as a module. So in the crash kernel, this memory can only be > > ~~~~~~~~~~~ > > Is it 1st kernel or 2nd kernel? > > Usually we call the 1st kernel as panicked kernel, crashed kernel, the > > 2nd kernel as kdump kernel. > > It should have been called "kdump (2nd) kernel" here indeed. > > > > properly detected once the virtio-mem driver started up. > > > > > > The virtio-mem driver already supports the "kdump mode", where it won't > > > hotplug any memory but instead queries the device to implement the > > > pfn_is_ram() callback, to avoid reading unplugged memory holes when reading > > > the vmcore. > > > > > > With this series, if the virtio-mem driver is included in the kdump > > > initrd -- which dracut already takes care of under Fedora/RHEL -- it will > > > now detect the device RAM ranges on s390 once it probes the devices, to add > > > them to the vmcore using the same callback mechanism we already have for > > > pfn_is_ram(). > > > > Do you mean on s390 virtio-mem memory region will be detected and added > > to vmcore in kdump kernel when virtio-mem driver is initialized? Not > > sure if I understand it correctly. > > Yes exactly. In the kdump kernel, the driver gets probed and registers the > vmcore callbacks. From there, we detect and add the device regions. I see now, thanks for your confirmation.
On 10/25/24 at 05:11pm, David Hildenbrand wrote: > This is based on "[PATCH v3 0/7] virtio-mem: s390 support" [1], which adds > virtio-mem support on s390. > > The only "different than everything else" thing about virtio-mem on s390 > is kdump: The crash (2nd) kernel allocates+prepares the elfcore hdr > during fs_init()->vmcore_init()->elfcorehdr_alloc(). Consequently, the > crash kernel must detect memory ranges of the crashed/panicked kernel to > include via PT_LOAD in the vmcore. > > On other architectures, all RAM regions (boot + hotplugged) can easily be > observed on the old (to crash) kernel (e.g., using /proc/iomem) to create > the elfcore hdr. > > On s390, information about "ordinary" memory (heh, "storage") can be > obtained by querying the hypervisor/ultravisor via SCLP/diag260, and > that information is stored early during boot in the "physmem" memblock > data structure. > > But virtio-mem memory is always detected by as device driver, which is > usually build as a module. So in the crash kernel, this memory can only be > properly detected once the virtio-mem driver started up. > > The virtio-mem driver already supports the "kdump mode", where it won't > hotplug any memory but instead queries the device to implement the > pfn_is_ram() callback, to avoid reading unplugged memory holes when reading > the vmcore. > > With this series, if the virtio-mem driver is included in the kdump > initrd -- which dracut already takes care of under Fedora/RHEL -- it will > now detect the device RAM ranges on s390 once it probes the devices, to add > them to the vmcore using the same callback mechanism we already have for > pfn_is_ram(). > > To add these device RAM ranges to the vmcore ("patch the vmcore"), we will > add new PT_LOAD entries that describe these memory ranges, and update > all offsets vmcore size so it is all consistent. > > Note that makedumfile is shaky with v6.12-rcX, I made the "obvious" things > (e.g., free page detection) work again while testing as documented in [2]. > > Creating the dumps using makedumpfile seems to work fine, and the > dump regions (PT_LOAD) are as expected. I yet have to check in more detail > if the created dumps are good (IOW, the right memory was dumped, but it > looks like makedumpfile reads the right memory when interpreting the > kernel data structures, which is promising). > > Patch #1 -- #6 are vmcore preparations and cleanups Thanks for CC-ing me, I will review the patch 1-6, vmcore part next week.
© 2016 - 2024 Red Hat, Inc.