.../ABI/testing/sysfs-devices-memory | 6 ++-- .../ABI/testing/sysfs-devices-system-cpu | 6 ++-- .../admin-guide/mm/memory-hotplug.rst | 5 +-- Documentation/core-api/cpu_hotplug.rst | 10 +++--- kernel/crash_core.c | 33 +++++++++++-------- 5 files changed, 35 insertions(+), 25 deletions(-)
Commit 79365026f869 ("crash: add a new kexec flag for hotplug support")
generalizes the crash hotplug support to allow architectures to update
multiple kexec segments on CPU/Memory hotplug and not just elfcorehdr.
Therefore, update the relevant kernel documentation to reflect the same.
No functional change.
Cc: Petr Tesarik <petr@tesarici.cz>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: x86@kernel.org
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
Changelog:
Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
- Update crash_hotplug sysfs document as suggested by Petr T
- Update an error message in crash_handle_hotplug_event and
crash_check_hotplug_support function.
---
.../ABI/testing/sysfs-devices-memory | 6 ++--
.../ABI/testing/sysfs-devices-system-cpu | 6 ++--
.../admin-guide/mm/memory-hotplug.rst | 5 +--
Documentation/core-api/cpu_hotplug.rst | 10 +++---
kernel/crash_core.c | 33 +++++++++++--------
5 files changed, 35 insertions(+), 25 deletions(-)
diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
index a95e0f17c35a..cec65827e602 100644
--- a/Documentation/ABI/testing/sysfs-devices-memory
+++ b/Documentation/ABI/testing/sysfs-devices-memory
@@ -115,6 +115,6 @@ What: /sys/devices/system/memory/crash_hotplug
Date: Aug 2023
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description:
- (RO) indicates whether or not the kernel directly supports
- modifying the crash elfcorehdr for memory hot un/plug and/or
- on/offline changes.
+ (RO) indicates whether or not the kernel updates relevant kexec
+ segments on memory hot un/plug and/or on/offline events, avoiding the
+ need to reload kdump kernel.
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 325873385b71..1a31b7c71676 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -703,9 +703,9 @@ What: /sys/devices/system/cpu/crash_hotplug
Date: Aug 2023
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description:
- (RO) indicates whether or not the kernel directly supports
- modifying the crash elfcorehdr for CPU hot un/plug and/or
- on/offline changes.
+ (RO) indicates whether or not the kernel updates relevant kexec
+ segments on memory hot un/plug and/or on/offline events, avoiding the
+ need to reload kdump kernel.
What: /sys/devices/system/cpu/enabled
Date: Nov 2022
diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
index 098f14d83e99..cb2c080f400c 100644
--- a/Documentation/admin-guide/mm/memory-hotplug.rst
+++ b/Documentation/admin-guide/mm/memory-hotplug.rst
@@ -294,8 +294,9 @@ The following files are currently defined:
``crash_hotplug`` read-only: when changes to the system memory map
occur due to hot un/plug of memory, this file contains
'1' if the kernel updates the kdump capture kernel memory
- map itself (via elfcorehdr), or '0' if userspace must update
- the kdump capture kernel memory map.
+ map itself (via elfcorehdr and other relevant kexec
+ segments), or '0' if userspace must update the kdump
+ capture kernel memory map.
Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
configuration option.
diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
index dcb0e379e5e8..a21dbf261be7 100644
--- a/Documentation/core-api/cpu_hotplug.rst
+++ b/Documentation/core-api/cpu_hotplug.rst
@@ -737,8 +737,9 @@ can process the event further.
When changes to the CPUs in the system occur, the sysfs file
/sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
-updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
-or '0' if userspace must update the kdump capture kernel list of CPUs.
+updates the kdump capture kernel list of CPUs itself (via elfcorehdr and
+other relevant kexec segment), or '0' if userspace must update the kdump
+capture kernel list of CPUs.
The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
option.
@@ -750,8 +751,9 @@ file can be used in a udev rule as follows:
SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
For a CPU hot un/plug event, if the architecture supports kernel updates
-of the elfcorehdr (which contains the list of CPUs), then the rule skips
-the unload-then-reload of the kdump capture kernel.
+of the elfcorehdr (which contains the list of CPUs) and other relevant
+kexec segments, then the rule skips the unload-then-reload of the kdump
+capture kernel.
Kernel Inline Documentations Reference
======================================
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 63cf89393c6e..c1048893f4b6 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
crash_hotplug_lock();
/* Obtain lock while reading crash information */
if (!kexec_trylock()) {
- pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
+ pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
crash_hotplug_unlock();
return 0;
}
@@ -520,18 +520,25 @@ int crash_check_hotplug_support(void)
}
/*
- * To accurately reflect hot un/plug changes of cpu and memory resources
- * (including onling and offlining of those resources), the elfcorehdr
- * (which is passed to the crash kernel via the elfcorehdr= parameter)
- * must be updated with the new list of CPUs and memories.
+ * To accurately reflect hot un/plug changes of CPU and Memory resources
+ * (including onling and offlining of those resources), the relevant
+ * kexec segments must be updated with latest CPU and Memory resources.
*
- * In order to make changes to elfcorehdr, two conditions are needed:
- * First, the segment containing the elfcorehdr must be large enough
- * to permit a growing number of resources; the elfcorehdr memory size
- * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES.
- * Second, purgatory must explicitly exclude the elfcorehdr from the
- * list of segments it checks (since the elfcorehdr changes and thus
- * would require an update to purgatory itself to update the digest).
+ * Architectures must ensure two things for all segments that need
+ * updating during hotplug events:
+ *
+ * 1. Segments must be large enough to accommodate a growing number of
+ * resources.
+ * 2. Exclude the segments from SHA verification.
+ *
+ * For example, on most architectures, the elfcorehdr (which is passed
+ * to the crash kernel via the elfcorehdr= parameter) must include the
+ * new list of CPUs and memory. To make changes to the elfcorehdr, it
+ * should be large enough to permit a growing number of CPU and Memory
+ * resources. One can estimate the elfcorehdr memory size based on
+ * NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. The elfcorehdr is
+ * excluded from SHA verification by default if the architecture
+ * supports crash hotplug.
*/
static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
{
@@ -540,7 +547,7 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
crash_hotplug_lock();
/* Obtain lock while changing crash information */
if (!kexec_trylock()) {
- pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
+ pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
crash_hotplug_unlock();
return;
}
--
2.45.2
Add Jonathan and Andew.
On 08/12/24 at 09:46am, Sourabh Jain wrote:
> Commit 79365026f869 ("crash: add a new kexec flag for hotplug support")
> generalizes the crash hotplug support to allow architectures to update
> multiple kexec segments on CPU/Memory hotplug and not just elfcorehdr.
> Therefore, update the relevant kernel documentation to reflect the same.
Hi Jonathan and Andew,
Could any of you pick this into your tree?
Thanks
Baoquan
>
> Cc: Petr Tesarik <petr@tesarici.cz>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: kexec@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: x86@kernel.org
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>
> Changelog:
>
> Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
> - Update crash_hotplug sysfs document as suggested by Petr T
> - Update an error message in crash_handle_hotplug_event and
> crash_check_hotplug_support function.
>
> ---
> .../ABI/testing/sysfs-devices-memory | 6 ++--
> .../ABI/testing/sysfs-devices-system-cpu | 6 ++--
> .../admin-guide/mm/memory-hotplug.rst | 5 +--
> Documentation/core-api/cpu_hotplug.rst | 10 +++---
> kernel/crash_core.c | 33 +++++++++++--------
> 5 files changed, 35 insertions(+), 25 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
> index a95e0f17c35a..cec65827e602 100644
> --- a/Documentation/ABI/testing/sysfs-devices-memory
> +++ b/Documentation/ABI/testing/sysfs-devices-memory
> @@ -115,6 +115,6 @@ What: /sys/devices/system/memory/crash_hotplug
> Date: Aug 2023
> Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
> Description:
> - (RO) indicates whether or not the kernel directly supports
> - modifying the crash elfcorehdr for memory hot un/plug and/or
> - on/offline changes.
> + (RO) indicates whether or not the kernel updates relevant kexec
> + segments on memory hot un/plug and/or on/offline events, avoiding the
> + need to reload kdump kernel.
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 325873385b71..1a31b7c71676 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -703,9 +703,9 @@ What: /sys/devices/system/cpu/crash_hotplug
> Date: Aug 2023
> Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
> Description:
> - (RO) indicates whether or not the kernel directly supports
> - modifying the crash elfcorehdr for CPU hot un/plug and/or
> - on/offline changes.
> + (RO) indicates whether or not the kernel updates relevant kexec
> + segments on memory hot un/plug and/or on/offline events, avoiding the
> + need to reload kdump kernel.
>
> What: /sys/devices/system/cpu/enabled
> Date: Nov 2022
> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
> index 098f14d83e99..cb2c080f400c 100644
> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
> @@ -294,8 +294,9 @@ The following files are currently defined:
> ``crash_hotplug`` read-only: when changes to the system memory map
> occur due to hot un/plug of memory, this file contains
> '1' if the kernel updates the kdump capture kernel memory
> - map itself (via elfcorehdr), or '0' if userspace must update
> - the kdump capture kernel memory map.
> + map itself (via elfcorehdr and other relevant kexec
> + segments), or '0' if userspace must update the kdump
> + capture kernel memory map.
>
> Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
> configuration option.
> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
> index dcb0e379e5e8..a21dbf261be7 100644
> --- a/Documentation/core-api/cpu_hotplug.rst
> +++ b/Documentation/core-api/cpu_hotplug.rst
> @@ -737,8 +737,9 @@ can process the event further.
>
> When changes to the CPUs in the system occur, the sysfs file
> /sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
> -updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
> -or '0' if userspace must update the kdump capture kernel list of CPUs.
> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr and
> +other relevant kexec segment), or '0' if userspace must update the kdump
> +capture kernel list of CPUs.
>
> The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
> option.
> @@ -750,8 +751,9 @@ file can be used in a udev rule as follows:
> SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>
> For a CPU hot un/plug event, if the architecture supports kernel updates
> -of the elfcorehdr (which contains the list of CPUs), then the rule skips
> -the unload-then-reload of the kdump capture kernel.
> +of the elfcorehdr (which contains the list of CPUs) and other relevant
> +kexec segments, then the rule skips the unload-then-reload of the kdump
> +capture kernel.
>
> Kernel Inline Documentations Reference
> ======================================
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 63cf89393c6e..c1048893f4b6 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
> crash_hotplug_lock();
> /* Obtain lock while reading crash information */
> if (!kexec_trylock()) {
> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
> crash_hotplug_unlock();
> return 0;
> }
> @@ -520,18 +520,25 @@ int crash_check_hotplug_support(void)
> }
>
> /*
> - * To accurately reflect hot un/plug changes of cpu and memory resources
> - * (including onling and offlining of those resources), the elfcorehdr
> - * (which is passed to the crash kernel via the elfcorehdr= parameter)
> - * must be updated with the new list of CPUs and memories.
> + * To accurately reflect hot un/plug changes of CPU and Memory resources
> + * (including onling and offlining of those resources), the relevant
> + * kexec segments must be updated with latest CPU and Memory resources.
> *
> - * In order to make changes to elfcorehdr, two conditions are needed:
> - * First, the segment containing the elfcorehdr must be large enough
> - * to permit a growing number of resources; the elfcorehdr memory size
> - * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES.
> - * Second, purgatory must explicitly exclude the elfcorehdr from the
> - * list of segments it checks (since the elfcorehdr changes and thus
> - * would require an update to purgatory itself to update the digest).
> + * Architectures must ensure two things for all segments that need
> + * updating during hotplug events:
> + *
> + * 1. Segments must be large enough to accommodate a growing number of
> + * resources.
> + * 2. Exclude the segments from SHA verification.
> + *
> + * For example, on most architectures, the elfcorehdr (which is passed
> + * to the crash kernel via the elfcorehdr= parameter) must include the
> + * new list of CPUs and memory. To make changes to the elfcorehdr, it
> + * should be large enough to permit a growing number of CPU and Memory
> + * resources. One can estimate the elfcorehdr memory size based on
> + * NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. The elfcorehdr is
> + * excluded from SHA verification by default if the architecture
> + * supports crash hotplug.
> */
> static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
> {
> @@ -540,7 +547,7 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
> crash_hotplug_lock();
> /* Obtain lock while changing crash information */
> if (!kexec_trylock()) {
> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
> crash_hotplug_unlock();
> return;
> }
> --
> 2.45.2
>
On 08/12/24 at 09:46am, Sourabh Jain wrote:
......
> ---
>
> Changelog:
>
> Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
> - Update crash_hotplug sysfs document as suggested by Petr T
> - Update an error message in crash_handle_hotplug_event and
> crash_check_hotplug_support function.
>
> ---
......
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 63cf89393c6e..c1048893f4b6 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
> crash_hotplug_lock();
> /* Obtain lock while reading crash information */
> if (!kexec_trylock()) {
> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
Wondering why this need be updated.
> crash_hotplug_unlock();
> return 0;
> }
> @@ -520,18 +520,25 @@ int crash_check_hotplug_support(void)
> }
>
> /*
> - * To accurately reflect hot un/plug changes of cpu and memory resources
> - * (including onling and offlining of those resources), the elfcorehdr
> - * (which is passed to the crash kernel via the elfcorehdr= parameter)
> - * must be updated with the new list of CPUs and memories.
> + * To accurately reflect hot un/plug changes of CPU and Memory resources
> + * (including onling and offlining of those resources), the relevant
> + * kexec segments must be updated with latest CPU and Memory resources.
> *
> - * In order to make changes to elfcorehdr, two conditions are needed:
> - * First, the segment containing the elfcorehdr must be large enough
> - * to permit a growing number of resources; the elfcorehdr memory size
> - * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES.
> - * Second, purgatory must explicitly exclude the elfcorehdr from the
> - * list of segments it checks (since the elfcorehdr changes and thus
> - * would require an update to purgatory itself to update the digest).
> + * Architectures must ensure two things for all segments that need
> + * updating during hotplug events:
> + *
> + * 1. Segments must be large enough to accommodate a growing number of
> + * resources.
> + * 2. Exclude the segments from SHA verification.
> + *
> + * For example, on most architectures, the elfcorehdr (which is passed
> + * to the crash kernel via the elfcorehdr= parameter) must include the
> + * new list of CPUs and memory. To make changes to the elfcorehdr, it
> + * should be large enough to permit a growing number of CPU and Memory
> + * resources. One can estimate the elfcorehdr memory size based on
> + * NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. The elfcorehdr is
> + * excluded from SHA verification by default if the architecture
> + * supports crash hotplug.
> */
> static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
> {
> @@ -540,7 +547,7 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
> crash_hotplug_lock();
> /* Obtain lock while changing crash information */
> if (!kexec_trylock()) {
> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
> crash_hotplug_unlock();
> return;
> }
> --
> 2.45.2
>
Hello Baoquan,
On 13/08/24 10:34, Baoquan He wrote:
> On 08/12/24 at 09:46am, Sourabh Jain wrote:
> ......
>> ---
>>
>> Changelog:
>>
>> Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
>> - Update crash_hotplug sysfs document as suggested by Petr T
>> - Update an error message in crash_handle_hotplug_event and
>> crash_check_hotplug_support function.
>>
>> ---
> ......
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 63cf89393c6e..c1048893f4b6 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
>> crash_hotplug_lock();
>> /* Obtain lock while reading crash information */
>> if (!kexec_trylock()) {
>> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
>> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
> Wondering why this need be updated.
In some architectures, additional kexec segments become obsolete during
a hotplug event,
so simply calling out the `elfcorehdr may be inaccurate` may not be
sufficient.
Therefore, it has been generalized with the kdump image.
Thanks,
Sourabh Jain
On 08/13/24 at 10:58am, Sourabh Jain wrote:
> Hello Baoquan,
>
> On 13/08/24 10:34, Baoquan He wrote:
> > On 08/12/24 at 09:46am, Sourabh Jain wrote:
> > ......
> > > ---
> > >
> > > Changelog:
> > >
> > > Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
> > > - Update crash_hotplug sysfs document as suggested by Petr T
> > > - Update an error message in crash_handle_hotplug_event and
> > > crash_check_hotplug_support function.
> > >
> > > ---
> > ......
> > > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > > index 63cf89393c6e..c1048893f4b6 100644
> > > --- a/kernel/crash_core.c
> > > +++ b/kernel/crash_core.c
> > > @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
> > > crash_hotplug_lock();
> > > /* Obtain lock while reading crash information */
> > > if (!kexec_trylock()) {
> > > - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> > > + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
> > Wondering why this need be updated.
>
> In some architectures, additional kexec segments become obsolete during a
> hotplug event,
> so simply calling out the `elfcorehdr may be inaccurate` may not be
> sufficient.
> Therefore, it has been generalized with the kdump image.
OK, I forgot the case in ppc, makes sense to me, thx.
Acked-by: Baoquan He <bhe@redhat.com>
Hello Baoquan,
On 13/08/24 14:47, Baoquan He wrote:
> On 08/13/24 at 10:58am, Sourabh Jain wrote:
>> Hello Baoquan,
>>
>> On 13/08/24 10:34, Baoquan He wrote:
>>> On 08/12/24 at 09:46am, Sourabh Jain wrote:
>>> ......
>>>> ---
>>>>
>>>> Changelog:
>>>>
>>>> Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
>>>> - Update crash_hotplug sysfs document as suggested by Petr T
>>>> - Update an error message in crash_handle_hotplug_event and
>>>> crash_check_hotplug_support function.
>>>>
>>>> ---
>>> ......
>>>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>>>> index 63cf89393c6e..c1048893f4b6 100644
>>>> --- a/kernel/crash_core.c
>>>> +++ b/kernel/crash_core.c
>>>> @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
>>>> crash_hotplug_lock();
>>>> /* Obtain lock while reading crash information */
>>>> if (!kexec_trylock()) {
>>>> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
>>>> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
>>> Wondering why this need be updated.
>> In some architectures, additional kexec segments become obsolete during a
>> hotplug event,
>> so simply calling out the `elfcorehdr may be inaccurate` may not be
>> sufficient.
>> Therefore, it has been generalized with the kdump image.
> OK, I forgot the case in ppc, makes sense to me, thx.
>
> Acked-by: Baoquan He <bhe@redhat.com>
Do we know who will be applying this patch and how it will be merged
into Linus’s tree?
Thanks,
Sourabh Jain
Hello Boaquan,
On 13/08/24 14:47, Baoquan He wrote:
> On 08/13/24 at 10:58am, Sourabh Jain wrote:
>> Hello Baoquan,
>>
>> On 13/08/24 10:34, Baoquan He wrote:
>>> On 08/12/24 at 09:46am, Sourabh Jain wrote:
>>> ......
>>>> ---
>>>>
>>>> Changelog:
>>>>
>>>> Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
>>>> - Update crash_hotplug sysfs document as suggested by Petr T
>>>> - Update an error message in crash_handle_hotplug_event and
>>>> crash_check_hotplug_support function.
>>>>
>>>> ---
>>> ......
>>>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>>>> index 63cf89393c6e..c1048893f4b6 100644
>>>> --- a/kernel/crash_core.c
>>>> +++ b/kernel/crash_core.c
>>>> @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
>>>> crash_hotplug_lock();
>>>> /* Obtain lock while reading crash information */
>>>> if (!kexec_trylock()) {
>>>> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
>>>> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
>>> Wondering why this need be updated.
>> In some architectures, additional kexec segments become obsolete during a
>> hotplug event,
>> so simply calling out the `elfcorehdr may be inaccurate` may not be
>> sufficient.
>> Therefore, it has been generalized with the kdump image.
> OK, I forgot the case in ppc, makes sense to me, thx.
>
> Acked-by: Baoquan He <bhe@redhat.com>
Thanks for the Ack!
- Sourabh Jain
On Mon, 12 Aug 2024 09:46:51 +0530
Sourabh Jain <sourabhjain@linux.ibm.com> wrote:
> Commit 79365026f869 ("crash: add a new kexec flag for hotplug support")
> generalizes the crash hotplug support to allow architectures to update
> multiple kexec segments on CPU/Memory hotplug and not just elfcorehdr.
> Therefore, update the relevant kernel documentation to reflect the same.
>
> No functional change.
>
> Cc: Petr Tesarik <petr@tesarici.cz>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: kexec@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: x86@kernel.org
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
It's perfect now.
Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Petr T
> ---
>
> Changelog:
>
> Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
> - Update crash_hotplug sysfs document as suggested by Petr T
> - Update an error message in crash_handle_hotplug_event and
> crash_check_hotplug_support function.
>
> ---
> .../ABI/testing/sysfs-devices-memory | 6 ++--
> .../ABI/testing/sysfs-devices-system-cpu | 6 ++--
> .../admin-guide/mm/memory-hotplug.rst | 5 +--
> Documentation/core-api/cpu_hotplug.rst | 10 +++---
> kernel/crash_core.c | 33 +++++++++++--------
> 5 files changed, 35 insertions(+), 25 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
> index a95e0f17c35a..cec65827e602 100644
> --- a/Documentation/ABI/testing/sysfs-devices-memory
> +++ b/Documentation/ABI/testing/sysfs-devices-memory
> @@ -115,6 +115,6 @@ What: /sys/devices/system/memory/crash_hotplug
> Date: Aug 2023
> Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
> Description:
> - (RO) indicates whether or not the kernel directly supports
> - modifying the crash elfcorehdr for memory hot un/plug and/or
> - on/offline changes.
> + (RO) indicates whether or not the kernel updates relevant kexec
> + segments on memory hot un/plug and/or on/offline events, avoiding the
> + need to reload kdump kernel.
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 325873385b71..1a31b7c71676 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -703,9 +703,9 @@ What: /sys/devices/system/cpu/crash_hotplug
> Date: Aug 2023
> Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
> Description:
> - (RO) indicates whether or not the kernel directly supports
> - modifying the crash elfcorehdr for CPU hot un/plug and/or
> - on/offline changes.
> + (RO) indicates whether or not the kernel updates relevant kexec
> + segments on memory hot un/plug and/or on/offline events, avoiding the
> + need to reload kdump kernel.
>
> What: /sys/devices/system/cpu/enabled
> Date: Nov 2022
> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
> index 098f14d83e99..cb2c080f400c 100644
> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
> @@ -294,8 +294,9 @@ The following files are currently defined:
> ``crash_hotplug`` read-only: when changes to the system memory map
> occur due to hot un/plug of memory, this file contains
> '1' if the kernel updates the kdump capture kernel memory
> - map itself (via elfcorehdr), or '0' if userspace must update
> - the kdump capture kernel memory map.
> + map itself (via elfcorehdr and other relevant kexec
> + segments), or '0' if userspace must update the kdump
> + capture kernel memory map.
>
> Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
> configuration option.
> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
> index dcb0e379e5e8..a21dbf261be7 100644
> --- a/Documentation/core-api/cpu_hotplug.rst
> +++ b/Documentation/core-api/cpu_hotplug.rst
> @@ -737,8 +737,9 @@ can process the event further.
>
> When changes to the CPUs in the system occur, the sysfs file
> /sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
> -updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
> -or '0' if userspace must update the kdump capture kernel list of CPUs.
> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr and
> +other relevant kexec segment), or '0' if userspace must update the kdump
> +capture kernel list of CPUs.
>
> The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
> option.
> @@ -750,8 +751,9 @@ file can be used in a udev rule as follows:
> SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>
> For a CPU hot un/plug event, if the architecture supports kernel updates
> -of the elfcorehdr (which contains the list of CPUs), then the rule skips
> -the unload-then-reload of the kdump capture kernel.
> +of the elfcorehdr (which contains the list of CPUs) and other relevant
> +kexec segments, then the rule skips the unload-then-reload of the kdump
> +capture kernel.
>
> Kernel Inline Documentations Reference
> ======================================
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 63cf89393c6e..c1048893f4b6 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
> crash_hotplug_lock();
> /* Obtain lock while reading crash information */
> if (!kexec_trylock()) {
> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
> crash_hotplug_unlock();
> return 0;
> }
> @@ -520,18 +520,25 @@ int crash_check_hotplug_support(void)
> }
>
> /*
> - * To accurately reflect hot un/plug changes of cpu and memory resources
> - * (including onling and offlining of those resources), the elfcorehdr
> - * (which is passed to the crash kernel via the elfcorehdr= parameter)
> - * must be updated with the new list of CPUs and memories.
> + * To accurately reflect hot un/plug changes of CPU and Memory resources
> + * (including onling and offlining of those resources), the relevant
> + * kexec segments must be updated with latest CPU and Memory resources.
> *
> - * In order to make changes to elfcorehdr, two conditions are needed:
> - * First, the segment containing the elfcorehdr must be large enough
> - * to permit a growing number of resources; the elfcorehdr memory size
> - * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES.
> - * Second, purgatory must explicitly exclude the elfcorehdr from the
> - * list of segments it checks (since the elfcorehdr changes and thus
> - * would require an update to purgatory itself to update the digest).
> + * Architectures must ensure two things for all segments that need
> + * updating during hotplug events:
> + *
> + * 1. Segments must be large enough to accommodate a growing number of
> + * resources.
> + * 2. Exclude the segments from SHA verification.
> + *
> + * For example, on most architectures, the elfcorehdr (which is passed
> + * to the crash kernel via the elfcorehdr= parameter) must include the
> + * new list of CPUs and memory. To make changes to the elfcorehdr, it
> + * should be large enough to permit a growing number of CPU and Memory
> + * resources. One can estimate the elfcorehdr memory size based on
> + * NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. The elfcorehdr is
> + * excluded from SHA verification by default if the architecture
> + * supports crash hotplug.
> */
> static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
> {
> @@ -540,7 +547,7 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
> crash_hotplug_lock();
> /* Obtain lock while changing crash information */
> if (!kexec_trylock()) {
> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
> crash_hotplug_unlock();
> return;
> }
On 12/08/24 11:11, Petr Tesarik wrote:
> On Mon, 12 Aug 2024 09:46:51 +0530
> Sourabh Jain <sourabhjain@linux.ibm.com> wrote:
>
>> Commit 79365026f869 ("crash: add a new kexec flag for hotplug support")
>> generalizes the crash hotplug support to allow architectures to update
>> multiple kexec segments on CPU/Memory hotplug and not just elfcorehdr.
>> Therefore, update the relevant kernel documentation to reflect the same.
>>
>> No functional change.
>>
>> Cc: Petr Tesarik <petr@tesarici.cz>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: kexec@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: x86@kernel.org
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> It's perfect now.
>
> Reviewed-by: Petr Tesarik <ptesarik@suse.com>
Thank you the review, Petr.
- Sourabh Jain
> Petr T
>
>> ---
>>
>> Changelog:
>>
>> Since v1: https://lore.kernel.org/all/20240805050829.297171-1-sourabhjain@linux.ibm.com/
>> - Update crash_hotplug sysfs document as suggested by Petr T
>> - Update an error message in crash_handle_hotplug_event and
>> crash_check_hotplug_support function.
>>
>> ---
>> .../ABI/testing/sysfs-devices-memory | 6 ++--
>> .../ABI/testing/sysfs-devices-system-cpu | 6 ++--
>> .../admin-guide/mm/memory-hotplug.rst | 5 +--
>> Documentation/core-api/cpu_hotplug.rst | 10 +++---
>> kernel/crash_core.c | 33 +++++++++++--------
>> 5 files changed, 35 insertions(+), 25 deletions(-)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-devices-memory b/Documentation/ABI/testing/sysfs-devices-memory
>> index a95e0f17c35a..cec65827e602 100644
>> --- a/Documentation/ABI/testing/sysfs-devices-memory
>> +++ b/Documentation/ABI/testing/sysfs-devices-memory
>> @@ -115,6 +115,6 @@ What: /sys/devices/system/memory/crash_hotplug
>> Date: Aug 2023
>> Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
>> Description:
>> - (RO) indicates whether or not the kernel directly supports
>> - modifying the crash elfcorehdr for memory hot un/plug and/or
>> - on/offline changes.
>> + (RO) indicates whether or not the kernel updates relevant kexec
>> + segments on memory hot un/plug and/or on/offline events, avoiding the
>> + need to reload kdump kernel.
>> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> index 325873385b71..1a31b7c71676 100644
>> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
>> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
>> @@ -703,9 +703,9 @@ What: /sys/devices/system/cpu/crash_hotplug
>> Date: Aug 2023
>> Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
>> Description:
>> - (RO) indicates whether or not the kernel directly supports
>> - modifying the crash elfcorehdr for CPU hot un/plug and/or
>> - on/offline changes.
>> + (RO) indicates whether or not the kernel updates relevant kexec
>> + segments on memory hot un/plug and/or on/offline events, avoiding the
>> + need to reload kdump kernel.
>>
>> What: /sys/devices/system/cpu/enabled
>> Date: Nov 2022
>> diff --git a/Documentation/admin-guide/mm/memory-hotplug.rst b/Documentation/admin-guide/mm/memory-hotplug.rst
>> index 098f14d83e99..cb2c080f400c 100644
>> --- a/Documentation/admin-guide/mm/memory-hotplug.rst
>> +++ b/Documentation/admin-guide/mm/memory-hotplug.rst
>> @@ -294,8 +294,9 @@ The following files are currently defined:
>> ``crash_hotplug`` read-only: when changes to the system memory map
>> occur due to hot un/plug of memory, this file contains
>> '1' if the kernel updates the kdump capture kernel memory
>> - map itself (via elfcorehdr), or '0' if userspace must update
>> - the kdump capture kernel memory map.
>> + map itself (via elfcorehdr and other relevant kexec
>> + segments), or '0' if userspace must update the kdump
>> + capture kernel memory map.
>>
>> Availability depends on the CONFIG_MEMORY_HOTPLUG kernel
>> configuration option.
>> diff --git a/Documentation/core-api/cpu_hotplug.rst b/Documentation/core-api/cpu_hotplug.rst
>> index dcb0e379e5e8..a21dbf261be7 100644
>> --- a/Documentation/core-api/cpu_hotplug.rst
>> +++ b/Documentation/core-api/cpu_hotplug.rst
>> @@ -737,8 +737,9 @@ can process the event further.
>>
>> When changes to the CPUs in the system occur, the sysfs file
>> /sys/devices/system/cpu/crash_hotplug contains '1' if the kernel
>> -updates the kdump capture kernel list of CPUs itself (via elfcorehdr),
>> -or '0' if userspace must update the kdump capture kernel list of CPUs.
>> +updates the kdump capture kernel list of CPUs itself (via elfcorehdr and
>> +other relevant kexec segment), or '0' if userspace must update the kdump
>> +capture kernel list of CPUs.
>>
>> The availability depends on the CONFIG_HOTPLUG_CPU kernel configuration
>> option.
>> @@ -750,8 +751,9 @@ file can be used in a udev rule as follows:
>> SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
>>
>> For a CPU hot un/plug event, if the architecture supports kernel updates
>> -of the elfcorehdr (which contains the list of CPUs), then the rule skips
>> -the unload-then-reload of the kdump capture kernel.
>> +of the elfcorehdr (which contains the list of CPUs) and other relevant
>> +kexec segments, then the rule skips the unload-then-reload of the kdump
>> +capture kernel.
>>
>> Kernel Inline Documentations Reference
>> ======================================
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 63cf89393c6e..c1048893f4b6 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -505,7 +505,7 @@ int crash_check_hotplug_support(void)
>> crash_hotplug_lock();
>> /* Obtain lock while reading crash information */
>> if (!kexec_trylock()) {
>> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
>> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
>> crash_hotplug_unlock();
>> return 0;
>> }
>> @@ -520,18 +520,25 @@ int crash_check_hotplug_support(void)
>> }
>>
>> /*
>> - * To accurately reflect hot un/plug changes of cpu and memory resources
>> - * (including onling and offlining of those resources), the elfcorehdr
>> - * (which is passed to the crash kernel via the elfcorehdr= parameter)
>> - * must be updated with the new list of CPUs and memories.
>> + * To accurately reflect hot un/plug changes of CPU and Memory resources
>> + * (including onling and offlining of those resources), the relevant
>> + * kexec segments must be updated with latest CPU and Memory resources.
>> *
>> - * In order to make changes to elfcorehdr, two conditions are needed:
>> - * First, the segment containing the elfcorehdr must be large enough
>> - * to permit a growing number of resources; the elfcorehdr memory size
>> - * is based on NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES.
>> - * Second, purgatory must explicitly exclude the elfcorehdr from the
>> - * list of segments it checks (since the elfcorehdr changes and thus
>> - * would require an update to purgatory itself to update the digest).
>> + * Architectures must ensure two things for all segments that need
>> + * updating during hotplug events:
>> + *
>> + * 1. Segments must be large enough to accommodate a growing number of
>> + * resources.
>> + * 2. Exclude the segments from SHA verification.
>> + *
>> + * For example, on most architectures, the elfcorehdr (which is passed
>> + * to the crash kernel via the elfcorehdr= parameter) must include the
>> + * new list of CPUs and memory. To make changes to the elfcorehdr, it
>> + * should be large enough to permit a growing number of CPU and Memory
>> + * resources. One can estimate the elfcorehdr memory size based on
>> + * NR_CPUS_DEFAULT and CRASH_MAX_MEMORY_RANGES. The elfcorehdr is
>> + * excluded from SHA verification by default if the architecture
>> + * supports crash hotplug.
>> */
>> static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu, void *arg)
>> {
>> @@ -540,7 +547,7 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
>> crash_hotplug_lock();
>> /* Obtain lock while changing crash information */
>> if (!kexec_trylock()) {
>> - pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
>> + pr_info("kexec_trylock() failed, kdump image may be inaccurate\n");
>> crash_hotplug_unlock();
>> return;
>> }
© 2016 - 2025 Red Hat, Inc.