Let's check for fatal signals only. That looks cleaner and still keeps
the documented use case for manual user-space triggered memory offlining
working. From Documentation/admin-guide/mm/memory-hotplug.rst:
% timeout $TIMEOUT offline_block | failure_handling
In fact, we even document there: "the offlining context can be terminated
by sending a fatal signal".
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/memory_hotplug.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 8e0fa209d533..0d2151df4ee1 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1879,7 +1879,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
do {
pfn = start_pfn;
do {
- if (signal_pending(current)) {
+ if (fatal_signal_pending(current)) {
ret = -EINTR;
reason = "signal backoff";
goto failed_removal_isolated;
--
2.40.1
On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
> Let's check for fatal signals only. That looks cleaner and still keeps
> the documented use case for manual user-space triggered memory offlining
> working. From Documentation/admin-guide/mm/memory-hotplug.rst:
>
> % timeout $TIMEOUT offline_block | failure_handling
>
> In fact, we even document there: "the offlining context can be terminated
> by sending a fatal signal".
We should be fixing documentation instead. This could break users who do
have a SIGALRM signal hander installed.
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> mm/memory_hotplug.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 8e0fa209d533..0d2151df4ee1 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1879,7 +1879,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
> do {
> pfn = start_pfn;
> do {
> - if (signal_pending(current)) {
> + if (fatal_signal_pending(current)) {
> ret = -EINTR;
> reason = "signal backoff";
> goto failed_removal_isolated;
> --
> 2.40.1
--
Michal Hocko
SUSE Labs
On 27.06.23 14:34, Michal Hocko wrote: > On Tue 27-06-23 13:22:16, David Hildenbrand wrote: >> Let's check for fatal signals only. That looks cleaner and still keeps >> the documented use case for manual user-space triggered memory offlining >> working. From Documentation/admin-guide/mm/memory-hotplug.rst: >> >> % timeout $TIMEOUT offline_block | failure_handling >> >> In fact, we even document there: "the offlining context can be terminated >> by sending a fatal signal". > > We should be fixing documentation instead. This could break users who do > have a SIGALRM signal hander installed. You mean because timeout will send a SIGALRM, which is not considered fatal in case a signal handler is installed? At least the "traditional" tools I am aware of don't set a timeout at all (crossing fingers that they never end up stuck): * chmem * QEMU guest agent * powerpc-utils libdaxctl also doesn't seem to implement an easy-to-spot timeout for memory offlining, but it also doesn't configure SIGALRM. Of course, that doesn't mean that there isn't somewhere a program that does that; I merely assume that it would be pretty unlikely to find such a program. But no strong opinion: we can also keep it like that, update the doc and add a comment why this one here is different than most other signal backoff checks. Thanks! -- Cheers, David / dhildenb
On Tue 27-06-23 15:28:29, David Hildenbrand wrote: > On 27.06.23 14:34, Michal Hocko wrote: > > On Tue 27-06-23 13:22:16, David Hildenbrand wrote: > > > Let's check for fatal signals only. That looks cleaner and still keeps > > > the documented use case for manual user-space triggered memory offlining > > > working. From Documentation/admin-guide/mm/memory-hotplug.rst: > > > > > > % timeout $TIMEOUT offline_block | failure_handling > > > > > > In fact, we even document there: "the offlining context can be terminated > > > by sending a fatal signal". > > > > We should be fixing documentation instead. This could break users who do > > have a SIGALRM signal hander installed. > > You mean because timeout will send a SIGALRM, which is not considered fatal > in case a signal handler is installed? Correct. > At least the "traditional" tools I am aware of don't set a timeout at all > (crossing fingers that they never end up stuck): > * chmem > * QEMU guest agent > * powerpc-utils > > libdaxctl also doesn't seem to implement an easy-to-spot timeout for memory > offlining, but it also doesn't configure SIGALRM. > > > Of course, that doesn't mean that there isn't somewhere a program that does > that; I merely assume that it would be pretty unlikely to find such a > program. > > But no strong opinion: we can also keep it like that, update the doc and add > a comment why this one here is different than most other signal backoff > checks. Well, the existing signal handling approach is there for way too long to be sure. I personally would prefer fatal_signal_pending as that reflects more what we do elsewhere but here we are. Historical baggage... -- Michal Hocko SUSE Labs
© 2016 - 2026 Red Hat, Inc.