[PATCH] mm/kfence: fix potential deadlock in reboot notifier

Breno Leitao posted 1 patch 3 weeks, 1 day ago
mm/kfence/core.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
[PATCH] mm/kfence: fix potential deadlock in reboot notifier
Posted by Breno Leitao 3 weeks, 1 day ago
The reboot notifier callback can deadlock when calling
cancel_delayed_work_sync() if toggle_allocation_gate() is blocked
in wait_event_idle() waiting for allocations, that might not happen on
shutdown path.

The issue is that cancel_delayed_work_sync() waits for the work to
complete, but the work is waiting for kfence_allocation_gate > 0
which requires allocations to happen (each allocation is increated by 1)
- allocations that may have stopped during shutdown.

Fix this by:
1. Using cancel_delayed_work() (non-sync) to avoid blocking. Now the
   callback succeeds and return.
2. Adding wake_up() to unblock any waiting toggle_allocation_gate()
3. Adding !kfence_enabled to the wait condition so the wake succeeds

The static_branch_disable() IPI will still execute after the wake,
but at this early point in shutdown (reboot notifier runs with
INT_MAX priority), the system is still functional and CPUs can
respond to IPIs.

Reported-by: Chris Mason <clm@meta.com>
Closes: https://lore.kernel.org/all/20260113140234.677117-1-clm@meta.com/
Fixes: ce2bba89566b ("mm/kfence: add reboot notifier to disable KFENCE on shutdown")
Signed-off-by: Breno Leitao <leitao@debian.org>
---
 mm/kfence/core.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 577a1699c553..da0f5b6f5744 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -823,6 +823,9 @@ static struct notifier_block kfence_check_canary_notifier = {
 static struct delayed_work kfence_timer;
 
 #ifdef CONFIG_KFENCE_STATIC_KEYS
+/* Wait queue to wake up allocation-gate timer task. */
+static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
+
 static int kfence_reboot_callback(struct notifier_block *nb,
 				  unsigned long action, void *data)
 {
@@ -832,7 +835,12 @@ static int kfence_reboot_callback(struct notifier_block *nb,
 	 */
 	WRITE_ONCE(kfence_enabled, false);
 	/* Cancel any pending timer work */
-	cancel_delayed_work_sync(&kfence_timer);
+	cancel_delayed_work(&kfence_timer);
+	/*
+	 * Wake up any blocked toggle_allocation_gate() so it can complete
+	 * early while the system is still able to handle IPIs.
+	 */
+	wake_up(&allocation_wait);
 
 	return NOTIFY_OK;
 }
@@ -842,9 +850,6 @@ static struct notifier_block kfence_reboot_notifier = {
 	.priority = INT_MAX, /* Run early to stop timers ASAP */
 };
 
-/* Wait queue to wake up allocation-gate timer task. */
-static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
-
 static void wake_up_kfence_timer(struct irq_work *work)
 {
 	wake_up(&allocation_wait);
@@ -873,7 +878,9 @@ static void toggle_allocation_gate(struct work_struct *work)
 	/* Enable static key, and await allocation to happen. */
 	static_branch_enable(&kfence_allocation_key);
 
-	wait_event_idle(allocation_wait, atomic_read(&kfence_allocation_gate) > 0);
+	wait_event_idle(allocation_wait,
+			atomic_read(&kfence_allocation_gate) > 0 ||
+			!READ_ONCE(kfence_enabled));
 
 	/* Disable static key and reset timer. */
 	static_branch_disable(&kfence_allocation_key);

---
base-commit: 983d014aafb14ee5e4915465bf8948e8f3a723b5
change-id: 20260116-kfence_fix-9905b284f1cc

Best regards,
--  
Breno Leitao <leitao@debian.org>
Re: [PATCH] mm/kfence: fix potential deadlock in reboot notifier
Posted by Marco Elver 2 weeks, 6 days ago
On Fri, 16 Jan 2026 at 16:49, Breno Leitao <leitao@debian.org> wrote:
>
> The reboot notifier callback can deadlock when calling
> cancel_delayed_work_sync() if toggle_allocation_gate() is blocked
> in wait_event_idle() waiting for allocations, that might not happen on
> shutdown path.
>
> The issue is that cancel_delayed_work_sync() waits for the work to
> complete, but the work is waiting for kfence_allocation_gate > 0
> which requires allocations to happen (each allocation is increated by 1)

increated -> increased

> - allocations that may have stopped during shutdown.
>
> Fix this by:
> 1. Using cancel_delayed_work() (non-sync) to avoid blocking. Now the
>    callback succeeds and return.
> 2. Adding wake_up() to unblock any waiting toggle_allocation_gate()
> 3. Adding !kfence_enabled to the wait condition so the wake succeeds
>
> The static_branch_disable() IPI will still execute after the wake,
> but at this early point in shutdown (reboot notifier runs with
> INT_MAX priority), the system is still functional and CPUs can
> respond to IPIs.
>
> Reported-by: Chris Mason <clm@meta.com>
> Closes: https://lore.kernel.org/all/20260113140234.677117-1-clm@meta.com/
> Fixes: ce2bba89566b ("mm/kfence: add reboot notifier to disable KFENCE on shutdown")
> Signed-off-by: Breno Leitao <leitao@debian.org>

Reviewed-by: Marco Elver <elver@google.com>

> ---
>  mm/kfence/core.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 577a1699c553..da0f5b6f5744 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -823,6 +823,9 @@ static struct notifier_block kfence_check_canary_notifier = {
>  static struct delayed_work kfence_timer;
>
>  #ifdef CONFIG_KFENCE_STATIC_KEYS
> +/* Wait queue to wake up allocation-gate timer task. */
> +static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
> +
>  static int kfence_reboot_callback(struct notifier_block *nb,
>                                   unsigned long action, void *data)
>  {
> @@ -832,7 +835,12 @@ static int kfence_reboot_callback(struct notifier_block *nb,
>          */
>         WRITE_ONCE(kfence_enabled, false);
>         /* Cancel any pending timer work */
> -       cancel_delayed_work_sync(&kfence_timer);
> +       cancel_delayed_work(&kfence_timer);
> +       /*
> +        * Wake up any blocked toggle_allocation_gate() so it can complete
> +        * early while the system is still able to handle IPIs.
> +        */
> +       wake_up(&allocation_wait);
>
>         return NOTIFY_OK;
>  }
> @@ -842,9 +850,6 @@ static struct notifier_block kfence_reboot_notifier = {
>         .priority = INT_MAX, /* Run early to stop timers ASAP */
>  };
>
> -/* Wait queue to wake up allocation-gate timer task. */
> -static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
> -
>  static void wake_up_kfence_timer(struct irq_work *work)
>  {
>         wake_up(&allocation_wait);
> @@ -873,7 +878,9 @@ static void toggle_allocation_gate(struct work_struct *work)
>         /* Enable static key, and await allocation to happen. */
>         static_branch_enable(&kfence_allocation_key);
>
> -       wait_event_idle(allocation_wait, atomic_read(&kfence_allocation_gate) > 0);
> +       wait_event_idle(allocation_wait,
> +                       atomic_read(&kfence_allocation_gate) > 0 ||
> +                       !READ_ONCE(kfence_enabled));
>
>         /* Disable static key and reset timer. */
>         static_branch_disable(&kfence_allocation_key);
>
> ---
> base-commit: 983d014aafb14ee5e4915465bf8948e8f3a723b5
> change-id: 20260116-kfence_fix-9905b284f1cc
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>
Re: [PATCH] mm/kfence: fix potential deadlock in reboot notifier
Posted by Breno Leitao 2 weeks, 5 days ago
Hello Marco, Andrew,

On Mon, Jan 19, 2026 at 08:00:00AM +0100, Marco Elver wrote:
> On Fri, 16 Jan 2026 at 16:49, Breno Leitao <leitao@debian.org> wrote:
>
> > The issue is that cancel_delayed_work_sync() waits for the work to
> > complete, but the work is waiting for kfence_allocation_gate > 0
> > which requires allocations to happen (each allocation is increated by 1)
> 
> increated -> increased

[...]

> Reviewed-by: Marco Elver <elver@google.com>

Thanks for reviewing this patch.

Andrew,

Please let me know if you want to send me a v2 with the typo above
fixed, or, if you can fix it in your own tree.

Thanks
--breno