From nobody Sat Feb 7 11:38:16 2026 Received: from mail-dl1-f42.google.com (mail-dl1-f42.google.com [74.125.82.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5919129D267 for ; Tue, 27 Jan 2026 02:23:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769480591; cv=none; b=g8DfGj4I22yrZdJ4JM0VlhntoGgDBCZyuFQMV3ylLd0t6YmSKECByRe9K5SS5AcX8rXnj1dINv441p3ZYV1l1rQyyRP8KcJOamm9PnwNRsvLn5K+iC26aOLSDF/HM97M+PrIaUCaxir6qxFWe7xoAnitKtm222wE/88BdMR2Wdc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769480591; c=relaxed/simple; bh=N64M/JIJLJwol49yhjrUkcj2FnRP1GWKH932vmf5iHg=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=MNRSYWUweAEKjV2q/s7aWc/t4j25gtIKF80TLztBlWdZupDAcw8HioshT8WVxK239m2AYKmPv3BBXZbLL3jYIDlTDCP570KKdOZ0Bh7KMfhYv4g1PH76iiwZvGN5RciesgK64q1jzFjy2NFIkT+IO4RgTOuD4An5tgITpyB1Tz8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Dlcal7hw; arc=none smtp.client-ip=74.125.82.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Dlcal7hw" Received: by mail-dl1-f42.google.com with SMTP id a92af1059eb24-12339e2e2c1so3339010c88.1 for ; Mon, 26 Jan 2026 18:23:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769480588; x=1770085388; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=C/BKwe8/Yc0esAHXSWFatM1ucuc0hnp+X30KfdTTAPY=; b=Dlcal7hwWTDZxQTCOfnwtmBy8iOvHqQIJvt/bhR9jpoyNep7Rzy7GxNkhmxJEQKiQy yYbwJRA7THKCc62/Oe3LpKzIQLxIhYdawEv3GACkuWacW2saOFh5qwxg+ghI3gDDlq/+ zYPK3zG9q6qrP4+UrIUPinfQI9nTB7T/7MelW+2SJgfIDF2JoXjokE0L+fHpYYhpjoI9 XrxZFXbu60IM2lNwJKeH+dzEWgwjUyUWS9VvRS1ISFYdIP0dBsd2ehNY7maJApDM+pgj GxoSfdAarSs5FxsD3pr4dZldhTssR+MFqx+6aQawuiMl+T2nneWZASvc6zDA1jV1Y0b4 Y10A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769480588; x=1770085388; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=C/BKwe8/Yc0esAHXSWFatM1ucuc0hnp+X30KfdTTAPY=; b=iXyw6wJTdAd4FGIDyJ1TWr8A5jgvI9mVEdRs9mlc5Ry2vkrOZ2L4115Uxva14Klr73 OuYEqjYnyHqR4GFnYKNUm68bBwxQaRt3eduJALUtA77VA9vEY7+fHTa6E25C0nu5HJ2g Hq1HDLZw8A9SKxhC2+vw7DAN4PzQiH8Jlkz48efpJTHw60yLOTBshr5+2NuCZ3OlTu7Y ZfvNfxJ723L9Mr37ShMmqidkFa9c4F/3WltBv2RzsEbhmHgSuG9GPRbS2ewCsl9DuKHF 25ke0U/0wT8GsypLZ1YZ0ZrS6qTy/pb/FhU9BqCaHNJLxxDA8g1cnkuXBMBFxt35J1Pz hwXw== X-Forwarded-Encrypted: i=1; AJvYcCWRooWhpB8zKyTy7AqHct7hJpUpPP/Gj7YItTv+REkZDXv2uVr11Mv+tF2eXAViXBvxxFMxlWexZyK7Qlg=@vger.kernel.org X-Gm-Message-State: AOJu0Yw22omErDC1vKG87SWONvyrwisoFTuBw17c1F7kANL9lN3aWOND ISKarwr0Cx/Dy4C0KCghGBfzh2h7MFO+BO5qCQ7Is0I1O90ykq4PZNNu X-Gm-Gg: AZuq6aLyjbXXOgEyF2sviy6bD4Wnws6df/wtF95gu3wewbkJSw47UHN2HHqk3tllBGr OwMVFp83iGfNV1gqoLR/yQHf9v+22INsbsBxoRNf2A3WClL73A+Uke4NlxgX4J72CTqWxvexa4o hJY0EBdTnaGD7AuDBNqKAo5oeU4m5TfCdISE7pjyKf+PvM9LllpyoGuHys1nhI6c6rxctbBRdIA DknOSpCoXH1EDabefx3gG3KnTGO9L3QbCFxpB+DvLiZeaqnoCrNH0taHZB7YBrMQimDJ4kqnuDD OtrnmV2dtwYCG3hV6+GPcFrUBexn3Kg1x3OiqiQ68qkPhEUl5+NR4kYBQWHfe5o25F4KkOKkoys C382zJ8295NUpqp84qaAHRxt3RpTbHj3V25elks7NzovgHS3rn75VX8MGuUYJ1BkGbnFCKG/bk9 aG9t8= X-Received: by 2002:a05:7022:6725:b0:11b:9386:a383 with SMTP id a92af1059eb24-124a0e8a130mr52799c88.22.1769480588356; Mon, 26 Jan 2026 18:23:08 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1247d90cda6sm20111224c88.1.2026.01.26.18.23.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Jan 2026 18:23:08 -0800 (PST) From: Qiliang Yuan To: Ingo Molnar , Qiliang Yuan , Andrew Morton , Li Huafei , Thorsten Blum , Jinchao Wang , Yicong Yang , Petr Mladek , Pingfan Liu , Lecopzer Chen , Douglas Anderson Cc: linux-watchdog@vger.kernel.org, mm-commits@vger.kernel.org, Shouxin Sun , Junnan Zhang , Qiliang Yuan , Song Liu , stable@vger.kernel.org, "Yury Norov (NVIDIA)" , linux-kernel@vger.kernel.org Subject: [PATCH v5] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race Date: Mon, 26 Jan 2026 21:22:24 -0500 Message-ID: <20260127022238.1182079-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The hardlockup detector's probe path (watchdog_hardlockup_probe()) can be executed in a non-pinned context, such as during the asynchronous retry mechanism (lockup_detector_delay_init) which runs in a standard unbound workqueue. In this context, the existing implementation of hardlockup_detector_event_create() suffers from a race condition due to potential task migration. It relies on is_percpu_thread() to ensure CPU-locality, but worker threads in a global workqueue do not carry the PF_PERCPU_THREAD flag, causing the WARN_ON() to trigger and violating the assumption of stable per-cpu access. If the task migrates during the probe: 1. It might set 'watchdog_ev' on one CPU but fail to clear it if the subsequent migration causes the cleanup logic to run on a different CPU. 2. This leaves a stale pointer to a freed perf_event in the original CPU's 'watchdog_ev' variable, leading to a use-after-free (UAF) when the watchdog is later enabled or reconfigured. While this issue was prominently observed in downstream kernels (like openEuler 4.19) where initialization timings are shifted to a post-SMP phase, it represents a latent bug in the mainline asynchronous initialization path. Refactor hardlockup_detector_event_create() to be stateless by returning the created perf_event pointer instead of directly modifying the per-cpu 'watchdog_ev' variable. This allows the probe logic to safely manage the temporary event. Use cpu_hotplug_disable() during the probe to ensure the target CPU remains valid throughout the check. Fixes: 930d8f8dbab9 ("watchdog/perf: adapt the watchdog_perf interface for = async model") Signed-off-by: Shouxin Sun Signed-off-by: Junnan Zhang Signed-off-by: Qiliang Yuan Signed-off-by: Qiliang Yuan Cc: Song Liu Cc: Douglas Anderson Cc: Jinchao Wang Cc: --- v5: - Refine description: clarify it identifies a latent bug in the mainline asynchronous retry path where worker threads lack PF_PERCPU_THREAD. v4: - Add cpu_hotplug_disable() in watchdog_hardlockup_probe() to stabilize the probe CPU. - Update description to explain the relevance of 4.19 logs. v3: - Refactor hardlockup_detector_event_create() to be stateless. v2: - Add Cc stable. kernel/watchdog_perf.c | 56 +++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 22 deletions(-) diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c index d3ca70e3c256..887b61c65c1b 100644 --- a/kernel/watchdog_perf.c +++ b/kernel/watchdog_perf.c @@ -17,6 +17,7 @@ #include #include #include +#include =20 #include #include @@ -118,18 +119,11 @@ static void watchdog_overflow_callback(struct perf_ev= ent *event, watchdog_hardlockup_check(smp_processor_id(), regs); } =20 -static int hardlockup_detector_event_create(void) +static struct perf_event *hardlockup_detector_event_create(unsigned int cp= u) { - unsigned int cpu; struct perf_event_attr *wd_attr; struct perf_event *evt; =20 - /* - * Preemption is not disabled because memory will be allocated. - * Ensure CPU-locality by calling this in per-CPU kthread. - */ - WARN_ON(!is_percpu_thread()); - cpu =3D raw_smp_processor_id(); wd_attr =3D &wd_hw_attr; wd_attr->sample_period =3D hw_nmi_get_sample_period(watchdog_thresh); =20 @@ -143,14 +137,7 @@ static int hardlockup_detector_event_create(void) watchdog_overflow_callback, NULL); } =20 - if (IS_ERR(evt)) { - pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, - PTR_ERR(evt)); - return PTR_ERR(evt); - } - WARN_ONCE(this_cpu_read(watchdog_ev), "unexpected watchdog_ev leak"); - this_cpu_write(watchdog_ev, evt); - return 0; + return evt; } =20 /** @@ -159,17 +146,26 @@ static int hardlockup_detector_event_create(void) */ void watchdog_hardlockup_enable(unsigned int cpu) { + struct perf_event *evt; + WARN_ON_ONCE(cpu !=3D smp_processor_id()); =20 - if (hardlockup_detector_event_create()) + evt =3D hardlockup_detector_event_create(cpu); + if (IS_ERR(evt)) { + pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, + PTR_ERR(evt)); return; + } =20 /* use original value for check */ if (!atomic_fetch_inc(&watchdog_cpus)) pr_info("Enabled. Permanently consumes one hw-PMU counter.\n"); =20 + WARN_ONCE(this_cpu_read(watchdog_ev), "unexpected watchdog_ev leak"); + this_cpu_write(watchdog_ev, evt); + watchdog_init_timestamp(); - perf_event_enable(this_cpu_read(watchdog_ev)); + perf_event_enable(evt); } =20 /** @@ -263,19 +259,35 @@ bool __weak __init arch_perf_nmi_is_available(void) */ int __init watchdog_hardlockup_probe(void) { + struct perf_event *evt; + unsigned int cpu; int ret; =20 if (!arch_perf_nmi_is_available()) return -ENODEV; =20 - ret =3D hardlockup_detector_event_create(); + if (!hw_nmi_get_sample_period(watchdog_thresh)) + return -EINVAL; =20 - if (ret) { + /* + * Test hardware PMU availability by creating a temporary perf event. + * The requested CPU is arbitrary; preemption is not disabled, so + * raw_smp_processor_id() is used. Surround with cpu_hotplug_disable() + * to ensure the arbitrarily chosen CPU remains online during the check. + * The event is released immediately. + */ + cpu_hotplug_disable(); + cpu =3D raw_smp_processor_id(); + evt =3D hardlockup_detector_event_create(cpu); + if (IS_ERR(evt)) { pr_info("Perf NMI watchdog permanently disabled\n"); + ret =3D PTR_ERR(evt); } else { - perf_event_release_kernel(this_cpu_read(watchdog_ev)); - this_cpu_write(watchdog_ev, NULL); + perf_event_release_kernel(evt); + ret =3D 0; } + cpu_hotplug_enable(); + return ret; } =20 --=20 2.51.0