From nobody Sat Feb 7 21:24:55 2026 Received: from mail-dl1-f44.google.com (mail-dl1-f44.google.com [74.125.82.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8516932AAAB for ; Fri, 23 Jan 2026 06:34:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769150063; cv=none; b=c2YIkLG3APKk4Tb7EPVWsCOV4FJNinvhryUuv/9X/tyU4hEeVzkTmANM8m2RXnhk8D7lr3XN6b0RZgl2NdmyZojBRQI1OabYAvy8FZKx+yU/UGu52GucPXgyNvMaLtFRK0r9tDiD6+jby/TQKlTi2aFr4xU783L8HzqAcFyIOjI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769150063; c=relaxed/simple; bh=3tE8bua8ns/oISEYiYIo23uBl9ulZW+1x9VpbUmsXjE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eo9WrQP2uF8RI9krhkudPvFl1kkJt8+u2Ge03/va4CpX8mPXKuGfxwZqQP995L0c1tTWtitSLEx/Kxha3E1PON2ofMXuFWEytXFjS8OF6VSnMeHIox/8keQZrxGcEdeBBA/yMHhxuCOJ7pKqEaJl6GrsNmjh9C3er5tYC3epLB4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HhMTsOby; arc=none smtp.client-ip=74.125.82.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HhMTsOby" Received: by mail-dl1-f44.google.com with SMTP id a92af1059eb24-123320591a4so2073033c88.1 for ; Thu, 22 Jan 2026 22:34:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769150059; x=1769754859; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5sXcBBoWh7S5ryxIZK6TNZiSRw8O4j7XF0vdkdGQlyQ=; b=HhMTsOby5FV2mEjeANhw/wG13yZ1j0cilrYs7iLQodzhEcmfdyCLkX2tIJjf4Bpcx9 Anipt3e+Ini1ftLQSWLYLxQSDuU8gA0oerZcViL7NPJxXTAHf7cu4d3D6NhBH1kDlXqG gL7fUicpa9bbXVOFjpeldFyx7rRI3EMp1TSbC5FJ2wDfe622Ukaqz+QZsYbua3eBOm11 Yr/VFXCYWIxd/bZDTbXyxfvg72b7438WtRvAmmIo19FLO7cnib5//OrYOToWrp0W+DkD Hoc6Sdqo9fZNk+aBdVmvJvWoCvW3YWezHWxnIehgAawTg/mA6VWPNbNmOrGiVWC1A7Pp LknA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769150059; x=1769754859; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=5sXcBBoWh7S5ryxIZK6TNZiSRw8O4j7XF0vdkdGQlyQ=; b=TLK7vXr8krAElj36bcs+jpnpk2DZDNT8hSj/tGpmCa6dRwG5BNZYI/V2vH0u+mZ8kx YwMeE0P8vVx2UT7mJ43XBka4sKzmyJY8ZHSPxt8QqQinmS9njEnL0G8xlt1T0VV2+KWO UGhT9NRVkTX2dlzxi5O5cIx2HhuE/rSfoj9nGTSu4eneYdqaNVrJZRsHzAZJ4nDSl4iO 8jdmI/49rfWKrGrm3LCO2HewS5GQJsPBW6oZ6zwQL5bRtVTSR5x4b6a5eOzsJXeY3qbR jI8gGYkKVp0wpTnpQjRPo8bC/tU9xQcseWubGnG+O/xmPCEoteuDT45O9lZxpaOasW+r Mfsg== X-Forwarded-Encrypted: i=1; AJvYcCUp3xY5uYQK62ElZXvVw/WNTL6ww5fCVOkp8BWEzb1dgQW/creaH9KJC2spi6hzGHu0ZWhZaRGWCzn8o5o=@vger.kernel.org X-Gm-Message-State: AOJu0Yxnrn/AyrN7LsH0feBTeGW00vu8+gH0PxCmmC2lbUm4IVflzLam qfc6FXAcT/88rgv6QnMK+BfFXuXLpZLPqVZ4MzmbthLo/sYClvBRZ9Xu X-Gm-Gg: AZuq6aJNmXLfpwwOVb2g5d32YvIyiPIg52ZTMewicvUr3UFDcVTYieQJGTH1ewBVoK/ vh3/e61YKChPXlw11JrjV1+wIVJW5093gALxLVuES1+eauqx+f9BRKz983X3mNZBjIfTfEwj7Iu D6YKHkqHg9IHSWvS7AM+Cy9kzYiDWQ2UGU2LEwQkjIWLf7V/vaShKPDMuMjsoK3I8rKLA+reg17 X0F5dwBlQbrQLzMPmM44mhNJQOYrHrb4GUti9uISr6yFC/6VnJeBujWSK2jmOSkVdjL3Pt6PRSp sdTTT7NTMyV8Sb1Ap+v56W449aAFqpGtYkZM4fg6brzWemb+OwsJmU4ZR9WtpBLGysvg++4s/e5 eAR27vPowLBiapldqf/MUkMGPWgEVVUBXnDGXHHhJMdW/APAYy47yJjvoX80zt07cZIKy+3pABi 2hZBQ= X-Received: by 2002:a05:7022:6720:b0:11b:9386:a381 with SMTP id a92af1059eb24-1247dc13195mr946069c88.48.1769150058449; Thu, 22 Jan 2026 22:34:18 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1247d7abb74sm3396847c88.0.2026.01.22.22.34.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jan 2026 22:34:18 -0800 (PST) From: Qiliang Yuan To: dianders@chromium.org, akpm@linux-foundation.org Cc: lihuafei1@huawei.com, linux-kernel@vger.kernel.org, mingo@kernel.org, realwujing@gmail.com, song@kernel.org, stable@vger.kernel.org, sunshx@chinatelecom.cn, thorsten.blum@linux.dev, wangjinchao600@gmail.com, yangyicong@hisilicon.com, yuanql9@chinatelecom.cn, zhangjn11@chinatelecom.cn, mm-commits@vger.kernel.org Subject: [PATCH v3] watchdog/hardlockup: Fix UAF in perf event cleanup due to migration race Date: Fri, 23 Jan 2026 01:34:07 -0500 Message-ID: <20260123063407.248775-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" During the early initialization of the hardlockup detector, the hardlockup_detector_perf_init() function probes for PMU hardware availabili= ty. It originally used hardlockup_detector_event_create(), which interacts with the per-cpu 'watchdog_ev' variable. If the initializing task migrates to another CPU during this probe phase, two issues arise: 1. The 'watchdog_ev' pointer on the original CPU is set but not cleared, leaving a stale pointer to a freed perf event. 2. The 'watchdog_ev' pointer on the new CPU might be incorrectly cleared. This race condition was observed in console logs (captured by adding debug = printks): [23.038376] hardlockup_detector_perf_init 313 cur_cpu=3D2 ... [23.076385] hardlockup_detector_event_create 203 cpu(cur)=3D2 set watchdog_= ev ... [23.095788] perf_event_release_kernel 4623 cur_cpu=3D2 ... [23.116963] lockup_detector_reconfigure 577 cur_cpu=3D3 The log shows the task started on CPU 2, set watchdog_ev on CPU 2, released the event on CPU 2, but then migrated to CPU 3 before the cleanup logic (which would clear watchdog_ev) could run. This left watchdog_ev on CPU 2 pointing to a freed event. Later, when the watchdog is enabled/disabled on CPU 2, this stale pointer leads to a Use-After-Free (UAF) in perf_event_disable(), as detected by KAS= AN: [26.539140] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D [26.540732] BUG: KASAN: use-after-free in perf_event_ctx_lock_nested.isra.7= 2+0x6b/0x140 [26.542442] Read of size 8 at addr ff110006b360d718 by task kworker/2:1/94 [26.543954] [26.544744] CPU: 2 PID: 94 Comm: kworker/2:1 Not tainted 4.19.90-debugkasan= #11 [26.546505] Hardware name: GoStack Foundation OpenStack Nova, BIOS 1.16.3-3= .ctl3 04/01/2014 [26.548256] Workqueue: events smp_call_on_cpu_callback [26.549267] Call Trace: [26.549936] dump_stack+0x8b/0xbb [26.550731] print_address_description+0x6a/0x270 [26.551688] kasan_report+0x179/0x2c0 [26.552519] ? perf_event_ctx_lock_nested.isra.72+0x6b/0x140 [26.553654] ? watchdog_disable+0x80/0x80 [26.553657] perf_event_ctx_lock_nested.isra.72+0x6b/0x140 [26.556951] ? dump_stack+0xa0/0xbb [26.564006] ? watchdog_disable+0x80/0x80 [26.564886] perf_event_disable+0xa/0x30 [26.565746] hardlockup_detector_perf_disable+0x1b/0x60 [26.566776] watchdog_disable+0x51/0x80 [26.567624] softlockup_stop_fn+0x11/0x20 [26.568499] smp_call_on_cpu_callback+0x5b/0xb0 [26.569443] process_one_work+0x389/0x770 [26.570311] worker_thread+0x57/0x5a0 [26.571124] ? process_one_work+0x770/0x770 [26.572031] kthread+0x1ae/0x1d0 [26.572810] ? kthread_create_worker_on_cpu+0xc0/0xc0 [26.573821] ret_from_fork+0x1f/0x40 [26.574638] [26.575178] Allocated by task 1: [26.575990] kasan_kmalloc+0xa0/0xd0 [26.576814] kmem_cache_alloc_trace+0xf3/0x1e0 [26.577732] perf_event_alloc.part.89+0xb5/0x12b0 [26.578700] perf_event_create_kernel_counter+0x1e/0x1d0 [26.579728] hardlockup_detector_event_create+0x4e/0xc0 [26.580744] hardlockup_detector_perf_init+0x2f/0x60 [26.581746] lockup_detector_init+0x85/0xdc [26.582645] kernel_init_freeable+0x34d/0x40e [26.583568] kernel_init+0xf/0x130 [26.584428] ret_from_fork+0x1f/0x40 [26.584429] [26.584430] Freed by task 0: [26.584433] __kasan_slab_free+0x130/0x180 [26.584436] kfree+0x90/0x1a0 [26.589641] rcu_process_callbacks+0x2cb/0x6e0 [26.590935] __do_softirq+0x119/0x3a2 [26.591965] [26.592630] The buggy address belongs to the object at ff110006b360d500 [26.592630] which belongs to the cache kmalloc-2048 of size 2048 [26.592633] The buggy address is located 536 bytes inside of [26.592633] 2048-byte region [ff110006b360d500, ff110006b360dd00) [26.592634] The buggy address belongs to the page: [26.592637] page:ffd400001acd8200 count:1 mapcount:0 mapping:ff11000107c0e8= 00 index:0x0 compound_mapcount: 0 [26.600959] flags: 0x17ffffc0010200(slab|head) [26.601891] raw: 0017ffffc0010200 dead000000000100 dead000000000200 ff11000= 107c0e800 [26.603541] raw: 0000000000000000 00000000800f000f 00000001ffffffff 0000000= 000000000 [26.605546] page dumped because: kasan: bad access detected [26.606788] [26.607351] Memory state around the buggy address: [26.608556] ff110006b360d600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb= fb [26.610565] ff110006b360d680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb= fb [26.610567] >ff110006b360d700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb= fb [26.610568] ^ [26.610570] ff110006b360d780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb= fb [26.610573] ff110006b360d800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb= fb [26.618955] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Fix this by refactoring hardlockup_detector_event_create() to return the created perf event instead of directly assigning it to the per-cpu variable. This allows the probe logic to reuse the creation code (including fallback logic) without affecting the global state, ensuring that task migration during probe no longer leaves stale pointers in 'watchdog_ev'. Signed-off-by: Shouxin Sun Signed-off-by: Junnan Zhang Signed-off-by: Qiliang Yuan Signed-off-by: Qiliang Yuan Cc: Song Liu Cc: Douglas Anderson Cc: Jinchao Wang Cc: Wang Jinchao Cc: --- v3: Refactor creation logic to return event pointer; restores PMU cycle fal= lback and unifies paths. v2: Add Cc: . v1: Avoid 'watchdog_ev' in probe path by manually creating and releasing a = local perf event. kernel/watchdog_perf.c | 51 ++++++++++++++++++++++++------------------ 1 file changed, 29 insertions(+), 22 deletions(-) diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c index d3ca70e3c256..d045b92bc514 100644 --- a/kernel/watchdog_perf.c +++ b/kernel/watchdog_perf.c @@ -118,18 +118,11 @@ static void watchdog_overflow_callback(struct perf_ev= ent *event, watchdog_hardlockup_check(smp_processor_id(), regs); } =20 -static int hardlockup_detector_event_create(void) +static struct perf_event *hardlockup_detector_event_create(unsigned int cp= u) { - unsigned int cpu; struct perf_event_attr *wd_attr; struct perf_event *evt; =20 - /* - * Preemption is not disabled because memory will be allocated. - * Ensure CPU-locality by calling this in per-CPU kthread. - */ - WARN_ON(!is_percpu_thread()); - cpu =3D raw_smp_processor_id(); wd_attr =3D &wd_hw_attr; wd_attr->sample_period =3D hw_nmi_get_sample_period(watchdog_thresh); =20 @@ -143,14 +136,7 @@ static int hardlockup_detector_event_create(void) watchdog_overflow_callback, NULL); } =20 - if (IS_ERR(evt)) { - pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, - PTR_ERR(evt)); - return PTR_ERR(evt); - } - WARN_ONCE(this_cpu_read(watchdog_ev), "unexpected watchdog_ev leak"); - this_cpu_write(watchdog_ev, evt); - return 0; + return evt; } =20 /** @@ -159,17 +145,26 @@ static int hardlockup_detector_event_create(void) */ void watchdog_hardlockup_enable(unsigned int cpu) { + struct perf_event *evt; + WARN_ON_ONCE(cpu !=3D smp_processor_id()); =20 - if (hardlockup_detector_event_create()) + evt =3D hardlockup_detector_event_create(cpu); + if (IS_ERR(evt)) { + pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, + PTR_ERR(evt)); return; + } =20 /* use original value for check */ if (!atomic_fetch_inc(&watchdog_cpus)) pr_info("Enabled. Permanently consumes one hw-PMU counter.\n"); =20 + WARN_ONCE(this_cpu_read(watchdog_ev), "unexpected watchdog_ev leak"); + this_cpu_write(watchdog_ev, evt); + watchdog_init_timestamp(); - perf_event_enable(this_cpu_read(watchdog_ev)); + perf_event_enable(evt); } =20 /** @@ -263,19 +258,31 @@ bool __weak __init arch_perf_nmi_is_available(void) */ int __init watchdog_hardlockup_probe(void) { + struct perf_event *evt; + unsigned int cpu; int ret; =20 if (!arch_perf_nmi_is_available()) return -ENODEV; =20 - ret =3D hardlockup_detector_event_create(); + if (!hw_nmi_get_sample_period(watchdog_thresh)) + return -EINVAL; =20 - if (ret) { + /* + * Test hardware PMU availability by creating a temporary perf event. + * Allow migration during the check as any successfully created per-cpu + * event validates PMU support. The event is released immediately. + */ + cpu =3D raw_smp_processor_id(); + evt =3D hardlockup_detector_event_create(cpu); + if (IS_ERR(evt)) { pr_info("Perf NMI watchdog permanently disabled\n"); + ret =3D PTR_ERR(evt); } else { - perf_event_release_kernel(this_cpu_read(watchdog_ev)); - this_cpu_write(watchdog_ev, NULL); + perf_event_release_kernel(evt); + ret =3D 0; } + return ret; } =20 --=20 2.51.0