From nobody Sat Feb 7 21:15:44 2026 Received: from mail-dy1-f179.google.com (mail-dy1-f179.google.com [74.125.82.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55968292B2E for ; Wed, 28 Jan 2026 02:32:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769567524; cv=none; b=DFP+hVH41AS79OdBTeb/91CMk+vf5HDBiIV4X5s+z/RnWaIkXE8CjYr1PfjptIh9ZHvR5oy+ANVCH/GlovuA+1mHPodIidcqgJuOOrmYkt6FMK2FMsOEZxT9KDMEstjm93cVm8AXnc0EGW0tMzjbuOTFfQohA/a+TWx8yOn/FdM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769567524; c=relaxed/simple; bh=3zyzKCSekqq7FHMYW7I4Xhmf24lrfieQrozQP0dABJg=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=Qiti8sz+9KX7IRVo/otZa78B8xCT7ZjjXY1oKLTnilBYo2HP8CyvRLhGDxFKvaRfBk+a3vm3jDSRWgGPqNpRlnYZqEfO0m9iTo5U888kUwGZo7CemfUukcTRxTwyX/BHXpprw0LPErE5JYrnf/74IJ29zdIx3khmrKRPzQMn9OE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=D3FpUrpA; arc=none smtp.client-ip=74.125.82.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="D3FpUrpA" Received: by mail-dy1-f179.google.com with SMTP id 5a478bee46e88-2b720bb90d0so5806199eec.0 for ; Tue, 27 Jan 2026 18:32:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769567522; x=1770172322; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=egAN9VsiBBmLS+EOoK2lXqZdjZw63twFvoDba+Q9+sA=; b=D3FpUrpAPSkq7c5MzDpfnDRX1BXrInzAxy5VejW8UEwXDOqHIk+aricbX3hVMVBZqI sHY/58vamS2jVQIV4OGRmXNXqMmWGjdc6XNn7oRuyOCMWxjF/mhJTRAfqOR7U2wYlV4u EErl6jV0zCVvYqQ7iGRIrFN9agSrm7Mw/594UYDO+mrYfvJFMKQPsmtXBNn117Agmf2v H3LbAnMXzeSBkRzrE8uy/LjsFuy8YWCNQJlb80SwFahxnelv371Ekwkw9TKGL6yfHzHs pAxLDd0X2fLdvO+3jtMdWjcERLj5Sj1J2cyNwQ9OTfQpOczSFIMXjRVRhkNnzhq4okZC xAXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769567522; x=1770172322; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=egAN9VsiBBmLS+EOoK2lXqZdjZw63twFvoDba+Q9+sA=; b=dUquu1rXa6Br8CMMaX61fUl7O7qyrsmBBLTMp7neQuccqXf45GFDnnRitUfmSsJMhd vi9cKOBpBL73gcyEBMPZ47sq/oxMSIn+T06T665dkuNOK3KRMGxlX9naOV87CUSJi0Vd SJ0cOV2V/qlcp3Gog3al88j9k/2rc/OxbWmNT5ztCKhCBjD3zFH7zc0q1zBLPQXKCzXz POnP/v0JqNdVauULzoCzdOzeUezB/cbTby3j1N7eJq8QMI+pOQgdV+hN0xjw+OKk1jjR TQKMQ+gaEELZQvsrp+M4+EjX5xBmjArx2H7B8Nj9mZ7iVzTRXFzm1/RBUG1brbKbVghv WrCQ== X-Forwarded-Encrypted: i=1; AJvYcCUxfCJi2yIlsBHtuL0pAP3qU62mPNW+f9K+3fA9/uUr6p0dGq63ZuraIouJgeteD8mpcZeyfPba5peD5gk=@vger.kernel.org X-Gm-Message-State: AOJu0YxBcz2oxtU62mssPsLZ/OYEPUXdfryCmDOAFRX9fuGTpRVDzJm2 aUx0mKdPHnLP/s6SxtuGNh0JcGaoqn9vIOxwnso/VRIKmw7Rg+r3LQjv X-Gm-Gg: AZuq6aIwS94706AXF0UseaC+JdOZtkE4qo5pqXx5ahO7fei7P5bOKo9VecMGmDUWjek 0Mb6lUWOd1n3jxwrjMtKr0i1VSQSL3HIUN1To+DZ7w9X5PSjMPzHKXNJYkThNPnsx8fSOBTTngf tTjjC/MHHdvv6p9xrUZtUSzmWr5Jyg5SPt4ZfNPpT/fuuVYCOjEIWiABiEhHhb24C1+Y6lwyCNt Z8aCWQYoDGpXf5K46BYRvHIgN9/vVdinc5oB6ZsuCNfj2Hr/+FVt+BNBVjRIbKsLBbQuOKQXEM7 CNJC5QnzB2SfK2+CYlEyr46UKIwpdkhsUWxAX56Yse9FbXK+lKwsoPyODKFuX2aPsdKojAMaxH9 1lg5K0AhTzWYVMJT6Kn5l/MyPPCkqGaiI+b9/eQmyOwH+A9v1NI+x7mWhMaxiJrs//bbSIi7hlT 7Buaqd3KCIyVaMpA== X-Received: by 2002:a05:693c:60cd:b0:2b7:97b0:82b9 with SMTP id 5a478bee46e88-2b797b09200mr1234573eec.9.1769567522217; Tue, 27 Jan 2026 18:32:02 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b7a17083dasm930663eec.14.2026.01.27.18.31.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Jan 2026 18:32:01 -0800 (PST) From: Qiliang Yuan To: Qiliang Yuan , Li Huafei , Andrew Morton , Ingo Molnar , Jinchao Wang , Yicong Yang , Thorsten Blum Cc: linux-watchdog@vger.kernel.org, mm-commits@vger.kernel.org, Shouxin Sun , Junnan Zhang , Qiliang Yuan , Song Liu , Douglas Anderson , linux-kernel@vger.kernel.org Subject: [PATCH v6] watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency Date: Tue, 27 Jan 2026 21:31:22 -0500 Message-ID: <20260128023136.1691973-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Simplify the hardlockup detector's probe path and remove its implicit dependency on pinned per-cpu execution. Refactor hardlockup_detector_event_create() to be stateless. Return the created perf_event pointer to the caller instead of directly modifying the per-cpu 'watchdog_ev' variable. This allows the probe path to safely manage a temporary event without the risk of leaving stale pointers should task migration occur. Use cpu_hotplug_disable() during the probe to ensure the target CPU remains stable throughout the availability check. Signed-off-by: Shouxin Sun Signed-off-by: Junnan Zhang Signed-off-by: Qiliang Yuan Cc: Song Liu Cc: Douglas Anderson Cc: Jinchao Wang Reviewed-by: Douglas Anderson --- v6: - Change title to "simplify/cleanup" and remove "Fixes" tag since the issue is not reproducible on mainline. - Rewrite commit message in imperative mood. - Clarify that mainline is safe while this improves robustness. - v5 link: https://lore.kernel.org/all/20260127022238.1182079-1-realwujing@= gmail.com/ v5: - Refine description: clarify that the retry path uses worker threads without PF_PERCPU_THREAD (though mainline is safe due to system_percpu_wq= ). - v4 link: https://lore.kernel.org/all/20260124070814.806828-1-realwujing@g= mail.com/ v4: - Add cpu_hotplug_disable() in watchdog_hardlockup_probe() to stabilize the probe CPU. - Update description to explain the relevance of 4.19 logs. v3: - Refactor hardlockup_detector_event_create() to be stateless by returning the event pointer instead of directly assigning to per-cpu variables. - Restore PMU cycle fallback and unify the enable/probe paths. v2: - Add Cc: stable@vger.kernel.org. v1: - Avoid 'watchdog_ev' in probe path by manually creating and releasing a local perf event. kernel/watchdog_perf.c | 56 +++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 22 deletions(-) diff --git a/kernel/watchdog_perf.c b/kernel/watchdog_perf.c index d3ca70e3c256..887b61c65c1b 100644 --- a/kernel/watchdog_perf.c +++ b/kernel/watchdog_perf.c @@ -17,6 +17,7 @@ #include #include #include +#include =20 #include #include @@ -118,18 +119,11 @@ static void watchdog_overflow_callback(struct perf_ev= ent *event, watchdog_hardlockup_check(smp_processor_id(), regs); } =20 -static int hardlockup_detector_event_create(void) +static struct perf_event *hardlockup_detector_event_create(unsigned int cp= u) { - unsigned int cpu; struct perf_event_attr *wd_attr; struct perf_event *evt; =20 - /* - * Preemption is not disabled because memory will be allocated. - * Ensure CPU-locality by calling this in per-CPU kthread. - */ - WARN_ON(!is_percpu_thread()); - cpu =3D raw_smp_processor_id(); wd_attr =3D &wd_hw_attr; wd_attr->sample_period =3D hw_nmi_get_sample_period(watchdog_thresh); =20 @@ -143,14 +137,7 @@ static int hardlockup_detector_event_create(void) watchdog_overflow_callback, NULL); } =20 - if (IS_ERR(evt)) { - pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, - PTR_ERR(evt)); - return PTR_ERR(evt); - } - WARN_ONCE(this_cpu_read(watchdog_ev), "unexpected watchdog_ev leak"); - this_cpu_write(watchdog_ev, evt); - return 0; + return evt; } =20 /** @@ -159,17 +146,26 @@ static int hardlockup_detector_event_create(void) */ void watchdog_hardlockup_enable(unsigned int cpu) { + struct perf_event *evt; + WARN_ON_ONCE(cpu !=3D smp_processor_id()); =20 - if (hardlockup_detector_event_create()) + evt =3D hardlockup_detector_event_create(cpu); + if (IS_ERR(evt)) { + pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, + PTR_ERR(evt)); return; + } =20 /* use original value for check */ if (!atomic_fetch_inc(&watchdog_cpus)) pr_info("Enabled. Permanently consumes one hw-PMU counter.\n"); =20 + WARN_ONCE(this_cpu_read(watchdog_ev), "unexpected watchdog_ev leak"); + this_cpu_write(watchdog_ev, evt); + watchdog_init_timestamp(); - perf_event_enable(this_cpu_read(watchdog_ev)); + perf_event_enable(evt); } =20 /** @@ -263,19 +259,35 @@ bool __weak __init arch_perf_nmi_is_available(void) */ int __init watchdog_hardlockup_probe(void) { + struct perf_event *evt; + unsigned int cpu; int ret; =20 if (!arch_perf_nmi_is_available()) return -ENODEV; =20 - ret =3D hardlockup_detector_event_create(); + if (!hw_nmi_get_sample_period(watchdog_thresh)) + return -EINVAL; =20 - if (ret) { + /* + * Test hardware PMU availability by creating a temporary perf event. + * The requested CPU is arbitrary; preemption is not disabled, so + * raw_smp_processor_id() is used. Surround with cpu_hotplug_disable() + * to ensure the arbitrarily chosen CPU remains online during the check. + * The event is released immediately. + */ + cpu_hotplug_disable(); + cpu =3D raw_smp_processor_id(); + evt =3D hardlockup_detector_event_create(cpu); + if (IS_ERR(evt)) { pr_info("Perf NMI watchdog permanently disabled\n"); + ret =3D PTR_ERR(evt); } else { - perf_event_release_kernel(this_cpu_read(watchdog_ev)); - this_cpu_write(watchdog_ev, NULL); + perf_event_release_kernel(evt); + ret =3D 0; } + cpu_hotplug_enable(); + return ret; } =20 --=20 2.51.0