From nobody Mon Jun 15 09:38:24 2026 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 638CD3C063B for ; Thu, 9 Apr 2026 11:14:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775733260; cv=none; b=YpOpuKHSANf27wo52Dds5xD6DzPW8trjpVI2joD/4hDZD8CxGGAFstUZ8uYp/esbDQ+HRJ/sOv8+d2OIinzOLZ/wYw/a6PmOMmH9AInoOR8vo15gmzK0zzoRm00UD0wnqnNwnADqgvtN7g17LEHpW25SizN1Q/CX/ijOW1vrCkE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775733260; c=relaxed/simple; bh=TW1Sy9d7aaTRUeCb5l+brTMFGvH05AQvCV5aAedgOAs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l9TCNR3ptbPO5CNUxAwtf8cqYoC3+RRCq+BGKYfckkSLaBRqub18GQGVdGxi1LwLe+MXq57uhUbA5li7ZU/GAYI/ygh1WtAon1hNyY5wz6npqFdB3lH1lD/lDSG2SBPRAg1cIEozorquQOGAcp9rFk4kIOsxYNKgxM5f3TFHdFE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com; spf=pass smtp.mailfrom=oss.qualcomm.com; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b=PAOlnGG2; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b=d6KArZMy; arc=none smtp.client-ip=205.220.180.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b="PAOlnGG2"; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="d6KArZMy" Received: from pps.filterd (m0279869.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 639AKW2r1727226 for ; Thu, 9 Apr 2026 11:14:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=6e0+tvJi2nx fM5V/oSmNLL+K1b0L8fyniqwMGsM3+t4=; b=PAOlnGG229PWu2kQVt43Qh3Qd0h vCBPgMcCLzP+/uNkA/WIAGZT19v/dK3a8fzYuAKnJvjCReoAHJ+VSic3ZTpOwaiW dSBRcDKQhqrbAuG54RMziiIOwpWmtPyA7zaDGTIZtYVYIiRMuRxbmrRc83DfO+JT Opz3I07Yf2KOD0uFh+ld4JiNdCeaTEk/irURGkn5tTH8gJi9eqmu0z/K+mRPwEjW dziO+eTjnNehR5e4SvMyJEUzCiJUsGdjOFJ4hH+9BEXBG1denIa1Em+rhzJmZq0E v7g3e/nl2/hGVEa1QFLCWMeFEGG49JtI0nyw2/y+w/PCBkkXqxDyNbCFCIQ== Received: from mail-pj1-f72.google.com (mail-pj1-f72.google.com [209.85.216.72]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4ddwcrtwxu-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Thu, 09 Apr 2026 11:14:18 +0000 (GMT) Received: by mail-pj1-f72.google.com with SMTP id 98e67ed59e1d1-354c0234c1fso742094a91.2 for ; Thu, 09 Apr 2026 04:14:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1775733257; x=1776338057; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6e0+tvJi2nxfM5V/oSmNLL+K1b0L8fyniqwMGsM3+t4=; b=d6KArZMyRW2WHGc6PfRu6dOMVxr1Pxc6uwb1wS/suQZYmK7e8MLAHdojSKQuJJvrxe P3YNaQ310SNQFZdsd9R2pqP85ykW2vDXwEkmVOsDB23etnzb6r1hKJZkuCLVRYZZ90LK abqgoqQBbd2zxfaLlrR4gtoPNK2uiC5b3smLMddwWxvouNETwVwN6ZSS/onK+Yhe6s2y IAwDl+5M1S9xRZUyBhjK6spBcnQaOLBkKHKt+B+RwYzLGhj8SeP1q87cYyJHRbpMR45I Efv09sALd6vXqQBGEUeAHQV6zznLOjl4VY5Wo9XEFhsj7ZnllmUX/3LB8zYPxhbs/ANE M5Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775733257; x=1776338057; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6e0+tvJi2nxfM5V/oSmNLL+K1b0L8fyniqwMGsM3+t4=; b=EJnTY56RiM7crRDNxBne8dXjcn8ss9jMhVXLLSm10lEZgP6JYU+1z5rjEj+MezK0rF g1RG/dNrmJqGFm+OtlbC8Kmi35U7u000z2DP/KM4PvoN3Pwbio0Zefjkmv6g9aDKVLGi uKYyJpGohtomb307giXeTFRXpDJNhw8eQXfN7ItSKDjLYiNs15PrEmLKUPsWdyjMEv94 RhZVsYhFGwCy4b7XDCv/9A5k0w9MepZ9DhWHP1lF3Oi46UzSqbP4fB4YZfrJEzx4DJgD fbl2AHXk2M7QL++20U1ltAjqFMl8UgI8+NRkr2a9Itav/+yw1y2pufRDGdToU9Bel6MC h0uQ== X-Forwarded-Encrypted: i=1; AJvYcCXrbVzwHoh5y5cShdMuSqRik2SymxwY9peVrWICyUCV2L+aX9SURqrBaGDhhdk6pAZfe+i11dzCAjrDPIk=@vger.kernel.org X-Gm-Message-State: AOJu0YzvM16hSOuPvWLprWvNhHoMPLHheCB/lDSVPpGf2lWoty1sM+iu s5HbPbDWCQ1rgPr5eKoidlhGF5OhmgZpFuzMx4wE3IjTs42BDl4dugioQC3+mGB605C/eXp2f6b 7h26inXSfwPCuLt+jRpYBjeRd/pmFyM28GwhzKrZYJywr6LXwj4xuZzQa4tSFBOpOr1E= X-Gm-Gg: AeBDieupOip387v0b45RewTHkPxgf5ooyovPcpj+NKLwrxPJa3BW3JrnIii9NSdG9UN 8Q7PxXoyA2/tHZ3qwLTAxFHdLp9YnSfHnf44eLKoqlGkykuzPFjps7G8DoSIxxpUNqa8nVqKFGk /gMrYfr9aPvA4iA3rKdbWaqa4FD+7rs6bqtUSfuzQUXdgdPBzU+LVJasr+fApYpzMhkH5ktyCIY 9RqU9+GSxdTqI9KF7+0XyEDe+lTu0NCj04kU9pKWuOVR81KL4KEAp4uTPngjIibx+Hj6Q/I/QKJ HUkBFo9EJhVsf8AfPG6uD7efMPqbMP2WNag5jWRv463f8QzCPb3xPUDOQU2GowVYpbv5X/zVger cMGQMvsZsLPfN/JhNbVwu5srKfTJQZm2bK33SAbbH/y+VNQEt8cFbYMdhTTrz7oU4dJO9/a3cWM cdG+ny+xyq/sx6 X-Received: by 2002:a05:6a20:6a0d:b0:398:b16f:703e with SMTP id adf61e73a8af0-39fc832eca9mr4081510637.40.1775733256868; Thu, 09 Apr 2026 04:14:16 -0700 (PDT) X-Received: by 2002:a05:6a20:6a0d:b0:398:b16f:703e with SMTP id adf61e73a8af0-39fc832eca9mr4081461637.40.1775733256328; Thu, 09 Apr 2026 04:14:16 -0700 (PDT) Received: from zhonhan-gv.qualcomm.com (tpe-colo-wan-fw-bordernet.qualcomm.com. [103.229.16.4]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c76c65935cbsm21622903a12.26.2026.04.09.04.14.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Apr 2026 04:14:15 -0700 (PDT) From: Zhongqiu Han To: rafael@kernel.org, viresh.kumar@linaro.org Cc: venkatesh.pallipadi@intel.com, davej@redhat.com, trenn@suse.de, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, zhongqiu.han@oss.qualcomm.com Subject: [PATCH v2 1/2] cpufreq: governor: Fix race between sysfs store and dbs work handler Date: Thu, 9 Apr 2026 19:14:06 +0800 Message-ID: <20260409111407.9775-2-zhongqiu.han@oss.qualcomm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com> References: <20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNDA5MDA5NiBTYWx0ZWRfX0O1cPfHyYHjj 8bnaLJ1kPuP1EEMXeXeEFuA1kSX3naYb15fB2XUFkUndJYlXJHpmTQZpgB2byu9P7z6mTXhziMW aXwQmWz4jJOkEq/ceQyXN4TbdIUbasjYmNGNT0gWt7BG8kAEzZdUU/hHp4+IFdF7iskVkD6MNHM 21r8zucpDmWWq5pAZubTrH/nax4CqnvrdQe4H9XnnM/Oqjt1Dy8jmxEpPBYupOj7M6+2iurQ+qR 7Z1uWU8yG48I8/nBOlqHeqLEAI71Rc8s6Ksx9N04ZOQHE7pbFUTDRXwyvXVGDXMltuSTaDn95DO 1WWhlAQ1GWlOe1ybZLKMluWJ79f2I4m8ZyzHyLsDEBO5rDbK9Omzyunp5opYb0u9Zj5vRrfwtNp 5gNW19ELx5Q7r601TsDJl0YSPWwUxMDM7p1LuMpPtw3mRv51dFZynF9K1IJKqlRfDxWOI/cqHKQ KJHVJyZEX7maNWw1EZA== X-Authority-Analysis: v=2.4 cv=SsWgLvO0 c=1 sm=1 tr=0 ts=69d78a0a cx=c_pps a=RP+M6JBNLl+fLTcSJhASfg==:117 a=nuhDOHQX5FNHPW3J6Bj6AA==:17 a=A5OVakUREuEA:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=u7WPNUs3qKkmUXheDGA7:22 a=_glEPmIy2e8OvE2BGh3C:22 a=EUspDBNiAAAA:8 a=EhOh3lww-dzmNHTOE2sA:9 a=iS9zxrgQBfv6-_F4QbHw:22 X-Proofpoint-GUID: V7vU54abDYJ-CwywoWf1kGi7nJA3oWFT X-Proofpoint-ORIG-GUID: V7vU54abDYJ-CwywoWf1kGi7nJA3oWFT X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-04-09_03,2026-04-09_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 suspectscore=0 lowpriorityscore=0 priorityscore=1501 spamscore=0 impostorscore=0 adultscore=0 bulkscore=0 clxscore=1015 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2604010000 definitions=main-2604090096 Content-Type: text/plain; charset="utf-8" gov_update_cpu_data() resets per-CPU prev_cpu_idle and prev_cpu_nice for every CPU in the governed domain. It is called from sysfs store callbacks (e.g. ignore_nice_load_store) which run under attr_set->update_lock, held by the surrounding governor_store(). Concurrently, dbs_work_handler() calls gov->gov_dbs_update() (which calls dbs_update()) under policy_dbs->update_mutex. dbs_update() both reads and writes the same prev_cpu_idle / prev_cpu_nice fields. The potential race path is: Path A (sysfs write, holds attr_set->update_lock only): governor_store() mutex_lock(&attr_set->update_lock) ignore_nice_load_store() dbs_data->ignore_nice_load =3D input gov_update_cpu_data(dbs_data) list_for_each_entry(policy_dbs, ...) for_each_cpu(j, ...) j_cdbs->prev_cpu_idle =3D get_cpu_idle_time(...) /* write */ j_cdbs->prev_cpu_nice =3D kcpustat_field(...) /* write */ mutex_unlock(&attr_set->update_lock) Path B (work queue, holds policy_dbs->update_mutex only): dbs_work_handler() mutex_lock(&policy_dbs->update_mutex) gov->gov_dbs_update(policy) dbs_update() for_each_cpu(j, policy->cpus) idle_time =3D cur - j_cdbs->prev_cpu_idle /* read */ j_cdbs->prev_cpu_idle =3D cur_idle_time /* write */ idle_time +=3D cur_nice - j_cdbs->prev_cpu_nice /* read */ j_cdbs->prev_cpu_nice =3D cur_nice /* write */ mutex_unlock(&policy_dbs->update_mutex) Because attr_set->update_lock and policy_dbs->update_mutex are two completely independent locks, the two paths are not mutually exclusive. This results in a data race on cpu_dbs_info.prev_cpu_idle and cpu_dbs_info.prev_cpu_nice. Fix this by also acquiring policy_dbs->update_mutex in gov_update_cpu_data() for each policy, so that path A participates in the mutual exclusion already established by dbs_work_handler(). Also update the function comment to accurately reflect the two-level locking contract. Additionally, cpufreq_dbs_governor_start() initializes prev_cpu_idle and prev_cpu_nice without holding policy_dbs->update_mutex. After cpufreq_dbs_governor_init() returns, the new policy is already visible in attr_set->policy_list and sysfs attributes are accessible. A concurrent sysfs write can therefore call gov_update_cpu_data() and race with the initialization loop on the same u64 fields. Fix this by holding policy_dbs->update_mutex around the initialization loop in cpufreq_dbs_governor_start() as well. The root of this race dates back to the original ondemand/conservative governors. Before commit ee88415caf73 ("[CPUFREQ] Cleanup locking in conservative governor") and commit 5a75c82828e7 ("[CPUFREQ] Cleanup locking in ondemand governor"), all accesses to prev_cpu_idle and prev_cpu_nice in cpufreq_governor_dbs() (path X), store_ignore_nice_load() (path Y), and do_dbs_timer() (path Z) were serialised by the same dbs_mutex, so no race existed. Those two commits switched do_dbs_timer() from dbs_mutex to a per-policy/per-cpu timer_mutex to reduce lock contention, but left store_ignore_nice_load() still holding dbs_mutex. As a result, path Y (store) and path Z (do_dbs_timer) no longer shared a common lock, introducing a potential race on prev_cpu_idle/prev_cpu_nice between store_ignore_nice_load() and dbs_check_cpu(). Commit 326c86deaed54a ("[CPUFREQ] Remove unneeded locks") then removed dbs_mutex from store_ignore_nice_load() entirely, introducing an additional potential race between store_ignore_nice_load() (path Y, now lockless) and cpufreq_governor_dbs() (path X, still holding dbs_mutex), while the race between path Y and path Z remained. Fixes: ee88415caf736b ("[CPUFREQ] Cleanup locking in conservative governor") Fixes: 5a75c82828e7c0 ("[CPUFREQ] Cleanup locking in ondemand governor") Fixes: 326c86deaed54a ("[CPUFREQ] Remove unneeded locks") Signed-off-by: Zhongqiu Han --- drivers/cpufreq/cpufreq_governor.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_g= overnor.c index 86f35e451914..c0d419c95609 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -90,7 +90,8 @@ EXPORT_SYMBOL_GPL(sampling_rate_store); * (that may be a single policy or a bunch of them if governor tunables are * system-wide). * - * Call under the @dbs_data mutex. + * Call under the @dbs_data->attr_set.update_lock. The per-policy + * update_mutex is acquired and released internally for each policy. */ void gov_update_cpu_data(struct dbs_data *dbs_data) { @@ -99,6 +100,7 @@ void gov_update_cpu_data(struct dbs_data *dbs_data) list_for_each_entry(policy_dbs, &dbs_data->attr_set.policy_list, list) { unsigned int j; =20 + mutex_lock(&policy_dbs->update_mutex); for_each_cpu(j, policy_dbs->policy->cpus) { struct cpu_dbs_info *j_cdbs =3D &per_cpu(cpu_dbs, j); =20 @@ -107,6 +109,7 @@ void gov_update_cpu_data(struct dbs_data *dbs_data) if (dbs_data->ignore_nice_load) j_cdbs->prev_cpu_nice =3D kcpustat_field(&kcpustat_cpu(j), CPUTIME_NIC= E, j); } + mutex_unlock(&policy_dbs->update_mutex); } } EXPORT_SYMBOL_GPL(gov_update_cpu_data); @@ -529,6 +532,7 @@ int cpufreq_dbs_governor_start(struct cpufreq_policy *p= olicy) ignore_nice =3D dbs_data->ignore_nice_load; io_busy =3D dbs_data->io_is_busy; =20 + mutex_lock(&policy_dbs->update_mutex); for_each_cpu(j, policy->cpus) { struct cpu_dbs_info *j_cdbs =3D &per_cpu(cpu_dbs, j); =20 @@ -541,6 +545,7 @@ int cpufreq_dbs_governor_start(struct cpufreq_policy *p= olicy) if (ignore_nice) j_cdbs->prev_cpu_nice =3D kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE= , j); } + mutex_unlock(&policy_dbs->update_mutex); =20 gov->start(policy); =20 --=20 2.43.0 From nobody Mon Jun 15 09:38:24 2026 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0FDE939FCBF for ; Thu, 9 Apr 2026 11:14:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775733263; cv=none; b=KpHqo9PfZid5pFmkfztD6Q1QZ3SlXJKF0xoWu+o8acwOCr/hG/EkPXfSDM3KY30Wz3ukYRyF6lMjgKe2VMgpagUa4YC1Ed+isc3mzAXnA1wALzPnVC8h7o9S9XPpagQwV77HEJ9QRmY57tJhVBdEqNH5NYvJikBTZfgEcRD2FP4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775733263; c=relaxed/simple; bh=dnHxWH8aSgldICw7ghKnPiWDnjf+yMG3w/yLa+v9y4U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UiUyMbAxH5eDCKugoIvFQu1i0xlxiunBKCo9EnjhzWHzHU10VDQ0xggRazP9rbyLVBp0KWrcpjvpvc77JdnU6OJJg+pMgktiSmqlVm+T5+D9hKfNfO18FRSxcelaou8/jsJgAOTzSl0JCg7Pi+v2VR6vJ6Iqq85GOelAxjuU/Xk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com; spf=pass smtp.mailfrom=oss.qualcomm.com; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b=p6I+JbKy; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b=LyUfX3Ws; arc=none smtp.client-ip=205.220.180.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oss.qualcomm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=qualcomm.com header.i=@qualcomm.com header.b="p6I+JbKy"; dkim=pass (2048-bit key) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="LyUfX3Ws" Received: from pps.filterd (m0279871.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 6395GDv84107667 for ; Thu, 9 Apr 2026 11:14:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=jOXTY5T+IJV 2dmKbM1/lXW6PkSVMrnV9tU/Yphv++DI=; b=p6I+JbKyBM1P6VmvpKtTS3sh+Fa O93PoiyUbcppIHolsxaEwPWM4hOxvJy00l0fHv1fh4dwMu7pcrEbyP24KxLjLms5 MnqtRFWZsO7JPoUjke0buxo9HmnaNV25/Z2RY2AOi3XKXRyXruBTt/Myw1awrQIh DS8GDfT4Et3ZucBaKlvWUTLw8e/HA/Q8bKdWtlmMWQ8kbyQUTg0+TqtdYl1V/Toi Nvb1rJgEyKDrQyIQdR7rxjf1wW9+25jCmxFzIwx9UIlO16grxPOHMfhke8130M+/ TU5BYTIN7VMvBeiRBQGv2X49OgsP90BJ5C1aGjP6Aa+K9Bnbvq7JLItU4tQ== Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4ddxhajjdf-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Thu, 09 Apr 2026 11:14:20 +0000 (GMT) Received: by mail-pg1-f199.google.com with SMTP id 41be03b00d2f7-c76bd4feb9fso661055a12.0 for ; Thu, 09 Apr 2026 04:14:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1775733260; x=1776338060; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jOXTY5T+IJV2dmKbM1/lXW6PkSVMrnV9tU/Yphv++DI=; b=LyUfX3WsimdldMAmPn7p05heKmTEsMfjC+58Hx/9bmoRze8GFLbXN7cfAFHSt8SQey SGtDEOF/Cel3JpzxCMHRlMVJlR74bxKb+MszxJwmD78LgahZQMCA/KYYfvowXbpLbnix rD29b9EzGn4+GwtRFAVNl92TZdB1pS2Qe1q+zZYAQxZNIqeVkAu4y5rEuNBGjlBGuMVe uB0KHQ2oZYhjcXLORinbwSbxiJ2KxEhavxHLvQFZ4FUzvKJ1CtjJkPfc7lkFGyPC+O5Y E4p/wjqu8fuRSijhuOrXop2VO8H2ZivfPJjv6G6w2dAX41792ou6ll/eqkrpQxwa/JBf eiHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775733260; x=1776338060; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=jOXTY5T+IJV2dmKbM1/lXW6PkSVMrnV9tU/Yphv++DI=; b=GkdkZBbcO6z5Or//KWyz5HfLxsau0N+t3CUF50EDt8VWs3uNcUc+BQ9NbxnK0QYKx8 MoPphH6zXL7kxbwN2UI52HFp5SPHLVpJEJdL955upezigRqpVJiduiye2dWjvQv4Z+Jr X2jwdEM5Fx76PsGJysedocUzK17jIYI4K68Wzc/5lrYQtqtQjZnX37gPdf+NPB7ItCxs dAnoWZ4pPgS81251NxfuYRnLfSr722alm9KzMKc0K2xbJAEHizOSvXYUHezoRlJN6nOf RxEkNGUgTlXzvsLIzUWO/ny5W2yZvfGyjnwgQTrshfPG2yDb0kqBGldwzlfZ/hUmArDF iFkw== X-Forwarded-Encrypted: i=1; AJvYcCWx+FEv8bxvTfB+TB3tPdyNrwzUxY9aBqzhdF5tdeUrFKRzjBrwmqSd3miYQ7G+LLYgIsLzgqbdRy5Hha4=@vger.kernel.org X-Gm-Message-State: AOJu0YxgLeCceBE4I2nshJdHNcoF5lSwYciwV/UDkLUDQDgls5Kgq233 muIhh3mDbH1lZXJoLkw0Cg5jfUiK4r/5+Hzl1axk/q4sT9JwHgreo6u2MH750PtUMLgbVqn6XZM 1VluNIOndEoj7c9YonCUHCs0ut0YD0Clw0/aQlpHxKKPU6FQfQp9f/JyxrdHNef2RrEk= X-Gm-Gg: AeBDietkogjl7nXEgA4sXPYnw1jR+2DaT+EGsoSWMlJx2ANes1wDhBt0RvQ1CGuSZ2o goagbv/Ngk3I/Mi8fDnJAkSD6yEmWn9UZ0Q4+YZQ/bqTiOnCbtZqooFruGhFiEzUYsseUrCA1h/ DI61o3ZL7sg263XnjXXXs1mUg4gJ3UnEPB3iRJhb4nygtYR8b1ycRY+eMZZmZQ7u7uf8YwIOMAN aRijAA9Igs5uLobFt/RW1uHfo//OWimrPvAnT1gmSDQnecg+Zk8dx6gpWY64gEAAfL9h4sp+amT vKAHilGpZRmQIbPxzMpiA/CiLhSJMHn/vst8hlC0RrLYwNXiHsXAJP8SF/ID0iBXNqY4CdY6kho 3diBFm3KlWBBoyaXHVF7ZynAV4c8VJETXhYMeBTiQlKqZq+dYgbzYPvplYpLGl9jFoYsCsDCRB6 lsXazK5qmpQM8d X-Received: by 2002:a05:6a20:9145:b0:398:7eea:50a0 with SMTP id adf61e73a8af0-39f2eff3894mr28098912637.18.1775733259947; Thu, 09 Apr 2026 04:14:19 -0700 (PDT) X-Received: by 2002:a05:6a20:9145:b0:398:7eea:50a0 with SMTP id adf61e73a8af0-39f2eff3894mr28098881637.18.1775733259440; Thu, 09 Apr 2026 04:14:19 -0700 (PDT) Received: from zhonhan-gv.qualcomm.com (tpe-colo-wan-fw-bordernet.qualcomm.com. [103.229.16.4]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c76c65935cbsm21622903a12.26.2026.04.09.04.14.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Apr 2026 04:14:18 -0700 (PDT) From: Zhongqiu Han To: rafael@kernel.org, viresh.kumar@linaro.org Cc: venkatesh.pallipadi@intel.com, davej@redhat.com, trenn@suse.de, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, zhongqiu.han@oss.qualcomm.com Subject: [PATCH v2 2/2] cpufreq: governor: Fix stale prev_cpu_nice spike when enabling ignore_nice_load Date: Thu, 9 Apr 2026 19:14:07 +0800 Message-ID: <20260409111407.9775-3-zhongqiu.han@oss.qualcomm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com> References: <20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-ORIG-GUID: 4r_bFKb4rCPCIK_MMdSJOlW1mnWiZoMd X-Proofpoint-GUID: 4r_bFKb4rCPCIK_MMdSJOlW1mnWiZoMd X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNDA5MDA5NiBTYWx0ZWRfX7jvf4e+wNBVD TiE87w6oh4r1TJG8GsOyl452/727ro5iuB7xc5Ajo72w3AOCbOlfaZaOm0rLS11/uvUqFesec5X yENbs3IlUX/Xv0llhtvgeieEAGBAIpY7SUxRtGUS8npk+mitdzEtb3n8RbzprL74CoDLPdY+LCM jkceDj5ct34GohLi/2N1tVBij584JCNZN3S5jKvmho6NSI8wQ7YBavEdggCa2wiLSs2/5poliB4 N5A3IkPH1jvMNU7IZb2KuZNgD3C/azEH9UcYd8jWEerVgnK3OlfevNUVMw0l5CYJ4+/7IF8YffX 16f02YyOqhwxXfwKyQS2hlH9LkKY3Ip2adXh3ljnCuYrAggFYlCvaYPCzu0zZJBUyMyfNsCR8pK VLM/U3hb3O004LOjEGe2WNzXYFz3lZWh1vcAoOCMPpXVyYZW3spOrlX9H6/iZMjikEu9hhbdEJL OuBIoPk+ITIJV5/sM4A== X-Authority-Analysis: v=2.4 cv=BefoFLt2 c=1 sm=1 tr=0 ts=69d78a0d cx=c_pps a=Oh5Dbbf/trHjhBongsHeRQ==:117 a=nuhDOHQX5FNHPW3J6Bj6AA==:17 a=A5OVakUREuEA:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=u7WPNUs3qKkmUXheDGA7:22 a=3WHJM1ZQz_JShphwDgj5:22 a=EUspDBNiAAAA:8 a=pdZpGBdB-CSjEQLDLe0A:9 a=_Vgx9l1VpLgwpw_dHYaR:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-04-09_03,2026-04-09_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 suspectscore=0 spamscore=0 priorityscore=1501 phishscore=0 malwarescore=0 bulkscore=0 clxscore=1015 lowpriorityscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2604010000 definitions=main-2604090096 Content-Type: text/plain; charset="utf-8" When ignore_nice_load is toggled from 0 to 1 via sysfs, dbs_update() may run concurrently and observe the new tunable value while prev_cpu_nice still holds a stale baseline, producing a spurious massive idle_time that results in an incorrect CPU load value. The root cause is that prev_cpu_nice is only updated inside dbs_update() when ignore_nice is true. While ignore_nice is false, prev_cpu_nice is never advanced, so it accumulates an unbounded debt of nice CPU time. The moment ignore_nice is flipped to 1, the very next dbs_update() call computes: idle_time +=3D cur_nice - j_cdbs->prev_cpu_nice where prev_cpu_nice is stale (possibly 0 if never updated since boot), making idle_time artificially large. The race can be illustrated with two concurrent paths: Path A (sysfs write, holds attr_set->update_lock): governor_store() mutex_lock(&attr_set->update_lock) ignore_nice_load_store() dbs_data->ignore_nice_load =3D 1 /* (A1) */ gov_update_cpu_data(dbs_data) mutex_lock(&policy_dbs->update_mutex) /* (A2) */ j_cdbs->prev_cpu_nice =3D kcpustat_field(...) mutex_unlock(&policy_dbs->update_mutex) mutex_unlock(&attr_set->update_lock) Path B (work queue, wins the race between A1 and A2): dbs_work_handler() mutex_lock(&policy_dbs->update_mutex) /* acquired before A2 */ dbs_update() ignore_nice =3D dbs_data->ignore_nice_load /* sees new value: 1 */ cur_nice =3D kcpustat_field(...) idle_time +=3D cur_nice - j_cdbs->prev_cpu_nice /* stale */ j_cdbs->prev_cpu_nice =3D cur_nice mutex_unlock(&policy_dbs->update_mutex) Note that even without the race, the anomaly occurs deterministically on the very first dbs_update() call after ignore_nice_load is enabled, because prev_cpu_nice has never been updated while ignore_nice was 0. The race only widens the window in which the stale read can happen. Fix this by unconditionally sampling cur_nice and advancing prev_cpu_nice in dbs_update() on every call, regardless of ignore_nice. With prev_cpu_nice always reflecting the most recent sample, enabling ignore_nice_load can never produce a stale-baseline spike: the delta will always be the nice time accumulated in the last sampling interval, not since boot. As a consequence of always tracking prev_cpu_nice: - gov_update_cpu_data() no longer needs to reset prev_cpu_nice when ignore_nice_load changes; remove that conditional. - cpufreq_dbs_governor_start() must unconditionally initialize prev_cpu_nice so the very first dbs_update() has a valid baseline; remove the ignore_nice guard and the now-unused ignore_nice variable. - ignore_nice_load_store() no longer needs to call gov_update_cpu_data() at all (prev_cpu_nice is always current); remove that call. Fixes: ee88415caf736b ("[CPUFREQ] Cleanup locking in conservative governor") Fixes: 5a75c82828e7c0 ("[CPUFREQ] Cleanup locking in ondemand governor") Signed-off-by: Zhongqiu Han --- drivers/cpufreq/cpufreq_conservative.c | 3 --- drivers/cpufreq/cpufreq_governor.c | 28 +++++++++++++++++--------- drivers/cpufreq/cpufreq_ondemand.c | 3 --- 3 files changed, 18 insertions(+), 16 deletions(-) diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufr= eq_conservative.c index df01d33993d8..5c316d2d3ddd 100644 --- a/drivers/cpufreq/cpufreq_conservative.c +++ b/drivers/cpufreq/cpufreq_conservative.c @@ -213,9 +213,6 @@ static ssize_t ignore_nice_load_store(struct gov_attr_s= et *attr_set, =20 dbs_data->ignore_nice_load =3D input; =20 - /* we need to re-evaluate prev_cpu_idle */ - gov_update_cpu_data(dbs_data); - return count; } =20 diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_g= overnor.c index c0d419c95609..cfbfa5d8bb36 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -92,6 +92,12 @@ EXPORT_SYMBOL_GPL(sampling_rate_store); * * Call under the @dbs_data->attr_set.update_lock. The per-policy * update_mutex is acquired and released internally for each policy. + * + * Note: prev_cpu_nice is intentionally not reset here. dbs_update() tracks + * prev_cpu_nice unconditionally on every sample, so it is always current. + * Resetting it here is therefore unnecessary and would only introduce a + * one-sample spike if a concurrent dbs_update() ran between the reset and + * the next sample. */ void gov_update_cpu_data(struct dbs_data *dbs_data) { @@ -106,8 +112,6 @@ void gov_update_cpu_data(struct dbs_data *dbs_data) =20 j_cdbs->prev_cpu_idle =3D get_cpu_idle_time(j, &j_cdbs->prev_update_tim= e, dbs_data->io_is_busy); - if (dbs_data->ignore_nice_load) - j_cdbs->prev_cpu_nice =3D kcpustat_field(&kcpustat_cpu(j), CPUTIME_NIC= E, j); } mutex_unlock(&policy_dbs->update_mutex); } @@ -167,12 +171,18 @@ unsigned int dbs_update(struct cpufreq_policy *policy) =20 j_cdbs->prev_cpu_idle =3D cur_idle_time; =20 - if (ignore_nice) { - u64 cur_nice =3D kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE, j); + /* + * Always sample cur_nice and advance prev_cpu_nice, regardless + * of ignore_nice. This keeps prev_cpu_nice current so that + * enabling ignore_nice_load via sysfs never produces a + * stale-baseline spike (the delta will be at most one sampling + * interval of accumulated nice time, not since boot). + */ + u64 cur_nice =3D kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE, j); =20 + if (ignore_nice) idle_time +=3D div_u64(cur_nice - j_cdbs->prev_cpu_nice, NSEC_PER_USEC); - j_cdbs->prev_cpu_nice =3D cur_nice; - } + j_cdbs->prev_cpu_nice =3D cur_nice; =20 if (unlikely(!time_elapsed)) { /* @@ -519,7 +529,7 @@ int cpufreq_dbs_governor_start(struct cpufreq_policy *p= olicy) struct dbs_governor *gov =3D dbs_governor_of(policy); struct policy_dbs_info *policy_dbs =3D policy->governor_data; struct dbs_data *dbs_data =3D policy_dbs->dbs_data; - unsigned int sampling_rate, ignore_nice, j; + unsigned int sampling_rate, j; unsigned int io_busy; =20 if (!policy->cur) @@ -529,7 +539,6 @@ int cpufreq_dbs_governor_start(struct cpufreq_policy *p= olicy) policy_dbs->rate_mult =3D 1; =20 sampling_rate =3D dbs_data->sampling_rate; - ignore_nice =3D dbs_data->ignore_nice_load; io_busy =3D dbs_data->io_is_busy; =20 mutex_lock(&policy_dbs->update_mutex); @@ -542,8 +551,7 @@ int cpufreq_dbs_governor_start(struct cpufreq_policy *p= olicy) */ j_cdbs->prev_load =3D 0; =20 - if (ignore_nice) - j_cdbs->prev_cpu_nice =3D kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE= , j); + j_cdbs->prev_cpu_nice =3D kcpustat_field(&kcpustat_cpu(j), CPUTIME_NICE,= j); } mutex_unlock(&policy_dbs->update_mutex); =20 diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_o= ndemand.c index 9942dbb38dae..d8d843183c21 100644 --- a/drivers/cpufreq/cpufreq_ondemand.c +++ b/drivers/cpufreq/cpufreq_ondemand.c @@ -261,9 +261,6 @@ static ssize_t ignore_nice_load_store(struct gov_attr_s= et *attr_set, } dbs_data->ignore_nice_load =3D input; =20 - /* we need to re-evaluate prev_cpu_idle */ - gov_update_cpu_data(dbs_data); - return count; } =20 --=20 2.43.0