From nobody Sun May 10 05:41:02 2026 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C696E19CD1B for ; Tue, 31 Dec 2024 15:01:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.180.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735657315; cv=none; b=VIaxFpDqO6czDbh5y36ztUuj1Gjywr2VH5xWk36UFkKDi+LZIIgdd787z64TpC5r5bO972Be9SjJA/U9RC/IRdvxB9ktsWriRcgczI4jjB047+5DK9jMfaHfwyDZvcwfCStC9zlXIFEiQ8oPKs05p1q1xlr53vN18nH1KyD4CWw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735657315; c=relaxed/simple; bh=KAMExwg55nygkjGz2o6zhAJ6/teUWrr/wZfe+XJpH5E=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=BV10DK9mQE55WFOuEZHbXHq+ySBhQ7UIYWZePFIw4AbYv51tgQ9/dq+S/t0YY+fFabqjnyogYm43ZUVxsFbY2FIy3EtUfn8JlBx1aIhKzDx+eb0QV4xRmNk3sP3iv7RQqCZcLjpMB/klYNrHv1guwvtc1w86/HD7jwRpTJINi7o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com; spf=pass smtp.mailfrom=quicinc.com; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b=RWs34u5R; arc=none smtp.client-ip=205.220.180.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=quicinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=quicinc.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=quicinc.com header.i=@quicinc.com header.b="RWs34u5R" Received: from pps.filterd (m0279872.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4BV9rmGJ001594; Tue, 31 Dec 2024 15:01:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h= cc:content-transfer-encoding:content-type:date:from:message-id :mime-version:subject:to; s=qcppdkim1; bh=7Aoczov8csn/8BuFGntu7s 1apOaW9rtTAUgoJOtU10o=; b=RWs34u5RqS8Sq5vqaHEqCCK64o6FoADm5VMijU lgdOs/UCgn77hzKcxlYQNnMyxusXpKkEgqw7qlHMWSQqnUOlXz79ltBJThdsx8ZN daltJp2nOlV5ygDv36+3tjX9va3UnkuKhmfCYlOVUR9XIZWVotyPc1bLA1nz7rny rcamputT8toQ3M+wv8gF6o7cYO/maIyUXYCtMLUuVz8q2/o47JF9nA9yxjXwOdWa 8xXVQfeFOImnG11LVRMsNzIWJcBvsIoFQpgpKC2gh46XPsWzseVnIfBY5QJ7O4TS Reo+HeSWlYKSAX5nK/HuhPFnxb8lsWTKlHEGHG0GbFFQoOiw== Received: from nasanppmta01.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 43vecq0ecv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Dec 2024 15:01:47 +0000 (GMT) Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA01.qualcomm.com (8.18.1.2/8.18.1.2) with ESMTPS id 4BVF1lq8020140 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 31 Dec 2024 15:01:47 GMT Received: from zhonhan-gv.qualcomm.com (10.80.80.8) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.9; Tue, 31 Dec 2024 07:01:45 -0800 From: Zhongqiu Han To: , , CC: , Subject: [PATCH] timers: Optimize get_timer_cpu_base() to reduce potentially redundant per_cpu_ptr() calls Date: Tue, 31 Dec 2024 23:01:15 +0800 Message-ID: <20241231150115.1978342-1-quic_zhonhan@quicinc.com> X-Mailer: git-send-email 2.25.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: 4hUQrMW7WXjEdNCUIgppegWjrx5OsZsr X-Proofpoint-GUID: 4hUQrMW7WXjEdNCUIgppegWjrx5OsZsr X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-06_09,2024-09-06_01,2024-09-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1011 bulkscore=0 adultscore=0 mlxlogscore=853 lowpriorityscore=0 phishscore=0 suspectscore=0 malwarescore=0 impostorscore=0 priorityscore=1501 spamscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2411120000 definitions=main-2412310127 Content-Type: text/plain; charset="utf-8" If the timer is deferrable and NO_HZ_COMMON is enabled, the function get_timer_cpu_base() will call per_cpu_ptr() twice. Optimize the function to avoid potentially redundant per_cpu_ptr() calls. One of the call paths of the get_timer_cpu_base() function is through the lock_timer_base() function, which contains a loop. Within this loop, the get_timer_base() func is called, and in turn, it calls the get_timer_cpu_base() function. And in such a path, get_timer_cpu_base is a hotspot function. It is called approximately 13,000 times in 12 seconds on test x86 KVM machines. lock_timer_base(){ for(;;) { ... --> get_timer_base() [inline] --> get_timer_cpu_base() [inline] ... } } With the patch, assembly code(on x86 and ARM64) to be executed in loop is reduced. And conducting comparative tests on x86 KVM virtual machines, comparison of runtime before and after optimization (in nanoseconds), we can see that the distribution of runtime tends to favor smaller time intervals. Before After [0-19]: 0 [0-19]: 0 [20-39]: 6 [20-39]: 1014 [40-59]: 41 [40-59]: 2198 [60-79]: 93 [60-79]: 2073 [80-99]: 814 [80-99]: 3081 [100-119]: 5262 [100-119]: 3268 [120-139]: 4510 [120-139]: 671 [140-159]: 2202 [140-159]: 468 [160-179]: 81 [160-179]: 158 [180-199]: 15 [180-199]: 160 [200-219]: 3 [200-219]: 54 [220-239]: 2 [220-239]: 7 [240-259]: 2 [240-259]: 3 [260-279]: 0 [260-279]: 0 [280-299]: 0 [280-299]: 1 [300-319]: 0 [300-319]: 0 total: 13031 total: 13156 Signed-off-by: Zhongqiu Han Reviewed-by: Frederic Weisbecker --- kernel/time/timer.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index a5860bf6d16f..40706cb36920 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -956,33 +956,29 @@ static int detach_if_pending(struct timer_list *timer= , struct timer_base *base, static inline struct timer_base *get_timer_cpu_base(u32 tflags, u32 cpu) { int index =3D tflags & TIMER_PINNED ? BASE_LOCAL : BASE_GLOBAL; - struct timer_base *base; - - base =3D per_cpu_ptr(&timer_bases[index], cpu); =20 /* * If the timer is deferrable and NO_HZ_COMMON is set then we need * to use the deferrable base. */ if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && (tflags & TIMER_DEFERRABLE)) - base =3D per_cpu_ptr(&timer_bases[BASE_DEF], cpu); - return base; + index =3D BASE_DEF; + + return per_cpu_ptr(&timer_bases[index], cpu); } =20 static inline struct timer_base *get_timer_this_cpu_base(u32 tflags) { int index =3D tflags & TIMER_PINNED ? BASE_LOCAL : BASE_GLOBAL; - struct timer_base *base; - - base =3D this_cpu_ptr(&timer_bases[index]); =20 /* * If the timer is deferrable and NO_HZ_COMMON is set then we need * to use the deferrable base. */ if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && (tflags & TIMER_DEFERRABLE)) - base =3D this_cpu_ptr(&timer_bases[BASE_DEF]); - return base; + index =3D BASE_DEF; + + return this_cpu_ptr(&timer_bases[index]); } =20 static inline struct timer_base *get_timer_base(u32 tflags) --=20 2.25.1