From nobody Tue Dec 23 10:19:05 2025 Received: from mail-pj1-f68.google.com (mail-pj1-f68.google.com [209.85.216.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0005DDBE for ; Tue, 21 Jan 2025 02:50:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737427860; cv=none; b=IuuqK7DYMgLxnWuBCRvNLBJRL7xkln9UGzsFXB69lhbqYE7S5vZwNYF7eZr+ZUAl7+83AfRdotu5ZUKYnHlbwHFDrlsG7srwpO04xZVLgj+sxw+5lV6gUgBVkypwpeZb79NDdvOsfInGKMAZMflxlmFWhV+TIToLSedpBOEgEUs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737427860; c=relaxed/simple; bh=Lku7/Vc86uPjkaRhA2tLjraIdVAyXQBPAmNwfjIfpsI=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=PXa/ppS8GuU1eEy3mUv3O5LP0jrb0ga6k0DYDYq6+kVucvTHq6egVOnzShDTip9Ary7+b5fB9ss3veBjxxhRhvWuTWfvV1cHfRRwF1LAALh9gDOKzaiaQjp3dWWOwffg9f3qjy7G2Dn0mw+ghFVOqc4Df/b6BKc+gDL8Lf3CkVE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jQ5n1P3i; arc=none smtp.client-ip=209.85.216.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jQ5n1P3i" Received: by mail-pj1-f68.google.com with SMTP id 98e67ed59e1d1-2ee709715d9so7010176a91.3 for ; Mon, 20 Jan 2025 18:50:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737427858; x=1738032658; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Uj3LyIowOs57/YtLJMRZ7zP2vg4n9EflpPbFgWqC8y8=; b=jQ5n1P3in/SgWHLvS0iqvbbGR5vPwKIlxh52HF+sW8QlKHAdGSP6w0ttZOsR0COWnx DhyN0RflQTsUtawkhAzgzJv6R5uxoPdqw8I/EloNh44WoUTwxYv+hpNsO89wJUUPouG+ NQR9XAVoo5FUvTtVaGi7+r3XcTuL94EQZHTEf9WjpAxZ/TJRqq5NVqgLY9Ppjtc2smXc aAB4Z6QBAFtNBvNOYPGGZKlGlkdz0xKPboqCo2IfDh3N+77yUGaVpoxhTM5MTG7zedDx j57tNUJuXGXaTuQ8QasIlzCYjCfaePIk5fHTfG5cREC1wWaBcGEfDYQO2DK46qMa68rB HYjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737427858; x=1738032658; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Uj3LyIowOs57/YtLJMRZ7zP2vg4n9EflpPbFgWqC8y8=; b=i6uKqfraon3Lu7hBG/JKkV2QUaDiZhslTyiAqK0L7/4WWQJGK2rmc1KSFlMrNn5Tpu pIxmZb8CATLZMC/SDCNc6AO1tL2sG/VFyT/mQPWWnt8nJYAcAAieYFwrUY0yCnCh5K4f mvqbos8A+FMXqLY20CXWLtbR2NzDbQSGROSsRTkKfLqSWoiT8evik3z/9+IG4vUXowgS uILrlrhcHKVZS95SysuDn7wbl8zSIEZSObilXDZSlduJLeTbwcjNGyljR8NXic5Frjtf 9b/NbmNOrKF8Donfb5L0/5IksETLvCHO36wjLcj6xkQsX3jZosJ9q6kE2gJWH2APNVAM 23XA== X-Gm-Message-State: AOJu0YxBWfBEzjVhPzBP7gCwSVeRNwF1Or7wGK0zTS9mFNiXBV42Wq4G MUbhQZHXdSjZCW+JiARbMbFgQhcg5asdSIuFDagP8lLAUhl199F+ X-Gm-Gg: ASbGnctfxlCbLo6+zeM/Wqxwm/lS2cKkNfIV4CZdDbWPpVy/US0Ecr+QMKjvXYNiUTQ JwgnQuto3F3nsJgHdau8U9zrhx4AEKpaODbN5wy2zMavva8yUeHR6B0N+OwvgbyG69p2aXSieTX 0ybpxHzpMCIDnGJIP1d1ZQmGxHez4Um0tKAjYCPRTIdkQoIEkY6SBBCU9OHEk5mMADVJOr0Vd3w b0oBNUJh3yeB58gipdhMFWtDQXxGfX5buUSX7vVFnjebD9htM6pR1fGWRAlWhpCoA6qCy1Xu4lQ 9y/O8qmEGS3Jag== X-Google-Smtp-Source: AGHT+IG+U0JltOJLzgqZOKJBjj+U10sEyCmWiord+zsjCiB7gPyzC9SbGW9RxPOJuPOfSCOlhdTWkQ== X-Received: by 2002:a05:6a00:4615:b0:72d:9ec5:928 with SMTP id d2e1a72fcca58-72dafae8f21mr20640305b3a.22.1737427857869; Mon, 20 Jan 2025 18:50:57 -0800 (PST) Received: from localhost.localdomain ([114.67.205.189]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72dab9c8e37sm7805306b3a.119.2025.01.20.18.50.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Jan 2025 18:50:57 -0800 (PST) From: zihan zhou <15645113830zzh@gmail.com> To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com Cc: linux-kernel@vger.kernel.org, zihan zhou <15645113830zzh@gmail.com> Subject: [PATCH V1] sched: Reduce the default slice to avoid tasks getting an extra tick Date: Tue, 21 Jan 2025 10:49:31 +0800 Message-Id: <20250121024929.110399-1-15645113830zzh@gmail.com> X-Mailer: git-send-email 2.33.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The old default value for slice is 0.75 msec * (1 + ilog(ncpus)) which means that we have a default slice of 0.75 for 1 cpu 1.50 up to 3 cpus 2.25 up to 7 cpus 3.00 for 8 cpus and above. For HZ=3D250 and HZ=3D100, because of the tick accuracy, the runtime of tas= ks is far higher than their slice. For HZ=3D1000 with 8 cpus or more, the accuracy of tick is already satisfactory, but there is still an issue that tasks will get an extra tick because the tick often arrives a little faster than expected. In this case, the task can only wait until the next tick to consider that it has reached its deadline, and will run 1ms longer. vruntime + sysctl_sched_base_slice =3D deadline |-----------|-----------|-----------|-----------| 1ms 1ms 1ms 1ms ^ ^ ^ ^ tick1 tick2 tick3 tick4(nearly 4ms) There are two reasons for tick error: clockevent precision and the CONFIG_IRQ_TIME_ACCOUNTING/CONFIG_PARAVIRT_TIME_ACCOUNTING. with CONFIG_IRQ_TIME_ACCOUNTING every tick will be less than 1ms, but even without it, because of clockevent precision, tick still often less than 1ms. In order to make scheduling more precise, we changed 0.75 to 0.70, Using 0.70 instead of 0.75 should not change much for other configs and would fix this issue: 0.70 for 1 cpu 1.40 up to 3 cpus 2.10 up to 7 cpus 2.8 for 8 cpus and above. This does not guarantee that tasks can run the slice time accurately every time, but occasionally running an extra tick has little impact. Signed-off-by: zihan zhou <15645113830zzh@gmail.com> --- kernel/sched/fair.c | 47 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 43 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 26958431deb7..754b0785eaa0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -71,10 +71,49 @@ unsigned int sysctl_sched_tunable_scaling =3D SCHED_TUN= ABLESCALING_LOG; /* * Minimal preemption granularity for CPU-bound tasks: * - * (default: 0.75 msec * (1 + ilog(ncpus)), units: nanoseconds) - */ -unsigned int sysctl_sched_base_slice =3D 750000ULL; -static unsigned int normalized_sysctl_sched_base_slice =3D 750000ULL; + * (default: 0.70 msec * (1 + ilog(ncpus)), units: nanoseconds) + * + * The old default value for slice is 0.75 msec * (1 + ilog(ncpus)) which + * means that we have a default slice of + * 0.75 for 1 cpu + * 1.50 up to 3 cpus + * 2.25 up to 7 cpus + * 3.00 for 8 cpus and above. + * + * For HZ=3D250 and HZ=3D100, because of the tick accuracy, the runtime of= tasks + * is far higher than their slice. + * For HZ=3D1000 with 8 cpus or more, the accuracy of tick is already + * satisfactory, but there is still an issue that tasks will get an extra + * tick because the tick often arrives a little faster than expected. In t= his + * case, the task can only wait until the next tick to consider that it has + * reached its deadline, and will run 1ms longer. + * + * vruntime + sysctl_sched_base_slice =3D deadline + * |-----------|-----------|-----------|-----------| + * 1ms 1ms 1ms 1ms + * ^ ^ ^ ^ + * tick1 tick2 tick3 tick4(nearly 4ms) + * + * There are two reasons for tick error: clockevent precision and the + * CONFIG_IRQ_TIME_ACCOUNTING/CONFIG_PARAVIRT_TIME_ACCOUNTING. + * with CONFIG_IRQ_TIME_ACCOUNTING every tick will be less than 1ms, but e= ven + * without it, because of clockevent precision, tick still often less than + * 1ms. + * + * In order to make scheduling more precise, we changed 0.75 to 0.70, + * Using 0.70 instead of 0.75 should not change much for other configs + * and would fix this issue: + * 0.70 for 1 cpu + * 1.40 up to 3 cpus + * 2.10 up to 7 cpus + * 2.8 for 8 cpus and above. + * + * This does not guarantee that tasks can run the slice time accurately ev= ery + * time, but occasionally running an extra tick has little impact. + * + */ +unsigned int sysctl_sched_base_slice =3D 700000ULL; +static unsigned int normalized_sysctl_sched_base_slice =3D 700000ULL; =20 const_debug unsigned int sysctl_sched_migration_cost =3D 500000UL; =20 --=20 2.33.0