From nobody Thu Dec 18 08:36:26 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0DE2630C342; Tue, 9 Dec 2025 21:04:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765314275; cv=none; b=qM20+RMfUhUPp57PZ9DWuwNVFNXFMdtGqCnOiQTvUghhe9smgMwjDO9HqNNiQ8JSOT3nAIDn7u+GFzjtQ5zvv/83ZSfu2PPNwtzrNKYua94dPnW/8oNkUoXPimSsYSOuBeE21ViM1MWxhvnoi0EBeQnNnoyh8oF8giYf44dHemg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765314275; c=relaxed/simple; bh=a39aZRYG4GuuNjVKn0c9PuWXsD2hsaYYnAPxt3wMqHI=; h=Date:Message-ID:From:To:Cc:Subject; b=f3zGN1aUGpc4nQEY530GJ0h0tHviPqcN2h+BO124mFGy7etUB0Lz5Cg9iW/l2+ZHcatVoR0CRB1cj36evNUJ9zLx+pCzxymJMaM7YuwzBGvjOmcz2j+cMxUQM6WlAH+RgD3YJCpgQlGZn8ergu/iAW9VhdgTW6t9oKWtkcV3hno= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fCc0dr+p; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fCc0dr+p" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 77EC9C4CEF5; Tue, 9 Dec 2025 21:04:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1765314274; bh=a39aZRYG4GuuNjVKn0c9PuWXsD2hsaYYnAPxt3wMqHI=; h=Date:From:To:Cc:Subject:From; b=fCc0dr+pvO612rC42KadLOL7coWJjNsR7Qto5AsgMYTViQMfYwoCugECEny0F1DLH SSHUzINWZ/gjvZZg+9K4qyjwbAFopJuz2ho7NPPgFfyasQta9+E4ZxRa20A8OFmWoI YAqMVba7ZW1o5eK+wfDhk4Gm6EUYc2BnodCEWXsSb6bJqSH7WmvJTAto7BFv/w0+Sv VfhY9azJ4fBnlPJpt+goyXRUdkav6AhScM2bvKgkqbuQzmm2mD5Lj017Yyh3LG2ULI VdaJwhzmDZnaeEUKOedeA7JQfZ9txZMw0SP6KleaRnyX0pgL4QYqrTltPWstvk2Mul D2xYAKfQIbYKA== Date: Tue, 09 Dec 2025 11:04:33 -1000 Message-ID: <286e6f7787a81239e1ce2989b52391ce@kernel.org> From: Tejun Heo To: David Vernet , Andrea Righi , Changwoo Min Cc: Chris Mason , sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH sched_ext/for-6.19-fixes] sched_ext: Fix bypass depth leak on scx_enable() failure Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" scx_enable() calls scx_bypass(true) to initialize in bypass mode and then scx_bypass(false) on success to exit. If scx_enable() fails during task initialization - e.g. scx_cgroup_init() or scx_init_task() returns an error= - it jumps to err_disable while bypass is still active. scx_disable_workfn() then calls scx_bypass(true/false) for its own bypass, leaving the bypass de= pth at 1 instead of 0. This causes the system to remain permanently in bypass m= ode after a failed scx_enable(). Failures after task initialization is complete - e.g. scx_tryset_enable_sta= te() at the end - already call scx_bypass(false) before reaching the error path = and are not affected. This only affects a subset of failure modes. Fix it by tracking whether scx_enable() called scx_bypass(true) in a bool a= nd having scx_disable_workfn() call an extra scx_bypass(false) to clear it. Th= is is a temporary measure as the bypass depth will be moved into the sched instance, which will make this tracking unnecessary. Fixes: 8c2090c504e9 ("sched_ext: Initialize in bypass mode") Cc: stable@vger.kernel.org # v6.12+ Reported-by: Chris Mason Signed-off-by: Tejun Heo Reviewed-by: Emil Tsalapatis --- kernel/sched/ext.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -41,6 +41,13 @@ static bool scx_init_task_enabled; static bool scx_switching_all; DEFINE_STATIC_KEY_FALSE(__scx_switched_all); +/* + * Tracks whether scx_enable() called scx_bypass(true). Used to balance by= pass + * depth on enable failure. Will be removed when bypass depth is moved int= o the + * sched instance. + */ +static bool scx_bypassed_for_enable; + static atomic_long_t scx_nr_rejected =3D ATOMIC_LONG_INIT(0); static atomic_long_t scx_hotplug_seq =3D ATOMIC_LONG_INIT(0); @@ -4318,6 +4325,11 @@ static void scx_disable_workfn(struct kt scx_dsp_max_batch =3D 0; free_kick_syncs(); + if (scx_bypassed_for_enable) { + scx_bypassed_for_enable =3D false; + scx_bypass(false); + } + mutex_unlock(&scx_enable_mutex); WARN_ON_ONCE(scx_set_enable_state(SCX_DISABLED) !=3D SCX_DISABLING); @@ -4970,6 +4982,7 @@ static int scx_enable(struct sched_ext_o * Init in bypass mode to guarantee forward progress. */ scx_bypass(true); + scx_bypassed_for_enable =3D true; for (i =3D SCX_OPI_NORMAL_BEGIN; i < SCX_OPI_NORMAL_END; i++) if (((void (**)(void))ops)[i]) @@ -5067,6 +5080,7 @@ static int scx_enable(struct sched_ext_o scx_task_iter_stop(&sti); percpu_up_write(&scx_fork_rwsem); + scx_bypassed_for_enable =3D false; scx_bypass(false); if (!scx_tryset_enable_state(SCX_ENABLED, SCX_ENABLING)) { -- tejun