From nobody Mon Dec 15 21:42:42 2025 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA60B2135C8 for ; Tue, 14 Jan 2025 23:04:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736895872; cv=none; b=o7KoEFiSdCDxdKvsBc/AU5SFu3hTCIclRe0IVzjkFF1WOf+shOuTO7nplAhKbrK8HqS1w0r8YGfzOJ/O+RQAR1jshGpbWOewlan/4yG6sup4HXcpTlkZh83U2x+6J30eQSbuyboOB+MkEjZkmVb0QS5s11mNAMWdj/7C2Wg/9Xs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736895872; c=relaxed/simple; bh=luz+o5nYhjV+ZtFiNy1+6l+b0XAAqwISG8/qprpIQhs=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=BnIHQL7zHJtENRq8m/vUa6NuAdCNTuHP48J7d1zqRkKNNm9VlLYdqVi6onqfre8+iQ1VIEa11cF7w0nOD/5XlsBXRne+W94zGmNgkKWrSc71Tf8KtX7vSnEpLBzNMFoEEq+HvMUJ/rQXahW38lVEgRXfQ/VYOwuH9gdsj5szgbM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=EmJuHAJJ; arc=none smtp.client-ip=209.85.214.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="EmJuHAJJ" Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-216426b0865so107106485ad.0 for ; Tue, 14 Jan 2025 15:04:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736895870; x=1737500670; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=V0gQQan57PKLomscVHGDFbmw/nF8ymHrsZiydVj2F9g=; b=EmJuHAJJp6IGCp0bBw+ILjqDE74H8ix8xzLoT5npGq6IgyOmKAiSLQF7Za1eJJR2Un OLD69YPcciv+Z0lQsgIGN09B7Z0OySf0KK9O5xcpSu4SjCjdlxjsc3YjzCeu/o46c2LS D9GDZK1NFTE/V9KvQN0AdU9OfjRWLSw1gC12v8XdC2g9ldKJvg2oRfCXSf5gxCi8O/tQ pcQCBgxh47HLkVqyNLZGQAe5b31OaDX3Ri6P20tUeFxhC62mqBiaXiLz0Q5ybTfACH6+ fdwhaZq1TgZMfEtRZY6VShVej11O6+5pPvfVqnhPDqQvlNXDQ0zVTfaWjIqSU9w8XxdF DCYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736895870; x=1737500670; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=V0gQQan57PKLomscVHGDFbmw/nF8ymHrsZiydVj2F9g=; b=QHfQaozWqCpOHodAzjMID+u8OccPZm0UYEJ+UJsI8m+HJIYzhWQ2cWdgS+ZDRW7MMK f0SoVnxvnBOWuGk1ldr1L/7SOA9ERtTmHOcwAuVwUmQQqvwXPe5zcnQaVDrwLDraMi2z xs/zqlJnY3a+1kDdrjw1cFlPzijzmVo/DVtKXS4mKGCwDRHxfxwP8syHQWjBWAxLKpaS eMIv8HxTXTIkbvl975F5ldY6ZPNUWnWQD5I2A7QdqRnRAN3CfZJtuHRkJCF/9zR4LCFk oE0G55BJ54v8MQYqedAmVg3Qs7nIG9Wq5q291YBEUM/4AmLxSSAM6LzDGvcN0rQhelwh jufw== X-Forwarded-Encrypted: i=1; AJvYcCV7Qe82aLrSJn3oGSmeXI41YYPj9BVjWQGJtiADZjy2InjOu+r3vyOYUwpyUy3EbAaoULGH8rWexlNvn8U=@vger.kernel.org X-Gm-Message-State: AOJu0YyXLdPsLUog5E+jUS07sjQL+eQLCnQgaHU/VqRzgFGHy0l2V51N v0Nn5RG4s4sFRZbq4DAGT15OomviwgxRMuJLqxwtURdCMl2i0nP8 X-Gm-Gg: ASbGncvFXsWG53T/W9daGNhx4+PsCtwkOXqC4/jEk3PlDe9AWQtNUwSwybXZbZuIOPh /mV4x18b4wYFCYgdKVtgUwYlGBodO/GVQz7QZed31k4kobBV4aImsjprkD4CTseT5VJccb0Klrf vYkiuc3kPdcVJ3Wi3a4IRx3XYYFNUKoV4woD31WufRn58eP5UvUKtXYWvvejTvkZMkOIYuZf3um 9WYMniWmHdJgS73AfM3HibWlc4l08uypwbhmDFfNoXLWZ2UugXV4mj99/jIprDuT/pLKPZrQLE1 IgZ7HgXsqKpcnTt7TGGD6e0Q X-Google-Smtp-Source: AGHT+IFpHaLqHTaHsnh0gcR0abd02/jtt9pIHYcVsrRxyU0QX/ioGwV1npNUT47f10iIjcAFaB0b7w== X-Received: by 2002:a05:6a20:12d1:b0:1ea:e7fb:7043 with SMTP id adf61e73a8af0-1eae7fb72a0mr13545386637.25.1736895869836; Tue, 14 Jan 2025 15:04:29 -0800 (PST) Received: from stbirv-lnx-1.igp.broadcom.net ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-72d40658bb4sm7920831b3a.86.2025.01.14.15.04.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 14 Jan 2025 15:04:29 -0800 (PST) From: Doug Berger To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot Cc: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Daniel Bristot de Oliveira , Florian Fainelli , linux-kernel@vger.kernel.org, Doug Berger Subject: [RFC PATCH] sched/topology: clear freecpu bit on detach Date: Tue, 14 Jan 2025 15:04:04 -0800 Message-Id: <20250114230404.661569-1-opendmb@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There is a hazard in the deadline scheduler where an offlined CPU can have its free_cpus bit left set in the def_root_domain when the schedutil cpufreq governor is used. This can allow a deadline thread to be pushed to the runqueue of a powered down CPU which breaks scheduling. The details can be found here: https://lore.kernel.org/lkml/20250110233010.2339521-1-opendmb@gmail.com The free_cpus mask is expected to be cleared by set_rq_offline(); however, the hazard occurs before the root domain is made online during CPU hotplug so that function is not invoked for the CPU that is being made active. This commit works around the issue by ensuring the free_cpus bit for a CPU is always cleared when the CPU is removed from a root_domain. This likely makes the call of cpudl_clear_freecpu() in rq_offline_dl() fully redundant, but I have not removed it here because I am not certain of all flows. It seems likely that a better solution is possible from someone more familiar with the scheduler implementation, but this approach is minimally invasive from someone who is not. Fixes: 120455c514f7 ("sched: Fix hotplug vs CPU bandwidth control") Signed-off-by: Doug Berger --- kernel/sched/topology.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index da33ec9e94ab..3cbc14953c36 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -499,6 +499,7 @@ void rq_attach_root(struct rq *rq, struct root_domain *= rd) set_rq_offline(rq); =20 cpumask_clear_cpu(rq->cpu, old_rd->span); + cpudl_clear_freecpu(&old_rd->cpudl, rq->cpu); =20 /* * If we don't want to free the old_rd yet then --=20 2.34.1