From nobody Wed Apr 1 21:31:40 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06E421FC103 for ; Tue, 4 Mar 2025 08:41:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741077695; cv=none; b=D6EVH7y6Yd4oL6hA/OrxpAUZN3mjM5oiObGwA7ZxNOM4hgmBUNisqArfoxICQBZqtTm6wojZJb6ov/JRg3IUNI6yEIK5v1EfW3O281uIexw/F25FjPhw3xQbdxkpj/EWBwpUZTr84VuHu/I/zAoJ/7TVcfthTCnn0xSpNipDpQE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741077695; c=relaxed/simple; bh=JoFBG7pMXQvknknPg4p6gPlrPQ55M+46+iQsTj5LTik=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WMaWxHIthagZOvRYetHB6fVh0BObj/yarmbPoemXrlbMx04ZLj/FZaG18L/9NpnNCf4G+SktOKoj/S9gHeczswVrw90wzoCmDVeVComD0zODLOxJ8rKsbA0hQdIHgLLnyIV/iyqnGhnChnCG8WHweJWGe6GNTrdoMeKlwN4/V+I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=a4TGY1px; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="a4TGY1px" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741077693; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iUFSm032bQgoMPJSD653MNcg2XgGU99Y/sFQDrAbodA=; b=a4TGY1pxTRuzXtaMTy9Fex7biZ+umFDQmKgsNZC6lf6LksP/s3kU8KV9u4Gm8dq8CmHLlK buYtf1uORe+J4IaEWLSRlJHiGXKCRKIyJGVVkhKkCfAgRrGUeqfUaqQFW3FtD/7rVJM3rJ tOqnnNZjlC+lkDVgzmFhD2nHFN5Octw= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-56-8W-3s1-dNOSZxfA6SdVLNg-1; Tue, 04 Mar 2025 03:41:32 -0500 X-MC-Unique: 8W-3s1-dNOSZxfA6SdVLNg-1 X-Mimecast-MFC-AGG-ID: 8W-3s1-dNOSZxfA6SdVLNg_1741077691 Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-7c3b53373faso369689085a.0 for ; Tue, 04 Mar 2025 00:41:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741077689; x=1741682489; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iUFSm032bQgoMPJSD653MNcg2XgGU99Y/sFQDrAbodA=; b=CuJALMDU/5GqVfWmvxohHRkY/yXrxOEe4VdAkSJ44Q/mRDd5kRwQ+iKAGM7oZuBd03 kegBnxLzdwXETw2AzFLMm1R4wWuRh9rCx+56ZEMvkgooWX2gMSIHaTNkOwPldvwn3ItD sv4emLf1nsF96LuwRKwFZ2vmnzybF1uNq94DjTVHEn4OuK0v3oiDe1LPdF+CroQF7hMt FMORnQwq5HYZQcZBj8ROCmjqsPjWlRpm6lwcrmVTJRnZy/YQEgGAOxOULt0dSfU3j2aq ihG7x0gOGFxqTc1t5I9BpM9M4ImE7fExxUrLgFmhzBKh4OGH/z0IhY1BZRdOEA9Ut9ZK gpJg== X-Gm-Message-State: AOJu0YzX+CLWM8hkE17vuOz5gKFiDjc0LwNpqB9Xl6UNXJWjdE1ICca/ 5YRjKB5SgexJNYEVxKsQ1DEJJpt5TZDBojhJ+BqFxIOtSFv9LC1r5f4xZddRL0zo34ihC4nju8l SNtTzgkZ3JbKj78Kkh5OXrIIIiPTPWHLC0XNw43f5i+Dbs8tufHCaL3hbkD0XMOQOafJAYMHoN7 U1LvAYDbNWg6aAUDdDVuTaJxW5ryYGommCwvoaLe3WL+jeBb/BoGpf7w== X-Gm-Gg: ASbGncsp9UjGBC27Bt6iZqZ7QSo52CgSmXdXZGdjF63Atgz1iSZ8gEVsSM2kBnqYiWa frMwr1Z+MIMZoVmIMbsS8wkuLw1kLzHZI5r8sooruRn62nxbzPJBWGqq1IIlendECQGbbqb0LjT qVM21luN/j+7pqkW8ptmu2ibWQ/AhdckatD+GLoFWjhhfMeCCxAv9ncSYQXOFJbXubTHojV4Q2x 8QdCKiFXpIAQ5e6B9XL0Bl515d1lqXTnEXuv6GVeipVXZElq4F/+bzmXFk+aKARC7nOek6aP4jH lf+mbj05AuA/xYJFAMx+0PY/CeGCSpPFoE/Rpjd3QSZv1XoJXY/Pm8rn80g+E5xSxuIZvaD7Shs vK+60 X-Received: by 2002:a05:620a:2b86:b0:7c3:c1fb:3df2 with SMTP id af79cd13be357-7c3c1fb4244mr730708685a.46.1741077689661; Tue, 04 Mar 2025 00:41:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IFjbG0dTZywrc8B1o3eEIcsstcc67Y38OSW68GGeX9mfGUlHgF5MXZ2iZRf4TVMhcQvkAS5og== X-Received: by 2002:a05:620a:2b86:b0:7c3:c1fb:3df2 with SMTP id af79cd13be357-7c3c1fb4244mr730704985a.46.1741077689282; Tue, 04 Mar 2025 00:41:29 -0800 (PST) Received: from jlelli-thinkpadt14gen4.remote.csb (host-89-240-117-139.as13285.net. [89.240.117.139]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7c3c0a94fbbsm218395285a.1.2025.03.04.00.41.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Mar 2025 00:41:26 -0800 (PST) From: Juri Lelli To: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Waiman Long , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Qais Yousef , Sebastian Andrzej Siewior , Swapnil Sapkal , Shrikanth Hegde , Phil Auld , luca.abeni@santannapisa.it, tommaso.cucinotta@santannapisa.it, Jon Hunter Subject: [PATCH 4/5] sched/deadline: Rebuild root domain accounting after every update Date: Tue, 4 Mar 2025 08:40:44 +0000 Message-ID: <20250304084045.62554-5-juri.lelli@redhat.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250304084045.62554-1-juri.lelli@redhat.com> References: <20250304084045.62554-1-juri.lelli@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Rebuilding of root domains accounting information (total_bw) is currently broken on some cases, e.g. suspend/resume on aarch64. Problem is that the way we keep track of domain changes and try to add bandwidth back is convoluted and fragile. Fix it by simplify things by making sure bandwidth accounting is cleared and completely restored after root domains changes (after root domains are again stable). Reported-by: Jon Hunter Fixes: 53916d5fd3c0 ("sched/deadline: Check bandwidth overflow earlier for = hotplug") Signed-off-by: Juri Lelli --- include/linux/sched/deadline.h | 4 ++++ include/linux/sched/topology.h | 2 ++ kernel/cgroup/cpuset.c | 16 +++++++++------- kernel/sched/deadline.c | 16 ++++++++++------ kernel/sched/topology.c | 1 + 5 files changed, 26 insertions(+), 13 deletions(-) diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h index 6ec578600b24..a780068aa1a5 100644 --- a/include/linux/sched/deadline.h +++ b/include/linux/sched/deadline.h @@ -34,6 +34,10 @@ static inline bool dl_time_before(u64 a, u64 b) struct root_domain; extern void dl_add_task_root_domain(struct task_struct *p); extern void dl_clear_root_domain(struct root_domain *rd); +extern void dl_clear_root_domain_cpu(int cpu); + +extern u64 dl_cookie; +extern bool dl_bw_visited(int cpu, u64 gen); =20 #endif /* CONFIG_SMP */ =20 diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 7f3dbafe1817..1622232bd08b 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -166,6 +166,8 @@ static inline struct cpumask *sched_domain_span(struct = sched_domain *sd) return to_cpumask(sd->span); } =20 +extern void dl_rebuild_rd_accounting(void); + extern void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[], struct sched_domain_attr *dattr_new); diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index f87526edb2a4..f66b2aefdc04 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -954,10 +954,12 @@ static void dl_update_tasks_root_domain(struct cpuset= *cs) css_task_iter_end(&it); } =20 -static void dl_rebuild_rd_accounting(void) +void dl_rebuild_rd_accounting(void) { struct cpuset *cs =3D NULL; struct cgroup_subsys_state *pos_css; + int cpu; + u64 cookie =3D ++dl_cookie; =20 lockdep_assert_held(&cpuset_mutex); lockdep_assert_cpus_held(); @@ -965,11 +967,12 @@ static void dl_rebuild_rd_accounting(void) =20 rcu_read_lock(); =20 - /* - * Clear default root domain DL accounting, it will be computed again - * if a task belongs to it. - */ - dl_clear_root_domain(&def_root_domain); + for_each_possible_cpu(cpu) { + if (dl_bw_visited(cpu, cookie)) + continue; + + dl_clear_root_domain_cpu(cpu); + } =20 cpuset_for_each_descendant_pre(cs, pos_css, &top_cpuset) { =20 @@ -996,7 +999,6 @@ partition_and_rebuild_sched_domains(int ndoms_new, cpum= ask_var_t doms_new[], { sched_domains_mutex_lock(); partition_sched_domains_locked(ndoms_new, doms_new, dattr_new); - dl_rebuild_rd_accounting(); sched_domains_mutex_unlock(); } =20 diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 339434271cba..17b040c92885 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -166,7 +166,7 @@ static inline unsigned long dl_bw_capacity(int i) } } =20 -static inline bool dl_bw_visited(int cpu, u64 cookie) +bool dl_bw_visited(int cpu, u64 cookie) { struct root_domain *rd =3D cpu_rq(cpu)->rd; =20 @@ -207,7 +207,7 @@ static inline unsigned long dl_bw_capacity(int i) return SCHED_CAPACITY_SCALE; } =20 -static inline bool dl_bw_visited(int cpu, u64 cookie) +bool dl_bw_visited(int cpu, u64 cookie) { return false; } @@ -2981,18 +2981,22 @@ void dl_clear_root_domain(struct root_domain *rd) rd->dl_bw.total_bw =3D 0; =20 /* - * dl_server bandwidth is only restored when CPUs are attached to root - * domains (after domains are created or CPUs moved back to the - * default root doamin). + * dl_servers are not tasks. Since dl_add_task_root_domanin ignores + * them, we need to account for them here explicitly. */ for_each_cpu(i, rd->span) { struct sched_dl_entity *dl_se =3D &cpu_rq(i)->fair_server; =20 if (dl_server(dl_se) && cpu_active(i)) - rd->dl_bw.total_bw +=3D dl_se->dl_bw; + __dl_add(&rd->dl_bw, dl_se->dl_bw, dl_bw_cpus(i)); } } =20 +void dl_clear_root_domain_cpu(int cpu) +{ + dl_clear_root_domain(cpu_rq(cpu)->rd); +} + #endif /* CONFIG_SMP */ =20 static void switched_from_dl(struct rq *rq, struct task_struct *p) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index b70d6002bb93..bdfda0ef1bd9 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2796,6 +2796,7 @@ void partition_sched_domains_locked(int ndoms_new, cp= umask_var_t doms_new[], ndoms_cur =3D ndoms_new; =20 update_sched_domain_debugfs(); + dl_rebuild_rd_accounting(); } =20 /* --=20 2.48.1