From nobody Tue Apr 14 22:50:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74E6EC00144 for ; Fri, 29 Jul 2022 23:10:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239540AbiG2XKY (ORCPT ); Fri, 29 Jul 2022 19:10:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239299AbiG2XKV (ORCPT ); Fri, 29 Jul 2022 19:10:21 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0413F617E; Fri, 29 Jul 2022 16:10:20 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id w17-20020a17090a8a1100b001f326c73df6so4940136pjn.3; Fri, 29 Jul 2022 16:10:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc; bh=ZT5Hgv9++D0LeXy67xshC1n8RX5xcI1NeVd8cpWBDyc=; b=EXWqxzvg90ywPJ5Qh+zZ/nc6YVWIvUi9vSiPcUvUxDKlSEDyqnBs6aWG6r0xsf4LOS VTXa92CyP08EGIIKs6Y+PZGaIQVKKh0wboWVrlys8DKtJzSTiS6E+gl7YpgNCGyyRfsd /PxJ13gwxLXT2v3WJtRlBLRtiExjDVBaAag7enKKiy5AXmBUnxy21YwuirHnCIpmRSdP fH0GHAdxhAFM/dx28WIGtHbj/+qGIorOE0XyzpJZn+xEzP4W/zcl8YMZRCcQhhb62gG0 A4yy2QPbN37xb8X4IMZq0l9aOiTIBAZF+a7/fRPGcBg+4lQfv5vSieEsn/STJHCWDIgT 3FOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc; bh=ZT5Hgv9++D0LeXy67xshC1n8RX5xcI1NeVd8cpWBDyc=; b=75FriK2glY3NRIkU+HmUtArLY9+4wAoDtE0Gghc5G1gYFKG1hQ9syYtNrMPzDOU+kQ gVwrBY+jiOoNBDzeDCz+ZictRUc4wAKRW54oFBM+vuwLpFLG/il5C93M2DFgW+YDIxZ6 bPRRGf4i/rEmH1Umn5ib6xOsmW8H3wkXlH0mesnDZHI11NNIsMYJ0NCeCiBq37hsid2P NQCXLN5zzrYXc3A3HTlNzqUGlc80yrM5+ZQMjDqktnFYdHVBsIWxy1sYCY/yjzQ5taJU f0rzrvOQMlNkgBpSBMVm/HLTZ0J3zqOVkxqVO51Q0R9cCPMTs+xnrQtDVJEILG9TPmnd ZWOw== X-Gm-Message-State: ACgBeo0XPuQJvQJe8sKW6Owvh7DY32L/tDdhXjTV7DHECA9hE0Tz247b IX/0fcKQlV4RzdBW0iWw5dDcjBTrS50= X-Google-Smtp-Source: AA6agR7JXcV/a/dNps5EsEg14WIfe+v2Df0haUj3nkkFLPaWFx3Xk0I2DUCIIb075vYNB3jlA6vnVw== X-Received: by 2002:a17:902:b681:b0:16d:31bb:7cd3 with SMTP id c1-20020a170902b68100b0016d31bb7cd3mr6042270pls.154.1659136218961; Fri, 29 Jul 2022 16:10:18 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:df80]) by smtp.gmail.com with ESMTPSA id m1-20020a1709026bc100b0016d9e53c138sm4192888plt.17.2022.07.29.16.10.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Jul 2022 16:10:18 -0700 (PDT) Sender: Tejun Heo Date: Fri, 29 Jul 2022 13:10:16 -1000 From: Tejun Heo To: cgroups@vger.kernel.org, Zefan Li , Michal =?iso-8859-1?Q?Koutn=FD?= , Christian Brauner Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com, Namhyung Kim , Pablo Neira Ayuso Subject: [PATCH v3 cgroup/for-5.20] cgroup: Replace cgroup->ancestor_ids[] with ->ancestors[] Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Every cgroup knows all its ancestors through its ->ancestor_ids[]. There's no advantage to remembering the IDs instead of the pointers directly and this makes the array useless for finding an actual ancestor cgroup forcing cgroup_ancestor() to iteratively walk up the hierarchy instead. Let's replace cgroup->ancestor_ids[] with ->ancestors[] and remove the walking-up from cgroup_ancestor(). While at it, improve comments around cgroup_root->cgrp_ancestor_storage. This patch shouldn't cause user-visible behavior differences. v2: Update cgroup_ancestor() to use ->ancestors[]. v3: cgroup_root->cgrp_ancestor_storage's type is updated to match cgroup->ancestors[]. Better comments. Signed-off-by: Tejun Heo Acked-by: Namhyung Kim --- include/linux/cgroup-defs.h | 16 ++++++++++------ include/linux/cgroup.h | 8 +++----- kernel/cgroup/cgroup.c | 7 +++---- net/netfilter/nft_socket.c | 9 +++++---- tools/perf/util/bpf_skel/bperf_cgroup.bpf.c | 2 +- 5 files changed, 22 insertions(+), 20 deletions(-) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 63bf43c7ca3b..52a3c47c89bc 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -379,7 +379,7 @@ struct cgroup { /* * The depth this cgroup is at. The root is at depth zero and each * step down the hierarchy increments the level. This along with - * ancestor_ids[] can determine whether a given cgroup is a + * ancestors[] can determine whether a given cgroup is a * descendant of another without traversing the hierarchy. */ int level; @@ -499,8 +499,8 @@ struct cgroup { /* Used to store internal freezer state */ struct cgroup_freezer_state freezer; =20 - /* ids of the ancestors at each level including self */ - u64 ancestor_ids[]; + /* All ancestors including self */ + struct cgroup *ancestors[]; }; =20 /* @@ -517,11 +517,15 @@ struct cgroup_root { /* Unique id for this hierarchy. */ int hierarchy_id; =20 - /* The root cgroup. Root is destroyed on its release. */ + /* + * The root cgroup. The containing cgroup_root will be destroyed on its + * release. cgrp->ancestors[0] will be used overflowing into the + * following field. cgrp_ancestor_storage must immediately follow. + */ struct cgroup cgrp; =20 - /* for cgrp->ancestor_ids[0] */ - u64 cgrp_ancestor_id_storage; + /* must follow cgrp for cgrp->ancestors[0], see above */ + struct cgroup *cgrp_ancestor_storage; =20 /* Number of cgroups in the hierarchy, used only for /proc/cgroups */ atomic_t nr_cgrps; diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index ed53bfe7c46c..4d143729b246 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -574,7 +574,7 @@ static inline bool cgroup_is_descendant(struct cgroup *= cgrp, { if (cgrp->root !=3D ancestor->root || cgrp->level < ancestor->level) return false; - return cgrp->ancestor_ids[ancestor->level] =3D=3D cgroup_id(ancestor); + return cgrp->ancestors[ancestor->level] =3D=3D ancestor; } =20 /** @@ -591,11 +591,9 @@ static inline bool cgroup_is_descendant(struct cgroup = *cgrp, static inline struct cgroup *cgroup_ancestor(struct cgroup *cgrp, int ancestor_level) { - if (cgrp->level < ancestor_level) + if (ancestor_level < 0 || ancestor_level > cgrp->level) return NULL; - while (cgrp && cgrp->level > ancestor_level) - cgrp =3D cgroup_parent(cgrp); - return cgrp; + return cgrp->ancestors[ancestor_level]; } =20 /** diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 85fa4c8587a8..ce587fe43dab 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -2047,7 +2047,7 @@ int cgroup_setup_root(struct cgroup_root *root, u16 s= s_mask) } root_cgrp->kn =3D kernfs_root_to_node(root->kf_root); WARN_ON_ONCE(cgroup_ino(root_cgrp) !=3D 1); - root_cgrp->ancestor_ids[0] =3D cgroup_id(root_cgrp); + root_cgrp->ancestors[0] =3D root_cgrp; =20 ret =3D css_populate_dir(&root_cgrp->self); if (ret) @@ -5391,8 +5391,7 @@ static struct cgroup *cgroup_create(struct cgroup *pa= rent, const char *name, int ret; =20 /* allocate the cgroup and its ID, 0 is reserved for the root */ - cgrp =3D kzalloc(struct_size(cgrp, ancestor_ids, (level + 1)), - GFP_KERNEL); + cgrp =3D kzalloc(struct_size(cgrp, ancestors, (level + 1)), GFP_KERNEL); if (!cgrp) return ERR_PTR(-ENOMEM); =20 @@ -5444,7 +5443,7 @@ static struct cgroup *cgroup_create(struct cgroup *pa= rent, const char *name, =20 spin_lock_irq(&css_set_lock); for (tcgrp =3D cgrp; tcgrp; tcgrp =3D cgroup_parent(tcgrp)) { - cgrp->ancestor_ids[tcgrp->level] =3D cgroup_id(tcgrp); + cgrp->ancestors[tcgrp->level] =3D tcgrp; =20 if (tcgrp !=3D cgrp) { tcgrp->nr_descendants++; diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c index 05ae5a338b6f..d982a7c22a77 100644 --- a/net/netfilter/nft_socket.c +++ b/net/netfilter/nft_socket.c @@ -40,16 +40,17 @@ static noinline bool nft_sock_get_eval_cgroupv2(u32 *dest, struct sock *sk, const struct nft_pk= tinfo *pkt, u32 level) { struct cgroup *cgrp; + u64 cgid; =20 if (!sk_fullsock(sk)) return false; =20 - cgrp =3D sock_cgroup_ptr(&sk->sk_cgrp_data); - if (level > cgrp->level) + cgrp =3D cgroup_ancestor(sock_cgroup_ptr(&sk->sk_cgrp_data), level); + if (!cgrp) return false; =20 - memcpy(dest, &cgrp->ancestor_ids[level], sizeof(u64)); - + cgid =3D cgroup_id(cgrp); + memcpy(dest, &cgid, sizeof(u64)); return true; } #endif diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/= bpf_skel/bperf_cgroup.bpf.c index 292c430768b5..bd6a420acc8f 100644 --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c @@ -68,7 +68,7 @@ static inline int get_cgroup_v1_idx(__u32 *cgrps, int siz= e) break; =20 // convert cgroup-id to a map index - cgrp_id =3D BPF_CORE_READ(cgrp, ancestor_ids[i]); + cgrp_id =3D BPF_CORE_READ(cgrp, ancestors[i], kn, id); elem =3D bpf_map_lookup_elem(&cgrp_idx, &cgrp_id); if (!elem) continue;