From nobody Fri Dec 19 10:44:03 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 151651D4173 for ; Tue, 27 Aug 2024 23:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800301; cv=none; b=u6G5kT8gTe0B9MmuDNLoSoBTsm3KLtn+YQ7B7kckpOfrpAgwGPUWznG1zc18Qrntmr8wlglfaU3TodEsyq54dmAUenxcrlgNcyALwbEYOhyRT8gKGLrVDT8ycX5ETzdsrTa+NEVaNvMN10nNBEDLl+D8qMZQxD9PPHQ88CiYaBc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800301; c=relaxed/simple; bh=pmnA98OHlkAk1b+OhNUeP4JA8KFIHmpFHK8wxlWLNA4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=HqULM0js4Owp8jX6LvMYLyfBzCjap+WjI135haLxmPziviBz3tQfEijMDbKAe7VuoYl1ptw1k4Og73+5C2+SLCJX9JvvfUCexMxwNDiQ1L6vd6tmG7sphEiM9Ynfk6W3tlrWoIs8DfWDfNkPG5w2RMnoVxU1ytTAuGwEnlcgsnk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=VZE/MPbO; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="VZE/MPbO" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-7d1fe1dd173so2448161a12.0 for ; Tue, 27 Aug 2024 16:11:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724800299; x=1725405099; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=uLJnrvY/TNbJZgveXjXzKwi1waLm2OswD6XrKXtY0Dg=; b=VZE/MPbOss8wfDj3UEJ1b8YUikmnFZXldSQhRVU5t9KPHd5waNUCQtyGM6icJJqprn JTpcqQGM0F1K/xuxquTCJZ1no81EIhckSNloyIiq5SQenw0TV4JOkfIqcYTQsfkm1Dc4 uWyr/YAUd52cNyKCYb3qk/9tCXXn27s9mTKy7BKFIloYdbJIgejSybW7Ue8mxCDA/Tfb Z7PK5zJMEyqGNKURrGVsVvGoGlZXZ/ncqGqJcs0lFxS+b1M0m3LgT3tzVg2nMRTuCk9J p0D5TfF+hAZokcGQAeh/V8IzjfEgE/+bPAvPDl50jKyS+jbTI5FXdv4esX+PN3NHd2CF vUmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724800299; x=1725405099; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=uLJnrvY/TNbJZgveXjXzKwi1waLm2OswD6XrKXtY0Dg=; b=v95gXL6g7Mez+UbeamyjhKTHUQBu2tYP/SY1HrE0BsMTCr57ilwHe68h6PjAaC44tY +PPhVbLN4KNFuymUHjz7go89s7KOfRqH69dTyJaWvP79l/oNlIhu0+9XCS/2ot1awM75 6im3wbeLK8ogpaeERlgqutmKb6pND0hcj6xNK5YYpBH5lZB9ePl4/wmevdK+9uHCIT8M kU24oWiY8Pvb0EFIMIE7e1mYNaOm0NUC7RRfeOmxFR8sfGgQY7dTosC5HALHSR4iSkZT ldALLaLFl6ElO5PUHdlA+ce4siQCzQJ9iDUvwcttoeG48XVJEU6YE+tRxaNylp4PICbb nHDw== X-Forwarded-Encrypted: i=1; AJvYcCVgOsBmwVpFr8C/USTame1PYLgcTYo8jHhy6ON1rNGR7HMSJGHlITn4j+bFzFYzdo1Tf83DRCHs411R0OY=@vger.kernel.org X-Gm-Message-State: AOJu0YxV9gBxraAsuCb87sHyvqA6vCsRvmwBJPUVBc2ETbJfd4oJJNUs eQR+MXrh32l8qA6Yfsm3YxQNm9a48L1YjZOQMnTHDy3bnQ3fegohBWijO+W8fw9IujpPmQlFosR 2Py6di8ovOQ== X-Google-Smtp-Source: AGHT+IEokv9e6LuULnuwid4WZdlPlarG8J+pXfvpDb7CgbDsiCb9RtBxcqB82FKfSIIge11nvt7XccQqszzV4Q== X-Received: from kinseyct.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:46b]) (user=kinseyho job=sendgmr) by 2002:a05:6a02:5aa:b0:6e7:95d3:b35c with SMTP id 41be03b00d2f7-7d2228eb22fmr302a12.5.1724800299140; Tue, 27 Aug 2024 16:11:39 -0700 (PDT) Date: Tue, 27 Aug 2024 23:07:38 +0000 In-Reply-To: <20240827230753.2073580-1-kinseyho@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240827230753.2073580-1-kinseyho@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240827230753.2073580-2-kinseyho@google.com> Subject: [PATCH mm-unstable v3 1/5] cgroup: clarify css sibling linkage is protected by cgroup_mutex or RCU From: Kinsey Ho To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Yosry Ahmed , Roman Gushchin , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Tejun Heo , Zefan Li , mkoutny@suse.com, Kinsey Ho Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Explicitly document that css sibling/descendant linkage is protected by cgroup_mutex or RCU. Also, document in css_next_descendant_pre() and similar functions that it isn't necessary to hold a ref on @pos. The following changes in this patchset rely on this clarification for simplification in memcg iteration code. Suggested-by: Yosry Ahmed Reviewed-by: Michal Koutn=C3=BD Signed-off-by: Kinsey Ho --- include/linux/cgroup-defs.h | 6 +++++- kernel/cgroup/cgroup.c | 16 +++++++++------- 2 files changed, 14 insertions(+), 8 deletions(-) diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h index 7fc2d0195f56..ca7e912b8355 100644 --- a/include/linux/cgroup-defs.h +++ b/include/linux/cgroup-defs.h @@ -172,7 +172,11 @@ struct cgroup_subsys_state { /* reference count - access via css_[try]get() and css_put() */ struct percpu_ref refcnt; =20 - /* siblings list anchored at the parent's ->children */ + /* + * siblings list anchored at the parent's ->children + * + * linkage is protected by cgroup_mutex or RCU + */ struct list_head sibling; struct list_head children; =20 diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 0a97cb2ef124..ece2316e2bca 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4602,8 +4602,9 @@ struct cgroup_subsys_state *css_next_child(struct cgr= oup_subsys_state *pos, * * While this function requires cgroup_mutex or RCU read locking, it * doesn't require the whole traversal to be contained in a single critical - * section. This function will return the correct next descendant as long - * as both @pos and @root are accessible and @pos is a descendant of @root. + * section. Additionally, it isn't necessary to hold onto a reference to @= pos. + * This function will return the correct next descendant as long as both @= pos + * and @root are accessible and @pos is a descendant of @root. * * If a subsystem synchronizes ->css_online() and the start of iteration, a * css which finished ->css_online() is guaranteed to be visible in the @@ -4651,8 +4652,9 @@ EXPORT_SYMBOL_GPL(css_next_descendant_pre); * * While this function requires cgroup_mutex or RCU read locking, it * doesn't require the whole traversal to be contained in a single critical - * section. This function will return the correct rightmost descendant as - * long as @pos is accessible. + * section. Additionally, it isn't necessary to hold onto a reference to @= pos. + * This function will return the correct rightmost descendant as long as @= pos + * is accessible. */ struct cgroup_subsys_state * css_rightmost_descendant(struct cgroup_subsys_state *pos) @@ -4696,9 +4698,9 @@ css_leftmost_descendant(struct cgroup_subsys_state *p= os) * * While this function requires cgroup_mutex or RCU read locking, it * doesn't require the whole traversal to be contained in a single critical - * section. This function will return the correct next descendant as long - * as both @pos and @cgroup are accessible and @pos is a descendant of - * @cgroup. + * section. Additionally, it isn't necessary to hold onto a reference to @= pos. + * This function will return the correct next descendant as long as both @= pos + * and @cgroup are accessible and @pos is a descendant of @cgroup. * * If a subsystem synchronizes ->css_online() and the start of iteration, a * css which finished ->css_online() is guaranteed to be visible in the --=20 2.46.0.295.g3b9ea8a38a-goog From nobody Fri Dec 19 10:44:03 2025 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE7B11D4606 for ; Tue, 27 Aug 2024 23:11:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800303; cv=none; b=qti8FzIgLPfa2Qn0X+b6T3oEHS65+NlD+4CunfWzIF9xLfRjA4gaWr8THM76XO+J0SSV3brzLRWUR5ZY2oPfh8mk8gGH2EXDtD/xNeNIZwpGClfB3tapSNMciOCAPNQYmHWouPXhdzx+UBX+T0N77MO1ykWXkmW9LCCTs59xEsk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800303; c=relaxed/simple; bh=GqTjpb1jAON2eyIaaYCsAsTBYc3LGYMO9CEcIpbdfhI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=f64PnqtB26AkWyD5B4GrNJriPL5kzE795UepVYwZjLocj71qrjYdVStJJy4m3Lro7Ord5njAqJE058hjGSHOg/9S0awzUwHcIhd3Kxat+IHlbvha07mpZTGU8LMWOJOlZhHLyUf4ryfwO3MkI7wN4uFvFCugiRdvFuNAgyibiBI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=luWYdgww; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="luWYdgww" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6b46a237b48so127399927b3.1 for ; Tue, 27 Aug 2024 16:11:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724800301; x=1725405101; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QXZSgIWrSh7o4TMYH5HXuFFT6CvWnPSrTikMMv9Xu6c=; b=luWYdgwwCtalmPcjYLmG12A7J1JyH9hd5BjKy6ix1RIRFsvB8fmqyTBGV/hbtf5GN0 bpqruMsMwtW7Y+dvdLdKXC4R5A5LcSOoUOgvYFgTwnF/n8hesNz79ZREzZ1a1GH7ogBM XKt9/LyLJjg4UuvU9VXdYExxnv6SDg4lzx+Nw3sDb3tZzdIRyyrJYnNfgez1n4Pl3v/i rgQ0k6reNmGgwPYW2y4ZcV6zZ2TTjmaIkoU8D+T2WMTiNeWIlD3UI+vvhE57htyghBYg aUKwkJgEZMBB9YnFQKpEYpKEnIuJ3yFnHmtE1tpDLwLba71NSosomK4EnlCdKwYSl2o3 Uetw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724800301; x=1725405101; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QXZSgIWrSh7o4TMYH5HXuFFT6CvWnPSrTikMMv9Xu6c=; b=AWN1/HpCCtjTcvv8LIZpFYN1mn+JFPQPYtcf8JTTYgCmZeasdAIkg2swvXBx1xfqIV EjAzE0n3JxE3yR6po/C5Yiq9Jj9PBfJmnVWuxP99NlonyPP3nbN6CWI5MtZcDPdZl6Tx Bmg21skeq/uroyqiNXwjk7VY0nnh5IVGLB/C1kzAcy6UsRgp+xhNzcKfwvDeq/VQ9OZu fwWN/JrG87VBtKi87+WiNFR8zb54jGkBJoKYfKT4SVPrmaH4EgRBrJBdauL+TcBjB6wF mVtfiTfiDPAKDVdRUa+AZzgtvhVWmYh3FsDme1mwZDz27z1BTkJHG17VUvFbuWoEMWnd Cjbw== X-Forwarded-Encrypted: i=1; AJvYcCUE8hJdpPV7WbSmEWOnQ7k9YEGJUUEIn9FybyDbW2YHmToODSmQUDTM3S3zkrttuQ7sw0g/74W5QzMm9mM=@vger.kernel.org X-Gm-Message-State: AOJu0YxqX1FcLagolorAaVJ2ltERVSsa6q0v3SJ12frKHNihnAjF7MQD JC/rsZsTKtA+smbZr7tvAuhSXbgFJEiMNi6RDttP5U3WrucTHtk1xNV1xrmXSfVxQbLimxovCRi 1GPU6ZtYRmA== X-Google-Smtp-Source: AGHT+IH/OEMHPdWupnXyjW/u5NZJ+Iwna53tKUi6hE4QApyk+GLsJI4xuXktR0o4bdU85PWSweUUV8sgsDuT4w== X-Received: from kinseyct.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:46b]) (user=kinseyho job=sendgmr) by 2002:a25:abad:0:b0:e11:69f2:e39 with SMTP id 3f1490d57ef6-e1a4580c518mr380276.9.1724800300879; Tue, 27 Aug 2024 16:11:40 -0700 (PDT) Date: Tue, 27 Aug 2024 23:07:39 +0000 In-Reply-To: <20240827230753.2073580-1-kinseyho@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240827230753.2073580-1-kinseyho@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240827230753.2073580-3-kinseyho@google.com> Subject: [PATCH mm-unstable v3 2/5] mm: don't hold css->refcnt during traversal From: Kinsey Ho To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Yosry Ahmed , Roman Gushchin , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Tejun Heo , Zefan Li , mkoutny@suse.com, Kinsey Ho Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To obtain the pointer to the next memcg position, mem_cgroup_iter() currently holds css->refcnt during memcg traversal only to put css->refcnt at the end of the routine. This isn't necessary as an rcu_read_lock is already held throughout the function. The use of the RCU read lock with css_next_descendant_pre() guarantees that sibling linkage is safe without holding a ref on the passed-in @css. Remove css->refcnt usage during traversal by leveraging RCU. Signed-off-by: Kinsey Ho Reviewed-by: T.J. Mercier --- mm/memcontrol.c | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 35431035e782..67b1994377b7 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1013,20 +1013,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup= *root, else if (reclaim->generation !=3D iter->generation) goto out_unlock; =20 - while (1) { - pos =3D READ_ONCE(iter->position); - if (!pos || css_tryget(&pos->css)) - break; - /* - * css reference reached zero, so iter->position will - * be cleared by ->css_released. However, we should not - * rely on this happening soon, because ->css_released - * is called from a work queue, and by busy-waiting we - * might block it. So we clear iter->position right - * away. - */ - (void)cmpxchg(&iter->position, pos, NULL); - } + pos =3D READ_ONCE(iter->position); } else if (prev) { pos =3D prev; } @@ -1067,9 +1054,6 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup = *root, */ (void)cmpxchg(&iter->position, pos, memcg); =20 - if (pos) - css_put(&pos->css); - if (!memcg) iter->generation++; } --=20 2.46.0.295.g3b9ea8a38a-goog From nobody Fri Dec 19 10:44:03 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B26B31D47B7 for ; Tue, 27 Aug 2024 23:11:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800305; cv=none; b=Rk4iRG/vcxXrtnC61AV4Y9KwlX3ygC+yfufD4TB+ERq0NLE+izfCyR6u9TPbIYXCFubjaZvszfqv5U+sG2pz11wTJFkwF04MqxD8x8wxrqUgrQCn0P9o8lHIcj39PREuyVpEmTNKcSfMANYmZF6sHfJ7e6Yw3jVPTGhh6H3tPmE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800305; c=relaxed/simple; bh=4ct1zm7TMhbJ+tGP04NvmerXkcuz0zpwYeQV6rpFg28=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=iPBh6t9OOWtw7ijmVNKD7ww12IoO4t68Da/8VkuCia/66Z5y67VGCdQzP+7jJBHBe+73LuxZ5/Y2ZUdDL9MbepRyGdGtu7gbJuEDOK8jSyhcrFs3akzhbLJzutbw9TPuMV9Bk/LJYxgyqwy39ZVLK3jF7N1c/mwfisQ15IjHOxU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=4UZR4KbA; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4UZR4KbA" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-7cf603d9ffaso2104407a12.0 for ; Tue, 27 Aug 2024 16:11:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724800303; x=1725405103; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pVbBxbMJe3JYZfDSh84GXFqWqtxHIvJybjs9yUa4Qwg=; b=4UZR4KbAHs+z+JcG2rwMs3CfCdSIh37MeIRzGQVsRF0+6fk7yrepI60bySIXtX4BaN /uT8kde3nsj+YiIlK48Ddc5YqvOKbkk6vtBrQYu42W6B+wMsunmqPwj3ZO4Dl+1o2Mp7 Vh8maNzwFiMm367YvaAA6baB0KglGOzY+lBTG4sGXt/9hADcMjD5tc80coh0XP5aKiHt AuKeQhlxcHuUBY1eoCFLgMzWZaH6DXfJHypT5lv+2kmQDZLmw+HWC/aVmTOk4tc/LT4V A9ygTMj590/9sLsZLKVCKcF9mExQtyddZL8px0jJYy29jCMbSG3ptXJx3H47sUbqRhno AOyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724800303; x=1725405103; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pVbBxbMJe3JYZfDSh84GXFqWqtxHIvJybjs9yUa4Qwg=; b=JPwROwMwmFjEyNmuWC0z/4Z8lNnBleP4oNzgbxUkYtUCGnkxjpM+6UocW03iH2tZhJ /1vN8yNPpQSlMbve7a82cUe++fawDUtcjpiKKx5c4pU75cttol7romPVXdI1mGKdMDuK gKhj9BzUAS3RKYn0iq/8p0em+CGkgWskpmLNARQL1Q+Mc0LjF1YsjJL7u5l8dEcaRWxB hp+uHQpJOKmZ4GkkrkwJCxRXqfLdY1u6F79dfytJrYc7Gn3Ew6nB4HFsCwfGb1bKzNiC mUYjKAK4XdnDcbR8gNxX3CDPReqF3X4aWOdjAoSENtEvCmrfwlCSp1jorMHg5FDuiy6t RiNw== X-Forwarded-Encrypted: i=1; AJvYcCWuXQS0u2g3XvF6jt13oHy4zJ/ebB7bWy8aOJxzkhD8N/5RduyA/d1R2s2e5U3Oc71zqGKcS1gEwSVpkZg=@vger.kernel.org X-Gm-Message-State: AOJu0YzlOFzBimdzTGkbNKnlZjdNvS8qppTJDFqvvuXxb09AnHjYQxwb oNVRyqF4/tz8e01vw0sLHUFn/b5woWggmuxQ8+RXRZKowi8A8+kyJoqFMti/zGTFdGfymmIkIHO PPcCvG5396A== X-Google-Smtp-Source: AGHT+IFdVh/6ZDHdjnuvg4eof/eCc2TmxxMkSBRAT5EDf73D7CKtBh/Kuj5HdytE0rc9XtN+OFRBe12aA8e0gw== X-Received: from kinseyct.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:46b]) (user=kinseyho job=sendgmr) by 2002:a63:d70c:0:b0:7cb:c8c3:3811 with SMTP id 41be03b00d2f7-7d2228e5badmr309a12.5.1724800302880; Tue, 27 Aug 2024 16:11:42 -0700 (PDT) Date: Tue, 27 Aug 2024 23:07:40 +0000 In-Reply-To: <20240827230753.2073580-1-kinseyho@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240827230753.2073580-1-kinseyho@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240827230753.2073580-4-kinseyho@google.com> Subject: [PATCH mm-unstable v3 3/5] mm: increment gen # before restarting traversal From: Kinsey Ho To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Yosry Ahmed , Roman Gushchin , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Tejun Heo , Zefan Li , mkoutny@suse.com, Kinsey Ho Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The generation number in struct mem_cgroup_reclaim_iter should be incremented on every round-trip. Currently, it is possible for a concurrent reclaimer to jump in at the end of the hierarchy, causing a traversal restart (resetting the iteration position) without incrementing the generation number. By resetting the position without incrementing the generation, it's possible for another ongoing mem_cgroup_iter() thread to walk the tree twice. Move the traversal restart such that the generation number is incremented before the restart. Signed-off-by: Kinsey Ho Reviewed-by: T.J. Mercier --- mm/memcontrol.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 67b1994377b7..51b194a4c375 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -997,7 +997,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *r= oot, root =3D root_mem_cgroup; =20 rcu_read_lock(); - +restart: if (reclaim) { struct mem_cgroup_per_node *mz; =20 @@ -1024,14 +1024,6 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup= *root, for (;;) { css =3D css_next_descendant_pre(css, &root->css); if (!css) { - /* - * Reclaimers share the hierarchy walk, and a - * new one might jump in right at the end of - * the hierarchy - make sure they see at least - * one group and restart from the beginning. - */ - if (!prev) - continue; break; } =20 @@ -1054,8 +1046,18 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup= *root, */ (void)cmpxchg(&iter->position, pos, memcg); =20 - if (!memcg) + if (!memcg) { iter->generation++; + + /* + * Reclaimers share the hierarchy walk, and a + * new one might jump in right at the end of + * the hierarchy - make sure they see at least + * one group and restart from the beginning. + */ + if (!prev) + goto restart; + } } =20 out_unlock: --=20 2.46.0.295.g3b9ea8a38a-goog From nobody Fri Dec 19 10:44:03 2025 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2FA41D47DC for ; Tue, 27 Aug 2024 23:11:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800307; cv=none; b=PJxhk4NO87ePl5mOSX2mhrwwYNt6Q1X16ZsNC7g5tRULBBAg393MOtuHYsNDCsiYUlql0AgWomhQRQ3mTxrqXRLByadUNT0dGA1GQCfJiCUIWgKml5p6OuE5fGAjhElkYuHOiGalEgIHlLdh6+SMYhr22tqYGT7PEa123WYdIxk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800307; c=relaxed/simple; bh=S1S4yqh1NPYayJ940lwvzrOEw1UTdu7vbQmz11c0RI4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=P01uAxpxYGVZbn4qvsVy2iMVCeNylpZMfHfiozWBYkpl303exE0MN4c8xbq5bjKeJ2ibpSV1TAfsk/6UMcH3zzrtXoBfS5Ed9oDSq1Sfl/c3KvkbhQc1KD0pTI6YaTAGqJF+NVfJZMHvkJKp6OgRx+ZZIBEiS74KZ2l5XMSrlfo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=luT5wC9r; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="luT5wC9r" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-69a0536b23aso114074297b3.3 for ; Tue, 27 Aug 2024 16:11:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724800305; x=1725405105; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GW/USJCjupc34zRwCQ9bHi7JmQHKfEiUl9nNlR9x2Hk=; b=luT5wC9rusx55XedaPAqa6mzDF7BZkeOjKEdIuYEyh7Hl+XCek1cBvOp0PZxyrUFFV X7orzv/4LbKeiUkg5qiT7AEQc//+XRIUA3oP4SOFi1qcZkWeCzj+NRsBeTY0Duy9FXz5 TrdNIS/gAoLkhnz9y3BjZJkN1DZnKKFlyEuaISFAE39uy5q1SdC+oFL9L0n98DjHe4F3 g6LEc47Mm55gQdXoAs2U8TJ3uLLKTVAmPny5bHoIwpDhguDltDxQmnEAknyWR0mRrL2R VrAhupptcJZvuQVxwmdFaMWcgm9Om7UYN9fn1rmKNECJ089kfGQNGwFeHc67J2NAeApi +vpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724800305; x=1725405105; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GW/USJCjupc34zRwCQ9bHi7JmQHKfEiUl9nNlR9x2Hk=; b=ayEQvKXz2DsBEq8gOUpEbyoywoJRf0G2WbBGNl8MDHly9MzDH2b5dgNDy1S1aYPW4a zWuSL+57ki9C0ciYTJ601yZqC3+ifB8rRg2BwHpgqsHsiAKxbcuJ0uzkANiL8Ropjmn0 2XRMx1PdzxqUul/bfoFvA4tmLqFd1vRzPgB2wlgDFoT29YNhr++VCqbhyfYrtscK05SW ArROkYK7kdA+xQ5su8z1z6U4GIxIBO5z0GmK6wIAZJZpZWjrxECaiMJ+j76/EOjyOAO0 oLthoZPXaptrUVC8DEtGxwV0Fndnk12DDWJbMaa1vkfBj/ZyQPajBwqb9sIYW1ZOhbSs lYhw== X-Forwarded-Encrypted: i=1; AJvYcCWX1q9IcMJ+1692rPyt7iknyFZICxUWD/qkwEBZzRmZvYmCcllGunnD/6d+OBbZ6y8uxRL/OsONzwUL9KM=@vger.kernel.org X-Gm-Message-State: AOJu0Yz8FwItpFG45Mg52YfZ1NELQuVIf31Z6B03QnaajTcj5TmUC+v5 pk8Cq4ccRoIPJ4APVcuLQMYBqrJqNmeyCKSzEf/EXdiKKQD9j9dZD3A0Jsb7Ks02QDFqrQ6ER69 Ax5c66K5M6Q== X-Google-Smtp-Source: AGHT+IF1yp8ipyeMEvESYLh84If45Gfi4ptWiqS2P4CUd5/mGA60SGsDPudut6D5G0yZlYICuZUQrkeSqUZvzQ== X-Received: from kinseyct.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:46b]) (user=kinseyho job=sendgmr) by 2002:a05:690c:46c3:b0:69a:536:afd3 with SMTP id 00721157ae682-6d172052874mr2467b3.5.1724800304855; Tue, 27 Aug 2024 16:11:44 -0700 (PDT) Date: Tue, 27 Aug 2024 23:07:41 +0000 In-Reply-To: <20240827230753.2073580-1-kinseyho@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240827230753.2073580-1-kinseyho@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240827230753.2073580-5-kinseyho@google.com> Subject: [PATCH mm-unstable v3 4/5] mm: restart if multiple traversals raced From: Kinsey Ho To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Yosry Ahmed , Roman Gushchin , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Tejun Heo , Zefan Li , mkoutny@suse.com, Kinsey Ho Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, if multiple reclaimers raced on the same position, the reclaimers which detect the race will still reclaim from the same memcg. Instead, the reclaimers which detect the race should move on to the next memcg in the hierarchy. So, in the case where multiple traversals race, jump back to the start of the mem_cgroup_iter() function to find the next memcg in the hierarchy to reclaim from. Signed-off-by: Kinsey Ho Acked-by: Kinsey Ho Reported-by: syzbot+e099d407346c45275ce9@syzkaller.appspotmail.com Reviewed-by: T.J. Mercier --- include/linux/memcontrol.h | 4 ++-- mm/memcontrol.c | 22 ++++++++++++++-------- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index fe05fdb92779..2ef94c74847d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -57,7 +57,7 @@ enum memcg_memory_event { =20 struct mem_cgroup_reclaim_cookie { pg_data_t *pgdat; - unsigned int generation; + int generation; }; =20 #ifdef CONFIG_MEMCG @@ -78,7 +78,7 @@ struct lruvec_stats; struct mem_cgroup_reclaim_iter { struct mem_cgroup *position; /* scan generation, increased every round-trip */ - unsigned int generation; + atomic_t generation; }; =20 /* diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 51b194a4c375..33bd379c738b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -986,7 +986,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *r= oot, struct mem_cgroup_reclaim_cookie *reclaim) { struct mem_cgroup_reclaim_iter *iter; - struct cgroup_subsys_state *css =3D NULL; + struct cgroup_subsys_state *css; struct mem_cgroup *memcg =3D NULL; struct mem_cgroup *pos =3D NULL; =20 @@ -999,18 +999,20 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup = *root, rcu_read_lock(); restart: if (reclaim) { + int gen; struct mem_cgroup_per_node *mz; =20 mz =3D root->nodeinfo[reclaim->pgdat->node_id]; iter =3D &mz->iter; + gen =3D atomic_read(&iter->generation); =20 /* * On start, join the current reclaim iteration cycle. * Exit when a concurrent walker completes it. */ if (!prev) - reclaim->generation =3D iter->generation; - else if (reclaim->generation !=3D iter->generation) + reclaim->generation =3D gen; + else if (reclaim->generation !=3D gen) goto out_unlock; =20 pos =3D READ_ONCE(iter->position); @@ -1018,8 +1020,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup = *root, pos =3D prev; } =20 - if (pos) - css =3D &pos->css; + css =3D pos ? &pos->css : NULL; =20 for (;;) { css =3D css_next_descendant_pre(css, &root->css); @@ -1033,21 +1034,26 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgrou= p *root, * and kicking, and don't take an extra reference. */ if (css =3D=3D &root->css || css_tryget(css)) { - memcg =3D mem_cgroup_from_css(css); break; } } =20 + memcg =3D mem_cgroup_from_css(css); + if (reclaim) { /* * The position could have already been updated by a competing * thread, so check that the value hasn't changed since we read * it to avoid reclaiming from the same cgroup twice. */ - (void)cmpxchg(&iter->position, pos, memcg); + if (cmpxchg(&iter->position, pos, memcg) !=3D pos) { + if (css && css !=3D &root->css) + css_put(css); + goto restart; + } =20 if (!memcg) { - iter->generation++; + atomic_inc(&iter->generation); =20 /* * Reclaimers share the hierarchy walk, and a --=20 2.46.0.295.g3b9ea8a38a-goog From nobody Fri Dec 19 10:44:03 2025 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE5DD1D54E7 for ; Tue, 27 Aug 2024 23:11:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800309; cv=none; b=USEvruULyt43iyZwMHPoc4yRdzPBpZC2R6mKVTDUkKyeich7tPxjnQCUdAXHh2u2xWIfmxU4DFbwDFvbzmdMSPQ0UPQvd6M9WrJzc5MPcO2wAJ1cmQiq2f9q79AyR1B6elNRwA40AjdSTBy5Yo+1QR2s8xRpE/0UAuW1TXWl1BY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724800309; c=relaxed/simple; bh=Wq+jQvGM0OF+bTL/TJg3bi4vuxkoCJyt+mqQiab3tvg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ca7s2vhEePQswBochuomdNF88623Hy+Fi2vMVThMt2oefEepySohEfIKMYRUU1+kQT6yiploCjX7vxkFDb0TFRrOMpduyLjCQ57RMIuNoeVS6QzGSfmxDjXrSBANLf4oYJn2hG7TomuOq7bie2UC/UChfzzDr0H8XCMt3f6t9d0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=jWZnfO+j; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kinseyho.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="jWZnfO+j" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e116c697904so12009529276.1 for ; Tue, 27 Aug 2024 16:11:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724800307; x=1725405107; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=F9O1xAoZAOV8c4zW9/De55yG0OzoFjxVL+7Ggug8PCo=; b=jWZnfO+jg24B0DMs4vYV0POKGupZnKieQLE2DOa2/4OyECJo53vESogMi92CGHtE6f 6o6cJyKSmj8U0PWJLvL++U2zTZMbSEJlrJbNfakGm09zPUpNYsUirOF+v4p9KnOoS6Ql HHuAwH/jN9p2HnGCdpoXC84TuwxHZmyxlC5y6HtWKQJo4vEUv7Wt4X52VSDuAK442LW1 Js5CPoa4o5t97w9v2rb7aSpJPk8G5RnZ89daz/dvZIbEnIqU5eYDcJJqRTDaRUug2meP i1avA0HFZB1ZhSHnwC6lT0hHrT5cJ672jJVSyXQqwmJzq7XD7WcQ8ckckF770LkQy/Q/ JsHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724800307; x=1725405107; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=F9O1xAoZAOV8c4zW9/De55yG0OzoFjxVL+7Ggug8PCo=; b=wP3C1Pj4kIjePhlZWEqOmWfTs//d9x1Ab7VnLwem92MKvRtF4q3G2CdnRAkuWN1Cd9 /1VJw6kUTZpJ59xJuWrwzkPp/CDgBvkGzzWmy7TTOaus4UZYHYAytKbydwzNdDwdpo5p gdNcVLTVasrpPWwjWmg+i302un6iUngtqYNk/BT2PDVv9hKxqIpB3Cw0MeKnvKhHuOSe BghlSlUwuwWCiJ7efJKqNH0ePuB7gilSvXn/gqp6UiZT7BE0NT6eSJR/3qqUrtLnJmnV h1umgVY+/NJM4qq2aswKySRRYy7nUzqpA847fF6jK69wsPw68ow35+TqM8+KT68DW2qM XZVw== X-Forwarded-Encrypted: i=1; AJvYcCUn1BqvItSQvWJMHmxlK2qFR1DPfKAxabbRZIevEGhDiGUyqG5Hwt/m4RClwe+Fc4CxkZrojeH8F0l8Ez0=@vger.kernel.org X-Gm-Message-State: AOJu0Yw1YAuj/oP+R5tREUona7NpuzuPmimU/TNkW4cobKpebeCHhkwn XE/xvXVsUekgLQmTUtWfQgaCy03Mzwr+nRGpG9ZRynWXxQLcRdhgtmhyX0X2GAZP/xj3ForMq9D iUx+FoON1yg== X-Google-Smtp-Source: AGHT+IGhyXg3/eIiJrPzV8LA1GsjfwoUYtS3Vb5uXpGkz/KSc2MBd8B85rwqll+iK9yx1Z8d0EFBFb/FbykjYg== X-Received: from kinseyct.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:46b]) (user=kinseyho job=sendgmr) by 2002:a25:2905:0:b0:e0b:f69b:da30 with SMTP id 3f1490d57ef6-e1a4580f12amr2713276.9.1724800306773; Tue, 27 Aug 2024 16:11:46 -0700 (PDT) Date: Tue, 27 Aug 2024 23:07:42 +0000 In-Reply-To: <20240827230753.2073580-1-kinseyho@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240827230753.2073580-1-kinseyho@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240827230753.2073580-6-kinseyho@google.com> Subject: [PATCH mm-unstable v3 5/5] mm: clean up mem_cgroup_iter() From: Kinsey Ho To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Yosry Ahmed , Roman Gushchin , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Tejun Heo , Zefan Li , mkoutny@suse.com, Kinsey Ho Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A clean up to make variable names more clear and to improve code readability. No functional change. Signed-off-by: Kinsey Ho Reviewed-by: T.J. Mercier --- mm/memcontrol.c | 30 +++++++++++------------------- 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 33bd379c738b..2bdad7c29ac0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -987,8 +987,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *r= oot, { struct mem_cgroup_reclaim_iter *iter; struct cgroup_subsys_state *css; - struct mem_cgroup *memcg =3D NULL; - struct mem_cgroup *pos =3D NULL; + struct mem_cgroup *pos; + struct mem_cgroup *next =3D NULL; =20 if (mem_cgroup_disabled()) return NULL; @@ -1000,10 +1000,9 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup= *root, restart: if (reclaim) { int gen; - struct mem_cgroup_per_node *mz; + int nid =3D reclaim->pgdat->node_id; =20 - mz =3D root->nodeinfo[reclaim->pgdat->node_id]; - iter =3D &mz->iter; + iter =3D &root->nodeinfo[nid]->iter; gen =3D atomic_read(&iter->generation); =20 /* @@ -1016,29 +1015,22 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgrou= p *root, goto out_unlock; =20 pos =3D READ_ONCE(iter->position); - } else if (prev) { + } else pos =3D prev; - } =20 css =3D pos ? &pos->css : NULL; =20 - for (;;) { - css =3D css_next_descendant_pre(css, &root->css); - if (!css) { - break; - } - + while ((css =3D css_next_descendant_pre(css, &root->css))) { /* * Verify the css and acquire a reference. The root * is provided by the caller, so we know it's alive * and kicking, and don't take an extra reference. */ - if (css =3D=3D &root->css || css_tryget(css)) { + if (css =3D=3D &root->css || css_tryget(css)) break; - } } =20 - memcg =3D mem_cgroup_from_css(css); + next =3D mem_cgroup_from_css(css); =20 if (reclaim) { /* @@ -1046,13 +1038,13 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgrou= p *root, * thread, so check that the value hasn't changed since we read * it to avoid reclaiming from the same cgroup twice. */ - if (cmpxchg(&iter->position, pos, memcg) !=3D pos) { + if (cmpxchg(&iter->position, pos, next) !=3D pos) { if (css && css !=3D &root->css) css_put(css); goto restart; } =20 - if (!memcg) { + if (!next) { atomic_inc(&iter->generation); =20 /* @@ -1071,7 +1063,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup = *root, if (prev && prev !=3D root) css_put(&prev->css); =20 - return memcg; + return next; } =20 /** --=20 2.46.0.295.g3b9ea8a38a-goog