From nobody Thu Sep 18 07:17:04 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F585C4332F for ; Thu, 8 Dec 2022 19:58:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229781AbiLHT6G (ORCPT ); Thu, 8 Dec 2022 14:58:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229737AbiLHT6A (ORCPT ); Thu, 8 Dec 2022 14:58:00 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64DC85F6C4 for ; Thu, 8 Dec 2022 11:57:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670529421; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=s03T6k1fAEZ5y4Sx1FORStjPoY8HDaI7yoRA/+1yXuc=; b=XfWZ/qzTnR9dT7BdPeNYYBVdx+V6PR4Ef/QBvk6MZ/aH6dzPDHO9T5Cnq/7D6b4MbilvtD N+xjSl51gHlqle9w61yG4O5C4HcCBilayHHI3GE1D6Ng9ReQslJfR6i5Qii7OL5SJUq4yT F3CtIyH66wFn7Wo5xI6Ag599E1bSg8s= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-417-he_nquLLOEqRq0Qhs5-jNg-1; Thu, 08 Dec 2022 14:57:00 -0500 X-MC-Unique: he_nquLLOEqRq0Qhs5-jNg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DAF63858F13; Thu, 8 Dec 2022 19:56:59 +0000 (UTC) Received: from llong.com (dhcp-17-153.bos.redhat.com [10.18.17.153]) by smtp.corp.redhat.com (Postfix) with ESMTP id A9D45492B05; Thu, 8 Dec 2022 19:56:59 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Waiman Long Subject: [PATCH 2/2] cgroup/cpuset: Make percpu cpuset_rwsem operation depending on DYNMODS state Date: Thu, 8 Dec 2022 14:56:34 -0500 Message-Id: <20221208195634.2604362-3-longman@redhat.com> In-Reply-To: <20221208195634.2604362-1-longman@redhat.com> References: <20221208195634.2604362-1-longman@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" With commit 6a010a49b63a ("cgroup: Make !percpu threadgroup_rwsem operations optional"), users can determine if they favor optimizing for efficiently moving processes between cgroups frequently or for a more static usage pattern where moving processes among cgroups are relatively rare. The percpu cpuset_rwsem is in the same boat as percpu_threadgroup_rwsem since moving processes among cpusets will have the same latency impact depending on whether percpu operation in cpuset_rwsem is disabled or not. Ideally cpuset_bind() is the best place to check if the cpuset_rwsem should have its reader fast path disabled like percpu_threadgroup_rwsem so that it gets to be re-evaluated every time the cpuset is rebound. Unfortunately, cgroup_favor_dynmods() that sets the CGRP_ROOT_FAVOR_DYNMODS flag is called after the bind() method call. Instead, the newly added cpuset_check_dynmods() function is called at the first cpuset_css_online() call after a cpuset_bind() call when the first child cpuset is created. Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 800c65de5daa..daf8ca948176 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -255,6 +255,7 @@ typedef enum { CS_SCHED_LOAD_BALANCE, CS_SPREAD_PAGE, CS_SPREAD_SLAB, + CS_FAVOR_DYNMODS, /* top_cpuset only */ } cpuset_flagbits_t; =20 /* convenient tests for these bits */ @@ -3049,6 +3050,27 @@ static struct cftype dfl_files[] =3D { { } /* terminate */ }; =20 +static bool dynmods_checked __read_mostly; +static void cpuset_check_dynmods(struct cgroup_root *root) +{ + bool favor_dynmods; + + lockdep_assert_held(&cgroup_mutex); + percpu_rwsem_assert_held(&cpuset_rwsem); + + /* + * Check the CGRP_ROOT_FAVOR_DYNMODS of the cgroup root to find out + * if we need to enable or disable reader fast path of cpuset_rwsem. + */ + favor_dynmods =3D test_bit(CS_FAVOR_DYNMODS, &top_cpuset.flags); + if (favor_dynmods && !(root->flags & CGRP_ROOT_FAVOR_DYNMODS)) { + rcu_sync_exit(&cpuset_rwsem.rss); + clear_bit(CS_FAVOR_DYNMODS, &top_cpuset.flags); + } else if (!favor_dynmods && (root->flags & CGRP_ROOT_FAVOR_DYNMODS)) { + rcu_sync_enter(&cpuset_rwsem.rss); + set_bit(CS_FAVOR_DYNMODS, &top_cpuset.flags); + } +} =20 /* * cpuset_css_alloc - allocate a cpuset css @@ -3099,6 +3121,14 @@ static int cpuset_css_online(struct cgroup_subsys_st= ate *css) cpus_read_lock(); percpu_down_write(&cpuset_rwsem); =20 + /* + * Check dynmod state on the first css_online() call. + */ + if (unlikely(!dynmods_checked)) { + cpuset_check_dynmods(cpuset_cgrp_subsys.root); + dynmods_checked =3D true; + } + set_bit(CS_ONLINE, &cs->flags); if (is_spread_page(parent)) set_bit(CS_SPREAD_PAGE, &cs->flags); @@ -3201,6 +3231,12 @@ static void cpuset_css_free(struct cgroup_subsys_sta= te *css) =20 static void cpuset_bind(struct cgroup_subsys_state *root_css) { + /* + * Reset dynmods_checked to be evaluated again in the next + * cpuset_css_online() + */ + dynmods_checked =3D false; + percpu_down_write(&cpuset_rwsem); spin_lock_irq(&callback_lock); =20 --=20 2.31.1