From nobody Thu Apr 9 03:15:32 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8339C433FE for ; Wed, 2 Nov 2022 02:04:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230209AbiKBCE5 (ORCPT ); Tue, 1 Nov 2022 22:04:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230133AbiKBCEp (ORCPT ); Tue, 1 Nov 2022 22:04:45 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C56B962F7 for ; Tue, 1 Nov 2022 19:03:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667354626; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CMSAt3GthE2zWUBTJjLiVIVQICwCKfwnNgpg1Ix0oWM=; b=ZljSs01HJWODvjWnTfjhaXLqetGUfY2Dbrl1RhsbQQnnejRqFWQA2C9tpUUWZ+hvgIpwiM HTwk/pBtyUkelu6N0Uh5jaPV+Xe2L0GeZps/1AynDYVU2dw4H4J6XjUGYkuO4l+A/tzDrE P13JMWMDRpTOeP9n+JmzLPV1lq6oI9s= Received: from mail-ot1-f72.google.com (mail-ot1-f72.google.com [209.85.210.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-60-PW_wU8tnPgigXsR4Ju4Szw-1; Tue, 01 Nov 2022 22:03:44 -0400 X-MC-Unique: PW_wU8tnPgigXsR4Ju4Szw-1 Received: by mail-ot1-f72.google.com with SMTP id ck9-20020a056830648900b0066c56ff7b33so3593178otb.20 for ; Tue, 01 Nov 2022 19:03:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CMSAt3GthE2zWUBTJjLiVIVQICwCKfwnNgpg1Ix0oWM=; b=SIFknT/q08sZI8/YjgsjL9aUL1OyazyzTRT7x8DiHfwRjFh+nYQcc0LDDkA8N3PXoQ fhW7kIH4gz2acZa7t3onlUb32JE5M3xCpj3xjl9khgi5l57zYPPV63nvXSjTezWszct5 lQ/5VSO0rOxQA3DY0+86pa5wEZv53nCmCk3OR0vtp9dGM1zO/GU7oN/OJ1EXAhyBWZ4P vZ0GLF/B4/oMGKtP4PMLjomIPsVM9+OcXZxFbRgftkf7IXGG7ouRxpLdIAVxCxMknBzu Yc35718IPv2ZrVtbnQCsiaZB5Qge8XhPiI1fk8Alt/Ull6K0fIfHk+ucJZoVzj6fInA4 pOqQ== X-Gm-Message-State: ACrzQf1kQsPU3dt+VsK4CyqvQwVOJHqGRwqi6xlVXoA9of8A2r1ky/Mg TpBVYRQN6I5acg0dIE/Iw3avDB2eqvwbh1FxgpF7JZEY8TzueHj6ya914JAH6pny9HXkEpiteh5 5Pm9fPCxuuLqxBgbcF5IyXcSb X-Received: by 2002:a05:6870:d250:b0:13b:3100:abcc with SMTP id h16-20020a056870d25000b0013b3100abccmr13089104oac.3.1667354624054; Tue, 01 Nov 2022 19:03:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5zv2uj8jD1EUkakgeYhALybaybY7reFpZn0f4powVIbb55KSIq40hrRxQHM4RpX4sgR1KnYw== X-Received: by 2002:a05:6870:d250:b0:13b:3100:abcc with SMTP id h16-20020a056870d25000b0013b3100abccmr13089090oac.3.1667354623748; Tue, 01 Nov 2022 19:03:43 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:1b3:a802:1099:7cb2:3a49:6197:5307]) by smtp.gmail.com with ESMTPSA id h15-20020a9d6f8f000000b00665919f7823sm4526624otq.8.2022.11.01.19.03.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Nov 2022 19:03:43 -0700 (PDT) From: Leonardo Bras To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Frederic Weisbecker , Leonardo Bras , Phil Auld , Marcelo Tosatti Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 1/3] =?UTF-8?q?sched/isolation:=20Add=20housekeep=C3=AD?= =?UTF-8?q?ng=5Fany=5Fcpu=5Ffrom()?= Date: Tue, 1 Nov 2022 23:02:41 -0300 Message-Id: <20221102020243.522358-2-leobras@redhat.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221102020243.522358-1-leobras@redhat.com> References: <20221102020243.522358-1-leobras@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org As of today, there is a function called housekeep=C3=ADng_any_cpu() that returns a housekeeping cpu near the current one. This function is very useful to help delegate tasks to other cpus when the current one is isolated. It also comes with the benefit of looking for cpus in the same NUMA node as the current cpu, so any memory activity could be faster in NUMA systems. On the other hand, there is no function like that to find housekeeping cpus in the same NUMA node of another CPU. Change housekeep=C3=ADng_any_cpu() into housekeep=C3=ADng_any_cpu_from(), s= o it accepts a cpu_start parameter and can find cpus in the same NUMA node as any given CPU. Also, reimplements housekeep=C3=ADng_any_cpu() as an inline function that c= alls housekeep=C3=ADng_any_cpu_from() with cpu_start =3D current cpu. Signed-off-by: Leonardo Bras --- include/linux/sched/isolation.h | 11 ++++++++--- kernel/sched/isolation.c | 8 ++++---- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index 8c15abd67aed9..95b65be44f19f 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -20,7 +20,7 @@ enum hk_type { =20 #ifdef CONFIG_CPU_ISOLATION DECLARE_STATIC_KEY_FALSE(housekeeping_overridden); -extern int housekeeping_any_cpu(enum hk_type type); +extern int housekeeping_any_cpu_from(enum hk_type type, int cpu_start); extern const struct cpumask *housekeeping_cpumask(enum hk_type type); extern bool housekeeping_enabled(enum hk_type type); extern void housekeeping_affine(struct task_struct *t, enum hk_type type); @@ -29,9 +29,9 @@ extern void __init housekeeping_init(void); =20 #else =20 -static inline int housekeeping_any_cpu(enum hk_type type) +static inline int housekeeping_any_cpu_from(enum hk_type type, int cpu_sta= rt) { - return smp_processor_id(); + return cpu_start; } =20 static inline const struct cpumask *housekeeping_cpumask(enum hk_type type) @@ -58,4 +58,9 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type= type) return true; } =20 +static inline int housekeeping_any_cpu(enum hk_type type) +{ + return housekeeping_any_cpu_from(type, smp_processor_id()); +} + #endif /* _LINUX_SCHED_ISOLATION_H */ diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 373d42c707bc5..6ebeac11bb350 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -36,22 +36,22 @@ bool housekeeping_enabled(enum hk_type type) } EXPORT_SYMBOL_GPL(housekeeping_enabled); =20 -int housekeeping_any_cpu(enum hk_type type) +int housekeeping_any_cpu_from(enum hk_type type, int cpu_start) { int cpu; =20 if (static_branch_unlikely(&housekeeping_overridden)) { if (housekeeping.flags & BIT(type)) { - cpu =3D sched_numa_find_closest(housekeeping.cpumasks[type], smp_proces= sor_id()); + cpu =3D sched_numa_find_closest(housekeeping.cpumasks[type], cpu_start); if (cpu < nr_cpu_ids) return cpu; =20 return cpumask_any_and(housekeeping.cpumasks[type], cpu_online_mask); } } - return smp_processor_id(); + return cpu_start; } -EXPORT_SYMBOL_GPL(housekeeping_any_cpu); +EXPORT_SYMBOL_GPL(housekeeping_any_cpu_from); =20 const struct cpumask *housekeeping_cpumask(enum hk_type type) { --=20 2.38.1 From nobody Thu Apr 9 03:15:32 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7AE9C433FE for ; Wed, 2 Nov 2022 02:05:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230187AbiKBCFG (ORCPT ); Tue, 1 Nov 2022 22:05:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230122AbiKBCEy (ORCPT ); Tue, 1 Nov 2022 22:04:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FD6E6351 for ; Tue, 1 Nov 2022 19:03:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667354631; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9awyJRy6zbIYnxU2Cyp8GDi5vivjY8I9wO3N/Lj5YCI=; b=FRgrnSxEw6MrXo8g7bVS2LwGLrnJ/DyTJ6aDxu52eO3jWNDdp75fa+yu8ZL3P+m1w9CZpy dbE8JoyWEcsN+OeM6DQjn6rAMJT7f9obDeka0Zm4eHx4UzWMSXwXq7uMJeI14JgVSBhrXZ OUGRv4w3Ih4I9vbH7D0ob5E4t7BHnuY= Received: from mail-oa1-f71.google.com (mail-oa1-f71.google.com [209.85.160.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-548-y8dEA0X8PuaogSBUFbIpNw-1; Tue, 01 Nov 2022 22:03:49 -0400 X-MC-Unique: y8dEA0X8PuaogSBUFbIpNw-1 Received: by mail-oa1-f71.google.com with SMTP id 586e51a60fabf-13bf576ffa6so8095986fac.9 for ; Tue, 01 Nov 2022 19:03:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9awyJRy6zbIYnxU2Cyp8GDi5vivjY8I9wO3N/Lj5YCI=; b=FKliEiPpmxf2y01QRbKdhGdThKDeBZ7qlxKCH3x/YfncLKdulq2v+LDkaHlHpea5x1 /WTxGPPI4kgROHTX+YtWVjLoXjFU1P5qamc1muVLlqPp5loHtwJhmHeDxbxlMvL1ukqk 5lY1+9Aut9IuAVRUVPrGibA6dv4UM8b/IxmSPw/T1Kt/oRM62pT7MzNHwefrHpMJhrdE 7G4iTS+RRAsa0Zo3mh+GwW4FX31FlLozniEHEz22UsxseNT4xxOHq1YbPWawSTei2Fna nMAr8zKHE14s+dIzR5YSi4v2REgdX/BpxKgWBYseTtWqRqM7AEF6s9uuWQyZB4PttDS+ 2onQ== X-Gm-Message-State: ACrzQf2489St25aQCBKbbIkegZAqV4ff3a2PwDIrMDPpNZB3TRQR4hIu 40vPHe0FNjaJSEZFwBz7KI5KQ+lKPNrScMIf3fHGp23GyZWQvDrNVFIX1uIDxYnrK49U3zwPyCV cruh1tuWUqdYbyRFF8IOj+UQS X-Received: by 2002:a05:6871:58b:b0:13c:be46:a02 with SMTP id u11-20020a056871058b00b0013cbe460a02mr11636093oan.8.1667354629098; Tue, 01 Nov 2022 19:03:49 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5ltoJltHI6Z/nEB+eev0WW7uJWB140pg3Bn+RcqlseVCQsPXYU6LZG5IjvE4MMVOkxV19MIA== X-Received: by 2002:a05:6871:58b:b0:13c:be46:a02 with SMTP id u11-20020a056871058b00b0013cbe460a02mr11636072oan.8.1667354628892; Tue, 01 Nov 2022 19:03:48 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:1b3:a802:1099:7cb2:3a49:6197:5307]) by smtp.gmail.com with ESMTPSA id h15-20020a9d6f8f000000b00665919f7823sm4526624otq.8.2022.11.01.19.03.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Nov 2022 19:03:48 -0700 (PDT) From: Leonardo Bras To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Frederic Weisbecker , Leonardo Bras , Phil Auld , Marcelo Tosatti Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 2/3] mm/memcontrol: Change stock_lock type from local_lock_t to spinlock_t Date: Tue, 1 Nov 2022 23:02:42 -0300 Message-Id: <20221102020243.522358-3-leobras@redhat.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221102020243.522358-1-leobras@redhat.com> References: <20221102020243.522358-1-leobras@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In this context, since it's using per-cpu variables, changing from local_lock to spinlock should not deal much impact in performance and can allow operations such as stock draining to happen in remote cpus. Why performance would probably not get impacted: 1 - Since the lock is in the same cache line as the information that is used next, there is no extra memory access for caching the lock. 2 - Since it's a percpu struct, there should be no other cpu sharing this cacheline, so there is no need for cacheline invalidation, and writing to the lock should be as fast as the next struct members. 3 - Even the write in (2) could be pipelined and batched with following writes to the cacheline (such as nr_pages member), further decreasing the impact of this change. Suggested-by: Marcelo Tosatti Signed-off-by: Leonardo Bras --- mm/memcontrol.c | 38 ++++++++++++++++++++------------------ 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2d8549ae1b300..add46da2e6df1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2167,7 +2167,7 @@ void unlock_page_memcg(struct page *page) } =20 struct memcg_stock_pcp { - local_lock_t stock_lock; + spinlock_t stock_lock; /* Protects the percpu struct */ struct mem_cgroup *cached; /* this never be root cgroup */ unsigned int nr_pages; =20 @@ -2184,7 +2184,7 @@ struct memcg_stock_pcp { #define FLUSHING_CACHED_CHARGE 0 }; static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock) =3D { - .stock_lock =3D INIT_LOCAL_LOCK(stock_lock), + .stock_lock =3D __SPIN_LOCK_UNLOCKED(stock_lock), }; static DEFINE_MUTEX(percpu_charge_mutex); =20 @@ -2229,15 +2229,15 @@ static bool consume_stock(struct mem_cgroup *memcg,= unsigned int nr_pages) if (nr_pages > MEMCG_CHARGE_BATCH) return ret; =20 - local_lock_irqsave(&memcg_stock.stock_lock, flags); - stock =3D this_cpu_ptr(&memcg_stock); + spin_lock_irqsave(&stock->stock_lock, flags); + if (memcg =3D=3D stock->cached && stock->nr_pages >=3D nr_pages) { stock->nr_pages -=3D nr_pages; ret =3D true; } =20 - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + spin_unlock_irqrestore(&stock->stock_lock, flags); =20 return ret; } @@ -2274,14 +2274,14 @@ static void drain_local_stock(struct work_struct *d= ummy) * drain_stock races is that we always operate on local CPU stock * here with IRQ disabled */ - local_lock_irqsave(&memcg_stock.stock_lock, flags); - stock =3D this_cpu_ptr(&memcg_stock); + spin_lock_irqsave(&stock->stock_lock, flags); + old =3D drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); =20 - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + spin_unlock_irqrestore(&stock->stock_lock, flags); if (old) obj_cgroup_put(old); } @@ -2309,10 +2309,12 @@ static void __refill_stock(struct mem_cgroup *memcg= , unsigned int nr_pages) static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { unsigned long flags; + struct memcg_stock_pcp *stock; =20 - local_lock_irqsave(&memcg_stock.stock_lock, flags); + stock =3D this_cpu_ptr(&memcg_stock); + spin_lock_irqsave(&stock->stock_lock, flags); __refill_stock(memcg, nr_pages); - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + spin_unlock_irqrestore(&stock->stock_lock, flags); } =20 /* @@ -3157,8 +3159,8 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct= pglist_data *pgdat, unsigned long flags; int *bytes; =20 - local_lock_irqsave(&memcg_stock.stock_lock, flags); stock =3D this_cpu_ptr(&memcg_stock); + spin_lock_irqsave(&stock->stock_lock, flags); =20 /* * Save vmstat data in stock and skip vmstat array update unless @@ -3210,7 +3212,7 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct= pglist_data *pgdat, if (nr) mod_objcg_mlstate(objcg, pgdat, idx, nr); =20 - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + spin_unlock_irqrestore(&stock->stock_lock, flags); if (old) obj_cgroup_put(old); } @@ -3221,15 +3223,15 @@ static bool consume_obj_stock(struct obj_cgroup *ob= jcg, unsigned int nr_bytes) unsigned long flags; bool ret =3D false; =20 - local_lock_irqsave(&memcg_stock.stock_lock, flags); - stock =3D this_cpu_ptr(&memcg_stock); + spin_lock_irqsave(&stock->stock_lock, flags); + if (objcg =3D=3D stock->cached_objcg && stock->nr_bytes >=3D nr_bytes) { stock->nr_bytes -=3D nr_bytes; ret =3D true; } =20 - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + spin_unlock_irqrestore(&stock->stock_lock, flags); =20 return ret; } @@ -3319,9 +3321,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, unsigned long flags; unsigned int nr_pages =3D 0; =20 - local_lock_irqsave(&memcg_stock.stock_lock, flags); - stock =3D this_cpu_ptr(&memcg_stock); + spin_lock_irqsave(&stock->stock_lock, flags); + if (stock->cached_objcg !=3D objcg) { /* reset if necessary */ old =3D drain_obj_stock(stock); obj_cgroup_get(objcg); @@ -3337,7 +3339,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, stock->nr_bytes &=3D (PAGE_SIZE - 1); } =20 - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); + spin_unlock_irqrestore(&stock->stock_lock, flags); if (old) obj_cgroup_put(old); =20 --=20 2.38.1 From nobody Thu Apr 9 03:15:32 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D63EC4332F for ; Wed, 2 Nov 2022 02:05:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230283AbiKBCFK (ORCPT ); Tue, 1 Nov 2022 22:05:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230185AbiKBCE5 (ORCPT ); Tue, 1 Nov 2022 22:04:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BABF863DB for ; Tue, 1 Nov 2022 19:03:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667354636; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i2RYvVpIxnZB4mRhcfM8kpuAulQpsyxXS11J9gDFkxg=; b=Zwzdmd6DfrLyphZlYUUq0lM/DoE0GLwAiWkYmwOq4W2QG/2jVAwrI97VRX2M6GDGQ0QL2e smmMDj6UwOImmMZqm1vWfoekoH6A2gB8lguxmTKX+0/AFipCZ5oqHCPe9wtnTDoTBiyMed +avQ5VVJ/e1YFQXTqblIR0arPqNNiXk= Received: from mail-oo1-f71.google.com (mail-oo1-f71.google.com [209.85.161.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-246-kZgHHuwuMuK0uDK0MYwmmQ-1; Tue, 01 Nov 2022 22:03:55 -0400 X-MC-Unique: kZgHHuwuMuK0uDK0MYwmmQ-1 Received: by mail-oo1-f71.google.com with SMTP id g6-20020a4a6b06000000b0049d1e5cd0cfso59052ooc.4 for ; Tue, 01 Nov 2022 19:03:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i2RYvVpIxnZB4mRhcfM8kpuAulQpsyxXS11J9gDFkxg=; b=daQWhjFCY87bRZwHaEI7uE8crvpGuzT+af2x5ho1/LQscqPAXt0E7a+tvAO3gTk1bc RloW9fYfX9Csk7JzpU5wc4ROPR2lq0rTE3O46juYPjgLV+/5QlRdOdNKVYfaqQBMzyQR 7m9lWal2dPkxRiSZAxFyeuOSFTxcfsgm5r2NFMtow1WGZVXyo4eLgIMo+edLOQGGp1EP g6Ycsxxp1nrjYgB2ZiRoL4JU2gbvp+rOAYhWtDuqlbgBkAn+a/7evM20jEBJnDM+9SPa C8sYYMOH/kIs8FW3K8UHlGJv3sorwUS9bBq6rp/b4bnyhLrcCHSY0eJ9XPAqkX9bfZ3l NQWA== X-Gm-Message-State: ACrzQf0als6DnJLnEs67IKfUKCBR1/UdT7djIbiZ1r3Z/+LC/1ywNmGN qhrQq6FYyKwFzbq9id0w0Lbiar6WlYToKLaPyA1+OV6mI55CH6nUl6L0k0NcNORqJPtxqEA2CIZ 6tEqvb7cIGrwMTxUNSEO/qc4H X-Received: by 2002:a05:6870:4212:b0:13c:d544:8e2a with SMTP id u18-20020a056870421200b0013cd5448e2amr10233067oac.28.1667354634222; Tue, 01 Nov 2022 19:03:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5TucfJr70Pv/6ho03ND1IeMhGN5op58YZ9rbaUJ10xOKWU2YOlELtwuMUjwXZ9y/ol3qVj0Q== X-Received: by 2002:a05:6870:4212:b0:13c:d544:8e2a with SMTP id u18-20020a056870421200b0013cd5448e2amr10233044oac.28.1667354633998; Tue, 01 Nov 2022 19:03:53 -0700 (PDT) Received: from LeoBras.redhat.com ([2804:1b3:a802:1099:7cb2:3a49:6197:5307]) by smtp.gmail.com with ESMTPSA id h15-20020a9d6f8f000000b00665919f7823sm4526624otq.8.2022.11.01.19.03.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Nov 2022 19:03:53 -0700 (PDT) From: Leonardo Bras To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Frederic Weisbecker , Leonardo Bras , Phil Auld , Marcelo Tosatti Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 3/3] mm/memcontrol: Add drain_remote_stock(), avoid drain_stock on isolated cpus Date: Tue, 1 Nov 2022 23:02:43 -0300 Message-Id: <20221102020243.522358-4-leobras@redhat.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221102020243.522358-1-leobras@redhat.com> References: <20221102020243.522358-1-leobras@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When drain_all_stock() is called, some CPUs will be required to have their per-CPU caches drained. This currently happens by scheduling a call to drain_local_stock() to run in each affected CPU. This, as a consequence, may end up scheduling work to CPUs that are isolated, and therefore should have as little interruption as possible. In order to avoid this, make drain_all_stock() able to detect isolated CPUs and schedule draining the perCPU stock to happen in another non-isolated CPU. But since the current implementation only allows the drain to happen in local CPU, implement a function to drain stock on a remote CPU: drain_remote_stock(). Given both drain_local_stock() and drain_remote_stock() do almost the same work, implement a inline drain_stock_helper() that is called by both. Also, since drain_stock() will be able to run on a remote CPU, protect memcg_hotplug_cpu_dead() with stock_lock. Signed-off-by: Leonardo Bras --- mm/memcontrol.c | 47 ++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 38 insertions(+), 9 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index add46da2e6df1..7ad6e4f4b79ef 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -2263,7 +2264,7 @@ static void drain_stock(struct memcg_stock_pcp *stock) stock->cached =3D NULL; } =20 -static void drain_local_stock(struct work_struct *dummy) +static inline void drain_stock_helper(int cpu) { struct memcg_stock_pcp *stock; struct obj_cgroup *old =3D NULL; @@ -2271,10 +2272,9 @@ static void drain_local_stock(struct work_struct *du= mmy) =20 /* * The only protection from cpu hotplug (memcg_hotplug_cpu_dead) vs. - * drain_stock races is that we always operate on local CPU stock - * here with IRQ disabled + * drain_stock races is stock_lock, a percpu spinlock. */ - stock =3D this_cpu_ptr(&memcg_stock); + stock =3D per_cpu_ptr(&memcg_stock, cpu); spin_lock_irqsave(&stock->stock_lock, flags); =20 old =3D drain_obj_stock(stock); @@ -2286,6 +2286,16 @@ static void drain_local_stock(struct work_struct *du= mmy) obj_cgroup_put(old); } =20 +static void drain_remote_stock(struct work_struct *work) +{ + drain_stock_helper(atomic_long_read(&work->data)); +} + +static void drain_local_stock(struct work_struct *dummy) +{ + drain_stock_helper(smp_processor_id()); +} + /* * Cache charges(val) to local per_cpu area. * This will be consumed by consume_stock() function, later. @@ -2352,10 +2362,16 @@ static void drain_all_stock(struct mem_cgroup *root= _memcg) =20 if (flush && !test_and_set_bit(FLUSHING_CACHED_CHARGE, &stock->flags)) { - if (cpu =3D=3D curcpu) + if (cpu =3D=3D curcpu) { drain_local_stock(&stock->work); - else + } else if (housekeeping_cpu(cpu, HK_TYPE_WQ)) { schedule_work_on(cpu, &stock->work); + } else { + int hkcpu =3D housekeeping_any_cpu_from(HK_TYPE_WQ, cpu); + + atomic_long_set(&stock->work.data, cpu); + schedule_work_on(hkcpu, &stock->work); + } } } migrate_enable(); @@ -2367,7 +2383,9 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) struct memcg_stock_pcp *stock; =20 stock =3D &per_cpu(memcg_stock, cpu); + spin_lock(&stock->stock_lock); drain_stock(stock); + spin_unlock(&stock->stock_lock); =20 return 0; } @@ -7272,9 +7290,20 @@ static int __init mem_cgroup_init(void) cpuhp_setup_state_nocalls(CPUHP_MM_MEMCQ_DEAD, "mm/memctrl:dead", NULL, memcg_hotplug_cpu_dead); =20 - for_each_possible_cpu(cpu) - INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, - drain_local_stock); + /* + * CPUs that are isolated should not spend cpu time for stock draining, + * so allow them to export this task to the nearest housekeeping enabled + * cpu available. + */ + for_each_possible_cpu(cpu) { + if (housekeeping_cpu(cpu, HK_TYPE_WQ)) { + INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, + drain_local_stock); + } else { + INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, + drain_remote_stock); + } + } =20 for_each_node(node) { struct mem_cgroup_tree_per_node *rtpn; --=20 2.38.1