From nobody Wed Feb 11 13:06:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C1B1C6FD1D for ; Wed, 15 Mar 2023 08:50:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231799AbjCOIuC (ORCPT ); Wed, 15 Mar 2023 04:50:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229487AbjCOItz (ORCPT ); Wed, 15 Mar 2023 04:49:55 -0400 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CA0B5D468 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: by mail-pf1-x433.google.com with SMTP id c4so11279628pfl.0 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YkxChdLDa4UFijk1Jheg8QY/La3zPetPy6md4h64POE=; b=QjRMzeMPM7bEXlB6ydwS837/NFUyQ8FTlay7FyNeoNs0VDwU6tSXowOh5PEijcS3y8 sD6nxAC6h4K1cRCQD9qu0IFALoN8EolNk7Bid2nRlhUXSKUTeDvUpAvWMnwt+HPHUQx/ KyyxJcLdPcRikCzCFdYvFjT+Mqx43i8pHeTDS+TQTsjpdDOaSGXHR1WhZ00ymwBzz/6L oUBDFOeHVrVNwCUzNaMl/ntSSHv31y6BC2N0XzgLSBSKpat6+edLAyQmzHqhhsxZ7a1M lj1RUSkyW5Ia0Sp/kwGQHE1me5OM6WQIgiCDHeaI+wyfJ3J//4uGOBLUlhHOul1FoJei V5fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YkxChdLDa4UFijk1Jheg8QY/La3zPetPy6md4h64POE=; b=uWZxx9xya5r0b9RfPfAt+WfiC1V1+4/nKB84LI4sVuS1csB/mVdWzWe4MiWGYJKTG0 JCtFrC6wRneIi2u0P1b3S27AtKknC4WJ2t595C4PjWJLGiaw3A7SL8tNsdBfh++8XftO 9Ehi1TTL2/UpU7fHAMNzHgtkYD+PcnDKe1XyofKyHD81ODeLYv0JXG4VkYzxnIWE+iqX vj39/qGDmN2XxYFrv/1VK3CVui6HEN4SJAqPTCPJnLzWkiq+fFvk9hJfJWN/qbxyQ1EY zCRSUKZDB7vFwfe1e9Ocx4WiuacJjGTCF5XUwp3OeYmLOmxoCcUN0WLM3DWOGsDjubo9 paCw== X-Gm-Message-State: AO0yUKV2X7pxHUUv/miXAE7PCM715sm6uqsGiFD5prWZudrqoxIL30JY /bVTl4fFyV6Na3PxP7tpNmVu1LKk2QSta1EjwWU= X-Google-Smtp-Source: AK7set+PAG3sniqSn+uEJTES+NuHK2ZCXd+V9ulDUeNZd08I0VS+GAnsdTPkVoBU/h/tKlRjkhWWSg== X-Received: by 2002:aa7:8ec1:0:b0:625:8217:15b9 with SMTP id b1-20020aa78ec1000000b00625821715b9mr3081112pfr.2.1678870191951; Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id i17-20020aa787d1000000b0058d9058fe8asm2966024pfo.103.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeS-2N; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6L-0D; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 1/4] cpumask: introduce for_each_cpu_or Date: Wed, 15 Mar 2023 19:49:35 +1100 Message-Id: <20230315084938.2544737-2-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Dave Chinner Equivalent of for_each_cpu_and, except it ORs the two masks together so it iterates all the CPUs present in either mask. Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong --- include/linux/cpumask.h | 17 +++++++++++++++++ include/linux/find.h | 37 +++++++++++++++++++++++++++++++++++++ lib/find_bit.c | 9 +++++++++ 3 files changed, 63 insertions(+) diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index d4901ca8883c..ca736b05ec7b 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -350,6 +350,23 @@ unsigned int __pure cpumask_next_wrap(int n, const str= uct cpumask *mask, int sta #define for_each_cpu_andnot(cpu, mask1, mask2) \ for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_= cpumask_bits) =20 +/** + * for_each_cpu_or - iterate over every cpu present in either mask + * @cpu: the (optionally unsigned) integer iterator + * @mask1: the first cpumask pointer + * @mask2: the second cpumask pointer + * + * This saves a temporary CPU mask in many places. It is equivalent to: + * struct cpumask tmp; + * cpumask_or(&tmp, &mask1, &mask2); + * for_each_cpu(cpu, &tmp) + * ... + * + * After the loop, cpu is >=3D nr_cpu_ids. + */ +#define for_each_cpu_or(cpu, mask1, mask2) \ + for_each_or_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), small_cpum= ask_bits) + /** * cpumask_any_but - return a "random" in a cpumask, but not this one. * @mask: the cpumask to search diff --git a/include/linux/find.h b/include/linux/find.h index 4647864a5ffd..5e4f39ef2e72 100644 --- a/include/linux/find.h +++ b/include/linux/find.h @@ -14,6 +14,8 @@ unsigned long _find_next_and_bit(const unsigned long *add= r1, const unsigned long unsigned long nbits, unsigned long start); unsigned long _find_next_andnot_bit(const unsigned long *addr1, const unsi= gned long *addr2, unsigned long nbits, unsigned long start); +unsigned long _find_next_or_bit(const unsigned long *addr1, const unsigned= long *addr2, + unsigned long nbits, unsigned long start); unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long= nbits, unsigned long start); extern unsigned long _find_first_bit(const unsigned long *addr, unsigned l= ong size); @@ -127,6 +129,36 @@ unsigned long find_next_andnot_bit(const unsigned long= *addr1, } #endif =20 +#ifndef find_next_or_bit +/** + * find_next_or_bit - find the next set bit in either memory regions + * @addr1: The first address to base the search on + * @addr2: The second address to base the search on + * @size: The bitmap size in bits + * @offset: The bitnumber to start searching at + * + * Returns the bit number for the next set bit + * If no bits are set, returns @size. + */ +static inline +unsigned long find_next_or_bit(const unsigned long *addr1, + const unsigned long *addr2, unsigned long size, + unsigned long offset) +{ + if (small_const_nbits(size)) { + unsigned long val; + + if (unlikely(offset >=3D size)) + return size; + + val =3D (*addr1 | *addr2) & GENMASK(size - 1, offset); + return val ? __ffs(val) : size; + } + + return _find_next_or_bit(addr1, addr2, size, offset); +} +#endif + #ifndef find_next_zero_bit /** * find_next_zero_bit - find the next cleared bit in a memory region @@ -536,6 +568,11 @@ unsigned long find_next_bit_le(const void *addr, unsig= ned (bit) =3D find_next_andnot_bit((addr1), (addr2), (size), (bit)), (bi= t) < (size);\ (bit)++) =20 +#define for_each_or_bit(bit, addr1, addr2, size) \ + for ((bit) =3D 0; \ + (bit) =3D find_next_or_bit((addr1), (addr2), (size), (bit)), (bit) <= (size);\ + (bit)++) + /* same as for_each_set_bit() but use bit as value to start with */ #define for_each_set_bit_from(bit, addr, size) \ for (; (bit) =3D find_next_bit((addr), (size), (bit)), (bit) < (size); (b= it)++) diff --git a/lib/find_bit.c b/lib/find_bit.c index c10920e66788..32f99e9a670e 100644 --- a/lib/find_bit.c +++ b/lib/find_bit.c @@ -182,6 +182,15 @@ unsigned long _find_next_andnot_bit(const unsigned lon= g *addr1, const unsigned l EXPORT_SYMBOL(_find_next_andnot_bit); #endif =20 +#ifndef find_next_or_bit +unsigned long _find_next_or_bit(const unsigned long *addr1, const unsigned= long *addr2, + unsigned long nbits, unsigned long start) +{ + return FIND_NEXT_BIT(addr1[idx] | addr2[idx], /* nop */, nbits, start); +} +EXPORT_SYMBOL(_find_next_or_bit); +#endif + #ifndef find_next_zero_bit unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long= nbits, unsigned long start) --=20 2.39.2 From nobody Wed Feb 11 13:06:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47C79C61DA4 for ; Wed, 15 Mar 2023 08:50:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231835AbjCOIuJ (ORCPT ); Wed, 15 Mar 2023 04:50:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231465AbjCOIt4 (ORCPT ); Wed, 15 Mar 2023 04:49:56 -0400 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8EE4658B40 for ; Wed, 15 Mar 2023 01:49:53 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id n16so5097773pfa.12 for ; Wed, 15 Mar 2023 01:49:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tkFj6yYLyA565/QhOvX6PgrnENIUFczCaPlTlQYQGRk=; b=P8lBfpG7IED6FPgo2ZB7apKwy3nUll9jUKZ8XzlrGpjDtYG8AbCcwo7zvIxIedMAi2 H6VZUAxroyRe59ECAAvynvkq8QSQkYykDwiNYazLkZA+lH3wCju9gPC+hT8a43f2BmoA j9RpQWC41i0mNupI+oXacSSvjKAGJ4+vgvsFXxD4G/haDpYnbt0PZ0RSkuEWQjljHrLG hGZcIKjas3JKkHHKlgv2t0oVs/hWgaL7pp9XDcpTbRs3hXgx0O3YC0Xiy7NF2n3Ej7tU RslGq9jpYn6NsD498EG4rurP0SM0c4G4pWSHhohSvWR1DWUZXDA7LXMdnh2+YmirVJdi 9A6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tkFj6yYLyA565/QhOvX6PgrnENIUFczCaPlTlQYQGRk=; b=diglQXSkc3wFwncCbAr2Jt1LV+daWJsEAlBr+83J5sVHOsInM+H+PFIgjYvyB8PjZs tC0fw6mynlt7XY0WXM2uET/lAletqRHxktNwD1HnKPNnojISIbTWhnqowhVdfJeQyUbq ccTvNDxLcCxqDbPeasiiblu8eY+qufQnAOq702MA9x8nfFXiKuRTLTv370BD0jeWV+45 7vM0XMzdWERgKWNnlA5+yHqvXt8Ol8CHlrh7JgAPXO3H+z5SKltiVyBU5wp4N5f96ocR Q+oZSrnDc+6knoIWJt6HpwO7gT3TcE1A56701RSTbHQa5DEyi7/V3X8fLB0lxRZuG+Fp SNXw== X-Gm-Message-State: AO0yUKXBnc7nTGeXrhrRTUA8RPN33JcBb7bm1Rf788H/4PCV5lt2u/jh nK1hfuKs6vaGr7IAQ5KXeOm23sibUdNSi3BYK7c= X-Google-Smtp-Source: AK7set9m56zOUn9/VDRZB2SidWtzW+u6ZCwpdH8KJIhu6v/dpLkJCqq28QRPyVmDBf4S/L86vLXNpQ== X-Received: by 2002:a05:6a00:41:b0:625:5560:696e with SMTP id i1-20020a056a00004100b006255560696emr5343794pfk.16.1678870192212; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id h11-20020a62b40b000000b0062505afff9fsm2971980pfn.126.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeT-3D; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6P-0I; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 2/4] pcpcntrs: fix dying cpu summation race Date: Wed, 15 Mar 2023 19:49:36 +1100 Message-Id: <20230315084938.2544737-3-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Dave Chinner In commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") a race condition between a cpu dying and percpu_counter_sum() iterating online CPUs was identified. The solution was to iterate all possible CPUs for summation via percpu_counter_sum_all(). We recently had a percpu_counter_sum() call in XFS trip over this same race condition and it fired a debug assert because the filesystem was unmounting and the counter *should* be zero just before we destroy it. That was reported here: https://lore.kernel.org/linux-kernel/20230314090649.326642-1-yebin@huaweicl= oud.com/ likely as a result of running generic/648 which exercises filesystems in the presence of CPU online/offline events. The solution to use percpu_counter_sum_all() is an awful one. We use percpu counters and percpu_counter_sum() for accurate and reliable threshold detection for space management, so a summation race condition during these operations can result in overcommit of available space and that may result in filesystem shutdowns. As percpu_counter_sum_all() iterates all possible CPUs rather than just those online or even those present, the mask can include CPUs that aren't even installed in the machine, or in the case of machines that can hot-plug CPU capable nodes, even have physical sockets present in the machine. Fundamentally, this race condition is caused by the CPU being offlined being removed from the cpu_online_mask before the notifier that cleans up per-cpu state is run. Hence percpu_counter_sum() will not sum the count for a cpu currently being taken offline, regardless of whether the notifier has run or not. This is the root cause of the bug. The percpu counter notifier iterates all the registered counters, locks the counter and moves the percpu count to the global sum. This is serialised against other operations that move the percpu counter to the global sum as well as percpu_counter_sum() operations that sum the percpu counts while holding the counter lock. Hence the notifier is safe to run concurrently with sum operations, and the only thing we actually need to care about is that percpu_counter_sum() iterates dying CPUs. That's trivial to do, and when there are no CPUs dying, it has no addition overhead except for a cpumask_or() operation. This change makes percpu_counter_sum() always do the right thing in the presence of CPU hot unplug events and makes percpu_counter_sum_all() unnecessary. This, in turn, means that filesystems like XFS, ext4, and btrfs don't have to work out when they should use percpu_counter_sum() vs percpu_counter_sum_all() in their space accounting algorithms Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong --- lib/percpu_counter.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index dba56c5c1837..0e096311e0c0 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -131,7 +131,7 @@ static s64 __percpu_counter_sum_mask(struct percpu_coun= ter *fbc, =20 raw_spin_lock_irqsave(&fbc->lock, flags); ret =3D fbc->count; - for_each_cpu(cpu, cpu_mask) { + for_each_cpu_or(cpu, cpu_online_mask, cpu_mask) { s32 *pcount =3D per_cpu_ptr(fbc->counters, cpu); ret +=3D *pcount; } @@ -141,11 +141,20 @@ static s64 __percpu_counter_sum_mask(struct percpu_co= unter *fbc, =20 /* * Add up all the per-cpu counts, return the result. This is a more accur= ate - * but much slower version of percpu_counter_read_positive() + * but much slower version of percpu_counter_read_positive(). + * + * We use the cpu mask of (cpu_online_mask | cpu_dying_mask) to capture su= ms + * from CPUs that are in the process of being taken offline. Dying cpus ha= ve + * been removed from the online mask, but may not have had the hotplug dead + * notifier called to fold the percpu count back into the global counter s= um. + * By including dying CPUs in the iteration mask, we avoid this race condi= tion + * so __percpu_counter_sum() just does the right thing when CPUs are being= taken + * offline. */ s64 __percpu_counter_sum(struct percpu_counter *fbc) { - return __percpu_counter_sum_mask(fbc, cpu_online_mask); + + return __percpu_counter_sum_mask(fbc, cpu_dying_mask); } EXPORT_SYMBOL(__percpu_counter_sum); =20 --=20 2.39.2 From nobody Wed Feb 11 13:06:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A9E2C6FD1D for ; Wed, 15 Mar 2023 08:50:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231740AbjCOIt7 (ORCPT ); Wed, 15 Mar 2023 04:49:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231789AbjCOIty (ORCPT ); Wed, 15 Mar 2023 04:49:54 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 810C75BD9D for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id x11so19240915pln.12 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3PiH3qSD87Gy2R0sEveTfXmyJns2RRHbdlIcrLyx7vY=; b=dGXEAZWGtw3hjf4evG9X6AHe1yTsnbPEjFBkPoP86wCPr8QN6TVnbPOULBMveWDVoT jiIftWXhqKYSzp0qmhThWBeyCKsD4Tm6Xp+7PMte7cJF7wgzVLSiYssGL15uSOr3lvfV rQd66fF+t9UQq3jhdoFYBw6GCsPIQuVOxn5UolXc0160hrbGsy9zlV4+B/qhV9yCDS7F H/CMxjUw1Z+tGw+1u0CeYT7yhOpK5oiWVK6GvR3bPJpgJzz75iWIvVxMm4iFmXRulZ4+ jvTJP5PpwGUCDBuh+DdFDKIgLX1jwAvPcxNGWCCVoZa7qwZqgDG/cKCvCt3mmTsN4n4r PO9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3PiH3qSD87Gy2R0sEveTfXmyJns2RRHbdlIcrLyx7vY=; b=NP3I6nxvDXaOVRBo+YZChppC+Nlw1O3H5vyIOxgt7aRMwWkGxt//PwDIPzJM6YCdMZ vo0IzsMOCRsrt+FYCuzlBMkGr2GxT2NjsRRWKyD3y2gQeBqMu6Ri5VTaDy9b+N352rAU SilaIbs8i/H7b4jPZtyTzlyKwqPznJGACuxWSlZ5EZl8HJZQ69lPmysDiSygPttQfW6f GLG1BWyCn2LpBPDxLa4sZs1gZvGA7WOcDVFdZAtLV6Am40yY27JdxmjQ5ByChi2UTccV z//pCx4R6AMzuh3lY0N08U7Qpx6efqx1oddsZ/4x+FRG1jIop+BxsAL9PKcrsMACgeCw AJcg== X-Gm-Message-State: AO0yUKUVOklWGN3GAjqmyqHRTh2DwmZh4tYm9zKhrM34yIG7sqHwIvm3 jWlTz6pz7oJY7saeDHc3QEtI6JOviemjq2S+vNY= X-Google-Smtp-Source: AK7set/d5shADEggOumzehf9o4A4WvFVUPs23yylPGUIObnfMjjp0YKBaZtfOr+A8wcBYihNDIQpPQ== X-Received: by 2002:a17:902:e541:b0:19a:70f9:affb with SMTP id n1-20020a170902e54100b0019a70f9affbmr2478057plf.2.1678870191825; Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id kk5-20020a170903070500b0019a928a8982sm3096849plb.118.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeU-45; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6T-0N; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 3/4] fork: remove use of percpu_counter_sum_all Date: Wed, 15 Mar 2023 19:49:37 +1100 Message-Id: <20230315084938.2544737-4-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Dave Chinner This effectively reverts the change made in commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") as the race condition percpu_counter_sum_all() was invented to avoid is now handled directly in percpu_counter_sum() and nobody needs to care about summing racing with cpu unplug anymore. Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong --- kernel/fork.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index d8cda4c6de6c..c0257cbee093 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -755,11 +755,6 @@ static void check_mm(struct mm_struct *mm) for (i =3D 0; i < NR_MM_COUNTERS; i++) { long x =3D percpu_counter_sum(&mm->rss_stat[i]); =20 - if (likely(!x)) - continue; - - /* Making sure this is not due to race with CPU offlining. */ - x =3D percpu_counter_sum_all(&mm->rss_stat[i]); if (unlikely(x)) pr_alert("BUG: Bad rss-counter state mm:%p type:%s val:%ld\n", mm, resident_page_types[i], x); --=20 2.39.2 From nobody Wed Feb 11 13:06:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4341C7618E for ; Wed, 15 Mar 2023 08:49:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231578AbjCOIt4 (ORCPT ); Wed, 15 Mar 2023 04:49:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231770AbjCOIty (ORCPT ); Wed, 15 Mar 2023 04:49:54 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30DA45BCA1 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id h12-20020a17090aea8c00b0023d1311fab3so1150053pjz.1 for ; Wed, 15 Mar 2023 01:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20210112.gappssmtp.com; s=20210112; t=1678870191; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DnnMb4Et7WhpEe76RPSGsI6qWidW0Ymzw0QUU6G4yao=; b=MldnmFi+QRjEcfycD9KY0bZvSKUHkvSL3UtA0U/PWvEzo8AB2GYhsOMJ9Yna93phsr 3pmBXGrj79AYyoeIfCO7JsUKNVP8tmqlS8mY/FTf5EtAxhPN1xd1Fm9/pzGq/ZdxfhQz eU2tuyfglF9o3N9mm4hhAzPHlUfBjcvnSq9Nb2s84Jjh61ufX2fH3XB7ShgyRL4j80Ax QyS9p698XnwvyGRgs0e4BrNF4ilCCsUmxrRDQcMuw618JIheaRe7R9Qmg/xhef7MwYkb GmW1c4Xe7Xs5vN1n3/NKD+0FvXsYvic5EsCMGBEC3z5N+9iBqt2u48HdxYZQHcZIPME0 idug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678870191; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DnnMb4Et7WhpEe76RPSGsI6qWidW0Ymzw0QUU6G4yao=; b=TOCfPVzWCZst/cKGnOi7IDqEWlI3YVQcbrEmcmo4kE3UDFLVpK32+H90PI4l3/8Wif mtGdA9CKA0IjXFaK3MqDuSSF12lWZDrOk8BTIBG/9ZfRU+ZJrCY8cvkToJDFPtP6p7Dc LdXEyXUii5N9ctvEOIiaTFKSxbjzsJA65o+WrqAO2IpKMtW6ngIRmpxCwWquZH7qfDka FuKmg2i8NH76Ni5XF/w6ZVn7G5SkpI7Br/QlVS97Oruvyn2zDsyNYBHMmJluGb1tVTDG ilqmJNTSOALxY1tyQ6c5f6LXUs3TH8HH2ffUlBUICut90mDFupJUABbzbDJe/KzQ/oUo +Sug== X-Gm-Message-State: AO0yUKWVsgE78ScfeIuwHSFGDzzE/TG3yZsNu5DPfsqxRBVG2PBkeR4F FDONjth5NueT6tvgaUhdeWtQzUWUcjQ+T7TeqOQ= X-Google-Smtp-Source: AK7set8ONEq6R2DicXybExxV1KAqb5LJDX736BB8nt9+Drt0g4Wcxf32rD3Zg0NHbWPaIIoqz/KYqA== X-Received: by 2002:a05:6a20:7d88:b0:cc:32a8:323d with SMTP id v8-20020a056a207d8800b000cc32a8323dmr15446982pzj.28.1678870191517; Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from dread.disaster.area (pa49-186-4-237.pa.vic.optusnet.com.au. [49.186.4.237]) by smtp.gmail.com with ESMTPSA id e29-20020a63501d000000b0050be183459bsm593527pgb.34.2023.03.15.01.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Mar 2023 01:49:51 -0700 (PDT) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.92.3) (envelope-from ) id 1pcMpQ-008zeV-50; Wed, 15 Mar 2023 19:49:48 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.96) (envelope-from ) id 1pcMpQ-00Ag6X-0S; Wed, 15 Mar 2023 19:49:48 +1100 From: Dave Chinner To: linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org Cc: linux-mm@vger.kernel.org, linux-fsdevel@vger.kernel.org, yebin10@huawei.com Subject: [PATCH 4/4] pcpcntr: remove percpu_counter_sum_all() Date: Wed, 15 Mar 2023 19:49:38 +1100 Message-Id: <20230315084938.2544737-5-david@fromorbit.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230315084938.2544737-1-david@fromorbit.com> References: <20230315084938.2544737-1-david@fromorbit.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Dave Chinner percpu_counter_sum_all() is now redundant as the race condition it was invented to handle is now dealt with by percpu_counter_sum() directly and all users of percpu_counter_sum_all() have been removed. Remove it. This effectively reverts the changes made in f689054aace2 ("percpu_counter: add percpu_counter_sum_all interface") except for the cpumask iteration that fixes percpu_counter_sum() made earlier in this series. Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong --- include/linux/percpu_counter.h | 6 ----- lib/percpu_counter.c | 40 ++++++++++------------------------ 2 files changed, 11 insertions(+), 35 deletions(-) diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h index 521a733e21a9..75b73c83bc9d 100644 --- a/include/linux/percpu_counter.h +++ b/include/linux/percpu_counter.h @@ -45,7 +45,6 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 a= mount); void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch); s64 __percpu_counter_sum(struct percpu_counter *fbc); -s64 percpu_counter_sum_all(struct percpu_counter *fbc); int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batc= h); void percpu_counter_sync(struct percpu_counter *fbc); =20 @@ -196,11 +195,6 @@ static inline s64 percpu_counter_sum(struct percpu_cou= nter *fbc) return percpu_counter_read(fbc); } =20 -static inline s64 percpu_counter_sum_all(struct percpu_counter *fbc) -{ - return percpu_counter_read(fbc); -} - static inline bool percpu_counter_initialized(struct percpu_counter *fbc) { return true; diff --git a/lib/percpu_counter.c b/lib/percpu_counter.c index 0e096311e0c0..5004463c4f9f 100644 --- a/lib/percpu_counter.c +++ b/lib/percpu_counter.c @@ -122,23 +122,6 @@ void percpu_counter_sync(struct percpu_counter *fbc) } EXPORT_SYMBOL(percpu_counter_sync); =20 -static s64 __percpu_counter_sum_mask(struct percpu_counter *fbc, - const struct cpumask *cpu_mask) -{ - s64 ret; - int cpu; - unsigned long flags; - - raw_spin_lock_irqsave(&fbc->lock, flags); - ret =3D fbc->count; - for_each_cpu_or(cpu, cpu_online_mask, cpu_mask) { - s32 *pcount =3D per_cpu_ptr(fbc->counters, cpu); - ret +=3D *pcount; - } - raw_spin_unlock_irqrestore(&fbc->lock, flags); - return ret; -} - /* * Add up all the per-cpu counts, return the result. This is a more accur= ate * but much slower version of percpu_counter_read_positive(). @@ -153,22 +136,21 @@ static s64 __percpu_counter_sum_mask(struct percpu_co= unter *fbc, */ s64 __percpu_counter_sum(struct percpu_counter *fbc) { + s64 ret; + int cpu; + unsigned long flags; =20 - return __percpu_counter_sum_mask(fbc, cpu_dying_mask); + raw_spin_lock_irqsave(&fbc->lock, flags); + ret =3D fbc->count; + for_each_cpu_or(cpu, cpu_online_mask, cpu_dying_mask) { + s32 *pcount =3D per_cpu_ptr(fbc->counters, cpu); + ret +=3D *pcount; + } + raw_spin_unlock_irqrestore(&fbc->lock, flags); + return ret; } EXPORT_SYMBOL(__percpu_counter_sum); =20 -/* - * This is slower version of percpu_counter_sum as it traverses all possib= le - * cpus. Use this only in the cases where accurate data is needed in the - * presense of CPUs getting offlined. - */ -s64 percpu_counter_sum_all(struct percpu_counter *fbc) -{ - return __percpu_counter_sum_mask(fbc, cpu_possible_mask); -} -EXPORT_SYMBOL(percpu_counter_sum_all); - int __percpu_counter_init(struct percpu_counter *fbc, s64 amount, gfp_t gf= p, struct lock_class_key *key) { --=20 2.39.2