From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68367C61DA4 for ; Thu, 9 Feb 2023 15:34:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231269AbjBIPeu (ORCPT ); Thu, 9 Feb 2023 10:34:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230308AbjBIPeq (ORCPT ); Thu, 9 Feb 2023 10:34:46 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68B105C493 for ; Thu, 9 Feb 2023 07:33:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956804; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=7C1hmy1E7+0MQkK/mZzmUZ/gKoFVIBd7d2YZY/3tmcQ=; b=iWySdEmx9Bx5S4mWfCEJ+U4sCFcQiw7bYuXnyMszuy5GGxgyAKppQYYct13B/KSfvwLmVA psIiOVFdj3Iy4EwC45ecHCK7y0JX1B5/+r5YxlkWlhM3zQCH6z3X1IO7sN/4J1RauEU73X w6R4tMo3pPZ1+fQ78BAfXb3ztLEJk84= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-532-6n2BmqIWMciXOAiOSVn0Kg-1; Thu, 09 Feb 2023 10:33:17 -0500 X-MC-Unique: 6n2BmqIWMciXOAiOSVn0Kg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C0E6E802C1D; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8DF9A2026D4B; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 6390E403CC06B; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.656996515@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:51 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 01/11] mm/vmstat: remove remote node draining References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Draining of pages from the local pcp for a remote zone was necessary since: "Note that remote node draining is a somewhat esoteric feature that is required on large NUMA systems because otherwise significant portions of system memory can become trapped in pcp queues. The number of pcp is determined by the number of processors and nodes in a system. A system with 4 processors and 2 nodes has 8 pcps which is okay. But a system with 1024 processors and 512 nodes has 512k pcps with a high potential for large amount of memory being caught in them." Since commit 443c2accd1b6679a1320167f8f56eed6536b806e ("mm/page_alloc: remotely drain per-cpu lists"), drain_all_pages() is able=20 to remotely free those pages when necessary. Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/include/linux/mmzone.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Acked-by: David Hildenbrand Suggested-by: Vlastimil Babka --- linux-vmstat-remote.orig/include/linux/mmzone.h +++ linux-vmstat-remote/include/linux/mmzone.h @@ -577,9 +577,6 @@ struct per_cpu_pages { int high; /* high watermark, emptying needed */ int batch; /* chunk size for buddy add/remove */ short free_factor; /* batch scaling factor during free */ -#ifdef CONFIG_NUMA - short expire; /* When 0, remote pagesets are drained */ -#endif =20 /* Lists of pages, one per migrate type stored on the pcp-lists */ struct list_head lists[NR_PCP_LISTS]; Index: linux-vmstat-remote/mm/vmstat.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/mm/vmstat.c +++ linux-vmstat-remote/mm/vmstat.c @@ -803,7 +803,7 @@ static int fold_diff(int *zone_diff, int * * The function returns the number of global counters updated. */ -static int refresh_cpu_vm_stats(bool do_pagesets) +static int refresh_cpu_vm_stats(void) { struct pglist_data *pgdat; struct zone *zone; @@ -814,9 +814,6 @@ static int refresh_cpu_vm_stats(bool do_ =20 for_each_populated_zone(zone) { struct per_cpu_zonestat __percpu *pzstats =3D zone->per_cpu_zonestats; -#ifdef CONFIG_NUMA - struct per_cpu_pages __percpu *pcp =3D zone->per_cpu_pageset; -#endif =20 for (i =3D 0; i < NR_VM_ZONE_STAT_ITEMS; i++) { int v; @@ -826,44 +823,8 @@ static int refresh_cpu_vm_stats(bool do_ =20 atomic_long_add(v, &zone->vm_stat[i]); global_zone_diff[i] +=3D v; -#ifdef CONFIG_NUMA - /* 3 seconds idle till flush */ - __this_cpu_write(pcp->expire, 3); -#endif } } -#ifdef CONFIG_NUMA - - if (do_pagesets) { - cond_resched(); - /* - * Deal with draining the remote pageset of this - * processor - * - * Check if there are pages remaining in this pageset - * if not then there is nothing to expire. - */ - if (!__this_cpu_read(pcp->expire) || - !__this_cpu_read(pcp->count)) - continue; - - /* - * We never drain zones local to this processor. - */ - if (zone_to_nid(zone) =3D=3D numa_node_id()) { - __this_cpu_write(pcp->expire, 0); - continue; - } - - if (__this_cpu_dec_return(pcp->expire)) - continue; - - if (__this_cpu_read(pcp->count)) { - drain_zone_pages(zone, this_cpu_ptr(pcp)); - changes++; - } - } -#endif } =20 for_each_online_pgdat(pgdat) { @@ -1864,7 +1825,7 @@ int sysctl_stat_interval __read_mostly =3D #ifdef CONFIG_PROC_FS static void refresh_vm_stats(struct work_struct *work) { - refresh_cpu_vm_stats(true); + refresh_cpu_vm_stats(); } =20 int vmstat_refresh(struct ctl_table *table, int write, @@ -1928,7 +1889,7 @@ int vmstat_refresh(struct ctl_table *tab =20 static void vmstat_update(struct work_struct *w) { - if (refresh_cpu_vm_stats(true)) { + if (refresh_cpu_vm_stats()) { /* * Counters were updated so we expect more updates * to occur in the future. Keep on running the @@ -1991,7 +1952,7 @@ void quiet_vmstat(void) * it would be too expensive from this path. * vmstat_shepherd will take care about that for us. */ - refresh_cpu_vm_stats(false); + refresh_cpu_vm_stats(); } =20 /* From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0222AC61DA4 for ; Thu, 9 Feb 2023 15:35:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231411AbjBIPfB (ORCPT ); Thu, 9 Feb 2023 10:35:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231129AbjBIPet (ORCPT ); Thu, 9 Feb 2023 10:34:49 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EEA165ACFC for ; Thu, 9 Feb 2023 07:33:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956802; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=iUvNynje4NLXuVViJqGVXz34mfwfY1r+EmC29iXiBlQ=; b=EwDBUPVWdyTl9lxExTL6/sPyvIxpIzni24vXNIPqEc517DE12Un9e1rZeaa6hW+mwYH/MV PTeoZnhPT5i5gxCmkt7IF/kLMyGnehOhk4UmVn7xZg5wM7lKvsMtcekBxwzhf/5BqVK2xz Sp2kHFDZxBkQxqoUhNqtPtU1ifTNGes= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-556-Qjur93AHP26bSv2h0FSKcg-1; Thu, 09 Feb 2023 10:33:18 -0500 X-MC-Unique: Qjur93AHP26bSv2h0FSKcg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CC7EC85CCE9; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A2D7E40398A2; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 6718A403CC06F; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.683821550@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:52 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 02/11] this_cpu_cmpxchg: ARM64: switch this_cpu_cmpxchg to locked, add _local function References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Goal is to have vmstat_shepherd to transfer from per-CPU counters to global counters remotely. For this,=20 an atomic this_cpu_cmpxchg is necessary. Following the kernel convention for cmpxchg/cmpxchg_local, change ARM's this_cpu_cmpxchg_ helpers to be atomic, and add this_cpu_cmpxchg_local_ helpers which are not atomic. Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/arch/arm64/include/asm/percpu.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/arch/arm64/include/asm/percpu.h +++ linux-vmstat-remote/arch/arm64/include/asm/percpu.h @@ -232,13 +232,23 @@ PERCPU_RET_OP(add, add, ldadd) _pcp_protect_return(xchg_relaxed, pcp, val) =20 #define this_cpu_cmpxchg_1(pcp, o, n) \ - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) + _pcp_protect_return(cmpxchg, pcp, o, n) #define this_cpu_cmpxchg_2(pcp, o, n) \ - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) + _pcp_protect_return(cmpxchg, pcp, o, n) #define this_cpu_cmpxchg_4(pcp, o, n) \ - _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) + _pcp_protect_return(cmpxchg, pcp, o, n) #define this_cpu_cmpxchg_8(pcp, o, n) \ + _pcp_protect_return(cmpxchg, pcp, o, n) + +#define this_cpu_cmpxchg_local_1(pcp, o, n) \ _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) +#define this_cpu_cmpxchg_local_2(pcp, o, n) \ + _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) +#define this_cpu_cmpxchg_local_4(pcp, o, n) \ + _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) +#define this_cpu_cmpxchg_local_8(pcp, o, n) \ + _pcp_protect_return(cmpxchg_relaxed, pcp, o, n) + =20 #ifdef __KVM_NVHE_HYPERVISOR__ extern unsigned long __hyp_per_cpu_offset(unsigned int cpu); From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92F36C61DA4 for ; Thu, 9 Feb 2023 15:34:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231367AbjBIPeY (ORCPT ); Thu, 9 Feb 2023 10:34:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230470AbjBIPeJ (ORCPT ); Thu, 9 Feb 2023 10:34:09 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95B6C38B55 for ; Thu, 9 Feb 2023 07:33:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=1qMXUw9dsgBVhjNMR0OfieQvkRg7GOetq5N/8/ayAmg=; b=O9IyU574Fv4knMpc7dsljjE9ZyEP581ejvihoL3kF99WyEJqOQ3qBcbWlCccyfoC0KNjmF mEp7GrkzzMr5Oi+V9p72IrW8Lac0c9PkkLL1VcUT600EBYzmymt3t2piJa576KBMx7t4AL fXGXJSzfyWIPBNA7GiG2/64uO/FNE5I= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-556-gM0BUYDHOl-4dDHK3bUZLg-1; Thu, 09 Feb 2023 10:33:17 -0500 X-MC-Unique: gM0BUYDHOl-4dDHK3bUZLg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D00932807D6A; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9FE17440BC; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 6B645403CC071; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.711672794@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:53 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 03/11] this_cpu_cmpxchg: loongarch: switch this_cpu_cmpxchg to locked, add _local function References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Goal is to have vmstat_shepherd to transfer from per-CPU counters to global counters remotely. For this, an atomic this_cpu_cmpxchg is necessary. Following the kernel convention for cmpxchg/cmpxchg_local, add this_cpu_cmpxchg_local helpers to Loongarch. Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/arch/loongarch/include/asm/percpu.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/arch/loongarch/include/asm/percpu.h +++ linux-vmstat-remote/arch/loongarch/include/asm/percpu.h @@ -150,6 +150,16 @@ static inline unsigned long __percpu_xch } =20 /* this_cpu_cmpxchg */ +#define _protect_cmpxchg(pcp, o, n) \ +({ \ + typeof(*raw_cpu_ptr(&(pcp))) __ret; \ + preempt_disable_notrace(); \ + __ret =3D cmpxchg(raw_cpu_ptr(&(pcp)), o, n); \ + preempt_enable_notrace(); \ + __ret; \ +}) + +/* this_cpu_cmpxchg_local */ #define _protect_cmpxchg_local(pcp, o, n) \ ({ \ typeof(*raw_cpu_ptr(&(pcp))) __ret; \ @@ -222,10 +232,15 @@ do { \ #define this_cpu_xchg_4(pcp, val) _percpu_xchg(pcp, val) #define this_cpu_xchg_8(pcp, val) _percpu_xchg(pcp, val) =20 -#define this_cpu_cmpxchg_1(ptr, o, n) _protect_cmpxchg_local(ptr, o, n) -#define this_cpu_cmpxchg_2(ptr, o, n) _protect_cmpxchg_local(ptr, o, n) -#define this_cpu_cmpxchg_4(ptr, o, n) _protect_cmpxchg_local(ptr, o, n) -#define this_cpu_cmpxchg_8(ptr, o, n) _protect_cmpxchg_local(ptr, o, n) +#define this_cpu_cmpxchg_local_1(ptr, o, n) _protect_cmpxchg_local(ptr, o,= n) +#define this_cpu_cmpxchg_local_2(ptr, o, n) _protect_cmpxchg_local(ptr, o,= n) +#define this_cpu_cmpxchg_local_4(ptr, o, n) _protect_cmpxchg_local(ptr, o,= n) +#define this_cpu_cmpxchg_local_8(ptr, o, n) _protect_cmpxchg_local(ptr, o,= n) + +#define this_cpu_cmpxchg_1(ptr, o, n) _protect_cmpxchg(ptr, o, n) +#define this_cpu_cmpxchg_2(ptr, o, n) _protect_cmpxchg(ptr, o, n) +#define this_cpu_cmpxchg_4(ptr, o, n) _protect_cmpxchg(ptr, o, n) +#define this_cpu_cmpxchg_8(ptr, o, n) _protect_cmpxchg(ptr, o, n) =20 #include From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0E8AC636D6 for ; Thu, 9 Feb 2023 15:34:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231389AbjBIPe4 (ORCPT ); Thu, 9 Feb 2023 10:34:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230435AbjBIPet (ORCPT ); Thu, 9 Feb 2023 10:34:49 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D53D5C48F for ; Thu, 9 Feb 2023 07:33:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956804; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=FePX2+VFBr01iZwaVbBM1HjdLAtX/eNoAgkcl35m848=; b=DueGOE/WgGjuikn8AERpx5fogV0gjbbI5u+9lzu2vp86IdhcyEcZKBkMeRtDD2iZgVnelP NewLAoEBMw/s44r2qQMyvQBALYDm0gKeCYT+jJRh8qQ17ZPtDgSbN0H2nmDrL9aoNk5tBb iAzVJas9iO3TD9GZhAYHoghwQ4dWfq0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-467-f84dd5UROTGCJx2buXnEzg-1; Thu, 09 Feb 2023 10:33:20 -0500 X-MC-Unique: f84dd5UROTGCJx2buXnEzg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CFC68104F0D0; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A28292026D76; Thu, 9 Feb 2023 15:33:14 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 6F68A403CC073; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.737772842@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:54 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 04/11] this_cpu_cmpxchg: S390: switch this_cpu_cmpxchg to locked, add _local function References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Goal is to have vmstat_shepherd to transfer from per-CPU counters to global counters remotely. For this, an atomic this_cpu_cmpxchg is necessary. Following the kernel convention for cmpxchg/cmpxchg_local, add S390's this_cpu_cmpxchg_local. Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/arch/s390/include/asm/percpu.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/arch/s390/include/asm/percpu.h +++ linux-vmstat-remote/arch/s390/include/asm/percpu.h @@ -148,6 +148,11 @@ #define this_cpu_cmpxchg_4(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, ova= l, nval) #define this_cpu_cmpxchg_8(pcp, oval, nval) arch_this_cpu_cmpxchg(pcp, ova= l, nval) =20 +#define this_cpu_cmpxchg_local_1(pcp, oval, nval) arch_this_cpu_cmpxchg(pc= p, oval, nval) +#define this_cpu_cmpxchg_local_2(pcp, oval, nval) arch_this_cpu_cmpxchg(pc= p, oval, nval) +#define this_cpu_cmpxchg_local_4(pcp, oval, nval) arch_this_cpu_cmpxchg(pc= p, oval, nval) +#define this_cpu_cmpxchg_local_8(pcp, oval, nval) arch_this_cpu_cmpxchg(pc= p, oval, nval) + #define arch_this_cpu_xchg(pcp, nval) \ ({ \ typeof(pcp) *ptr__; \ From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38837C636D6 for ; Thu, 9 Feb 2023 15:34:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230292AbjBIPeN (ORCPT ); Thu, 9 Feb 2023 10:34:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230022AbjBIPeH (ORCPT ); Thu, 9 Feb 2023 10:34:07 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68196144B7 for ; Thu, 9 Feb 2023 07:33:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=hjd5gtUB1FFqfcJWUv9qrW5kcd1bszAAtI+X6kwFBXk=; b=G6qcjjajdVmVDzxZqne2LsGRPlIf+l8TNlOG7Y30iM3hvqULTMLiqQ5QrxRvdpE1OBfzzJ kUOyDwy7sonVNsw0x7Msv8C8uagxvow5rj9//6W8H37q7HoT7l4eEOXBmXdV24vIFIEgYf nVP1DS2jApkOXNyOcB84t2B3Bmwiy0g= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-311-unXOuOmGPT2X0gACJlUVeQ-1; Thu, 09 Feb 2023 10:33:18 -0500 X-MC-Unique: unXOuOmGPT2X0gACJlUVeQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 98BE93C10EDA; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4973F40398A0; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 72D73403CC077; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.765612949@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:55 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 05/11] this_cpu_cmpxchg: x86: switch this_cpu_cmpxchg to locked, add _local function References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Goal is to have vmstat_shepherd to transfer from per-CPU counters to global counters remotely. For this, an atomic this_cpu_cmpxchg is necessary. Following the kernel convention for cmpxchg/cmpxchg_local, change x86's this_cpu_cmpxchg_ helpers to be atomic. and add this_cpu_cmpxchg_local_ helpers which are not atomic. Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/arch/x86/include/asm/percpu.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/arch/x86/include/asm/percpu.h +++ linux-vmstat-remote/arch/x86/include/asm/percpu.h @@ -197,11 +197,11 @@ do { \ * cmpxchg has no such implied lock semantics as a result it is much * more efficient for cpu local operations. */ -#define percpu_cmpxchg_op(size, qual, _var, _oval, _nval) \ +#define percpu_cmpxchg_op(size, qual, _var, _oval, _nval, lockp) \ ({ \ __pcpu_type_##size pco_old__ =3D __pcpu_cast_##size(_oval); \ __pcpu_type_##size pco_new__ =3D __pcpu_cast_##size(_nval); \ - asm qual (__pcpu_op2_##size("cmpxchg", "%[nval]", \ + asm qual (__pcpu_op2_##size(lockp "cmpxchg", "%[nval]", \ __percpu_arg([var])) \ : [oval] "+a" (pco_old__), \ [var] "+m" (_var) \ @@ -279,16 +279,20 @@ do { \ #define raw_cpu_add_return_1(pcp, val) percpu_add_return_op(1, , pcp, val) #define raw_cpu_add_return_2(pcp, val) percpu_add_return_op(2, , pcp, val) #define raw_cpu_add_return_4(pcp, val) percpu_add_return_op(4, , pcp, val) -#define raw_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(1, , pcp, ova= l, nval) -#define raw_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(2, , pcp, ova= l, nval) -#define raw_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(4, , pcp, ova= l, nval) +#define raw_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(1, , pcp, ova= l, nval, "") +#define raw_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(2, , pcp, ova= l, nval, "") +#define raw_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(4, , pcp, ova= l, nval, "") =20 #define this_cpu_add_return_1(pcp, val) percpu_add_return_op(1, volatile,= pcp, val) #define this_cpu_add_return_2(pcp, val) percpu_add_return_op(2, volatile,= pcp, val) #define this_cpu_add_return_4(pcp, val) percpu_add_return_op(4, volatile,= pcp, val) -#define this_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(1, volatile,= pcp, oval, nval) -#define this_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(2, volatile,= pcp, oval, nval) -#define this_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(4, volatile,= pcp, oval, nval) +#define this_cpu_cmpxchg_local_1(pcp, oval, nval) percpu_cmpxchg_op(1, vol= atile, pcp, oval, nval, "") +#define this_cpu_cmpxchg_local_2(pcp, oval, nval) percpu_cmpxchg_op(2, vol= atile, pcp, oval, nval, "") +#define this_cpu_cmpxchg_local_4(pcp, oval, nval) percpu_cmpxchg_op(4, vol= atile, pcp, oval, nval, "") + +#define this_cpu_cmpxchg_1(pcp, oval, nval) percpu_cmpxchg_op(1, volatile,= pcp, oval, nval, LOCK_PREFIX) +#define this_cpu_cmpxchg_2(pcp, oval, nval) percpu_cmpxchg_op(2, volatile,= pcp, oval, nval, LOCK_PREFIX) +#define this_cpu_cmpxchg_4(pcp, oval, nval) percpu_cmpxchg_op(4, volatile,= pcp, oval, nval, LOCK_PREFIX) =20 #ifdef CONFIG_X86_CMPXCHG64 #define percpu_cmpxchg8b_double(pcp1, pcp2, o1, o2, n1, n2) \ @@ -319,16 +323,17 @@ do { \ #define raw_cpu_or_8(pcp, val) percpu_to_op(8, , "or", (pcp), val) #define raw_cpu_add_return_8(pcp, val) percpu_add_return_op(8, , pcp, val) #define raw_cpu_xchg_8(pcp, nval) raw_percpu_xchg_op(pcp, nval) -#define raw_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(8, , pcp, ova= l, nval) +#define raw_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(8, , pcp, ova= l, nval, "") =20 -#define this_cpu_read_8(pcp) percpu_from_op(8, volatile, "mov", pcp) -#define this_cpu_write_8(pcp, val) percpu_to_op(8, volatile, "mov", (pcp)= , val) -#define this_cpu_add_8(pcp, val) percpu_add_op(8, volatile, (pcp), val) -#define this_cpu_and_8(pcp, val) percpu_to_op(8, volatile, "and", (pcp), = val) -#define this_cpu_or_8(pcp, val) percpu_to_op(8, volatile, "or", (pcp), v= al) -#define this_cpu_add_return_8(pcp, val) percpu_add_return_op(8, volatile,= pcp, val) -#define this_cpu_xchg_8(pcp, nval) percpu_xchg_op(8, volatile, pcp, nval) -#define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(8, volatile,= pcp, oval, nval) +#define this_cpu_read_8(pcp) percpu_from_op(8, volatile, "mov", pcp) +#define this_cpu_write_8(pcp, val) percpu_to_op(8, volatile, "mov", (pcp= ), val) +#define this_cpu_add_8(pcp, val) percpu_add_op(8, volatile, (pcp), val) +#define this_cpu_and_8(pcp, val) percpu_to_op(8, volatile, "and", (pcp),= val) +#define this_cpu_or_8(pcp, val) percpu_to_op(8, volatile, "or", (pcp), = val) +#define this_cpu_add_return_8(pcp, val) percpu_add_return_op(8, volatile= , pcp, val) +#define this_cpu_xchg_8(pcp, nval) percpu_xchg_op(8, volatile, pcp, nval) +#define this_cpu_cmpxchg_local_8(pcp, oval, nval) percpu_cmpxchg_op(8, vol= atile, pcp, oval, nval, "") +#define this_cpu_cmpxchg_8(pcp, oval, nval) percpu_cmpxchg_op(8, volatile= , pcp, oval, nval, LOCK_PREFIX) =20 /* * Pretty complex macro to generate cmpxchg16 instruction. The instruction From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29755C636D6 for ; Thu, 9 Feb 2023 15:34:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230463AbjBIPeJ (ORCPT ); Thu, 9 Feb 2023 10:34:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229487AbjBIPeD (ORCPT ); Thu, 9 Feb 2023 10:34:03 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58ECB4226 for ; Thu, 9 Feb 2023 07:33:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=xttH7836oXgQ2gV7qXbyD6NGWlFVj+CDeiD3JQB+5E4=; b=VUbEwTbq2UObIG4CHg4wlLVgxr8/amvl3aOU8tb0z5RGuUbueIETdcFCoKeKxTvXV7C57Y /ykt3gal6iKncmBEm57gMCx1Zh+OkRJotOzKXd+jy/1PBJD/6CJ+4uB1bxX0rM453IZXz1 G3aOnAhZ0jCOjzehyCARFn8u0kvXzog= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-590-HxuIk-9sP12Q212hs8jtxA-1; Thu, 09 Feb 2023 10:33:18 -0500 X-MC-Unique: HxuIk-9sP12Q212hs8jtxA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 937E12807D6E; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6690740398A2; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 77799403CC079; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.791818567@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:56 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 06/11] this_cpu_cmpxchg: asm-generic: switch this_cpu_cmpxchg to locked, add _local function References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Goal is to have vmstat_shepherd to transfer from per-CPU counters to global counters remotely. For this, an atomic this_cpu_cmpxchg is necessary. Add this_cpu_cmpxchg_local_ helpers to asm-generic/percpu.h. Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/include/asm-generic/percpu.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/include/asm-generic/percpu.h +++ linux-vmstat-remote/include/asm-generic/percpu.h @@ -424,6 +424,23 @@ do { \ this_cpu_generic_cmpxchg(pcp, oval, nval) #endif =20 +#ifndef this_cpu_cmpxchg_local_1 +#define this_cpu_cmpxchg_local_1(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef this_cpu_cmpxchg_local_2 +#define this_cpu_cmpxchg_local_2(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef this_cpu_cmpxchg_local_4 +#define this_cpu_cmpxchg_local_4(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif +#ifndef this_cpu_cmpxchg_local_8 +#define this_cpu_cmpxchg_local_8(pcp, oval, nval) \ + this_cpu_generic_cmpxchg(pcp, oval, nval) +#endif + #ifndef this_cpu_cmpxchg_double_1 #define this_cpu_cmpxchg_double_1(pcp1, pcp2, oval1, oval2, nval1, nval2) \ this_cpu_generic_cmpxchg_double(pcp1, pcp2, oval1, oval2, nval1, nval2) From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6E89C61DA4 for ; Thu, 9 Feb 2023 15:34:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231341AbjBIPeP (ORCPT ); Thu, 9 Feb 2023 10:34:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230401AbjBIPeI (ORCPT ); Thu, 9 Feb 2023 10:34:08 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E115639BAC for ; Thu, 9 Feb 2023 07:33:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=ofXkymEHUHbiccFSxfE3RKhTo0FLU9PUEYKbx6pPdBY=; b=V3kBSbGWUdC6ZSVgMu5UlD4aG2eeN3M3L0+1CCd0MR5QIc7V8damQWm6BKn4Y3BA0KjJ1h 2qoyJVFNCtjWQcwul+hX15yJszISlozQOtk6Z0Y4IUfzHKPRWRHaOvSr+W+IY4rVej5AYG 6cpGmJYDf2v47D7uhka4yVOVYXBDRsU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-311-w-j-N1OhMzS0AJa82rbJ1A-1; Thu, 09 Feb 2023 10:33:18 -0500 X-MC-Unique: w-j-N1OhMzS0AJa82rbJ1A-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 799E13C10ED3; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4C3921121314; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 7BB57403CC07A; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.818774692@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:57 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 07/11] convert this_cpu_cmpxchg users to this_cpu_cmpxchg_local References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" this_cpu_cmpxchg was modified to atomic version, which=20 can be more costly than non-atomic version. Switch users of this_cpu_cmpxchg to this_cpu_cmpxchg_local (which preserves pre-non-atomic this_cpu_cmpxchg behaviour). Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/kernel/fork.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Acked-by: Peter Xu --- linux-vmstat-remote.orig/kernel/fork.c +++ linux-vmstat-remote/kernel/fork.c @@ -203,7 +203,7 @@ static bool try_release_thread_stack_to_ unsigned int i; =20 for (i =3D 0; i < NR_CACHED_STACKS; i++) { - if (this_cpu_cmpxchg(cached_stacks[i], NULL, vm) !=3D NULL) + if (this_cpu_cmpxchg_local(cached_stacks[i], NULL, vm) !=3D NULL) continue; return true; } Index: linux-vmstat-remote/kernel/scs.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/kernel/scs.c +++ linux-vmstat-remote/kernel/scs.c @@ -79,7 +79,7 @@ void scs_free(void *s) */ =20 for (i =3D 0; i < NR_CACHED_SCS; i++) - if (this_cpu_cmpxchg(scs_cache[i], 0, s) =3D=3D NULL) + if (this_cpu_cmpxchg_local(scs_cache[i], 0, s) =3D=3D NULL) return; =20 kasan_unpoison_vmalloc(s, SCS_SIZE, KASAN_VMALLOC_PROT_NORMAL); From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54346C61DA4 for ; Thu, 9 Feb 2023 15:35:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231339AbjBIPf3 (ORCPT ); Thu, 9 Feb 2023 10:35:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230259AbjBIPfY (ORCPT ); Thu, 9 Feb 2023 10:35:24 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34A4563136 for ; Thu, 9 Feb 2023 07:34:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956866; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Ij2M2yRJo325miWQNsr652YwVVg5UqnJuterQU1PBfY=; b=FuEccYoIeox7nFs80+bRAra07rjQkPUehucw87hF4voLhTcCeSgjDiH6bkB2zHoA6+Z6af yGtkkfdBPC3FupBn2/g0ySyTPa8pRAbAt3myvlFa0UpWZy3vqUmZLHcuIqiZd1McJPpCjc HlmctTCigEkJv0x9QKaKvWC9N50WDRM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-2-l-XLnhCQM8S0UIYM4PgIYQ-1; Thu, 09 Feb 2023 10:33:18 -0500 X-MC-Unique: l-XLnhCQM8S0UIYM4PgIYQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 95BA5104F0A8; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4556518EC5; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 7ECF5403CC07B; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.846239718@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:58 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 08/11] mm/vmstat: switch counter modification to cmpxchg References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation to switch vmstat shepherd to flush per-CPU counters remotely, switch all functions that modify the counters to use cmpxchg. To test the performance difference, a page allocator microbenchmark: https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/benc= h/page_bench01.c=20 with loops=3D1000000 was used, on Intel Core i7-11850H @ 2.50GHz. For the single_page_alloc_free test, which does /** Loop to measure **/ for (i =3D 0; i < rec->loops; i++) { my_page =3D alloc_page(gfp_mask); if (unlikely(my_page =3D=3D NULL)) return 0; __free_page(my_page); } Unit is cycles. Vanilla Patched Diff 159 165 3.7% Signed-off-by: Marcelo Tosatti Index: linux-vmstat-remote/mm/vmstat.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-vmstat-remote.orig/mm/vmstat.c +++ linux-vmstat-remote/mm/vmstat.c @@ -334,6 +334,188 @@ void set_pgdat_percpu_threshold(pg_data_ } } =20 +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL +/* + * If we have cmpxchg_local support then we do not need to incur the overh= ead + * that comes with local_irq_save/restore if we use this_cpu_cmpxchg. + * + * mod_state() modifies the zone counter state through atomic per cpu + * operations. + * + * Overstep mode specifies how overstep should handled: + * 0 No overstepping + * 1 Overstepping half of threshold + * -1 Overstepping minus half of threshold + */ +static inline void mod_zone_state(struct zone *zone, enum zone_stat_item i= tem, + long delta, int overstep_mode) +{ + struct per_cpu_zonestat __percpu *pcp =3D zone->per_cpu_zonestats; + s8 __percpu *p =3D pcp->vm_stat_diff + item; + long o, n, t, z; + + do { + z =3D 0; /* overflow to zone counters */ + + /* + * The fetching of the stat_threshold is racy. We may apply + * a counter threshold to the wrong the cpu if we get + * rescheduled while executing here. However, the next + * counter update will apply the threshold again and + * therefore bring the counter under the threshold again. + * + * Most of the time the thresholds are the same anyways + * for all cpus in a zone. + */ + t =3D this_cpu_read(pcp->stat_threshold); + + o =3D this_cpu_read(*p); + n =3D delta + o; + + if (abs(n) > t) { + int os =3D overstep_mode * (t >> 1); + + /* Overflow must be added to zone counters */ + z =3D n + os; + n =3D -os; + } + } while (this_cpu_cmpxchg(*p, o, n) !=3D o); + + if (z) + zone_page_state_add(z, zone, item); +} + +void mod_zone_page_state(struct zone *zone, enum zone_stat_item item, + long delta) +{ + mod_zone_state(zone, item, delta, 0); +} +EXPORT_SYMBOL(mod_zone_page_state); + +void __mod_zone_page_state(struct zone *zone, enum zone_stat_item item, + long delta) +{ + mod_zone_state(zone, item, delta, 0); +} +EXPORT_SYMBOL(__mod_zone_page_state); + +void inc_zone_page_state(struct page *page, enum zone_stat_item item) +{ + mod_zone_state(page_zone(page), item, 1, 1); +} +EXPORT_SYMBOL(inc_zone_page_state); + +void __inc_zone_page_state(struct page *page, enum zone_stat_item item) +{ + mod_zone_state(page_zone(page), item, 1, 1); +} +EXPORT_SYMBOL(__inc_zone_page_state); + +void dec_zone_page_state(struct page *page, enum zone_stat_item item) +{ + mod_zone_state(page_zone(page), item, -1, -1); +} +EXPORT_SYMBOL(dec_zone_page_state); + +void __dec_zone_page_state(struct page *page, enum zone_stat_item item) +{ + mod_zone_state(page_zone(page), item, -1, -1); +} +EXPORT_SYMBOL(__dec_zone_page_state); + +static inline void mod_node_state(struct pglist_data *pgdat, + enum node_stat_item item, + int delta, int overstep_mode) +{ + struct per_cpu_nodestat __percpu *pcp =3D pgdat->per_cpu_nodestats; + s8 __percpu *p =3D pcp->vm_node_stat_diff + item; + long o, n, t, z; + + if (vmstat_item_in_bytes(item)) { + /* + * Only cgroups use subpage accounting right now; at + * the global level, these items still change in + * multiples of whole pages. Store them as pages + * internally to keep the per-cpu counters compact. + */ + VM_WARN_ON_ONCE(delta & (PAGE_SIZE - 1)); + delta >>=3D PAGE_SHIFT; + } + + do { + z =3D 0; /* overflow to node counters */ + + /* + * The fetching of the stat_threshold is racy. We may apply + * a counter threshold to the wrong the cpu if we get + * rescheduled while executing here. However, the next + * counter update will apply the threshold again and + * therefore bring the counter under the threshold again. + * + * Most of the time the thresholds are the same anyways + * for all cpus in a node. + */ + t =3D this_cpu_read(pcp->stat_threshold); + + o =3D this_cpu_read(*p); + n =3D delta + o; + + if (abs(n) > t) { + int os =3D overstep_mode * (t >> 1); + + /* Overflow must be added to node counters */ + z =3D n + os; + n =3D -os; + } + } while (this_cpu_cmpxchg(*p, o, n) !=3D o); + + if (z) + node_page_state_add(z, pgdat, item); +} + +void mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item it= em, + long delta) +{ + mod_node_state(pgdat, item, delta, 0); +} +EXPORT_SYMBOL(mod_node_page_state); + +void __mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item = item, + long delta) +{ + mod_node_state(pgdat, item, delta, 0); +} +EXPORT_SYMBOL(__mod_node_page_state); + +void inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) +{ + mod_node_state(pgdat, item, 1, 1); +} + +void inc_node_page_state(struct page *page, enum node_stat_item item) +{ + mod_node_state(page_pgdat(page), item, 1, 1); +} +EXPORT_SYMBOL(inc_node_page_state); + +void __inc_node_page_state(struct page *page, enum node_stat_item item) +{ + mod_node_state(page_pgdat(page), item, 1, 1); +} +EXPORT_SYMBOL(__inc_node_page_state); + +void dec_node_page_state(struct page *page, enum node_stat_item item) +{ + mod_node_state(page_pgdat(page), item, -1, -1); +} +EXPORT_SYMBOL(dec_node_page_state); + +void __dec_node_page_state(struct page *page, enum node_stat_item item) +{ + mod_node_state(page_pgdat(page), item, -1, -1); +} +EXPORT_SYMBOL(__dec_node_page_state); +#else /* * For use when we know that interrupts are disabled, * or when we know that preemption is disabled and that @@ -541,149 +723,6 @@ void __dec_node_page_state(struct page * } EXPORT_SYMBOL(__dec_node_page_state); =20 -#ifdef CONFIG_HAVE_CMPXCHG_LOCAL -/* - * If we have cmpxchg_local support then we do not need to incur the overh= ead - * that comes with local_irq_save/restore if we use this_cpu_cmpxchg. - * - * mod_state() modifies the zone counter state through atomic per cpu - * operations. - * - * Overstep mode specifies how overstep should handled: - * 0 No overstepping - * 1 Overstepping half of threshold - * -1 Overstepping minus half of threshold -*/ -static inline void mod_zone_state(struct zone *zone, - enum zone_stat_item item, long delta, int overstep_mode) -{ - struct per_cpu_zonestat __percpu *pcp =3D zone->per_cpu_zonestats; - s8 __percpu *p =3D pcp->vm_stat_diff + item; - long o, n, t, z; - - do { - z =3D 0; /* overflow to zone counters */ - - /* - * The fetching of the stat_threshold is racy. We may apply - * a counter threshold to the wrong the cpu if we get - * rescheduled while executing here. However, the next - * counter update will apply the threshold again and - * therefore bring the counter under the threshold again. - * - * Most of the time the thresholds are the same anyways - * for all cpus in a zone. - */ - t =3D this_cpu_read(pcp->stat_threshold); - - o =3D this_cpu_read(*p); - n =3D delta + o; - - if (abs(n) > t) { - int os =3D overstep_mode * (t >> 1) ; - - /* Overflow must be added to zone counters */ - z =3D n + os; - n =3D -os; - } - } while (this_cpu_cmpxchg(*p, o, n) !=3D o); - - if (z) - zone_page_state_add(z, zone, item); -} - -void mod_zone_page_state(struct zone *zone, enum zone_stat_item item, - long delta) -{ - mod_zone_state(zone, item, delta, 0); -} -EXPORT_SYMBOL(mod_zone_page_state); - -void inc_zone_page_state(struct page *page, enum zone_stat_item item) -{ - mod_zone_state(page_zone(page), item, 1, 1); -} -EXPORT_SYMBOL(inc_zone_page_state); - -void dec_zone_page_state(struct page *page, enum zone_stat_item item) -{ - mod_zone_state(page_zone(page), item, -1, -1); -} -EXPORT_SYMBOL(dec_zone_page_state); - -static inline void mod_node_state(struct pglist_data *pgdat, - enum node_stat_item item, int delta, int overstep_mode) -{ - struct per_cpu_nodestat __percpu *pcp =3D pgdat->per_cpu_nodestats; - s8 __percpu *p =3D pcp->vm_node_stat_diff + item; - long o, n, t, z; - - if (vmstat_item_in_bytes(item)) { - /* - * Only cgroups use subpage accounting right now; at - * the global level, these items still change in - * multiples of whole pages. Store them as pages - * internally to keep the per-cpu counters compact. - */ - VM_WARN_ON_ONCE(delta & (PAGE_SIZE - 1)); - delta >>=3D PAGE_SHIFT; - } - - do { - z =3D 0; /* overflow to node counters */ - - /* - * The fetching of the stat_threshold is racy. We may apply - * a counter threshold to the wrong the cpu if we get - * rescheduled while executing here. However, the next - * counter update will apply the threshold again and - * therefore bring the counter under the threshold again. - * - * Most of the time the thresholds are the same anyways - * for all cpus in a node. - */ - t =3D this_cpu_read(pcp->stat_threshold); - - o =3D this_cpu_read(*p); - n =3D delta + o; - - if (abs(n) > t) { - int os =3D overstep_mode * (t >> 1) ; - - /* Overflow must be added to node counters */ - z =3D n + os; - n =3D -os; - } - } while (this_cpu_cmpxchg(*p, o, n) !=3D o); - - if (z) - node_page_state_add(z, pgdat, item); -} - -void mod_node_page_state(struct pglist_data *pgdat, enum node_stat_item it= em, - long delta) -{ - mod_node_state(pgdat, item, delta, 0); -} -EXPORT_SYMBOL(mod_node_page_state); - -void inc_node_state(struct pglist_data *pgdat, enum node_stat_item item) -{ - mod_node_state(pgdat, item, 1, 1); -} - -void inc_node_page_state(struct page *page, enum node_stat_item item) -{ - mod_node_state(page_pgdat(page), item, 1, 1); -} -EXPORT_SYMBOL(inc_node_page_state); - -void dec_node_page_state(struct page *page, enum node_stat_item item) -{ - mod_node_state(page_pgdat(page), item, -1, -1); -} -EXPORT_SYMBOL(dec_node_page_state); -#else /* * Use interrupt disable to serialize counter updates */ From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B13DAC61DA4 for ; Thu, 9 Feb 2023 15:34:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230007AbjBIPeG (ORCPT ); Thu, 9 Feb 2023 10:34:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229647AbjBIPeD (ORCPT ); Thu, 9 Feb 2023 10:34:03 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC28947087 for ; Thu, 9 Feb 2023 07:33:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=LGWVlVnJpdpZsY64OC4/IsULi3iGJlzsOgwi06Ffb5I=; b=OMwp7q2g3Dx2cqRDfgY5KaO/WyAGXlNOXbTW38puODzE3dYtaVxft27AR+bcs+6QqSpNoG YKC0exqqGahS7qaQOMjHTQI9W+gkGI81YEF18sppXRi260tC6NsElMJgaZE2ay+t5xTlzF h/kzPmmckhSn0pPSovfya6mlgNBQjBk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-205-LPxsHX8gOi6p7L8Li5eumw-1; Thu, 09 Feb 2023 10:33:18 -0500 X-MC-Unique: LPxsHX8gOi6p7L8Li5eumw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 902DD1C29D4F; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 625EB40C83B6; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 86BD4400E5701; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.873999366@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:01:59 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 09/11] mm/vmstat: use cmpxchg loop in cpu_vm_stats_fold References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation to switch vmstat shepherd to flush per-CPU counters remotely, use a cmpxchg loop=20 instead of a pair of read/write instructions. Signed-off-by: Marcelo Tosatti Index: linux-2.6/mm/vmstat.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-2.6.orig/mm/vmstat.c +++ linux-2.6/mm/vmstat.c @@ -885,7 +885,7 @@ static int refresh_cpu_vm_stats(void) } =20 /* - * Fold the data for an offline cpu into the global array. + * Fold the data for a cpu into the global array. * There cannot be any access by the offline cpu and therefore * synchronization is simplified. */ @@ -906,8 +906,9 @@ void cpu_vm_stats_fold(int cpu) if (pzstats->vm_stat_diff[i]) { int v; =20 - v =3D pzstats->vm_stat_diff[i]; - pzstats->vm_stat_diff[i] =3D 0; + do { + v =3D pzstats->vm_stat_diff[i]; + } while (!try_cmpxchg(&pzstats->vm_stat_diff[i], &v, 0)); atomic_long_add(v, &zone->vm_stat[i]); global_zone_diff[i] +=3D v; } @@ -917,8 +918,9 @@ void cpu_vm_stats_fold(int cpu) if (pzstats->vm_numa_event[i]) { unsigned long v; =20 - v =3D pzstats->vm_numa_event[i]; - pzstats->vm_numa_event[i] =3D 0; + do { + v =3D pzstats->vm_numa_event[i]; + } while (!try_cmpxchg(&pzstats->vm_numa_event[i], &v, 0)); zone_numa_event_add(v, zone, i); } } @@ -934,8 +936,9 @@ void cpu_vm_stats_fold(int cpu) if (p->vm_node_stat_diff[i]) { int v; =20 - v =3D p->vm_node_stat_diff[i]; - p->vm_node_stat_diff[i] =3D 0; + do { + v =3D p->vm_node_stat_diff[i]; + } while (!try_cmpxchg(&p->vm_node_stat_diff[i], &v, 0)); atomic_long_add(v, &pgdat->vm_stat[i]); global_node_diff[i] +=3D v; } From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46C67C636D6 for ; Thu, 9 Feb 2023 15:34:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229551AbjBIPeS (ORCPT ); Thu, 9 Feb 2023 10:34:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230410AbjBIPeI (ORCPT ); Thu, 9 Feb 2023 10:34:08 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB9BF423A for ; Thu, 9 Feb 2023 07:33:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956799; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=Djq1nafk+T9HY7/d7T0y7Sk074zesK3iaSlQcrfIOBk=; b=cV/Hwo44u6YXekoUiUU2St1FfZ6Fm+ts1zorxxBdILv1sIVHqszH88+SgOJslpNbjSKP5X CO8k6NHFGbY+9y0AT2sEueV5xYFd7QCE3F+HpG452pKvcin7jJoFrvphrhb4J5Njyn7/zD Y0hYAAHYyml0CEQoPKjpMYMhjAgW6uU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-49-YtM2vNv5O6i_tviMBYIBvA-1; Thu, 09 Feb 2023 10:33:17 -0500 X-MC-Unique: YtM2vNv5O6i_tviMBYIBvA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 94220858F0E; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 668772026D4B; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 8B2A4403CC07D; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.901518530@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:02:00 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 10/11] mm/vmstat: switch vmstat shepherd to flush per-CPU counters remotely References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now that the counters are modified via cmpxchg both CPU locally (via the account functions), and remotely (via cpu_vm_stats_fold), its possible to switch vmstat_shepherd to perform the per-CPU=20 vmstats folding remotely. This fixes the following two problems: 1. A customer provided some evidence which indicates that the idle tick was stopped; albeit, CPU-specific vmstat counters still remained populated. Thus one can only assume quiet_vmstat() was not invoked on return to the idle loop. If I understand correctly, I suspect this divergence might erroneously prevent a reclaim attempt by kswapd. If the number of zone specific free pages are below their per-cpu drift value then zone_page_state_snapshot() is used to compute a more accurate view of the aforementioned statistic. Thus any task blocked on the NUMA node specific pfmemalloc_wait queue will be unable to make significant progress via direct reclaim unless it is killed after being woken up by kswapd (see throttle_direct_reclaim()) 2. With a SCHED_FIFO task that busy loops on a given CPU, and kworker for that CPU at SCHED_OTHER priority, queuing work to sync per-vmstats will either cause that work to never execute, or stalld (i.e. stall daemon) boosts kworker priority which causes a latency violation Signed-off-by: Marcelo Tosatti Index: linux-2.6/mm/vmstat.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-2.6.orig/mm/vmstat.c +++ linux-2.6/mm/vmstat.c @@ -2007,6 +2007,23 @@ static void vmstat_shepherd(struct work_ =20 static DECLARE_DEFERRABLE_WORK(shepherd, vmstat_shepherd); =20 +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL +/* Flush counters remotely if CPU uses cmpxchg to update its per-CPU count= ers */ +static void vmstat_shepherd(struct work_struct *w) +{ + int cpu; + + cpus_read_lock(); + for_each_online_cpu(cpu) { + cpu_vm_stats_fold(cpu); + cond_resched(); + } + cpus_read_unlock(); + + schedule_delayed_work(&shepherd, + round_jiffies_relative(sysctl_stat_interval)); +} +#else static void vmstat_shepherd(struct work_struct *w) { int cpu; @@ -2026,6 +2043,7 @@ static void vmstat_shepherd(struct work_ schedule_delayed_work(&shepherd, round_jiffies_relative(sysctl_stat_interval)); } +#endif =20 static void __init start_shepherd_timer(void) { From nobody Fri Sep 12 13:06:33 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AB69C61DA4 for ; Thu, 9 Feb 2023 15:34:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231342AbjBIPex (ORCPT ); Thu, 9 Feb 2023 10:34:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230470AbjBIPet (ORCPT ); Thu, 9 Feb 2023 10:34:49 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12F5D5ACE9 for ; Thu, 9 Feb 2023 07:33:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675956804; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=0xpraGtvv+g3M3VM6y/rVH56xF8i4HGSirYhvlOx/70=; b=eFu4e4GvG0Uo7DgVB9GriV6abROJVR//AuPCtuhBgvAO5oaU+TzMaOfj9D9S4w2UA+WA05 YB+QCLZ2UhWicRIoHb5rrFsmtRSmutM1Et0/gZoWHGP0ViWs3aF+FnENcddF2dsnlqmOv2 huhVmeYeeujtsxognLot9EEyXDJKgd4= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-311-HYnLzgtyPPihlf6ibXjY4Q-1; Thu, 09 Feb 2023 10:33:19 -0500 X-MC-Unique: HYnLzgtyPPihlf6ibXjY4Q-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 963A72807D72; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-3.gru2.redhat.com [10.97.112.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6C910492B00; Thu, 9 Feb 2023 15:33:16 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 90B32403CC181; Thu, 9 Feb 2023 12:32:51 -0300 (-03) Message-ID: <20230209153204.929333982@redhat.com> User-Agent: quilt/0.67 Date: Thu, 09 Feb 2023 12:02:01 -0300 From: Marcelo Tosatti To: Christoph Lameter Cc: Aaron Tomlin , Frederic Weisbecker , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Marcelo Tosatti Subject: [PATCH v2 11/11] mm/vmstat: refresh stats remotely instead of via work item References: <20230209150150.380060673@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Refresh per-CPU stats remotely, instead of queueing=20 work items, for the stat_refresh procfs method. This fixes sosreport hang (which uses vmstat_refresh) with spinning SCHED_FIFO process. Signed-off-by: Marcelo Tosatti Index: linux-2.6/mm/vmstat.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-2.6.orig/mm/vmstat.c +++ linux-2.6/mm/vmstat.c @@ -1865,11 +1865,21 @@ static DEFINE_PER_CPU(struct delayed_wor int sysctl_stat_interval __read_mostly =3D HZ; =20 #ifdef CONFIG_PROC_FS + +#ifdef CONFIG_HAVE_CMPXCHG_LOCAL +static int refresh_all_vm_stats(void); +#else static void refresh_vm_stats(struct work_struct *work) { refresh_cpu_vm_stats(); } =20 +static int refresh_all_vm_stats(void) +{ + return schedule_on_each_cpu(refresh_vm_stats); +} +#endif + int vmstat_refresh(struct ctl_table *table, int write, void *buffer, size_t *lenp, loff_t *ppos) { @@ -1889,7 +1899,7 @@ int vmstat_refresh(struct ctl_table *tab * transiently negative values, report an error here if any of * the stats is negative, so we know to go looking for imbalance. */ - err =3D schedule_on_each_cpu(refresh_vm_stats); + err =3D refresh_all_vm_stats(); if (err) return err; for (i =3D 0; i < NR_VM_ZONE_STAT_ITEMS; i++) { @@ -2009,7 +2019,7 @@ static DECLARE_DEFERRABLE_WORK(shepherd, =20 #ifdef CONFIG_HAVE_CMPXCHG_LOCAL /* Flush counters remotely if CPU uses cmpxchg to update its per-CPU count= ers */ -static void vmstat_shepherd(struct work_struct *w) +static int refresh_all_vm_stats(void) { int cpu; =20 @@ -2019,7 +2029,12 @@ static void vmstat_shepherd(struct work_ cond_resched(); } cpus_read_unlock(); + return 0; +} =20 +static void vmstat_shepherd(struct work_struct *w) +{ + refresh_all_vm_stats(); schedule_delayed_work(&shepherd, round_jiffies_relative(sysctl_stat_interval)); }