From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAA2CC04A95 for ; Sat, 22 Oct 2022 11:49:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230289AbiJVLtH (ORCPT ); Sat, 22 Oct 2022 07:49:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229824AbiJVLst (ORCPT ); Sat, 22 Oct 2022 07:48:49 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E73824FEFE for ; Sat, 22 Oct 2022 04:48:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=B/ouY7blFSKbM/QG08uUBBh7FdO1AskbZkW9MoHfj6s=; b=S//P5/VxJDDNdFqdqGkA0ghIGL VinkdVqdm4OvZQrImBQTdKp1JukAyaG5YVjjy3e3UbDF00JPqap9zUb+NzaQ9P+zYX2Pis2cBtO0z EgXV49kD0T3r9+YUm/5EWRySgkDdgVD8o2tUfQDEEFP59xFAkC0pvHiU34HWdkAN3NmJAqwoaljai vOSad0sFEWshxHC1ZMIEuH56m4bKhF56jG0i4XooByga1cS3S1h7RrM/RW1O7j+nHFv3L1Cs38kQ9 GL7ZcQmZzyK2TEjFTPmNeE2yXiIj2VXTm5ImT10pf0FddSMPwgv3YCOyx9TMjBlEnjQrdOYxkwymX r5aanGyA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzL-005XdF-G3; Sat, 22 Oct 2022 11:48:28 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 327EC300472; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 1C06028B8E50E; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.515572025@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:04 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 01/13] mm: Update ptep_get_lockless()s comment References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Improve the comment. Suggested-by: Matthew Wilcox Signed-off-by: Peter Zijlstra (Intel) Acked-by: Gerald Schaefer # s390 Acked-by: Peter Zijlstra (Intel) --- include/linux/pgtable.h | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -260,15 +260,12 @@ static inline pte_t ptep_get(pte_t *ptep =20 #ifdef CONFIG_GUP_GET_PTE_LOW_HIGH /* - * WARNING: only to be used in the get_user_pages_fast() implementation. - * - * With get_user_pages_fast(), we walk down the pagetables without taking = any - * locks. For this we would like to load the pointers atomically, but som= etimes - * that is not possible (e.g. without expensive cmpxchg8b on x86_32 PAE). = What - * we do have is the guarantee that a PTE will only either go from not pre= sent - * to present, or present to not present or both -- it will not switch to a - * completely different present page without a TLB flush in between; somet= hing - * that we are blocking by holding interrupts off. + * For walking the pagetables without holding any locks. Some architectur= es + * (eg x86-32 PAE) cannot load the entries atomically without using expens= ive + * instructions. We are guaranteed that a PTE will only either go from not + * present to present, or present to not present -- it will not switch to a + * completely different present page without a TLB flush inbetween; which = we + * are blocking by holding interrupts off. * * Setting ptes from not present to present goes: * From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 859B8C04A95 for ; Sat, 22 Oct 2022 11:49:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230248AbiJVLts (ORCPT ); Sat, 22 Oct 2022 07:49:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230078AbiJVLtB (ORCPT ); Sat, 22 Oct 2022 07:49:01 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7456253BD1 for ; Sat, 22 Oct 2022 04:48:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=f4g92HolWEIcUN5sHQre4J13ebXNQzotEK5t5/pJgYg=; b=fWHHwCBmHCqVGItBT3D823XpSB 9Dn0Fb6t5HnrtAN4U48A3Kztys6P91mnsAdPmGWnZ3YL+ZvkCkwUYv4Ial3sfGO9vpxYlV7hlxvyK 3S4ZVu9VS0hcs0IcRh4wPkHYNzxCF+xcAfF3mKxqWKNm5u3W3mE/zcxAUdZZ35p0u0vwdqmIL82rj HnNNEu1yDU0W/lGAtsf9OLSNIhvyVEARw22A89v685RQawGPHxFiOAqR9vB8sH2yL2Ka38J0NakbM ssY2mMFhQM/DSYKztlMPN7IRfdul4t4thrnD2FzWx4MShTDJuzc0hlc+KfdxZjC/RbZuES/YqJTQo zuzGHOWA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzL-005XdE-G3; Sat, 22 Oct 2022 11:48:28 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 36A6630067F; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 1FFF728B8E50D; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.580310787@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:05 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 02/13] x86/mm/pae: Make pmd_t similar to pte_t References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Instead of mucking about with at least 2 different ways of fudging it, do the same thing we do for pte_t. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/pgtable-3level.h | 42 +++++++++--------------= ----- arch/x86/include/asm/pgtable-3level_types.h | 7 ++++ arch/x86/include/asm/pgtable_64_types.h | 1=20 arch/x86/include/asm/pgtable_types.h | 4 -- 4 files changed, 23 insertions(+), 31 deletions(-) --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -87,7 +87,7 @@ static inline pmd_t pmd_read_atomic(pmd_ ret |=3D ((pmdval_t)*(tmp + 1)) << 32; } =20 - return (pmd_t) { ret }; + return (pmd_t) { .pmd =3D ret }; } =20 static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) @@ -121,12 +121,11 @@ static inline void native_pte_clear(stru ptep->pte_high =3D 0; } =20 -static inline void native_pmd_clear(pmd_t *pmd) +static inline void native_pmd_clear(pmd_t *pmdp) { - u32 *tmp =3D (u32 *)pmd; - *tmp =3D 0; + pmdp->pmd_low =3D 0; smp_wmb(); - *(tmp + 1) =3D 0; + pmdp->pmd_high =3D 0; } =20 static inline void native_pud_clear(pud_t *pudp) @@ -162,25 +161,17 @@ static inline pte_t native_ptep_get_and_ #define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) #endif =20 -union split_pmd { - struct { - u32 pmd_low; - u32 pmd_high; - }; - pmd_t pmd; -}; - #ifdef CONFIG_SMP static inline pmd_t native_pmdp_get_and_clear(pmd_t *pmdp) { - union split_pmd res, *orig =3D (union split_pmd *)pmdp; + pmd_t res; =20 /* xchg acts as a barrier before setting of the high bits */ - res.pmd_low =3D xchg(&orig->pmd_low, 0); - res.pmd_high =3D orig->pmd_high; - orig->pmd_high =3D 0; + res.pmd_low =3D xchg(&pmdp->pmd_low, 0); + res.pmd_high =3D READ_ONCE(pmdp->pmd_high); + WRITE_ONCE(pmdp->pmd_high, 0); =20 - return res.pmd; + return res; } #else #define native_pmdp_get_and_clear(xp) native_local_pmdp_get_and_clear(xp) @@ -199,17 +190,12 @@ static inline pmd_t pmdp_establish(struc * anybody. */ if (!(pmd_val(pmd) & _PAGE_PRESENT)) { - union split_pmd old, new, *ptr; - - ptr =3D (union split_pmd *)pmdp; - - new.pmd =3D pmd; - /* xchg acts as a barrier before setting of the high bits */ - old.pmd_low =3D xchg(&ptr->pmd_low, new.pmd_low); - old.pmd_high =3D ptr->pmd_high; - ptr->pmd_high =3D new.pmd_high; - return old.pmd; + old.pmd_low =3D xchg(&pmdp->pmd_low, pmd.pmd_low); + old.pmd_high =3D READ_ONCE(pmdp->pmd_high); + WRITE_ONCE(pmdp->pmd_high, pmd.pmd_high); + + return old; } =20 do { --- a/arch/x86/include/asm/pgtable-3level_types.h +++ b/arch/x86/include/asm/pgtable-3level_types.h @@ -18,6 +18,13 @@ typedef union { }; pteval_t pte; } pte_t; + +typedef union { + struct { + unsigned long pmd_low, pmd_high; + }; + pmdval_t pmd; +} pmd_t; #endif /* !__ASSEMBLY__ */ =20 #define SHARED_KERNEL_PMD (!static_cpu_has(X86_FEATURE_PTI)) --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -19,6 +19,7 @@ typedef unsigned long pgdval_t; typedef unsigned long pgprotval_t; =20 typedef struct { pteval_t pte; } pte_t; +typedef struct { pmdval_t pmd; } pmd_t; =20 #ifdef CONFIG_X86_5LEVEL extern unsigned int __pgtable_l5_enabled; --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -381,11 +381,9 @@ static inline pudval_t native_pud_val(pu #endif =20 #if CONFIG_PGTABLE_LEVELS > 2 -typedef struct { pmdval_t pmd; } pmd_t; - static inline pmd_t native_make_pmd(pmdval_t val) { - return (pmd_t) { val }; + return (pmd_t) { .pmd =3D val }; } =20 static inline pmdval_t native_pmd_val(pmd_t pmd) From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30C26FA3741 for ; Sat, 22 Oct 2022 11:49:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230382AbiJVLtO (ORCPT ); Sat, 22 Oct 2022 07:49:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229895AbiJVLsv (ORCPT ); Sat, 22 Oct 2022 07:48:51 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA20B24F785 for ; Sat, 22 Oct 2022 04:48:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=VsbQJE2gXZYIdS3+vUbgKcHjJGGPNYaddyEXb2iNKwA=; b=R/Cm2B1BOg/6TAQKIgkjPpWVyg aJqvcmGvaLHJIyjHUTTOC9NQOLfcdXPVFDNmrXjpOfj2HRuokxNbnDFv/xyauV8LYwlc6vtAvUbv8 qVx3drzMG8oj3vR8aj3v3JuvhjtGCrH6x8DIszHxZibmaxQOjKYKhniIsmx8QKAyVh8T0SzLH+L3N YIuC8UEvEF+f63BUsupCIiDX3G/ZGcBdpuKPmYfTMyl6VsHmFXx3eesedrDyRlojc94tKwrIsNqOg 7SCZNRhzklPUgvlZ2P4w8vhBw28FFxoMep0YjTvSE0ICbGaBrTrF7eV5JUiFFZy5fsqJBBxZu0Mu4 AbqXXtlw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzQ-00Dtgp-6K; Sat, 22 Oct 2022 11:48:32 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 397173006B1; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 241D428B8E50B; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.645657294@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:06 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 03/13] sh/mm: Make pmd_t similar to pte_t References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Just like 64bit pte_t, have a low/high split in pmd_t. Signed-off-by: Peter Zijlstra (Intel) --- arch/sh/include/asm/pgtable-3level.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) --- a/arch/sh/include/asm/pgtable-3level.h +++ b/arch/sh/include/asm/pgtable-3level.h @@ -28,9 +28,15 @@ #define pmd_ERROR(e) \ printk("%s:%d: bad pmd %016llx.\n", __FILE__, __LINE__, pmd_val(e)) =20 -typedef struct { unsigned long long pmd; } pmd_t; +typedef struct { + struct { + unsigned long pmd_low; + unsigned long pmd_high; + }; + unsigned long long pmd; +} pmd_t; #define pmd_val(x) ((x).pmd) -#define __pmd(x) ((pmd_t) { (x) } ) +#define __pmd(x) ((pmd_t) { .pmd =3D (x) } ) =20 static inline pmd_t *pud_pgtable(pud_t pud) { From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09A0CC04A95 for ; Sat, 22 Oct 2022 11:49:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230193AbiJVLto (ORCPT ); Sat, 22 Oct 2022 07:49:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229939AbiJVLs4 (ORCPT ); Sat, 22 Oct 2022 07:48:56 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9FDB251D59 for ; Sat, 22 Oct 2022 04:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=WrWjWAdXlYrZgI+2wdU5TCD9SMPt4h2Vfa3Cvok7+uU=; b=nwDkzw3R5O12iesOP1q8VwET/4 ZsL/dWqBoN6o0f0YpYhXVZ3YijSmTavj8IkPaWHQEqvefhwtLf1jemo/PGf5ldXjjwMRsIjSqacpe xst/jckJF8WzaoFNFKMklCC16WgqrSaV6BVJBi3IWWzqOi4gfDwqytHzaZmw3wjdmpWGiFZle50zi kPQ/2PGDg0ljw0gZjsm4HzQ5CICs1p+mnrJeR7VDEtP9QHcoJ/+q3IFu0klABPRfnUhNCxm6hhtrI gWNG3kd08LexHBzRJStQuirx3jBgrbnqxgJjQsfmcJzsW/K5l6IAmX2YsavdKyVGhUNAtUCo6j/TI v8um1bAQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzL-005XdD-G3; Sat, 22 Oct 2022 11:48:28 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 3E33D300BBA; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 2841228B8E50F; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.711181252@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:07 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 04/13] mm: Fix pmd_read_atomic() References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" AFAICT there's no reason to do anything different than what we do for PTEs. Make it so (also affects SH). Signed-off-by: Peter Zijlstra (Intel) Suggested-by: Linus Torvalds --- arch/x86/include/asm/pgtable-3level.h | 56 -----------------------------= ----- include/linux/pgtable.h | 49 +++++++++++++++++++++++------ 2 files changed, 39 insertions(+), 66 deletions(-) --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -34,62 +34,6 @@ static inline void native_set_pte(pte_t ptep->pte_low =3D pte.pte_low; } =20 -#define pmd_read_atomic pmd_read_atomic -/* - * pte_offset_map_lock() on 32-bit PAE kernels was reading the pmd_t with - * a "*pmdp" dereference done by GCC. Problem is, in certain places - * where pte_offset_map_lock() is called, concurrent page faults are - * allowed, if the mmap_lock is hold for reading. An example is mincore - * vs page faults vs MADV_DONTNEED. On the page fault side - * pmd_populate() rightfully does a set_64bit(), but if we're reading the - * pmd_t with a "*pmdp" on the mincore side, a SMP race can happen - * because GCC will not read the 64-bit value of the pmd atomically. - * - * To fix this all places running pte_offset_map_lock() while holding the - * mmap_lock in read mode, shall read the pmdp pointer using this - * function to know if the pmd is null or not, and in turn to know if - * they can run pte_offset_map_lock() or pmd_trans_huge() or other pmd - * operations. - * - * Without THP if the mmap_lock is held for reading, the pmd can only - * transition from null to not null while pmd_read_atomic() runs. So - * we can always return atomic pmd values with this function. - * - * With THP if the mmap_lock is held for reading, the pmd can become - * trans_huge or none or point to a pte (and in turn become "stable") - * at any time under pmd_read_atomic(). We could read it truly - * atomically here with an atomic64_read() for the THP enabled case (and - * it would be a whole lot simpler), but to avoid using cmpxchg8b we - * only return an atomic pmdval if the low part of the pmdval is later - * found to be stable (i.e. pointing to a pte). We are also returning a - * 'none' (zero) pmdval if the low part of the pmd is zero. - * - * In some cases the high and low part of the pmdval returned may not be - * consistent if THP is enabled (the low part may point to previously - * mapped hugepage, while the high part may point to a more recently - * mapped hugepage), but pmd_none_or_trans_huge_or_clear_bad() only - * needs the low part of the pmd to be read atomically to decide if the - * pmd is unstable or not, with the only exception when the low part - * of the pmd is zero, in which case we return a 'none' pmd. - */ -static inline pmd_t pmd_read_atomic(pmd_t *pmdp) -{ - pmdval_t ret; - u32 *tmp =3D (u32 *)pmdp; - - ret =3D (pmdval_t) (*tmp); - if (ret) { - /* - * If the low part is null, we must not read the high part - * or we can end up with a partial pmd. - */ - smp_rmb(); - ret |=3D ((pmdval_t)*(tmp + 1)) << 32; - } - - return (pmd_t) { .pmd =3D ret }; -} - static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) { set_64bit((unsigned long long *)(ptep), native_pte_val(pte)); --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -258,6 +258,13 @@ static inline pte_t ptep_get(pte_t *ptep } #endif =20 +#ifndef __HAVE_ARCH_PMDP_GET +static inline pmd_t pmdp_get(pmd_t *pmdp) +{ + return READ_ONCE(*pmdp); +} +#endif + #ifdef CONFIG_GUP_GET_PTE_LOW_HIGH /* * For walking the pagetables without holding any locks. Some architectur= es @@ -302,15 +309,42 @@ static inline pte_t ptep_get_lockless(pt =20 return pte; } -#else /* CONFIG_GUP_GET_PTE_LOW_HIGH */ +#define ptep_get_lockless ptep_get_lockless + +#if CONFIG_PGTABLE_LEVELS > 2 +static inline pmd_t pmdp_get_lockless(pmd_t *pmdp) +{ + pmd_t pmd; + + do { + pmd.pmd_low =3D pmdp->pmd_low; + smp_rmb(); + pmd.pmd_high =3D pmdp->pmd_high; + smp_rmb(); + } while (unlikely(pmd.pmd_low !=3D pmdp->pmd_low)); + + return pmd; +} +#define pmdp_get_lockless pmdp_get_lockless +#endif /* CONFIG_PGTABLE_LEVELS > 2 */ +#endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */ + /* * We require that the PTE can be read atomically. */ +#ifndef ptep_get_lockless static inline pte_t ptep_get_lockless(pte_t *ptep) { return ptep_get(ptep); } -#endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */ +#endif + +#ifndef pmdp_get_lockless +static inline pmd_t pmdp_get_lockless(pmd_t *pmdp) +{ + return pmdp_get(pmdp); +} +#endif =20 #ifdef CONFIG_TRANSPARENT_HUGEPAGE #ifndef __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR @@ -1211,17 +1247,10 @@ static inline int pud_trans_unstable(pud #endif } =20 -#ifndef pmd_read_atomic static inline pmd_t pmd_read_atomic(pmd_t *pmdp) { - /* - * Depend on compiler for an atomic pmd read. NOTE: this is - * only going to work, if the pmdval_t isn't larger than - * an unsigned long. - */ - return *pmdp; + return pmdp_get_lockless(pmdp); } -#endif =20 #ifndef arch_needs_pgtable_deposit #define arch_needs_pgtable_deposit() (false) From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCCC3FA3740 for ; Sat, 22 Oct 2022 11:49:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230338AbiJVLtL (ORCPT ); Sat, 22 Oct 2022 07:49:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229893AbiJVLsv (ORCPT ); Sat, 22 Oct 2022 07:48:51 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA38424FEC7 for ; Sat, 22 Oct 2022 04:48:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=TK/Z6RlKcsjeIiMJAK610lvWrcZp9kpetOjgnaxEVyk=; b=BukeXEN+MIjiqbC4c+Me1GShNp 3NJEOydsKmBtXlCm9wZSBPlu7rh1Tpdlku9TEONhdRsvIEUUB+bqMHMSm1mvu7KAgQPtSwMg4xvVP Y8xYk+e0IhS0rasaGjclCuquJU+ICuftzGWoGhAr8Zz1A+LQFPCC3Vx3cw7UAs74Lh2KxPL1vNJnA 7F4+ebmeUEF0Y0T/0KGU0Qh4x75bo4Pi7iIXE/tgZgByH3FEN75c7yP7CWc522xWgxDQzT/j/YQII y60Lvw31VYz/F7wfeNBNBsmpc4sJJDb665REzL/H0+BzHuN2LAKPzZnF+fbzi2K4hiN5y/0t+Y58q da2+8wqw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzR-00Dtgu-Ja; Sat, 22 Oct 2022 11:48:33 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id BDB1E301FC4; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 2C25E28B8E510; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.776404066@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:08 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 05/13] mm: Rename GUP_GET_PTE_LOW_HIGH References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since it no longer applies to only PTEs, rename it to PXX. Suggested-by: Linus Torvalds Signed-off-by: Peter Zijlstra (Intel) --- arch/mips/Kconfig | 2 +- arch/sh/Kconfig | 2 +- arch/x86/Kconfig | 2 +- include/linux/pgtable.h | 4 ++-- mm/Kconfig | 2 +- 5 files changed, 6 insertions(+), 6 deletions(-) --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -46,7 +46,7 @@ config MIPS select GENERIC_SCHED_CLOCK if !CAVIUM_OCTEON_SOC select GENERIC_SMP_IDLE_THREAD select GENERIC_TIME_VSYSCALL - select GUP_GET_PTE_LOW_HIGH if CPU_MIPS32 && PHYS_ADDR_T_64BIT + select GUP_GET_PXX_LOW_HIGH if CPU_MIPS32 && PHYS_ADDR_T_64BIT select HAVE_ARCH_COMPILER_H select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_KGDB if MIPS_FP_SUPPORT --- a/arch/sh/Kconfig +++ b/arch/sh/Kconfig @@ -24,7 +24,7 @@ config SUPERH select GENERIC_PCI_IOMAP if PCI select GENERIC_SCHED_CLOCK select GENERIC_SMP_IDLE_THREAD - select GUP_GET_PTE_LOW_HIGH if X2TLB + select GUP_GET_PXX_LOW_HIGH if X2TLB select HAVE_ARCH_AUDITSYSCALL select HAVE_ARCH_KGDB select HAVE_ARCH_SECCOMP_FILTER --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -157,7 +157,7 @@ config X86 select GENERIC_TIME_VSYSCALL select GENERIC_GETTIMEOFDAY select GENERIC_VDSO_TIME_NS - select GUP_GET_PTE_LOW_HIGH if X86_PAE + select GUP_GET_PXX_LOW_HIGH if X86_PAE select HARDIRQS_SW_RESEND select HARDLOCKUP_CHECK_TIMESTAMP if X86_64 select HAVE_ACPI_APEI if ACPI --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -305,7 +305,7 @@ static inline pmd_t pmdp_get(pmd_t *pmdp } #endif =20 -#ifdef CONFIG_GUP_GET_PTE_LOW_HIGH +#ifdef CONFIG_GUP_GET_PXX_LOW_HIGH /* * For walking the pagetables without holding any locks. Some architectur= es * (eg x86-32 PAE) cannot load the entries atomically without using expens= ive @@ -365,7 +365,7 @@ static inline pmd_t pmdp_get_lockless(pm } #define pmdp_get_lockless pmdp_get_lockless #endif /* CONFIG_PGTABLE_LEVELS > 2 */ -#endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */ +#endif /* CONFIG_GUP_GET_PXX_LOW_HIGH */ =20 /* * We require that the PTE can be read atomically. --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1044,7 +1044,7 @@ config GUP_TEST comment "GUP_TEST needs to have DEBUG_FS enabled" depends on !GUP_TEST && !DEBUG_FS =20 -config GUP_GET_PTE_LOW_HIGH +config GUP_GET_PXX_LOW_HIGH bool =20 config ARCH_HAS_PTE_SPECIAL From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E017CC433FE for ; Sat, 22 Oct 2022 11:49:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229955AbiJVLs4 (ORCPT ); Sat, 22 Oct 2022 07:48:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229871AbiJVLsu (ORCPT ); Sat, 22 Oct 2022 07:48:50 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21121250293 for ; Sat, 22 Oct 2022 04:48:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=lkZdGp83/TthtV3rxO0LKzlFIb0aX3v3XYJ5iZQQNzg=; b=DvKf19ngBUpdzl0ij7xBmZMRXh ui/RMKslNxqoOsZHJQEPa5zDXLSQ3lYvqgHe3zbQqojpn0U9BuDn4pJaoxbaK86/AQqbdKJhh3oYD ktC6an0vkVyWJJKA9MR4FG0zwzprr5Xgm+MolIvlAy29FCTpuSFLo1xrWIqCK/O8Qs5Gsck/i8qIQ 4mfjWuAYUdAynm0EIt1Pp3+QSrXSg1pT6whi7abx5v7Sj6NZsGkXKAvhdlLwSQjvnDw18bP07Zv4y Mc6ZlO9yHBlrAsiBQYNLFfQ3Yhfz13CO5XsuOaulhx5bKEzAJXiaIYGya4dd4IzKwJEJA1YcKJKVh 2tGUPmqA==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzM-005XdR-WF; Sat, 22 Oct 2022 11:48:29 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id BDB32302D62; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 3015B28B8E511; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.841277397@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:09 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 06/13] mm: Rename pmd_read_atomic() References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There's no point in having the identical routines for PTE/PMD have different names. Signed-off-by: Peter Zijlstra (Intel) --- include/linux/pgtable.h | 9 ++------- mm/hmm.c | 2 +- mm/khugepaged.c | 2 +- mm/mapping_dirty_helpers.c | 2 +- mm/mprotect.c | 2 +- mm/userfaultfd.c | 2 +- mm/vmscan.c | 4 ++-- 7 files changed, 9 insertions(+), 14 deletions(-) --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1352,11 +1352,6 @@ static inline int pud_trans_unstable(pud #endif } =20 -static inline pmd_t pmd_read_atomic(pmd_t *pmdp) -{ - return pmdp_get_lockless(pmdp); -} - #ifndef arch_needs_pgtable_deposit #define arch_needs_pgtable_deposit() (false) #endif @@ -1383,13 +1378,13 @@ static inline pmd_t pmd_read_atomic(pmd_ */ static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd) { - pmd_t pmdval =3D pmd_read_atomic(pmd); + pmd_t pmdval =3D pmdp_get_lockless(pmd); /* * The barrier will stabilize the pmdval in a register or on * the stack so that it will stop changing under the code. * * When CONFIG_TRANSPARENT_HUGEPAGE=3Dy on x86 32bit PAE, - * pmd_read_atomic is allowed to return a not atomic pmdval + * pmdp_get_lockless is allowed to return a not atomic pmdval * (for example pointing to an hugepage that has never been * mapped in the pmd). The below checks will only care about * the low part of the pmd with 32bit PAE x86 anyway, with the --- a/mm/hmm.c +++ b/mm/hmm.c @@ -361,7 +361,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, * huge or device mapping one and compute corresponding pfn * values. */ - pmd =3D pmd_read_atomic(pmdp); + pmd =3D pmdp_get_lockless(pmdp); barrier(); if (!pmd_devmap(pmd) && !pmd_trans_huge(pmd)) goto again; --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -862,7 +862,7 @@ static int find_pmd_or_thp_or_none(struc if (!*pmd) return SCAN_PMD_NULL; =20 - pmde =3D pmd_read_atomic(*pmd); + pmde =3D pmdp_get_lockless(*pmd); =20 #ifdef CONFIG_TRANSPARENT_HUGEPAGE /* See comments in pmd_none_or_trans_huge_or_clear_bad() */ --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -126,7 +126,7 @@ static int clean_record_pte(pte_t *pte, static int wp_clean_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned lon= g end, struct mm_walk *walk) { - pmd_t pmdval =3D pmd_read_atomic(pmd); + pmd_t pmdval =3D pmdp_get_lockless(pmd); =20 if (!pmd_trans_unstable(&pmdval)) return 0; --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -292,7 +292,7 @@ static unsigned long change_pte_range(st */ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) { - pmd_t pmdval =3D pmd_read_atomic(pmd); + pmd_t pmdval =3D pmdp_get_lockless(pmd); =20 /* See pmd_none_or_trans_huge_or_clear_bad for info on barrier */ #ifdef CONFIG_TRANSPARENT_HUGEPAGE --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -613,7 +613,7 @@ static __always_inline ssize_t __mcopy_a break; } =20 - dst_pmdval =3D pmd_read_atomic(dst_pmd); + dst_pmdval =3D pmdp_get_lockless(dst_pmd); /* * If the dst_pmd is mapped as THP don't * override it and just be strict. --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4039,9 +4039,9 @@ static void walk_pmd_range(pud_t *pud, u /* walk_pte_range() may call get_next_vma() */ vma =3D args->vma; for (i =3D pmd_index(start), addr =3D start; addr !=3D end; i++, addr =3D= next) { - pmd_t val =3D pmd_read_atomic(pmd + i); + pmd_t val =3D pmdp_get_lockless(pmd + i); =20 - /* for pmd_read_atomic() */ + /* for pmdp_get_lockless() */ barrier(); =20 next =3D pmd_addr_end(addr, end); From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44487C433FE for ; Sat, 22 Oct 2022 11:49:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230182AbiJVLtg (ORCPT ); Sat, 22 Oct 2022 07:49:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230024AbiJVLtA (ORCPT ); Sat, 22 Oct 2022 07:49:00 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 672C4251D7C for ; Sat, 22 Oct 2022 04:48:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ZxLGyKaonAFGosmEJEXCJIuUc/p2pYnEnlwCFC//QMs=; b=d5RZJQiWmMamzGxaIKQnEOtUwD T9H/sWSomE2By/o8fhYFxcJnvZOZCOoyYsj4DPqrDTAEiVXHR6xyiOPOuV3LpQeQUbDuoAPcmVKGs DJSaXerbzdFXHI/CeMHGBW7cMxLNnTrKYaXjJZ3DmnuM+KA59Jo1ENMyp0WIE4NfWdcSJlN6hSA3v I0C+dfhtEKU3yRXt+wguStxjqcWSysKILzObipzuaBHkU3xB0q+bVaL3s37Udbo8ewcAHquPYv+98 QYAVXxD+sKkG/ZpZypq3uRMpr18Br1YlM8YBXYyqaiaLhGv+xeybW3Binc8LI7NIXuDe907gPs0vd qMBpHPRw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzM-005XdG-1V; Sat, 22 Oct 2022 11:48:28 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id BDB6F302D82; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 33F9C28B8E514; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.906110403@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:10 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 07/13] mm/gup: Fix the lockless PMD access References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On architectures where the PTE/PMD is larger than the native word size (i386-PAE for example), READ_ONCE() can do the wrong thing. Use pmdp_get_lockless() just like we use ptep_get_lockless(). Signed-off-by: Peter Zijlstra (Intel) --- kernel/events/core.c | 2 +- mm/gup.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7186,7 +7186,7 @@ static u64 perf_get_pgtable_size(struct return pud_leaf_size(pud); =20 pmdp =3D pmd_offset_lockless(pudp, pud, addr); - pmd =3D READ_ONCE(*pmdp); + pmd =3D pmdp_get_lockless(pmdp); if (!pmd_present(pmd)) return 0; =20 --- a/mm/gup.c +++ b/mm/gup.c @@ -2507,7 +2507,7 @@ static int gup_pmd_range(pud_t *pudp, pu =20 pmdp =3D pmd_offset_lockless(pudp, pud, addr); do { - pmd_t pmd =3D READ_ONCE(*pmdp); + pmd_t pmd =3D pmdp_get_lockless(pmdp); =20 next =3D pmd_addr_end(addr, end); if (!pmd_present(pmd)) From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB110C433FE for ; Sat, 22 Oct 2022 11:49:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229765AbiJVLtV (ORCPT ); Sat, 22 Oct 2022 07:49:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229914AbiJVLsv (ORCPT ); Sat, 22 Oct 2022 07:48:51 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA2B424FEC2 for ; Sat, 22 Oct 2022 04:48:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=rlvYdOzEA1PYmv3xpXNUozerfLc4suXD0vdv08sNG6k=; b=gfnL7eWmkMJzCbyJLYVMNiLqdN Yy/XAKxgWJ1eol3mJOSAcMtqvfGvqoxIJjpKemQp81MaKf1+ntLPHeb8Tem7q/EMim9uN0vxFMDUB eXbMhDOcxXURog4p2Dz9JSV7iQMaS+NvXu2N+TIdLF2D1tj3LS6LRHUMXcL2NpUST9QLYcdvQ+L6m 0tSzURodRdHPX/bvpcDRKMHpJhdmSmNMA8PjoBlbBQTsxsSBpfIXNc3inYvAKWS+p0g8i/fZ3WPsc CtjByrke9HLUT+6Qs8yFq1wAEjWRrJtLKO2twxmWKmHgpKCQcsNt+RZg5B7IuCgeZG9FXhnWrtHSL GKlrjMlQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzR-00Dtgt-IR; Sat, 22 Oct 2022 11:48:33 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id BDB42302D80; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 37BBE28B8E515; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114424.971450128@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:11 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 08/13] x86/mm/pae: Dont (ab)use atomic64 References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" PAE implies CX8, write readable code. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/pgtable-3level.h | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -2,8 +2,6 @@ #ifndef _ASM_X86_PGTABLE_3LEVEL_H #define _ASM_X86_PGTABLE_3LEVEL_H =20 -#include - /* * Intel Physical Address Extension (PAE) Mode - three-level page * tables on PPro+ CPUs. @@ -95,11 +93,12 @@ static inline void pud_clear(pud_t *pudp #ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *ptep) { - pte_t res; + pte_t old =3D *ptep; =20 - res.pte =3D (pteval_t)arch_atomic64_xchg((atomic64_t *)ptep, 0); + do { + } while (!try_cmpxchg64(&ptep->pte, &old.pte, 0ULL)); =20 - return res; + return old; } #else #define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2174C433FE for ; Sat, 22 Oct 2022 11:49:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230168AbiJVLtd (ORCPT ); Sat, 22 Oct 2022 07:49:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43524 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229945AbiJVLs4 (ORCPT ); Sat, 22 Oct 2022 07:48:56 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9C9E251D4F for ; Sat, 22 Oct 2022 04:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=21hb22ZOLPug1Y7rU1MItuB/yDHg1vIxzPuc1LZH5hI=; b=PdKGQbC12QBugaGBHXI+1OdOwW jIeT7sZU8f/cCzbk9SkQJaFP1b540Ku1IBJuuSIHSLzg9x2w8Wmc8o276uAHPOpXzXLG3d6TZq0tF ufch0SOHGlt6rJ9GjVJv7If3PI2CCLHZEcs/plOsipgGDnY0U9T2MARB4MNF2md4sW8rQWFDyYM2z ftKN36vAn858VDWdBe2+QF71E9pc0xQfcgKilAvXMfbC3SKqEsFNp+XKGyHGw+FG2kU27DviLQ2k2 Zgve+Gbb5vgj55/AffmFiNzVSPKfRJE8lA9PCBRfZ7JGWZYdwkJx6h4LWZv/Y1jswpVYaPfpTSW4b 7CjP2Teg==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzM-005XdH-1s; Sat, 22 Oct 2022 11:48:28 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id BDB8D302D91; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 3B49F28B8E516; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114425.038102604@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:12 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 09/13] x86/mm/pae: Use WRITE_ONCE() References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Disallow write-tearing, that would be really unfortunate. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/pgtable-3level.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -27,9 +27,9 @@ */ static inline void native_set_pte(pte_t *ptep, pte_t pte) { - ptep->pte_high =3D pte.pte_high; + WRITE_ONCE(ptep->pte_high, pte.pte_high); smp_wmb(); - ptep->pte_low =3D pte.pte_low; + WRITE_ONCE(ptep->pte_low, pte.pte_low); } =20 static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) @@ -58,16 +58,16 @@ static inline void native_set_pud(pud_t static inline void native_pte_clear(struct mm_struct *mm, unsigned long ad= dr, pte_t *ptep) { - ptep->pte_low =3D 0; + WRITE_ONCE(ptep->pte_low, 0); smp_wmb(); - ptep->pte_high =3D 0; + WRITE_ONCE(ptep->pte_high, 0); } =20 static inline void native_pmd_clear(pmd_t *pmdp) { - pmdp->pmd_low =3D 0; + WRITE_ONCE(pmdp->pmd_low, 0); smp_wmb(); - pmdp->pmd_high =3D 0; + WRITE_ONCE(pmdp->pmd_high, 0); } =20 static inline void native_pud_clear(pud_t *pudp) From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72887C04A95 for ; Sat, 22 Oct 2022 11:48:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229875AbiJVLsw (ORCPT ); Sat, 22 Oct 2022 07:48:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229506AbiJVLst (ORCPT ); Sat, 22 Oct 2022 07:48:49 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A80624FEC7 for ; Sat, 22 Oct 2022 04:48:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ooIZw4CypGPCb0zPum6oOMgggH2beVobr6aR5c3g6lI=; b=K1NZC28rbCpDx1lOpuyqqeOlu0 u3HYYfJW0hm7LNd8SPY5tyqEMpHWPxr11iKs9l58bdoPnVU6wHHL2BsM1n5/JdXsM0nZfADRbHFqz v8V6qBdgbc9YK9ivVKEpDExRf6QDkCdPgacqCcnSoo9/LIGBwfOTjWoso36OcEgniZ48URn3tfWKv 9W3APtgZ2AOVMvmrrjyx1tGJ6tYWebVeBHPYVBttdim6OSimso7Cn3gFaTTlAa6QuNq13n8bEp1WD JUaT8tW6KC84dQ4vZgwnBJuYXBP8WtAmVqc1uEfDt3bFBi0IyCgb1WGcB4XpIzmQjGYlCvhnqr0X6 FlQvWLug==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzM-005XdS-Vs; Sat, 22 Oct 2022 11:48:29 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id C79DA302E97; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 3EEA728B8E517; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114425.103392961@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:13 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 10/13] x86/mm/pae: Be consistent with pXXp_get_and_clear() References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Given that ptep_get_and_clear() uses cmpxchg8b, and that should be by far the most common case, there's no point in having an optimized variant for pmd/pud. Introduce the pxx_xchg64() helper to implement the common logic once. Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/pgtable-3level.h | 67 ++++++++---------------------= ----- 1 file changed, 17 insertions(+), 50 deletions(-) --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -90,34 +90,33 @@ static inline void pud_clear(pud_t *pudp */ } =20 + +#define pxx_xchg64(_pxx, _ptr, _val) ({ \ + _pxx##val_t *_p =3D (_pxx##val_t *)_ptr; \ + _pxx##val_t _o =3D *_p; \ + do { } while (!try_cmpxchg64(_p, &_o, (_val))); \ + native_make_##_pxx(_o); \ +}) + #ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *ptep) { - pte_t old =3D *ptep; - - do { - } while (!try_cmpxchg64(&ptep->pte, &old.pte, 0ULL)); - - return old; + return pxx_xchg64(pte, ptep, 0ULL); } -#else -#define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) -#endif =20 -#ifdef CONFIG_SMP static inline pmd_t native_pmdp_get_and_clear(pmd_t *pmdp) { - pmd_t res; - - /* xchg acts as a barrier before setting of the high bits */ - res.pmd_low =3D xchg(&pmdp->pmd_low, 0); - res.pmd_high =3D READ_ONCE(pmdp->pmd_high); - WRITE_ONCE(pmdp->pmd_high, 0); + return pxx_xchg64(pmd, pmdp, 0ULL); +} =20 - return res; +static inline pud_t native_pudp_get_and_clear(pud_t *pudp) +{ + return pxx_xchg64(pud, pudp, 0ULL); } #else +#define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) #define native_pmdp_get_and_clear(xp) native_local_pmdp_get_and_clear(xp) +#define native_pudp_get_and_clear(xp) native_local_pudp_get_and_clear(xp) #endif =20 #ifndef pmdp_establish @@ -141,40 +140,8 @@ static inline pmd_t pmdp_establish(struc return old; } =20 - do { - old =3D *pmdp; - } while (cmpxchg64(&pmdp->pmd, old.pmd, pmd.pmd) !=3D old.pmd); - - return old; -} -#endif - -#ifdef CONFIG_SMP -union split_pud { - struct { - u32 pud_low; - u32 pud_high; - }; - pud_t pud; -}; - -static inline pud_t native_pudp_get_and_clear(pud_t *pudp) -{ - union split_pud res, *orig =3D (union split_pud *)pudp; - -#ifdef CONFIG_PAGE_TABLE_ISOLATION - pti_set_user_pgtbl(&pudp->p4d.pgd, __pgd(0)); -#endif - - /* xchg acts as a barrier before setting of the high bits */ - res.pud_low =3D xchg(&orig->pud_low, 0); - res.pud_high =3D orig->pud_high; - orig->pud_high =3D 0; - - return res.pud; + return pxx_xchg64(pmd, pmdp, pmd.pmd); } -#else -#define native_pudp_get_and_clear(xp) native_local_pudp_get_and_clear(xp) #endif =20 /* Encode and de-code a swap entry */ From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67DD4C3A59D for ; Sat, 22 Oct 2022 11:49:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230003AbiJVLtA (ORCPT ); Sat, 22 Oct 2022 07:49:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229765AbiJVLst (ORCPT ); Sat, 22 Oct 2022 07:48:49 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 317C52505C3 for ; Sat, 22 Oct 2022 04:48:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=kfJUOBrrmjy4JsdmvOUplpu2MGBZGnH8F3TsabZggcA=; b=TZtejNuXM7kl3MOXWKy8aaU/tk e118lDcxQsCXKrtcjfNaX3hhzfxtkXwSWz763aZgaWRoymxnOariKlXTCOTFQ1zOXOH9zc1OTUFRv 8azjLpfaOOQEKL6SRwBny2sFE0p0l69+LTKxpFmQ79VJ+ySU5VFdBl2ClPmUNW6KfJP7IiEb2xjC9 tsNtJRffbNAX/xOBgv9twdHzjXxpXTWw7dPoKQlb47QCW08TuhULthxwrfwPLHr8YayH5Eek3Idiz D6G/kMsRd5f8hFWCtPSwW5VwwMGGMZa4EtEoRiAGDeP8Ppb2+SjV6sDe0sQYqyLmn55ceyQHrM481 m9ajaoKQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzM-005XdT-Vx; Sat, 22 Oct 2022 11:48:29 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id C85DC3030FD; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 4299A28B8E512; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114425.168036718@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:14 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 11/13] x86_64: Remove pointless set_64bit() usage References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The use of set_64bit() in X86_64 only code is pretty pointless, seeing how it's a direct assignment. Remove all this nonsense. Additionally, since x86_64 unconditionally has HAVE_CMPXCHG_DOUBLE, there is no point in even having that fallback. Signed-off-by: Peter Zijlstra (Intel) --- arch/um/include/asm/pgtable-3level.h | 8 -------- arch/x86/include/asm/cmpxchg_64.h | 5 ----- drivers/iommu/intel/irq_remapping.c | 10 ++-------- 3 files changed, 2 insertions(+), 21 deletions(-) --- a/arch/um/include/asm/pgtable-3level.h +++ b/arch/um/include/asm/pgtable-3level.h @@ -58,11 +58,7 @@ #define pud_populate(mm, pud, pmd) \ set_pud(pud, __pud(_PAGE_TABLE + __pa(pmd))) =20 -#ifdef CONFIG_64BIT -#define set_pud(pudptr, pudval) set_64bit((u64 *) (pudptr), pud_val(pudval= )) -#else #define set_pud(pudptr, pudval) (*(pudptr) =3D (pudval)) -#endif =20 static inline int pgd_newpage(pgd_t pgd) { @@ -71,11 +67,7 @@ static inline int pgd_newpage(pgd_t pgd) =20 static inline void pgd_mkuptodate(pgd_t pgd) { pgd_val(pgd) &=3D ~_PAGE_NE= WPAGE; } =20 -#ifdef CONFIG_64BIT -#define set_pmd(pmdptr, pmdval) set_64bit((u64 *) (pmdptr), pmd_val(pmdval= )) -#else #define set_pmd(pmdptr, pmdval) (*(pmdptr) =3D (pmdval)) -#endif =20 static inline void pud_clear (pud_t *pud) { --- a/arch/x86/include/asm/cmpxchg_64.h +++ b/arch/x86/include/asm/cmpxchg_64.h @@ -2,11 +2,6 @@ #ifndef _ASM_X86_CMPXCHG_64_H #define _ASM_X86_CMPXCHG_64_H =20 -static inline void set_64bit(volatile u64 *ptr, u64 val) -{ - *ptr =3D val; -} - #define arch_cmpxchg64(ptr, o, n) \ ({ \ BUILD_BUG_ON(sizeof(*(ptr)) !=3D 8); \ --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -173,7 +173,6 @@ static int modify_irte(struct irq_2_iomm index =3D irq_iommu->irte_index + irq_iommu->sub_handle; irte =3D &iommu->ir_table->base[index]; =20 -#if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) if ((irte->pst =3D=3D 1) || (irte_modified->pst =3D=3D 1)) { bool ret; =20 @@ -187,11 +186,6 @@ static int modify_irte(struct irq_2_iomm * same as the old value. */ WARN_ON(!ret); - } else -#endif - { - set_64bit(&irte->low, irte_modified->low); - set_64bit(&irte->high, irte_modified->high); } __iommu_flush_cache(iommu, irte, sizeof(*irte)); =20 @@ -249,8 +243,8 @@ static int clear_entries(struct irq_2_io end =3D start + (1 << irq_iommu->irte_mask); =20 for (entry =3D start; entry < end; entry++) { - set_64bit(&entry->low, 0); - set_64bit(&entry->high, 0); + WRITE_ONCE(entry->low, 0); + WRITE_ONCE(entry->high, 0); } bitmap_release_region(iommu->ir_table->bitmap, index, irq_iommu->irte_mask); From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53B26C04A95 for ; Sat, 22 Oct 2022 11:49:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230097AbiJVLtZ (ORCPT ); Sat, 22 Oct 2022 07:49:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229912AbiJVLsv (ORCPT ); Sat, 22 Oct 2022 07:48:51 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA4922505E0 for ; Sat, 22 Oct 2022 04:48:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=vmo888klJOYTK6NXdYxj1ylLTVw6+U7xUyde2QpBL2o=; b=bYr1reyz4HyDuluM2y+c4SgJM0 PBJN8uYkZxEcRHT3GePK91yBuUwfsfegkY9kOgusA3DzeACZX107P9rwRsGS+B5vmcl2jCFpRqPPq dY+JhH8JW06eGjLRNrjQfsUO/3WvAUW0Mnr0U8/5MItJ1KB06PnuMLz9AKqkVjcZ+3ImKa4B4PPbx MXfbYcCiAe9q7O28LJ1ssPhDG1tuqtRmAMRil+8BvHeMkUevTn5g569pTJJaTks3FEbOAnHuEgDx5 13zre+JP7+tEWJSrresAW1tKhUL7Mnjcjod2QJFzkI1sVeA4Id2JzSbvBWj8QCFumXkoCwc6YQj9L ka4hjhhw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzR-00Dtgv-LC; Sat, 22 Oct 2022 11:48:33 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id CA46F303106; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 46C8E28B8E519; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114425.233481884@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:15 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 12/13] x86/mm/pae: Get rid of set_64bit() References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Recognise that set_64bit() is a special case of our previously introduced pxx_xchg64(), so use that and get rid of set_64bit(). Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/cmpxchg_32.h | 28 ---------------------------- arch/x86/include/asm/pgtable-3level.h | 23 ++++++++++++----------- 2 files changed, 12 insertions(+), 39 deletions(-) --- a/arch/x86/include/asm/cmpxchg_32.h +++ b/arch/x86/include/asm/cmpxchg_32.h @@ -7,34 +7,6 @@ * you need to test for the feature in boot_cpu_data. */ =20 -/* - * CMPXCHG8B only writes to the target if we had the previous - * value in registers, otherwise it acts as a read and gives us the - * "new previous" value. That is why there is a loop. Preloading - * EDX:EAX is a performance optimization: in the common case it means - * we need only one locked operation. - * - * A SIMD/3DNOW!/MMX/FPU 64-bit store here would require at the very - * least an FPU save and/or %cr0.ts manipulation. - * - * cmpxchg8b must be used with the lock prefix here to allow the - * instruction to be executed atomically. We need to have the reader - * side to see the coherent 64bit value. - */ -static inline void set_64bit(volatile u64 *ptr, u64 value) -{ - u32 low =3D value; - u32 high =3D value >> 32; - u64 prev =3D *ptr; - - asm volatile("\n1:\t" - LOCK_PREFIX "cmpxchg8b %0\n\t" - "jnz 1b" - : "=3Dm" (*ptr), "+A" (prev) - : "b" (low), "c" (high) - : "memory"); -} - #ifdef CONFIG_X86_CMPXCHG64 #define arch_cmpxchg64(ptr, o, n) \ ((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \ --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -19,7 +19,15 @@ pr_err("%s:%d: bad pgd %p(%016Lx)\n", \ __FILE__, __LINE__, &(e), pgd_val(e)) =20 -/* Rules for using set_pte: the pte being assigned *must* be +#define pxx_xchg64(_pxx, _ptr, _val) ({ \ + _pxx##val_t *_p =3D (_pxx##val_t *)_ptr; \ + _pxx##val_t _o =3D *_p; \ + do { } while (!try_cmpxchg64(_p, &_o, (_val))); \ + native_make_##_pxx(_o); \ +}) + +/* + * Rules for using set_pte: the pte being assigned *must* be * either not present or in a state where the hardware will * not attempt to update the pte. In places where this is * not possible, use pte_get_and_clear to obtain the old pte @@ -34,12 +42,12 @@ static inline void native_set_pte(pte_t =20 static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) { - set_64bit((unsigned long long *)(ptep), native_pte_val(pte)); + pxx_xchg64(pte, ptep, native_pte_val(pte)); } =20 static inline void native_set_pmd(pmd_t *pmdp, pmd_t pmd) { - set_64bit((unsigned long long *)(pmdp), native_pmd_val(pmd)); + pxx_xchg64(pmd, pmdp, native_pmd_val(pmd)); } =20 static inline void native_set_pud(pud_t *pudp, pud_t pud) @@ -47,7 +55,7 @@ static inline void native_set_pud(pud_t #ifdef CONFIG_PAGE_TABLE_ISOLATION pud.p4d.pgd =3D pti_set_user_pgtbl(&pudp->p4d.pgd, pud.p4d.pgd); #endif - set_64bit((unsigned long long *)(pudp), native_pud_val(pud)); + pxx_xchg64(pud, pudp, native_pud_val(pud)); } =20 /* @@ -91,13 +99,6 @@ static inline void pud_clear(pud_t *pudp } =20 =20 -#define pxx_xchg64(_pxx, _ptr, _val) ({ \ - _pxx##val_t *_p =3D (_pxx##val_t *)_ptr; \ - _pxx##val_t _o =3D *_p; \ - do { } while (!try_cmpxchg64(_p, &_o, (_val))); \ - native_make_##_pxx(_o); \ -}) - #ifdef CONFIG_SMP static inline pte_t native_ptep_get_and_clear(pte_t *ptep) { From nobody Wed Apr 8 04:56:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12599C433FE for ; Sat, 22 Oct 2022 11:48:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229928AbiJVLsy (ORCPT ); Sat, 22 Oct 2022 07:48:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229576AbiJVLst (ORCPT ); Sat, 22 Oct 2022 07:48:49 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82B6D24F785 for ; Sat, 22 Oct 2022 04:48:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=RY5OJVKm7PHKfP6ch20ecx7TX/MGByzkwaP5zQXVje4=; b=OJUNT+dm8Xm8G4HKPUJTT88+5b tS7rS4Xte2tScbrsFmj1YbzEc81Fk6njpuY3V7B/fQSSRiowYqw06hkfp4K7lil+cYoEilAlK626c skvA6gh56x7NxdcMn/+qHQbgcePeOyPqteTa64FhrQD2XdvJRo7DyDyAcFyvMhNpo/ScLaXybnWgd N35tVDujd87+4UpkzlXOj/vEoKlK+9YprxpUkKPWJwSBL6UY46tD9qx15+E/rcO3cvGY6EFN4XohB 8Ot3mLfMet0JZV34mz6DbGldZ+616MqADT10saDPFs+Rw1ii0z/QnTUVeBnufCvLBhWNZ13DQWN5F 7HXKoxeQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1omCzM-005XdU-WF; Sat, 22 Oct 2022 11:48:29 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id D6086303109; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 518E428B8E518; Sat, 22 Oct 2022 13:48:26 +0200 (CEST) Message-ID: <20221022114425.298833095@infradead.org> User-Agent: quilt/0.66 Date: Sat, 22 Oct 2022 13:14:16 +0200 From: Peter Zijlstra To: x86@kernel.org, willy@infradead.org, torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, linux-mm@kvack.org, aarcange@redhat.com, kirill.shutemov@linux.intel.com, jroedel@suse.de, ubizjak@gmail.com Subject: [PATCH 13/13] mm: Remove pointless barrier() after pmdp_get_lockless() References: <20221022111403.531902164@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" pmdp_get_lockless() should itself imply any ordering required. Signed-off-by: Peter Zijlstra (Intel) --- mm/hmm.c | 1 - mm/vmscan.c | 3 --- 2 files changed, 4 deletions(-) --- a/mm/hmm.c +++ b/mm/hmm.c @@ -362,7 +362,6 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, * values. */ pmd =3D pmdp_get_lockless(pmdp); - barrier(); if (!pmd_devmap(pmd) && !pmd_trans_huge(pmd)) goto again; =20 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4041,9 +4041,6 @@ static void walk_pmd_range(pud_t *pud, u for (i =3D pmd_index(start), addr =3D start; addr !=3D end; i++, addr =3D= next) { pmd_t val =3D pmdp_get_lockless(pmd + i); =20 - /* for pmdp_get_lockless() */ - barrier(); - next =3D pmd_addr_end(addr, end); =20 if (!pmd_present(val) || is_huge_zero_pmd(val)) {