From nobody Sat Feb 7 18:20:09 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDF941448FF for ; Wed, 7 Aug 2024 19:48:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060102; cv=none; b=IeO69GHuZ/EiLxYaimMTmluzUCCmTiAREZkGcpw/3RErZGHUqX08AbQbg77d7a45a7lRkMmuAAxwCEzqxg304EFgIKKqdcx1bAg2f0EwEZziSI7AKLdqI6ceamnPyVVJtDuShF+P2vqr2WZVrCOhj63Pv5qLN5yXL/P9tiwckvI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060102; c=relaxed/simple; bh=XL7oTSSTB/8YDtl+/bwPSSNlnKCeA31jQyxrFRfpnLw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gNeq5y0VrB86jOr9xG5MTa+a0IT53ew2PwaFyaiofmR/hQWZMOQ5nwXckQi87T0SKENzbVT6FypoFZ1HbZ/IuMc5WGXtQlEZmJ029KUNCmWFY/G65gfu0FNWLtbhhoAO7KoS1Xe9KBw5BsxHC9Abkif0uQYLq+ddacB00f1LWlM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XVpcwFW1; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XVpcwFW1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723060099; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9Kq72BfB1Q9PD26Udwsuz2Gt7rCvKvwunl3vkpU7op8=; b=XVpcwFW1S7v+Z2qL7qm9Ot6NiZFICCdE9T1xZUptlXwv1bTZ7errLgfcHnn8FrVyolJs2s Z2vYP91jsLadMuJuJl34M4jWTWdDTpNLRQo1nb+DCvXGB5AHauTEIKxsdhLXCE1mIs2CX8 5kzTkIvc2pQyIJdDgg7jQyNQILmKgvE= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-562-rfhsveJVOXW7rEfvh9HXtg-1; Wed, 07 Aug 2024 15:48:18 -0400 X-MC-Unique: rfhsveJVOXW7rEfvh9HXtg-1 Received: by mail-qv1-f70.google.com with SMTP id 6a1803df08f44-6b95c005dbcso498486d6.1 for ; Wed, 07 Aug 2024 12:48:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723060098; x=1723664898; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9Kq72BfB1Q9PD26Udwsuz2Gt7rCvKvwunl3vkpU7op8=; b=lppUPqLS9cudKwKApeN1sCQMQ97n1GcW/NOe87Ju/a6oxjG18wHECUR4Us3iXI2L1W pKZa3LZNO4afibOcn6p/XUidknILK7XBJ1lr89sjvt+MZ91/90sELfRI59VVGVip8i40 Z1slDuHO0/GvFuYMhcVM7SCey2hXhHOV2B6cpNxyMJYVuMvsK2S13Ewct3miuDe3neP7 YYSaNeTcxFCIBg5lnI9ytrPh0shh2T1UUmRSyMZyYhTlLhgpi8CAFwwe8ZZfBAFxdt6Z I1mjhMW9ZWVJvWP2UmB/PFzldGJM02AZIIg7pQ4sH8TAGhSgwMXI74oOh35AokgdUzE0 AaKg== X-Gm-Message-State: AOJu0YwO7lSdAzb9GWLOlbpS+F03lVBSoXYyhEl2VSut1HSf/prc1Zf0 PqxZXf+thmYM9F5O3Fx0qsTxMUGW9Oj7aJzFeva8GHTtvjIUx6Cuoiflhs4C3suQgCCLYlqgi4H Yma7I/hd1UdNGkwPpJBfocRhS6Mw3D+L+heEjWNDWoleZgdDX+Iu2CDk7ygtvzR179xqc9LgpLR jriT4jogzmPGADqglIDmK2feipfCxUxTKxJ86T1cNlZPc= X-Received: by 2002:ad4:5f89:0:b0:6b2:b5b5:124e with SMTP id 6a1803df08f44-6bb98201fe5mr130641606d6.0.1723060097570; Wed, 07 Aug 2024 12:48:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE2kwiMDYWwjAOmnvJrXFpiL3DqhuBYXy1c4XYYU13it34Yz8NGLRxRmyIIbMBrd7nnJuhzgA== X-Received: by 2002:ad4:5f89:0:b0:6b2:b5b5:124e with SMTP id 6a1803df08f44-6bb98201fe5mr130641186d6.0.1723060096990; Wed, 07 Aug 2024 12:48:16 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm59853256d6.33.2024.08.07.12.48.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 12:48:16 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , peterx@redhat.com, linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Thomas Gleixner , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying Subject: [PATCH v4 1/7] mm/dax: Dump start address in fault handler Date: Wed, 7 Aug 2024 15:48:05 -0400 Message-ID: <20240807194812.819412-2-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240807194812.819412-1-peterx@redhat.com> References: <20240807194812.819412-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently the dax fault handler dumps the vma range when dynamic debugging enabled. That's mostly not useful. Dump the (aligned) address instead with the order info. Acked-by: David Hildenbrand Signed-off-by: Peter Xu --- drivers/dax/device.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 2051e4f73c8a..9c1a729cd77e 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -235,9 +235,9 @@ static vm_fault_t dev_dax_huge_fault(struct vm_fault *v= mf, unsigned int order) int id; struct dev_dax *dev_dax =3D filp->private_data; =20 - dev_dbg(&dev_dax->dev, "%s: %s (%#lx - %#lx) order:%d\n", current->comm, - (vmf->flags & FAULT_FLAG_WRITE) ? "write" : "read", - vmf->vma->vm_start, vmf->vma->vm_end, order); + dev_dbg(&dev_dax->dev, "%s: op=3D%s addr=3D%#lx order=3D%d\n", current->c= omm, + (vmf->flags & FAULT_FLAG_WRITE) ? "write" : "read", + vmf->address & ~((1UL << (order + PAGE_SHIFT)) - 1), order); =20 id =3D dax_read_lock(); if (order =3D=3D 0) --=20 2.45.0 From nobody Sat Feb 7 18:20:09 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24320145354 for ; Wed, 7 Aug 2024 19:48:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060104; cv=none; b=URqcacpimxHr4nSthZnn3gQSPMIjheSBpSZ0V74D/VfS2Bnc5mvg0yz7rrZHtu/U8V8DzgSjeJRePEmwymYEbqFo3kQb1+ASz7lD/Z4QW3c5FbhlZDNrMSyQu0ppEuJg+ZcFTbtLqYqX+Ix+Sd7sUyEFGInS7GFPpuDXsgGNrV0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060104; c=relaxed/simple; bh=7RUkptmdOEoTaWI5G/XfwfT/YiV+5T3HhVmyBbiSmpw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Vv6+N6rEfAyJRj36Hn9S+ZgsdQTfMX0mIPzZrVnwyRUmjorZ19sWEHFwYEj9kyNEAUcIukYnz6p0++gIQlsPXllJgVhuX8FVlWBQQdD4VRFUWDtZvk/8vtSaNMsBlXdiVMhz5ptmCW99EnrkM+s7K5VPli6djMbbTj7nQa6vWf0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Q2ucMBhz; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Q2ucMBhz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723060102; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6aC2RF+//p0GJfjgH/iZrlW9ATLJl4mVfe8Rdc4jHgk=; b=Q2ucMBhz/XgfvpP6HrkE37VzE11I5NvndIEIUubfXNm9HWgjGEbnItvvaM/gQxtt9pnhdI tp4bikG7BkGUOHvPehBXpEDFnrwgiv/fYv6O6fJEV4oTFCnqzDwejIOrlXYxzTAu8J3ko9 1uywXexZN6uOEi39FGwc9ugxhb0BQXI= Received: from mail-ua1-f70.google.com (mail-ua1-f70.google.com [209.85.222.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-501-JeHkmjBuMOyURhaG7k-nOQ-1; Wed, 07 Aug 2024 15:48:20 -0400 X-MC-Unique: JeHkmjBuMOyURhaG7k-nOQ-1 Received: by mail-ua1-f70.google.com with SMTP id a1e0cc1a2514c-81fb65857e0so15699241.3 for ; Wed, 07 Aug 2024 12:48:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723060099; x=1723664899; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6aC2RF+//p0GJfjgH/iZrlW9ATLJl4mVfe8Rdc4jHgk=; b=lWLwBfIlzGbfAgEW588xVGGxB9dYMG3k8mkc1aZwQ4bDumV4VzqZYFiz7YgVrW7iat XqGvqp2HGu07S+lQ20jolG2YFwHj4xZtEokoa+AkXfQGSQuxxTj/dRQ2vjWPAYvkocZF av6OAeSf6xbXpUx6h8+/QbmZ+8Yeo+QoIwGinlFTOlDC7Czi4U8ttQrcEQQorQRNtVmR 38JQqMAkM1xUN5XYNxcmKDSrL+zHzASWuPNcxJWRNXxFtuN0M/tlRsfBs4YGhnngkigq zh7k8Nk7R5s8Aqn88h4P5jhH/jGpc4F4FiuwPFbOZtKuBZC+RwhoU/QyGmha69h+jt1q 9kPg== X-Gm-Message-State: AOJu0YzX3f0iHIkozl9gmCtXIrB53ZwV6urLcS0cLoYKND9iVEhs9MRq jSephpZSc2WTGQq4tVmVpBKMnhRFc7UAh+gvT4j4jmSB0N+kKeXWcW9tbT3HAxDvkOWMFlUZ/Kb 7rs3sb/zvO+Ppjq+/DMTG4TxBjXumhP4gexhoGGbzvZ2wzcFCXWx+vUwYqUi1ZWYMWUorweOjsK usJF3o54GEKzzsAs0wEJmcNG6Yw3jJbVGMl/oydX8yEh8= X-Received: by 2002:a67:f445:0:b0:493:c75f:4c71 with SMTP id ada2fe7eead31-4945bf7f9bemr10182711137.4.1723060099598; Wed, 07 Aug 2024 12:48:19 -0700 (PDT) X-Google-Smtp-Source: AGHT+IExnSrOtwcHXkdqrS3UwyseZPMUb/72Vd4NIKyn9I98RgoSLE2TM2KVGSHUSlSxMHs5MS4l7Q== X-Received: by 2002:a67:f445:0:b0:493:c75f:4c71 with SMTP id ada2fe7eead31-4945bf7f9bemr10182670137.4.1723060099151; Wed, 07 Aug 2024 12:48:19 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm59853256d6.33.2024.08.07.12.48.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 12:48:18 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , peterx@redhat.com, linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Thomas Gleixner , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying , kvm@vger.kernel.org, Sean Christopherson , Paolo Bonzini , David Rientjes Subject: [PATCH v4 2/7] mm/mprotect: Push mmu notifier to PUDs Date: Wed, 7 Aug 2024 15:48:06 -0400 Message-ID: <20240807194812.819412-3-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240807194812.819412-1-peterx@redhat.com> References: <20240807194812.819412-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" mprotect() does mmu notifiers in PMD levels. It's there since 2014 of commit a5338093bfb4 ("mm: move mmu notifier call from change_protection to change_pmd_range"). At that time, the issue was that NUMA balancing can be applied on a huge range of VM memory, even if nothing was populated. The notification can be avoided in this case if no valid pmd detected, which includes either THP or a PTE pgtable page. Now to pave way for PUD handling, this isn't enough. We need to generate mmu notifications even on PUD entries properly. mprotect() is currently broken on PUD (e.g., one can easily trigger kernel error with dax 1G mappings already), this is the start to fix it. To fix that, this patch proposes to push such notifications to the PUD layers. There is risk on regressing the problem Rik wanted to resolve before, but I think it shouldn't really happen, and I still chose this solution because of a few reasons: 1) Consider a large VM that should definitely contain more than GBs of memory, it's highly likely that PUDs are also none. In this case there will have no regression. 2) KVM has evolved a lot over the years to get rid of rmap walks, which might be the major cause of the previous soft-lockup. At least TDP MMU already got rid of rmap as long as not nested (which should be the major use case, IIUC), then the TDP MMU pgtable walker will simply see empty VM pgtable (e.g. EPT on x86), the invalidation of a full empty region in most cases could be pretty fast now, comparing to 2014. 3) KVM has explicit code paths now to even give way for mmu notifiers just like this one, e.g. in commit d02c357e5bfa ("KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing"). It'll also avoid contentions that may also contribute to a soft-lockup. 4) Stick with PMD layer simply don't work when PUD is there... We need one way or another to fix PUD mappings on mprotect(). Pushing it to PUD should be the safest approach as of now, e.g. there's yet no sign of huge P4D coming on any known archs. Cc: kvm@vger.kernel.org Cc: Sean Christopherson Cc: Paolo Bonzini Cc: David Rientjes Cc: Rik van Riel Signed-off-by: Peter Xu --- mm/mprotect.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 37cf8d249405..d423080e6509 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -363,9 +363,6 @@ static inline long change_pmd_range(struct mmu_gather *= tlb, unsigned long next; long pages =3D 0; unsigned long nr_huge_updates =3D 0; - struct mmu_notifier_range range; - - range.start =3D 0; =20 pmd =3D pmd_offset(pud, addr); do { @@ -383,14 +380,6 @@ static inline long change_pmd_range(struct mmu_gather = *tlb, if (pmd_none(*pmd)) goto next; =20 - /* invoke the mmu notifier if the pmd is populated */ - if (!range.start) { - mmu_notifier_range_init(&range, - MMU_NOTIFY_PROTECTION_VMA, 0, - vma->vm_mm, addr, end); - mmu_notifier_invalidate_range_start(&range); - } - _pmd =3D pmdp_get_lockless(pmd); if (is_swap_pmd(_pmd) || pmd_trans_huge(_pmd) || pmd_devmap(_pmd)) { if ((next - addr !=3D HPAGE_PMD_SIZE) || @@ -431,9 +420,6 @@ static inline long change_pmd_range(struct mmu_gather *= tlb, cond_resched(); } while (pmd++, addr =3D next, addr !=3D end); =20 - if (range.start) - mmu_notifier_invalidate_range_end(&range); - if (nr_huge_updates) count_vm_numa_events(NUMA_HUGE_PTE_UPDATES, nr_huge_updates); return pages; @@ -443,22 +429,36 @@ static inline long change_pud_range(struct mmu_gather= *tlb, struct vm_area_struct *vma, p4d_t *p4d, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) { + struct mmu_notifier_range range; pud_t *pud; unsigned long next; long pages =3D 0, ret; =20 + range.start =3D 0; + pud =3D pud_offset(p4d, addr); do { next =3D pud_addr_end(addr, end); ret =3D change_prepare(vma, pud, pmd, addr, cp_flags); - if (ret) - return ret; + if (ret) { + pages =3D ret; + break; + } if (pud_none_or_clear_bad(pud)) continue; + if (!range.start) { + mmu_notifier_range_init(&range, + MMU_NOTIFY_PROTECTION_VMA, 0, + vma->vm_mm, addr, end); + mmu_notifier_invalidate_range_start(&range); + } pages +=3D change_pmd_range(tlb, vma, pud, addr, next, newprot, cp_flags); } while (pud++, addr =3D next, addr !=3D end); =20 + if (range.start) + mmu_notifier_invalidate_range_end(&range); + return pages; } =20 --=20 2.45.0 From nobody Sat Feb 7 18:20:09 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B6E53145B39 for ; Wed, 7 Aug 2024 19:48:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060107; cv=none; b=eDaGNFHClKzQrwZyR3Ohr71uQRVETgK5MTs+KXi637yyVdhaIprZwT81covEhkt4J2m7yR0qDaugpPqLsMM7oz8QSr1pw0+z1sa1bH8W6Rcmq4/88k9mzI445y1upKMLZ0c1A6zgJrflhsX9jYCit+d7MkSKKh+zwCo7gyrg5JQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060107; c=relaxed/simple; bh=lGBhFBkbaz7EzTpAgSm8PUxosOm03SfTjgCaLHkY1V8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=E/G7TRGKKlVnatbw0KO0pJqCeyAgK1BPwwXqrFOE/5/dHwsEne0PPUugTWG9EbURNirlFqc6RS5WwYJIPBG7KKCSN5+X3zxsgF5IoWcD9NfSXFZGEiI783ZZCFbcunOvBmFlYl6t1/W0cwGoyeyg9lO4luTM2FGG9QGeMmoxJZ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CS138z7p; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CS138z7p" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723060104; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DPSgdrK+/9txvlBVeWURx0wHDXuvLt3nB8aMlUhEkV8=; b=CS138z7pYDjFBV+cBxOm3z4S8IBLvToYUprABE4Nel8vyYv3gqaETSs4W63Elf5kqhTu7e t2t+mkcrSNYRpXUKrNBXDeivsEFnq7cS7FTd3jqxFFKk2k+h5a2f/0ENe+9xkYWNEpHnM3 dAaTtZRFOkaCm2oqoIroq1iTmdGEVKc= Received: from mail-ot1-f72.google.com (mail-ot1-f72.google.com [209.85.210.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-457-oO-GHluyOs2CxcVt_IntPg-1; Wed, 07 Aug 2024 15:48:23 -0400 X-MC-Unique: oO-GHluyOs2CxcVt_IntPg-1 Received: by mail-ot1-f72.google.com with SMTP id 46e09a7af769-70b29ad14e5so26970a34.3 for ; Wed, 07 Aug 2024 12:48:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723060102; x=1723664902; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DPSgdrK+/9txvlBVeWURx0wHDXuvLt3nB8aMlUhEkV8=; b=O0c+FzV6NuzGBIWztDHYPanGJQTHW44csppRWuAfj6sqyXWNvQLTbdPsq8GyU5GPxR bYjYvSfUbIH9qrjK/3yQA0IE9UHa7GOjEvh9Fz3SHdIdGwMy/0xQrlIG82wjzn5nwXGY 4Qfh4C+yyxQDzcVwAOoh8cYtWMngDYnmIvF1PNbTiuhWXwbotUZKPfsRaLaf+/pBN07V F1BYYkg/DLVGiW4STBQI/9jISczHZdMMTbzU82UulQZ1DYHUtkp3ARd+3EOQr25nRJqT NODugM57Q7/A7FXbZK0Po1Ni0CYSHUGebvx/JnFjXmrTA1dTAcLh+QrIAnXF64jriXnr eXzw== X-Gm-Message-State: AOJu0YxWPpsKReBD619kdFOw7ZstvzmOuymXR/Pvcm3VEqS84j0/SbtT O9j5X6+RwwN+dv73TbHWxrbyG79uuZlA/osCER+t7fh2umb0ZkNrivd21W7fzLqppbEER1JGnzB xirdv5t6I92MOD86A1Y3nSBo2XZNFE6nzVN6igig9QEzHZJq0Rw7ZT2Gp7S5vxxGnAAfFEBywrq A4GTObLlVPN6IdqTI/l2A31+orn6nYk5FRyZcEMFsCjmo= X-Received: by 2002:a05:6358:a096:b0:1aa:c73d:5a95 with SMTP id e5c5f4694b2df-1af3b89cddemr1270645155d.0.1723060101746; Wed, 07 Aug 2024 12:48:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH//9GjXWmqzA+Y0OZdeiBZvOf2mh0xJt9rLdJnCDcBWQbkH3n97H6Shuw3uIhtGYCg2pjapA== X-Received: by 2002:a05:6358:a096:b0:1aa:c73d:5a95 with SMTP id e5c5f4694b2df-1af3b89cddemr1270641255d.0.1723060101168; Wed, 07 Aug 2024 12:48:21 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm59853256d6.33.2024.08.07.12.48.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 12:48:20 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , peterx@redhat.com, linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Thomas Gleixner , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying Subject: [PATCH v4 3/7] mm/powerpc: Add missing pud helpers Date: Wed, 7 Aug 2024 15:48:07 -0400 Message-ID: <20240807194812.819412-4-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240807194812.819412-1-peterx@redhat.com> References: <20240807194812.819412-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" These new helpers will be needed for pud entry updates soon. Introduce them by referencing the pmd ones. Namely: - pudp_invalidate() - pud_modify() Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: linuxppc-dev@lists.ozlabs.org Cc: Aneesh Kumar K.V Signed-off-by: Peter Xu --- arch/powerpc/include/asm/book3s/64/pgtable.h | 3 +++ arch/powerpc/mm/book3s64/pgtable.c | 20 ++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/in= clude/asm/book3s/64/pgtable.h index 519b1743a0f4..5da92ba68a45 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -1124,6 +1124,7 @@ extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgpr= ot); extern pud_t pfn_pud(unsigned long pfn, pgprot_t pgprot); extern pmd_t mk_pmd(struct page *page, pgprot_t pgprot); extern pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot); +extern pud_t pud_modify(pud_t pud, pgprot_t newprot); extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd); extern void set_pud_at(struct mm_struct *mm, unsigned long addr, @@ -1384,6 +1385,8 @@ static inline pgtable_t pgtable_trans_huge_withdraw(s= truct mm_struct *mm, #define __HAVE_ARCH_PMDP_INVALIDATE extern pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long add= ress, pmd_t *pmdp); +extern pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long add= ress, + pud_t *pudp); =20 #define pmd_move_must_withdraw pmd_move_must_withdraw struct spinlock; diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/= pgtable.c index f4d8d3c40e5c..5a4a75369043 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -176,6 +176,17 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsi= gned long address, return __pmd(old_pmd); } =20 +pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, + pud_t *pudp) +{ + unsigned long old_pud; + + VM_WARN_ON_ONCE(!pud_present(*pudp)); + old_pud =3D pud_hugepage_update(vma->vm_mm, address, pudp, _PAGE_PRESENT,= _PAGE_INVALID); + flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE); + return __pud(old_pud); +} + pmd_t pmdp_huge_get_and_clear_full(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmdp, int full) { @@ -259,6 +270,15 @@ pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) pmdv &=3D _HPAGE_CHG_MASK; return pmd_set_protbits(__pmd(pmdv), newprot); } + +pud_t pud_modify(pud_t pud, pgprot_t newprot) +{ + unsigned long pudv; + + pudv =3D pud_val(pud); + pudv &=3D _HPAGE_CHG_MASK; + return pud_set_protbits(__pud(pudv), newprot); +} #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ =20 /* For use by kexec, called with MMU off */ --=20 2.45.0 From nobody Sat Feb 7 18:20:09 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EDB2145FF8 for ; Wed, 7 Aug 2024 19:48:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060108; cv=none; b=cAiVH6JV4TAKBBLU/dCemJRc3uMWGjSYy2cDHCbm5Ay4DxCvDgir29kaSAr0GZnOj7Xgbxw2zPfaQ9qML19H0Dw2X4DXuLUAXqe2oS0Y7XNt1TrKnPQ5KweABfvroBT6A9v00ME8OlnnBZKJTXBqfqAJUGIjikwJtG4PtGGaYbA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060108; c=relaxed/simple; bh=H+ha3vatbPGARaB174w66no3cvkqymG1Sz3/aRXqIBw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WR1e15IY5PFjZAEmstqnWXmgr/SJb+HZgksZOS/IY74c9PmnLDyl8pxU4gwndKE4hGCDTybQq+F2wy9p9ZV+MEY4eTRzNmSv3y0i5jJG4cZhZQAHVE5Ux6Zsw6GwOfFNBuYg8HEr7mF8+roEkegrnymLHJS0sHPqtT+neSQ+OhM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=ZDinerC7; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ZDinerC7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723060106; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0ts+Sj64QZG1FGRDLVoa9nQimNVqUC6U/k8fGPxCUtI=; b=ZDinerC7VKau45MhLovYyeAQugFNd3cWy0DNGtaM4ot2etjA2oNtJdHmXz+MEBkWJAlph4 vEwW01BF3HG/DOK9qSt0cZ72l6lqBL7Q9CWw1a0zvaldmothBjfPTCxSUVyQSuKjeS1Ysk BpzeWdEHq+Qpm8Tfss7B0krTDVhq6G8= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-594-W1Z9KMFdOHyoDAzDjSNAFg-1; Wed, 07 Aug 2024 15:48:25 -0400 X-MC-Unique: W1Z9KMFdOHyoDAzDjSNAFg-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-6b7ad98c1f8so389516d6.1 for ; Wed, 07 Aug 2024 12:48:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723060104; x=1723664904; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0ts+Sj64QZG1FGRDLVoa9nQimNVqUC6U/k8fGPxCUtI=; b=N8qliLU4qkdSdpiB+5s9V8DU6+GhmNmr7+/YNH4Zmr889Kpep87MGdZffhddPFN2z+ ME3OzeI3WEAdYh2ym462HKyXwd1q4WObS7zSWTK4cLVr4WrnBCUegNybcNJY6xHeA/Og d11RpcQ76vzpqInaEF0yb5mF2r+Wm5/bTQRTsz0IbIcEibgE9h/052vOwT6ySer6Op0A 8bCvavAc0LO2bvIs1cBndR7Dz5N1AmWIYDnlAoGPXjlZu4waBG2e9IilrWJnJvfg7ibS UVYtassrgd17gbdh6CCNfQvDQolkjcs+GNKlUz3/Gr1OJxWsIA3lEjITd4W0MmBYj5sz QzqA== X-Gm-Message-State: AOJu0Yze6HH0NP/Q+k6tWzrwLLjHQcxb6zzM+NqSs9UIIXAjNROZ5ult 4iOw/9mpN0whhGx+syTg7p5RzUa6fuhtIA2zxOR0u1jVe/oGOBYohRBXkitggYTEJBSLEXA4iAS xVCLeRjp9xhOXmkc8IfiXBnljHMVz6tc8vqMNoz3u5irczJMRkbrgHFGA3PbViu0+VJ6lcabevD jx5lQqZ3GU47kT49oTR+ujWwEUk0kedyWjnwteJMqteKg= X-Received: by 2002:ad4:5dca:0:b0:6b5:e3bc:af9a with SMTP id 6a1803df08f44-6bb9832d2dbmr147037786d6.2.1723060103845; Wed, 07 Aug 2024 12:48:23 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEOqKDpzCy2XKs/l7qwihRy4ZY0EbFICMSu3L2U6zTcNQWVR9cvlOSLvEsobqbXC66x3CvadA== X-Received: by 2002:ad4:5dca:0:b0:6b5:e3bc:af9a with SMTP id 6a1803df08f44-6bb9832d2dbmr147037356d6.2.1723060103178; Wed, 07 Aug 2024 12:48:23 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm59853256d6.33.2024.08.07.12.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 12:48:22 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , peterx@redhat.com, linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Thomas Gleixner , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying Subject: [PATCH v4 4/7] mm/x86: Make pud_leaf() only care about PSE bit Date: Wed, 7 Aug 2024 15:48:08 -0400 Message-ID: <20240807194812.819412-5-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240807194812.819412-1-peterx@redhat.com> References: <20240807194812.819412-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An entry should be reported as PUD leaf even if it's PROT_NONE, in which case PRESENT bit isn't there. I hit bad pud without this when testing dax 1G on zapping a PROT_NONE PUD. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: x86@kernel.org Acked-by: Dave Hansen Reviewed-by: David Hildenbrand Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index e39311a89bf4..a2a3bd4c1bda 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1078,8 +1078,7 @@ static inline pmd_t *pud_pgtable(pud_t pud) #define pud_leaf pud_leaf static inline bool pud_leaf(pud_t pud) { - return (pud_val(pud) & (_PAGE_PSE | _PAGE_PRESENT)) =3D=3D - (_PAGE_PSE | _PAGE_PRESENT); + return pud_val(pud) & _PAGE_PSE; } =20 static inline int pud_bad(pud_t pud) --=20 2.45.0 From nobody Sat Feb 7 18:20:09 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75585146A62 for ; Wed, 7 Aug 2024 19:48:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060111; cv=none; b=GSeAyZ9G6yV/0J7eKyXlkcBQDLo5enaupD9urBXZgfxVbLNe8T5csBWPAhqlWpEROpP0W51029dRzsUrug2nGcZclEo5hDQryQCs9j6iq6lclmMnXI0gNS/YRK2X4HQSvOS3u4K8LI5i41aNSzLjn9G+kOL/7wXus1Q59/DETUg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060111; c=relaxed/simple; bh=/VcNNtJJw0P8ycOBIOS5hkIVFjnZ5JtuXWiPTpXW6DM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BxVCP6tCB4fS4P95IC7xvL1mTjIH6vqeQlTtPZYLcjwSuf9OYvokhvblPA7sK+4qGC5UVvusTPlyZFKfJtA4e75k4Ejio1Uf+bHMkq7u+WI6tk2kzYQuzDJ+KCy3Mc1VvtEwyb8/KU6O1R94UDJfAmA4f++6OSsUtg/csJZH6Ws= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=QLhg6LK5; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QLhg6LK5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723060108; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FAGFj0FHVAq5pVRnc576bB34mI1HflD3zt6t4lJRJgM=; b=QLhg6LK5r1mF0Hm0CnKTWQZqP7VuKLmJuAxCfWWZhhYwtoWp+/8vBR4b9pz25qOLNPZqlQ 7d/+rPgXEzfdiVnpNGKndgklUl48UmzMeIeq3G7gCp0N5VJcfIrvMXH31uOQguPAQVtwGV Kpqx9/+nvM1rAWo/8Fn+UsbbqHG9/Bs= Received: from mail-oo1-f72.google.com (mail-oo1-f72.google.com [209.85.161.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-489-HJcXAUJCPvu-dyWVt5bIwA-1; Wed, 07 Aug 2024 15:48:27 -0400 X-MC-Unique: HJcXAUJCPvu-dyWVt5bIwA-1 Received: by mail-oo1-f72.google.com with SMTP id 006d021491bc7-5d5cc01aee6so12873eaf.0 for ; Wed, 07 Aug 2024 12:48:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723060106; x=1723664906; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FAGFj0FHVAq5pVRnc576bB34mI1HflD3zt6t4lJRJgM=; b=DTyiGO4ToXZVdNVUYtIok9rAjFGriGPISE+OV0JoRhgbF2Zs7b7LPzyxbi5dI7stK8 efRsx3h7TgJoKnSVE+KpOXYMQel7ZyWrMzNZjm5Kj9HbON8V4NXHqmkWxB29o/IawV1i 7g4lrqMa1udbCRLVfyhSpfF2NC9l27KnFuclMN44d2U6NioV5bLiT/BzA58cznD6kGyV nyTv1yv5oIkJvhm77ZjXQQSdZ+2Q5SkTHKEflHi7rL1ilcmL2pARwmzlYL4j5srCtQZH 8K5c2cap4KfFKIdgu9CUIbhCpQ9250UZlGjm9Yx8dEdK868fm/5Z3BoH53OzPQMUautM FflA== X-Gm-Message-State: AOJu0YxgPenEVYD8uAlTjtEOLwnKlVS1KBrJjlwm0n3y4wIwgNNmdbO4 o3Aqgy4slIhW00UIfsTT2Aj+NDrA0YC9hz8polWKKO4dso1P2Tsuj+tRudYJpqvp1RAT7sdiBYw islaxGm+WdmQhHmF/3Fl1gjjUxcyh8c6wWIcFGL5wBaq3EyGBxjZb015RUQZuzRnz+C+jNvWotf y7vNl3dI658eIc7SDQTGO39nxygwlH5wIa/obSqQqWaUc= X-Received: by 2002:a4a:b90c:0:b0:5cd:920:de44 with SMTP id 006d021491bc7-5d6636caa6bmr12611344eaf.2.1723060106166; Wed, 07 Aug 2024 12:48:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE55jmIe/WZbCw3l3NTgyyno7WLqIqKlQdXAtW30hkxt69+d9woKPh92QMpZkY/maA+61+r6A== X-Received: by 2002:a4a:b90c:0:b0:5cd:920:de44 with SMTP id 006d021491bc7-5d6636caa6bmr12611307eaf.2.1723060105619; Wed, 07 Aug 2024 12:48:25 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm59853256d6.33.2024.08.07.12.48.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 12:48:24 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , peterx@redhat.com, linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Thomas Gleixner , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying Subject: [PATCH v4 5/7] mm/x86: arch_check_zapped_pud() Date: Wed, 7 Aug 2024 15:48:09 -0400 Message-ID: <20240807194812.819412-6-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240807194812.819412-1-peterx@redhat.com> References: <20240807194812.819412-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce arch_check_zapped_pud() to sanity check shadow stack on PUD zaps. It has the same logic of the PMD helper. One thing to mention is, it might be a good idea to use page_table_check in the future for trapping wrong setups of shadow stack pgtable entries [1]. That is left for the future as a separate effort. [1] https://lore.kernel.org/all/59d518698f664e07c036a5098833d7b56b953305.ca= mel@intel.com Cc: "Edgecombe, Rick P" Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: x86@kernel.org Acked-by: David Hildenbrand Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 10 ++++++++++ arch/x86/mm/pgtable.c | 7 +++++++ include/linux/pgtable.h | 7 +++++++ mm/huge_memory.c | 4 +++- 4 files changed, 27 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a2a3bd4c1bda..fdb8ac9e7030 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -174,6 +174,13 @@ static inline int pud_young(pud_t pud) return pud_flags(pud) & _PAGE_ACCESSED; } =20 +static inline bool pud_shstk(pud_t pud) +{ + return cpu_feature_enabled(X86_FEATURE_SHSTK) && + (pud_flags(pud) & (_PAGE_RW | _PAGE_DIRTY | _PAGE_PSE)) =3D=3D + (_PAGE_DIRTY | _PAGE_PSE); +} + static inline int pte_write(pte_t pte) { /* @@ -1667,6 +1674,9 @@ void arch_check_zapped_pte(struct vm_area_struct *vma= , pte_t pte); #define arch_check_zapped_pmd arch_check_zapped_pmd void arch_check_zapped_pmd(struct vm_area_struct *vma, pmd_t pmd); =20 +#define arch_check_zapped_pud arch_check_zapped_pud +void arch_check_zapped_pud(struct vm_area_struct *vma, pud_t pud); + #ifdef CONFIG_XEN_PV #define arch_has_hw_nonleaf_pmd_young arch_has_hw_nonleaf_pmd_young static inline bool arch_has_hw_nonleaf_pmd_young(void) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index f5931499c2d6..d4b3ccf90236 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -926,3 +926,10 @@ void arch_check_zapped_pmd(struct vm_area_struct *vma,= pmd_t pmd) VM_WARN_ON_ONCE(!(vma->vm_flags & VM_SHADOW_STACK) && pmd_shstk(pmd)); } + +void arch_check_zapped_pud(struct vm_area_struct *vma, pud_t pud) +{ + /* See note in arch_check_zapped_pte() */ + VM_WARN_ON_ONCE(!(vma->vm_flags & VM_SHADOW_STACK) && + pud_shstk(pud)); +} diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 2a6a3cccfc36..2289e9f7aa1b 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -447,6 +447,13 @@ static inline void arch_check_zapped_pmd(struct vm_are= a_struct *vma, } #endif =20 +#ifndef arch_check_zapped_pud +static inline void arch_check_zapped_pud(struct vm_area_struct *vma, + pud_t pud) +{ +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0024266dea0a..81c5da0708ed 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2293,12 +2293,14 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_= area_struct *vma, pud_t *pud, unsigned long addr) { spinlock_t *ptl; + pud_t orig_pud; =20 ptl =3D __pud_trans_huge_lock(pud, vma); if (!ptl) return 0; =20 - pudp_huge_get_and_clear_full(vma, addr, pud, tlb->fullmm); + orig_pud =3D pudp_huge_get_and_clear_full(vma, addr, pud, tlb->fullmm); + arch_check_zapped_pud(vma, orig_pud); tlb_remove_pud_tlb_entry(tlb, pud, addr); if (vma_is_special_huge(vma)) { spin_unlock(ptl); --=20 2.45.0 From nobody Sat Feb 7 18:20:09 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 802161482FD for ; Wed, 7 Aug 2024 19:48:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060114; cv=none; b=WTU+BUCFwQGpa0d+nGjOA0TARV+RRFikVlyYQxdrlGg8JL/qQ48guXMvQ9QvWOMGDtXI3SfwaUo/+T3JJ7fNMtABxZK+iTXtox8ffGB4fUdo/L8Ef/KqBGxt4uFEUoif9sskgjKJ2trJN2VNd2tfl0uhjzS2pMwdEKHyg1Rsot0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060114; c=relaxed/simple; bh=jlZBQTergZikBzedosCmwjPZuxq7DHRCY1RQ0v7m73s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TSqcIfK86lVuk/Dt04YY4HdY5h7UdCfIdmFv0xQSubWcUxUuF9x4sLzr5zUbuGh2VfCBMbxtd6E6IcqoyKM/jGuUzXOv4KFeQnbdKQ1+uPwFKmwUecOYmjSWe33yvbFrHewA4uFwIb6UM3meZmoImz0AwP/6VHk5etTkGfrFfzE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TdO+HUA2; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TdO+HUA2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723060110; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DDb5b0nOIqaqJ1HBArfkZKBp6ttQDN26GIh6owjfgNA=; b=TdO+HUA2uoAY7CqjdXmDAfm9Kd1zGVSx5rBwFDFs1Q86HNAapg0itjjHQ5NFivO+WKSJpD 9v9ignnnq+kFe59kJyKyeXol9m0w2x6VKAy9XBWMnS6ELUzjfOcTRN6gpS9iMDbkqiD+rz KJofRxDWbm//GsWctR/DUoo1o5qfd5I= Received: from mail-oi1-f198.google.com (mail-oi1-f198.google.com [209.85.167.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-473-3DBBQLP8O6aAO2zcupSXnQ-1; Wed, 07 Aug 2024 15:48:29 -0400 X-MC-Unique: 3DBBQLP8O6aAO2zcupSXnQ-1 Received: by mail-oi1-f198.google.com with SMTP id 5614622812f47-3db15ea38baso53633b6e.3 for ; Wed, 07 Aug 2024 12:48:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723060108; x=1723664908; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DDb5b0nOIqaqJ1HBArfkZKBp6ttQDN26GIh6owjfgNA=; b=OLZVCEvf+gpkF3Sh921JB1ceNCRadtQ1NMoWYyt4L4RlAEXHSRbQx8V3V0MCasLRUy f/APBsgiMD6/PoGUZ+tkT2uYCyrPPQzyKJDn0s0W2H3E1cUiCz5vDBqilHwTRyaie7Ll SvgIxpHuDmZbemItXAbspFZgLJufDzGwU+4UZIUtp+i3qo5XM0VHRboHe3EcDE8aokL/ H9wyhcHyzY7YFb1clU1c9tI7wNiDviwWYleOZXr3Xac+9yZXEkySV9j70Z0SZEWIZXYJ RjUfQtRGsKBOIwrWkR2hpwm5fQXgD3oCs2hu3HA/Vj/UJ5DW/ZQxvqeuSynppZkPr2An wkTw== X-Gm-Message-State: AOJu0Yzlop6rZ8RhP69+84gRqt9x3L1HkBNWjFiLJeg3l5X9p+MH9PS5 xyhk84fIuDZEFYe7Fwok9d1aOl91txftfpt/NzyHtnnjhLJaad+9BZGDqJQB6iILjnZX1DvQEyT hJUhgnMK/svNhNQvVKBHQOhWFpU7K/iaid3v+0zRn8/i8KPhy96FaF7VnitXo80uJrH+hRwUpPh T6znEpZDP1kp9lbsYQkDuXpVMcxXYQz5ZSnDxFj/kTQwQ= X-Received: by 2002:a05:6358:e49f:b0:1ac:f6e3:dbcd with SMTP id e5c5f4694b2df-1af3babf83fmr1244373255d.3.1723060108372; Wed, 07 Aug 2024 12:48:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGF6PD4SDnTvqRZFf7QOZAHAr54Efqd3zzz8coMqAlruM+AKE9ZSLZCO533yzyaTsEnxMNyzw== X-Received: by 2002:a05:6358:e49f:b0:1ac:f6e3:dbcd with SMTP id e5c5f4694b2df-1af3babf83fmr1244369055d.3.1723060107772; Wed, 07 Aug 2024 12:48:27 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm59853256d6.33.2024.08.07.12.48.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 12:48:27 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , peterx@redhat.com, linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Thomas Gleixner , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying Subject: [PATCH v4 6/7] mm/x86: Add missing pud helpers Date: Wed, 7 Aug 2024 15:48:10 -0400 Message-ID: <20240807194812.819412-7-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240807194812.819412-1-peterx@redhat.com> References: <20240807194812.819412-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" These new helpers will be needed for pud entry updates soon. Introduce these helpers by referencing the pmd ones. Namely: - pudp_invalidate() - pud_modify() Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: x86@kernel.org Signed-off-by: Peter Xu --- arch/x86/include/asm/pgtable.h | 55 +++++++++++++++++++++++++++++----- arch/x86/mm/pgtable.c | 12 ++++++++ 2 files changed, 59 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index fdb8ac9e7030..a7c1e9cfea41 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -787,6 +787,12 @@ static inline pmd_t pmd_mkinvalid(pmd_t pmd) __pgprot(pmd_flags(pmd) & ~(_PAGE_PRESENT|_PAGE_PROTNONE))); } =20 +static inline pud_t pud_mkinvalid(pud_t pud) +{ + return pfn_pud(pud_pfn(pud), + __pgprot(pud_flags(pud) & ~(_PAGE_PRESENT|_PAGE_PROTNONE))); +} + static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask); =20 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) @@ -834,14 +840,8 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t new= prot) pmd_result =3D __pmd(val); =20 /* - * To avoid creating Write=3D0,Dirty=3D1 PMDs, pte_modify() needs to avoi= d: - * 1. Marking Write=3D0 PMDs Dirty=3D1 - * 2. Marking Dirty=3D1 PMDs Write=3D0 - * - * The first case cannot happen because the _PAGE_CHG_MASK will filter - * out any Dirty bit passed in newprot. Handle the second case by - * going through the mksaveddirty exercise. Only do this if the old - * value was Write=3D1 to avoid doing this on Shadow Stack PTEs. + * Avoid creating shadow stack PMD by accident. See comment in + * pte_modify(). */ if (oldval & _PAGE_RW) pmd_result =3D pmd_mksaveddirty(pmd_result); @@ -851,6 +851,29 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t new= prot) return pmd_result; } =20 +static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) +{ + pudval_t val =3D pud_val(pud), oldval =3D val; + pud_t pud_result; + + val &=3D _HPAGE_CHG_MASK; + val |=3D check_pgprot(newprot) & ~_HPAGE_CHG_MASK; + val =3D flip_protnone_guard(oldval, val, PHYSICAL_PUD_PAGE_MASK); + + pud_result =3D __pud(val); + + /* + * Avoid creating shadow stack PUD by accident. See comment in + * pte_modify(). + */ + if (oldval & _PAGE_RW) + pud_result =3D pud_mksaveddirty(pud_result); + else + pud_result =3D pud_clear_saveddirty(pud_result); + + return pud_result; +} + /* * mprotect needs to preserve PAT and encryption bits when updating * vm_page_prot @@ -1389,10 +1412,26 @@ static inline pmd_t pmdp_establish(struct vm_area_s= truct *vma, } #endif =20 +static inline pud_t pudp_establish(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp, pud_t pud) +{ + page_table_check_pud_set(vma->vm_mm, pudp, pud); + if (IS_ENABLED(CONFIG_SMP)) { + return xchg(pudp, pud); + } else { + pud_t old =3D *pudp; + WRITE_ONCE(*pudp, pud); + return old; + } +} + #define __HAVE_ARCH_PMDP_INVALIDATE_AD extern pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); =20 +pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, + pud_t *pudp); + /* * Page table pages are page-aligned. The lower half of the top * level is used for userspace and the top half for the kernel. diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index d4b3ccf90236..9fc2dabf8427 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -641,6 +641,18 @@ pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, u= nsigned long address, } #endif =20 +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ + defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) +pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, + pud_t *pudp) +{ + VM_WARN_ON_ONCE(!pud_present(*pudp)); + pud_t old =3D pudp_establish(vma, address, pudp, pud_mkinvalid(*pudp)); + flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE); + return old; +} +#endif + /** * reserve_top_address - reserves a hole in the top of kernel address space * @reserve - size of hole to reserve --=20 2.45.0 From nobody Sat Feb 7 18:20:09 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14603145335 for ; Wed, 7 Aug 2024 19:48:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060115; cv=none; b=o9UpJQzOMXuB5IgiFY9rbOAtDp2A4KT0FrAsU36IVJ1aJyHUGNQtzO7Xi45GB53vz1QUW41iu4rPzeiXcUtjQ1G0iF39h0lMbNktUHaaHi40B4OfQC4X/kLvlE9ckvkqsd6CfNBFQ6KlJM39EzEPztXWZXNStN7Ty2zrSuqjSIM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723060115; c=relaxed/simple; bh=egk112t/6BSuiTyKSVUoJ65Hf838PJyUoM/Z74+luwA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=X4o0SfjdckQh1sy/zMPpI0vsuE8Kt3ncZkcPsFKLEcQQ/Vl27uM+iH+c1BieDc/r2epyyqfIATKPi7EafhqgZEVklmVREQXy2Krdrp2WZRya2pmmmZ4W/zTYGzQ+tiQOhHNsSxOdhRjefdyuXsf4sAerXanqEKFUmeziZb8HaJs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=bEBNCaFW; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bEBNCaFW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723060113; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A/6PEtzQTa+drbYKkEnodTZuUMYm4tSA4vStMaJG0Ro=; b=bEBNCaFWkhNqkgYgioe+QAykpqrs4ItH0zPmZs/kitG5LMOh35RYXkjYyjcznhPRb+EEHN 4XOavuHY22dodEpRJDtVFDBfmL+6XC59DRz1GCz2vqpDWM491tb8bU5dCBNdCmC3V2bTKp Az56uvWuEfLY9lYlhfOW8X16wqpInGw= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-73-uE9-s9MzOdmlYmkBR4LhZw-1; Wed, 07 Aug 2024 15:48:32 -0400 X-MC-Unique: uE9-s9MzOdmlYmkBR4LhZw-1 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-6b7ad98c1f8so389826d6.1 for ; Wed, 07 Aug 2024 12:48:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723060111; x=1723664911; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A/6PEtzQTa+drbYKkEnodTZuUMYm4tSA4vStMaJG0Ro=; b=bLGhJMZJjepWxMmbawT/qn684rfr/Pgh5CuhAL07Jl1BqPksAj2jxWaE6MmaSMymYV 54YqLB8DlbQvY79tvuNnivdwwMNvAJDpzsfszsq27lEbPj+GWaExlR+QUDgQpSnofshb EOFAa0VD5GbKAgnDXtVRvJ13rvSM16gIiclhXtctFpDWTCygoQugkkoKgtR6ndnwu1PM DDRvYY68Tu2TKulRJfmlO6AfmfhB8U0tYGYnJZ+GDORN3U7Uzy1er5IoFvItW5lwOfwA ogma2R3t47clgifEv/njhJVUbovgCti6/pR50VjaUTazqyfdCxNpZiFeVznyTG2DMfTy 4Emg== X-Gm-Message-State: AOJu0YzsURQfKXFbhzoNRl90CyI+SMbnldUdpDdKyzPqN3Odgc0vKemc zLGuBrB0wh9ShFmAVdURBc9p3r+ALfY07kjEjhcnXPgluK7RTneeIZYr+fXn5QFrKQMHob+MzvF UnGpTj343YvFvHRBZzEkFV4zUtYhAnQQBN8j0YPBtW3YoA3vSgRhaxdXTZ3+gjXL4trvR/o1E9a F+a5NDT3OdGYxJLYrKxZEVTXSWbOkkZNHQE3GTYKgqQrA= X-Received: by 2002:a05:6214:e64:b0:6b0:8202:5c4e with SMTP id 6a1803df08f44-6bb983f0fa6mr145729416d6.5.1723060110884; Wed, 07 Aug 2024 12:48:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFjs/meRFYsBYhQNSwMvbRLZa9/TlQmQfQAUysM61b8jRz23PUki/2rojCFyrJzg3cPY3hNlw== X-Received: by 2002:a05:6214:e64:b0:6b0:8202:5c4e with SMTP id 6a1803df08f44-6bb983f0fa6mr145728896d6.5.1723060110042; Wed, 07 Aug 2024 12:48:30 -0700 (PDT) Received: from x1n.redhat.com (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm59853256d6.33.2024.08.07.12.48.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Aug 2024 12:48:29 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , peterx@redhat.com, linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Thomas Gleixner , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying Subject: [PATCH v4 7/7] mm/mprotect: fix dax pud handlings Date: Wed, 7 Aug 2024 15:48:11 -0400 Message-ID: <20240807194812.819412-8-peterx@redhat.com> X-Mailer: git-send-email 2.45.0 In-Reply-To: <20240807194812.819412-1-peterx@redhat.com> References: <20240807194812.819412-1-peterx@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is only relevant to the two archs that support PUD dax, aka, x86_64 and ppc64. PUD THPs do not yet exist elsewhere, and hugetlb PUDs do not count in this case. DAX have had PUD mappings for years, but change protection path never worked. When the path is triggered in any form (a simple test program would be: call mprotect() on a 1G dev_dax mapping), the kernel will report "bad pud". This patch should fix that. The new change_huge_pud() tries to keep everything simple. For example, it doesn't optimize write bit as that will need even more PUD helpers. It's not too bad anyway to have one more write fault in the worst case once for 1G range; may be a bigger thing for each PAGE_SIZE, though. Neither does it support userfault-wp bits, as there isn't such PUD mappings that is supported; file mappings always need a split there. The same to TLB shootdown: the pmd path (which was for x86 only) has the trick of using _ad() version of pmdp_invalidate*() which can avoid one redundant TLB, but let's also leave that for later. Again, the larger the mapping, the smaller of such effect. There's some difference on handling "retry" for change_huge_pud() (where it can return 0): it isn't like change_huge_pmd(), as the pmd version is safe with all conditions handled in change_pte_range() later, thanks to Hugh's new pte_offset_map_lock(). In short, change_pte_range() is simply smarter. For that, change_pud_range() will need proper retry if it races with something else when a huge PUD changed from under us. The last thing to mention is currently the PUD path ignores the huge pte numa counter (NUMA_HUGE_PTE_UPDATES), not only because DAX is not applicable to NUMA, but also that it's ambiguous on its own to decide how to account pud in this case. In one earlier version of this patchset I proposed to remove the counter as it doesn't even look right to do the accounting as of now [1], but then a further discussion suggests we can leave that for later, as that doesn't block this series if we choose to ignore that counter. That's what this patch does, by ignoring it. When at it, touch up the comment in pgtable_split_needed() to make it generic to either pmd or pud file THPs. [1] https://lore.kernel.org/all/20240715192142.3241557-3-peterx@redhat.com/ [2] https://lore.kernel.org/r/added2d0-b8be-4108-82ca-1367a388d0b1@redhat.c= om Cc: Dan Williams Cc: Matthew Wilcox Cc: Dave Jiang Cc: Hugh Dickins Cc: Kirill A. Shutemov Cc: Vlastimil Babka Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Michael Ellerman Cc: Aneesh Kumar K.V Cc: Oscar Salvador Cc: x86@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Fixes: a00cc7d9dd93 ("mm, x86: add support for PUD-sized transparent hugepa= ges") Fixes: 27af67f35631 ("powerpc/book3s64/mm: enable transparent pud hugepage") Signed-off-by: Peter Xu --- include/linux/huge_mm.h | 24 +++++++++++++++++++ mm/huge_memory.c | 52 +++++++++++++++++++++++++++++++++++++++++ mm/mprotect.c | 39 ++++++++++++++++++++++++------- 3 files changed, 107 insertions(+), 8 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index ce44caa40eed..6370026689e0 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -342,6 +342,17 @@ void split_huge_pmd_address(struct vm_area_struct *vma= , unsigned long address, void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, unsigned long address); =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +int change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, + pud_t *pudp, unsigned long addr, pgprot_t newprot, + unsigned long cp_flags); +#else +static inline int +change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, + pud_t *pudp, unsigned long addr, pgprot_t newprot, + unsigned long cp_flags) { return 0; } +#endif + #define split_huge_pud(__vma, __pud, __address) \ do { \ pud_t *____pud =3D (__pud); \ @@ -585,6 +596,19 @@ static inline int next_order(unsigned long *orders, in= t prev) { return 0; } + +static inline void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, + unsigned long address) +{ +} + +static inline int change_huge_pud(struct mmu_gather *tlb, + struct vm_area_struct *vma, pud_t *pudp, + unsigned long addr, pgprot_t newprot, + unsigned long cp_flags) +{ + return 0; +} #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ =20 static inline int split_folio_to_list_to_order(struct folio *folio, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 81c5da0708ed..0aafd26d7a53 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2114,6 +2114,53 @@ int change_huge_pmd(struct mmu_gather *tlb, struct v= m_area_struct *vma, return ret; } =20 +/* + * Returns: + * + * - 0: if pud leaf changed from under us + * - 1: if pud can be skipped + * - HPAGE_PUD_NR: if pud was successfully processed + */ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +int change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, + pud_t *pudp, unsigned long addr, pgprot_t newprot, + unsigned long cp_flags) +{ + struct mm_struct *mm =3D vma->vm_mm; + pud_t oldpud, entry; + spinlock_t *ptl; + + tlb_change_page_size(tlb, HPAGE_PUD_SIZE); + + /* NUMA balancing doesn't apply to dax */ + if (cp_flags & MM_CP_PROT_NUMA) + return 1; + + /* + * Huge entries on userfault-wp only works with anonymous, while we + * don't have anonymous PUDs yet. + */ + if (WARN_ON_ONCE(cp_flags & MM_CP_UFFD_WP_ALL)) + return 1; + + ptl =3D __pud_trans_huge_lock(pudp, vma); + if (!ptl) + return 0; + + /* + * Can't clear PUD or it can race with concurrent zapping. See + * change_huge_pmd(). + */ + oldpud =3D pudp_invalidate(vma, addr, pudp); + entry =3D pud_modify(oldpud, newprot); + set_pud_at(mm, addr, pudp, entry); + tlb_flush_pud_range(tlb, addr, HPAGE_PUD_SIZE); + + spin_unlock(ptl); + return HPAGE_PUD_NR; +} +#endif + #ifdef CONFIG_USERFAULTFD /* * The PT lock for src_pmd and dst_vma/src_vma (for reading) are locked by @@ -2344,6 +2391,11 @@ void __split_huge_pud(struct vm_area_struct *vma, pu= d_t *pud, spin_unlock(ptl); mmu_notifier_invalidate_range_end(&range); } +#else +void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, + unsigned long address) +{ +} #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ =20 static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, diff --git a/mm/mprotect.c b/mm/mprotect.c index d423080e6509..446f8e5f10d9 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -302,8 +302,9 @@ pgtable_split_needed(struct vm_area_struct *vma, unsign= ed long cp_flags) { /* * pte markers only resides in pte level, if we need pte markers, - * we need to split. We cannot wr-protect shmem thp because file - * thp is handled differently when split by erasing the pmd so far. + * we need to split. For example, we cannot wr-protect a file thp + * (e.g. 2M shmem) because file thp is handled differently when + * split by erasing the pmd so far. */ return (cp_flags & MM_CP_UFFD_WP) && !vma_is_anonymous(vma); } @@ -430,31 +431,53 @@ static inline long change_pud_range(struct mmu_gather= *tlb, unsigned long end, pgprot_t newprot, unsigned long cp_flags) { struct mmu_notifier_range range; - pud_t *pud; + pud_t *pudp, pud; unsigned long next; long pages =3D 0, ret; =20 range.start =3D 0; =20 - pud =3D pud_offset(p4d, addr); + pudp =3D pud_offset(p4d, addr); do { +again: next =3D pud_addr_end(addr, end); - ret =3D change_prepare(vma, pud, pmd, addr, cp_flags); + ret =3D change_prepare(vma, pudp, pmd, addr, cp_flags); if (ret) { pages =3D ret; break; } - if (pud_none_or_clear_bad(pud)) + + pud =3D READ_ONCE(*pudp); + if (pud_none(pud)) continue; + if (!range.start) { mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_VMA, 0, vma->vm_mm, addr, end); mmu_notifier_invalidate_range_start(&range); } - pages +=3D change_pmd_range(tlb, vma, pud, addr, next, newprot, + + if (pud_leaf(pud)) { + if ((next - addr !=3D PUD_SIZE) || + pgtable_split_needed(vma, cp_flags)) { + __split_huge_pud(vma, pudp, addr); + goto again; + } else { + ret =3D change_huge_pud(tlb, vma, pudp, + addr, newprot, cp_flags); + if (ret =3D=3D 0) + goto again; + /* huge pud was handled */ + if (ret =3D=3D HPAGE_PUD_NR) + pages +=3D HPAGE_PUD_NR; + continue; + } + } + + pages +=3D change_pmd_range(tlb, vma, pudp, addr, next, newprot, cp_flags); - } while (pud++, addr =3D next, addr !=3D end); + } while (pudp++, addr =3D next, addr !=3D end); =20 if (range.start) mmu_notifier_invalidate_range_end(&range); --=20 2.45.0