From nobody Sun Feb 8 05:30:24 2026 Received: from mail-qv1-f97.google.com (mail-qv1-f97.google.com [209.85.219.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE39B3191CF for ; Wed, 24 Dec 2025 10:43:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.97 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766573014; cv=none; b=ollC5pS1UKC1ET4Tjtb8roTSPRDp4ObXFlbxaXc4hQJvECetl2Tb2GU1db5oHlOfQStlJWS3eEX43+WLmQ25FYSygLUI4STCv/VkL+fHg76/cg3qJ4+o1kI402jKTS0QtMXJzySDd3LTRkBklQzkHuFbbtEfVyAK7t1z6OjUBIk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766573014; c=relaxed/simple; bh=gtSHTc5KLEm2c3/Ii/6Ssf3DLdh0gefU1D1AZRjUdVA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cRC9xH5x0RfmAXO2aYAqqgEwbJT5fR0cSn8Mink4zdPEdpcSYm3G5sEPshAUQjeMrvttJwZkH0BvWx16zs3weh/56hYXIFYgZkyOHBzaPbyM5nSfeyAl+wSlUMVzSrcIi/gcClPWEhnIfRyHGFeam16I21XaJPfU2g7lD+YkHcg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com; spf=fail smtp.mailfrom=broadcom.com; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b=QTmAy/0l; arc=none smtp.client-ip=209.85.219.97 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=broadcom.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="QTmAy/0l" Received: by mail-qv1-f97.google.com with SMTP id 6a1803df08f44-8804f1bd6a7so55515126d6.2 for ; Wed, 24 Dec 2025 02:43:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766573011; x=1767177811; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=P1xHU9mQPsJeek2O22o16AKV8gYUfyitTbWLfmUirCg=; b=wUqEMbxyKwMBhPCXAKuMV5VeHFec/LNA2zx3Dwfl1IESXAMPDUHxIqbluzwB2SbYvt +/aFTDiFYr+eNDAaf/zaucB9nT4cAPz/jiYPRmFFYXyEKwMmz1U1GTp0gAeCPEMQZzGP ml57Zc4NeUQo2DDOJfHimPlb8x1r/KtX+zQRgFPlsohNI7zwcFIYMVjAj0VAgXz3YR0t 60giFqdN6mYYhxtcIUu9D9+Pjx1PVb3evl5xaNP0FK8+6Dj5T2BwptVV4eWvyEOEv4jN CN6WpY3ot8PgBGCKPKrG1uNx3jE6OrXgzwHWJeJUiDDzg4C0M4FTuao8Ssa+9pQP3sPC pq3g== X-Forwarded-Encrypted: i=1; AJvYcCWEXHj8yY898tbFu03u3r9OXmU89d+wDPNzY6tgZNkKeO6h76T+hMTmDfYIbl0cAJrw4PpIKnOJPilfxkg=@vger.kernel.org X-Gm-Message-State: AOJu0YwD1zqfam1DT/TU5H/bIHvAy+5njzOIY+IyzXlCc/9xdj14bIzD h6rE9ZxER3DRj1pxApbru4GSDl5XxPRTOW0e3XY0COL2j/MfYx8Z1qpEuxQIBtduDuzWoAWwmTI IC4DP50ONWNF7zedPXjOFByRYV2HaJoFL3DSVbfNsqBXdF1SAYhj1Srj2rUHnjro12CLMmHGg6z NHe7jYHHe+Tu8NkGN4dbm+s0d6diR7ITgpWrLFu5vAYb54bTEuMtzdNZkzwDc3XGjy9XNKKvWzy ofhy0I6vQaLjmkr2Cw= X-Gm-Gg: AY/fxX7t0dnAy/h3jE55GqO+zCxiLTDY+imF/kP5N5GgSz/dC7+DsNbevfydxViAqzD T1tIUPml+wCk0AspliKQ3iEX9gWhd0NrSL42EJnJPOIzjSOmasPwMzbq4NmtaHvDxKdsLeSSpfK NNjIWHTP4M+8Fg/M081GIw7amutzFf8U7uEZ70h63HajqUfD0mG6XOBP/0nKdPVV266Uha9gBB2 3ZVkd30Joe1eMD8O7CQlGqUlQrKaE7rwxQOiOt0idLQsp+yBOF9QCzBq8TEMgHuf33vtplNXTeT RXSE6YEn/4R9Np0eyYdcN0zrd+sYujfd6Ov0CW82aATT39xMlx6/cd5LUeM+MsPpfGZ1Q/LMx/1 Cevg4yZC64RnxQRbh2XrhXNNsuwBGG7MRDOqN/cre5TuUNHEntaCXADYqVET/sHWLfQUztmWE7X FocmfeQc5KEVcEGASsuvzzqENgiriAxp4RSeMHjHE= X-Google-Smtp-Source: AGHT+IG+R0SXD+Sr7xgyMsTYaoO7qr2LKXAzqNCtgE5+HYpfBNHNrQJzkPt4fAsu9Dda1lp3WQJ5z8vkSa57 X-Received: by 2002:a05:6214:d04:b0:88f:a4ff:454e with SMTP id 6a1803df08f44-88fa4ff4642mr180196066d6.10.1766573010809; Wed, 24 Dec 2025 02:43:30 -0800 (PST) Received: from smtp-us-east1-p01-i01-si01.dlp.protect.broadcom.com (address-144-49-247-2.dlp.protect.broadcom.com. [144.49.247.2]) by smtp-relay.gmail.com with ESMTPS id 6a1803df08f44-88d94fcbfc5sm22531196d6.2.2025.12.24.02.43.30 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Dec 2025 02:43:30 -0800 (PST) X-Relaying-Domain: broadcom.com X-CFilter-Loop: Reflected Received: by mail-pg1-f197.google.com with SMTP id 41be03b00d2f7-bc2a04abc5aso5241593a12.2 for ; Wed, 24 Dec 2025 02:43:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1766573009; x=1767177809; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=P1xHU9mQPsJeek2O22o16AKV8gYUfyitTbWLfmUirCg=; b=QTmAy/0lhBwTUuXiJ1YMYwi50qJYtNKPJpcxWB7x2ryvpR9tTAOozIjhOn9XgxGJJq AGPLXbA8uzA5VgWTXpbgIVFdsovq+sCFMijuP9tai6Pq2zON7lTHn9pBdQk+en+qgklm 3Yo8lril5Tnx6gxBGRIMDBfhu4vXAy9E8hGpU= X-Forwarded-Encrypted: i=1; AJvYcCWdzlF2cbV5ClOXzBSCup86tDDSA5XPMyl4YBLqVA5BN8futjUtMwdDc9Lmbk7YY2wSeLBHVc1Ow3eEAO8=@vger.kernel.org X-Received: by 2002:a05:7022:670b:b0:11a:fec5:d005 with SMTP id a92af1059eb24-121721aab84mr18748466c88.10.1766573008837; Wed, 24 Dec 2025 02:43:28 -0800 (PST) X-Received: by 2002:a05:7022:670b:b0:11a:fec5:d005 with SMTP id a92af1059eb24-121721aab84mr18748454c88.10.1766573008224; Wed, 24 Dec 2025 02:43:28 -0800 (PST) Received: from photon-dev-haas.. ([192.19.161.250]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1217254c734sm68746919c88.13.2025.12.24.02.43.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Dec 2025 02:43:27 -0800 (PST) From: Ajay Kaher To: stable@vger.kernel.org, gregkh@linuxfoundation.org Cc: dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ajay.kaher@broadcom.com, alexey.makhalov@broadcom.com, vamsi-krishna.brahmajosyula@broadcom.com, yin.ding@broadcom.com, tapas.kundu@broadcom.com, Ma Wupeng , syzbot+5f488e922d047d8f00cc@syzkaller.appspotmail.com, Alexander Ofitserov Subject: [PATCH v6.1 1/2] x86/mm/pat: clear VM_PAT if copy_p4d_range failed Date: Wed, 24 Dec 2025 10:24:31 +0000 Message-Id: <20251224102432.923410-2-ajay.kaher@broadcom.com> X-Mailer: git-send-email 2.40.4 In-Reply-To: <20251224102432.923410-1-ajay.kaher@broadcom.com> References: <20251224102432.923410-1-ajay.kaher@broadcom.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-DetectorID-Processed: b00c1d49-9d2e-4205-b15f-d015386d3d5e Content-Type: text/plain; charset="utf-8" From: Ma Wupeng [ Upstream commit d155df53f31068c3340733d586eb9b3ddfd70fc5 ] Syzbot reports a warning in untrack_pfn(). Digging into the root we found that this is due to memory allocation failure in pmd_alloc_one. And this failure is produced due to failslab. In copy_page_range(), memory alloaction for pmd failed. During the error handling process in copy_page_range(), mmput() is called to remove all vmas. While untrack_pfn this empty pfn, warning happens. Here's a simplified flow: dup_mm dup_mmap copy_page_range copy_p4d_range copy_pud_range copy_pmd_range pmd_alloc __pmd_alloc pmd_alloc_one page =3D alloc_pages(gfp, 0); if (!page) return NULL; mmput exit_mmap unmap_vmas unmap_single_vma untrack_pfn follow_phys WARN_ON_ONCE(1); Since this vma is not generate successfully, we can clear flag VM_PAT. In this case, untrack_pfn() will not be called while cleaning this vma. Function untrack_pfn_moved() has also been renamed to fit the new logic. Link: https://lkml.kernel.org/r/20230217025615.1595558-1-mawupeng1@huawei.c= om Signed-off-by: Ma Wupeng Reported-by: Signed-off-by: Andrew Morton Signed-off-by: Alexander Ofitserov Cc: stable@vger.kernel.org [ Ajay: Modified to apply on v6.1 ] Signed-off-by: Ajay Kaher --- arch/x86/mm/pat/memtype.c | 12 ++++++++---- include/linux/pgtable.h | 7 ++++--- mm/memory.c | 1 + mm/mremap.c | 2 +- 4 files changed, 14 insertions(+), 8 deletions(-) diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c index d6fe9093e..1ad881017 100644 --- a/arch/x86/mm/pat/memtype.c +++ b/arch/x86/mm/pat/memtype.c @@ -1137,11 +1137,15 @@ void untrack_pfn(struct vm_area_struct *vma, unsign= ed long pfn, } =20 /* - * untrack_pfn_moved is called, while mremapping a pfnmap for a new region, - * with the old vma after its pfnmap page table has been removed. The new - * vma has a new pfnmap to the same pfn & cache type with VM_PAT set. + * untrack_pfn_clear is called if the following situation fits: + * + * 1) while mremapping a pfnmap for a new region, with the old vma after + * its pfnmap page table has been removed. The new vma has a new pfnmap + * to the same pfn & cache type with VM_PAT set. + * 2) while duplicating vm area, the new vma fails to copy the pgtable from + * old vma. */ -void untrack_pfn_moved(struct vm_area_struct *vma) +void untrack_pfn_clear(struct vm_area_struct *vma) { vma->vm_flags &=3D ~VM_PAT; } diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 82d78cba7..500a612ff 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1214,9 +1214,10 @@ static inline void untrack_pfn(struct vm_area_struct= *vma, } =20 /* - * untrack_pfn_moved is called while mremapping a pfnmap for a new region. + * untrack_pfn_clear is called while mremapping a pfnmap for a new region + * or fails to copy pgtable during duplicate vm area. */ -static inline void untrack_pfn_moved(struct vm_area_struct *vma) +static inline void untrack_pfn_clear(struct vm_area_struct *vma) { } #else @@ -1228,7 +1229,7 @@ extern void track_pfn_insert(struct vm_area_struct *v= ma, pgprot_t *prot, extern int track_pfn_copy(struct vm_area_struct *vma); extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, unsigned long size); -extern void untrack_pfn_moved(struct vm_area_struct *vma); +extern void untrack_pfn_clear(struct vm_area_struct *vma); #endif =20 #ifdef CONFIG_MMU diff --git a/mm/memory.c b/mm/memory.c index 454d91844..41a03adcf 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1335,6 +1335,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struc= t vm_area_struct *src_vma) continue; if (unlikely(copy_p4d_range(dst_vma, src_vma, dst_pgd, src_pgd, addr, next))) { + untrack_pfn_clear(dst_vma); ret =3D -ENOMEM; break; } diff --git a/mm/mremap.c b/mm/mremap.c index 930f65c31..6ed28eeae 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -682,7 +682,7 @@ static unsigned long move_vma(struct vm_area_struct *vm= a, =20 /* Tell pfnmap has moved from this vma */ if (unlikely(vma->vm_flags & VM_PFNMAP)) - untrack_pfn_moved(vma); + untrack_pfn_clear(vma); =20 if (unlikely(!err && (flags & MREMAP_DONTUNMAP))) { /* We always clear VM_LOCKED[ONFAULT] on the old vma */ --=20 2.40.4 From nobody Sun Feb 8 05:30:24 2026 Received: from mail-yx1-f99.google.com (mail-yx1-f99.google.com [74.125.224.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 953A13195EF for ; Wed, 24 Dec 2025 10:43:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.99 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766573018; cv=none; b=tqOZZMciUk6tBV45s8NFb4cyzcz7RklM/XIcIY+5xFiriWZoxTnOK9Wc9+F/E5uh+aBnzslJBD3lRpm57tMNTJEFO9Refmtd/HMY6cUGgtsZ8YbG3u5GfbZ9Vr+c78kHIaLlnKDl8btln8/v09O8zSClyE8zpmMyqn4aNusO5/M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766573018; c=relaxed/simple; bh=gN0Z6TQyRvgaq7lyEKqsD7oNpurphVB/Y6c1nH9WX5s=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TIIeL0F13OcMgubTIoehcG3e/mUefpU/DxhJ+rBDV2a4Bnh7wahXcljQOTwjPgmmPm+p1eFArU5PaDWoh0Y+34RrfKJ792PtpqOXpThIs4RY0bM8Y7E4saAh7a/bZMDihFBEl2iNEOdfW8n8Qws1njeHt8jJe/ndAfoW4NualGA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com; spf=fail smtp.mailfrom=broadcom.com; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b=KsQeBJaL; arc=none smtp.client-ip=74.125.224.99 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=broadcom.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=broadcom.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="KsQeBJaL" Received: by mail-yx1-f99.google.com with SMTP id 956f58d0204a3-644795bf5feso5295932d50.2 for ; Wed, 24 Dec 2025 02:43:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766573015; x=1767177815; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:dkim-signature:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qs1L9yogvjoOS4UWjYKVRJa3NcwdJKzh6wcPXeXPOP0=; b=irZ6cuP8LSnDuIl6D6EGQv+IyJagLcnhDvAimV1RwuJB3CWj1uUC0nSX0R8Bgne/1/ IBqyjRxWWNbk53pq2LGfe8Q6VZnEI5uNdttMfV6uyeH/DbdDquMSKJfTB2hGih9AJB+8 ySiT2Nv/XrSWqzFH2nSoCm1JgQejEct5ReyiODPgsoIItCp9qX4Tlcs8t60VhLfka8+S 4afDGcXmxE48A+dPLiwpaUx/zqg7Klzuc5QySBSBHJ4ZH0aiMvdedDiqvRRaDRusA/gO JueR93xJBSD/3clwUyVQWXkJDAweVzX2MLxur5363JjzuwHgYASyNfC3nV5xFfATDtUk 9QuQ== X-Forwarded-Encrypted: i=1; AJvYcCW+Ujl+AuSYxrqsUYnjYRp1PP2INibEKCadR8+mbHcSPQr6jRvupgoqkSuyicCtFoIVD94XVGUspuEtVAo=@vger.kernel.org X-Gm-Message-State: AOJu0YyiGqSg6GjBaTckuusJcgT02XHrLK59sFrA26ERMAfTgD33xLE4 3FOrUVZ+57BlNehZMjDP2QQmb0BkTuDIf8pUUBJHIgdbS+a6Z3jbdlX6Y41AL371H8qf1fDjlia 1rQHrAiIiSaqeHdqUcPi3AnDKJm40xF5ZFA4J+EvMoOBtmDGsurh2L9zma7Jv0A4qWjh2VXQ9gW /vmi2MpsiPlpTZOEty8mQta9sXLlFthOrLw6da3AHeld39QE8S7WLRuSjtDwZbCpkZPP93D7tBf +UBcPkU5pGUz7lijq8= X-Gm-Gg: AY/fxX7XTAJcXsXRYQUDj1QkrK1Wdu22FuOecEkG1kCHqdLxVB6GOWSjBvHB+OMKdU/ JvmT0bVJ4hzZojw+ZcBAa0+uXBkLj/q7i8DqXK3sf/0iiGZdNVHeQGZF53v+c5z029pT/ZQIeN8 kYPU3iUPGHY5A3aNT6qygDIUNfNR767NPTx04aYvEw/JRlQ7HxDWDsr1DRnGCSUpBaFEQ0ragSI A+KmDAP+tBPe92Y4ed2D5jAz0rKU6hKcB8zB5tHdZfrwawIx915xPOViugHpPuPnhUOsP5jXQVQ FNBwNQXGp7uGFxNHVukjfuMJH9szVbkcf7JHRyeEt4BGNTWiNGc7vIssWaOs0FnQ4zCeCskDsv+ muSyMpjDBCrDmTaiE2/knhjkbY6Vn1/hI9u9b5StDKwEaHqnsu/uqq83w5RhOMU3PMJywDsXg+Z 5AOQ1w0iZQk/plNgW4/lmzfwVe3mjPvDIq4DmN68PKPw== X-Google-Smtp-Source: AGHT+IFtzKOj9Vpmi2AuOvm+7BEuThcWccpubK2a2yR27fSw9kNcba5oHHZHDBq2B0BxGzGTouQT/vsh84Wh X-Received: by 2002:a53:c44c:0:b0:644:4a82:304e with SMTP id 956f58d0204a3-6466a9215acmr9640968d50.80.1766573015495; Wed, 24 Dec 2025 02:43:35 -0800 (PST) Received: from smtp-us-east1-p01-i01-si01.dlp.protect.broadcom.com (address-144-49-247-10.dlp.protect.broadcom.com. [144.49.247.10]) by smtp-relay.gmail.com with ESMTPS id 956f58d0204a3-6466a8bd962sm765242d50.4.2025.12.24.02.43.35 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Dec 2025 02:43:35 -0800 (PST) X-Relaying-Domain: broadcom.com X-CFilter-Loop: Reflected Received: by mail-pl1-f200.google.com with SMTP id d9443c01a7336-2a08cbeb87eso94161245ad.3 for ; Wed, 24 Dec 2025 02:43:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1766573014; x=1767177814; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qs1L9yogvjoOS4UWjYKVRJa3NcwdJKzh6wcPXeXPOP0=; b=KsQeBJaLsuRW3y46EXnXALwJ0XwhI8HCsIXdvUAGuIxv9bNEd9D5n1iGDXnB5Gqtv1 JoksMQBC8PNGC/sLJW19Np9u0kD+E7LdTMMuGxg7N75XLO9VL6mt/UcIaRzBWLO4lTnP KlSKa0XOVS5jFuBPdCsGplT8kAOBB+w1Ot9IE= X-Forwarded-Encrypted: i=1; AJvYcCWbj6bGiqwPDjRgbbl4ZPy1JxeOCBZIAyjishrqdtw9Uw/0CzQnGz3tRL+U+r3f4NS4YPRM99xDex13XcQ=@vger.kernel.org X-Received: by 2002:a05:7022:62a0:b0:11a:342e:8a98 with SMTP id a92af1059eb24-12172136c4emr20604520c88.0.1766573014055; Wed, 24 Dec 2025 02:43:34 -0800 (PST) X-Received: by 2002:a05:7022:62a0:b0:11a:342e:8a98 with SMTP id a92af1059eb24-12172136c4emr20604480c88.0.1766573013386; Wed, 24 Dec 2025 02:43:33 -0800 (PST) Received: from photon-dev-haas.. ([192.19.161.250]) by smtp.gmail.com with ESMTPSA id a92af1059eb24-1217254c734sm68746919c88.13.2025.12.24.02.43.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Dec 2025 02:43:33 -0800 (PST) From: Ajay Kaher To: stable@vger.kernel.org, gregkh@linuxfoundation.org Cc: dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ajay.kaher@broadcom.com, alexey.makhalov@broadcom.com, vamsi-krishna.brahmajosyula@broadcom.com, yin.ding@broadcom.com, tapas.kundu@broadcom.com, xingwei lee , yuxin wang , Marius Fleischer , David Hildenbrand , Ingo Molnar , Rik van Riel , Linus Torvalds , Sasha Levin Subject: [PATCH v6.1 2/2] x86/mm/pat: Fix VM_PAT handling when fork() fails in copy_page_range() Date: Wed, 24 Dec 2025 10:24:32 +0000 Message-Id: <20251224102432.923410-3-ajay.kaher@broadcom.com> X-Mailer: git-send-email 2.40.4 In-Reply-To: <20251224102432.923410-1-ajay.kaher@broadcom.com> References: <20251224102432.923410-1-ajay.kaher@broadcom.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-DetectorID-Processed: b00c1d49-9d2e-4205-b15f-d015386d3d5e Content-Type: text/plain; charset="utf-8" From: David Hildenbrand [ Upstream commit dc84bc2aba85a1508f04a936f9f9a15f64ebfb31 ] If track_pfn_copy() fails, we already added the dst VMA to the maple tree. As fork() fails, we'll cleanup the maple tree, and stumble over the dst VMA for which we neither performed any reservation nor copied any page tables. Consequently untrack_pfn() will see VM_PAT and try obtaining the PAT information from the page table -- which fails because the page table was not copied. The easiest fix would be to simply clear the VM_PAT flag of the dst VMA if track_pfn_copy() fails. However, the whole thing is about "simply" clearing the VM_PAT flag is shaky as well: if we passed track_pfn_copy() and performed a reservation, but copying the page tables fails, we'll simply clear the VM_PAT flag, not properly undoing the reservation ... which is also wrong. So let's fix it properly: set the VM_PAT flag only if the reservation succeeded (leaving it clear initially), and undo the reservation if anything goes wrong while copying the page tables: clearing the VM_PAT flag after undoing the reservation. Note that any copied page table entries will get zapped when the VMA will get removed later, after copy_page_range() succeeded; as VM_PAT is not set then, we won't try cleaning VM_PAT up once more and untrack_pfn() will be happy. Note that leaving these page tables in place without a reservation is not a problem, as we are aborting fork(); this process will never run. A reproducer can trigger this usually at the first try: https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/reproducers/p= at_fork.c WARNING: CPU: 26 PID: 11650 at arch/x86/mm/pat/memtype.c:983 get_pat_info= +0xf6/0x110 Modules linked in: ... CPU: 26 UID: 0 PID: 11650 Comm: repro3 Not tainted 6.12.0-rc5+ #92 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04= /01/2014 RIP: 0010:get_pat_info+0xf6/0x110 ... Call Trace: ... untrack_pfn+0x52/0x110 unmap_single_vma+0xa6/0xe0 unmap_vmas+0x105/0x1f0 exit_mmap+0xf6/0x460 __mmput+0x4b/0x120 copy_process+0x1bf6/0x2aa0 kernel_clone+0xab/0x440 __do_sys_clone+0x66/0x90 do_syscall_64+0x95/0x180 Likely this case was missed in: d155df53f310 ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed") ... and instead of undoing the reservation we simply cleared the VM_PAT fla= g. Keep the documentation of these functions in include/linux/pgtable.h, one place is more than sufficient -- we should clean that up for the other functions like track_pfn_remap/untrack_pfn separately. Fixes: d155df53f310 ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed") Fixes: 2ab640379a0a ("x86: PAT: hooks in generic vm code to help archs to t= rack pfnmap regions - v3") Reported-by: xingwei lee Reported-by: yuxin wang Reported-by: Marius Fleischer Signed-off-by: David Hildenbrand Signed-off-by: Ingo Molnar Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Rik van Riel Cc: "H. Peter Anvin" Cc: Linus Torvalds Cc: Andrew Morton Cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20250321112323.153741-1-david@redhat.com Closes: https://lore.kernel.org/lkml/CABOYnLx_dnqzpCW99G81DmOr+2UzdmZMk=3DT= 3uxwNxwz+R1RAwg@mail.gmail.com/ Closes: https://lore.kernel.org/lkml/CAJg=3D8jwijTP5fre8woS4JVJQ8iUA6v+iNcs= Ogtj9Zfpc3obDOQ@mail.gmail.com/ Signed-off-by: Sasha Levin Cc: stable@vger.kernel.org [ Ajay: Modified to apply on v6.1 ] Signed-off-by: Ajay Kaher --- arch/x86/mm/pat/memtype.c | 52 +++++++++++++++++++++------------------ include/linux/pgtable.h | 28 ++++++++++++++++----- kernel/fork.c | 4 +++ mm/memory.c | 11 +++------ 4 files changed, 58 insertions(+), 37 deletions(-) diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c index 1ad881017..67438ed59 100644 --- a/arch/x86/mm/pat/memtype.c +++ b/arch/x86/mm/pat/memtype.c @@ -1029,29 +1029,42 @@ static int get_pat_info(struct vm_area_struct *vma,= resource_size_t *paddr, return -EINVAL; } =20 -/* - * track_pfn_copy is called when vma that is covering the pfnmap gets - * copied through copy_page_range(). - * - * If the vma has a linear pfn mapping for the entire range, we get the pr= ot - * from pte and reserve the entire vma range with single reserve_pfn_range= call. - */ -int track_pfn_copy(struct vm_area_struct *vma) +int track_pfn_copy(struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma, unsigned long *pfn) { + const unsigned long vma_size =3D src_vma->vm_end - src_vma->vm_start; resource_size_t paddr; - unsigned long vma_size =3D vma->vm_end - vma->vm_start; pgprot_t pgprot; + int rc; =20 - if (vma->vm_flags & VM_PAT) { - if (get_pat_info(vma, &paddr, &pgprot)) - return -EINVAL; - /* reserve the whole chunk covered by vma. */ - return reserve_pfn_range(paddr, vma_size, &pgprot, 1); - } + if (!(src_vma->vm_flags & VM_PAT)) + return 0; + + /* + * Duplicate the PAT information for the dst VMA based on the src + * VMA. + */ + if (get_pat_info(src_vma, &paddr, &pgprot)) + return -EINVAL; + rc =3D reserve_pfn_range(paddr, vma_size, &pgprot, 1); + if (rc) + return rc; =20 + /* Reservation for the destination VMA succeeded. */ + dst_vma->vm_flags |=3D VM_PAT; + *pfn =3D PHYS_PFN(paddr); return 0; } =20 +void untrack_pfn_copy(struct vm_area_struct *dst_vma, unsigned long pfn) +{ + untrack_pfn(dst_vma, pfn, dst_vma->vm_end - dst_vma->vm_start); + /* + * Reservation was freed, any copied page tables will get cleaned + * up later, but without getting PAT involved again. + */ +} + /* * prot is passed in as a parameter for the new mapping. If the vma has * a linear pfn mapping for the entire range, or no vma is provided, @@ -1136,15 +1149,6 @@ void untrack_pfn(struct vm_area_struct *vma, unsigne= d long pfn, vma->vm_flags &=3D ~VM_PAT; } =20 -/* - * untrack_pfn_clear is called if the following situation fits: - * - * 1) while mremapping a pfnmap for a new region, with the old vma after - * its pfnmap page table has been removed. The new vma has a new pfnmap - * to the same pfn & cache type with VM_PAT set. - * 2) while duplicating vm area, the new vma fails to copy the pgtable from - * old vma. - */ void untrack_pfn_clear(struct vm_area_struct *vma) { vma->vm_flags &=3D ~VM_PAT; diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 500a612ff..943c47c95 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1195,14 +1195,25 @@ static inline void track_pfn_insert(struct vm_area_= struct *vma, pgprot_t *prot, } =20 /* - * track_pfn_copy is called when vma that is covering the pfnmap gets - * copied through copy_page_range(). + * track_pfn_copy is called when a VM_PFNMAP VMA is about to get the page + * tables copied during copy_page_range(). On success, stores the pfn to be + * passed to untrack_pfn_copy(). */ -static inline int track_pfn_copy(struct vm_area_struct *vma) +static inline int track_pfn_copy(struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma, unsigned long *pfn) { return 0; } =20 +/* + * untrack_pfn_copy is called when a VM_PFNMAP VMA failed to copy during + * copy_page_range(), but after track_pfn_copy() was already called. + */ +static inline void untrack_pfn_copy(struct vm_area_struct *dst_vma, + unsigned long pfn) +{ +} + /* * untrack_pfn is called while unmapping a pfnmap for a region. * untrack can be called for a specific region indicated by pfn and size or @@ -1214,8 +1225,10 @@ static inline void untrack_pfn(struct vm_area_struct= *vma, } =20 /* - * untrack_pfn_clear is called while mremapping a pfnmap for a new region - * or fails to copy pgtable during duplicate vm area. + * untrack_pfn_clear is called in the following cases on a VM_PFNMAP VMA: + * + * 1) During mremap() on the src VMA after the page tables were moved. + * 2) During fork() on the dst VMA, immediately after duplicating the src = VMA. */ static inline void untrack_pfn_clear(struct vm_area_struct *vma) { @@ -1226,7 +1239,10 @@ extern int track_pfn_remap(struct vm_area_struct *vm= a, pgprot_t *prot, unsigned long size); extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, pfn_t pfn); -extern int track_pfn_copy(struct vm_area_struct *vma); +extern int track_pfn_copy(struct vm_area_struct *dst_vma, + struct vm_area_struct *src_vma, unsigned long *pfn); +extern void untrack_pfn_copy(struct vm_area_struct *dst_vma, + unsigned long pfn); extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, unsigned long size); extern void untrack_pfn_clear(struct vm_area_struct *vma); diff --git a/kernel/fork.c b/kernel/fork.c index cbd68079c..992068b7f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -476,6 +476,10 @@ struct vm_area_struct *vm_area_dup(struct vm_area_stru= ct *orig) *new =3D data_race(*orig); INIT_LIST_HEAD(&new->anon_vma_chain); dup_anon_vma_name(orig, new); + + /* track_pfn_copy() will later take care of copying internal state. */ + if (unlikely(new->vm_flags & VM_PFNMAP)) + untrack_pfn_clear(new); } return new; } diff --git a/mm/memory.c b/mm/memory.c index 41a03adcf..38e08d378 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1278,12 +1278,12 @@ int copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src= _vma) { pgd_t *src_pgd, *dst_pgd; - unsigned long next; unsigned long addr =3D src_vma->vm_start; unsigned long end =3D src_vma->vm_end; struct mm_struct *dst_mm =3D dst_vma->vm_mm; struct mm_struct *src_mm =3D src_vma->vm_mm; struct mmu_notifier_range range; + unsigned long next, pfn; bool is_cow; int ret; =20 @@ -1294,11 +1294,7 @@ copy_page_range(struct vm_area_struct *dst_vma, stru= ct vm_area_struct *src_vma) return copy_hugetlb_page_range(dst_mm, src_mm, dst_vma, src_vma); =20 if (unlikely(src_vma->vm_flags & VM_PFNMAP)) { - /* - * We do not free on error cases below as remove_vma - * gets called on error from higher level routine - */ - ret =3D track_pfn_copy(src_vma); + ret =3D track_pfn_copy(dst_vma, src_vma, &pfn); if (ret) return ret; } @@ -1335,7 +1331,6 @@ copy_page_range(struct vm_area_struct *dst_vma, struc= t vm_area_struct *src_vma) continue; if (unlikely(copy_p4d_range(dst_vma, src_vma, dst_pgd, src_pgd, addr, next))) { - untrack_pfn_clear(dst_vma); ret =3D -ENOMEM; break; } @@ -1345,6 +1340,8 @@ copy_page_range(struct vm_area_struct *dst_vma, struc= t vm_area_struct *src_vma) raw_write_seqcount_end(&src_mm->write_protect_seq); mmu_notifier_invalidate_range_end(&range); } + if (ret && unlikely(src_vma->vm_flags & VM_PFNMAP)) + untrack_pfn_copy(dst_vma, pfn); return ret; } =20 --=20 2.40.4