From nobody Sun Feb 8 05:20:10 2026 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4DD43033CF for ; Mon, 2 Feb 2026 19:41:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770061293; cv=none; b=NEGQGF3BN78o0ASr3NGGPGsYUZmJgbryYoVgViH9+olvZMdF3Bqr4i8eHKcqhSU3DPxVKMNGwQooG0XUvnOg7BpQRU/EnFTAFvPJ/ywXIjpHDkypDQGLoYuuWyXfM47KYnzrO+xUACcz3HLHsLDj1oXY2diZr8/v3fBEarI3kBI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770061293; c=relaxed/simple; bh=VEcc4EmMxD5CYsWfQQiE7/bp7uts1CWqF9qONZrDPzs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ksVjh23es3496WbxW8eSRMJBgYzEj1wakJZkdplB1zWxe+Zmr8ASRKCV2+rp4+GFNzuaymquvG3wIHLSIVei7Ae5bEixjVs0+EuEr50S+tBFC4SLSVp3bOHzjTdi6nPQtEZqs+rIqBF+yAUish4ay2EaMNkX11t4jLTi5pNIQ1Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Z4nnvbAM; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Z4nnvbAM" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-c65d08b623aso59464a12.0 for ; Mon, 02 Feb 2026 11:41:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770061291; x=1770666091; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zZShtbIIWDK79DIJSdgH4MCTg+L7wrJtfNmBeLFqVIo=; b=Z4nnvbAMX1zYRqeZcavH1XRIkTYaKCi5WA1ofWCnUSc0ZSUhP83DQLp7iw8fzxqncI B0+K6HD7YkzfoNUDVN0+gcqFwWl1pQwtystLYKk62GpXJAAzS/0pEiUQLgS8k5f7Oa5F Cnr2XA6qGx5TIN66dx3yR7tnglq+PyuEVZgoYokfctkQCEEJ3CAussR42qpa9UVdBeE3 FFHz1CBKoHCrJY8oy31ZsMU/SDvQ5Q3g1/3dSOzn4P/JVqAV9FIQp+lqHPG4qKR4NjSf zLqNfKBuTLupGuGY8mXnOJuLDTaf5NF5nnuPkCO26Swf9z53WqLvTthQ0+HGx3pzydIO uDoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770061291; x=1770666091; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zZShtbIIWDK79DIJSdgH4MCTg+L7wrJtfNmBeLFqVIo=; b=jub07GYMn97Q+LWq6/E/NcgfyiAmlxSBlOtNnkyFapF8nCQT2g14LYo8nBKeTqrt61 gGrhIRrVX1OqHzXtMcd0JFzBq4QPa3k+s1oZTajGW/URwqpakn4Y7trexdg8Y4dn6cVt E+JUofN4JMWOUtMcLEiL42i4zdvU/MBesBR0mdSvmpa0PvRwLnGXcS/7sWSAi1pAau4c R6+A3WgLtIClG05OqTxdQ+Z/kgSzPfUZSrQCKEY8xN4o82L6quYxNesU19oBGcOb7isr 0ajdmgJ7YFL0NfwOrkVO60xBNW0s4DKbLsY/GLKrtrT/IazAN2hSvh6MXh+PAt7zJeGY FfJg== X-Forwarded-Encrypted: i=1; AJvYcCWuEAcpM5kG9kv5h/+PZU1YDrpobcsMFj16HNX2sl/Q49kS/oqAfeSBWFdaP8hEezm8O2C67HQuuCYHxw4=@vger.kernel.org X-Gm-Message-State: AOJu0YwE/WqhjslqcHMRw5eymbNFOeLw5uPpMafXUrihj4s+7L4nYzf5 VQd1wEpAY8UzYoxidIBRZGpHrCSNgNwSHdjiwnjPyEQLRhmcFHF5PBb67pCVNx4rbTkD1hzk7kl A1+yn8UVl4lF+AQ== X-Received: from pgbfm9.prod.google.com ([2002:a05:6a02:4989:b0:c63:555e:ebf4]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:3a41:b0:364:14f3:a5b with SMTP id adf61e73a8af0-39355ee41f2mr482485637.2.1770061291047; Mon, 02 Feb 2026 11:41:31 -0800 (PST) Date: Mon, 2 Feb 2026 19:41:23 +0000 In-Reply-To: <20260202194125.2191216-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260202194125.2191216-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.53.0.rc1.225.gd81095ad13-goog Message-ID: <20260202194125.2191216-2-jiaqiyan@google.com> Subject: [PATCH v4 1/3] mm/page_alloc: only free healthy pages in high-order has_hwpoisoned folio From: Jiaqi Yan To: jackmanb@google.com, hannes@cmpxchg.org, linmiaohe@huawei.com, ziy@nvidia.com, harry.yoo@oracle.com, willy@infradead.org Cc: nao.horiguchi@gmail.com, david@redhat.com, lorenzo.stoakes@oracle.com, william.roche@oracle.com, tony.luck@intel.com, wangkefeng.wang@huawei.com, jane.chu@oracle.com, akpm@linux-foundation.org, osalvador@suse.de, muchun.song@linux.dev, rientjes@google.com, duenwen@google.com, jthoughton@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, boudewijn@delta-utec.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" At the end of dissolve_free_hugetlb_folio(), a free HugeTLB folio becomes non-HugeTLB, and it is released to buddy allocator as a high-order folio, e.g. a folio that contains 262144 pages if the folio was a 1G HugeTLB hugepage. This is problematic if the HugeTLB hugepage contained HWPoison subpages. In that case, since buddy allocator does not check HWPoison for non-zero-order folio, the raw HWPoison page can be given out with its buddy page and be re-used by either kernel or userspace. Memory failure recovery (MFR) in kernel does attempt to take raw HWPoison page off buddy allocator after dissolve_free_hugetlb_folio(). However, there is always a time window between dissolve_free_hugetlb_folio() frees a HWPoison high-order folio to buddy allocator and MFR takes HWPoison raw page off buddy allocator. Another similar situation is when a transparent huge page (THP) runs into memory failure but splitting failed. Such THP will eventually be released to buddy allocator when owning userspace processes are gone, but with certain subpages having HWPoison. One obvious way to avoid both problems is to add page sanity checks in page allocate or free path. However, it is against the past efforts to reduce sanity check overhead [1,2,3]. Introduce free_has_hwpoisoned() to only free the healthy pages and to exclude the HWPoison ones in the high-order folio. The idea is to iterate through the sub-pages of the folio to identify contiguous ranges of healthy pages. Instead of freeing pages one by one, decompose healthy ranges into the largest possible blocks having different orders. Every block meets the requirements to be freed via __free_one_page(). free_has_hwpoisoned() has linear time complexity wrt the number of pages in the folio. While the power-of-two decomposition ensures that the number of calls to the buddy allocator is logarithmic for each contiguous healthy range, the mandatory linear scan of pages to identify PageHWPoison() defines the overall time complexity. For a 1G hugepage having several HWPoison pages, free_has_hwpoisoned() takes around 2ms on average. Since free_has_hwpoisoned() has nontrivial overhead, it is added to free_pages_prepare() as a shortcut and is only done if PG_has_hwpoisoned indicates HWPoison page exists and after checks and preparations all succeeded. [1] https://lore.kernel.org/linux-mm/1460711275-1130-15-git-send-email-mgor= man@techsingularity.net [2] https://lore.kernel.org/linux-mm/1460711275-1130-16-git-send-email-mgor= man@techsingularity.net [3] https://lore.kernel.org/all/20230216095131.17336-1-vbabka@suse.cz Signed-off-by: Jiaqi Yan --- mm/page_alloc.c | 133 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 131 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index cbf758e27aa2c..d6883f1b17d95 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -242,6 +242,7 @@ gfp_t gfp_allowed_mask __read_mostly =3D GFP_BOOT_MASK; unsigned int pageblock_order __read_mostly; #endif =20 +static void free_has_hwpoisoned(struct page *page, unsigned int order); static void __free_pages_ok(struct page *page, unsigned int order, fpi_t fpi_flags); =20 @@ -1340,14 +1341,30 @@ static inline void pgalloc_tag_sub_pages(struct all= oc_tag *tag, unsigned int nr) =20 #endif /* CONFIG_MEM_ALLOC_PROFILING */ =20 -__always_inline bool free_pages_prepare(struct page *page, - unsigned int order) +/* + * Returns + * - true: checks and preparations all good, caller can proceed freeing. + * - false: do not proceed freeing for one of the two reasons: + * 1. Some check failed so it is not safe to proceed freeing. + * 2. A compound page having some HWPoison pages. The healthy pages + * are already safely freed, and HWPoison ones isolated. + */ +__always_inline bool free_pages_prepare(struct page *page, unsigned int or= der) { int bad =3D 0; bool skip_kasan_poison =3D should_skip_kasan_poison(page); bool init =3D want_init_on_free(); bool compound =3D PageCompound(page); struct folio *folio =3D page_folio(page); + /* + * When dealing with compound page, PG_has_hwpoisoned is cleared + * with PAGE_FLAGS_SECOND. So the check must be done first. + * + * Note we can't exclude PG_has_hwpoisoned from PAGE_FLAGS_SECOND. + * Because PG_has_hwpoisoned =3D=3D PG_active, free_page_is_bad() will + * confuse and complaint that the first tail page is still active. + */ + bool should_fhh =3D compound && folio_test_has_hwpoisoned(folio); =20 VM_BUG_ON_PAGE(PageTail(page), page); =20 @@ -1470,6 +1487,16 @@ __always_inline bool free_pages_prepare(struct page = *page, =20 debug_pagealloc_unmap_pages(page, 1 << order); =20 + /* + * After breaking down compound page and dealing with page metadata + * (e.g. page owner and page alloc tags), take a shortcut if this + * was a compound page containing certain HWPoison subpages. + */ + if (should_fhh) { + free_has_hwpoisoned(page, order); + return false; + } + return true; } =20 @@ -2953,6 +2980,108 @@ static bool free_frozen_page_commit(struct zone *zo= ne, return ret; } =20 +/* + * Given a range of physically contiguous pages, efficiently free them + * block by block. Block order is chosen to meet the PFN alignment + * requirement in __free_one_page(). + */ +static void free_contiguous_pages(struct page *curr, + unsigned long nr_pages) +{ + unsigned int order; + unsigned int align_order; + unsigned int size_order; + unsigned long remaining; + unsigned long pfn =3D page_to_pfn(curr); + const unsigned long end_pfn =3D pfn + nr_pages; + struct zone *zone =3D page_zone(curr); + + /* + * This decomposition algorithm at every iteration chooses the + * order to be the minimum of two constraints: + * - Alignment: the largest power-of-two that divides current pfn. + * - Size: the largest power-of-two that fits in the current + * remaining number of pages. + */ + while (pfn < end_pfn) { + remaining =3D end_pfn - pfn; + align_order =3D ffs(pfn) - 1; + size_order =3D fls_long(remaining) - 1; + order =3D min(align_order, size_order); + + free_one_page(zone, curr, pfn, order, FPI_NONE); + curr +=3D (1UL << order); + pfn +=3D (1UL << order); + } + + VM_WARN_ON(pfn !=3D end_pfn); +} + +/* + * Given a high-order compound page containing certain number of HWPoison + * pages, free only the healthy ones assuming FPI_NONE. + * + * Pages must have passed free_pages_prepare(). Even if having HWPoison + * pages, breaking down compound page and updating metadata (e.g. page + * owner, alloc tag) can be done together during free_pages_prepare(), + * which simplifies the splitting here: unlike __split_unmapped_folio(), + * there is no need to turn split pages into a compound page or to carry + * metadata. + * + * It calls free_one_page O(2^order) times and cause nontrivial overhead. + * So only use this when the compound page really contains HWPoison. + * + * This implementation doesn't work in memdesc world. + */ +static void free_has_hwpoisoned(struct page *page, unsigned int order) +{ + struct page *curr =3D page; + struct page *next; + unsigned long nr_pages; + /* + * Don't assume end points to a valid page. It is only used + * here for pointer arithmetic. + */ + struct page *end =3D page + (1 << order); + unsigned long total_freed =3D 0; + unsigned long total_hwp =3D 0; + + VM_WARN_ON(order =3D=3D 0); + VM_WARN_ON(page->flags.f & PAGE_FLAGS_CHECK_AT_PREP); + + while (curr < end) { + next =3D curr; + nr_pages =3D 0; + + while (next < end && !PageHWPoison(next)) { + ++next; + ++nr_pages; + } + + if (next !=3D end && PageHWPoison(next)) { + /* + * Avoid accounting error when the page is freed + * by unpoison_memory(). + */ + clear_page_tag_ref(next); + ++total_hwp; + } + + free_contiguous_pages(curr, nr_pages); + total_freed +=3D nr_pages; + + if (next =3D=3D end) + break; + + VM_WARN_ON(!PageHWPoison(next)); + curr =3D next + 1; + } + + VM_WARN_ON(total_freed + total_hwp !=3D (1 << order)); + pr_info("Freed %#lx pages, excluded %lu hwpoison pages\n", + total_freed, total_hwp); +} + /* * Free a pcp page */ --=20 2.53.0.rc2.204.g2597b5adb4-goog From nobody Sun Feb 8 05:20:10 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 577343090D2 for ; Mon, 2 Feb 2026 19:41:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770061294; cv=none; b=YN70vuEsbX+y106oH4n89mBxV9R3efzM28cst6Z53imeZ7T9iM77NXAd7S4j05fS78QTdGxHJfDf8VEH5SI/wNE4L0MGXkvKZvfagRH+M/7XJ5LJF+ZxE092i15antggzVb5YeLvAXEz2lcq/9qB0usGuEEdROi6J8LdOwNZ9t0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770061294; c=relaxed/simple; bh=slxaiK7D8INUK3Uf3G7SOHRXCoc/qMyWqE8nww1/XoI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=FKg/Ads8FB94szxt1VaS5CleNgYz2Z3+nGnaVvD94xN6Ei6dCpFj/cOizSMudxn/xHtOosW3HTPElI/SFjmW4wy0Z2j6VOqT9HHGUq4+pDkTyvYNr3oDdx9vY17VEiDH49DVUVL2/TnCYI9SPNKXlX7lPuSpL3isUvhXqtcuK1c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=I56HXM23; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="I56HXM23" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2a90db74672so14987345ad.0 for ; Mon, 02 Feb 2026 11:41:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770061293; x=1770666093; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=CzBuDtCCfLt++OBvygrsI0xUSFHIl5gIsKDVFk7aDn4=; b=I56HXM23F7zNWg+5SNRg+9im/BrosSZXhVoH7UDzQsBT9oLvgtezJ2120QzKW4JVvX kChxEXvVMIe+4AwKBVUhIJvRVTH5GgqpqjIINm8QNnKNMPoifLc3DD7TtOVHeg+G10H1 kqsHRx6r/k12Syus17L1EuU/ZQvSwPyKaKAj9xmgq+kRTvFyLvl5YHbmh9lbZPX5kwFm jZQ7loPIzaoQ8pJB1/aEYRRWG69mdvng7BoJy/8tAb2AM7K+pvI4HLklSxA+mgOT0j8V zwlYqXrGAnrknG/SfsiyyjJwGpTaa17VwXRKrYtbReTwMicyfaeUuk7YYKjD+eyBvxOB acHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770061293; x=1770666093; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CzBuDtCCfLt++OBvygrsI0xUSFHIl5gIsKDVFk7aDn4=; b=d+z03UEUxf9fyi7PRX7c0pbK+NqclgORtBEc1MvUfMum0Q6wPfXvN7f4XbNp5uFWUx oQf7UmzgIVRieS3AY0cROWUVKRYSs+FnoJ66UKGmb2JVLTcQtaPi7RzzeA0Qh1tqd5km R/Yl34R9l2/46Z4kQY1WRZ3s84vGWwyGgKNa5DggnHUFSgiIMuIKhmWlFJpbPc/iGgDp ucmJZ9WDUZ3sO9PJYTJF+K9G14Svz4YdWemZnfbKdcz2da1mD9E4rgCPwkxDojXtCPuI a0an+gn7nmMsp/mm1+JVCw5Kp0aujCCCMgNeaZdGpLIlGxsm5B2B0Qh4zCZOO6eNvZej Lg9g== X-Forwarded-Encrypted: i=1; AJvYcCWSxm1xUUs1N7syHU+jJFkLuvbYRR2O4g0Meqo61cIofwTIjCk+vgvbMU3vVLK7PcA9s6foaT5ZIc65rGw=@vger.kernel.org X-Gm-Message-State: AOJu0Yx3otPVpWoSRQ0kooMXhZxm+qR7H3zvx/23X1rNLSRbj7sr7RwI RxaqSuxFXMbqVX17KF/i7ZJRifWOxhqbCkhRPv1JsyXceIc4qbrmYY5+9So9biaKPRD4QlI1FVT 6SkVlKahkC2ZmxQ== X-Received: from plblf5.prod.google.com ([2002:a17:902:fb45:b0:2a0:fb2a:79f0]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:283:b0:29f:1fad:8e50 with SMTP id d9443c01a7336-2a8d959c8abmr134229505ad.3.1770061292645; Mon, 02 Feb 2026 11:41:32 -0800 (PST) Date: Mon, 2 Feb 2026 19:41:24 +0000 In-Reply-To: <20260202194125.2191216-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260202194125.2191216-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.53.0.rc1.225.gd81095ad13-goog Message-ID: <20260202194125.2191216-3-jiaqiyan@google.com> Subject: [PATCH v4 2/3] mm/memory-failure: set has_hwpoisoned flags on dissolved HugeTLB folio From: Jiaqi Yan To: jackmanb@google.com, hannes@cmpxchg.org, linmiaohe@huawei.com, ziy@nvidia.com, harry.yoo@oracle.com, willy@infradead.org Cc: nao.horiguchi@gmail.com, david@redhat.com, lorenzo.stoakes@oracle.com, william.roche@oracle.com, tony.luck@intel.com, wangkefeng.wang@huawei.com, jane.chu@oracle.com, akpm@linux-foundation.org, osalvador@suse.de, muchun.song@linux.dev, rientjes@google.com, duenwen@google.com, jthoughton@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, boudewijn@delta-utec.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When a free HWPoison HugeTLB folio is dissolved, it becomes non-HugeTLB and is released to buddy allocator as a high-order folio. Set has_hwpoisoned flags on the high-order folio so that buddy allocator can tell that it contains certain HWPoison page(s), and can handle it specially with free_has_hwpoisoned(). Signed-off-by: Jiaqi Yan --- include/linux/page-flags.h | 2 +- mm/memory-failure.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index f7a0e4af0c734..d13835e265952 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -904,7 +904,7 @@ static inline int PageTransCompound(const struct page *= page) TESTPAGEFLAG_FALSE(TransCompound, transcompound) #endif =20 -#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE) +#if defined(CONFIG_MEMORY_FAILURE) && (defined(CONFIG_TRANSPARENT_HUGEPAGE= ) || defined(CONFIG_HUGETLB_PAGE)) /* * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in = the * compound page. diff --git a/mm/memory-failure.c b/mm/memory-failure.c index c80c2907da333..529a83a325740 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1952,6 +1952,7 @@ void folio_clear_hugetlb_hwpoison(struct folio *folio) if (folio_test_hugetlb_vmemmap_optimized(folio)) return; folio_clear_hwpoison(folio); + folio_set_has_hwpoisoned(folio); folio_free_raw_hwp(folio, true); } =20 --=20 2.53.0.rc2.204.g2597b5adb4-goog From nobody Sun Feb 8 05:20:10 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 349DF309F00 for ; Mon, 2 Feb 2026 19:41:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770061296; cv=none; b=SlhhAtghtkkIvc374eRTxmre36dCxhVRh7IKAX/aG8c3102Pi9AaCLCfhivFvHbjZl6RhasL1Pl5b2i2JPVe+TGQr7lESCpyQhE01AZnaV2I0JzSPLYyJfyXLh+DMra4ocPVuelzpaA3MSL6YBqVwuCTWOjvocMvVJsH82I6qIM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770061296; c=relaxed/simple; bh=lwOPe/g7+6yRLxIFHIwh0Mns1KwAYuyxSV/XhDD4v/g=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=IAEs2CuR3QF5mhu9A7u9u+M85g3g2x2pouE29zdPSmfuxRjnkYIaAwMDP6G7OzLQEayOUvG2qRYjDegRBfBnB+8eKpc6O3AO9B9VizfCbreAVERFJETpo6DMHO8sQ8eFsm9cMLlxhSW9sQG1KgZ05XYqtrkVCM2aUV35aLDSKeY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=P+1AC4vf; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="P+1AC4vf" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2a07fa318fdso49479655ad.0 for ; Mon, 02 Feb 2026 11:41:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770061294; x=1770666094; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2xJyNzAFjx6HxtPx+hx1i0rUvha8B/sysQiqMz13rtk=; b=P+1AC4vfwNL+YSVDZx2h8EF5UXO3uMewNUnX5h3VuZ+0d3ZMFADsChKh4VnNcZ4HtS JlLOt07YrPrAYZ55G5UOi2iUig9HYPgO0zznbYhYAfAGUEFehUOCYtB+czqZBt287qIl QXaCbtcbjiw81az8rYUZhSJxdGI1HnVn2wb6vaIdWQOOVHtboudigBIglNTaqx3vEwv6 fdjO4FrVKR7lqKDoSrsmFZUSt3d/7UFQ4YAaAgAcGJXCJGbUwDJnV6AVD8Vl5MfpK5vO jezGhEnOAtC3EJ2vxg4SXYvXgmmNNdv9Z5laDYph7+C5/jyIsC1fL8dev4zX7tASJbka uJNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770061294; x=1770666094; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2xJyNzAFjx6HxtPx+hx1i0rUvha8B/sysQiqMz13rtk=; b=NScvqbeqKZRm3N5J1ToA/XC+qSeVXXQUNXAIZGJT8xEP7iNo5Mq9cHqEIeo21CX5++ yTYzZ4UR6KCbppboH0UaqaZoSEkoAJ3cPnRHkCaSCQ62lzFHpJ69xza1a47AQhx/f190 AXLfEbM8bM9DJmuoRIL1zBV7FX98tEmaudPagJPK69EzYFtINTHm0DrsAIwFUXqvElqu gVM96syyQAGJGElVU4lxL1unqpjSVPL1TSaeCYlHY7hzTMYhwnPD85B5r+75E6mMgbkJ 5zhrAdCZM0tTic+vWq/plyW16s04QM4Tz2AE/7pSjvaaOaj3vFv17jb/K2O3+XzMMJlM MCgA== X-Forwarded-Encrypted: i=1; AJvYcCVojNe+kEWld3Z07ZPhTnbgcWRpKj1xy11N4G/uT2A5U4ARSobXkhZirbS60jzsfK6vZojBKeqSNW0Z1Ys=@vger.kernel.org X-Gm-Message-State: AOJu0Yxos7DPWN34whV9cTTwbTfS06XJwk6NsKkpy1gnnKUnFg2hkXC2 5YV51V19S/+x/M4TnZhuR4RowcjOIptnnwwG8jAqRUJS79Eiht6m2Kc2OZ+PYqjfnpWZx7/OW4R nUzs2cs989qF2CQ== X-Received: from plps16.prod.google.com ([2002:a17:902:9890:b0:2a3:29a1:b818]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:1984:b0:2a7:9196:a94e with SMTP id d9443c01a7336-2a8d96a2af7mr109252935ad.15.1770061294535; Mon, 02 Feb 2026 11:41:34 -0800 (PST) Date: Mon, 2 Feb 2026 19:41:25 +0000 In-Reply-To: <20260202194125.2191216-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260202194125.2191216-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.53.0.rc1.225.gd81095ad13-goog Message-ID: <20260202194125.2191216-4-jiaqiyan@google.com> Subject: [PATCH v4 3/3] mm/memory-failure: skip take_page_off_buddy after dissolving HWPoison HugeTLB page From: Jiaqi Yan To: jackmanb@google.com, hannes@cmpxchg.org, linmiaohe@huawei.com, ziy@nvidia.com, harry.yoo@oracle.com, willy@infradead.org Cc: nao.horiguchi@gmail.com, david@redhat.com, lorenzo.stoakes@oracle.com, william.roche@oracle.com, tony.luck@intel.com, wangkefeng.wang@huawei.com, jane.chu@oracle.com, akpm@linux-foundation.org, osalvador@suse.de, muchun.song@linux.dev, rientjes@google.com, duenwen@google.com, jthoughton@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, boudewijn@delta-utec.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now that HWPoison subpage(s) within HugeTLB page will be rejected by buddy allocator during dissolve_free_hugetlb_folio(), there is no need to drain_all_pages() and take_page_off_buddy() anymore. In fact, calling take_page_off_buddy() after dissolve_free_hugetlb_folio() succeeded returns false, making caller think __page_handle_poison() failed. Add __hugepage_handle_poison() and replace __page_handle_poison() at HugeTLB specific call sites. The being handled HugeTLB page either is free at the moment of try_memory_failure_hugetlb(), or becomes free at the moment of me_huge_page(). Signed-off-by: Jiaqi Yan --- mm/memory-failure.c | 36 ++++++++++++++++++++++++++++++------ 1 file changed, 30 insertions(+), 6 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 529a83a325740..58b34f5d2c05d 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -163,6 +163,30 @@ static struct rb_root_cached pfn_space_itree =3D RB_RO= OT_CACHED; static DEFINE_MUTEX(pfn_space_lock); =20 /* + * Only for a HugeTLB page being handled by memory_failure(). The key + * difference to soft_offline() is that, no HWPoison subpage will make + * into buddy allocator after a successful dissolve_free_hugetlb_folio(), + * so take_page_off_buddy() is unnecessary. + */ +static int __hugepage_handle_poison(struct page *page) +{ + struct folio *folio =3D page_folio(page); + + VM_WARN_ON_FOLIO(!folio_test_hwpoison(folio), folio); + + /* + * Can't use dissolve_free_hugetlb_folio() without a reliable + * raw_hwp_list telling which subpage is HWPoison. + */ + if (folio_test_hugetlb_raw_hwp_unreliable(folio)) + /* raw_hwp_list becomes unreliable when kmalloc() fails. */ + return -ENOMEM; + + return dissolve_free_hugetlb_folio(folio); +} + +/* + * Only for a free or HugeTLB page being handled by soft_offline(). * Return values: * 1: the page is dissolved (if needed) and taken off from buddy, * 0: the page is dissolved (if needed) and not taken off from buddy, @@ -1174,11 +1198,11 @@ static int me_huge_page(struct page_state *ps, stru= ct page *p) * subpages. */ folio_put(folio); - if (__page_handle_poison(p) > 0) { + if (__hugepage_handle_poison(p)) { + res =3D MF_FAILED; + } else { page_ref_inc(p); res =3D MF_RECOVERED; - } else { - res =3D MF_FAILED; } } =20 @@ -2067,11 +2091,11 @@ static int try_memory_failure_hugetlb(unsigned long= pfn, int flags, int *hugetlb */ if (res =3D=3D 0) { folio_unlock(folio); - if (__page_handle_poison(p) > 0) { + if (__hugepage_handle_poison(p)) { + res =3D MF_FAILED; + } else { page_ref_inc(p); res =3D MF_RECOVERED; - } else { - res =3D MF_FAILED; } return action_result(pfn, MF_MSG_FREE_HUGE, res); } --=20 2.53.0.rc2.204.g2597b5adb4-goog