From nobody Mon Feb 9 11:04:58 2026 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC5B521C16E for ; Mon, 12 Jan 2026 00:49:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768178974; cv=none; b=gdr/5YrFQbRR4WG5j+LK32OqcMdHR728igZQ97TAit3GvcjRhtnX3ef1uNMdOMSbWgBjsTvL/tOcOej/N60OREcg9emUwm/0gN7jeIyzoOzDUb3aO7VEcykyjsLcgbNzE1R1LeeGFE+nLhnjGWSpF6VQ32z7h4l2dSf1+cEOeFY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768178974; c=relaxed/simple; bh=gDsNc4WRZ7bXIoryYSTAbaJUY95PrLrIDuXElm3sNrM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cj9HxD0HbLsXDsDkvRtd83ZwChNKjySz2WAFxn5gnGBXUa15BimkL5CMvxVXXLwc/f3JAKPLykoriAhE5eVaqDZBJB3kuqLRLdpTfbilTC1H5n+Zu+fkRwkQcjMuXQ/AL6wTIWM7xK4+YzQDlC9xbs0RvRotYRqflY1bMmT0Jcc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fDEE6qnc; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fDEE6qnc" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-81ea3358dd3so1514705b3a.3 for ; Sun, 11 Jan 2026 16:49:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1768178972; x=1768783772; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tOfkfgub/NaDAEiuWC6wAGOFEvU0NuNfiux/MMIfgrM=; b=fDEE6qncv0M5g1RfwdR0fZTURYRSuZcHmyp/S993bAxW4TDCbtvy8hffm8s1lpfTYi N6KA1LglOmVHF1N+sTHDsRDLLfYHUDPPkfDwOt6GdHqhZe67kMXh/daNKowknSCqg6q4 wInvNjdMu2MWAGFSsDzpvp/fvr1z7IOH2YOyAf0XRTVKzWpJsuUnqtELl9X5pxdgV+x5 jLIAVMgRdFm+AyCghRbfyRuDriNjSUSp2DB9wfYva0xRgFdpqYuIhkIG+62E+mXK+lFN yZA2Pg3EgWr6DR/JrPQA2GoJ1TSQtgXLKFARKdfOtYBctSLxCTnMqOecXJrkaMnfD9c/ 6RWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768178972; x=1768783772; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tOfkfgub/NaDAEiuWC6wAGOFEvU0NuNfiux/MMIfgrM=; b=hqXONJsNvKFMtjp6tfNOhIHmY1Y/LNz/3f/F7EId1yqAf2sJFYvZiy39mrXAMgwAy8 zAYgcwpvyWiRngeyqEhVBKfBIuYpsUpE5JOptDD7ua+rYmIbpmkP+HXXWgIbTabxjZrA um7n68yAmZSvbTdOFm+HuGB/pv25cX0XvzTH2phOpf3YOKhT6JDy9OVaEcdtyJ8cFmPb Fs2OFimwe8S9wa10BUJ0CYaQ4VqaHfO6UWXylli0ZSD86Qpl+ZzVR7prhTLGHM1hlTPm yVXQbHFB637vjJ2X6pklmfLsVpTnx21YcLr4dYEeOnNwO56Cx8KugW4fk/Ft5xiLZwzn YGgw== X-Forwarded-Encrypted: i=1; AJvYcCV2cF2ligJRFtNg6v8EK5EA5090tHr26M1B8piXeBBJGp5G7rmwRnDg6AdE38fXLZgNrjsRh3vtM/2YX6U=@vger.kernel.org X-Gm-Message-State: AOJu0YxgXMQhyG9L9grsNoF9vpALWYr4DPuRfxZbp5Z4qpmJIqVzcNHE ihy7SUALs1iETjIXVzZ+62UouZohH3tax6/wKbpbYKH+XH0jIoNPgQVGKMSlNYuKIoPt6Sy5SqA 6XMqbuQ0jSIcoWg== X-Google-Smtp-Source: AGHT+IE3mm0jWQWNZBJBvrBw2R/8R6/gLaYBFwhcilPPZzp5OWEiyGaY2M5Ofj7hXuABGa1C9diKGvBO24qNGg== X-Received: from pfib20.prod.google.com ([2002:aa7:8114:0:b0:7b0:bc2e:9595]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4516:b0:81f:50b1:51fc with SMTP id d2e1a72fcca58-81f50b171b3mr2334123b3a.10.1768178972099; Sun, 11 Jan 2026 16:49:32 -0800 (PST) Date: Mon, 12 Jan 2026 00:49:23 +0000 In-Reply-To: <20260112004923.888429-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260112004923.888429-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.52.0.457.g6b5491de43-goog Message-ID: <20260112004923.888429-4-jiaqiyan@google.com> Subject: [PATCH v3 3/3] mm/memory-failure: refactor page_handle_poison() From: Jiaqi Yan To: jackmanb@google.com, hannes@cmpxchg.org, linmiaohe@huawei.com, ziy@nvidia.com, harry.yoo@oracle.com, willy@infradead.org Cc: nao.horiguchi@gmail.com, david@redhat.com, lorenzo.stoakes@oracle.com, william.roche@oracle.com, tony.luck@intel.com, wangkefeng.wang@huawei.com, jane.chu@oracle.com, akpm@linux-foundation.org, osalvador@suse.de, muchun.song@linux.dev, rientjes@google.com, duenwen@google.com, jthoughton@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now that HWPoison page(s) within HugeTLB page will be rejected by buddy allocator during dissolve_free_hugetlb_folio(), there is no need to drain_all_pages() and take_page_off_buddy() anymore. In fact, calling take_page_off_buddy() after dissolve_free_hugetlb_folio() succeeded returns false, making caller think page_handl_poion() failed. On the other hand, for hardware corrupted pages in buddy allocator, take_page_off_buddy() is still a must-have. Given hugepage and free buddy page should be treated differently, refactor page_handle_poison() and __page_handle_poison(): - __page_handle_poison() is unwind into page_handle_poison(). - Callers of page_handle_poison() also need to explicitly tell if page is HugeTLB hugepage or free buddy page. - Add helper hugepage_handle_poison() for several existing HugeTLB specific callsites. Signed-off-by: Jiaqi Yan --- mm/memory-failure.c | 84 ++++++++++++++++++++++----------------------- 1 file changed, 41 insertions(+), 43 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index d204de6c9792a..1fdaee1e48bb8 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -162,54 +162,48 @@ static struct rb_root_cached pfn_space_itree =3D RB_R= OOT_CACHED; =20 static DEFINE_MUTEX(pfn_space_lock); =20 -/* - * Return values: - * 1: the page is dissolved (if needed) and taken off from buddy, - * 0: the page is dissolved (if needed) and not taken off from buddy, - * < 0: failed to dissolve. +/** + * Handle the HugeTLB hugepage that @page belongs to. Return values: + * =3D 0: the hugepage is free hugepage and is dissolved. + * < 0: hugepage is in-use or failed to dissolve. */ -static int __page_handle_poison(struct page *page) +static int hugepage_handle_poison(struct page *page) { - int ret; + return dissolve_free_hugetlb_folio(page_folio(page)); +} + +/** + * Helper at the end of handling @page having hardware errors. + * @huge: @page is part of a HugeTLB hugepage. + * @free: @page is free buddy page. + * @release: memory-failure module should release a pending refcount. + */ +static bool page_handle_poison(struct page *page, bool huge, bool free, + bool release) +{ + int ret =3D 0; =20 /* - * zone_pcp_disable() can't be used here. It will - * hold pcp_batch_high_lock and dissolve_free_hugetlb_folio() might hold - * cpu_hotplug_lock via static_key_slow_dec() when hugetlb vmemmap - * optimization is enabled. This will break current lock dependency - * chain and leads to deadlock. - * Disabling pcp before dissolving the page was a deterministic - * approach because we made sure that those pages cannot end up in any - * PCP list. Draining PCP lists expels those pages to the buddy system, - * but nothing guarantees that those pages do not get back to a PCP - * queue if we need to refill those. + * Buddy allocator will exclude the HWPoison page after hugepage + * is successfully dissolved. */ - ret =3D dissolve_free_hugetlb_folio(page_folio(page)); - if (!ret) { + if (huge) + ret =3D hugepage_handle_poison(page); + + if (free) { drain_all_pages(page_zone(page)); - ret =3D take_page_off_buddy(page); + ret =3D take_page_off_buddy(page) ? 0 : -1; } =20 - return ret; -} - -static bool page_handle_poison(struct page *page, bool hugepage_or_freepag= e, bool release) -{ - if (hugepage_or_freepage) { + if ((huge || free) && ret < 0) /* - * Doing this check for free pages is also fine since - * dissolve_free_hugetlb_folio() returns 0 for non-hugetlb folios as wel= l. + * We could fail to take off the target page from buddy + * for example due to racy page allocation, but that's + * acceptable because soft-offlined page is not broken + * and if someone really want to use it, they should + * take it. */ - if (__page_handle_poison(page) <=3D 0) - /* - * We could fail to take off the target page from buddy - * for example due to racy page allocation, but that's - * acceptable because soft-offlined page is not broken - * and if someone really want to use it, they should - * take it. - */ - return false; - } + return false; =20 SetPageHWPoison(page); if (release) @@ -1174,7 +1168,7 @@ static int me_huge_page(struct page_state *ps, struct= page *p) * subpages. */ folio_put(folio); - if (__page_handle_poison(p) > 0) { + if (!hugepage_handle_poison(p)) { page_ref_inc(p); res =3D MF_RECOVERED; } else { @@ -2067,7 +2061,7 @@ static int try_memory_failure_hugetlb(unsigned long p= fn, int flags, int *hugetlb */ if (res =3D=3D 0) { folio_unlock(folio); - if (__page_handle_poison(p) > 0) { + if (!hugepage_handle_poison(p)) { page_ref_inc(p); res =3D MF_RECOVERED; } else { @@ -2815,7 +2809,7 @@ static int soft_offline_in_use_page(struct page *page) =20 if (ret) { pr_info("%#lx: invalidated\n", pfn); - page_handle_poison(page, false, true); + page_handle_poison(page, false, false, true); return 0; } =20 @@ -2836,7 +2830,7 @@ static int soft_offline_in_use_page(struct page *page) if (!ret) { bool release =3D !huge; =20 - if (!page_handle_poison(page, huge, release)) + if (!page_handle_poison(page, huge, false, release)) ret =3D -EBUSY; } else { if (!list_empty(&pagelist)) @@ -2884,6 +2878,8 @@ int soft_offline_page(unsigned long pfn, int flags) { int ret; bool try_again =3D true; + bool huge; + bool free; struct page *page; =20 if (!pfn_valid(pfn)) { @@ -2929,7 +2925,9 @@ int soft_offline_page(unsigned long pfn, int flags) if (ret > 0) { ret =3D soft_offline_in_use_page(page); } else if (ret =3D=3D 0) { - if (!page_handle_poison(page, true, false)) { + huge =3D folio_test_hugetlb(page_folio(page)); + free =3D is_free_buddy_page(page); + if (!page_handle_poison(page, huge, free, false)) { if (try_again) { try_again =3D false; flags &=3D ~MF_COUNT_INCREASED; --=20 2.52.0.457.g6b5491de43-goog