From nobody Tue Apr 28 08:59:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABBADC433EF for ; Thu, 2 Jun 2022 05:07:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229949AbiFBFH1 (ORCPT ); Thu, 2 Jun 2022 01:07:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229539AbiFBFHI (ORCPT ); Thu, 2 Jun 2022 01:07:08 -0400 Received: from out1.migadu.com (out1.migadu.com [91.121.223.63]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A30ABC8A for ; Wed, 1 Jun 2022 22:07:05 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1654146423; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fvjgKpSpd6vLLDTvBjr5eSnjiZxFu3lgcpJpXhpnkYU=; b=huhz5I1gFAwH66TksJg8V8xQm8U25z83Rs/uca4PVrJXeak9ugEQpvHMJZUIbkZ5K9kKaZ TcQ6mseCkGfF5IlXQbeAHprjP7LCR5gnbRs0PQ5uqZxjRiVMVW0upMK5hCgytyLGy/v7Wx +ugdNGgrtZZI+PRpMWhGGbrbyJIB6Bw= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 1/5] mm, hwpoison, hugetlb: introduce SUBPAGE_INDEX_HWPOISON to save raw error page Date: Thu, 2 Jun 2022 14:06:27 +0900 Message-Id: <20220602050631.771414-2-naoya.horiguchi@linux.dev> In-Reply-To: <20220602050631.771414-1-naoya.horiguchi@linux.dev> References: <20220602050631.771414-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Naoya Horiguchi When handling memory error on a hugetlb page, the error handler tries to dissolve and turn it into 4kB pages. If it's successfully dissolved, PageHWPoison flag is moved to the raw error page, so that's all right. However, dissolve sometimes fails, then the error page is left as hwpoisoned hugepage. It's useful if we can retry to dissolve it to save healthy pages, but that's not possible now because the information about where the raw error page is lost. Use the private field of a tail page to keep that information. The code path of shrinking hugepage pool used this info to try delayed dissolve. This only keeps one hwpoison page for now, which might be OK because it's simple and multiple hwpoison pages in a hugepage can be rare. But it can be extended in the future. Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- ChangeLog since previous post on 4/27: - fixed typo in patch description (by Miaohe) - fixed config value in #ifdef statement (by Miaohe) - added sentences about "multiple hwpoison pages" scenario in patch description --- include/linux/hugetlb.h | 24 ++++++++++++++++++++++++ mm/hugetlb.c | 9 +++++++++ mm/memory-failure.c | 2 ++ 3 files changed, 35 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ac2a1d758a80..a5341a3a0d4b 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -42,6 +42,9 @@ enum { SUBPAGE_INDEX_CGROUP, /* reuse page->private */ SUBPAGE_INDEX_CGROUP_RSVD, /* reuse page->private */ __MAX_CGROUP_SUBPAGE_INDEX =3D SUBPAGE_INDEX_CGROUP_RSVD, +#endif +#ifdef CONFIG_MEMORY_FAILURE + SUBPAGE_INDEX_HWPOISON, #endif __NR_USED_SUBPAGE, }; @@ -784,6 +787,27 @@ extern int dissolve_free_huge_page(struct page *page); extern int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); =20 +#ifdef CONFIG_MEMORY_FAILURE +/* + * pointer to raw error page is located in hpage[SUBPAGE_INDEX_HWPOISON].p= rivate + */ +static inline struct page *hugetlb_page_hwpoison(struct page *hpage) +{ + return (void *)page_private(hpage + SUBPAGE_INDEX_HWPOISON); +} + +static inline void hugetlb_set_page_hwpoison(struct page *hpage, + struct page *page) +{ + set_page_private(hpage + SUBPAGE_INDEX_HWPOISON, (unsigned long)page); +} +#else +static inline struct page *hugetlb_page_hwpoison(struct page *hpage) +{ + return NULL; +} +#endif + #ifdef CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION #ifndef arch_hugetlb_migration_supported static inline bool arch_hugetlb_migration_supported(struct hstate *h) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f8e048b939c7..6867ea8345d1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1547,6 +1547,15 @@ static void __update_and_free_page(struct hstate *h,= struct page *page) return; } =20 + if (unlikely(PageHWPoison(page))) { + struct page *raw_error =3D hugetlb_page_hwpoison(page); + + if (raw_error && raw_error !=3D page) { + SetPageHWPoison(raw_error); + ClearPageHWPoison(page); + } + } + for (i =3D 0; i < pages_per_huge_page(h); i++, subpage =3D mem_map_next(subpage, page, i)) { subpage->flags &=3D ~(1 << PG_locked | 1 << PG_error | diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 66edaa7e5092..056dbb2050f8 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1534,6 +1534,8 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, i= nt flags) goto out; } =20 + hugetlb_set_page_hwpoison(head, page); + return ret; out: if (count_increased) --=20 2.25.1 From nobody Tue Apr 28 08:59:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C3BACCA47B for ; Thu, 2 Jun 2022 05:07:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229937AbiFBFHQ (ORCPT ); Thu, 2 Jun 2022 01:07:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229924AbiFBFHK (ORCPT ); Thu, 2 Jun 2022 01:07:10 -0400 Received: from out1.migadu.com (out1.migadu.com [91.121.223.63]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D323DED3 for ; Wed, 1 Jun 2022 22:07:09 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1654146428; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7I9i+xId5N+xy5u1TX6EpcfxW61lTxslw160ZROaNiI=; b=d/YW6eqqfLVLB7Ax1v3MIG6kBqgh4p98XWoszP9C3FXa9s9oHFP6CVDFpGHJ5+5UX4m99r Y89/UUhMh0a97isHzTi97tAZnmq4kX4w5L/eFh8f571GzTgejiHbUglZ8VsKMjH4nhAiUm d+3HZlY4kma9EmkYrjNSpuFVd+VG5qo= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 2/5] mm,hwpoison: set PG_hwpoison for busy hugetlb pages Date: Thu, 2 Jun 2022 14:06:28 +0900 Message-Id: <20220602050631.771414-3-naoya.horiguchi@linux.dev> In-Reply-To: <20220602050631.771414-1-naoya.horiguchi@linux.dev> References: <20220602050631.771414-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Naoya Horiguchi If memory_failure() fails to grab page refcount on a hugetlb page because it's busy, it returns without setting PG_hwpoison on it. This not only loses a chance of error containment, but breaks the rule that action_result() should be called only when memory_failure() do any of handling work (even if that's just setting PG_hwpoison). This inconsistency could harm code maintainability. So set PG_hwpoison and call hugetlb_set_page_hwpoison() for such a case. Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion a= nd memory_failure_hugetlb()") Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- include/linux/mm.h | 1 + mm/memory-failure.c | 8 ++++---- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index d446e834a3e5..04de0c3e4f9f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3187,6 +3187,7 @@ enum mf_flags { MF_MUST_KILL =3D 1 << 2, MF_SOFT_OFFLINE =3D 1 << 3, MF_UNPOISON =3D 1 << 4, + MF_NO_RETRY =3D 1 << 5, }; extern int memory_failure(unsigned long pfn, int flags); extern void memory_failure_queue(unsigned long pfn, int flags); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 056dbb2050f8..fe6a7961dc66 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1526,7 +1526,8 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, i= nt flags) count_increased =3D true; } else { ret =3D -EBUSY; - goto out; + if (!(flags & MF_NO_RETRY)) + goto out; } =20 if (TestSetPageHWPoison(head)) { @@ -1556,7 +1557,6 @@ static int try_memory_failure_hugetlb(unsigned long p= fn, int flags, int *hugetlb struct page *p =3D pfn_to_page(pfn); struct page *head; unsigned long page_flags; - bool retry =3D true; =20 *hugetlb =3D 1; retry: @@ -1572,8 +1572,8 @@ static int try_memory_failure_hugetlb(unsigned long p= fn, int flags, int *hugetlb } return res; } else if (res =3D=3D -EBUSY) { - if (retry) { - retry =3D false; + if (!(flags & MF_NO_RETRY)) { + flags |=3D MF_NO_RETRY; goto retry; } action_result(pfn, MF_MSG_UNKNOWN, MF_IGNORED); --=20 2.25.1 From nobody Tue Apr 28 08:59:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40ACFC433EF for ; Thu, 2 Jun 2022 05:07:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229938AbiFBFHb (ORCPT ); Thu, 2 Jun 2022 01:07:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229926AbiFBFHP (ORCPT ); Thu, 2 Jun 2022 01:07:15 -0400 Received: from out1.migadu.com (out1.migadu.com [IPv6:2001:41d0:2:863f::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9CC4DDED3 for ; Wed, 1 Jun 2022 22:07:13 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1654146432; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DyOll928F39kd/7foHlO9PskbUXpYx8gLtnTg/AubRM=; b=to/byFHDEZZzBz+ilC/vfPbbmmbqN7PiQmd0smz+yne4xEr2XU2AH65Fg0Tnud3ecAlDJz FjXLi5/eDhzXNNo+JEGgffe+Rn968fx5TUGzR0HboszyNIpZpWYUz+FjleC//0o9bzQSIP hSnBYUPQDJT3hQmSwN79f6q6g1WQ1WY= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 3/5] mm, hwpoison: make __page_handle_poison returns int Date: Thu, 2 Jun 2022 14:06:29 +0900 Message-Id: <20220602050631.771414-4-naoya.horiguchi@linux.dev> In-Reply-To: <20220602050631.771414-1-naoya.horiguchi@linux.dev> References: <20220602050631.771414-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Naoya Horiguchi __page_handle_poison() returns bool that shows whether take_page_off_buddy() has passed or not now. But we will want to distinguish another case of "dissolve has passed but taking off failed" by its return value. So change the type of the return value. No functional change. Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- mm/memory-failure.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index fe6a7961dc66..f149a7864c81 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -68,7 +68,13 @@ int sysctl_memory_failure_recovery __read_mostly =3D 1; =20 atomic_long_t num_poisoned_pages __read_mostly =3D ATOMIC_LONG_INIT(0); =20 -static bool __page_handle_poison(struct page *page) +/* + * Return values: + * 1: the page is dissolved (if needed) and taken off from buddy, + * 0: the page is dissolved (if needed) and not taken off from buddy, + * < 0: failed to dissolve. + */ +static int __page_handle_poison(struct page *page) { int ret; =20 @@ -78,7 +84,7 @@ static bool __page_handle_poison(struct page *page) ret =3D take_page_off_buddy(page); zone_pcp_enable(page_zone(page)); =20 - return ret > 0; + return ret; } =20 static bool page_handle_poison(struct page *page, bool hugepage_or_freepag= e, bool release) @@ -88,7 +94,7 @@ static bool page_handle_poison(struct page *page, bool hu= gepage_or_freepage, boo * Doing this check for free pages is also fine since dissolve_free_huge= _page * returns 0 for non-hugetlb pages as well. */ - if (!__page_handle_poison(page)) + if (__page_handle_poison(page) <=3D 0) /* * We could fail to take off the target page from buddy * for example due to racy page allocation, but that's @@ -1045,7 +1051,7 @@ static int me_huge_page(struct page_state *ps, struct= page *p) * save healthy subpages. */ put_page(hpage); - if (__page_handle_poison(p)) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res =3D MF_RECOVERED; } @@ -1595,8 +1601,7 @@ static int try_memory_failure_hugetlb(unsigned long p= fn, int flags, int *hugetlb */ if (res =3D=3D 0) { unlock_page(head); - res =3D MF_FAILED; - if (__page_handle_poison(p)) { + if (__page_handle_poison(p) > 0) { page_ref_inc(p); res =3D MF_RECOVERED; } --=20 2.25.1 From nobody Tue Apr 28 08:59:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FB49C43334 for ; Thu, 2 Jun 2022 05:07:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229969AbiFBFHh (ORCPT ); Thu, 2 Jun 2022 01:07:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229947AbiFBFHZ (ORCPT ); Thu, 2 Jun 2022 01:07:25 -0400 Received: from out1.migadu.com (out1.migadu.com [IPv6:2001:41d0:2:863f::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE76626AF3 for ; Wed, 1 Jun 2022 22:07:17 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1654146436; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=69HjEHnWoOWXnEg6MdqERUrKKt+r81ak43WL3MVYDqU=; b=ErJcBXqaS4mYysb0lBd2slIeeDZ8PVHR7I2q+OH+OvDRMLQeObJt4/4r0pa3e72Ap9vVnO R3S3+dXl1od9asrjUxRYuDT7qu+vW4n+yrL//rV2wX3d19ICdeDQd4hAabQZrK9uPAlqbN X0Dm0eqHEs8sq5hDG+6WGsA6OpiyKOQ= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 4/5] mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage Date: Thu, 2 Jun 2022 14:06:30 +0900 Message-Id: <20220602050631.771414-5-naoya.horiguchi@linux.dev> In-Reply-To: <20220602050631.771414-1-naoya.horiguchi@linux.dev> References: <20220602050631.771414-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Naoya Horiguchi Currently if memory_failure() (modified to remove blocking code) is called on a page in some 1GB hugepage, memory error handling returns failure and the raw error page gets into undesirable state. The impact is small in production systems (just leaked single 4kB page), but this limits the test efficiency because unpoison doesn't work for it. So we can no longer create 1GB hugepage on the 1GB physical address range with such hwpoison pages, that could be an issue in testing on small systems. When a hwpoison page in a 1GB hugepage is handled, it's caught by the PageHWPoison check in free_pages_prepare() because the hugepage is broken down into raw error page and order is 0: if (unlikely(PageHWPoison(page)) && !order) { ... return false; } Then, the page is not sent to buddy and the page refcount is left 0. Originally this check is supposed to work when the error page is freed from page_handle_poison() (that is called from soft-offline), but now we are opening another path to call it, so the callers of __page_handle_poison() need to handle the case by considering the return value 0 as success. Then page refcount for hwpoison is properly incremented and now unpoison works. Signed-off-by: Naoya Horiguchi Reviewed-by: Miaohe Lin --- mm/memory-failure.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index f149a7864c81..babeb34f7477 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1043,7 +1043,6 @@ static int me_huge_page(struct page_state *ps, struct= page *p) res =3D truncate_error_page(hpage, page_to_pfn(p), mapping); unlock_page(hpage); } else { - res =3D MF_FAILED; unlock_page(hpage); /* * migration entry prevents later access on error anonymous @@ -1051,9 +1050,11 @@ static int me_huge_page(struct page_state *ps, struc= t page *p) * save healthy subpages. */ put_page(hpage); - if (__page_handle_poison(p) > 0) { + if (__page_handle_poison(p) >=3D 0) { page_ref_inc(p); res =3D MF_RECOVERED; + } else { + res =3D MF_FAILED; } } =20 @@ -1601,9 +1602,11 @@ static int try_memory_failure_hugetlb(unsigned long = pfn, int flags, int *hugetlb */ if (res =3D=3D 0) { unlock_page(head); - if (__page_handle_poison(p) > 0) { + if (__page_handle_poison(p) >=3D 0) { page_ref_inc(p); res =3D MF_RECOVERED; + } else { + res =3D MF_FAILED; } action_result(pfn, MF_MSG_FREE_HUGE, res); return res =3D=3D MF_RECOVERED ? 0 : -EBUSY; --=20 2.25.1 From nobody Tue Apr 28 08:59:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 448C4C433EF for ; Thu, 2 Jun 2022 05:07:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229993AbiFBFHm (ORCPT ); Thu, 2 Jun 2022 01:07:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229991AbiFBFH2 (ORCPT ); Thu, 2 Jun 2022 01:07:28 -0400 Received: from out1.migadu.com (out1.migadu.com [IPv6:2001:41d0:2:863f::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4496C2B264 for ; Wed, 1 Jun 2022 22:07:22 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1654146440; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iZX0dIM2nGT+t6nSIEOaUXj10onpi540VogwPcnDo94=; b=fpmVTdPPCXplwinXlfNWczSNSOxqhd15ahjEARQgdLz6RV3VclY0QlVpzIQPPYl9vAveKY /gTO4xk/P1MrkQZ/7R8psPkbXK74Y1Wk+UeA2q0Hk8rismHYqVxYMvFDphf/I4HAvL7LPY Rnjr8Pq71IOgB4/hsPZvLkysfhtnRYk= From: Naoya Horiguchi To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Mike Kravetz , Miaohe Lin , Liu Shixin , Yang Shi , Oscar Salvador , Muchun Song , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 5/5] mm, hwpoison: enable memory error handling on 1GB hugepage Date: Thu, 2 Jun 2022 14:06:31 +0900 Message-Id: <20220602050631.771414-6-naoya.horiguchi@linux.dev> In-Reply-To: <20220602050631.771414-1-naoya.horiguchi@linux.dev> References: <20220602050631.771414-1-naoya.horiguchi@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Naoya Horiguchi Now error handling code is prepared, so remove the blocking code and enable memory error handling on 1GB hugepage. Signed-off-by: Naoya Horiguchi --- include/linux/mm.h | 1 - include/ras/ras_event.h | 1 - mm/memory-failure.c | 16 ---------------- 3 files changed, 18 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 04de0c3e4f9f..58a6aa916e4f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3238,7 +3238,6 @@ enum mf_action_page_type { MF_MSG_DIFFERENT_COMPOUND, MF_MSG_HUGE, MF_MSG_FREE_HUGE, - MF_MSG_NON_PMD_HUGE, MF_MSG_UNMAP_FAILED, MF_MSG_DIRTY_SWAPCACHE, MF_MSG_CLEAN_SWAPCACHE, diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index d0337a41141c..cbd3ddd7c33d 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -360,7 +360,6 @@ TRACE_EVENT(aer_event, EM ( MF_MSG_DIFFERENT_COMPOUND, "different compound page after locking" )= \ EM ( MF_MSG_HUGE, "huge page" ) \ EM ( MF_MSG_FREE_HUGE, "free huge page" ) \ - EM ( MF_MSG_NON_PMD_HUGE, "non-pmd-sized huge page" ) \ EM ( MF_MSG_UNMAP_FAILED, "unmapping failed page" ) \ EM ( MF_MSG_DIRTY_SWAPCACHE, "dirty swapcache page" ) \ EM ( MF_MSG_CLEAN_SWAPCACHE, "clean swapcache page" ) \ diff --git a/mm/memory-failure.c b/mm/memory-failure.c index babeb34f7477..ced033a99e19 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -725,7 +725,6 @@ static const char * const action_page_types[] =3D { [MF_MSG_DIFFERENT_COMPOUND] =3D "different compound page after locking", [MF_MSG_HUGE] =3D "huge page", [MF_MSG_FREE_HUGE] =3D "free huge page", - [MF_MSG_NON_PMD_HUGE] =3D "non-pmd-sized huge page", [MF_MSG_UNMAP_FAILED] =3D "unmapping failed page", [MF_MSG_DIRTY_SWAPCACHE] =3D "dirty swapcache page", [MF_MSG_CLEAN_SWAPCACHE] =3D "clean swapcache page", @@ -1614,21 +1613,6 @@ static int try_memory_failure_hugetlb(unsigned long = pfn, int flags, int *hugetlb =20 page_flags =3D head->flags; =20 - /* - * TODO: hwpoison for pud-sized hugetlb doesn't work right now, so - * simply disable it. In order to make it work properly, we need - * make sure that: - * - conversion of a pud that maps an error hugetlb into hwpoison - * entry properly works, and - * - other mm code walking over page table is aware of pud-aligned - * hwpoison entries. - */ - if (huge_page_size(page_hstate(head)) > PMD_SIZE) { - action_result(pfn, MF_MSG_NON_PMD_HUGE, MF_IGNORED); - res =3D -EBUSY; - goto out; - } - if (!hwpoison_user_mappings(p, pfn, flags, head)) { action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED); res =3D -EBUSY; --=20 2.25.1