From nobody Sun May 24 19:34:43 2026 Received: from canpmsgout02.his.huawei.com (canpmsgout02.his.huawei.com [113.46.200.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 736E32F5A06 for ; Wed, 20 May 2026 02:22:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=113.46.200.217 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779243780; cv=none; b=Hta1vGx2/ofKCSzIAMVq6ZhFPbuT7ykaJU1Gxf4V7/eh2xOWOBy3bUfi7lKPAFO/6rPAC9IZOJnLlsprdWzoPWYKl8TxRTM6sODMoD2oMRYyDdhJ8fIvr6LjiwWzAHdVaBCv7l62DF8Lw2rP7bEwmI9GO8nq95+KmZpjA27cGw8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779243780; c=relaxed/simple; bh=J1PPDgKrNKWm/BzPv7YXrvgyjWOn9D9B/E/tAYkZeT0=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=nhLws2F85E0tvAi1HFuTAgr9Bz18qMRipoKzTmz/MoD/hjI8zjAwhqMASJnkgyOz+5tPoHxxjTxv6r+MorjMM5MJYgumdfxsssHAYH5JjWzzM6ejT7EYAvUavR9HIfMzBmZbfJiVwU0cAyb858VfPeytgTa5UPaGqZ3H2AOLAWk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b=perXELn6; arc=none smtp.client-ip=113.46.200.217 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b="perXELn6" dkim-signature: v=1; a=rsa-sha256; d=huawei.com; s=dkim; c=relaxed/relaxed; q=dns/txt; h=From; bh=xESyquy9XilXzWqANuLJoTmgMkqpFkxFEj+q7nxzhzs=; b=perXELn6JjszDpLp2P7vAjeVoHe55x9jI7ailgLmSf+1gOeC2D409KV5uyirKG7vrFdhmOiUG BtuhaaajwlujKFTwBMUMqyeuaMx7djalG6RnPDsMs95w8AMgvfakW+jJOUGS0kY5Bk5UT/3TmaQ H8XAu+btVw5qRlfhTdc+wSA= Received: from mail.maildlp.com (unknown [172.19.162.140]) by canpmsgout02.his.huawei.com (SkyGuard) with ESMTPS id 4gKw9j75QSzcb0K; Wed, 20 May 2026 10:15:21 +0800 (CST) Received: from kwepemj100016.china.huawei.com (unknown [7.202.194.10]) by mail.maildlp.com (Postfix) with ESMTPS id 7A0B7202E6; Wed, 20 May 2026 10:22:55 +0800 (CST) Received: from huawei.com (10.50.85.135) by kwepemj100016.china.huawei.com (7.202.194.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.36; Wed, 20 May 2026 10:22:54 +0800 From: Wupeng Ma To: , , , , , , , , , , , , CC: , , Subject: [PATCH v3] mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison Date: Wed, 20 May 2026 10:01:28 +0800 Message-ID: <20260520020128.3506168-1-mawupeng1@huawei.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems500002.china.huawei.com (7.221.188.17) To kwepemj100016.china.huawei.com (7.202.194.10) Content-Type: text/plain; charset="utf-8" madvise(MADV_HWPOISON) can trigger a recursive spinlock self-deadlock (AA deadlock) on hugetlb_lock due to a race with concurrent folio unmapping. The race scenario: Thread 1 (madvise MADV_HWPOISON) Thread 2 (unmap) ------------------------------- ----------------- madvise_inject_error() get_user_pages_fast() <- refcount++ memory_failure(MF_COUNT_INCREASED) get_huge_page_for_hwpoison() spin_lock_irq(&hugetlb_lock) // refcount =3D=3D 2 (gup + map) // MF_COUNT_INCREASED path: count_increased =3D true zap_pte_range() page_remove_rmap() put_page() <- drops map ref // refcount: 2 -> 1 hugetlb_update_hwpoison() -> MF_HUGETLB_FOLIO_PRE_POISONED -> goto out out: folio_put(folio) <- drops gup ref // refcount: 1 -> 0 free_huge_folio() spin_lock_irq(&hugetlb_lock) <- AA DEADLOCK When Thread 2's put_page() drops the mapping reference while Thread 1 holds hugetlb_lock, the folio refcount drops to 1. The subsequent folio_put() at the out: label frees the folio, and free_huge_folio() attempts to re-acquire the non-recursive hugetlb_lock on the same CPU, resulting in an AA self-deadlock. The same deadlock can also occur on the folio_try_get() path: when a migratable folio is found and folio_try_get() succeeds (refcount rises to refcount+1), a concurrent unmap and a hugetlb_update_hwpoison() returning pre-poisoned status will land at out: where folio_put() again may free the folio under hugetlb_lock. Fix this by removing the hugetlb_lock wrapper from hugetlb.c and moving the lock acquisition directly into get_huge_page_for_hwpoison() (formerly __get_huge_page_for_hwpoison) in memory-failure.c. Place spin_unlock_irq() before folio_put() at the out: label so that the folio is always released outside the lock, preventing any recursive lock acquisition. Remove the now-incorrect "Called from hugetlb code with hugetlb_lock held" comment, and update the stale __get_huge_page_for_hwpoison declarations in include/linux/mm.h. Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion a= nd memory_failure_hugetlb()") Signed-off-by: Wupeng Ma --- include/linux/hugetlb.h | 8 -------- include/linux/mm.h | 8 -------- mm/hugetlb.c | 11 ----------- mm/memory-failure.c | 8 ++++---- 4 files changed, 4 insertions(+), 31 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 65910437be1ca..aa3eb42e0a01a 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -153,8 +153,6 @@ long hugetlb_unreserve_pages(struct inode *inode, long = start, long end, long freed); bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list); int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool un= poison); -int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared); void folio_putback_hugetlb(struct folio *folio); void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, = int reason); void hugetlb_fix_reserve_counts(struct inode *inode); @@ -422,12 +420,6 @@ static inline int get_hwpoison_hugetlb_folio(struct fo= lio *folio, bool *hugetlb, return 0; } =20 -static inline int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - return 0; -} - static inline void folio_putback_hugetlb(struct folio *folio) { } diff --git a/include/linux/mm.h b/include/linux/mm.h index abb4963c1f064..46e5936dabaa8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4602,8 +4602,6 @@ extern int soft_offline_page(unsigned long pfn, int f= lags); */ extern const struct attribute_group memory_failure_attr_group; extern void memory_failure_queue(unsigned long pfn, int flags); -extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared); void num_poisoned_pages_inc(unsigned long pfn); void num_poisoned_pages_sub(unsigned long pfn, long i); #else @@ -4611,12 +4609,6 @@ static inline void memory_failure_queue(unsigned lon= g pfn, int flags) { } =20 -static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flag= s, - bool *migratable_cleared) -{ - return 0; -} - static inline void num_poisoned_pages_inc(unsigned long pfn) { } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 327eaa4074d39..4c99bb868ad08 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7170,17 +7170,6 @@ int get_hwpoison_hugetlb_folio(struct folio *folio, = bool *hugetlb, bool unpoison return ret; } =20 -int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - int ret; - - spin_lock_irq(&hugetlb_lock); - ret =3D __get_huge_page_for_hwpoison(pfn, flags, migratable_cleared); - spin_unlock_irq(&hugetlb_lock); - return ret; -} - /** * folio_putback_hugetlb - unisolate a hugetlb folio * @folio: the isolated hugetlb folio diff --git a/mm/memory-failure.c b/mm/memory-failure.c index ee42d43613097..28522180cf7f8 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1966,10 +1966,7 @@ void folio_clear_hugetlb_hwpoison(struct folio *foli= o) folio_free_raw_hwp(folio, true); } =20 -/* - * Called from hugetlb code with hugetlb_lock held. - */ -int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, +static int get_huge_page_for_hwpoison(unsigned long pfn, int flags, bool *migratable_cleared) { struct page *page =3D pfn_to_page(pfn); @@ -1977,6 +1974,7 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, i= nt flags, bool count_increased =3D false; int ret, rc; =20 + spin_lock_irq(&hugetlb_lock); if (!folio_test_hugetlb(folio)) { ret =3D MF_HUGETLB_NON_HUGEPAGE; goto out; @@ -2013,8 +2011,10 @@ int __get_huge_page_for_hwpoison(unsigned long pfn, = int flags, *migratable_cleared =3D true; } =20 + spin_unlock_irq(&hugetlb_lock); return ret; out: + spin_unlock_irq(&hugetlb_lock); if (count_increased) folio_put(folio); return ret; --=20 2.43.0