From nobody Thu Dec 18 23:20:46 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16C9385260 for ; Wed, 19 Feb 2025 02:18:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931503; cv=none; b=biUT5iFUoRNHUlItqEVvWzwf9waXtCUHdjz3MP37I1tZ2KVsv6yKrwu5FQtM6bV9UGm6D1TCAOvsOcx+iORJXJ+uieqOqdGbU5Mbj5LB0XwoV1V8UP1i/z3kK3dWDmoFcJVOpFNtbxr0wpAd5/BFZQM1rqzgTJaoAoSoI2eBuBg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931503; c=relaxed/simple; bh=qPogO3+SURGDVctjzFSU9aTDFySdl7GETz8GHCW3wlk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PX5aDd8ZXzQGxrrYoBfELhrHjFqSpg1Zou4ECZ1MxMDyAfyt8QGuqQo6LPwYhSkXBNAw67zjvVVOzS8FvOqkM8d7zw4B+nO8h26p8Zo3+DgLtJ2dlAT24GAhAIObWu2j21F545xyHs+r0S+bfWo2k+m2S+G3eaH18cNumYXv9y4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=AxmZOGZs; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="AxmZOGZs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739931500; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GvFM6GZiRQiXqBZ3vCZFiH0Ays3za6p0+8us7Xm2FsA=; b=AxmZOGZs2dhHxNycOG02YO2p/7RouMIIc2z4m20K2KX0aDn7B1fjcX67jB5MuolJo4n0HG Vwfp1cFjA9ZEfJ6yD2oNehoM/5UxmrBazDADXkOQEVh/IT8W1nxg0Gie1V82JDmGIPByZe mx/YjGRqtrvvPXsmcGDGn12OQBxGWU4= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-633-uF3_mlKrM2imXHAX3cLYTQ-1; Tue, 18 Feb 2025 21:18:16 -0500 X-MC-Unique: uF3_mlKrM2imXHAX3cLYTQ-1 X-Mimecast-MFC-AGG-ID: uF3_mlKrM2imXHAX3cLYTQ_1739931495 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 13B48180087A; Wed, 19 Feb 2025 02:18:15 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.22.65.50]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id BC6A719560B9; Wed, 19 Feb 2025 02:18:12 +0000 (UTC) From: Luiz Capitulino To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, yuzhao@google.com, pasha.tatashin@soleen.com Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, muchun.song@linux.dev, luizcap@redhat.com Subject: [PATCH 1/4] mm: page_ext: add an iteration API for page extensions Date: Tue, 18 Feb 2025 21:17:47 -0500 Message-ID: <3f0e058aef3951b39cf6bb4259c247352d4fe736.1739931468.git.luizcap@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Content-Type: text/plain; charset="utf-8" The page extension implementation assumes that all page extensions of a given page order are stored in the same memory section. The function page_ext_next() relies on this assumption by adding an offset to the current object to return the next adjacent page extension. This behavior works as expected for flatmem but fails for sparsemem when using 1G pages. The commit cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for gigantic folios") exposes this issue, making it possible for a crash when using page_owner or page_table_check page extensions. The problem is that for 1G pages, the page extensions may span memory section boundaries and be stored in different memory sections. This issue was not visible before commit cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for gigantic folios") because alloc_contig_pages() never passed more than MAX_PAGE_ORDER to post_alloc_hook(). However, the series introducing mentioned commit changed this behavior allowing the full 1G page order to be passed. Reproducer: 1. Build the kernel with CONFIG_SPARSEMEM=3Dy and table extensions support 2. Pass 'default_hugepagesz=3D1 page_owner=3Don' in the kernel command-line 3. Reserve one 1G page at run-time, this should crash (backtrace below) To address this issue, this commit introduces a new API for iterating through page extensions. The main iteration loops are for_each_page_ext() and for_each_page_ext_order(). Both must be called with the RCU read lock taken. Here's an usage example: """ struct page_ext_iter iter; struct page_ext *page_ext; ... rcu_read_lock(); for_each_page_ext_order(page, order, page_ext, iter) { struct my_page_ext *obj =3D get_my_page_ext_obj(page_ext); ... } rcu_read_unlock(); """ Both loop constructs use page_ext_iter_next(), which checks to see if we have crossed sections in the iteration. In this case, page_ext_iter_next() retrieves the next page_ext object from another section. Thanks to David Hildenbrand for helping identify the root cause and providing suggestions on how to fix and optmize the solution (final implementation and bugs are all mine through). Lastly, here's the backtrace, without kasan you can get random crashes: [ 76.052526] BUG: KASAN: slab-out-of-bounds in __update_page_owner_handle= +0x238/0x298 [ 76.060283] Write of size 4 at addr ffff07ff96240038 by task tee/3598 [ 76.066714] [ 76.068203] CPU: 88 UID: 0 PID: 3598 Comm: tee Kdump: loaded Not tainted= 6.13.0-rep1 #3 [ 76.076202] Hardware name: WIWYNN Mt.Jade Server System B81.030Z1.0007/M= t.Jade Motherboard, BIOS 2.10.20220810 (SCP: 2.10.20220810) 2022/08/10 [ 76.088972] Call trace: [ 76.091411] show_stack+0x20/0x38 (C) [ 76.095073] dump_stack_lvl+0x80/0xf8 [ 76.098733] print_address_description.constprop.0+0x88/0x398 [ 76.104476] print_report+0xa8/0x278 [ 76.108041] kasan_report+0xa8/0xf8 [ 76.111520] __asan_report_store4_noabort+0x20/0x30 [ 76.116391] __update_page_owner_handle+0x238/0x298 [ 76.121259] __set_page_owner+0xdc/0x140 [ 76.125173] post_alloc_hook+0x190/0x1d8 [ 76.129090] alloc_contig_range_noprof+0x54c/0x890 [ 76.133874] alloc_contig_pages_noprof+0x35c/0x4a8 [ 76.138656] alloc_gigantic_folio.isra.0+0x2c0/0x368 [ 76.143616] only_alloc_fresh_hugetlb_folio.isra.0+0x24/0x150 [ 76.149353] alloc_pool_huge_folio+0x11c/0x1f8 [ 76.153787] set_max_huge_pages+0x364/0xca8 [ 76.157961] __nr_hugepages_store_common+0xb0/0x1a0 [ 76.162829] nr_hugepages_store+0x108/0x118 [ 76.167003] kobj_attr_store+0x3c/0x70 [ 76.170745] sysfs_kf_write+0xfc/0x188 [ 76.174492] kernfs_fop_write_iter+0x274/0x3e0 [ 76.178927] vfs_write+0x64c/0x8e0 [ 76.182323] ksys_write+0xf8/0x1f0 [ 76.185716] __arm64_sys_write+0x74/0xb0 [ 76.189630] invoke_syscall.constprop.0+0xd8/0x1e0 [ 76.194412] do_el0_svc+0x164/0x1e0 [ 76.197891] el0_svc+0x40/0xe0 [ 76.200939] el0t_64_sync_handler+0x144/0x168 [ 76.205287] el0t_64_sync+0x1ac/0x1b0 Fixes: cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for gigantic folios") Signed-off-by: Luiz Capitulino --- include/linux/page_ext.h | 66 ++++++++++++++++++++++++++++++++++++++++ mm/page_ext.c | 41 +++++++++++++++++++++++++ 2 files changed, 107 insertions(+) diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h index e4b48a0dda244..a99da12e59fa7 100644 --- a/include/linux/page_ext.h +++ b/include/linux/page_ext.h @@ -3,6 +3,7 @@ #define __LINUX_PAGE_EXT_H =20 #include +#include #include =20 struct pglist_data; @@ -69,12 +70,26 @@ extern void page_ext_init(void); static inline void page_ext_init_flatmem_late(void) { } + +static inline bool page_ext_iter_next_fast_possible(unsigned long next_pfn) +{ + /* + * page_ext is allocated per memory section. Once we cross a + * memory section, we have to fetch the new pointer. + */ + return next_pfn % PAGES_PER_SECTION; +} #else extern void page_ext_init_flatmem(void); extern void page_ext_init_flatmem_late(void); static inline void page_ext_init(void) { } + +static inline bool page_ext_iter_next_fast_possible(unsigned long next_pfn) +{ + return true; +} #endif =20 extern struct page_ext *page_ext_get(const struct page *page); @@ -93,6 +108,57 @@ static inline struct page_ext *page_ext_next(struct pag= e_ext *curr) return next; } =20 +struct page_ext_iter { + unsigned long pfn; + unsigned long index; + struct page_ext *page_ext; +}; + +struct page_ext *page_ext_iter_begin(struct page_ext_iter *iter, struct pa= ge *page); +struct page_ext *page_ext_iter_next(struct page_ext_iter *iter); + +/** + * page_ext_iter_get() - Get current page extension + * @iter: page extension iterator. + * + * Return: NULL if no page_ext exists for this iterator. + */ +static inline struct page_ext *page_ext_iter_get(const struct page_ext_ite= r *iter) +{ + return iter->page_ext; +} + +/** + * for_each_page_ext(): iterate through page_ext objects. + * @__page: the page we're interested in + * @__pgcount: how many pages to iterate through + * @__page_ext: struct page_ext pointer where the current page_ext + * object is returned + * @__iter: struct page_ext_iter object (defined in the stack) + * + * IMPORTANT: must be called with RCU read lock taken. + */ +#define for_each_page_ext(__page, __pgcount, __page_ext, __iter) \ + __page_ext =3D page_ext_iter_begin(&__iter, __page); \ + for (__iter.index =3D 0; \ + __page_ext && __iter.index < __pgcount; \ + __page_ext =3D page_ext_iter_next(&__iter), \ + __iter.index++) + +/** + * for_each_page_ext_order(): iterate through page_ext objects + * for a given page order + * @__page: the page we're interested in + * @__order: page order to iterate through + * @__page_ext: struct page_ext pointer where the current page_ext + * object is returned + * @__iter: struct page_ext_iter object (defined in the stack) + * + * IMPORTANT: must be called with RCU read lock taken. + */ +#define for_each_page_ext_order(__page, __order, __page_ext, __iter) \ + for_each_page_ext(__page, (1UL << __order), __page_ext, __iter) + #else /* !CONFIG_PAGE_EXTENSION */ struct page_ext; =20 diff --git a/mm/page_ext.c b/mm/page_ext.c index 641d93f6af4c1..508deb04d5ead 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -549,3 +549,44 @@ void page_ext_put(struct page_ext *page_ext) =20 rcu_read_unlock(); } + +/** + * page_ext_iter_begin() - Prepare for iterating through page extensions. + * @iter: page extension iterator. + * @page: The page we're interested in. + * + * Must be called with RCU read lock taken. + * + * Return: NULL if no page_ext exists for this page. + */ +struct page_ext *page_ext_iter_begin(struct page_ext_iter *iter, struct pa= ge *page) +{ + iter->pfn =3D page_to_pfn(page); + iter->page_ext =3D lookup_page_ext(page); + + return iter->page_ext; +} + +/** + * page_ext_iter_next() - Get next page extension + * @iter: page extension iterator. + * + * Must be called with RCU read lock taken. + * + * Return: NULL if no next page_ext exists. + */ +struct page_ext *page_ext_iter_next(struct page_ext_iter *iter) +{ + if (WARN_ON_ONCE(!iter->page_ext)) + return NULL; + + iter->pfn++; + + if (page_ext_iter_next_fast_possible(iter->pfn)) { + iter->page_ext =3D page_ext_next(iter->page_ext); + } else { + iter->page_ext =3D lookup_page_ext(pfn_to_page(iter->pfn)); + } + + return iter->page_ext; +} --=20 2.48.1 From nobody Thu Dec 18 23:20:46 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4FE5170A11 for ; Wed, 19 Feb 2025 02:18:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931505; cv=none; b=DiEulGSQoM0MzFe9+0GfThfvVCqJgK0D6XDAntEHuXxF2dznnR6sPUlKS9IKx+mDgTdKOskN9ObdVjaa/P/3Es9BYRAkZjaqvhxvhLsawbLWBZAiJBcbaXTd4YYSp8gG7W+dYExNZUteploJ4OZ8CoAJPOfzY0zoqPz6JRx+yjE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931505; c=relaxed/simple; bh=3iPOGv3PQchbB18B4ZinqtqeDFAmJ/DFZI2T+f/LAWE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UOZGbZ+C7nqHAVeYAlxCvgn7kyvZa+NvFGtPflLoxKm39i3OpfmsTeo8xWoJaQVF8FoP9VRBbdVpkeg0Su0OfIYLEoRvsHklV/H2v9TK2drP586twWHiOLpPmQUiW0G4+jfRrItlej/xJrj055CCcm6bfFA+P0MfEZUNGLJzYnk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Knw+qpNx; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Knw+qpNx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739931502; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NsF2KsmlxImjdJhL9OxCFldmZagOaePVCluv2R2e7Ow=; b=Knw+qpNxgRvMjeqsj9VC3CziaVDA/YL/QNAYKug2K77UUPuV46CTzSAecSFSOCA3GcKPGj HG7262+L5DrwN3nxEkZu8c4ci9LQ3BGAeOEJ9LnNvMDh+Gb4lXHP9ibvqI5aWFa1vtmmrO 20W7oJvBF+/x8v7KJMD91lLdwCDn0hY= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-599-8odB42S2OHSsiU1HfMlZmg-1; Tue, 18 Feb 2025 21:18:19 -0500 X-MC-Unique: 8odB42S2OHSsiU1HfMlZmg-1 X-Mimecast-MFC-AGG-ID: 8odB42S2OHSsiU1HfMlZmg_1739931498 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A6C1818EB2CF; Wed, 19 Feb 2025 02:18:17 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.22.65.50]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 59B1D19560B9; Wed, 19 Feb 2025 02:18:15 +0000 (UTC) From: Luiz Capitulino To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, yuzhao@google.com, pasha.tatashin@soleen.com Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, muchun.song@linux.dev, luizcap@redhat.com Subject: [PATCH 2/4] mm: page_table_check: use new iteration API Date: Tue, 18 Feb 2025 21:17:48 -0500 Message-ID: <85f11743d259d5e4a1f47456fbcda82ff6db9ab3.1739931468.git.luizcap@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Content-Type: text/plain; charset="utf-8" The page_ext_next() function assumes that page extension objects for a page order allocation always reside in the same memory section, which may not be true and could lead to crashes. Use the new page_ext iteration API instead. Fixes: cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for gigantic folios") Signed-off-by: Luiz Capitulino Acked-by: David Hildenbrand --- mm/page_table_check.c | 39 ++++++++++++--------------------------- 1 file changed, 12 insertions(+), 27 deletions(-) diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 509c6ef8de400..b52e04d31c809 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -62,24 +62,20 @@ static struct page_table_check *get_page_table_check(st= ruct page_ext *page_ext) */ static void page_table_check_clear(unsigned long pfn, unsigned long pgcnt) { + struct page_ext_iter iter; struct page_ext *page_ext; struct page *page; - unsigned long i; bool anon; =20 if (!pfn_valid(pfn)) return; =20 page =3D pfn_to_page(pfn); - page_ext =3D page_ext_get(page); - - if (!page_ext) - return; - BUG_ON(PageSlab(page)); anon =3D PageAnon(page); =20 - for (i =3D 0; i < pgcnt; i++) { + rcu_read_lock(); + for_each_page_ext(page, pgcnt, page_ext, iter) { struct page_table_check *ptc =3D get_page_table_check(page_ext); =20 if (anon) { @@ -89,9 +85,8 @@ static void page_table_check_clear(unsigned long pfn, uns= igned long pgcnt) BUG_ON(atomic_read(&ptc->anon_map_count)); BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0); } - page_ext =3D page_ext_next(page_ext); } - page_ext_put(page_ext); + rcu_read_unlock(); } =20 /* @@ -102,24 +97,20 @@ static void page_table_check_clear(unsigned long pfn, = unsigned long pgcnt) static void page_table_check_set(unsigned long pfn, unsigned long pgcnt, bool rw) { + struct page_ext_iter iter; struct page_ext *page_ext; struct page *page; - unsigned long i; bool anon; =20 if (!pfn_valid(pfn)) return; =20 page =3D pfn_to_page(pfn); - page_ext =3D page_ext_get(page); - - if (!page_ext) - return; - BUG_ON(PageSlab(page)); anon =3D PageAnon(page); =20 - for (i =3D 0; i < pgcnt; i++) { + rcu_read_lock(); + for_each_page_ext(page, pgcnt, page_ext, iter) { struct page_table_check *ptc =3D get_page_table_check(page_ext); =20 if (anon) { @@ -129,9 +120,8 @@ static void page_table_check_set(unsigned long pfn, uns= igned long pgcnt, BUG_ON(atomic_read(&ptc->anon_map_count)); BUG_ON(atomic_inc_return(&ptc->file_map_count) < 0); } - page_ext =3D page_ext_next(page_ext); } - page_ext_put(page_ext); + rcu_read_unlock(); } =20 /* @@ -140,24 +130,19 @@ static void page_table_check_set(unsigned long pfn, u= nsigned long pgcnt, */ void __page_table_check_zero(struct page *page, unsigned int order) { + struct page_ext_iter iter; struct page_ext *page_ext; - unsigned long i; =20 BUG_ON(PageSlab(page)); =20 - page_ext =3D page_ext_get(page); - - if (!page_ext) - return; - - for (i =3D 0; i < (1ul << order); i++) { + rcu_read_lock(); + for_each_page_ext_order(page, order, page_ext, iter) { struct page_table_check *ptc =3D get_page_table_check(page_ext); =20 BUG_ON(atomic_read(&ptc->anon_map_count)); BUG_ON(atomic_read(&ptc->file_map_count)); - page_ext =3D page_ext_next(page_ext); } - page_ext_put(page_ext); + rcu_read_unlock(); } =20 void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte) --=20 2.48.1 From nobody Thu Dec 18 23:20:46 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 797FD18FDAF for ; Wed, 19 Feb 2025 02:18:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931508; cv=none; b=OSlOqUGiySyXnAiKasHq7rR0rF375XcLZ4mQYZxMKR41SrTWu+WXJbGdyuZOAREskhFCiCj/qkQdXQMJLWmHWfHdLojI62ajleL82Km1JGLJW9xaqu8hIF7HSRvRhqGlgNxhhePoPSK+sfRVENfHax3ioCxUYn8RQMZCdwmBybM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931508; c=relaxed/simple; bh=4BuZXM7miTE+pJVGfomhSJVcQ6otUaGOwqZg+loBCg8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lH6DFBbVfr50D5gWr/kmTFU8VXLbnUWmSKKK2drLhYe0900okDdUvbtrp382atJNLbKbGjL7/1dNKN4+JG0AXs75M/SQ/Apgd+625Io0/97jO0aFvFBa6PdQ1bRY/14Gqo/HF3Ec9YUlesp5HM9bjteHppkzAoa0/x+wgrLJ1+8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Te9K9con; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Te9K9con" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739931504; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sO1RKRz7AmW+mPQ8kuacqDxnDkJ26PvSRyCVK8dkxcg=; b=Te9K9conq2EDkI/m0e4S8bbioQdr83+3SiCYx0DABYOwsXALJFwV6NIe9kZHB5K8CUAwUb 9XzkTESUBbfE5gRrFr/+SMv40XQLJTXgU2oDLSqIDCb6WPdc5cgMzbaDOxvv2JXN5x58Hz QlJkeNbcTbxwXxjZ7xdnCeZ7qvGPQ2s= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-214-kACK9iz7N-iKjG_UQRlKJQ-1; Tue, 18 Feb 2025 21:18:21 -0500 X-MC-Unique: kACK9iz7N-iKjG_UQRlKJQ-1 X-Mimecast-MFC-AGG-ID: kACK9iz7N-iKjG_UQRlKJQ_1739931500 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F35CB196E078; Wed, 19 Feb 2025 02:18:19 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.22.65.50]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id ECCAB1955F0F; Wed, 19 Feb 2025 02:18:17 +0000 (UTC) From: Luiz Capitulino To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, yuzhao@google.com, pasha.tatashin@soleen.com Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, muchun.song@linux.dev, luizcap@redhat.com Subject: [PATCH 3/4] mm: page_owner: use new iteration API Date: Tue, 18 Feb 2025 21:17:49 -0500 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Content-Type: text/plain; charset="utf-8" The page_ext_next() function assumes that page extension objects for a page order allocation always reside in the same memory section, which may not be true and could lead to crashes. Use the new page_ext iteration API instead. Fixes: cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for gigantic folios") Signed-off-by: Luiz Capitulino --- mm/page_owner.c | 61 +++++++++++++++++++++++-------------------------- 1 file changed, 29 insertions(+), 32 deletions(-) diff --git a/mm/page_owner.c b/mm/page_owner.c index 2d6360eaccbb6..9afc62a882813 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -229,17 +229,19 @@ static void dec_stack_record_count(depot_stack_handle= _t handle, handle); } =20 -static inline void __update_page_owner_handle(struct page_ext *page_ext, +static inline void __update_page_owner_handle(struct page *page, depot_stack_handle_t handle, unsigned short order, gfp_t gfp_mask, short last_migrate_reason, u64 ts_nsec, pid_t pid, pid_t tgid, char *comm) { - int i; + struct page_ext_iter iter; + struct page_ext *page_ext; struct page_owner *page_owner; =20 - for (i =3D 0; i < (1 << order); i++) { + rcu_read_lock(); + for_each_page_ext_order(page, order, page_ext, iter) { page_owner =3D get_page_owner(page_ext); page_owner->handle =3D handle; page_owner->order =3D order; @@ -252,20 +254,22 @@ static inline void __update_page_owner_handle(struct = page_ext *page_ext, sizeof(page_owner->comm)); __set_bit(PAGE_EXT_OWNER, &page_ext->flags); __set_bit(PAGE_EXT_OWNER_ALLOCATED, &page_ext->flags); - page_ext =3D page_ext_next(page_ext); } + rcu_read_unlock(); } =20 -static inline void __update_page_owner_free_handle(struct page_ext *page_e= xt, +static inline void __update_page_owner_free_handle(struct page *page, depot_stack_handle_t handle, unsigned short order, pid_t pid, pid_t tgid, u64 free_ts_nsec) { - int i; + struct page_ext_iter iter; + struct page_ext *page_ext; struct page_owner *page_owner; =20 - for (i =3D 0; i < (1 << order); i++) { + rcu_read_lock(); + for_each_page_ext_order(page, order, page_ext, iter) { page_owner =3D get_page_owner(page_ext); /* Only __reset_page_owner() wants to clear the bit */ if (handle) { @@ -275,8 +279,8 @@ static inline void __update_page_owner_free_handle(stru= ct page_ext *page_ext, page_owner->free_ts_nsec =3D free_ts_nsec; page_owner->free_pid =3D current->pid; page_owner->free_tgid =3D current->tgid; - page_ext =3D page_ext_next(page_ext); } + rcu_read_unlock(); } =20 void __reset_page_owner(struct page *page, unsigned short order) @@ -293,11 +297,11 @@ void __reset_page_owner(struct page *page, unsigned s= hort order) =20 page_owner =3D get_page_owner(page_ext); alloc_handle =3D page_owner->handle; + page_ext_put(page_ext); =20 handle =3D save_stack(GFP_NOWAIT | __GFP_NOWARN); - __update_page_owner_free_handle(page_ext, handle, order, current->pid, + __update_page_owner_free_handle(page, handle, order, current->pid, current->tgid, free_ts_nsec); - page_ext_put(page_ext); =20 if (alloc_handle !=3D early_handle) /* @@ -313,19 +317,13 @@ void __reset_page_owner(struct page *page, unsigned s= hort order) noinline void __set_page_owner(struct page *page, unsigned short order, gfp_t gfp_mask) { - struct page_ext *page_ext; u64 ts_nsec =3D local_clock(); depot_stack_handle_t handle; =20 handle =3D save_stack(gfp_mask); - - page_ext =3D page_ext_get(page); - if (unlikely(!page_ext)) - return; - __update_page_owner_handle(page_ext, handle, order, gfp_mask, -1, + __update_page_owner_handle(page, handle, order, gfp_mask, -1, ts_nsec, current->pid, current->tgid, current->comm); - page_ext_put(page_ext); inc_stack_record_count(handle, gfp_mask, 1 << order); } =20 @@ -344,26 +342,24 @@ void __set_page_owner_migrate_reason(struct page *pag= e, int reason) =20 void __split_page_owner(struct page *page, int old_order, int new_order) { - int i; - struct page_ext *page_ext =3D page_ext_get(page); + struct page_ext_iter iter; + struct page_ext *page_ext; struct page_owner *page_owner; =20 - if (unlikely(!page_ext)) - return; - - for (i =3D 0; i < (1 << old_order); i++) { + rcu_read_lock(); + for_each_page_ext_order(page, old_order, page_ext, iter) { page_owner =3D get_page_owner(page_ext); page_owner->order =3D new_order; - page_ext =3D page_ext_next(page_ext); } - page_ext_put(page_ext); + rcu_read_unlock(); } =20 void __folio_copy_owner(struct folio *newfolio, struct folio *old) { - int i; struct page_ext *old_ext; struct page_ext *new_ext; + struct page_ext *page_ext; + struct page_ext_iter iter; struct page_owner *old_page_owner; struct page_owner *new_page_owner; depot_stack_handle_t migrate_handle; @@ -381,7 +377,7 @@ void __folio_copy_owner(struct folio *newfolio, struct = folio *old) old_page_owner =3D get_page_owner(old_ext); new_page_owner =3D get_page_owner(new_ext); migrate_handle =3D new_page_owner->handle; - __update_page_owner_handle(new_ext, old_page_owner->handle, + __update_page_owner_handle(&newfolio->page, old_page_owner->handle, old_page_owner->order, old_page_owner->gfp_mask, old_page_owner->last_migrate_reason, old_page_owner->ts_nsec, old_page_owner->pid, @@ -391,7 +387,7 @@ void __folio_copy_owner(struct folio *newfolio, struct = folio *old) * will be freed after migration. Keep them until then as they may be * useful. */ - __update_page_owner_free_handle(new_ext, 0, old_page_owner->order, + __update_page_owner_free_handle(&newfolio->page, 0, old_page_owner->order, old_page_owner->free_pid, old_page_owner->free_tgid, old_page_owner->free_ts_nsec); @@ -400,11 +396,12 @@ void __folio_copy_owner(struct folio *newfolio, struc= t folio *old) * for the new one and the old folio otherwise there will be an imbalance * when subtracting those pages from the stack. */ - for (i =3D 0; i < (1 << new_page_owner->order); i++) { + rcu_read_lock(); + for_each_page_ext_order(&old->page, new_page_owner->order, page_ext, iter= ) { + old_page_owner =3D get_page_owner(page_ext); old_page_owner->handle =3D migrate_handle; - old_ext =3D page_ext_next(old_ext); - old_page_owner =3D get_page_owner(old_ext); } + rcu_read_unlock(); =20 page_ext_put(new_ext); page_ext_put(old_ext); @@ -813,7 +810,7 @@ static void init_pages_in_zone(pg_data_t *pgdat, struct= zone *zone) goto ext_put_continue; =20 /* Found early allocated page */ - __update_page_owner_handle(page_ext, early_handle, 0, 0, + __update_page_owner_handle(page, early_handle, 0, 0, -1, local_clock(), current->pid, current->tgid, current->comm); count++; --=20 2.48.1 From nobody Thu Dec 18 23:20:46 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C8CA2192580 for ; Wed, 19 Feb 2025 02:18:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931509; cv=none; b=U0yuZotZW7bo+HTxnzR0gOzToQeLRC4snZoPwtTcJ0Jlrox6IOYd1futjwFgcWxwOxJPqhH5eN3VQI9c9jDvUrGt0nNx59XShvE/t8CqEnVjxVOezUx5HKqFBF0/fQnQPlYCh46HD2XeyKcWpKZdPnPQ7gz/Z72sLpLhWDENyIU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739931509; c=relaxed/simple; bh=7970QM1xZmqq/TgEZxRgA2G329dagmUAvSBDdehiCKg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZIOzp8/08j/grPFYw43XpRCRyfGAAcUhd94H8W4JEfBEULHxWRdfUuyVi3x3YNRkhEUwHDsydO+KvR7fgfdb91f4bSovOo9u21EzSUo5cPKQjX4lwbdLa9ucb13Ks7KW9L29vIZS/d8r0hFO2ZKaYNgqI2JhrZc5/CNE2DITjiI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TkmQiIq8; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TkmQiIq8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739931506; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v8m9/U7uQSSFPcBA25A4fBqyiaCO54n8VIlpbaNJ5EU=; b=TkmQiIq8dvhrjJ7Uy+wOzn0vbDG2zHIb51mz14aF3FEk6T6gIeRY3fnMeNYhBywLLb26ts HH7fn1mp2BAz8rAuS2lx1uoYeJzzRQNDqXsmAXu1LtAoXmmNjVTg5iudB0PBhdD+KCIseU hITS/D6rjb+y6r8jLcx+AvzE7/psx/o= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-338-rLhuEFm1MNKyofazmyW6_g-1; Tue, 18 Feb 2025 21:18:23 -0500 X-MC-Unique: rLhuEFm1MNKyofazmyW6_g-1 X-Mimecast-MFC-AGG-ID: rLhuEFm1MNKyofazmyW6_g_1739931502 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4186619373DC; Wed, 19 Feb 2025 02:18:22 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.22.65.50]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 44EAA1956094; Wed, 19 Feb 2025 02:18:20 +0000 (UTC) From: Luiz Capitulino To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, david@redhat.com, yuzhao@google.com, pasha.tatashin@soleen.com Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, muchun.song@linux.dev, luizcap@redhat.com Subject: [PATCH 4/4] mm: page_ext: make page_ext_next() private to page_ext Date: Tue, 18 Feb 2025 21:17:50 -0500 Message-ID: <5794ff5b322febd376728c8e22c802c15827dcc8.1739931468.git.luizcap@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Content-Type: text/plain; charset="utf-8" Previous commits removed page_ext_next() use from page_ext clients. Make it private. Fixes: cf54f310d0d3 ("mm/hugetlb: use __GFP_COMP for gigantic folios") Signed-off-by: Luiz Capitulino --- include/linux/page_ext.h | 7 ------- mm/page_ext.c | 7 +++++++ 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h index a99da12e59fa7..3a4f0f825aa59 100644 --- a/include/linux/page_ext.h +++ b/include/linux/page_ext.h @@ -101,13 +101,6 @@ static inline void *page_ext_data(struct page_ext *pag= e_ext, return (void *)(page_ext) + ops->offset; } =20 -static inline struct page_ext *page_ext_next(struct page_ext *curr) -{ - void *next =3D curr; - next +=3D page_ext_size; - return next; -} - struct page_ext_iter { unsigned long pfn; unsigned long index; diff --git a/mm/page_ext.c b/mm/page_ext.c index 508deb04d5ead..f9e515d353005 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -567,6 +567,13 @@ struct page_ext *page_ext_iter_begin(struct page_ext_i= ter *iter, struct page *pa return iter->page_ext; } =20 +static struct page_ext *page_ext_next(struct page_ext *curr) +{ + void *next =3D curr; + next +=3D page_ext_size; + return next; +} + /** * page_ext_iter_next() - Get next page extension * @iter: page extension iterator. --=20 2.48.1