From nobody Sun Apr 26 09:31:05 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64C0DC43334 for ; Sun, 19 Jun 2022 15:12:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233498AbiFSPMF (ORCPT ); Sun, 19 Jun 2022 11:12:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231438AbiFSPLv (ORCPT ); Sun, 19 Jun 2022 11:11:51 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E86B6AE56; Sun, 19 Jun 2022 08:11:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=bindIO2lTFDesoOeQ2oViKfEfCZlyJ8TfNVBigsijEU=; b=LFYmATCu1fX3l78jDNbK+xhNAo DcgucQf7ONWKGyPqRpVHlCUmhmynjxiGS9VGyA6neq2L8jTOMjkb8NmhEArLoabLJyWsGC/xSVtED xy+pBQ314fj/6ws20wk/YlffE786PNkPua4pe8DEjyJZ9tnsH9+7u7aeI5n/RU5fi3XX14U8m8nm4 vEZTIuOjcuFbBJ2gYgAGkq0Z7nv2OnusSgNwE5KLogbSvK26zxBKPyr7N+Gi29o2dOlu+YI9Wh+da umaqTxE5cd3nfTrioNl1gqhwSBpRuJQ/I1gnYIgkaB6BL0re7iafEHA8ZSMPizsETgpK4pM3L2LJz W5Bd5m+w==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1o2waZ-004QOq-3d; Sun, 19 Jun 2022 15:11:47 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Yu Kuai , Kent Overstreet Subject: [PATCH 1/3] filemap: Correct the conditions for marking a folio as accessed Date: Sun, 19 Jun 2022 16:11:41 +0100 Message-Id: <20220619151143.1054746-2-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220619151143.1054746-1-willy@infradead.org> References: <20220619151143.1054746-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" We had an off-by-one error which meant that we never marked the first page in a read as accessed. This was visible as a slowdown when re-reading a file as pages were being evicted from cache too soon. In reviewing this code, we noticed a second bug where a multi-page folio would be marked as accessed multiple times when doing reads that were less than the size of the folio. Abstract the comparison of whether two file positions are in the same folio into a new function, fixing both of these bugs. Reported-by: Yu Kuai Reviewed-by: Kent Overstreet Signed-off-by: Matthew Wilcox (Oracle) Reported-by: kernel test robot reviewed-by tags I get by Tuesday. --- mm/filemap.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index ac3775c1ce4c..577068868449 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2629,6 +2629,13 @@ static int filemap_get_pages(struct kiocb *iocb, str= uct iov_iter *iter, return err; } =20 +static inline bool pos_same_folio(loff_t pos1, loff_t pos2, struct folio *= folio) +{ + unsigned int shift =3D folio_shift(folio); + + return (pos1 >> shift =3D=3D pos2 >> shift); +} + /** * filemap_read - Read data from the page cache. * @iocb: The iocb to read. @@ -2700,11 +2707,11 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov= _iter *iter, writably_mapped =3D mapping_writably_mapped(mapping); =20 /* - * When a sequential read accesses a page several times, only + * When a read accesses the same folio several times, only * mark it as accessed the first time. */ - if (iocb->ki_pos >> PAGE_SHIFT !=3D - ra->prev_pos >> PAGE_SHIFT) + if (!pos_same_folio(iocb->ki_pos, ra->prev_pos - 1, + fbatch.folios[0])) folio_mark_accessed(fbatch.folios[0]); =20 for (i =3D 0; i < folio_batch_count(&fbatch); i++) { --=20 2.35.1 From nobody Sun Apr 26 09:31:05 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 626B5C43334 for ; Sun, 19 Jun 2022 15:12:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231159AbiFSPMA (ORCPT ); Sun, 19 Jun 2022 11:12:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231346AbiFSPLv (ORCPT ); Sun, 19 Jun 2022 11:11:51 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC7F3AE53; Sun, 19 Jun 2022 08:11:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=HEdDzWGE0/UHZ2jiI+06ZjvEQpyio5BCGMu+xcj8GF0=; b=sZmaruFh2/gF/L/HuQXhGK6qZb eTdqebfkQ7+xES9EIjJjUsRQ52aTC7z4Y4sZtIWXSUFhBfZ01esJ1pdMxtGgxPhdUKxQGTNJ1/TVN gL1525wV6ogGepJWkseXjFu3pBixV56lh1VRFx94jSN7Y32FQwrARLA0Rn0u8Z3fZ77zbHJeIJBeh OFPCrvTO+HA4Qth8rF7ewBPjeI5uHbaYh6MxARGxO2MOEyY2iFyPK0kRWD4dhYUy/8Q3heargL3Z0 qCVn7BDm/o3v0koG61EZFrqWTBtXIh61XPKfbsUWMc7LKaCUgpyJtvABSQ2q0E6gMrtplQz5oratY cG//Gk2g==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1o2waZ-004QOs-5Y; Sun, 19 Jun 2022 15:11:47 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Dave Chinner , Brian Foster , stable@vger.kernel.org Subject: [PATCH 2/3] filemap: Handle sibling entries in filemap_get_read_batch() Date: Sun, 19 Jun 2022 16:11:42 +0100 Message-Id: <20220619151143.1054746-3-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220619151143.1054746-1-willy@infradead.org> References: <20220619151143.1054746-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If a read races with an invalidation followed by another read, it is possible for a folio to be replaced with a higher-order folio. If that happens, we'll see a sibling entry for the new folio in the next iteration of the loop. This manifests as a NULL pointer dereference while holding the RCU read lock. Handle this by simply returning. The next call will find the new folio and handle it correctly. The other ways of handling this rare race are more complex and it's just not worth it. Reported-by: Dave Chinner Reported-by: Brian Foster Debugged-by: Brian Foster Tested-by: Brian Foster Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_r= ead") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Brian Foster reviewed-by tags I get by Tuesday. --- mm/filemap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 577068868449..ffdfbc8b0e3c 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2385,6 +2385,8 @@ static void filemap_get_read_batch(struct address_spa= ce *mapping, continue; if (xas.xa_index > max || xa_is_value(folio)) break; + if (xa_is_sibling(folio)) + break; if (!folio_try_get_rcu(folio)) goto retry; =20 --=20 2.35.1 From nobody Sun Apr 26 09:31:05 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A025FC43334 for ; Sun, 19 Jun 2022 15:11:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232417AbiFSPL6 (ORCPT ); Sun, 19 Jun 2022 11:11:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231159AbiFSPLv (ORCPT ); Sun, 19 Jun 2022 11:11:51 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34352AE52; Sun, 19 Jun 2022 08:11:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=rkDLWdA9qK6dJptJf/M5HrXtYhb5ttvBn9HErKKNMDI=; b=tmmoPedVacsHoBJuv77FbS1OIE 7SC30HkfVjoJBDeZpX3j7OGOSuyv1LGvOS/awWV3auug2b3c7I/ens/EYBToPmtCdz0zIoLOHXJ39 IiuJtdrJGFnx+KkF5jQywbM61r/8FFO2ymoz76cwqrNek1jICCAwHYhdwMnftyKJIx6WhCWFXO6X4 UPqS0jgn1qeU7fRXgM4aAHBpgepX3EKHVoptrxqr2klZDX2Kb23EgkmjmnC1f5JOkXehYnd5JqR/U HvvcFM5kV8inFsizinujjrGoSQKLwq8X5M6qOWFW7PT2j3uJI6D7kHzUvq6OtHqlfo271eisrXDZI sYqmpVPQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1o2waZ-004QOu-7l; Sun, 19 Jun 2022 15:11:47 +0000 From: "Matthew Wilcox (Oracle)" To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, Xiubo Li Subject: [PATCH 3/3] mm: Clear page->private when splitting or migrating a page Date: Sun, 19 Jun 2022 16:11:43 +0100 Message-Id: <20220619151143.1054746-4-willy@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220619151143.1054746-1-willy@infradead.org> References: <20220619151143.1054746-1-willy@infradead.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In our efforts to remove uses of PG_private, we have found folios with the private flag clear and folio->private not-NULL. That is the root cause behind 642d51fb0775 ("ceph: check folio PG_private bit instead of folio->private"). It can also affect a few other filesystems that haven't yet reported a problem. compaction_alloc() can return a page with uninitialised page->private, and rather than checking all the callers of migrate_pages(), just zero page->private after calling get_new_page(). Similarly, the tail pages from split_huge_page() may also have an uninitialised page->private. Reported-by: Xiubo Li Signed-off-by: Matthew Wilcox (Oracle) Tested-by: Xiubo Li reviewed-by tags I get by Tuesday. --- mm/huge_memory.c | 1 + mm/migrate.c | 1 + 2 files changed, 2 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f7248002dad9..9b31a50217b5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2377,6 +2377,7 @@ static void __split_huge_page_tail(struct page *head,= int tail, page_tail); page_tail->mapping =3D head->mapping; page_tail->index =3D head->index + tail; + page_tail->private =3D NULL; =20 /* Page flags must be visible before we make the page non-compound. */ smp_wmb(); diff --git a/mm/migrate.c b/mm/migrate.c index e51588e95f57..6c1ea61f39d8 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1106,6 +1106,7 @@ static int unmap_and_move(new_page_t get_new_page, if (!newpage) return -ENOMEM; =20 + newpage->private =3D 0; rc =3D __unmap_and_move(page, newpage, force, mode); if (rc =3D=3D MIGRATEPAGE_SUCCESS) set_page_owner_migrate_reason(newpage, reason); --=20 2.35.1