From nobody Sun Jun 14 02:31:57 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15CA5378D71; Mon, 4 May 2026 13:21:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777900888; cv=none; b=tDzY43VFzodLtmMH4WgLfseW6tm4R1jG5ZVo87MRP6AWh8ROURDBegDrk0fGdBDZO+KsFilb1+Z0EmO7sgz7VpwiT/NViGjqORKztywnLXT6lJA5e0O9UvYqBAjocm7Q4p9ryWwpGbGFfeTMq3crephHSrAi5WgboT7G4WKlETQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777900888; c=relaxed/simple; bh=2aFCrdi086X8qaY+nrQYNcsSeV5HvIWfW7LHiwbQchQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=taiYiqx1/F8qC5UvkZMGh/JjlnSFwvUKTD1hA7bJhH3TyoK2BKlyuX++S23G4zjZnvkP/EqOip2KggoyP7X9sGAUZM/WoLml1bJYd3/CV9X5c7AnT1xVz9mz6eVEVTHb8+X9kOEaEkr55OM/ov4xI5vt+jOjb94PBvb01Dw0VIY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=cmW1qlBQ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cmW1qlBQ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34274C4AF09; Mon, 4 May 2026 13:21:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777900887; bh=2aFCrdi086X8qaY+nrQYNcsSeV5HvIWfW7LHiwbQchQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=cmW1qlBQq1HOb3cm9DFTA9YWVwxGTYZ/7TDKbbF+u0tf0t0+vmPZw2hBkA+K8gxOI aFjZC9Igr0Kf7wNgdzLmEu0yIwOhPPkmhQ8RFdE3H2G2BEGqdpjlihBLjHCyZz113T PiFHPjWRTNQX64T+kzdNYo+BBay/WjHds/NBI2nIcopTuune73KPhKIKkdH8mnSt8/ mWaYsErFDGhpJx8I0j/uDTXjt6v4duueKgk3bKYpmzwD3JCDga8OjnPWjxEiVA1RkD 2finvn/5LM7g6+uSv+LuaUVfnrwlvhMv1DLRHIM4l+29nGQUfcQTbU6d/ysSVMKGXH Ujiab4Xt/9Ikg== From: Jeff Layton Date: Mon, 04 May 2026 15:20:49 +0200 Subject: [PATCH v5 1/2] mm: track DONTCACHE dirty pages per bdi_writeback Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260504-dontcache-v5-1-4103e58bb377@kernel.org> References: <20260504-dontcache-v5-0-4103e58bb377@kernel.org> In-Reply-To: <20260504-dontcache-v5-0-4103e58bb377@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , "Matthew Wilcox (Oracle)" , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Mike Snitzer , Jens Axboe , Ritesh Harjani , Chuck Lever Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=3721; i=jlayton@kernel.org; h=from:subject:message-id; bh=2aFCrdi086X8qaY+nrQYNcsSeV5HvIWfW7LHiwbQchQ=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBp+J1NRZKbyI7QhGLiFF2oAvXgvp74opb+1ER2w V1/zzRaInuJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCafidTQAKCRAADmhBGVaC FVfRD/43QS9dZ/TF2zC6q4G6CqPoyA3ARP6obj1Xj+LbLpvyEmgsXxz+3d+U/FfnrA8BRHtb9QL qr+DcoUZk2qB+URtNgCPYPwKBDtqkgJ+SisKUHuXzqjw3qMTh6c4znnSG633tJttJopxbpR8YII s7t5olXjXb4eEmwPXIRAKzfrnxHBIj9WN9NZpWIKHDuQP529fBhbZEVyOnxorJuYKUebcLYerxm ha+rmiV4D17PNqIyUeX5rYbvKeeD3bUuXCQ1H+qI580RjMfbgUlisrWaK9vh7m+wB90snPQvcYQ dS9oFMbTspr49VuAUf/VEl6sFBgEllf+GwuujzQBWomgFJDCCM/xpeb1pRfc28ciMhTMzRD6lxq nLBiUtrwSR1UfOoruRdfA4PYDPsB6vFq5hNZMxgWhyJXgTrJhUif2Vzg+ADwlUMZfL49Xa1hYbu TS6q9Ph5QR+x1q7QaWGrax44oyb25sS7eRWljcso6CLFZU9EOrc6hoxEHFQyZpzVimCExJVFSkD BA8N0u5a9Wb27puAE8r2v0xFcGhZqf2qBiMZuTWNiiJ6csZBaeuvziWe+ESXIgabtB82LDJjkIm 8DUcz0KpgvQYgTixhjkclRiF/6xqWnRG0G1EZ/CFfKBjt11N2Fd9F7enrpPODrT2CWcHbqoQuKr zLenDJ+Xa6OU7GQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 Add a per-wb WB_DONTCACHE_DIRTY counter that tracks the number of dirty pages with the dropbehind flag set (i.e., pages dirtied via RWF_DONTCACHE writes). Increment the counter alongside WB_RECLAIMABLE in folio_account_dirtied() when the folio has the dropbehind flag set, and decrement it in folio_clear_dirty_for_io() and folio_account_cleaned(). Also decrement it when a non-DONTCACHE lookup clears the dropbehind flag on a dirty folio in __filemap_get_folio_mpol(), using proper writeback domain locking. The counter will be used by the writeback flusher to determine how many pages to write back when expediting writeback for IOCB_DONTCACHE writes, without flushing the entire BDI's dirty pages. Suggested-by: Jan Kara Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Jan Kara Signed-off-by: Jeff Layton --- include/linux/backing-dev-defs.h | 1 + mm/filemap.c | 13 ++++++++++++- mm/page-writeback.c | 6 ++++++ 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-d= efs.h index a06b93446d10..cb660dd37286 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -33,6 +33,7 @@ enum wb_stat_item { WB_WRITEBACK, WB_DIRTIED, WB_WRITTEN, + WB_DONTCACHE_DIRTY, NR_WB_STAT_ITEMS }; =20 diff --git a/mm/filemap.c b/mm/filemap.c index 4e636647100c..1c9c0d5f495f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2052,8 +2052,19 @@ struct folio *__filemap_get_folio_mpol(struct addres= s_space *mapping, if (!folio) return ERR_PTR(-ENOENT); /* not an uncached lookup, clear uncached if set */ - if (folio_test_dropbehind(folio) && !(fgp_flags & FGP_DONTCACHE)) + if (folio_test_dropbehind(folio) && !(fgp_flags & FGP_DONTCACHE)) { + if (folio_test_dirty(folio)) { + struct inode *inode =3D mapping->host; + struct bdi_writeback *wb; + struct wb_lock_cookie cookie =3D {}; + + wb =3D unlocked_inode_to_wb_begin(inode, &cookie); + wb_stat_mod(wb, WB_DONTCACHE_DIRTY, + -folio_nr_pages(folio)); + unlocked_inode_to_wb_end(inode, &cookie); + } folio_clear_dropbehind(folio); + } return folio; } EXPORT_SYMBOL(__filemap_get_folio_mpol); diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 88cd53d4ba09..8e520717d1f6 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2630,6 +2630,8 @@ static void folio_account_dirtied(struct folio *folio, wb =3D inode_to_wb(inode); =20 lruvec_stat_mod_folio(folio, NR_FILE_DIRTY, nr); + if (folio_test_dropbehind(folio)) + wb_stat_mod(wb, WB_DONTCACHE_DIRTY, nr); __zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, nr); __node_stat_mod_folio(folio, NR_DIRTIED, nr); wb_stat_mod(wb, WB_RECLAIMABLE, nr); @@ -2651,6 +2653,8 @@ void folio_account_cleaned(struct folio *folio, struc= t bdi_writeback *wb) long nr =3D folio_nr_pages(folio); =20 lruvec_stat_mod_folio(folio, NR_FILE_DIRTY, -nr); + if (folio_test_dropbehind(folio)) + wb_stat_mod(wb, WB_DONTCACHE_DIRTY, -nr); zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, -nr); wb_stat_mod(wb, WB_RECLAIMABLE, -nr); task_io_account_cancelled_write(nr * PAGE_SIZE); @@ -2920,6 +2924,8 @@ bool folio_clear_dirty_for_io(struct folio *folio) if (folio_test_clear_dirty(folio)) { long nr =3D folio_nr_pages(folio); lruvec_stat_mod_folio(folio, NR_FILE_DIRTY, -nr); + if (folio_test_dropbehind(folio)) + wb_stat_mod(wb, WB_DONTCACHE_DIRTY, -nr); zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, -nr); wb_stat_mod(wb, WB_RECLAIMABLE, -nr); ret =3D true; --=20 2.54.0 From nobody Sun Jun 14 02:31:57 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 284D33D6CAC; Mon, 4 May 2026 13:21:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777900893; cv=none; b=EzxjqM8lPjbpSqRltv5pxU+NF/uzEz0u8gRBktxyww4Jjq2a/35Twp9LVS9pFbdYHFhkVDjui24rAqQvzYjzk3Xk6dyzdQhLvs60iNsyeulxD2MKIeyx9w8il3bRfLP3tqkm5ODqqTF+gZXiQNWARSQngUxA9epqKxKu16lhxcQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777900893; c=relaxed/simple; bh=K5q1zRIUANtP5+ezPR77QL8VR3Y+17PrUYBZelKYwQA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ipf4MV4/0zlOxiAEb52fCKRjk80NxZGz/XpWybtthXB3U6VWOqIoT3wvndYM59N7ZU+q3yL63izXkiLLCNM3UP766Xeoz1G5xmKwM+e4fxUimJcMR483tUuVeF1r2Hh24d6fVNwqkz4KGi+dIGYN7iHoAFjaEIXVU3k9Q8VHHO8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=j54ynDpx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="j54ynDpx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 62EEFC2BCB9; Mon, 4 May 2026 13:21:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777900893; bh=K5q1zRIUANtP5+ezPR77QL8VR3Y+17PrUYBZelKYwQA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=j54ynDpx41SgtpTVwV13AhhL34YKqxMGpZizKJZcLZ+2+lrgjuxzi3Zk5lCbe4zOj 6XTe/latR5gEu258dSfB0RATqtjUNpF9v+RQKNcgwGwNTzzkIg8p9JqGbjNJtPLYr/ oYoEV24VejBaNO75lVqN4Xn1pR3DIjsXA+F5+nHZs0zRRdXiVDJPJWTgkySqU2tdnN nQEyBH4d6Kr1W3NiQ/+O8XvrSkKKS/fY6I/KqFbmMkZdzo4v0shUYt9lN4T7u1/xhj lMpu+qOjNXn4CSMhwSMIJCkqEdLdAhhQ4ycUeyND4Lj5LsRbn20ljQfk8Z0JH86A+6 jTEtqphdH50Mw== From: Jeff Layton Date: Mon, 04 May 2026 15:20:50 +0200 Subject: [PATCH v5 2/2] mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty tracking Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260504-dontcache-v5-2-4103e58bb377@kernel.org> References: <20260504-dontcache-v5-0-4103e58bb377@kernel.org> In-Reply-To: <20260504-dontcache-v5-0-4103e58bb377@kernel.org> To: Alexander Viro , Christian Brauner , Jan Kara , "Matthew Wilcox (Oracle)" , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Mike Snitzer , Jens Axboe , Ritesh Harjani , Chuck Lever Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=9241; i=jlayton@kernel.org; h=from:subject:message-id; bh=K5q1zRIUANtP5+ezPR77QL8VR3Y+17PrUYBZelKYwQA=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBp+J1NiIvi0vMSfHfxkcBTX763n3T4fxh6pexT8 YCbsfLYCRmJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCafidTQAKCRAADmhBGVaC FVOaEACVZZ3bkMrZlVY1320n9egVK7t2tt9rIjgPzUaEXprs9pjMWgMRRQBIeC/Rwy3a94SfMUN koHfEaSesiOsNg72wBd2wlkTFYmsjM78M+aRz3jMrlAXQsSrGaOr1f3LJ4qXKmd+C711vkWKkRv 5q85XNGvIWF2mXApReoSsgxQy4jcJSQu9Qwjsl2s7CKyCiX3HhA/Cuv7SPAAcUcmbqKc3EzugtS OgSgiHHhdxTGoXK/FOXFhz1M/E5a3sHr17qUwo6DsyjKyvSiFlsLiWYuEgKIwxW6J1itILeGi8a X0be5F2+6gEBRou/yF5fvH8jZC9+lCeFdNR+lXwC+n/OAVltkPu6IrAv4JA6Bk2mfFUlOSvsrEQ Ncxjpww+Q0YBtE1NAu8Qjx+a2lht0xI+/7NxkO85sAfU93F4egCAvh5If+Nw3e2in2GVfNmPbuj KSFEhBsa17IXZoIeQ5ZG4hseRVWRlneHq9x3d0G12HOyjseq6MSgUjzvBTP5ywDe921aWQu1hTI /+3Iog762IfDxLhzPA76ZAPhPQu4ZSDdjfbPaqzYu7toyaRBJDea8311y7NLM2Uer5E2ug2Vd2+ ZJVieEQ3C74DSbEN8cGc/KbZZvI6XhA2tTaJV8y0iVfcRGspgZu1u680cvRQwl3m8EkLnkYe/gU 6IX0WNnHJ9rfE5Q== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 The IOCB_DONTCACHE writeback path in generic_write_sync() calls filemap_flush_range() on every write, submitting writeback inline in the writer's context. Perf lock contention profiling shows the performance problem is not lock contention but the writeback submission work itself =E2=80=94 walking the page tree and submitting I/O blocks the w= riter for milliseconds, inflating p99.9 latency from 23ms (buffered) to 93ms (dontcache). Replace the inline filemap_flush_range() call with a flusher kick that drains dirty pages in the background. This moves writeback submission completely off the writer's hot path. To avoid flushing unrelated buffered dirty data, add a dedicated WB_start_dontcache bit and wb_check_start_dontcache() handler that uses the per-wb WB_DONTCACHE_DIRTY counter to determine how many pages to write back. The flusher writes back that many pages from the oldest dirty inodes (not restricted to dontcache-specific inodes). This helps preserve I/O batching while limiting the scope of expedited writeback. Like WB_start_all, the WB_start_dontcache bit coalesces multiple DONTCACHE writes into a single flusher wakeup without per-write allocations. Also add WB_REASON_DONTCACHE as a new writeback reason for tracing visibility, and target the correct cgroup writeback domain via unlocked_inode_to_wb_begin(). dontcache-bench results (same host, T6F_SKL_1920GBF, 251 GiB RAM, xfs on NVMe, fio io_uring): Buffered and direct I/O paths are unaffected by this patchset. All improvements are confined to the dontcache path: Single-stream throughput (MB/s): Before After Change seq-write/dontcache 298 897 +201% rand-write/dontcache 131 236 +80% Tail latency improvements (seq-write/dontcache): p99: 135,266 us -> 23,986 us (-82%) p99.9: 8,925,479 us -> 28,443 us (-99.7%) Multi-writer (4 jobs, sequential write): Before After Change dontcache aggregate (MB/s) 2,529 4,532 +79% dontcache p99 (us) 8,553 1,002 -88% dontcache p99.9 (us) 109,314 1,057 -99% Dontcache multi-writer throughput now matches buffered (4,532 vs 4,616 MB/s). 32-file write (Axboe test): Before After Change dontcache aggregate (MB/s) 1,548 3,499 +126% dontcache p99 (us) 10,170 602 -94% Peak dirty pages (MB) 1,837 213 -88% Dontcache now reaches 81% of buffered throughput (was 35%). Competing writers (dontcache vs buffered, separate files): Before After buffered writer 868 433 MB/s dontcache writer 415 433 MB/s Aggregate 1,284 866 MB/s Previously the buffered writer starved the dontcache writer 2:1. With per-bdi_writeback tracking, both writers now receive equal bandwidth. The aggregate matches the buffered-vs-buffered baseline (863 MB/s), indicating fair sharing regardless of I/O mode. The dontcache writer's p99.9 latency collapsed from 119 ms to 33 ms (-73%), eliminating the severe periodic stalls seen in the baseline. Both writers now share identical latency profiles, matching the buffered-vs-buffered pattern. The per-bdi_writeback dirty tracking dramatically reduces peak dirty pages in dontcache workloads, with the 32-file test dropping from 1.8 GB to 213 MB. Dontcache sequential write throughput triples and multi-writer throughput reaches parity with buffered I/O, with tail latencies collapsing by 1-2 orders of magnitude. Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Jan Kara Reviewed-by: Jens Axboe Signed-off-by: Jeff Layton --- fs/fs-writeback.c | 65 ++++++++++++++++++++++++++++++++++++= ++++ include/linux/backing-dev-defs.h | 2 ++ include/linux/fs.h | 6 ++-- include/trace/events/writeback.h | 3 +- 4 files changed, 71 insertions(+), 5 deletions(-) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index a65694cbfe68..ebef485b2f8b 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1334,6 +1334,18 @@ static void wb_start_writeback(struct bdi_writeback = *wb, enum wb_reason reason) wb_wakeup(wb); } =20 +static void wb_start_dontcache_writeback(struct bdi_writeback *wb) +{ + if (!wb_has_dirty_io(wb)) + return; + + if (test_bit(WB_start_dontcache, &wb->state) || + test_and_set_bit(WB_start_dontcache, &wb->state)) + return; + + wb_wakeup(wb); +} + /** * wb_start_background_writeback - start background writeback * @wb: bdi_writback to write from @@ -2373,6 +2385,28 @@ static long wb_check_start_all(struct bdi_writeback = *wb) return nr_pages; } =20 +static long wb_check_start_dontcache(struct bdi_writeback *wb) +{ + long nr_pages; + + if (!test_bit(WB_start_dontcache, &wb->state)) + return 0; + + nr_pages =3D wb_stat(wb, WB_DONTCACHE_DIRTY); + if (nr_pages) { + struct wb_writeback_work work =3D { + .nr_pages =3D nr_pages, + .sync_mode =3D WB_SYNC_NONE, + .range_cyclic =3D 1, + .reason =3D WB_REASON_DONTCACHE, + }; + + nr_pages =3D wb_writeback(wb, &work); + } + + clear_bit(WB_start_dontcache, &wb->state); + return nr_pages; +} =20 /* * Retrieve work items and do the writeback they describe @@ -2394,6 +2428,11 @@ static long wb_do_writeback(struct bdi_writeback *wb) */ wrote +=3D wb_check_start_all(wb); =20 + /* + * Check for dontcache writeback request + */ + wrote +=3D wb_check_start_dontcache(wb); + /* * Check for periodic writeback, kupdated() style */ @@ -2468,6 +2507,32 @@ void wakeup_flusher_threads_bdi(struct backing_dev_i= nfo *bdi, rcu_read_unlock(); } =20 +/** + * filemap_dontcache_kick_writeback - kick flusher for IOCB_DONTCACHE writ= es + * @mapping: address_space that was just written to + * + * Kick the writeback flusher thread to expedite writeback of dontcache di= rty + * pages. Queue writeback for the inode's wb for as many pages as there are + * dontcache pages, but don't restrict writeback to dontcache pages only. + * + * This significantly improves performance over either writing all wb's pa= ges + * or writing only dontcache pages. Although it doesn't guarantee quick + * writeback and reclaim of dontcache pages, it keeps the amount of dirty = pages + * in check. Over longer term dontcache pages get written and reclaimed by + * background writeback even with this rough heuristic. + */ +void filemap_dontcache_kick_writeback(struct address_space *mapping) +{ + struct inode *inode =3D mapping->host; + struct bdi_writeback *wb; + struct wb_lock_cookie cookie =3D {}; + + wb =3D unlocked_inode_to_wb_begin(inode, &cookie); + wb_start_dontcache_writeback(wb); + unlocked_inode_to_wb_end(inode, &cookie); +} +EXPORT_SYMBOL_GPL(filemap_dontcache_kick_writeback); + /* * Wakeup the flusher threads to start writeback of all currently dirty pa= ges */ diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-d= efs.h index cb660dd37286..4f1084937315 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -26,6 +26,7 @@ enum wb_state { WB_writeback_running, /* Writeback is in progress */ WB_has_dirty_io, /* Dirty inodes on ->b_{dirty|io|more_io} */ WB_start_all, /* nr_pages =3D=3D 0 (all) work pending */ + WB_start_dontcache, /* dontcache writeback pending */ }; =20 enum wb_stat_item { @@ -56,6 +57,7 @@ enum wb_reason { */ WB_REASON_FORKER_THREAD, WB_REASON_FOREIGN_FLUSH, + WB_REASON_DONTCACHE, =20 WB_REASON_MAX, }; diff --git a/include/linux/fs.h b/include/linux/fs.h index 11559c513dfb..df72b42a9e9b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2624,6 +2624,7 @@ extern int __must_check file_write_and_wait_range(str= uct file *file, loff_t start, loff_t end); int filemap_flush_range(struct address_space *mapping, loff_t start, loff_t end); +void filemap_dontcache_kick_writeback(struct address_space *mapping); =20 static inline int file_write_and_wait(struct file *file) { @@ -2657,10 +2658,7 @@ static inline ssize_t generic_write_sync(struct kioc= b *iocb, ssize_t count) if (ret) return ret; } else if (iocb->ki_flags & IOCB_DONTCACHE) { - struct address_space *mapping =3D iocb->ki_filp->f_mapping; - - filemap_flush_range(mapping, iocb->ki_pos - count, - iocb->ki_pos - 1); + filemap_dontcache_kick_writeback(iocb->ki_filp->f_mapping); } =20 return count; diff --git a/include/trace/events/writeback.h b/include/trace/events/writeb= ack.h index bdac0d685a98..13ee076ccd16 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -44,7 +44,8 @@ EM( WB_REASON_PERIODIC, "periodic") \ EM( WB_REASON_FS_FREE_SPACE, "fs_free_space") \ EM( WB_REASON_FORKER_THREAD, "forker_thread") \ - EMe(WB_REASON_FOREIGN_FLUSH, "foreign_flush") + EM( WB_REASON_FOREIGN_FLUSH, "foreign_flush") \ + EMe(WB_REASON_DONTCACHE, "dontcache") =20 WB_WORK_REASON =20 --=20 2.54.0