On 20/03/2026 7.34 am, JP Kobryn (Meta) wrote:
> We're finding that while under memory pressure, direct reclaim is kicking
> in during compressed readahead. This puts the associated task into D-state.
> Then shrink_lruvec() disables interrupts when acquiring the LRU lock. Under
> heavy pressure, reclaim can run long enough that the CPU becomes prone to
> CSD lock stalls since it cannot service incoming IPIs. Although the CSD
> lock stalls are the worst case scenario, we have found many more subtle
> occurrences of this latency on the order of seconds, over a minute in some
> cases.
>
> Prevent direct reclaim during compressed readahead. This is achieved by
> using different GFP flags whenever the bio is marked for readahead. The
> flags are similar to GFP_NOFS but stripped of __GFP_DIRECT_RECLAIM. Also,
> __GFP_NOWARN is added since these allocations are allowed to fail. Demand
> reads still use full GFP_NOFS and will enter reclaim if needed.
This seems a sensible change to me. Read-ahead is speculative, so it's
better for it to fail rather than cause problems elsewhere.
> There has been some previous work done to reduce the frequency of calling
> add_ra_bio_pages() [0]. This patch is complementary in that it reduces the
> latency associated with those calls.
>
> [0] https://lore.kernel.org/linux-btrfs/656838ec1232314a2657716e59f4f15a8eadba64.1751492111.git.boris@bur.io/
>
> JP Kobryn (Meta) (2):
> btrfs: additional gfp api for allocating compressed folios
> btrfs: prevent direct reclaim during compressed readahead
>
> fs/btrfs/compression.c | 44 ++++++++++++++++++++++++++++++++++--------
> fs/btrfs/compression.h | 1 +
> 2 files changed, 37 insertions(+), 8 deletions(-)
>