[RFC PATCH 0/2] Use high-order folios in mmap sync RA

Anatoly Stepanov posted 2 patches 2 months ago
fs/proc/task_mmu.c      | 20 +++++++++++++++++---
include/linux/pagemap.h |  1 +
mm/filemap.c            |  1 +
mm/internal.h           |  1 +
mm/memory.c             |  2 +-
mm/readahead.c          |  5 +++--
6 files changed, 24 insertions(+), 6 deletions(-)
[RFC PATCH 0/2] Use high-order folios in mmap sync RA
Posted by Anatoly Stepanov 2 months ago
When "fault around" is enabled, 0-order folios might significantly
slowdown filemap_map_pages().

For example when async RA won't be able to start,
we might end up with a large mmap'ed file with 0-orders.

Imagine an access pattern, when we
just access file chunk-by-chunk, where each chunk size equals to RA window,
until every chunk of the file gets loaded into the page cache.

In this case, we never touch RA-marked page, thus async RA wouldn't kick
in, ending with 0-orders covering all the file.

Let's resolve this by starting sync RA with high-order.

(procfs smaps patch is just for showing contpte coverage improvement for arm64)

Based on linux-7.0-rc5

Anatoly Stepanov (2):
  procfs: add contpte info into smaps
  filemap: use high-order folios in filemap sync RA

 fs/proc/task_mmu.c      | 20 +++++++++++++++++---
 include/linux/pagemap.h |  1 +
 mm/filemap.c            |  1 +
 mm/internal.h           |  1 +
 mm/memory.c             |  2 +-
 mm/readahead.c          |  5 +++--
 6 files changed, 24 insertions(+), 6 deletions(-)

-- 
2.34.1
Re: [RFC PATCH 0/2] Use high-order folios in mmap sync RA
Posted by Matthew Wilcox 2 months ago
On Thu, Apr 16, 2026 at 03:28:51AM +0800, Anatoly Stepanov wrote:
> When "fault around" is enabled, 0-order folios might significantly
> slowdown filemap_map_pages().

There's a lot of "might" in this patchset.  I'd like to know that there
is a real workload that benefits from this, and if so by how much.

You raise an interesting point that faultaround may be slow, and maybe
we should start out with 0 faultaround until we've determined (somehow)
that faultaround would be beneficial for this particular mapping.  Like
we adjust the readahead window.

> For example when async RA won't be able to start,
> we might end up with a large mmap'ed file with 0-orders.

That is a feature, not a bug.  If access is random, then we don't want
to do any async readahead because we don't know where the next access
will be.  We just end up occupying large chunks of memory with
never-used data.
Re: [RFC PATCH 0/2] Use high-order folios in mmap sync RA
Posted by Stepanov Anatoly 2 months ago
On 4/15/2026 4:18 PM, Matthew Wilcox wrote:
> On Thu, Apr 16, 2026 at 03:28:51AM +0800, Anatoly Stepanov wrote:
>> When "fault around" is enabled, 0-order folios might significantly
>> slowdown filemap_map_pages().
> 
> There's a lot of "might" in this patchset.  I'd like to know that there
> is a real workload that benefits from this, and if so by how much.
> 
Actually, no real workload at the moment.
The intention is to highlight the filemap_map_pages issue,
i found it during my experiments with the page cache.

> You raise an interesting point that faultaround may be slow, and maybe
> we should start out with 0 faultaround until we've determined (somehow)
> that faultaround would be beneficial for this particular mapping.  Like
> we adjust the readahead window.
> 
Sounds nice, 
looks like, there should be kind of "virtual readahead" or smth like this.

BTW, for the benchmark i posted, if fault_around is disabled (4K)
then the throughput is even higher.


>> For example when async RA won't be able to start,
>> we might end up with a large mmap'ed file with 0-orders.
> 
> That is a feature, not a bug.  If access is random, then we don't want
> to do any async readahead because we don't know where the next access
> will be.  We just end up occupying large chunks of memory with
> never-used data.
> 
> 
Yes, i understand the logic behind this, what i mean is that it can actually happen.


-- 
Anatoly Stepanov, Huawei