When a task in memcg readaheads file pages, page_cache_ra_unbounded()
will try to readahead nr_to_read pages. Even if the new allocated page
fails to charge, page_cache_ra_unbounded() still tries to readahead
next page. This leads to too much memory reclaim.
Stop readahead if mem_cgroup_charge() fails, i.e. add_to_page_cache_lru()
returns -ENOMEM.
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
---
mm/readahead.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/mm/readahead.c b/mm/readahead.c
index 23620c57c1225..cc4abb67eb223 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -228,6 +228,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
*/
for (i = 0; i < nr_to_read; i++) {
struct folio *folio = xa_load(&mapping->i_pages, index + i);
+ int ret;
if (folio && !xa_is_value(folio)) {
/*
@@ -247,9 +248,12 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
folio = filemap_alloc_folio(gfp_mask, 0);
if (!folio)
break;
- if (filemap_add_folio(mapping, folio, index + i,
- gfp_mask) < 0) {
+
+ ret = filemap_add_folio(mapping, folio, index + i, gfp_mask);
+ if (ret < 0) {
folio_put(folio);
+ if (ret == -ENOMEM)
+ break;
read_pages(ractl);
ractl->_index++;
i = ractl->_index + ractl->_nr_pages - index - 1;
--
2.25.1
On Thu, Feb 01, 2024 at 06:08:34PM +0800, Liu Shixin wrote:
> @@ -247,9 +248,12 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> folio = filemap_alloc_folio(gfp_mask, 0);
> if (!folio)
> break;
> - if (filemap_add_folio(mapping, folio, index + i,
> - gfp_mask) < 0) {
> +
> + ret = filemap_add_folio(mapping, folio, index + i, gfp_mask);
> + if (ret < 0) {
> folio_put(folio);
> + if (ret == -ENOMEM)
> + break;
No, that's too early. You've still got a batch of pages which were
successfully added; you have to read them. You were only off by one
line though ;-)
> read_pages(ractl);
> ractl->_index++;
> i = ractl->_index + ractl->_nr_pages - index - 1;
> --
> 2.25.1
>
>
On Thu 01-02-24 13:47:03, Matthew Wilcox wrote:
> On Thu, Feb 01, 2024 at 06:08:34PM +0800, Liu Shixin wrote:
> > @@ -247,9 +248,12 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> > folio = filemap_alloc_folio(gfp_mask, 0);
> > if (!folio)
> > break;
> > - if (filemap_add_folio(mapping, folio, index + i,
> > - gfp_mask) < 0) {
> > +
> > + ret = filemap_add_folio(mapping, folio, index + i, gfp_mask);
> > + if (ret < 0) {
> > folio_put(folio);
> > + if (ret == -ENOMEM)
> > + break;
>
> No, that's too early. You've still got a batch of pages which were
> successfully added; you have to read them. You were only off by one
> line though ;-)
There's a read_pages() call just outside of the loop so this break is
actually fine AFAICT.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
On Thu, Feb 01, 2024 at 02:52:31PM +0100, Jan Kara wrote:
> On Thu 01-02-24 13:47:03, Matthew Wilcox wrote:
> > On Thu, Feb 01, 2024 at 06:08:34PM +0800, Liu Shixin wrote:
> > > @@ -247,9 +248,12 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> > > folio = filemap_alloc_folio(gfp_mask, 0);
> > > if (!folio)
> > > break;
> > > - if (filemap_add_folio(mapping, folio, index + i,
> > > - gfp_mask) < 0) {
> > > +
> > > + ret = filemap_add_folio(mapping, folio, index + i, gfp_mask);
> > > + if (ret < 0) {
> > > folio_put(folio);
> > > + if (ret == -ENOMEM)
> > > + break;
> >
> > No, that's too early. You've still got a batch of pages which were
> > successfully added; you have to read them. You were only off by one
> > line though ;-)
>
> There's a read_pages() call just outside of the loop so this break is
> actually fine AFAICT.
Oh, good point! I withdraw my criticism.
On Thu 01-02-24 18:08:34, Liu Shixin wrote:
> When a task in memcg readaheads file pages, page_cache_ra_unbounded()
> will try to readahead nr_to_read pages. Even if the new allocated page
> fails to charge, page_cache_ra_unbounded() still tries to readahead
> next page. This leads to too much memory reclaim.
>
> Stop readahead if mem_cgroup_charge() fails, i.e. add_to_page_cache_lru()
> returns -ENOMEM.
>
> Signed-off-by: Liu Shixin <liushixin2@huawei.com>
> Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Makes sense. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> mm/readahead.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 23620c57c1225..cc4abb67eb223 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -228,6 +228,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> */
> for (i = 0; i < nr_to_read; i++) {
> struct folio *folio = xa_load(&mapping->i_pages, index + i);
> + int ret;
>
> if (folio && !xa_is_value(folio)) {
> /*
> @@ -247,9 +248,12 @@ void page_cache_ra_unbounded(struct readahead_control *ractl,
> folio = filemap_alloc_folio(gfp_mask, 0);
> if (!folio)
> break;
> - if (filemap_add_folio(mapping, folio, index + i,
> - gfp_mask) < 0) {
> +
> + ret = filemap_add_folio(mapping, folio, index + i, gfp_mask);
> + if (ret < 0) {
> folio_put(folio);
> + if (ret == -ENOMEM)
> + break;
> read_pages(ractl);
> ractl->_index++;
> i = ractl->_index + ractl->_nr_pages - index - 1;
> --
> 2.25.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
© 2016 - 2025 Red Hat, Inc.