Page reclaim will not scan anon LRU if no swap space, however MADV_PAGEOUT
can still split shmem large folios even without a swap device. Thus add
swap available space validation before spliting shmem large folio to
avoid redundant split.
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
mm/vmscan.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 31d13462571e..796f65781f4f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
}
} else if (folio_test_swapbacked(folio) &&
folio_test_large(folio)) {
+
+ /*
+ * Do not split shmem folio if no swap memory
+ * available.
+ */
+ if (!total_swap_pages)
+ goto activate_locked;
+
/* Split shmem folio */
if (split_folio_to_list(folio, folio_list))
goto keep_locked;
--
2.39.3
On 07.08.24 09:31, Baolin Wang wrote:
> Page reclaim will not scan anon LRU if no swap space, however MADV_PAGEOUT
> can still split shmem large folios even without a swap device. Thus add
> swap available space validation before spliting shmem large folio to
> avoid redundant split.
>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> mm/vmscan.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 31d13462571e..796f65781f4f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
> }
> } else if (folio_test_swapbacked(folio) &&
> folio_test_large(folio)) {
> +
> + /*
> + * Do not split shmem folio if no swap memory
> + * available.
> + */
> + if (!total_swap_pages)
> + goto activate_locked;
> +
> /* Split shmem folio */
> if (split_folio_to_list(folio, folio_list))
> goto keep_locked;
Reminds me of
commit 9a976f0c847b67d22ed694556a3626ed92da0422
Author: Luis Chamberlain <mcgrof@kernel.org>
Date: Thu Mar 9 15:05:43 2023 -0800
shmem: skip page split if we're not reclaiming
In theory when info->flags & VM_LOCKED we should not be getting
shem_writepage() called so we should be verifying this with a
WARN_ON_ONCE(). Since we should not be swapping then best to ensure we
also don't do the folio split earlier too. So just move the check early
to avoid folio splits in case its a dubious call.
We also have a similar early bail when !total_swap_pages so just move that
earlier to avoid the possible folio split in the same situation.
But indeed, pageout() -> writepage() is called *after* the split in the vmscan path.
In that "noswap" context, I wonder if we also want to skip folios part of shmem
with disabled swapping?
But now I am wondering under which circumstances we end up calling
shmem_writepage() with a large folio. And I think the answer is the comment of
folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
... so if shmem_writepage() handles+checks that, could we do
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a332cb80e928..7dfa3d6e8ba7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
goto activate_locked_split;
}
}
- } else if (folio_test_swapbacked(folio) &&
- folio_test_large(folio)) {
- /* Split shmem folio */
- if (split_folio_to_list(folio, folio_list))
- goto keep_locked;
}
/*
instead?
--
Cheers,
David / dhildenb
On 2024/8/7 23:53, David Hildenbrand wrote:
> On 07.08.24 09:31, Baolin Wang wrote:
>> Page reclaim will not scan anon LRU if no swap space, however
>> MADV_PAGEOUT
>> can still split shmem large folios even without a swap device. Thus add
>> swap available space validation before spliting shmem large folio to
>> avoid redundant split.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>> mm/vmscan.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 31d13462571e..796f65781f4f 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
>> list_head *folio_list,
>> }
>> } else if (folio_test_swapbacked(folio) &&
>> folio_test_large(folio)) {
>> +
>> + /*
>> + * Do not split shmem folio if no swap memory
>> + * available.
>> + */
>> + if (!total_swap_pages)
>> + goto activate_locked;
>> +
>> /* Split shmem folio */
>> if (split_folio_to_list(folio, folio_list))
>> goto keep_locked;
>
> Reminds me of
>
> commit 9a976f0c847b67d22ed694556a3626ed92da0422
> Author: Luis Chamberlain <mcgrof@kernel.org>
> Date: Thu Mar 9 15:05:43 2023 -0800
>
> shmem: skip page split if we're not reclaiming
> In theory when info->flags & VM_LOCKED we should not be getting
> shem_writepage() called so we should be verifying this with a
> WARN_ON_ONCE(). Since we should not be swapping then best to
> ensure we
> also don't do the folio split earlier too. So just move the check
> early
> to avoid folio splits in case its a dubious call.
> We also have a similar early bail when !total_swap_pages so just
> move that
> earlier to avoid the possible folio split in the same situation.
>
>
> But indeed, pageout() -> writepage() is called *after* the split in the
> vmscan path.
>
> In that "noswap" context, I wonder if we also want to skip folios part
> of shmem
> with disabled swapping?
Yes, I think so.
>
> But now I am wondering under which circumstances we end up calling
> shmem_writepage() with a large folio. And I think the answer is the
> comment of
> folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
>
>
> ... so if shmem_writepage() handles+checks that, could we do
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a332cb80e928..7dfa3d6e8ba7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
> list_head *folio_list,
> goto
> activate_locked_split;
> }
> }
> - } else if (folio_test_swapbacked(folio) &&
> - folio_test_large(folio)) {
> - /* Split shmem folio */
> - if (split_folio_to_list(folio, folio_list))
> - goto keep_locked;
> }
>
> /*
>
> instead?
Seems reasonable to me. But we should pass the 'folio_list' to
shmem_writepage() to list the subpages of the large folio. Let me try.
Thanks.
On Thu, Aug 08, 2024 at 10:36:23AM +0800, Baolin Wang wrote:
> On 2024/8/7 23:53, David Hildenbrand wrote:
> > But now I am wondering under which circumstances we end up calling
> > shmem_writepage() with a large folio. And I think the answer is the
> > comment of
> > folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
> >
> >
> > ... so if shmem_writepage() handles+checks that, could we do
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a332cb80e928..7dfa3d6e8ba7 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
> > list_head *folio_list,
> > goto
> > activate_locked_split;
> > }
> > }
> > - } else if (folio_test_swapbacked(folio) &&
> > - folio_test_large(folio)) {
> > - /* Split shmem folio */
> > - if (split_folio_to_list(folio, folio_list))
> > - goto keep_locked;
> > }
> >
> > /*
> >
> > instead?
>
> Seems reasonable to me. But we should pass the 'folio_list' to
> shmem_writepage() to list the subpages of the large folio. Let me try.
> Thanks.
We should be trying to remove shmem_writepage(), not make it more
complex. We're making good progress removing instances of ->writepage;
just ceph, ecryptfs, f2fs, gfs2, hostfs, nilfs2, orangefs, vboxsf, shmem
& swap are left. gfs2 patches are out for review.
As you can see from previous patches, the approach is to use
->writepages instead of ->writepage. There should be no need to
handle a folio split list as splitting a folio leaves the folios in the
page cache and they'll naturally be found by subsequent iterations.
On 2024/8/8 20:35, Matthew Wilcox wrote:
> On Thu, Aug 08, 2024 at 10:36:23AM +0800, Baolin Wang wrote:
>> On 2024/8/7 23:53, David Hildenbrand wrote:
>>> But now I am wondering under which circumstances we end up calling
>>> shmem_writepage() with a large folio. And I think the answer is the
>>> comment of
>>> folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
>>>
>>>
>>> ... so if shmem_writepage() handles+checks that, could we do
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index a332cb80e928..7dfa3d6e8ba7 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
>>> list_head *folio_list,
>>> goto
>>> activate_locked_split;
>>> }
>>> }
>>> - } else if (folio_test_swapbacked(folio) &&
>>> - folio_test_large(folio)) {
>>> - /* Split shmem folio */
>>> - if (split_folio_to_list(folio, folio_list))
>>> - goto keep_locked;
>>> }
>>>
>>> /*
>>>
>>> instead?
>>
>> Seems reasonable to me. But we should pass the 'folio_list' to
>> shmem_writepage() to list the subpages of the large folio. Let me try.
>> Thanks.
>
> We should be trying to remove shmem_writepage(), not make it more
> complex. We're making good progress removing instances of ->writepage;
> just ceph, ecryptfs, f2fs, gfs2, hostfs, nilfs2, orangefs, vboxsf, shmem
> & swap are left. gfs2 patches are out for review.
I am afraid shmem is a bit special. IIUC, ->writepages() is used to
write back some dirty pages from the mapping by the writeback flusher
thread, but shmem cannot be written back (mapping_can_writeback() will
return false). Therefore, shmem can only be reclaimed through direct
reclaim or kswapd if a swap device is set up (if no swap, shmem should
always be kept in memory). So currently, we should still keep
shmem_writepage() to reclaim shmem pages.
> As you can see from previous patches, the approach is to use
> ->writepages instead of ->writepage. There should be no need to
> handle a folio split list as splitting a folio leaves the folios in the
> page cache and they'll naturally be found by subsequent iterations.
Again, shmem is special. If shmem folio is reclaimable (when a swap
device is set up), we need to allocate contiguous swap entries for large
folios. However, if there is significant fragmentation of swap entries
(there is already a topic to talk about this issue), we will not able to
allocate contiguous swap entries for large shmem folios. Therefore, in
this case, it is necessary to split the large shmem folio in order to
try to allocate a singe swap entry for reclaiming shmem.
On 08.08.24 04:36, Baolin Wang wrote:
>
>
> On 2024/8/7 23:53, David Hildenbrand wrote:
>> On 07.08.24 09:31, Baolin Wang wrote:
>>> Page reclaim will not scan anon LRU if no swap space, however
>>> MADV_PAGEOUT
>>> can still split shmem large folios even without a swap device. Thus add
>>> swap available space validation before spliting shmem large folio to
>>> avoid redundant split.
>>>
>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> ---
>>> mm/vmscan.c | 8 ++++++++
>>> 1 file changed, 8 insertions(+)
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index 31d13462571e..796f65781f4f 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
>>> list_head *folio_list,
>>> }
>>> } else if (folio_test_swapbacked(folio) &&
>>> folio_test_large(folio)) {
>>> +
>>> + /*
>>> + * Do not split shmem folio if no swap memory
>>> + * available.
>>> + */
>>> + if (!total_swap_pages)
>>> + goto activate_locked;
>>> +
>>> /* Split shmem folio */
>>> if (split_folio_to_list(folio, folio_list))
>>> goto keep_locked;
>>
>> Reminds me of
>>
>> commit 9a976f0c847b67d22ed694556a3626ed92da0422
>> Author: Luis Chamberlain <mcgrof@kernel.org>
>> Date: Thu Mar 9 15:05:43 2023 -0800
>>
>> shmem: skip page split if we're not reclaiming
>> In theory when info->flags & VM_LOCKED we should not be getting
>> shem_writepage() called so we should be verifying this with a
>> WARN_ON_ONCE(). Since we should not be swapping then best to
>> ensure we
>> also don't do the folio split earlier too. So just move the check
>> early
>> to avoid folio splits in case its a dubious call.
>> We also have a similar early bail when !total_swap_pages so just
>> move that
>> earlier to avoid the possible folio split in the same situation.
>>
>>
>> But indeed, pageout() -> writepage() is called *after* the split in the
>> vmscan path.
>>
>> In that "noswap" context, I wonder if we also want to skip folios part
>> of shmem
>> with disabled swapping?
>
> Yes, I think so.
>
>>
>> But now I am wondering under which circumstances we end up calling
>> shmem_writepage() with a large folio. And I think the answer is the
>> comment of
>> folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
>>
>>
>> ... so if shmem_writepage() handles+checks that, could we do
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index a332cb80e928..7dfa3d6e8ba7 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
>> list_head *folio_list,
>> goto
>> activate_locked_split;
>> }
>> }
>> - } else if (folio_test_swapbacked(folio) &&
>> - folio_test_large(folio)) {
>> - /* Split shmem folio */
>> - if (split_folio_to_list(folio, folio_list))
>> - goto keep_locked;
>> }
>>
>> /*
>>
>> instead?
>
> Seems reasonable to me. But we should pass the 'folio_list' to
> shmem_writepage() to list the subpages of the large folio. Let me try.
Ah, yes, good point. Alternatively, we'd have to split and try writing
all subpages in there. I wonder what to do if we fail to write some, and
if we could handle that transparently, without the folio_list.
--
Cheers,
David / dhildenb
On 2024/8/8 16:51, David Hildenbrand wrote:
> On 08.08.24 04:36, Baolin Wang wrote:
>>
>>
>> On 2024/8/7 23:53, David Hildenbrand wrote:
>>> On 07.08.24 09:31, Baolin Wang wrote:
>>>> Page reclaim will not scan anon LRU if no swap space, however
>>>> MADV_PAGEOUT
>>>> can still split shmem large folios even without a swap device. Thus add
>>>> swap available space validation before spliting shmem large folio to
>>>> avoid redundant split.
>>>>
>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>> ---
>>>> mm/vmscan.c | 8 ++++++++
>>>> 1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>> index 31d13462571e..796f65781f4f 100644
>>>> --- a/mm/vmscan.c
>>>> +++ b/mm/vmscan.c
>>>> @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
>>>> list_head *folio_list,
>>>> }
>>>> } else if (folio_test_swapbacked(folio) &&
>>>> folio_test_large(folio)) {
>>>> +
>>>> + /*
>>>> + * Do not split shmem folio if no swap memory
>>>> + * available.
>>>> + */
>>>> + if (!total_swap_pages)
>>>> + goto activate_locked;
>>>> +
>>>> /* Split shmem folio */
>>>> if (split_folio_to_list(folio, folio_list))
>>>> goto keep_locked;
>>>
>>> Reminds me of
>>>
>>> commit 9a976f0c847b67d22ed694556a3626ed92da0422
>>> Author: Luis Chamberlain <mcgrof@kernel.org>
>>> Date: Thu Mar 9 15:05:43 2023 -0800
>>>
>>> shmem: skip page split if we're not reclaiming
>>> In theory when info->flags & VM_LOCKED we should not be getting
>>> shem_writepage() called so we should be verifying this with a
>>> WARN_ON_ONCE(). Since we should not be swapping then best to
>>> ensure we
>>> also don't do the folio split earlier too. So just move the check
>>> early
>>> to avoid folio splits in case its a dubious call.
>>> We also have a similar early bail when !total_swap_pages so just
>>> move that
>>> earlier to avoid the possible folio split in the same situation.
>>>
>>>
>>> But indeed, pageout() -> writepage() is called *after* the split in the
>>> vmscan path.
>>>
>>> In that "noswap" context, I wonder if we also want to skip folios part
>>> of shmem
>>> with disabled swapping?
>>
>> Yes, I think so.
>>
>>>
>>> But now I am wondering under which circumstances we end up calling
>>> shmem_writepage() with a large folio. And I think the answer is the
>>> comment of
>>> folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
>>>
>>>
>>> ... so if shmem_writepage() handles+checks that, could we do
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index a332cb80e928..7dfa3d6e8ba7 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
>>> list_head *folio_list,
>>> goto
>>> activate_locked_split;
>>> }
>>> }
>>> - } else if (folio_test_swapbacked(folio) &&
>>> - folio_test_large(folio)) {
>>> - /* Split shmem folio */
>>> - if (split_folio_to_list(folio, folio_list))
>>> - goto keep_locked;
>>> }
>>>
>>> /*
>>>
>>> instead?
>>
>> Seems reasonable to me. But we should pass the 'folio_list' to
>> shmem_writepage() to list the subpages of the large folio. Let me try.
>
> Ah, yes, good point. Alternatively, we'd have to split and try writing
> all subpages in there. I wonder what to do if we fail to write some, and
> if we could handle that transparently, without the folio_list.
After some investigation, I prefer to pass 'folio_list' to
shmem_writepage() via 'struct writeback_control', which could simplify
the logic a lot. Otherwise, we need to handle each subpage's
writeback/reclaim/dirty state, as well as tracking each subpage's write
result, which seems more complicated.
I made a quick change by passing 'folio_list', and it looks simple and
works as expected.
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 75196b0f894f..10100e22d5c6 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -80,6 +80,9 @@ struct writeback_control {
*/
struct swap_iocb **swap_plug;
+ /* Target list for splitting a large folio */
+ struct list_head *list;
+
/* internal fields used by the ->writepages implementation: */
struct folio_batch fbatch;
pgoff_t index;
diff --git a/mm/shmem.c b/mm/shmem.c
index 05525e9e7423..0a5a68f7d0a0 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1496,9 +1496,10 @@ static int shmem_writepage(struct page *page,
struct writeback_control *wbc)
* and its shmem_writeback() needs them to be split when swapping.
*/
if (wbc->split_large_folio && folio_test_large(folio)) {
+try_split:
/* Ensure the subpages are still dirty */
folio_test_set_dirty(folio);
- if (split_huge_page(page) < 0)
+ if (split_huge_page_to_list_to_order(page, wbc->list, 0))
goto redirty;
folio = page_folio(page);
folio_clear_dirty(folio);
@@ -1540,8 +1541,12 @@ static int shmem_writepage(struct page *page,
struct writeback_control *wbc)
}
swap = folio_alloc_swap(folio);
- if (!swap.val)
+ if (!swap.val) {
+ if (nr_pages > 1)
+ goto try_split;
+
goto redirty;
+ }
/*
* Add inode to shmem_unuse()'s list of swapped-out inodes,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 277571815789..cf982cf2454f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -628,7 +628,7 @@ typedef enum {
* Calls ->writepage().
*/
static pageout_t pageout(struct folio *folio, struct address_space
*mapping,
- struct swap_iocb **plug)
+ struct swap_iocb **plug, struct list_head
*folio_list)
{
/*
* If the folio is dirty, only perform writeback if that write
@@ -676,6 +676,11 @@ static pageout_t pageout(struct folio *folio,
struct address_space *mapping,
.swap_plug = plug,
};
+ if (shmem_mapping(mapping)) {
+ wbc.list = folio_list;
+ wbc.split_large_folio =
!IS_ENABLED(CONFIG_THP_SWAP);
+ }
+
folio_set_reclaim(folio);
res = mapping->a_ops->writepage(&folio->page, &wbc);
if (res < 0)
@@ -1259,23 +1264,6 @@ static unsigned int shrink_folio_list(struct
list_head *folio_list,
goto activate_locked_split;
}
}
- } else if (folio_test_swapbacked(folio) &&
- folio_test_large(folio)) {
-
- /*
- * Do not split shmem folio if no swap memory
- * available.
- */
- if (!total_swap_pages)
- goto activate_locked;
-
- /*
- * Only split shmem folio when CONFIG_THP_SWAP
- * is not enabled.
- */
- if (!IS_ENABLED(CONFIG_THP_SWAP) &&
- split_folio_to_list(folio, folio_list))
- goto keep_locked;
}
/*
@@ -1377,18 +1365,21 @@ static unsigned int shrink_folio_list(struct
list_head *folio_list,
* starts and then write it out here.
*/
try_to_unmap_flush_dirty();
-try_pageout:
- switch (pageout(folio, mapping, &plug)) {
+ switch (pageout(folio, mapping, &plug,
folio_list)) {
case PAGE_KEEP:
goto keep_locked;
case PAGE_ACTIVATE:
- if (shmem_mapping(mapping) &&
folio_test_large(folio) &&
- !split_folio_to_list(folio,
folio_list)) {
+ /* Shmem can be split when writeback to
swap */
+ if ((nr_pages > 1) &&
!folio_test_large(folio)) {
+ sc->nr_scanned -= (nr_pages - 1);
nr_pages = 1;
- goto try_pageout;
}
goto activate_locked;
case PAGE_SUCCESS:
+ if ((nr_pages > 1) &&
!folio_test_large(folio)) {
+ sc->nr_scanned -= (nr_pages - 1);
+ nr_pages = 1;
+ }
stat->nr_pageout += nr_pages;
if (folio_test_writeback(folio)) {
On Thu, Aug 08, 2024 at 05:34:50PM GMT, Baolin Wang wrote:
>
>
> On 2024/8/8 16:51, David Hildenbrand wrote:
> > On 08.08.24 04:36, Baolin Wang wrote:
> > >
> > >
> > > On 2024/8/7 23:53, David Hildenbrand wrote:
> > > > On 07.08.24 09:31, Baolin Wang wrote:
> > > > > Page reclaim will not scan anon LRU if no swap space, however
> > > > > MADV_PAGEOUT
> > > > > can still split shmem large folios even without a swap device. Thus add
> > > > > swap available space validation before spliting shmem large folio to
> > > > > avoid redundant split.
> > > > >
> > > > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> > > > > ---
> > > > > mm/vmscan.c | 8 ++++++++
> > > > > 1 file changed, 8 insertions(+)
> > > > >
> > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > index 31d13462571e..796f65781f4f 100644
> > > > > --- a/mm/vmscan.c
> > > > > +++ b/mm/vmscan.c
> > > > > @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
> > > > > list_head *folio_list,
> > > > > }
> > > > > } else if (folio_test_swapbacked(folio) &&
> > > > > folio_test_large(folio)) {
> > > > > +
> > > > > + /*
> > > > > + * Do not split shmem folio if no swap memory
> > > > > + * available.
> > > > > + */
> > > > > + if (!total_swap_pages)
> > > > > + goto activate_locked;
> > > > > +
> > > > > /* Split shmem folio */
> > > > > if (split_folio_to_list(folio, folio_list))
> > > > > goto keep_locked;
> > > >
> > > > Reminds me of
> > > >
> > > > commit 9a976f0c847b67d22ed694556a3626ed92da0422
> > > > Author: Luis Chamberlain <mcgrof@kernel.org>
> > > > Date: Thu Mar 9 15:05:43 2023 -0800
> > > >
> > > > shmem: skip page split if we're not reclaiming
> > > > In theory when info->flags & VM_LOCKED we should not be getting
> > > > shem_writepage() called so we should be verifying this with a
> > > > WARN_ON_ONCE(). Since we should not be swapping then best to
> > > > ensure we
> > > > also don't do the folio split earlier too. So just move the check
> > > > early
> > > > to avoid folio splits in case its a dubious call.
> > > > We also have a similar early bail when !total_swap_pages so just
> > > > move that
> > > > earlier to avoid the possible folio split in the same situation.
> > > >
> > > >
> > > > But indeed, pageout() -> writepage() is called *after* the split in the
> > > > vmscan path.
> > > >
> > > > In that "noswap" context, I wonder if we also want to skip folios part
> > > > of shmem
> > > > with disabled swapping?
> > >
> > > Yes, I think so.
> > >
> > > >
> > > > But now I am wondering under which circumstances we end up calling
> > > > shmem_writepage() with a large folio. And I think the answer is the
> > > > comment of
> > > > folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
> > > >
> > > >
> > > > ... so if shmem_writepage() handles+checks that, could we do
> > > >
> > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > index a332cb80e928..7dfa3d6e8ba7 100644
> > > > --- a/mm/vmscan.c
> > > > +++ b/mm/vmscan.c
> > > > @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
> > > > list_head *folio_list,
> > > > goto
> > > > activate_locked_split;
> > > > }
> > > > }
> > > > - } else if (folio_test_swapbacked(folio) &&
> > > > - folio_test_large(folio)) {
> > > > - /* Split shmem folio */
> > > > - if (split_folio_to_list(folio, folio_list))
> > > > - goto keep_locked;
> > > > }
> > > >
> > > > /*
> > > >
> > > > instead?
> > >
> > > Seems reasonable to me. But we should pass the 'folio_list' to
> > > shmem_writepage() to list the subpages of the large folio. Let me try.
> >
> > Ah, yes, good point. Alternatively, we'd have to split and try writing
> > all subpages in there. I wonder what to do if we fail to write some, and
> > if we could handle that transparently, without the folio_list.
>
> After some investigation, I prefer to pass 'folio_list' to shmem_writepage()
> via 'struct writeback_control', which could simplify the logic a lot.
> Otherwise, we need to handle each subpage's writeback/reclaim/dirty state,
> as well as tracking each subpage's write result, which seems more
> complicated.
>
> I made a quick change by passing 'folio_list', and it looks simple and works
> as expected.
>
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index 75196b0f894f..10100e22d5c6 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -80,6 +80,9 @@ struct writeback_control {
> */
> struct swap_iocb **swap_plug;
>
> + /* Target list for splitting a large folio */
> + struct list_head *list;
> +
> /* internal fields used by the ->writepages implementation: */
> struct folio_batch fbatch;
> pgoff_t index;
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 05525e9e7423..0a5a68f7d0a0 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1496,9 +1496,10 @@ static int shmem_writepage(struct page *page, struct
> writeback_control *wbc)
> * and its shmem_writeback() needs them to be split when swapping.
> */
> if (wbc->split_large_folio && folio_test_large(folio)) {
> +try_split:
> /* Ensure the subpages are still dirty */
> folio_test_set_dirty(folio);
> - if (split_huge_page(page) < 0)
> + if (split_huge_page_to_list_to_order(page, wbc->list, 0))
We check for split_large_folio, but we still send the wbc->list for i915
writepage() case. Previously, we were sending a NULL list. Shouldn't we address
that case too?
> goto redirty;
> folio = page_folio(page);
> folio_clear_dirty(folio);
> @@ -1540,8 +1541,12 @@ static int shmem_writepage(struct page *page, struct
> writeback_control *wbc)
> }
>
> swap = folio_alloc_swap(folio);
> - if (!swap.val)
> + if (!swap.val) {
> + if (nr_pages > 1)
> + goto try_split;
> +
> goto redirty;
> + }
>
> /*
> * Add inode to shmem_unuse()'s list of swapped-out inodes,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 277571815789..cf982cf2454f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -628,7 +628,7 @@ typedef enum {
> * Calls ->writepage().
> */
> static pageout_t pageout(struct folio *folio, struct address_space
> *mapping,
> - struct swap_iocb **plug)
> + struct swap_iocb **plug, struct list_head
> *folio_list)
> {
> /*
> * If the folio is dirty, only perform writeback if that write
> @@ -676,6 +676,11 @@ static pageout_t pageout(struct folio *folio, struct
> address_space *mapping,
> .swap_plug = plug,
> };
>
> + if (shmem_mapping(mapping)) {
> + wbc.list = folio_list;
> + wbc.split_large_folio =
> !IS_ENABLED(CONFIG_THP_SWAP);
> + }
> +
> folio_set_reclaim(folio);
> res = mapping->a_ops->writepage(&folio->page, &wbc);
> if (res < 0)
> @@ -1259,23 +1264,6 @@ static unsigned int shrink_folio_list(struct
> list_head *folio_list,
> goto activate_locked_split;
> }
> }
> - } else if (folio_test_swapbacked(folio) &&
> - folio_test_large(folio)) {
> -
> - /*
> - * Do not split shmem folio if no swap memory
> - * available.
> - */
> - if (!total_swap_pages)
> - goto activate_locked;
> -
> - /*
> - * Only split shmem folio when CONFIG_THP_SWAP
> - * is not enabled.
> - */
> - if (!IS_ENABLED(CONFIG_THP_SWAP) &&
> - split_folio_to_list(folio, folio_list))
> - goto keep_locked;
> }
>
> /*
> @@ -1377,18 +1365,21 @@ static unsigned int shrink_folio_list(struct
> list_head *folio_list,
> * starts and then write it out here.
> */
> try_to_unmap_flush_dirty();
> -try_pageout:
> - switch (pageout(folio, mapping, &plug)) {
> + switch (pageout(folio, mapping, &plug, folio_list))
> {
> case PAGE_KEEP:
> goto keep_locked;
> case PAGE_ACTIVATE:
> - if (shmem_mapping(mapping) &&
> folio_test_large(folio) &&
> - !split_folio_to_list(folio, folio_list))
> {
> + /* Shmem can be split when writeback to swap
> */
> + if ((nr_pages > 1) &&
> !folio_test_large(folio)) {
> + sc->nr_scanned -= (nr_pages - 1);
> nr_pages = 1;
> - goto try_pageout;
> }
> goto activate_locked;
> case PAGE_SUCCESS:
> + if ((nr_pages > 1) &&
> !folio_test_large(folio)) {
> + sc->nr_scanned -= (nr_pages - 1);
> + nr_pages = 1;
> + }
> stat->nr_pageout += nr_pages;
>
> if (folio_test_writeback(folio)) {
On Thu, Aug 08, 2024 at 12:48:52PM GMT, Daniel Gomez wrote:
> On Thu, Aug 08, 2024 at 05:34:50PM GMT, Baolin Wang wrote:
> >
> >
> > On 2024/8/8 16:51, David Hildenbrand wrote:
> > > On 08.08.24 04:36, Baolin Wang wrote:
> > > >
> > > >
> > > > On 2024/8/7 23:53, David Hildenbrand wrote:
> > > > > On 07.08.24 09:31, Baolin Wang wrote:
> > > > > > Page reclaim will not scan anon LRU if no swap space, however
> > > > > > MADV_PAGEOUT
> > > > > > can still split shmem large folios even without a swap device. Thus add
> > > > > > swap available space validation before spliting shmem large folio to
> > > > > > avoid redundant split.
> > > > > >
> > > > > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> > > > > > ---
> > > > > > mm/vmscan.c | 8 ++++++++
> > > > > > 1 file changed, 8 insertions(+)
> > > > > >
> > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > > index 31d13462571e..796f65781f4f 100644
> > > > > > --- a/mm/vmscan.c
> > > > > > +++ b/mm/vmscan.c
> > > > > > @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
> > > > > > list_head *folio_list,
> > > > > > }
> > > > > > } else if (folio_test_swapbacked(folio) &&
> > > > > > folio_test_large(folio)) {
> > > > > > +
> > > > > > + /*
> > > > > > + * Do not split shmem folio if no swap memory
> > > > > > + * available.
> > > > > > + */
> > > > > > + if (!total_swap_pages)
> > > > > > + goto activate_locked;
> > > > > > +
> > > > > > /* Split shmem folio */
> > > > > > if (split_folio_to_list(folio, folio_list))
> > > > > > goto keep_locked;
> > > > >
> > > > > Reminds me of
> > > > >
> > > > > commit 9a976f0c847b67d22ed694556a3626ed92da0422
> > > > > Author: Luis Chamberlain <mcgrof@kernel.org>
> > > > > Date: Thu Mar 9 15:05:43 2023 -0800
> > > > >
> > > > > shmem: skip page split if we're not reclaiming
> > > > > In theory when info->flags & VM_LOCKED we should not be getting
> > > > > shem_writepage() called so we should be verifying this with a
> > > > > WARN_ON_ONCE(). Since we should not be swapping then best to
> > > > > ensure we
> > > > > also don't do the folio split earlier too. So just move the check
> > > > > early
> > > > > to avoid folio splits in case its a dubious call.
> > > > > We also have a similar early bail when !total_swap_pages so just
> > > > > move that
> > > > > earlier to avoid the possible folio split in the same situation.
> > > > >
> > > > >
> > > > > But indeed, pageout() -> writepage() is called *after* the split in the
> > > > > vmscan path.
> > > > >
> > > > > In that "noswap" context, I wonder if we also want to skip folios part
> > > > > of shmem
> > > > > with disabled swapping?
> > > >
> > > > Yes, I think so.
> > > >
> > > > >
> > > > > But now I am wondering under which circumstances we end up calling
> > > > > shmem_writepage() with a large folio. And I think the answer is the
> > > > > comment of
> > > > > folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
> > > > >
> > > > >
> > > > > ... so if shmem_writepage() handles+checks that, could we do
> > > > >
> > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > index a332cb80e928..7dfa3d6e8ba7 100644
> > > > > --- a/mm/vmscan.c
> > > > > +++ b/mm/vmscan.c
> > > > > @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
> > > > > list_head *folio_list,
> > > > > goto
> > > > > activate_locked_split;
> > > > > }
> > > > > }
> > > > > - } else if (folio_test_swapbacked(folio) &&
> > > > > - folio_test_large(folio)) {
> > > > > - /* Split shmem folio */
> > > > > - if (split_folio_to_list(folio, folio_list))
> > > > > - goto keep_locked;
> > > > > }
> > > > >
> > > > > /*
> > > > >
> > > > > instead?
> > > >
> > > > Seems reasonable to me. But we should pass the 'folio_list' to
> > > > shmem_writepage() to list the subpages of the large folio. Let me try.
> > >
> > > Ah, yes, good point. Alternatively, we'd have to split and try writing
> > > all subpages in there. I wonder what to do if we fail to write some, and
> > > if we could handle that transparently, without the folio_list.
> >
> > After some investigation, I prefer to pass 'folio_list' to shmem_writepage()
> > via 'struct writeback_control', which could simplify the logic a lot.
> > Otherwise, we need to handle each subpage's writeback/reclaim/dirty state,
> > as well as tracking each subpage's write result, which seems more
> > complicated.
> >
> > I made a quick change by passing 'folio_list', and it looks simple and works
> > as expected.
> >
> > diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> > index 75196b0f894f..10100e22d5c6 100644
> > --- a/include/linux/writeback.h
> > +++ b/include/linux/writeback.h
> > @@ -80,6 +80,9 @@ struct writeback_control {
> > */
> > struct swap_iocb **swap_plug;
> >
> > + /* Target list for splitting a large folio */
> > + struct list_head *list;
> > +
> > /* internal fields used by the ->writepages implementation: */
> > struct folio_batch fbatch;
> > pgoff_t index;
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 05525e9e7423..0a5a68f7d0a0 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1496,9 +1496,10 @@ static int shmem_writepage(struct page *page, struct
> > writeback_control *wbc)
> > * and its shmem_writeback() needs them to be split when swapping.
> > */
> > if (wbc->split_large_folio && folio_test_large(folio)) {
> > +try_split:
> > /* Ensure the subpages are still dirty */
> > folio_test_set_dirty(folio);
> > - if (split_huge_page(page) < 0)
> > + if (split_huge_page_to_list_to_order(page, wbc->list, 0))
>
> We check for split_large_folio, but we still send the wbc->list for i915
> writepage() case. Previously, we were sending a NULL list. Shouldn't we address
> that case too?
I guess I was missing wbc initialization snippet:
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index fe69f2c8527d..174b95a9a988 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -308,6 +308,7 @@ void __shmem_writeback(size_t size, struct address_space *mapping)
.range_start = 0,
.range_end = LLONG_MAX,
.for_reclaim = 1,
+ .list = NULL,
};
unsigned long i;
> > goto redirty;
> > folio = page_folio(page);
> > folio_clear_dirty(folio);
> > @@ -1540,8 +1541,12 @@ static int shmem_writepage(struct page *page, struct
> > writeback_control *wbc)
> > }
> >
> > swap = folio_alloc_swap(folio);
> > - if (!swap.val)
> > + if (!swap.val) {
> > + if (nr_pages > 1)
> > + goto try_split;
> > +
> > goto redirty;
> > + }
> >
> > /*
> > * Add inode to shmem_unuse()'s list of swapped-out inodes,
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 277571815789..cf982cf2454f 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -628,7 +628,7 @@ typedef enum {
> > * Calls ->writepage().
> > */
> > static pageout_t pageout(struct folio *folio, struct address_space
> > *mapping,
> > - struct swap_iocb **plug)
> > + struct swap_iocb **plug, struct list_head
> > *folio_list)
> > {
> > /*
> > * If the folio is dirty, only perform writeback if that write
> > @@ -676,6 +676,11 @@ static pageout_t pageout(struct folio *folio, struct
> > address_space *mapping,
> > .swap_plug = plug,
> > };
> >
> > + if (shmem_mapping(mapping)) {
> > + wbc.list = folio_list;
> > + wbc.split_large_folio =
> > !IS_ENABLED(CONFIG_THP_SWAP);
> > + }
> > +
> > folio_set_reclaim(folio);
> > res = mapping->a_ops->writepage(&folio->page, &wbc);
> > if (res < 0)
> > @@ -1259,23 +1264,6 @@ static unsigned int shrink_folio_list(struct
> > list_head *folio_list,
> > goto activate_locked_split;
> > }
> > }
> > - } else if (folio_test_swapbacked(folio) &&
> > - folio_test_large(folio)) {
> > -
> > - /*
> > - * Do not split shmem folio if no swap memory
> > - * available.
> > - */
> > - if (!total_swap_pages)
> > - goto activate_locked;
> > -
> > - /*
> > - * Only split shmem folio when CONFIG_THP_SWAP
> > - * is not enabled.
> > - */
> > - if (!IS_ENABLED(CONFIG_THP_SWAP) &&
> > - split_folio_to_list(folio, folio_list))
> > - goto keep_locked;
> > }
> >
> > /*
> > @@ -1377,18 +1365,21 @@ static unsigned int shrink_folio_list(struct
> > list_head *folio_list,
> > * starts and then write it out here.
> > */
> > try_to_unmap_flush_dirty();
> > -try_pageout:
> > - switch (pageout(folio, mapping, &plug)) {
> > + switch (pageout(folio, mapping, &plug, folio_list))
> > {
> > case PAGE_KEEP:
> > goto keep_locked;
> > case PAGE_ACTIVATE:
> > - if (shmem_mapping(mapping) &&
> > folio_test_large(folio) &&
> > - !split_folio_to_list(folio, folio_list))
> > {
> > + /* Shmem can be split when writeback to swap
> > */
> > + if ((nr_pages > 1) &&
> > !folio_test_large(folio)) {
> > + sc->nr_scanned -= (nr_pages - 1);
> > nr_pages = 1;
> > - goto try_pageout;
> > }
> > goto activate_locked;
> > case PAGE_SUCCESS:
> > + if ((nr_pages > 1) &&
> > !folio_test_large(folio)) {
> > + sc->nr_scanned -= (nr_pages - 1);
> > + nr_pages = 1;
> > + }
> > stat->nr_pageout += nr_pages;
> >
> > if (folio_test_writeback(folio)) {
On 2024/8/8 18:57, Daniel Gomez wrote:
> On Thu, Aug 08, 2024 at 12:48:52PM GMT, Daniel Gomez wrote:
>> On Thu, Aug 08, 2024 at 05:34:50PM GMT, Baolin Wang wrote:
>>>
>>>
>>> On 2024/8/8 16:51, David Hildenbrand wrote:
>>>> On 08.08.24 04:36, Baolin Wang wrote:
>>>>>
>>>>>
>>>>> On 2024/8/7 23:53, David Hildenbrand wrote:
>>>>>> On 07.08.24 09:31, Baolin Wang wrote:
>>>>>>> Page reclaim will not scan anon LRU if no swap space, however
>>>>>>> MADV_PAGEOUT
>>>>>>> can still split shmem large folios even without a swap device. Thus add
>>>>>>> swap available space validation before spliting shmem large folio to
>>>>>>> avoid redundant split.
>>>>>>>
>>>>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>>>>> ---
>>>>>>> mm/vmscan.c | 8 ++++++++
>>>>>>> 1 file changed, 8 insertions(+)
>>>>>>>
>>>>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>>>>> index 31d13462571e..796f65781f4f 100644
>>>>>>> --- a/mm/vmscan.c
>>>>>>> +++ b/mm/vmscan.c
>>>>>>> @@ -1259,6 +1259,14 @@ static unsigned int shrink_folio_list(struct
>>>>>>> list_head *folio_list,
>>>>>>> }
>>>>>>> } else if (folio_test_swapbacked(folio) &&
>>>>>>> folio_test_large(folio)) {
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * Do not split shmem folio if no swap memory
>>>>>>> + * available.
>>>>>>> + */
>>>>>>> + if (!total_swap_pages)
>>>>>>> + goto activate_locked;
>>>>>>> +
>>>>>>> /* Split shmem folio */
>>>>>>> if (split_folio_to_list(folio, folio_list))
>>>>>>> goto keep_locked;
>>>>>>
>>>>>> Reminds me of
>>>>>>
>>>>>> commit 9a976f0c847b67d22ed694556a3626ed92da0422
>>>>>> Author: Luis Chamberlain <mcgrof@kernel.org>
>>>>>> Date: Thu Mar 9 15:05:43 2023 -0800
>>>>>>
>>>>>> shmem: skip page split if we're not reclaiming
>>>>>> In theory when info->flags & VM_LOCKED we should not be getting
>>>>>> shem_writepage() called so we should be verifying this with a
>>>>>> WARN_ON_ONCE(). Since we should not be swapping then best to
>>>>>> ensure we
>>>>>> also don't do the folio split earlier too. So just move the check
>>>>>> early
>>>>>> to avoid folio splits in case its a dubious call.
>>>>>> We also have a similar early bail when !total_swap_pages so just
>>>>>> move that
>>>>>> earlier to avoid the possible folio split in the same situation.
>>>>>>
>>>>>>
>>>>>> But indeed, pageout() -> writepage() is called *after* the split in the
>>>>>> vmscan path.
>>>>>>
>>>>>> In that "noswap" context, I wonder if we also want to skip folios part
>>>>>> of shmem
>>>>>> with disabled swapping?
>>>>>
>>>>> Yes, I think so.
>>>>>
>>>>>>
>>>>>> But now I am wondering under which circumstances we end up calling
>>>>>> shmem_writepage() with a large folio. And I think the answer is the
>>>>>> comment of
>>>>>> folio_test_large(): via drivers/gpu/drm/i915/gem/i915_gem_shmem.c.
>>>>>>
>>>>>>
>>>>>> ... so if shmem_writepage() handles+checks that, could we do
>>>>>>
>>>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>>>> index a332cb80e928..7dfa3d6e8ba7 100644
>>>>>> --- a/mm/vmscan.c
>>>>>> +++ b/mm/vmscan.c
>>>>>> @@ -1257,11 +1257,6 @@ static unsigned int shrink_folio_list(struct
>>>>>> list_head *folio_list,
>>>>>> goto
>>>>>> activate_locked_split;
>>>>>> }
>>>>>> }
>>>>>> - } else if (folio_test_swapbacked(folio) &&
>>>>>> - folio_test_large(folio)) {
>>>>>> - /* Split shmem folio */
>>>>>> - if (split_folio_to_list(folio, folio_list))
>>>>>> - goto keep_locked;
>>>>>> }
>>>>>>
>>>>>> /*
>>>>>>
>>>>>> instead?
>>>>>
>>>>> Seems reasonable to me. But we should pass the 'folio_list' to
>>>>> shmem_writepage() to list the subpages of the large folio. Let me try.
>>>>
>>>> Ah, yes, good point. Alternatively, we'd have to split and try writing
>>>> all subpages in there. I wonder what to do if we fail to write some, and
>>>> if we could handle that transparently, without the folio_list.
>>>
>>> After some investigation, I prefer to pass 'folio_list' to shmem_writepage()
>>> via 'struct writeback_control', which could simplify the logic a lot.
>>> Otherwise, we need to handle each subpage's writeback/reclaim/dirty state,
>>> as well as tracking each subpage's write result, which seems more
>>> complicated.
>>>
>>> I made a quick change by passing 'folio_list', and it looks simple and works
>>> as expected.
>>>
>>> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
>>> index 75196b0f894f..10100e22d5c6 100644
>>> --- a/include/linux/writeback.h
>>> +++ b/include/linux/writeback.h
>>> @@ -80,6 +80,9 @@ struct writeback_control {
>>> */
>>> struct swap_iocb **swap_plug;
>>>
>>> + /* Target list for splitting a large folio */
>>> + struct list_head *list;
>>> +
>>> /* internal fields used by the ->writepages implementation: */
>>> struct folio_batch fbatch;
>>> pgoff_t index;
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> index 05525e9e7423..0a5a68f7d0a0 100644
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>> @@ -1496,9 +1496,10 @@ static int shmem_writepage(struct page *page, struct
>>> writeback_control *wbc)
>>> * and its shmem_writeback() needs them to be split when swapping.
>>> */
>>> if (wbc->split_large_folio && folio_test_large(folio)) {
>>> +try_split:
>>> /* Ensure the subpages are still dirty */
>>> folio_test_set_dirty(folio);
>>> - if (split_huge_page(page) < 0)
>>> + if (split_huge_page_to_list_to_order(page, wbc->list, 0))
>>
>> We check for split_large_folio, but we still send the wbc->list for i915
>> writepage() case. Previously, we were sending a NULL list. Shouldn't we address
>> that case too?
>
> I guess I was missing wbc initialization snippet:
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index fe69f2c8527d..174b95a9a988 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -308,6 +308,7 @@ void __shmem_writeback(size_t size, struct address_space *mapping)
> .range_start = 0,
> .range_end = LLONG_MAX,
> .for_reclaim = 1,
> + .list = NULL,
> };
> unsigned long i;
>
IMO, we don't need an explicit initialization, and 'list' will be
initialized as NULL. Please see:
https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html
© 2016 - 2025 Red Hat, Inc.