From: Chen Ridong <chenridong@huawei.com>
In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
updated by adding `nr_pages` regardless of whether `nr_pages` is greater
than 0 or less than 0. To simplify this function, add a check for
`nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
actions.
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
mm/memcontrol.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index da6e4e9bd0fa..f977e0be1c04 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1280,15 +1280,14 @@ void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
unsigned long *lru_size;
long size;
- if (mem_cgroup_disabled())
+ if (mem_cgroup_disabled() || !nr_pages)
return;
mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
lru_size = &mz->lru_zone_size[zid][lru];
- if (nr_pages < 0)
- *lru_size += nr_pages;
+ *lru_size += nr_pages;
size = *lru_size;
if (WARN_ONCE(size < 0,
"%s(%p, %d, %d): lru_size %ld\n",
@@ -1296,9 +1295,6 @@ void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
VM_BUG_ON(1);
*lru_size = 0;
}
-
- if (nr_pages > 0)
- *lru_size += nr_pages;
}
/**
--
2.34.1
On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote: > > From: Chen Ridong <chenridong@huawei.com> > > In the `mem_cgroup_update_lru_size` function, the `lru_size` should be > updated by adding `nr_pages` regardless of whether `nr_pages` is greater > than 0 or less than 0. To simplify this function, add a check for > `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same > actions. > > Signed-off-by: Chen Ridong <chenridong@huawei.com> NAK. The commit that added that clearly explains why it was done that way.
On 2024/12/6 13:33, Yu Zhao wrote:
> On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>>
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
>> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
>> than 0 or less than 0. To simplify this function, add a check for
>> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
>> actions.
>>
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>
> NAK.
>
> The commit that added that clearly explains why it was done that way.
Thank you for your reply.
I have read the commit message for ca707239e8a7 ("mm: update_lru_size
warn and reset bad lru_size") before sending my patch. However, I did
not quite understand why we need to deal with the difference between
'nr_pages > 0' and 'nr_pages < 0'.
The 'lru_zone_size' can only be modified in the
'mem_cgroup_update_lru_size' function. Only subtracting 'nr_pages' or
adding 'nr_pages' in a way that causes an overflow can make the size < 0.
For case 1, subtracting 'nr_pages', we should issue a warning if the
size goes below 0. For case 2, when adding 'nr_pages' results in an
overflow, there will be no warning and the size won't be reset to 0 the
first time it occurs . It might be that a warning will be issued the
next time 'mem_cgroup_update_lru_size' is called to modify the
'lru_zone_size'. However, as the commit message said, "the first
occurrence is the most informative," and it seems we have missed that
first occurrence.
As the commit message said: "and then the vast unsigned long size draws
page reclaim into a loop of repeatedly", I think that a warning should
be issued and 'lru_zone_size' should be reset whenever 'size < 0' occurs
for the first time, whether from subtracting or adding 'nr_pages'.
I am be grateful if you can explain more details, it has confused me for
a while. Thank you very much.
Best regards,
Ridong
On Fri, 6 Dec 2024, Chen Ridong wrote:
> On 2024/12/6 13:33, Yu Zhao wrote:
> > On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
> >> From: Chen Ridong <chenridong@huawei.com>
> >>
> >> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
> >> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
> >> than 0 or less than 0. To simplify this function, add a check for
> >> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
> >> actions.
> >>
> >> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> >
> > NAK.
> >
> > The commit that added that clearly explains why it was done that way.
Thanks Yu: I did spot this going by, but was indeed hoping that someone
else would NAK it, with more politeness than I could muster at the time!
>
> Thank you for your reply.
>
> I have read the commit message for ca707239e8a7 ("mm: update_lru_size
> warn and reset bad lru_size") before sending my patch. However, I did
> not quite understand why we need to deal with the difference between
> 'nr_pages > 0' and 'nr_pages < 0'.
>
>
> The 'lru_zone_size' can only be modified in the
> 'mem_cgroup_update_lru_size' function. Only subtracting 'nr_pages' or
> adding 'nr_pages' in a way that causes an overflow can make the size < 0.
>
> For case 1, subtracting 'nr_pages', we should issue a warning if the
> size goes below 0. For case 2, when adding 'nr_pages' results in an
> overflow, there will be no warning and the size won't be reset to 0 the
> first time it occurs . It might be that a warning will be issued the
> next time 'mem_cgroup_update_lru_size' is called to modify the
> 'lru_zone_size'. However, as the commit message said, "the first
> occurrence is the most informative," and it seems we have missed that
> first occurrence.
>
> As the commit message said: "and then the vast unsigned long size draws
> page reclaim into a loop of repeatedly", I think that a warning should
> be issued and 'lru_zone_size' should be reset whenever 'size < 0' occurs
> for the first time, whether from subtracting or adding 'nr_pages'.
One thing, not obvious, but important to understand, is that WARN_ONCE()
only issues a warning the first time its condition is found true, but
it returns the true or false of the condition every time. So lru_size
is forced to 0 each time an inconsistency is detected.
(But IIRC VM_WARN_ON_ONCE() does not behave in that useful way; or maybe
I've got that macro name wrong - we have so many slightly differing.)
Perhaps understanding that will help you to make better sense of the
order of events in this function.
Another thing to understand: it's called before adding folio to list,
but after removing folio from list: when it can usefully compare whether
the emptiness of the list correctly matches lru_size 0. It cannot do so
when adding if you "simplify" it in the way that you did.
You might be focusing too much on the "size < 0" aspect of it, or you
might be worrying more than I did about size growing larger and larger
until it wraps to negative (not likely on 64-bit, unless corrupted).
I hope these remarks help, but you need to think through it again
for yourself.
Hugh
>
> I am be grateful if you can explain more details, it has confused me for
> a while. Thank you very much.
>
> Best regards,
> Ridong
On Fri, Dec 06, 2024 at 12:24:54AM -0800, Hugh Dickins wrote: [...] > Another thing to understand: it's called before adding folio to list, > but after removing folio from list: when it can usefully compare whether > the emptiness of the list correctly matches lru_size 0. I think one source of confusion might be that this "emptiness" check has been removed by commit b4536f0c829c because of maintaining the list size per-zone and actual list is shared between zones of a node. > It cannot do so > when adding if you "simplify" it in the way that you did. >
On 2024/12/7 5:20, Shakeel Butt wrote: > On Fri, Dec 06, 2024 at 12:24:54AM -0800, Hugh Dickins wrote: > [...] >> Another thing to understand: it's called before adding folio to list, >> but after removing folio from list: when it can usefully compare whether >> the emptiness of the list correctly matches lru_size 0. > > I think one source of confusion might be that this "emptiness" check has > been removed by commit b4536f0c829c because of maintaining the list size > per-zone and actual list is shared between zones of a node. > Agree. Maybe it doesn't have to distinguish between "size > 0" and "size < 0" now? Thanks, Ridong >> It cannot do so >> when adding if you "simplify" it in the way that you did. >>
On 2024/12/6 16:24, Hugh Dickins wrote:
> On Fri, 6 Dec 2024, Chen Ridong wrote:
>> On 2024/12/6 13:33, Yu Zhao wrote:
>>> On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>>>> From: Chen Ridong <chenridong@huawei.com>
>>>>
>>>> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
>>>> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
>>>> than 0 or less than 0. To simplify this function, add a check for
>>>> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
>>>> actions.
>>>>
>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>>>
>>> NAK.
>>>
>>> The commit that added that clearly explains why it was done that way.
>
> Thanks Yu: I did spot this going by, but was indeed hoping that someone
> else would NAK it, with more politeness than I could muster at the time!
>
>>
>> Thank you for your reply.
>>
>> I have read the commit message for ca707239e8a7 ("mm: update_lru_size
>> warn and reset bad lru_size") before sending my patch. However, I did
>> not quite understand why we need to deal with the difference between
>> 'nr_pages > 0' and 'nr_pages < 0'.
>>
>>
>> The 'lru_zone_size' can only be modified in the
>> 'mem_cgroup_update_lru_size' function. Only subtracting 'nr_pages' or
>> adding 'nr_pages' in a way that causes an overflow can make the size < 0.
>>
>> For case 1, subtracting 'nr_pages', we should issue a warning if the
>> size goes below 0. For case 2, when adding 'nr_pages' results in an
>> overflow, there will be no warning and the size won't be reset to 0 the
>> first time it occurs . It might be that a warning will be issued the
>> next time 'mem_cgroup_update_lru_size' is called to modify the
>> 'lru_zone_size'. However, as the commit message said, "the first
>> occurrence is the most informative," and it seems we have missed that
>> first occurrence.
>>
>> As the commit message said: "and then the vast unsigned long size draws
>> page reclaim into a loop of repeatedly", I think that a warning should
>> be issued and 'lru_zone_size' should be reset whenever 'size < 0' occurs
>> for the first time, whether from subtracting or adding 'nr_pages'.
>
> One thing, not obvious, but important to understand, is that WARN_ONCE()
> only issues a warning the first time its condition is found true, but
> it returns the true or false of the condition every time. So lru_size
> is forced to 0 each time an inconsistency is detected.
>
Thank you for your explanation.
My patch does not change this logic.
> (But IIRC VM_WARN_ON_ONCE() does not behave in that useful way; or maybe
> I've got that macro name wrong - we have so many slightly differing.)
>
> Perhaps understanding that will help you to make better sense of the
> order of events in this function.
>
> Another thing to understand: it's called before adding folio to list,
> but after removing folio from list: when it can usefully compare whether
> the emptiness of the list correctly matches lru_size 0. It cannot do so
> when adding if you "simplify" it in the way that you did.
>
The commit b4536f0c829c ("mm, memcg: fix the active list aging for
lowmem requests when memcg is enabled") has removed the emptiness check.
What is meaningful is that we can determine whether the size is smaller
than 0 before adding `nr_pages`. However, as I mentioned, the
`lru_zone_size` can only be modified in the `mem_cgroup_update_lru_size`
function, which means that a warning must have been issued if the size
was smaller than 0 before adding `nr_pages` when `nr_pages` is greater
than 0. In this case, it will not issue a warning again.
Perhaps "when it can usefully compare whether the emptiness of the list
correctly matches lru_size 0" is not useful now.
> You might be focusing too much on the "size < 0" aspect of it, or you
> might be worrying more than I did about size growing larger and larger
> until it wraps to negative (not likely on 64-bit, unless corrupted).> I hope these remarks help, but you need to think through it again
> for yourself.
>
> Hugh
Thank you very much for your patience.
Best regards,
Ridong
>
>>
>> I am be grateful if you can explain more details, it has confused me for
>> a while. Thank you very much.
>>
>> Best regards,
>> Ridong
© 2016 - 2025 Red Hat, Inc.