[PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()

Qi Zheng posted 4 patches 1 week, 1 day ago
There is a newer version of this series
[PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Qi Zheng 1 week, 1 day ago
From: Qi Zheng <zhengqi.arch@bytedance.com>

In memcg_state_val_in_pages(), if the passed val is negative, the
expression val * unit / PAGE_SIZE could be implicitly converted to a
massive positive number when compared with 1UL in the max() macro.
This leads to returning an incorrect massive positive value.

Fix this by using abs(val) to calculate the magnitude first, and then
restoring the sign of the value before returning the result. Additionally,
use mult_frac() to prevent potential overflow during the multiplication of
val and unit.

Reported-by: Harry Yoo (Oracle) <harry@kernel.org>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 mm/memcontrol.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 04076a139dbe3..0c249255ebefb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -787,11 +787,14 @@ static int memcg_page_state_unit(int item);
 static long memcg_state_val_in_pages(int idx, long val)
 {
 	int unit = memcg_page_state_unit(idx);
+	long res;
 
 	if (!val || unit == PAGE_SIZE)
 		return val;
-	else
-		return max(val * unit / PAGE_SIZE, 1UL);
+
+	res = max(mult_frac(abs(val), unit, PAGE_SIZE), 1UL);
+
+	return val < 0 ? -res : res;
 }
 
 #ifdef CONFIG_MEMCG_V1
-- 
2.20.1
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Lorenzo Stoakes (Oracle) 1 week ago
On Wed, Mar 25, 2026 at 10:13:25PM +0800, Qi Zheng wrote:
> From: Qi Zheng <zhengqi.arch@bytedance.com>
>
> In memcg_state_val_in_pages(), if the passed val is negative, the
> expression val * unit / PAGE_SIZE could be implicitly converted to a
> massive positive number when compared with 1UL in the max() macro.
> This leads to returning an incorrect massive positive value.
>
> Fix this by using abs(val) to calculate the magnitude first, and then
> restoring the sign of the value before returning the result. Additionally,
> use mult_frac() to prevent potential overflow during the multiplication of
> val and unit.
>
> Reported-by: Harry Yoo (Oracle) <harry@kernel.org>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>

The logic is correct, but I think this needs rework for better
understanding, and obviously this should be squashed into 2/4 as per
Andrew.

With the below change applied:

Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

> ---
>  mm/memcontrol.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 04076a139dbe3..0c249255ebefb 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -787,11 +787,14 @@ static int memcg_page_state_unit(int item);
>  static long memcg_state_val_in_pages(int idx, long val)
>  {
>  	int unit = memcg_page_state_unit(idx);
> +	long res;
>
>  	if (!val || unit == PAGE_SIZE)
>  		return val;
> -	else
> -		return max(val * unit / PAGE_SIZE, 1UL);

Hm this was already fairly horrid, because we're comparing an unsigned long
value of 1 vs. a ULONG_MAX - abs(val) so this was intended to make 0 -> 1UL
but not what you'd mathematically think this was which was to make negative
values (logically < 1) -> 1.

Of course before it was just broken and would promote (val * unit /
PAGE_SIZE) to unsigned long first (thus massive number) and return that :)

> +
> +	res = max(mult_frac(abs(val), unit, PAGE_SIZE), 1UL);

This is way too compressed into one line and retains the confusing
behaviour.

Could we split this out and explain what we're doing (sign-extension,
integer promotion and all of this stuff is confusing - so let's just accept
that and spell it out):

	/* Get the absolute value of (val * unit / PAGE_SIZE). */
	res = mult_frac(abs(val), unit, PAGE_SIZE);
	/* Round up zero values. */
	res = res ?: 1;
	/* Retain sign. */
	return val < 0 ? -res : res;

This is functionally identical, but a lot more readable, I think.

> +
> +	return val < 0 ? -res : res;
>  }
>
>  #ifdef CONFIG_MEMCG_V1
> --
> 2.20.1
>

Cheers, Lorenzo
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Qi Zheng 1 week ago

On 3/26/26 5:16 PM, Lorenzo Stoakes (Oracle) wrote:
> On Wed, Mar 25, 2026 at 10:13:25PM +0800, Qi Zheng wrote:
>> From: Qi Zheng <zhengqi.arch@bytedance.com>
>>
>> In memcg_state_val_in_pages(), if the passed val is negative, the
>> expression val * unit / PAGE_SIZE could be implicitly converted to a
>> massive positive number when compared with 1UL in the max() macro.
>> This leads to returning an incorrect massive positive value.
>>
>> Fix this by using abs(val) to calculate the magnitude first, and then
>> restoring the sign of the value before returning the result. Additionally,
>> use mult_frac() to prevent potential overflow during the multiplication of
>> val and unit.
>>
>> Reported-by: Harry Yoo (Oracle) <harry@kernel.org>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> 
> The logic is correct, but I think this needs rework for better
> understanding, and obviously this should be squashed into 2/4 as per
> Andrew.
> 
> With the below change applied:
> 
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> 
>> ---
>>   mm/memcontrol.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 04076a139dbe3..0c249255ebefb 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -787,11 +787,14 @@ static int memcg_page_state_unit(int item);
>>   static long memcg_state_val_in_pages(int idx, long val)
>>   {
>>   	int unit = memcg_page_state_unit(idx);
>> +	long res;
>>
>>   	if (!val || unit == PAGE_SIZE)
>>   		return val;
>> -	else
>> -		return max(val * unit / PAGE_SIZE, 1UL);
> 
> Hm this was already fairly horrid, because we're comparing an unsigned long
> value of 1 vs. a ULONG_MAX - abs(val) so this was intended to make 0 -> 1UL
> but not what you'd mathematically think this was which was to make negative
> values (logically < 1) -> 1.
> 
> Of course before it was just broken and would promote (val * unit /
> PAGE_SIZE) to unsigned long first (thus massive number) and return that :)
> 
>> +
>> +	res = max(mult_frac(abs(val), unit, PAGE_SIZE), 1UL);
> 
> This is way too compressed into one line and retains the confusing
> behaviour.
> 
> Could we split this out and explain what we're doing (sign-extension,
> integer promotion and all of this stuff is confusing - so let's just accept
> that and spell it out):
> 
> 	/* Get the absolute value of (val * unit / PAGE_SIZE). */
> 	res = mult_frac(abs(val), unit, PAGE_SIZE);
> 	/* Round up zero values. */
> 	res = res ?: 1;
> 	/* Retain sign. */
> 	return val < 0 ? -res : res;
> 
> This is functionally identical, but a lot more readable, I think.

Make sense, I will update to v3.

If Andrew needs me to merge this patchset into "[PATCH v6 00/33] 
Eliminate Dying Memory Cgroup" [1], then I will develop and send v7.

[1]. 
https://lore.kernel.org/all/cover.1772711148.git.zhengqi.arch@bytedance.com/

Thanks,
Qi

> 
>> +
>> +	return val < 0 ? -res : res;
>>   }
>>
>>   #ifdef CONFIG_MEMCG_V1
>> --
>> 2.20.1
>>
> 
> Cheers, Lorenzo
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Lorenzo Stoakes (Oracle) 1 week ago
On Thu, Mar 26, 2026 at 05:32:05PM +0800, Qi Zheng wrote:
>
>
> On 3/26/26 5:16 PM, Lorenzo Stoakes (Oracle) wrote:
> > On Wed, Mar 25, 2026 at 10:13:25PM +0800, Qi Zheng wrote:
> > > ---
> > >   mm/memcontrol.c | 7 +++++--
> > >   1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > > index 04076a139dbe3..0c249255ebefb 100644
> > > --- a/mm/memcontrol.c
> > > +++ b/mm/memcontrol.c
> > > @@ -787,11 +787,14 @@ static int memcg_page_state_unit(int item);
> > >   static long memcg_state_val_in_pages(int idx, long val)
> > >   {
> > >   	int unit = memcg_page_state_unit(idx);
> > > +	long res;
> > >
> > >   	if (!val || unit == PAGE_SIZE)
> > >   		return val;
> > > -	else
> > > -		return max(val * unit / PAGE_SIZE, 1UL);
> >
> > Hm this was already fairly horrid, because we're comparing an unsigned long
> > value of 1 vs. a ULONG_MAX - abs(val) so this was intended to make 0 -> 1UL
> > but not what you'd mathematically think this was which was to make negative
> > values (logically < 1) -> 1.
> >
> > Of course before it was just broken and would promote (val * unit /
> > PAGE_SIZE) to unsigned long first (thus massive number) and return that :)
> >
> > > +
> > > +	res = max(mult_frac(abs(val), unit, PAGE_SIZE), 1UL);
> >
> > This is way too compressed into one line and retains the confusing
> > behaviour.
> >
> > Could we split this out and explain what we're doing (sign-extension,
> > integer promotion and all of this stuff is confusing - so let's just accept
> > that and spell it out):
> >
> > 	/* Get the absolute value of (val * unit / PAGE_SIZE). */
> > 	res = mult_frac(abs(val), unit, PAGE_SIZE);
> > 	/* Round up zero values. */
> > 	res = res ?: 1;
> > 	/* Retain sign. */
> > 	return val < 0 ? -res : res;
> >
> > This is functionally identical, but a lot more readable, I think.
>
> Make sense, I will update to v3.

Thanks!

>
> If Andrew needs me to merge this patchset into "[PATCH v6 00/33] Eliminate
> Dying Memory Cgroup" [1], then I will develop and send v7.
>
> [1].
> https://lore.kernel.org/all/cover.1772711148.git.zhengqi.arch@bytedance.com/

Oh that's your series too :)

That would be ideal unless that's already in mm-stable, as the series ordering
will give us strict ordering on patches.

Anyway let's wait for Andrew on that!

Cheers, Lorenzo
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Andrew Morton 1 week ago
On Thu, 26 Mar 2026 09:38:02 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:

> >
> > If Andrew needs me to merge this patchset into "[PATCH v6 00/33] Eliminate
> > Dying Memory Cgroup" [1], then I will develop and send v7.
> >
> > [1].
> > https://lore.kernel.org/all/cover.1772711148.git.zhengqi.arch@bytedance.com/
> 
> Oh that's your series too :)
> 
> That would be ideal unless that's already in mm-stable, as the series ordering
> will give us strict ordering on patches.
> 
> Anyway let's wait for Andrew on that!

<trying to catch up here>

Gee, I'd rather not churn that 33 patch series.  Could of course do so
in the normal fashion, if that's considered particularly desirable.

As I understand it, Qi will be preparing a v3 and I should stage that
ahead of "Eliminate ..." to avoid a bisection hole?

If so, that works.

<checks>

Well dang, this series ("fix unexpected type conversions and potential
overflows") is at least textually dependent on "Eliminate ...".

Options:

a) Redo this "fix unexpected ..." series on top of "Eliminate ..."
   and tolerate the bisection hole (easiest).

   Am I correct in believing that the concern here is a runtime
   bisection hole?  And that the bug is pretty unlikely to hit even if
   our unlucky bisector happens to hit that hole?  If so, we can live
   with that, surely.  Every darn hotfix we do creates a runtime
   bisecton hole!

b) Redo this series on top of mm-stable or mainline or whatever then
   I stage this series ahead of "Eliminate ..." and fix up the merge
   issues (probably OK)

c) Qi redoes everything as a single series.  That's OK.

If we choose c) then please lmk and I'll drop both series to give Qi a
clean run at mm-unstable.
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Qi Zheng 6 days, 22 hours ago

On 3/27/26 8:06 AM, Andrew Morton wrote:
> On Thu, 26 Mar 2026 09:38:02 +0000 "Lorenzo Stoakes (Oracle)" <ljs@kernel.org> wrote:
> 
>>>
>>> If Andrew needs me to merge this patchset into "[PATCH v6 00/33] Eliminate
>>> Dying Memory Cgroup" [1], then I will develop and send v7.
>>>
>>> [1].
>>> https://lore.kernel.org/all/cover.1772711148.git.zhengqi.arch@bytedance.com/
>>
>> Oh that's your series too :)
>>
>> That would be ideal unless that's already in mm-stable, as the series ordering
>> will give us strict ordering on patches.
>>
>> Anyway let's wait for Andrew on that!
> 
> <trying to catch up here>
> 
> Gee, I'd rather not churn that 33 patch series.  Could of course do so

Agree.

> in the normal fashion, if that's considered particularly desirable.
> 
> As I understand it, Qi will be preparing a v3 and I should stage that
> ahead of "Eliminate ..." to avoid a bisection hole?
> 
> If so, that works.
> 
> <checks>
> 
> Well dang, this series ("fix unexpected type conversions and potential
> overflows") is at least textually dependent on "Eliminate ...".
> 
> Options:
> 
> a) Redo this "fix unexpected ..." series on top of "Eliminate ..."
>     and tolerate the bisection hole (easiest).

 From my personal perspective, I prefer this approach. The issues fixed
by this patchset aren't too critical; it's just that the counter might
overflow (and only with CONFIG_MEMCG_V1). If it can be included in
v7.1-rcX along with "Eliminate ...", I think it's acceptable.

Thanks,
Qi

> 
>     Am I correct in believing that the concern here is a runtime
>     bisection hole?  And that the bug is pretty unlikely to hit even if
>     our unlucky bisector happens to hit that hole?  If so, we can live
>     with that, surely.  Every darn hotfix we do creates a runtime
>     bisecton hole!
> 
> b) Redo this series on top of mm-stable or mainline or whatever then
>     I stage this series ahead of "Eliminate ..." and fix up the merge
>     issues (probably OK)
> 
> c) Qi redoes everything as a single series.  That's OK.
> 
> If we choose c) then please lmk and I'll drop both series to give Qi a
> clean run at mm-unstable.
>
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Andrew Morton 6 days, 22 hours ago
On Fri, 27 Mar 2026 10:42:52 +0800 Qi Zheng <qi.zheng@linux.dev> wrote:

> > a) Redo this "fix unexpected ..." series on top of "Eliminate ..."
> >     and tolerate the bisection hole (easiest).
> 
>  From my personal perspective, I prefer this approach. The issues fixed
> by this patchset aren't too critical; it's just that the counter might
> overflow (and only with CONFIG_MEMCG_V1). If it can be included in
> v7.1-rcX along with "Eliminate ...", I think it's acceptable.

Cool, thanks.

I've dropped "fix unexpected type conversions and potential overflows
v2" from mm.git to give you a clean mm-unstable to work against.  I
expect to be pushing this a couple of hours from now.
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Lorenzo Stoakes (Oracle) 6 days, 17 hours ago
On Thu, Mar 26, 2026 at 08:13:49PM -0700, Andrew Morton wrote:
> On Fri, 27 Mar 2026 10:42:52 +0800 Qi Zheng <qi.zheng@linux.dev> wrote:
>
> > > a) Redo this "fix unexpected ..." series on top of "Eliminate ..."
> > >     and tolerate the bisection hole (easiest).
> >
> >  From my personal perspective, I prefer this approach. The issues fixed
> > by this patchset aren't too critical; it's just that the counter might
> > overflow (and only with CONFIG_MEMCG_V1). If it can be included in
> > v7.1-rcX along with "Eliminate ...", I think it's acceptable.
>
> Cool, thanks.
>
> I've dropped "fix unexpected type conversions and potential overflows
> v2" from mm.git to give you a clean mm-unstable to work against.  I
> expect to be pushing this a couple of hours from now.
>

Yeah this works for me too, I think given low chance of hitting we can tolerate
a small bisection hole!
Re: [PATCH v2 4/4] mm: memcontrol: fix unexpected massive positive number in memcg_state_val_in_pages()
Posted by Lorenzo Stoakes (Oracle) 1 week ago
On Thu, Mar 26, 2026 at 09:16:50AM +0000, Lorenzo Stoakes (Oracle) wrote:
> 	/* Get the absolute value of (val * unit / PAGE_SIZE). */
> 	res = mult_frac(abs(val), unit, PAGE_SIZE);
> 	/* Round up zero values. */
> 	res = res ?: 1;
> 	/* Retain sign. */
> 	return val < 0 ? -res : res;

(If you think the comments are too much, you can drop them, and the code
essentially spells it out anyway)

Cheers, Lorenzo