[PATCH] sched/fair: Fix potential memory corruption in child_cfs_rq_on_list

Zecheng Li posted 1 patch 11 months, 1 week ago
There is a newer version of this series
kernel/sched/fair.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
[PATCH] sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
Posted by Zecheng Li 11 months, 1 week ago
child_cfs_rq_on_list attempts to convert a 'prev' pointer to a cfs_rq.
This 'prev' pointer can originate from struct rq's leaf_cfs_rq_list,
making the conversion invalid and potentially leading to memory
corruption. Depending on the relative positions of leaf_cfs_rq_list and
the task group (tg) pointer within the struct, this can cause a memory
fault or access garbage data.

The issue arises in list_add_leaf_cfs_rq, where both
cfs_rq->leaf_cfs_rq_list and rq->leaf_cfs_rq_list are added to the same
leaf list. Also, rq->tmp_alone_branch can be set to rq->leaf_cfs_rq_list.

This adds a check `if (prev == &rq->leaf_cfs_rq_list)` after the main
conditional in child_cfs_rq_on_list. This ensures that the container_of
operation will convert a correct cfs_rq struct.

This check is sufficient because only cfs_rqs on the same CPU are added
to the list, so verifying the 'prev' pointer against the current rq's list
head is enough.

Fixes a potential memory corruption issue that due to current struct
layout might not be manifesting as a crash but could lead to unpredictable
behavior when the layout changes.

Signed-off-by: Zecheng Li <zecheng@google.com>
---
 kernel/sched/fair.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 857808da23d8..9dafb374d76d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4061,15 +4061,17 @@ static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
 {
 	struct cfs_rq *prev_cfs_rq;
 	struct list_head *prev;
+	struct rq *rq = rq_of(cfs_rq);
 
 	if (cfs_rq->on_list) {
 		prev = cfs_rq->leaf_cfs_rq_list.prev;
 	} else {
-		struct rq *rq = rq_of(cfs_rq);
-
 		prev = rq->tmp_alone_branch;
 	}
 
+	if (prev == &rq->leaf_cfs_rq_list)
+		return false;
+
 	prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
 
 	return (prev_cfs_rq->tg->parent == cfs_rq->tg);

base-commit: 7ab02bd36eb444654183ad6c5b15211ddfa32a8f
-- 
2.48.1
Re: [PATCH] sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
Posted by Vincent Guittot 11 months, 1 week ago
On Tue, 4 Mar 2025 at 22:40, Zecheng Li <zecheng@google.com> wrote:
>
> child_cfs_rq_on_list attempts to convert a 'prev' pointer to a cfs_rq.
> This 'prev' pointer can originate from struct rq's leaf_cfs_rq_list,
> making the conversion invalid and potentially leading to memory
> corruption. Depending on the relative positions of leaf_cfs_rq_list and
> the task group (tg) pointer within the struct, this can cause a memory
> fault or access garbage data.
>
> The issue arises in list_add_leaf_cfs_rq, where both
> cfs_rq->leaf_cfs_rq_list and rq->leaf_cfs_rq_list are added to the same
> leaf list. Also, rq->tmp_alone_branch can be set to rq->leaf_cfs_rq_list.
>
> This adds a check `if (prev == &rq->leaf_cfs_rq_list)` after the main
> conditional in child_cfs_rq_on_list. This ensures that the container_of
> operation will convert a correct cfs_rq struct.
>
> This check is sufficient because only cfs_rqs on the same CPU are added
> to the list, so verifying the 'prev' pointer against the current rq's list
> head is enough.
>
> Fixes a potential memory corruption issue that due to current struct
> layout might not be manifesting as a crash but could lead to unpredictable
> behavior when the layout changes.

Would be good to add a fix tag
Fixes: fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added
after unthrottling")

>
> Signed-off-by: Zecheng Li <zecheng@google.com>
> ---
>  kernel/sched/fair.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 857808da23d8..9dafb374d76d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4061,15 +4061,17 @@ static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
>  {
>         struct cfs_rq *prev_cfs_rq;
>         struct list_head *prev;
> +       struct rq *rq = rq_of(cfs_rq);
>
>         if (cfs_rq->on_list) {
>                 prev = cfs_rq->leaf_cfs_rq_list.prev;
>         } else {
> -               struct rq *rq = rq_of(cfs_rq);
> -
>                 prev = rq->tmp_alone_branch;
>         }
>
> +       if (prev == &rq->leaf_cfs_rq_list)
> +               return false;

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>


> +
>         prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
>
>         return (prev_cfs_rq->tg->parent == cfs_rq->tg);
>
> base-commit: 7ab02bd36eb444654183ad6c5b15211ddfa32a8f
> --
> 2.48.1
>
Re: [PATCH] sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
Posted by K Prateek Nayak 11 months, 1 week ago
Hello Li,

On 3/5/2025 3:10 AM, Zecheng Li wrote:
> child_cfs_rq_on_list attempts to convert a 'prev' pointer to a cfs_rq.
> This 'prev' pointer can originate from struct rq's leaf_cfs_rq_list,
> making the conversion invalid and potentially leading to memory
> corruption. Depending on the relative positions of leaf_cfs_rq_list and
> the task group (tg) pointer within the struct, this can cause a memory
> fault or access garbage data.
> 
> The issue arises in list_add_leaf_cfs_rq, where both
> cfs_rq->leaf_cfs_rq_list and rq->leaf_cfs_rq_list are added to the same
> leaf list. Also, rq->tmp_alone_branch can be set to rq->leaf_cfs_rq_list.
> 
> This adds a check `if (prev == &rq->leaf_cfs_rq_list)` after the main
> conditional in child_cfs_rq_on_list. This ensures that the container_of
> operation will convert a correct cfs_rq struct.
> 
> This check is sufficient because only cfs_rqs on the same CPU are added
> to the list, so verifying the 'prev' pointer against the current rq's list
> head is enough.
> 
> Fixes a potential memory corruption issue that due to current struct
> layout might not be manifesting as a crash but could lead to unpredictable
> behavior when the layout changes.
> 
> Signed-off-by: Zecheng Li <zecheng@google.com>
> ---
>   kernel/sched/fair.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 857808da23d8..9dafb374d76d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4061,15 +4061,17 @@ static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
>   {
>   	struct cfs_rq *prev_cfs_rq;
>   	struct list_head *prev;
> +	struct rq *rq = rq_of(cfs_rq);
>   
>   	if (cfs_rq->on_list) {
>   		prev = cfs_rq->leaf_cfs_rq_list.prev;
>   	} else {
> -		struct rq *rq = rq_of(cfs_rq);
> -
>   		prev = rq->tmp_alone_branch;
>   	}

A "SCHED_WARN_ON(prev == &rq->leaf_cfs_rq_list)" here is easily tripped
during early boot on my setup before this fix.

Only nit. is perhaps that return can go into the else clause above since
"cfs_rq->on_list" case will guarantee a "leaf_cfs_rq_list" pointer that
is embedded in a valid cfs_rq struct but I've no strong feelings.

Feel free to add:

Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>

-- 
Thanks and Regards,
Prateek

>   
> +	if (prev == &rq->leaf_cfs_rq_list)
> +		return false;
> +
>   	prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
>   
>   	return (prev_cfs_rq->tg->parent == cfs_rq->tg);
> 
> base-commit: 7ab02bd36eb444654183ad6c5b15211ddfa32a8f
[tip: sched/urgent] sched/fair: Fix potential memory corruption in child_cfs_rq_on_list
Posted by tip-bot2 for Zecheng Li 11 months, 1 week ago
The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     3b4035ddbfc8e4521f85569998a7569668cccf51
Gitweb:        https://git.kernel.org/tip/3b4035ddbfc8e4521f85569998a7569668cccf51
Author:        Zecheng Li <zecheng@google.com>
AuthorDate:    Tue, 04 Mar 2025 21:40:31 
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 05 Mar 2025 17:30:54 +01:00

sched/fair: Fix potential memory corruption in child_cfs_rq_on_list

child_cfs_rq_on_list attempts to convert a 'prev' pointer to a cfs_rq.
This 'prev' pointer can originate from struct rq's leaf_cfs_rq_list,
making the conversion invalid and potentially leading to memory
corruption. Depending on the relative positions of leaf_cfs_rq_list and
the task group (tg) pointer within the struct, this can cause a memory
fault or access garbage data.

The issue arises in list_add_leaf_cfs_rq, where both
cfs_rq->leaf_cfs_rq_list and rq->leaf_cfs_rq_list are added to the same
leaf list. Also, rq->tmp_alone_branch can be set to rq->leaf_cfs_rq_list.

This adds a check `if (prev == &rq->leaf_cfs_rq_list)` after the main
conditional in child_cfs_rq_on_list. This ensures that the container_of
operation will convert a correct cfs_rq struct.

This check is sufficient because only cfs_rqs on the same CPU are added
to the list, so verifying the 'prev' pointer against the current rq's list
head is enough.

Fixes a potential memory corruption issue that due to current struct
layout might not be manifesting as a crash but could lead to unpredictable
behavior when the layout changes.

Fixes: fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added after unthrottling")
Signed-off-by: Zecheng Li <zecheng@google.com>
Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20250304214031.2882646-1-zecheng@google.com
---
 kernel/sched/fair.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1c0ef43..c798d27 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4045,15 +4045,17 @@ static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
 {
 	struct cfs_rq *prev_cfs_rq;
 	struct list_head *prev;
+	struct rq *rq = rq_of(cfs_rq);
 
 	if (cfs_rq->on_list) {
 		prev = cfs_rq->leaf_cfs_rq_list.prev;
 	} else {
-		struct rq *rq = rq_of(cfs_rq);
-
 		prev = rq->tmp_alone_branch;
 	}
 
+	if (prev == &rq->leaf_cfs_rq_list)
+		return false;
+
 	prev_cfs_rq = container_of(prev, struct cfs_rq, leaf_cfs_rq_list);
 
 	return (prev_cfs_rq->tg->parent == cfs_rq->tg);