[RFC PATCH] sched: Clear user_cpus_ptr only when no intersection with the new mask

Xuewen Yan posted 1 patch 1 year, 9 months ago
kernel/sched/core.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
[RFC PATCH] sched: Clear user_cpus_ptr only when no intersection with the new mask
Posted by Xuewen Yan 1 year, 9 months ago
The commit 851a723e45d1c("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
would cause that online/offline cpu will produce different results
for the !top-cpuset task.
For example:

If the task was running, then offline task's cpus, would lead to clear
its user-mask.

unisoc:/ # while true; do sleep 600; done&
[1] 6786
unisoc:/ # echo 6786 > /dev/cpuset/top-app/tasks
unisoc:/ # cat /dev/cpuset/top-app/cpus
0-7
unisoc:/ # cat /proc/6786/status | grep Cpus
Cpus_allowed:   ff
Cpus_allowed_list:      0-7

unisoc:/ # taskset -p c0 6786
pid 6786's current affinity mask: ff
pid 6786's new affinity mask: c0
unisoc:/ # cat /proc/6786/status | grep Cpus
Cpus_allowed:   c0
Cpus_allowed_list:      6-7

After offline the cpu6 and cpu7, the user-mask would be cleared:

unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
unisoc:/ # cat /proc/6786/status | grep Cpus
Cpus_allowed:   40
Cpus_allowed_list:      6
ums9621_1h10:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
ums9621_1h10:/ # cat /proc/6786/status | grep Cpus
Cpus_allowed:   3f
Cpus_allowed_list:      0-5

When online the cpu6/7, the user-mask can not bring back:

unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
unisoc:/ # cat /proc/6786/status | grep Cpus
Cpus_allowed:   ff
Cpus_allowed_list:      0-6

However, if we offline the cpu when the task is sleeping, at this
time, because would not call the fallback_cpu(), its user-mask will
not be cleared.

unisoc:/ # while true; do sleep 600; done&
[1] 5990
unisoc:/ # echo 5990 > /dev/cpuset/top-app/tasks
unisoc:/ # cat /proc/5990/status | grep Cpus
Cpus_allowed:   ff
Cpus_allowed_list:      0-7

unisoc:/ # taskset -p c0 5990
pid 5990's current affinity mask: ff
pid 5990's new affinity mask: c0
unisoc:/ # cat /proc/5990/status | grep Cpus
Cpus_allowed:   c0
Cpus_allowed_list:      6-7

unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
unisoc:/ # cat /proc/5990/status | grep Cpus
Cpus_allowed:   80
Cpus_allowed_list:      7
unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
unisoc:/ # cat /proc/5990/status | grep Cpus
Cpus_allowed:   3f
Cpus_allowed_list:      0-5

After 10 minutes, it was waked up, it can also keep its user-mask:
ums9621_1h10:/ # cat /proc/5990/status | grep Cpus
Cpus_allowed:   3f
Cpus_allowed_list:      0-5

And when online the cpu6/7,the user-mask could bring back.
unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
unisoc:/ # cat /proc/6786/status | grep Cpus
Cpus_allowed:   c0
Cpus_allowed_list:      6-7

Indeed, there is no need to clear the user_cpus_ptr if there is an
intersection between user_cpus_ptr and new_mask.
So add the judgement of whether there is an intersection between them.
Clear user_cpus_ptr only when no intersection with the new mask.
In this way, the above problems can also be solved.

Suggested-by: Waiman Long <longman@redhat.com>
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
previous discussion:
 https://lore.kernel.org/all/e402d623-1875-47a2-9db3-8299a54502ef@redhat.com/
---
 kernel/sched/core.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7019a40457a6..bbb8e88949f4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2796,21 +2796,24 @@ __do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
 }
 
 /*
- * Used for kthread_bind() and select_fallback_rq(), in both cases the user
- * affinity (if any) should be destroyed too.
+ * Used for kthread_bind() and select_fallback_rq().
+ * Destroy user affinity if no intersection with the new_mask.
  */
 void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
 {
 	struct affinity_context ac = {
 		.new_mask  = new_mask,
 		.user_mask = NULL,
-		.flags     = SCA_USER,	/* clear the user requested mask */
+		.flags     = 0,
 	};
 	union cpumask_rcuhead {
 		cpumask_t cpumask;
 		struct rcu_head rcu;
 	};
 
+	if (p->user_cpus_ptr && !cpumask_intersects(p->user_cpus_ptr, new_mask))
+		ac.flags = SCA_USER;	/* clear the user requested mask */
+
 	__do_set_cpus_allowed(p, &ac);
 
 	/*
-- 
2.25.1
Re: [RFC PATCH] sched: Clear user_cpus_ptr only when no intersection with the new mask
Posted by Xuewen Yan 1 year, 6 months ago
Friendly ping..

Could anyone please give some comments?

Thanks!

On Tue, May 7, 2024 at 3:23 PM Xuewen Yan <xuewen.yan@unisoc.com> wrote:
>
> The commit 851a723e45d1c("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> would cause that online/offline cpu will produce different results
> for the !top-cpuset task.
> For example:
>
> If the task was running, then offline task's cpus, would lead to clear
> its user-mask.
>
> unisoc:/ # while true; do sleep 600; done&
> [1] 6786
> unisoc:/ # echo 6786 > /dev/cpuset/top-app/tasks
> unisoc:/ # cat /dev/cpuset/top-app/cpus
> 0-7
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-7
>
> unisoc:/ # taskset -p c0 6786
> pid 6786's current affinity mask: ff
> pid 6786's new affinity mask: c0
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
>
> After offline the cpu6 and cpu7, the user-mask would be cleared:
>
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   40
> Cpus_allowed_list:      6
> ums9621_1h10:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> ums9621_1h10:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
>
> When online the cpu6/7, the user-mask can not bring back:
>
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-6
>
> However, if we offline the cpu when the task is sleeping, at this
> time, because would not call the fallback_cpu(), its user-mask will
> not be cleared.
>
> unisoc:/ # while true; do sleep 600; done&
> [1] 5990
> unisoc:/ # echo 5990 > /dev/cpuset/top-app/tasks
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-7
>
> unisoc:/ # taskset -p c0 5990
> pid 5990's current affinity mask: ff
> pid 5990's new affinity mask: c0
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
>
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   80
> Cpus_allowed_list:      7
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
>
> After 10 minutes, it was waked up, it can also keep its user-mask:
> ums9621_1h10:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
>
> And when online the cpu6/7,the user-mask could bring back.
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
>
> Indeed, there is no need to clear the user_cpus_ptr if there is an
> intersection between user_cpus_ptr and new_mask.
> So add the judgement of whether there is an intersection between them.
> Clear user_cpus_ptr only when no intersection with the new mask.
> In this way, the above problems can also be solved.
>
> Suggested-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
> previous discussion:
>  https://lore.kernel.org/all/e402d623-1875-47a2-9db3-8299a54502ef@redhat.com/
> ---
>  kernel/sched/core.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7019a40457a6..bbb8e88949f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2796,21 +2796,24 @@ __do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
>  }
>
>  /*
> - * Used for kthread_bind() and select_fallback_rq(), in both cases the user
> - * affinity (if any) should be destroyed too.
> + * Used for kthread_bind() and select_fallback_rq().
> + * Destroy user affinity if no intersection with the new_mask.
>   */
>  void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>  {
>         struct affinity_context ac = {
>                 .new_mask  = new_mask,
>                 .user_mask = NULL,
> -               .flags     = SCA_USER,  /* clear the user requested mask */
> +               .flags     = 0,
>         };
>         union cpumask_rcuhead {
>                 cpumask_t cpumask;
>                 struct rcu_head rcu;
>         };
>
> +       if (p->user_cpus_ptr && !cpumask_intersects(p->user_cpus_ptr, new_mask))
> +               ac.flags = SCA_USER;    /* clear the user requested mask */
> +
>         __do_set_cpus_allowed(p, &ac);
>
>         /*
> --
> 2.25.1
>
>
Re: [RFC PATCH] sched: Clear user_cpus_ptr only when no intersection with the new mask
Posted by Saket Kumar Bhaskar 1 year, 8 months ago
On Tue, May 07, 2024 at 03:22:42PM +0800, Xuewen Yan wrote:
> The commit 851a723e45d1c("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> would cause that online/offline cpu will produce different results
> for the !top-cpuset task.
> For example:
> 
> If the task was running, then offline task's cpus, would lead to clear
> its user-mask.
> 
> unisoc:/ # while true; do sleep 600; done&
> [1] 6786
> unisoc:/ # echo 6786 > /dev/cpuset/top-app/tasks
> unisoc:/ # cat /dev/cpuset/top-app/cpus
> 0-7
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-7
> 
> unisoc:/ # taskset -p c0 6786
> pid 6786's current affinity mask: ff
> pid 6786's new affinity mask: c0
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
> 
> After offline the cpu6 and cpu7, the user-mask would be cleared:
> 
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   40
> Cpus_allowed_list:      6
> ums9621_1h10:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> ums9621_1h10:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
> 
> When online the cpu6/7, the user-mask can not bring back:
> 
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-6
> 
> However, if we offline the cpu when the task is sleeping, at this
> time, because would not call the fallback_cpu(), its user-mask will
> not be cleared.
> 
> unisoc:/ # while true; do sleep 600; done&
> [1] 5990
> unisoc:/ # echo 5990 > /dev/cpuset/top-app/tasks
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-7
> 
> unisoc:/ # taskset -p c0 5990
> pid 5990's current affinity mask: ff
> pid 5990's new affinity mask: c0
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
> 
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   80
> Cpus_allowed_list:      7
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
> 
> After 10 minutes, it was waked up, it can also keep its user-mask:
> ums9621_1h10:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
> 
> And when online the cpu6/7,the user-mask could bring back.
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
> 
> Indeed, there is no need to clear the user_cpus_ptr if there is an
> intersection between user_cpus_ptr and new_mask.
> So add the judgement of whether there is an intersection between them.
> Clear user_cpus_ptr only when no intersection with the new mask.
> In this way, the above problems can also be solved.
> 
> Suggested-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
> previous discussion:
>  https://lore.kernel.org/all/e402d623-1875-47a2-9db3-8299a54502ef@redhat.com/
> ---
>  kernel/sched/core.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7019a40457a6..bbb8e88949f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2796,21 +2796,24 @@ __do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
>  }
>  
>  /*
> - * Used for kthread_bind() and select_fallback_rq(), in both cases the user
> - * affinity (if any) should be destroyed too.
> + * Used for kthread_bind() and select_fallback_rq().
> + * Destroy user affinity if no intersection with the new_mask.
>   */
>  void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>  {
>  	struct affinity_context ac = {
>  		.new_mask  = new_mask,
>  		.user_mask = NULL,
> -		.flags     = SCA_USER,	/* clear the user requested mask */
> +		.flags     = 0,
>  	};
>  	union cpumask_rcuhead {
>  		cpumask_t cpumask;
>  		struct rcu_head rcu;
>  	};
>  
> +	if (p->user_cpus_ptr && !cpumask_intersects(p->user_cpus_ptr, new_mask))
> +		ac.flags = SCA_USER;	/* clear the user requested mask */
> +
>  	__do_set_cpus_allowed(p, &ac);
>  
>  	/*
> -- 
> 2.25.1
> 

Hi Xuewen,

I have a query here:

1. From the example where task is running, " while true; do sleep 600; done& "
   is being used, which is a sleeping task. How running task is emulated here?

2. Tried without patch (6.9.0-rc4) on a 64 CPUs system. Results are:
   
   For a task that is running:

   # stress-ng -l 100 --cpu 1
   stress-ng: info:  [2307] defaulting to a 86400 second (1 day, 0.00 secs) run per stressor
   stress-ng: info:  [2307] dispatching hogs: 1 cpu
   
   #taskset -p c0 2308
   pid 2308's current affinity mask: ffffffffffffffff
   pid 2308's new affinity mask: c0
   
   # cat /proc/2308/status |grep Cpus
   Cpus_allowed:	00000000,000000c0
   Cpus_allowed_list:	6-7
   
   #chcpu -d 6,7
   CPU 6 disabled
   CPU 7 disabled
   
   After disabling CPUs 6 and 7:

   # cat /proc/2308/status |grep Cpus
   Cpus_allowed:	ffffffff,ffffffff
   Cpus_allowed_list:	0-63

   After enabling CPUs 6 and 7:

   ## chcpu -e 6,7
   CPU 6 enabled
   CPU 7 enabled
   
   # cat /proc/2308/status |grep Cpus
   Cpus_allowed:	ffffffff,ffffffff
   Cpus_allowed_list:	0-63

   From the above output, after disabling CPUs 6 and 7, all the CPUs in the
   system are displayed rather than showing only remaining online CPUs(as 
   shown in above example).

   For a task that is sleeping:

   # while true; do sleep 60; done&
   [1] 2541
   
   # taskset -p c0 2541
   pid 2541's current affinity mask: ffffffffffffffff
   pid 2541's new affinity mask: c0
   
   # cat /proc/2541/status |grep Cpus
   Cpus_allowed:	00000000,000000c0
   Cpus_allowed_list:	6-7
 
   After disabling CPUs 6 and 7:
  
   # chcpu -d 6,7
   CPU 6 disabled
   CPU 7 disabled

   # cat /proc/2541/status |grep Cpus
   Cpus_allowed:	00000000,000000c0
   Cpus_allowed_list:	6-7
   
   After 1 minute:

   # cat /proc/2541/status |grep Cpus
   Cpus_allowed:	ffffffff,ffffffff
   Cpus_allowed_list:	0-63
   
   # chcpu -e 6,7
   CPU 6 enabled
   CPU 7 enabled
   
   # cat /proc/2541/status |grep Cpus
   Cpus_allowed:	ffffffff,ffffffff
   Cpus_allowed_list:	0-63

   From the above output, after disabling CPUs 6 and 7, it waked up after 
   1 minute and Cpus_allowed_list got changed to 0-63 (which is contrary 
   to the above example).

   So, there is some deviation in behaviour seen without the patch, 
   than reported or am I missing something?

Thanks and regards,
Saket Kumar Bhaskar
Re: [RFC PATCH] sched: Clear user_cpus_ptr only when no intersection with the new mask
Posted by Xuewen Yan 1 year, 8 months ago
Hi Saket

On Fri, Jun 7, 2024 at 6:02 PM Saket Kumar Bhaskar <skb99@linux.ibm.com> wrote:
>
> On Tue, May 07, 2024 at 03:22:42PM +0800, Xuewen Yan wrote:
> > The commit 851a723e45d1c("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > would cause that online/offline cpu will produce different results
> > for the !top-cpuset task.
> > For example:
> >
> > If the task was running, then offline task's cpus, would lead to clear
> > its user-mask.
> >
> > unisoc:/ # while true; do sleep 600; done&
> > [1] 6786
> > unisoc:/ # echo 6786 > /dev/cpuset/top-app/tasks
> > unisoc:/ # cat /dev/cpuset/top-app/cpus
> > 0-7
> > unisoc:/ # cat /proc/6786/status | grep Cpus
> > Cpus_allowed:   ff
> > Cpus_allowed_list:      0-7
> >
> > unisoc:/ # taskset -p c0 6786
> > pid 6786's current affinity mask: ff
> > pid 6786's new affinity mask: c0
> > unisoc:/ # cat /proc/6786/status | grep Cpus
> > Cpus_allowed:   c0
> > Cpus_allowed_list:      6-7
> >
> > After offline the cpu6 and cpu7, the user-mask would be cleared:
> >
> > unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> > unisoc:/ # cat /proc/6786/status | grep Cpus
> > Cpus_allowed:   40
> > Cpus_allowed_list:      6
> > ums9621_1h10:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> > ums9621_1h10:/ # cat /proc/6786/status | grep Cpus
> > Cpus_allowed:   3f
> > Cpus_allowed_list:      0-5
> >
> > When online the cpu6/7, the user-mask can not bring back:
> >
> > unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> > unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> > unisoc:/ # cat /proc/6786/status | grep Cpus
> > Cpus_allowed:   ff
> > Cpus_allowed_list:      0-6
> >
> > However, if we offline the cpu when the task is sleeping, at this
> > time, because would not call the fallback_cpu(), its user-mask will
> > not be cleared.
> >
> > unisoc:/ # while true; do sleep 600; done&
> > [1] 5990
> > unisoc:/ # echo 5990 > /dev/cpuset/top-app/tasks
> > unisoc:/ # cat /proc/5990/status | grep Cpus
> > Cpus_allowed:   ff
> > Cpus_allowed_list:      0-7
> >
> > unisoc:/ # taskset -p c0 5990
> > pid 5990's current affinity mask: ff
> > pid 5990's new affinity mask: c0
> > unisoc:/ # cat /proc/5990/status | grep Cpus
> > Cpus_allowed:   c0
> > Cpus_allowed_list:      6-7
> >
> > unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> > unisoc:/ # cat /proc/5990/status | grep Cpus
> > Cpus_allowed:   80
> > Cpus_allowed_list:      7
> > unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> > unisoc:/ # cat /proc/5990/status | grep Cpus
> > Cpus_allowed:   3f
> > Cpus_allowed_list:      0-5
> >
> > After 10 minutes, it was waked up, it can also keep its user-mask:
> > ums9621_1h10:/ # cat /proc/5990/status | grep Cpus
> > Cpus_allowed:   3f
> > Cpus_allowed_list:      0-5
> >
> > And when online the cpu6/7,the user-mask could bring back.
> > unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> > unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> > unisoc:/ # cat /proc/6786/status | grep Cpus
> > Cpus_allowed:   c0
> > Cpus_allowed_list:      6-7
> >
> > Indeed, there is no need to clear the user_cpus_ptr if there is an
> > intersection between user_cpus_ptr and new_mask.
> > So add the judgement of whether there is an intersection between them.
> > Clear user_cpus_ptr only when no intersection with the new mask.
> > In this way, the above problems can also be solved.
> >
> > Suggested-by: Waiman Long <longman@redhat.com>
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > ---
> > previous discussion:
> >  https://lore.kernel.org/all/e402d623-1875-47a2-9db3-8299a54502ef@redhat.com/
> > ---
> >  kernel/sched/core.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 7019a40457a6..bbb8e88949f4 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2796,21 +2796,24 @@ __do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
> >  }
> >
> >  /*
> > - * Used for kthread_bind() and select_fallback_rq(), in both cases the user
> > - * affinity (if any) should be destroyed too.
> > + * Used for kthread_bind() and select_fallback_rq().
> > + * Destroy user affinity if no intersection with the new_mask.
> >   */
> >  void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
> >  {
> >       struct affinity_context ac = {
> >               .new_mask  = new_mask,
> >               .user_mask = NULL,
> > -             .flags     = SCA_USER,  /* clear the user requested mask */
> > +             .flags     = 0,
> >       };
> >       union cpumask_rcuhead {
> >               cpumask_t cpumask;
> >               struct rcu_head rcu;
> >       };
> >
> > +     if (p->user_cpus_ptr && !cpumask_intersects(p->user_cpus_ptr, new_mask))
> > +             ac.flags = SCA_USER;    /* clear the user requested mask */
> > +
> >       __do_set_cpus_allowed(p, &ac);
> >
> >       /*
> > --
> > 2.25.1
> >
>
> Hi Xuewen,
>
> I have a query here:
>
> 1. From the example where task is running, " while true; do sleep 600; done& "
>    is being used, which is a sleeping task. How running task is emulated here?
>
> 2. Tried without patch (6.9.0-rc4) on a 64 CPUs system. Results are:
>
>    For a task that is running:
>
>    # stress-ng -l 100 --cpu 1
>    stress-ng: info:  [2307] defaulting to a 86400 second (1 day, 0.00 secs) run per stressor
>    stress-ng: info:  [2307] dispatching hogs: 1 cpu
>
>    #taskset -p c0 2308
>    pid 2308's current affinity mask: ffffffffffffffff
>    pid 2308's new affinity mask: c0
>
>    # cat /proc/2308/status |grep Cpus
>    Cpus_allowed:        00000000,000000c0
>    Cpus_allowed_list:   6-7
>
>    #chcpu -d 6,7
>    CPU 6 disabled
>    CPU 7 disabled
>
>    After disabling CPUs 6 and 7:
>
>    # cat /proc/2308/status |grep Cpus
>    Cpus_allowed:        ffffffff,ffffffff
>    Cpus_allowed_list:   0-63
>
>    After enabling CPUs 6 and 7:
>
>    ## chcpu -e 6,7
>    CPU 6 enabled
>    CPU 7 enabled
>
>    # cat /proc/2308/status |grep Cpus
>    Cpus_allowed:        ffffffff,ffffffff
>    Cpus_allowed_list:   0-63
>
>    From the above output, after disabling CPUs 6 and 7, all the CPUs in the
>    system are displayed rather than showing only remaining online CPUs(as
>    shown in above example).
>
>    For a task that is sleeping:
>
>    # while true; do sleep 60; done&
>    [1] 2541
>
>    # taskset -p c0 2541
>    pid 2541's current affinity mask: ffffffffffffffff
>    pid 2541's new affinity mask: c0
>
>    # cat /proc/2541/status |grep Cpus
>    Cpus_allowed:        00000000,000000c0
>    Cpus_allowed_list:   6-7
>
>    After disabling CPUs 6 and 7:
>
>    # chcpu -d 6,7
>    CPU 6 disabled
>    CPU 7 disabled
>
>    # cat /proc/2541/status |grep Cpus
>    Cpus_allowed:        00000000,000000c0
>    Cpus_allowed_list:   6-7
>
>    After 1 minute:
>
>    # cat /proc/2541/status |grep Cpus
>    Cpus_allowed:        ffffffff,ffffffff
>    Cpus_allowed_list:   0-63
>
>    # chcpu -e 6,7
>    CPU 6 enabled
>    CPU 7 enabled
>
>    # cat /proc/2541/status |grep Cpus
>    Cpus_allowed:        ffffffff,ffffffff
>    Cpus_allowed_list:   0-63
>
>    From the above output, after disabling CPUs 6 and 7, it waked up after
>    1 minute and Cpus_allowed_list got changed to 0-63 (which is contrary
>    to the above example).
>
>    So, there is some deviation in behaviour seen without the patch,
>    than reported or am I missing something?

Whether the thread you are testing belongs to top-cpuset? If so, I
think is normal.
And you can see the patch: 3fb906e7fabbb (group/cpuset: Don't filter
offline CPUs in cpuset_cpus_allowed() for top cpuset tasks).

BR
---
xuewen
>
> Thanks and regards,
> Saket Kumar Bhaskar
Re: [RFC PATCH] sched: Clear user_cpus_ptr only when no intersection with the new mask
Posted by Xuewen Yan 1 year, 8 months ago
Friendly ping...

On Tue, May 7, 2024 at 3:23 PM Xuewen Yan <xuewen.yan@unisoc.com> wrote:
>
> The commit 851a723e45d1c("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> would cause that online/offline cpu will produce different results
> for the !top-cpuset task.
> For example:
>
> If the task was running, then offline task's cpus, would lead to clear
> its user-mask.
>
> unisoc:/ # while true; do sleep 600; done&
> [1] 6786
> unisoc:/ # echo 6786 > /dev/cpuset/top-app/tasks
> unisoc:/ # cat /dev/cpuset/top-app/cpus
> 0-7
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-7
>
> unisoc:/ # taskset -p c0 6786
> pid 6786's current affinity mask: ff
> pid 6786's new affinity mask: c0
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
>
> After offline the cpu6 and cpu7, the user-mask would be cleared:
>
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   40
> Cpus_allowed_list:      6
> ums9621_1h10:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> ums9621_1h10:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
>
> When online the cpu6/7, the user-mask can not bring back:
>
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-6
>
> However, if we offline the cpu when the task is sleeping, at this
> time, because would not call the fallback_cpu(), its user-mask will
> not be cleared.
>
> unisoc:/ # while true; do sleep 600; done&
> [1] 5990
> unisoc:/ # echo 5990 > /dev/cpuset/top-app/tasks
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   ff
> Cpus_allowed_list:      0-7
>
> unisoc:/ # taskset -p c0 5990
> pid 5990's current affinity mask: ff
> pid 5990's new affinity mask: c0
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
>
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   80
> Cpus_allowed_list:      7
> unisoc:/ # echo 0 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
>
> After 10 minutes, it was waked up, it can also keep its user-mask:
> ums9621_1h10:/ # cat /proc/5990/status | grep Cpus
> Cpus_allowed:   3f
> Cpus_allowed_list:      0-5
>
> And when online the cpu6/7,the user-mask could bring back.
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu6/online
> unisoc:/ # echo 1 > /sys/devices/system/cpu/cpu7/online
> unisoc:/ # cat /proc/6786/status | grep Cpus
> Cpus_allowed:   c0
> Cpus_allowed_list:      6-7
>
> Indeed, there is no need to clear the user_cpus_ptr if there is an
> intersection between user_cpus_ptr and new_mask.
> So add the judgement of whether there is an intersection between them.
> Clear user_cpus_ptr only when no intersection with the new mask.
> In this way, the above problems can also be solved.
>
> Suggested-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
> previous discussion:
>  https://lore.kernel.org/all/e402d623-1875-47a2-9db3-8299a54502ef@redhat.com/
> ---
>  kernel/sched/core.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7019a40457a6..bbb8e88949f4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2796,21 +2796,24 @@ __do_set_cpus_allowed(struct task_struct *p, struct affinity_context *ctx)
>  }
>
>  /*
> - * Used for kthread_bind() and select_fallback_rq(), in both cases the user
> - * affinity (if any) should be destroyed too.
> + * Used for kthread_bind() and select_fallback_rq().
> + * Destroy user affinity if no intersection with the new_mask.
>   */
>  void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>  {
>         struct affinity_context ac = {
>                 .new_mask  = new_mask,
>                 .user_mask = NULL,
> -               .flags     = SCA_USER,  /* clear the user requested mask */
> +               .flags     = 0,
>         };
>         union cpumask_rcuhead {
>                 cpumask_t cpumask;
>                 struct rcu_head rcu;
>         };
>
> +       if (p->user_cpus_ptr && !cpumask_intersects(p->user_cpus_ptr, new_mask))
> +               ac.flags = SCA_USER;    /* clear the user requested mask */
> +
>         __do_set_cpus_allowed(p, &ac);
>
>         /*
> --
> 2.25.1
>
>