io_uring/io-wq.c | 33 ++++++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 7 deletions(-)
Hi, as discussed in [1], this is a manual backport of the remaining two patches to let the io worker threads respect the affinites defined by the cgroup of the process. In 6.1 one worker is created per NUMA node, while in da64d6db3bd3 ("io_uring: One wqe per wq") this is changed to only have a single worker. As this patch is pretty invasive, Jens and me agreed to not backport it. Instead we now limit the workers cpuset to the cpus that are in the intersection between what the cgroup allows and what the NUMA node has. This leaves the question what to do in case the intersection is empty: To be backwarts compatible, we allow this case, but restrict the cpumask of the poller to the cpuset defined by the cgroup. We further believe this is a reasonable decision, as da64d6db3bd3 drops the NUMA awareness anyways. [1] https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk Best regards, Felix Moessbauer Siemens AG Felix Moessbauer (2): io_uring/io-wq: do not allow pinning outside of cpuset io_uring/io-wq: inherit cpuset of cgroup in io worker io_uring/io-wq.c | 33 ++++++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 7 deletions(-) -- 2.39.2
On Wed, Sep 11, 2024 at 06:23:14PM +0200, Felix Moessbauer wrote: > Hi, > > as discussed in [1], this is a manual backport of the remaining two > patches to let the io worker threads respect the affinites defined by > the cgroup of the process. > > In 6.1 one worker is created per NUMA node, while in da64d6db3bd3 > ("io_uring: One wqe per wq") this is changed to only have a single worker. > As this patch is pretty invasive, Jens and me agreed to not backport it. > > Instead we now limit the workers cpuset to the cpus that are in the > intersection between what the cgroup allows and what the NUMA node has. > This leaves the question what to do in case the intersection is empty: > To be backwarts compatible, we allow this case, but restrict the cpumask > of the poller to the cpuset defined by the cgroup. We further believe > this is a reasonable decision, as da64d6db3bd3 drops the NUMA awareness > anyways. > > [1] https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk Why was neither of these actually tagged for inclusion in a stable tree? Why just 6.1.y? Please submit them for all relevent kernel versions. thanks, greg k-h
On Mon, 2024-09-30 at 21:15 +0200, Greg KH wrote: > On Wed, Sep 11, 2024 at 06:23:14PM +0200, Felix Moessbauer wrote: > > Hi, > > > > as discussed in [1], this is a manual backport of the remaining two > > patches to let the io worker threads respect the affinites defined > > by > > the cgroup of the process. > > > > In 6.1 one worker is created per NUMA node, while in da64d6db3bd3 > > ("io_uring: One wqe per wq") this is changed to only have a single > > worker. > > As this patch is pretty invasive, Jens and me agreed to not > > backport it. > > > > Instead we now limit the workers cpuset to the cpus that are in the > > intersection between what the cgroup allows and what the NUMA node > > has. > > This leaves the question what to do in case the intersection is > > empty: > > To be backwarts compatible, we allow this case, but restrict the > > cpumask > > of the poller to the cpuset defined by the cgroup. We further > > believe > > this is a reasonable decision, as da64d6db3bd3 drops the NUMA > > awareness > > anyways. > > > > [1] > > https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk > > Why was neither of these actually tagged for inclusion in a stable > tree? This is a manual backport of these patches for 6.1, as the subsystem changed significantly between 6.1 and 6.2, making an automated backport impossible. This has been agreed on with Jens in https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk/ > Why just 6.1.y? Please submit them for all relevent kernel versions. The original patch was tagged stable and got accepted in 6.6, 6.10 and 6.11. Felix > > thanks, > > greg k-h -- Siemens AG, Technology Linux Expert Center
On Tue, Oct 01, 2024 at 07:32:42AM +0000, MOESSBAUER, Felix wrote: > On Mon, 2024-09-30 at 21:15 +0200, Greg KH wrote: > > On Wed, Sep 11, 2024 at 06:23:14PM +0200, Felix Moessbauer wrote: > > > Hi, > > > > > > as discussed in [1], this is a manual backport of the remaining two > > > patches to let the io worker threads respect the affinites defined > > > by > > > the cgroup of the process. > > > > > > In 6.1 one worker is created per NUMA node, while in da64d6db3bd3 > > > ("io_uring: One wqe per wq") this is changed to only have a single > > > worker. > > > As this patch is pretty invasive, Jens and me agreed to not > > > backport it. > > > > > > Instead we now limit the workers cpuset to the cpus that are in the > > > intersection between what the cgroup allows and what the NUMA node > > > has. > > > This leaves the question what to do in case the intersection is > > > empty: > > > To be backwarts compatible, we allow this case, but restrict the > > > cpumask > > > of the poller to the cpuset defined by the cgroup. We further > > > believe > > > this is a reasonable decision, as da64d6db3bd3 drops the NUMA > > > awareness > > > anyways. > > > > > > [1] > > > https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk > > > > Why was neither of these actually tagged for inclusion in a stable > > tree? > > This is a manual backport of these patches for 6.1, as the subsystem > changed significantly between 6.1 and 6.2, making an automated backport > impossible. This has been agreed on with Jens in > https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk/ > > > Why just 6.1.y? Please submit them for all relevent kernel versions. > > The original patch was tagged stable and got accepted in 6.6, 6.10 and > 6.11. No they were not at all. Please properly tag them in the future as per the documentation if you wish to have things applied to the stable trees: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html thanks, greg k-h
On 10/1/24 1:50 AM, gregkh@linuxfoundation.org wrote: > On Tue, Oct 01, 2024 at 07:32:42AM +0000, MOESSBAUER, Felix wrote: >> On Mon, 2024-09-30 at 21:15 +0200, Greg KH wrote: >>> On Wed, Sep 11, 2024 at 06:23:14PM +0200, Felix Moessbauer wrote: >>>> Hi, >>>> >>>> as discussed in [1], this is a manual backport of the remaining two >>>> patches to let the io worker threads respect the affinites defined >>>> by >>>> the cgroup of the process. >>>> >>>> In 6.1 one worker is created per NUMA node, while in da64d6db3bd3 >>>> ("io_uring: One wqe per wq") this is changed to only have a single >>>> worker. >>>> As this patch is pretty invasive, Jens and me agreed to not >>>> backport it. >>>> >>>> Instead we now limit the workers cpuset to the cpus that are in the >>>> intersection between what the cgroup allows and what the NUMA node >>>> has. >>>> This leaves the question what to do in case the intersection is >>>> empty: >>>> To be backwarts compatible, we allow this case, but restrict the >>>> cpumask >>>> of the poller to the cpuset defined by the cgroup. We further >>>> believe >>>> this is a reasonable decision, as da64d6db3bd3 drops the NUMA >>>> awareness >>>> anyways. >>>> >>>> [1] >>>> https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk >>> >>> Why was neither of these actually tagged for inclusion in a stable >>> tree? >> >> This is a manual backport of these patches for 6.1, as the subsystem >> changed significantly between 6.1 and 6.2, making an automated backport >> impossible. This has been agreed on with Jens in >> https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk/ >> >>> Why just 6.1.y? Please submit them for all relevent kernel versions. >> >> The original patch was tagged stable and got accepted in 6.6, 6.10 and >> 6.11. > > No they were not at all. Please properly tag them in the future as per > the documentation if you wish to have things applied to the stable > trees: > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html That's my bad, missed that one of them did not get marked for stable, the sqpoll one did. -- Jens Axboe
On 9/11/24 10:23 AM, Felix Moessbauer wrote: > Hi, > > as discussed in [1], this is a manual backport of the remaining two > patches to let the io worker threads respect the affinites defined by > the cgroup of the process. > > In 6.1 one worker is created per NUMA node, while in da64d6db3bd3 > ("io_uring: One wqe per wq") this is changed to only have a single worker. > As this patch is pretty invasive, Jens and me agreed to not backport it. > > Instead we now limit the workers cpuset to the cpus that are in the > intersection between what the cgroup allows and what the NUMA node has. > This leaves the question what to do in case the intersection is empty: > To be backwarts compatible, we allow this case, but restrict the cpumask > of the poller to the cpuset defined by the cgroup. We further believe > this is a reasonable decision, as da64d6db3bd3 drops the NUMA awareness > anyways. > > [1] https://lore.kernel.org/lkml/ec01745a-b102-4f6e-abc9-abd636d36319@kernel.dk The upstream patches are staged for 6.12 and marked for a backport, so they should go upstream next week. Once they are upstream, I'll make sure to check in on these on the stable front. -- Jens Axboe
© 2016 - 2024 Red Hat, Inc.