[PATCH 11/27] block: Protect against concurrent isolated cpuset change

Frederic Weisbecker posted 27 patches 3 months, 2 weeks ago
There is a newer version of this series
[PATCH 11/27] block: Protect against concurrent isolated cpuset change
Posted by Frederic Weisbecker 3 months, 2 weeks ago
The block subsystem prevents running the workqueue to isolated CPUs,
including those defined by cpuset isolated partitions. Since
HK_TYPE_DOMAIN will soon contain both and be subject to runtime
modifications, synchronize against housekeeping using the relevant lock.

For full support of cpuset changes, the block subsystem may need to
propagate changes to isolated cpumask through the workqueue in the
future.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 block/blk-mq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4806b867e37d..ece3369825fe 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -4237,12 +4237,16 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 
 		/*
 		 * Rule out isolated CPUs from hctx->cpumask to avoid
-		 * running block kworker on isolated CPUs
+		 * running block kworker on isolated CPUs.
+		 * FIXME: cpuset should propagate further changes to isolated CPUs
+		 * here.
 		 */
+		housekeeping_lock();
 		for_each_cpu(cpu, hctx->cpumask) {
 			if (cpu_is_isolated(cpu))
 				cpumask_clear_cpu(cpu, hctx->cpumask);
 		}
+		housekeeping_unlock();
 
 		/*
 		 * Initialize batch roundrobin counts
-- 
2.48.1
Re: [PATCH 11/27] block: Protect against concurrent isolated cpuset change
Posted by Christoph Hellwig 3 months, 2 weeks ago
On Fri, Jun 20, 2025 at 05:22:52PM +0200, Frederic Weisbecker wrote:
> +		 * running block kworker on isolated CPUs.
> +		 * FIXME: cpuset should propagate further changes to isolated CPUs
> +		 * here.

I have no idea what this comments means.  Can you explain it, or help
fixing it?  Or at least send the entire series to all affected
subsystems as there's no way to review it without the context.

If nothing changes please at leat avoid the overly long line.
Re: [PATCH 11/27] block: Protect against concurrent isolated cpuset change
Posted by Frederic Weisbecker 3 months, 2 weeks ago
Le Sun, Jun 22, 2025 at 10:46:57PM -0700, Christoph Hellwig a écrit :
> On Fri, Jun 20, 2025 at 05:22:52PM +0200, Frederic Weisbecker wrote:
> > +		 * running block kworker on isolated CPUs.
> > +		 * FIXME: cpuset should propagate further changes to isolated CPUs
> > +		 * here.
> 
> I have no idea what this comments means.  Can you explain it, or help
> fixing it?  Or at least send the entire series to all affected
> subsystems as there's no way to review it without the context.
> 
> If nothing changes please at leat avoid the overly long line.

That's definetly confusing.
I'll try to clarify that on the next iteration, or even try to fix
it myself.

Thanks.

-- 
Frederic Weisbecker
SUSE Labs
Re: [PATCH 11/27] block: Protect against concurrent isolated cpuset change
Posted by Bart Van Assche 3 months, 2 weeks ago
On 6/20/25 8:22 AM, Frederic Weisbecker wrote:
> The block subsystem prevents running the workqueue to isolated CPUs,
> including those defined by cpuset isolated partitions. Since
> HK_TYPE_DOMAIN will soon contain both and be subject to runtime
> modifications, synchronize against housekeeping using the relevant lock.
> 
> For full support of cpuset changes, the block subsystem may need to
> propagate changes to isolated cpumask through the workqueue in the
> future.
> 
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> ---
>   block/blk-mq.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4806b867e37d..ece3369825fe 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -4237,12 +4237,16 @@ static void blk_mq_map_swqueue(struct request_queue *q)
>   
>   		/*
>   		 * Rule out isolated CPUs from hctx->cpumask to avoid
> -		 * running block kworker on isolated CPUs
> +		 * running block kworker on isolated CPUs.
> +		 * FIXME: cpuset should propagate further changes to isolated CPUs
> +		 * here.
>   		 */
> +		housekeeping_lock();
>   		for_each_cpu(cpu, hctx->cpumask) {
>   			if (cpu_is_isolated(cpu))
>   				cpumask_clear_cpu(cpu, hctx->cpumask);
>   		}
> +		housekeeping_unlock();
>   
>   		/*
>   		 * Initialize batch roundrobin counts

Isn't it expected that function names have the subsystem name as a
prefix? The function name "housekeeping_lock" is not a good name because
that name does not make it clear what subsystem that function affects.
Additionally, "housekeeping" is very vague. Please choose a better name.

Thanks,

Bart.
Re: [PATCH 11/27] block: Protect against concurrent isolated cpuset change
Posted by Frederic Weisbecker 3 months, 2 weeks ago
Le Fri, Jun 20, 2025 at 08:59:58AM -0700, Bart Van Assche a écrit :
> On 6/20/25 8:22 AM, Frederic Weisbecker wrote:
> > The block subsystem prevents running the workqueue to isolated CPUs,
> > including those defined by cpuset isolated partitions. Since
> > HK_TYPE_DOMAIN will soon contain both and be subject to runtime
> > modifications, synchronize against housekeeping using the relevant lock.
> > 
> > For full support of cpuset changes, the block subsystem may need to
> > propagate changes to isolated cpumask through the workqueue in the
> > future.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >   block/blk-mq.c | 6 +++++-
> >   1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 4806b867e37d..ece3369825fe 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -4237,12 +4237,16 @@ static void blk_mq_map_swqueue(struct request_queue *q)
> >   		/*
> >   		 * Rule out isolated CPUs from hctx->cpumask to avoid
> > -		 * running block kworker on isolated CPUs
> > +		 * running block kworker on isolated CPUs.
> > +		 * FIXME: cpuset should propagate further changes to isolated CPUs
> > +		 * here.
> >   		 */
> > +		housekeeping_lock();
> >   		for_each_cpu(cpu, hctx->cpumask) {
> >   			if (cpu_is_isolated(cpu))
> >   				cpumask_clear_cpu(cpu, hctx->cpumask);
> >   		}
> > +		housekeeping_unlock();
> >   		/*
> >   		 * Initialize batch roundrobin counts
> 
> Isn't it expected that function names have the subsystem name as a
> prefix? The function name "housekeeping_lock" is not a good name because
> that name does not make it clear what subsystem that function affects.
> Additionally, "housekeeping" is very vague. Please choose a better name.

Perhaps. "housekeeping_" doesn't match "isolation.c" but there is
already a whole set of APIs with the housekeeping prefix.

Anyway, this will likely disappear and be replaced by RCU instead.

Thanks.

-- 
Frederic Weisbecker
SUSE Labs