spin_lock_irqsave() in autofs_write() is bogus

[PATCH] spin_lock_irqsave() in autofs_write() is bogus

Posted by Al Viro 1 month, 2 weeks ago

	That function should never be (and never is) called with irqs
disabled - we have an explicit mutex_lock() in there, if nothing else.
Which makes spin_lock_irqsave() use in there pointless - we do need to
disable irqs for ->siglock, but that should be spin_lock_irq().

	The history is interesting - it goes all way back to 2.1.68pre1,
and that obviously was a tree-wide work.  Might be interesting to look
for other places with just-in-case spin_lock_irqsave()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c
index 33dd4660d82f..4dc226e86360 100644
--- a/fs/autofs/waitq.c
+++ b/fs/autofs/waitq.c
@@ -46,7 +46,7 @@ void autofs_catatonic_mode(struct autofs_sb_info *sbi)
 static int autofs_write(struct autofs_sb_info *sbi,
 			struct file *file, const void *addr, int bytes)
 {
-	unsigned long sigpipe, flags;
+	unsigned long sigpipe;
 	const char *data = (const char *)addr;
 	ssize_t wr = 0;
 
@@ -66,10 +66,10 @@ static int autofs_write(struct autofs_sb_info *sbi,
 	 * SIGPIPE unless it was already supposed to get one
 	 */
 	if (wr == -EPIPE && !sigpipe) {
-		spin_lock_irqsave(&current->sighand->siglock, flags);
+		spin_lock_irq(&current->sighand->siglock);
 		sigdelset(&current->pending.signal, SIGPIPE);
 		recalc_sigpending();
-		spin_unlock_irqrestore(&current->sighand->siglock, flags);
+		spin_unlock_irq(&current->sighand->siglock);
 	}
 
 	/* if 'wr' returned 0 (impossible) we assume -EIO (safe) */

Re: [PATCH] spin_lock_irqsave() in autofs_write() is bogus

Posted by Paul Menzel 1 month, 2 weeks ago

Dear Al,


Am 17.08.25 um 18:36 schrieb Al Viro:
> 	That function should never be (and never is) called with irqs
> disabled - we have an explicit mutex_lock() in there, if nothing else.
> Which makes spin_lock_irqsave() use in there pointless - we do need to
> disable irqs for ->siglock, but that should be spin_lock_irq().
> 
> 	The history is interesting - it goes all way back to 2.1.68pre1,
> and that obviously was a tree-wide work.  Might be interesting to look
> for other places with just-in-case spin_lock_irqsave()...
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
> diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c
> index 33dd4660d82f..4dc226e86360 100644
> --- a/fs/autofs/waitq.c
> +++ b/fs/autofs/waitq.c
> @@ -46,7 +46,7 @@ void autofs_catatonic_mode(struct autofs_sb_info *sbi)
>   static int autofs_write(struct autofs_sb_info *sbi,
>   			struct file *file, const void *addr, int bytes)
>   {
> -	unsigned long sigpipe, flags;
> +	unsigned long sigpipe;
>   	const char *data = (const char *)addr;
>   	ssize_t wr = 0;
>   
> @@ -66,10 +66,10 @@ static int autofs_write(struct autofs_sb_info *sbi,
>   	 * SIGPIPE unless it was already supposed to get one
>   	 */
>   	if (wr == -EPIPE && !sigpipe) {
> -		spin_lock_irqsave(&current->sighand->siglock, flags);
> +		spin_lock_irq(&current->sighand->siglock);
>   		sigdelset(&current->pending.signal, SIGPIPE);
>   		recalc_sigpending();
> -		spin_unlock_irqrestore(&current->sighand->siglock, flags);
> +		spin_unlock_irq(&current->sighand->siglock);
>   	}
>   
>   	/* if 'wr' returned 0 (impossible) we assume -EIO (safe) */
> 

Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>


Kind regards,

Paul

Re: [PATCH] spin_lock_irqsave() in autofs_write() is bogus

Posted by Linus Torvalds 1 month, 2 weeks ago

On Sun, 17 Aug 2025 at 09:36, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
>         That function should never be (and never is) called with irqs
> disabled - we have an explicit mutex_lock() in there, if nothing else.
> Which makes spin_lock_irqsave() use in there pointless - we do need to
> disable irqs for ->siglock, but that should be spin_lock_irq().

I think we basically did the irqsave/restore version as the default
when not wanting to think about the context.

Your patch looks fine, but I doubt it's measurable outside of "it
makes the code a few bytes smaller".

So ACK on it, but I'm not convinced it's worth spending time actively
_looking_ for these kinds of things.

            Linus

Re: [PATCH] spin_lock_irqsave() in autofs_write() is bogus

Posted by David Laight 1 month, 1 week ago

On Sun, 17 Aug 2025 09:50:25 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sun, 17 Aug 2025 at 09:36, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> >         That function should never be (and never is) called with irqs
> > disabled - we have an explicit mutex_lock() in there, if nothing else.
> > Which makes spin_lock_irqsave() use in there pointless - we do need to
> > disable irqs for ->siglock, but that should be spin_lock_irq().  
> 
> I think we basically did the irqsave/restore version as the default
> when not wanting to think about the context.
> 
> Your patch looks fine, but I doubt it's measurable outside of "it
> makes the code a few bytes smaller".

There is a separate problem with all non-irqsave spin locks.
The lock hold time for spin locks is expected to be short,
perhaps only a few instruction.
(If the hold time is long it shouldn't be a spin lock...)

So disabling interrupts won't have a significant effect on interrupt
latency.

OTOH if an interrupt happens while the spin lock is help the lock
hold time is extended until the interrupt completes.
Get caught by an ethernet rx interrupt and it can be a very long
time (easily several milliseconds) before the hardware interrupt
and soft-int code returns.
During that time any other code attempting to acquire the lock
will spin.

So you may want to disable interrupts on all spin locks.
The downside is that the disable/enable isn't cheap on a lot
of systems.
I'm sure something could be done with the 'pre-empt disable'
counter to conditionally enable/disable interrupts.

There have also been systems that don't actually disable interrupts,
they just set a flag.
When an interrupt happens 'magic' is done so that the ISR can return
and the interrupt processed when (eg) the spin lock is released.

	David

Re: [PATCH] spin_lock_irqsave() in autofs_write() is bogus

Posted by Al Viro 1 month, 2 weeks ago

On Sun, Aug 17, 2025 at 09:50:25AM -0700, Linus Torvalds wrote:
> On Sun, 17 Aug 2025 at 09:36, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> >         That function should never be (and never is) called with irqs
> > disabled - we have an explicit mutex_lock() in there, if nothing else.
> > Which makes spin_lock_irqsave() use in there pointless - we do need to
> > disable irqs for ->siglock, but that should be spin_lock_irq().
> 
> I think we basically did the irqsave/restore version as the default
> when not wanting to think about the context.
>
> Your patch looks fine, but I doubt it's measurable outside of "it
> makes the code a few bytes smaller".
>
> So ACK on it, but I'm not convinced it's worth spending time actively
> _looking_ for these kinds of things.

It's obviously not a hot path of any sort; the only cost is head-scratching
of later readers.  OTOH, seeing that nobody had looked at it that one in
what, 28 years...

For another head-scratcher in that area: I started to look at __rcu sparse
noise, and ->sighand->siglock accesses had been quite a part of that.
current->sighand is obviously stable and should need no rcu_dereference(),
right?  So what the hell is
        rcu_read_lock();
	sighand = rcu_dereference(current->sighand);
	spin_lock_irqsave(&sighand->siglock, flags);
	recalc_sigpending();
	spin_unlock_irqrestore(&sighand->siglock, flags);
	rcu_read_unlock();
in net/sunrpc/svc.c:svc_unregister() about?  Is there something subtle I'm
missing there?  AFAICS, that came from 00a87e5d1d67 "SUNRPC: Address
RCU warning in net/sunrpc/svc.c" and commit message in there contains
no explanation beyond "sparse is complaining, let's make it STFU"...

Looking for ->sighand stores gives this:
fs/exec.c:1070:         rcu_assign_pointer(me->sighand, newsighand);
	store to current->sighand, called from begin_new_exec(), with
'me' set to current.
kernel/exit.c:215:      tsk->sighand = NULL;
	__exit_signal(), from release_task(), and if it's ever done
asynchronously to the current thread, we are really fucked (note that
we do *not* check that sighand is non-NULL in that snippet).
kernel/fork.c:1608:     RCU_INIT_POINTER(tsk->sighand, sig);
	copy_sighand(), from copy_process(), done to the child task
that is nowhere near running at that point.

So nothing weird has happened and that code looks very much like
a pointless head-scratcher.  Sure, the cost in execution time is
not going to be measurable, but the cost for readers is non-trivial...

FWIW, I suspect that the right way would be something like
static inline struct sighand_struct *current_sighand(void)
{
	return unrcu_pointer(current->sighand);
}
if unrcu_pointer is idiomatic in such case, plus conversion of open-coded
instances to it...