Syzkaller reports the leak [1]. It is reproducible. The following patch fixes the leak. It was proposed by Takeshi Misawa and tested by Syzbot. In other places of the code the waitqueue is freed when its wait_ctr becomes zero (see autofs_wait_release). So I think it is not actually supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to zero. Please correct me if I'm wrong. Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up() instead of wake_up_interruptible', shouldn't wake_up_interruptible() inside autofs_catatonic_mode() be replaced with wake_up()? [1] https://syzkaller.appspot.com/bug?id=a9412f636e2d733130f8def7975897d0b57f6e37 [2] https://www.spinics.net/lists/autofs/msg01875.html
On 12/2/23 03:59, Fedor Pchelkin wrote: > Syzkaller reports the leak [1]. It is reproducible. > > The following patch fixes the leak. It was proposed by Takeshi Misawa and > tested by Syzbot. > > In other places of the code the waitqueue is freed when its wait_ctr > becomes zero (see autofs_wait_release). So I think it is not actually > supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to > zero. Please correct me if I'm wrong. This is a bit had to read but I think your saying there's an assumption that wait_ctr can't become zero in autofs_catatonic_mode(). That's correct, the case of a waiting process getting sent a signal is not accounted for and this can (as you observed) lead to the wait not being freed and also not being freed at umount. I think the change here should be sufficient to resolve the leak and I can't think of any cases where this could cause a further problem. > > Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up() > instead of wake_up_interruptible', shouldn't wake_up_interruptible() > inside autofs_catatonic_mode() be replaced with wake_up()? This does imply that [2] should have been applied to autofs_catatonic_mode() as well, I'm still trying to grok if that change would cause side effects for the change here but I think not. Ian
On 13/2/23 12:27, Ian Kent wrote: > On 12/2/23 03:59, Fedor Pchelkin wrote: >> Syzkaller reports the leak [1]. It is reproducible. >> >> The following patch fixes the leak. It was proposed by Takeshi Misawa >> and >> tested by Syzbot. >> >> In other places of the code the waitqueue is freed when its wait_ctr >> becomes zero (see autofs_wait_release). So I think it is not actually >> supposed that inside autofs_catatonic_mode wait_ctr cannot be >> decreased to >> zero. Please correct me if I'm wrong. > > This is a bit had to read but I think your saying there's an assumption > > that wait_ctr can't become zero in autofs_catatonic_mode(). > > > That's correct, the case of a waiting process getting sent a signal is > > not accounted for and this can (as you observed) lead to the wait not > > being freed and also not being freed at umount. > > > I think the change here should be sufficient to resolve the leak and > > I can't think of any cases where this could cause a further problem. > > >> >> Also, looking at the discussion [2] of the '[PATCH] autofs4: use >> wake_up() >> instead of wake_up_interruptible', shouldn't wake_up_interruptible() >> inside autofs_catatonic_mode() be replaced with wake_up()? > > This does imply that [2] should have been applied to > autofs_catatonic_mode() > > as well, I'm still trying to grok if that change would cause side effects > > for the change here but I think not. I was going to Ack the patch but I wondering if we should wait a little while and perhaps (probably) include the wake up call change as well. In any case we need Al to accept it (cc'd). Hopefully Al will offer his opinion on the changes too. > > > Ian >
On Mon, Feb 13, 2023 at 12:37:16PM +0800, Ian Kent wrote: > > I was going to Ack the patch but I wondering if we should wait a little > > while and perhaps (probably) include the wake up call change as well. > Hmm, those would be separate patches? An interesting thing is that the code itself supposes the wake up calls from autofs_wait_release() and autofs_catatonic_mode() to be related in some way (see autofs_wait fragment): /* * wq->name.name is NULL iff the lock is already released * or the mount has been made catatonic. */ wait_event_killable(wq->queue, wq->name.name == NULL); status = wq->status; It seems 'the lock is already released' refers to autofs_wait_release() as there is no alternative except the call to catatonic function where wq->name.name is NULL. So apparently the wake up calls should be the same (although I don't know if autofs_catatonic_mode has some different behaviour in such case, but probably it doesn't differ here). It's also strange that autofs_kill_sb() calls autofs_catatonic_mode() and currently it just decrements the wait_ctr's and it is not clear to me where the waitqueues are eventually freed in such case. Only if autofs_wait_release() or autofs_wait() are called? I'm not sure whether they are definitely called after that or not. [1] https://www.spinics.net/lists/autofs/msg01878.html > > In any case we need Al to accept it (cc'd). > > Hopefully Al will offer his opinion on the changes too. > It would be very nice if probably Al would make it more clear. At the moment I think that the leak issue should be fixed with the currenly discussed patch and the wake up call issue should be fixed like in [1], but perhaps I'm missing something.
On 11/3/23 01:56, Fedor Pchelkin wrote: > On Mon, Feb 13, 2023 at 12:37:16PM +0800, Ian Kent wrote: >> I was going to Ack the patch but I wondering if we should wait a little >> >> while and perhaps (probably) include the wake up call change as well. >> > Hmm, those would be separate patches? > > An interesting thing is that the code itself supposes the wake up calls > from autofs_wait_release() and autofs_catatonic_mode() to be related in > some way (see autofs_wait fragment): > > /* > * wq->name.name is NULL iff the lock is already released > * or the mount has been made catatonic. > */ > wait_event_killable(wq->queue, wq->name.name == NULL); > status = wq->status; > > It seems 'the lock is already released' refers to autofs_wait_release() > as there is no alternative except the call to catatonic function where > wq->name.name is NULL. So apparently the wake up calls should be the same > (although I don't know if autofs_catatonic_mode has some different > behaviour in such case, but probably it doesn't differ here). I think that, because there are processes waiting, they will always go via the tail of autofs_wait() so the wait will be freed at that point. Alternately autofs_wait_release() will be called from user space daemon to tell the kernel it's done with the current notification. I think there was an order of execution problem at some point between autofs_wait() and autofs_wait_release() hence the code there. The same may be the case for autofs_catatonic_mode() which is what the patch implies. These mount points can be left mounted after the user space daemon exits with the processes still blocked so umounting the mount should trigger the freeing of the name or they may be set catatonic by the daemon at exit, again freeing the name, and in both cases unblocking the processes to free the wait. So I didn't think there was a memory leak here but SyZkaller says there is. > > It's also strange that autofs_kill_sb() calls autofs_catatonic_mode() and > currently it just decrements the wait_ctr's and it is not clear to me > where the waitqueues are eventually freed in such case. Only if > autofs_wait_release() or autofs_wait() are called? I'm not sure whether > they are definitely called after that or not. > > [1] https://www.spinics.net/lists/autofs/msg01878.html >> In any case we need Al to accept it (cc'd). >> >> Hopefully Al will offer his opinion on the changes too. >> > It would be very nice if probably Al would make it more clear. > > At the moment I think that the leak issue should be fixed with the > currenly discussed patch and the wake up call issue should be fixed like > in [1], but perhaps I'm missing something. The question I have is, is it possible a process waiting on the wait queue gets unblocked after the wait is freed in autofs_catatonic_mode? Ian
On 12/2/23 03:59, Fedor Pchelkin wrote: > Syzkaller reports the leak [1]. It is reproducible. > > The following patch fixes the leak. It was proposed by Takeshi Misawa and > tested by Syzbot. > > In other places of the code the waitqueue is freed when its wait_ctr > becomes zero (see autofs_wait_release). So I think it is not actually > supposed that inside autofs_catatonic_mode wait_ctr cannot be decreased to > zero. Please correct me if I'm wrong. Clearly there's a problem here but I'll need to think about what's going a bit more myself. > > Also, looking at the discussion [2] of the '[PATCH] autofs4: use wake_up() > instead of wake_up_interruptible', shouldn't wake_up_interruptible() > inside autofs_catatonic_mode() be replaced with wake_up()? Yes, I think so but that also deserves a bit of thought. Ian
© 2016 - 2025 Red Hat, Inc.