[PATCH] pps: Don't allow PPS_KC_BIND on removed devices

Calvin Owens posted 1 patch 1 week, 2 days ago
There is a newer version of this series
drivers/pps/kc.c           | 20 +++++++++++---------
include/linux/pps_kernel.h |  1 +
2 files changed, 12 insertions(+), 9 deletions(-)
[PATCH] pps: Don't allow PPS_KC_BIND on removed devices
Posted by Calvin Owens 1 week, 2 days ago
If userspace holds its file descriptor open, it can call PPS_KC_BIND on
a device which has been unplugged, leaving pps_kc_hardpps_dev as a
dangling pointer after close().

After that sequence, PPS_KC_BIND is broken until the system is rebooted,
because the pointer comparison in pps_kc_bind() can never be true.

    calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1081
    initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 811 usecs
    pps pps0: bound kernel consumer: edge=0x1
    pps pps0: unbound kernel consumer on device removal
    pps pps0: bound kernel consumer: edge=0x1
    calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1085
    initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 340 usecs
    pps pps0: another kernel consumer is already bound

Here is a short reproducer, which uses rmmod of the pps-ktimer testcase
to simulate a device being unplugged:

    #include <stdlib.h>
    #include <stdio.h>
    #include <unistd.h>
    #include <fcntl.h>
    #include <sys/ioctl.h>
    #include <linux/pps.h>
    #include <errno.h>
    #include <err.h>

    int main(void)
    {
        while (1) {
            int fd;

            if (system("insmod ./pps-ktimer.ko"))
                err(1, "insmod failed");

            fd = open("/dev/pps0", O_RDWR);
            if (fd == -1)
                err(1, "open failed");

            struct pps_bind_args args = {
                .tsformat = PPS_TSFMT_TSPEC,
                .edge = PPS_CAPTUREASSERT,
                .consumer = PPS_KC_HARDPPS,
            };

            if (ioctl(fd, PPS_KC_BIND, &args))
                err(1, "first PPS_KC_BIND failed");

            if (system("rmmod pps-ktimer"))
                err(1, "rmmod failed");

            if (ioctl(fd, PPS_KC_BIND, &args)) {
                if (errno != ENODEV)
                    err(1, "second PPS_KC_BIND failed");
                else
                    puts("Got ENODEV, kernel is patched");
            }

            close(fd);
        }
    }

Fix this by setting a flag when the device is unplugged, returning
-ENODEV from PPS_KC_BIND if the flag is set.

For userspace to encounter this new behavior, it must do something which
breaks the interface today, so this fix shouldn't cause any observable
behavior change for working programs.

Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://sashiko.dev/#/patchset/cover.1779733602.git.calvin%40wbinvd.org?part=1
Signed-off-by: Calvin Owens <calvin@wbinvd.org>
---
I generally hate "just add a flag" as a way to fix this sort of thing,
but I think this really is specific to KC_BIND and not a more general
lifetime problem.

 drivers/pps/kc.c           | 20 +++++++++++---------
 include/linux/pps_kernel.h |  1 +
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/pps/kc.c b/drivers/pps/kc.c
index fbd23295afd7..5f9058f663c8 100644
--- a/drivers/pps/kc.c
+++ b/drivers/pps/kc.c
@@ -36,17 +36,21 @@ static int pps_kc_hardpps_mode;		/* mode bits for kernel consumer */
 int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
 {
 	/* Check if another consumer is already bound */
-	spin_lock_irq(&pps_kc_hardpps_lock);
+	guard(spinlock_irq)(&pps_kc_hardpps_lock);
+
+	/*
+	 * Don't allow PPS_KC_BIND on a removed device.
+	 */
+	if (pps->kc_removed)
+		return -ENODEV;
 
 	if (bind_args->edge == 0)
 		if (pps_kc_hardpps_dev == pps) {
 			pps_kc_hardpps_mode = 0;
 			pps_kc_hardpps_dev = NULL;
-			spin_unlock_irq(&pps_kc_hardpps_lock);
 			dev_info(&pps->dev, "unbound kernel"
 					" consumer\n");
 		} else {
-			spin_unlock_irq(&pps_kc_hardpps_lock);
 			dev_err(&pps->dev, "selected kernel consumer"
 					" is not bound\n");
 			return -EINVAL;
@@ -56,11 +60,9 @@ int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
 				pps_kc_hardpps_dev == pps) {
 			pps_kc_hardpps_mode = bind_args->edge;
 			pps_kc_hardpps_dev = pps;
-			spin_unlock_irq(&pps_kc_hardpps_lock);
 			dev_info(&pps->dev, "bound kernel consumer: "
 				"edge=0x%x\n", bind_args->edge);
 		} else {
-			spin_unlock_irq(&pps_kc_hardpps_lock);
 			dev_err(&pps->dev, "another kernel consumer"
 					" is already bound\n");
 			return -EINVAL;
@@ -78,15 +80,15 @@ int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
  */
 void pps_kc_remove(struct pps_device *pps)
 {
-	spin_lock_irq(&pps_kc_hardpps_lock);
+	guard(spinlock_irq)(&pps_kc_hardpps_lock);
+
+	pps->kc_removed = true;
 	if (pps == pps_kc_hardpps_dev) {
 		pps_kc_hardpps_mode = 0;
 		pps_kc_hardpps_dev = NULL;
-		spin_unlock_irq(&pps_kc_hardpps_lock);
 		dev_info(&pps->dev, "unbound kernel consumer"
 				" on device removal\n");
-	} else
-		spin_unlock_irq(&pps_kc_hardpps_lock);
+	}
 }
 
 /* pps_kc_event - call hardpps() on PPS event
diff --git a/include/linux/pps_kernel.h b/include/linux/pps_kernel.h
index aab0aebb529e..3c2979f8a5ac 100644
--- a/include/linux/pps_kernel.h
+++ b/include/linux/pps_kernel.h
@@ -60,6 +60,7 @@ struct pps_device {
 	struct device dev;
 	struct fasync_struct *async_queue;	/* fasync method */
 	spinlock_t lock;
+	bool kc_removed;
 };
 
 /*
-- 
2.47.3
Re: [PATCH] pps: Don't allow PPS_KC_BIND on removed devices
Posted by Rodolfo Giometti 1 week, 1 day ago
On 30/05/2026 04:46, Calvin Owens wrote:
> If userspace holds its file descriptor open, it can call PPS_KC_BIND on
> a device which has been unplugged, leaving pps_kc_hardpps_dev as a
> dangling pointer after close().
> 
> After that sequence, PPS_KC_BIND is broken until the system is rebooted,
> because the pointer comparison in pps_kc_bind() can never be true.
> 
>      calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1081
>      initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 811 usecs
>      pps pps0: bound kernel consumer: edge=0x1
>      pps pps0: unbound kernel consumer on device removal
>      pps pps0: bound kernel consumer: edge=0x1
>      calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1085
>      initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 340 usecs
>      pps pps0: another kernel consumer is already bound
> 
> Here is a short reproducer, which uses rmmod of the pps-ktimer testcase
> to simulate a device being unplugged:
> 
>      #include <stdlib.h>
>      #include <stdio.h>
>      #include <unistd.h>
>      #include <fcntl.h>
>      #include <sys/ioctl.h>
>      #include <linux/pps.h>
>      #include <errno.h>
>      #include <err.h>
> 
>      int main(void)
>      {
>          while (1) {
>              int fd;
> 
>              if (system("insmod ./pps-ktimer.ko"))
>                  err(1, "insmod failed");
> 
>              fd = open("/dev/pps0", O_RDWR);
>              if (fd == -1)
>                  err(1, "open failed");
> 
>              struct pps_bind_args args = {
>                  .tsformat = PPS_TSFMT_TSPEC,
>                  .edge = PPS_CAPTUREASSERT,
>                  .consumer = PPS_KC_HARDPPS,
>              };
> 
>              if (ioctl(fd, PPS_KC_BIND, &args))
>                  err(1, "first PPS_KC_BIND failed");
> 
>              if (system("rmmod pps-ktimer"))
>                  err(1, "rmmod failed");
> 
>              if (ioctl(fd, PPS_KC_BIND, &args)) {
>                  if (errno != ENODEV)
>                      err(1, "second PPS_KC_BIND failed");
>                  else
>                      puts("Got ENODEV, kernel is patched");
>              }
> 
>              close(fd);
>          }
>      }
> 
> Fix this by setting a flag when the device is unplugged, returning
> -ENODEV from PPS_KC_BIND if the flag is set.
> 
> For userspace to encounter this new behavior, it must do something which
> breaks the interface today, so this fix shouldn't cause any observable
> behavior change for working programs.
> 
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Closes: https://sashiko.dev/#/patchset/cover.1779733602.git.calvin%40wbinvd.org?part=1
> Signed-off-by: Calvin Owens <calvin@wbinvd.org>
> ---
> I generally hate "just add a flag" as a way to fix this sort of thing,
> but I think this really is specific to KC_BIND and not a more general
> lifetime problem.
> 
>   drivers/pps/kc.c           | 20 +++++++++++---------
>   include/linux/pps_kernel.h |  1 +
>   2 files changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/pps/kc.c b/drivers/pps/kc.c
> index fbd23295afd7..5f9058f663c8 100644
> --- a/drivers/pps/kc.c
> +++ b/drivers/pps/kc.c
> @@ -36,17 +36,21 @@ static int pps_kc_hardpps_mode;		/* mode bits for kernel consumer */
>   int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
>   {
>   	/* Check if another consumer is already bound */
> -	spin_lock_irq(&pps_kc_hardpps_lock);
> +	guard(spinlock_irq)(&pps_kc_hardpps_lock);
> +
> +	/*
> +	 * Don't allow PPS_KC_BIND on a removed device.
> +	 */
> +	if (pps->kc_removed)
> +		return -ENODEV;
>   
>   	if (bind_args->edge == 0)
>   		if (pps_kc_hardpps_dev == pps) {
>   			pps_kc_hardpps_mode = 0;
>   			pps_kc_hardpps_dev = NULL;
> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>   			dev_info(&pps->dev, "unbound kernel"
>   					" consumer\n");
>   		} else {
> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>   			dev_err(&pps->dev, "selected kernel consumer"
>   					" is not bound\n");
>   			return -EINVAL;
> @@ -56,11 +60,9 @@ int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
>   				pps_kc_hardpps_dev == pps) {
>   			pps_kc_hardpps_mode = bind_args->edge;
>   			pps_kc_hardpps_dev = pps;
> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>   			dev_info(&pps->dev, "bound kernel consumer: "
>   				"edge=0x%x\n", bind_args->edge);
>   		} else {
> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>   			dev_err(&pps->dev, "another kernel consumer"
>   					" is already bound\n");
>   			return -EINVAL;

I strongly dislike the idea of putting dev_info() and dev_err() inside a spinlock.

The original code was designed to release the lock before calling dev_info(). 
Using guard() here accidentally breaks that good design by keeping the lock held 
during the logging calls.

> @@ -78,15 +80,15 @@ int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
>    */
>   void pps_kc_remove(struct pps_device *pps)
>   {
> -	spin_lock_irq(&pps_kc_hardpps_lock);
> +	guard(spinlock_irq)(&pps_kc_hardpps_lock);
> +
> +	pps->kc_removed = true;
>   	if (pps == pps_kc_hardpps_dev) {
>   		pps_kc_hardpps_mode = 0;
>   		pps_kc_hardpps_dev = NULL;
> -		spin_unlock_irq(&pps_kc_hardpps_lock);
>   		dev_info(&pps->dev, "unbound kernel consumer"
>   				" on device removal\n");
> -	} else
> -		spin_unlock_irq(&pps_kc_hardpps_lock);
> +	}
>   }
>   
>   /* pps_kc_event - call hardpps() on PPS event
> diff --git a/include/linux/pps_kernel.h b/include/linux/pps_kernel.h
> index aab0aebb529e..3c2979f8a5ac 100644
> --- a/include/linux/pps_kernel.h
> +++ b/include/linux/pps_kernel.h
> @@ -60,6 +60,7 @@ struct pps_device {
>   	struct device dev;
>   	struct fasync_struct *async_queue;	/* fasync method */
>   	spinlock_t lock;
> +	bool kc_removed;
>   };
>   
>   /*

Ciao,

Rodolfo

-- 
GNU/Linux Solutions                  e-mail: giometti@enneenne.com
Linux Device Driver                          giometti@linux.it
Embedded Systems                     phone:  +39 349 2432127
UNIX programming
Re: [PATCH] pps: Don't allow PPS_KC_BIND on removed devices
Posted by Calvin Owens 1 week, 1 day ago
On Saturday 05/30 at 14:01 +0200, Rodolfo Giometti wrote:
> On 30/05/2026 04:46, Calvin Owens wrote:
> > If userspace holds its file descriptor open, it can call PPS_KC_BIND on
> > a device which has been unplugged, leaving pps_kc_hardpps_dev as a
> > dangling pointer after close().
> > 
> > After that sequence, PPS_KC_BIND is broken until the system is rebooted,
> > because the pointer comparison in pps_kc_bind() can never be true.
> > 
> >      calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1081
> >      initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 811 usecs
> >      pps pps0: bound kernel consumer: edge=0x1
> >      pps pps0: unbound kernel consumer on device removal
> >      pps pps0: bound kernel consumer: edge=0x1
> >      calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1085
> >      initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 340 usecs
> >      pps pps0: another kernel consumer is already bound
> > 
> > Here is a short reproducer, which uses rmmod of the pps-ktimer testcase
> > to simulate a device being unplugged:
> > 
> >      #include <stdlib.h>
> >      #include <stdio.h>
> >      #include <unistd.h>
> >      #include <fcntl.h>
> >      #include <sys/ioctl.h>
> >      #include <linux/pps.h>
> >      #include <errno.h>
> >      #include <err.h>
> > 
> >      int main(void)
> >      {
> >          while (1) {
> >              int fd;
> > 
> >              if (system("insmod ./pps-ktimer.ko"))
> >                  err(1, "insmod failed");
> > 
> >              fd = open("/dev/pps0", O_RDWR);
> >              if (fd == -1)
> >                  err(1, "open failed");
> > 
> >              struct pps_bind_args args = {
> >                  .tsformat = PPS_TSFMT_TSPEC,
> >                  .edge = PPS_CAPTUREASSERT,
> >                  .consumer = PPS_KC_HARDPPS,
> >              };
> > 
> >              if (ioctl(fd, PPS_KC_BIND, &args))
> >                  err(1, "first PPS_KC_BIND failed");
> > 
> >              if (system("rmmod pps-ktimer"))
> >                  err(1, "rmmod failed");
> > 
> >              if (ioctl(fd, PPS_KC_BIND, &args)) {
> >                  if (errno != ENODEV)
> >                      err(1, "second PPS_KC_BIND failed");
> >                  else
> >                      puts("Got ENODEV, kernel is patched");
> >              }
> > 
> >              close(fd);
> >          }
> >      }
> > 
> > Fix this by setting a flag when the device is unplugged, returning
> > -ENODEV from PPS_KC_BIND if the flag is set.
> > 
> > For userspace to encounter this new behavior, it must do something which
> > breaks the interface today, so this fix shouldn't cause any observable
> > behavior change for working programs.
> > 
> > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > Closes: https://sashiko.dev/#/patchset/cover.1779733602.git.calvin%40wbinvd.org?part=1
> > Signed-off-by: Calvin Owens <calvin@wbinvd.org>
> > ---
> > I generally hate "just add a flag" as a way to fix this sort of thing,
> > but I think this really is specific to KC_BIND and not a more general
> > lifetime problem.
> > 
> >   drivers/pps/kc.c           | 20 +++++++++++---------
> >   include/linux/pps_kernel.h |  1 +
> >   2 files changed, 12 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/pps/kc.c b/drivers/pps/kc.c
> > index fbd23295afd7..5f9058f663c8 100644
> > --- a/drivers/pps/kc.c
> > +++ b/drivers/pps/kc.c
> > @@ -36,17 +36,21 @@ static int pps_kc_hardpps_mode;		/* mode bits for kernel consumer */
> >   int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
> >   {
> >   	/* Check if another consumer is already bound */
> > -	spin_lock_irq(&pps_kc_hardpps_lock);
> > +	guard(spinlock_irq)(&pps_kc_hardpps_lock);
> > +
> > +	/*
> > +	 * Don't allow PPS_KC_BIND on a removed device.
> > +	 */
> > +	if (pps->kc_removed)
> > +		return -ENODEV;
> >   	if (bind_args->edge == 0)
> >   		if (pps_kc_hardpps_dev == pps) {
> >   			pps_kc_hardpps_mode = 0;
> >   			pps_kc_hardpps_dev = NULL;
> > -			spin_unlock_irq(&pps_kc_hardpps_lock);
> >   			dev_info(&pps->dev, "unbound kernel"
> >   					" consumer\n");
> >   		} else {
> > -			spin_unlock_irq(&pps_kc_hardpps_lock);
> >   			dev_err(&pps->dev, "selected kernel consumer"
> >   					" is not bound\n");
> >   			return -EINVAL;
> > @@ -56,11 +60,9 @@ int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
> >   				pps_kc_hardpps_dev == pps) {
> >   			pps_kc_hardpps_mode = bind_args->edge;
> >   			pps_kc_hardpps_dev = pps;
> > -			spin_unlock_irq(&pps_kc_hardpps_lock);
> >   			dev_info(&pps->dev, "bound kernel consumer: "
> >   				"edge=0x%x\n", bind_args->edge);
> >   		} else {
> > -			spin_unlock_irq(&pps_kc_hardpps_lock);
> >   			dev_err(&pps->dev, "another kernel consumer"
> >   					" is already bound\n");
> >   			return -EINVAL;
> 
> I strongly dislike the idea of putting dev_info() and dev_err() inside a spinlock.
> 
> The original code was designed to release the lock before calling
> dev_info(). Using guard() here accidentally breaks that good design by
> keeping the lock held during the logging calls.

Generally with RT and NBCON upstream, my assumption is that pepole who
care about this have the tools to deal with it. It's all preemptible on
an RT kernel nowadays.

But I don't feel strongly about this, I'm just curious if you or anyone
else on the list have any thoughts about that. If nobody chimes in, I'll
send a v2 with the guard changes dropped in a few days.

> > @@ -78,15 +80,15 @@ int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
> >    */
> >   void pps_kc_remove(struct pps_device *pps)
> >   {
> > -	spin_lock_irq(&pps_kc_hardpps_lock);
> > +	guard(spinlock_irq)(&pps_kc_hardpps_lock);
> > +
> > +	pps->kc_removed = true;
> >   	if (pps == pps_kc_hardpps_dev) {
> >   		pps_kc_hardpps_mode = 0;
> >   		pps_kc_hardpps_dev = NULL;
> > -		spin_unlock_irq(&pps_kc_hardpps_lock);
> >   		dev_info(&pps->dev, "unbound kernel consumer"
> >   				" on device removal\n");
> > -	} else
> > -		spin_unlock_irq(&pps_kc_hardpps_lock);
> > +	}
> >   }
> >   /* pps_kc_event - call hardpps() on PPS event
> > diff --git a/include/linux/pps_kernel.h b/include/linux/pps_kernel.h
> > index aab0aebb529e..3c2979f8a5ac 100644
> > --- a/include/linux/pps_kernel.h
> > +++ b/include/linux/pps_kernel.h
> > @@ -60,6 +60,7 @@ struct pps_device {
> >   	struct device dev;
> >   	struct fasync_struct *async_queue;	/* fasync method */
> >   	spinlock_t lock;
> > +	bool kc_removed;
> >   };
> >   /*
> 
> Ciao,
> 
> Rodolfo
> 
> -- 
> GNU/Linux Solutions                  e-mail: giometti@enneenne.com
> Linux Device Driver                          giometti@linux.it
> Embedded Systems                     phone:  +39 349 2432127
> UNIX programming
Re: [PATCH] pps: Don't allow PPS_KC_BIND on removed devices
Posted by Rodolfo Giometti 6 days, 23 hours ago
On 30/05/2026 17:00, Calvin Owens wrote:
> On Saturday 05/30 at 14:01 +0200, Rodolfo Giometti wrote:
>> On 30/05/2026 04:46, Calvin Owens wrote:
>>> If userspace holds its file descriptor open, it can call PPS_KC_BIND on
>>> a device which has been unplugged, leaving pps_kc_hardpps_dev as a
>>> dangling pointer after close().
>>>
>>> After that sequence, PPS_KC_BIND is broken until the system is rebooted,
>>> because the pointer comparison in pps_kc_bind() can never be true.
>>>
>>>       calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1081
>>>       initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 811 usecs
>>>       pps pps0: bound kernel consumer: edge=0x1
>>>       pps pps0: unbound kernel consumer on device removal
>>>       pps pps0: bound kernel consumer: edge=0x1
>>>       calling  pps_ktimer_init+0x0/0x1000 [pps_ktimer] @ 1085
>>>       initcall pps_ktimer_init+0x0/0x1000 [pps_ktimer] returned 0 after 340 usecs
>>>       pps pps0: another kernel consumer is already bound
>>>
>>> Here is a short reproducer, which uses rmmod of the pps-ktimer testcase
>>> to simulate a device being unplugged:
>>>
>>>       #include <stdlib.h>
>>>       #include <stdio.h>
>>>       #include <unistd.h>
>>>       #include <fcntl.h>
>>>       #include <sys/ioctl.h>
>>>       #include <linux/pps.h>
>>>       #include <errno.h>
>>>       #include <err.h>
>>>
>>>       int main(void)
>>>       {
>>>           while (1) {
>>>               int fd;
>>>
>>>               if (system("insmod ./pps-ktimer.ko"))
>>>                   err(1, "insmod failed");
>>>
>>>               fd = open("/dev/pps0", O_RDWR);
>>>               if (fd == -1)
>>>                   err(1, "open failed");
>>>
>>>               struct pps_bind_args args = {
>>>                   .tsformat = PPS_TSFMT_TSPEC,
>>>                   .edge = PPS_CAPTUREASSERT,
>>>                   .consumer = PPS_KC_HARDPPS,
>>>               };
>>>
>>>               if (ioctl(fd, PPS_KC_BIND, &args))
>>>                   err(1, "first PPS_KC_BIND failed");
>>>
>>>               if (system("rmmod pps-ktimer"))
>>>                   err(1, "rmmod failed");
>>>
>>>               if (ioctl(fd, PPS_KC_BIND, &args)) {
>>>                   if (errno != ENODEV)
>>>                       err(1, "second PPS_KC_BIND failed");
>>>                   else
>>>                       puts("Got ENODEV, kernel is patched");
>>>               }
>>>
>>>               close(fd);
>>>           }
>>>       }
>>>
>>> Fix this by setting a flag when the device is unplugged, returning
>>> -ENODEV from PPS_KC_BIND if the flag is set.
>>>
>>> For userspace to encounter this new behavior, it must do something which
>>> breaks the interface today, so this fix shouldn't cause any observable
>>> behavior change for working programs.
>>>
>>> Reported-by: Sashiko <sashiko-bot@kernel.org>
>>> Closes: https://sashiko.dev/#/patchset/cover.1779733602.git.calvin%40wbinvd.org?part=1
>>> Signed-off-by: Calvin Owens <calvin@wbinvd.org>
>>> ---
>>> I generally hate "just add a flag" as a way to fix this sort of thing,
>>> but I think this really is specific to KC_BIND and not a more general
>>> lifetime problem.
>>>
>>>    drivers/pps/kc.c           | 20 +++++++++++---------
>>>    include/linux/pps_kernel.h |  1 +
>>>    2 files changed, 12 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/pps/kc.c b/drivers/pps/kc.c
>>> index fbd23295afd7..5f9058f663c8 100644
>>> --- a/drivers/pps/kc.c
>>> +++ b/drivers/pps/kc.c
>>> @@ -36,17 +36,21 @@ static int pps_kc_hardpps_mode;		/* mode bits for kernel consumer */
>>>    int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
>>>    {
>>>    	/* Check if another consumer is already bound */
>>> -	spin_lock_irq(&pps_kc_hardpps_lock);
>>> +	guard(spinlock_irq)(&pps_kc_hardpps_lock);
>>> +
>>> +	/*
>>> +	 * Don't allow PPS_KC_BIND on a removed device.
>>> +	 */
>>> +	if (pps->kc_removed)
>>> +		return -ENODEV;
>>>    	if (bind_args->edge == 0)
>>>    		if (pps_kc_hardpps_dev == pps) {
>>>    			pps_kc_hardpps_mode = 0;
>>>    			pps_kc_hardpps_dev = NULL;
>>> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>>>    			dev_info(&pps->dev, "unbound kernel"
>>>    					" consumer\n");
>>>    		} else {
>>> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>>>    			dev_err(&pps->dev, "selected kernel consumer"
>>>    					" is not bound\n");
>>>    			return -EINVAL;
>>> @@ -56,11 +60,9 @@ int pps_kc_bind(struct pps_device *pps, struct pps_bind_args *bind_args)
>>>    				pps_kc_hardpps_dev == pps) {
>>>    			pps_kc_hardpps_mode = bind_args->edge;
>>>    			pps_kc_hardpps_dev = pps;
>>> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>>>    			dev_info(&pps->dev, "bound kernel consumer: "
>>>    				"edge=0x%x\n", bind_args->edge);
>>>    		} else {
>>> -			spin_unlock_irq(&pps_kc_hardpps_lock);
>>>    			dev_err(&pps->dev, "another kernel consumer"
>>>    					" is already bound\n");
>>>    			return -EINVAL;
>>
>> I strongly dislike the idea of putting dev_info() and dev_err() inside a spinlock.
>>
>> The original code was designed to release the lock before calling
>> dev_info(). Using guard() here accidentally breaks that good design by
>> keeping the lock held during the logging calls.
> 
> Generally with RT and NBCON upstream, my assumption is that pepole who
> care about this have the tools to deal with it. It's all preemptible on
> an RT kernel nowadays.
> 
> But I don't feel strongly about this, I'm just curious if you or anyone
> else on the list have any thoughts about that. If nobody chimes in, I'll
> send a v2 with the guard changes dropped in a few days.

The approach of using a kc_removed flag to return -ENODEV is a correct and 
robust fix for the core problem, improving both stability and userspace API 
predictability.

I suggest keeping the kc_removed check as an early exit, but reverting to 
explicit spin_lock_irq/spin_unlock_irq calls to ensure logging happens outside 
the spinlock protected region, as in the original code. This balances the 
benefit of your fix with best practices for spinlock usage.

Could you please prepare a v2 patch following this approach?

Ciao,

Rodolfo

-- 
GNU/Linux Solutions                  e-mail: giometti@enneenne.com
Linux Device Driver                          giometti@linux.it
Embedded Systems                     phone:  +39 349 2432127
UNIX programming