[PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths

Stepan Ionichev posted 1 patch 6 days, 19 hours ago
drivers/iio/pressure/rohm-bm1390.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
[PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths
Posted by Stepan Ionichev 6 days, 19 hours ago
bm1390_trigger_handler() returns from three error paths without
calling iio_trigger_notify_done(). The success path at the end
does, so on a single transient regmap or read failure the trigger
use_count is never decremented, and the !atomic_read(&trig->use_count)
guard in iio_trigger_poll_chained() drops every subsequent dispatch.
The buffered-data flow stays wedged until the trigger is detached.

Funnel all returns through a single done label that calls
iio_trigger_notify_done() and reports the outcome via IRQ_RETVAL().

Fixes: 81ca5979b6ed ("iio: pressure: Support ROHM BU1390")
Cc: stable@vger.kernel.org
Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
---
v2:
- Use a bool and IRQ_RETVAL() instead of irqreturn_t (Andy)

v1: https://lore.kernel.org/all/20260517160801.269-1-sozdayvek@gmail.com/

 drivers/iio/pressure/rohm-bm1390.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/iio/pressure/rohm-bm1390.c b/drivers/iio/pressure/rohm-bm1390.c
index 08146ca0f..81368e578 100644
--- a/drivers/iio/pressure/rohm-bm1390.c
+++ b/drivers/iio/pressure/rohm-bm1390.c
@@ -626,12 +626,15 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
 	struct iio_poll_func *pf = p;
 	struct iio_dev *idev = pf->indio_dev;
 	struct bm1390_data *data = iio_priv(idev);
+	bool handled = true;
 	int ret, status;
 
 	/* DRDY is acked by reading status reg */
 	ret = regmap_read(data->regmap, BM1390_REG_STATUS, &status);
-	if (ret || !status)
-		return IRQ_NONE;
+	if (ret || !status) {
+		handled = false;
+		goto done;
+	}
 
 	dev_dbg(data->dev, "DRDY trig status 0x%x\n", status);
 
@@ -639,7 +642,8 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
 		ret = bm1390_pressure_read(data, &data->buf.pressure);
 		if (ret) {
 			dev_warn(data->dev, "sample read failed %d\n", ret);
-			return IRQ_NONE;
+			handled = false;
+			goto done;
 		}
 	}
 
@@ -648,15 +652,16 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
 				       &data->buf.temp, sizeof(data->buf.temp));
 		if (ret) {
 			dev_warn(data->dev, "temp read failed %d\n", ret);
-			return IRQ_HANDLED;
+			goto done;
 		}
 	}
 
 	iio_push_to_buffers_with_ts(idev, &data->buf, sizeof(data->buf),
 				    data->timestamp);
+done:
 	iio_trigger_notify_done(idev->trig);
 
-	return IRQ_HANDLED;
+	return IRQ_RETVAL(handled);
 }
 
 /* Get timestamps and wake the thread if we need to read data */
-- 
2.43.0
Re: [PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths
Posted by Jonathan Cameron 6 days, 13 hours ago
On Mon, 18 May 2026 14:42:38 +0500
Stepan Ionichev <sozdayvek@gmail.com> wrote:

> bm1390_trigger_handler() returns from three error paths without
> calling iio_trigger_notify_done(). The success path at the end
> does, so on a single transient regmap or read failure the trigger
> use_count is never decremented, and the !atomic_read(&trig->use_count)
> guard in iio_trigger_poll_chained() drops every subsequent dispatch.
> The buffered-data flow stays wedged until the trigger is detached.
> 
> Funnel all returns through a single done label that calls
> iio_trigger_notify_done() and reports the outcome via IRQ_RETVAL().
> 
> Fixes: 81ca5979b6ed ("iio: pressure: Support ROHM BU1390")
> Cc: stable@vger.kernel.org
> Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>

These error path 'fixes' are fixes for hardware failure - so if anything
they are hardending  against a possible error condition. I don't mind
that bit it's not a bug to not do this so fixes tag an stable are not
appropriate for any of these.

Note however that hardening against these conditions is not this simple.
It takes careful analysis of exactly how the hardware behaves and what
each error condition 'might' mean.  Whilst they are probably harmless
I'm also very dubious about taking them without comprehensive testing
on the particular device.

> ---
> v2:
> - Use a bool and IRQ_RETVAL() instead of irqreturn_t (Andy)
> 
> v1: https://lore.kernel.org/all/20260517160801.269-1-sozdayvek@gmail.com/
> 
>  drivers/iio/pressure/rohm-bm1390.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iio/pressure/rohm-bm1390.c b/drivers/iio/pressure/rohm-bm1390.c
> index 08146ca0f..81368e578 100644
> --- a/drivers/iio/pressure/rohm-bm1390.c
> +++ b/drivers/iio/pressure/rohm-bm1390.c
> @@ -626,12 +626,15 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>  	struct iio_poll_func *pf = p;
>  	struct iio_dev *idev = pf->indio_dev;
>  	struct bm1390_data *data = iio_priv(idev);
> +	bool handled = true;
>  	int ret, status;
>  
>  	/* DRDY is acked by reading status reg */
>  	ret = regmap_read(data->regmap, BM1390_REG_STATUS, &status);
So question 1.
- What actually is device state if this read fails?  We have no idea.
  It might have failed on the 'to device' path in which case the device
  didn't see the read.  Or it might have failed on the 'from device path'.

Gets more complex...

> -	if (ret || !status)
> -		return IRQ_NONE;

The trigger in use might well be the dataready trigger provided by this driver
(though I note this device has no validate callbacks so we do allow other
triggers - that may or may not be a bug!)  I really dislike read to clear
register designs as they make this stuff more complex.

Anyhow question 2:
- What happens if we don't clear it and do acknowledge the interrupt plus
ack the trigger (which is what iio_trigger_done() is doing?
  Two obvious options - wedged device, it re interrupts immediately.
If we are wedged, then meh device dead. Without adding retry loops
(don't) recovery path is reset the driver by unbinding and rebinding.

Fun follow up is what happens if having acked the data ready trigger
by this read, we get another read before getting to iio_trigger_notify_done()?

Quite possibly we wedge. This drivers trigger may be missing a reenable() callback
(which would typically reread the status register to clear any such interrupt).

Whether it does is again a device implementation specific thing.


> +	if (ret || !status) {
> +		handled = false;
> +		goto done;
> +	}
>  
>  	dev_dbg(data->dev, "DRDY trig status 0x%x\n", status);
>  
> @@ -639,7 +642,8 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>  		ret = bm1390_pressure_read(data, &data->buf.pressure);
>  		if (ret) {
>  			dev_warn(data->dev, "sample read failed %d\n", ret);
> -			return IRQ_NONE;
> +			handled = false;
> +			goto done;

Hopefully all this stuff is unrelated to the trigger.  For these it is fair to
ack the trigger and the interrupt.  Curiously the driver does it partly for the
next one (IRQ_HANDLED).

>  		}
>  	}
>  
> @@ -648,15 +652,16 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>  				       &data->buf.temp, sizeof(data->buf.temp));
>  		if (ret) {
>  			dev_warn(data->dev, "temp read failed %d\n", ret);
> -			return IRQ_HANDLED;
> +			goto done;
>  		}
>  	}
>  
>  	iio_push_to_buffers_with_ts(idev, &data->buf, sizeof(data->buf),
>  				    data->timestamp);
> +done:
>  	iio_trigger_notify_done(idev->trig);
>  
> -	return IRQ_HANDLED;
> +	return IRQ_RETVAL(handled);
If we are doing this Andy's suggestion of a helper is neater.

Anyhow, upshot is to get this stuff right requires device specific knowledge.
Ideally the author tests injecting errors at each point to verify if the
data capture survives.  However, it's up to a driver author to decide if they
care.  There are normally dozens of paths in a driver that will result in needing
a reset (unbind/bind for most IIO drivers) - that's expensive, complex, fragile
handling code to maintain, so personally I consider it optional.

Jonathan


>  }
>  
>  /* Get timestamps and wake the thread if we need to read data */
Re: [PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths
Posted by Matti Vaittinen 5 days, 23 hours ago
Thanks Jonathan,

Your post give me something to think about ;)

On 18/05/2026 18:15, Jonathan Cameron wrote:
> On Mon, 18 May 2026 14:42:38 +0500
> Stepan Ionichev <sozdayvek@gmail.com> wrote:
> 
>> bm1390_trigger_handler() returns from three error paths without
>> calling iio_trigger_notify_done(). The success path at the end
>> does, so on a single transient regmap or read failure the trigger
>> use_count is never decremented, and the !atomic_read(&trig->use_count)
>> guard in iio_trigger_poll_chained() drops every subsequent dispatch.
>> The buffered-data flow stays wedged until the trigger is detached.
>>
>> Funnel all returns through a single done label that calls
>> iio_trigger_notify_done() and reports the outcome via IRQ_RETVAL().
>>
>> Fixes: 81ca5979b6ed ("iio: pressure: Support ROHM BU1390")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
> 
> These error path 'fixes' are fixes for hardware failure - so if anything
> they are hardending  against a possible error condition. I don't mind
> that bit it's not a bug to not do this so fixes tag an stable are not
> appropriate for any of these.
> 
> Note however that hardening against these conditions is not this simple.
> It takes careful analysis of exactly how the hardware behaves and what
> each error condition 'might' mean.  Whilst they are probably harmless
> I'm also very dubious about taking them without comprehensive testing
> on the particular device.
> 
>> ---
>> v2:
>> - Use a bool and IRQ_RETVAL() instead of irqreturn_t (Andy)
>>
>> v1: https://lore.kernel.org/all/20260517160801.269-1-sozdayvek@gmail.com/
>>
>>   drivers/iio/pressure/rohm-bm1390.c | 15 ++++++++++-----
>>   1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iio/pressure/rohm-bm1390.c b/drivers/iio/pressure/rohm-bm1390.c
>> index 08146ca0f..81368e578 100644
>> --- a/drivers/iio/pressure/rohm-bm1390.c
>> +++ b/drivers/iio/pressure/rohm-bm1390.c
>> @@ -626,12 +626,15 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>>   	struct iio_poll_func *pf = p;
>>   	struct iio_dev *idev = pf->indio_dev;
>>   	struct bm1390_data *data = iio_priv(idev);
>> +	bool handled = true;
>>   	int ret, status;
>>   
>>   	/* DRDY is acked by reading status reg */
>>   	ret = regmap_read(data->regmap, BM1390_REG_STATUS, &status);
> So question 1.
> - What actually is device state if this read fails?  We have no idea.
>    It might have failed on the 'to device' path in which case the device
>    didn't see the read.  Or it might have failed on the 'from device path'.
> 
> Gets more complex...
> 
>> -	if (ret || !status)
>> -		return IRQ_NONE;
> 
> The trigger in use might well be the dataready trigger provided by this driver
> (though I note this device has no validate callbacks so we do allow other
> triggers - that may or may not be a bug!)  I really dislike read to clear
> register designs as they make this stuff more complex.

I have a strong feeling it should be the dataready. Still, I have no 
idea about actual systems using this driver, so I am a bit cautious 
adding new restrictions.

> Anyhow question 2:
> - What happens if we don't clear it and do acknowledge the interrupt plus
> ack the trigger (which is what iio_trigger_done() is doing?
>    Two obvious options - wedged device, it re interrupts immediately.
> If we are wedged, then meh device dead. Without adding retry loops
> (don't) recovery path is reset the driver by unbinding and rebinding.

The BM1390 keeps the IRQ pin asserted.

> Fun follow up is what happens if having acked the data ready trigger
> by this read, we get another read before getting to iio_trigger_notify_done()?
> 
> Quite possibly we wedge.

I see. This isn't fun at all. Even more so if the trigger use-count now 
prevents us from calling the handler, and returning further IRQ_NONEs, 
preventing the safety-mechanism intended to disable the offending IRQ. I 
have a feeling there is IRQF_ONESHOT set though, so perhaps we are safe 
from this (when no error path is taken in the handler).

> This drivers trigger may be missing a reenable() callback
> (which would typically reread the status register to clear any such interrupt).

Which works for case where we "get another read before getting to 
iio_trigger_notify_done()" - but not for a case where we might have the 
bus stuck, causing read errors.

> Whether it does is again a device implementation specific thing.
> 
> 
>> +	if (ret || !status) {
>> +		handled = false;
>> +		goto done;
>> +	}
>>   
>>   	dev_dbg(data->dev, "DRDY trig status 0x%x\n", status);
>>   
>> @@ -639,7 +642,8 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>>   		ret = bm1390_pressure_read(data, &data->buf.pressure);
>>   		if (ret) {
>>   			dev_warn(data->dev, "sample read failed %d\n", ret);
>> -			return IRQ_NONE;
>> +			handled = false;
>> +			goto done;
> 
> Hopefully all this stuff is unrelated to the trigger.  For these it is fair to
> ack the trigger and the interrupt.  Curiously the driver does it partly for the
> next one (IRQ_HANDLED).

I would keep the IRQ_NONE here because, if we keep constantly failing 
the reads, then the bus is likely to be unerliable - and disabling the 
useless IRQ is probably very sane thing to do. It should help debugging. 
What comes to acking the trigger - I am starting to agree with Stepan, 
we should probably ack the trigger in any case. If we don't ack the 
trigger, then the IRQ_NONE does not serve the purpose it is intended for.

>>   		}
>>   	}
>>   
>> @@ -648,15 +652,16 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>>   				       &data->buf.temp, sizeof(data->buf.temp));
>>   		if (ret) {
>>   			dev_warn(data->dev, "temp read failed %d\n", ret);
>> -			return IRQ_HANDLED;
>> +			goto done;
>>   		}
>>   	}
>>   
>>   	iio_push_to_buffers_with_ts(idev, &data->buf, sizeof(data->buf),
>>   				    data->timestamp);
>> +done:
>>   	iio_trigger_notify_done(idev->trig);
>>   
>> -	return IRQ_HANDLED;
>> +	return IRQ_RETVAL(handled);
> If we are doing this Andy's suggestion of a helper is neater.
> 
> Anyhow, upshot is to get this stuff right requires device specific knowledge.

And time... :)

> Ideally the author tests injecting errors at each point to verify if the
> data capture survives.  However, it's up to a driver author to decide if they
> care.  There are normally dozens of paths in a driver that will result in needing
> a reset (unbind/bind for most IIO drivers) - that's expensive, complex, fragile
> handling code to maintain, so personally I consider it optional.

I am not going to try adding any such recovery code in driver. I am 
afraid it would be way too complex for me to maintain (with my memory, 
code I've seen last month is new Today) for the added benefit. If we 
have such a delicate system where this type of 'failure recovery w/o 
reset' is required, then such code should (in my opinion) be system 
specific and not generic. Most of the device users will never benefit 
from it, but will need to look at it...

What I DO care is the IRQ gets disabled (from host side) if it can't be 
acked (from device side). That shouldn't be so complex (although, it 
seems it is more complex I thought when I wrote this driver).

After all this babbling I've done - if I understood it right, omitting 
the call to iio_trigger_notify_done() will prevent further returns of 
the IRQ_NONE, even if the IRQ stays asserted. So yes, I would definitely 
like to see this fix getting in.

Thanks guys for giving me a lesson again!

> 
>>   }
>>   
>>   /* Get timestamps and wake the thread if we need to read data */
> 

Yours,
	-- Matti

---
Matti Vaittinen
Linux kernel developer at ROHM Semiconductors
Oulu Finland

~~ When things go utterly wrong vim users can always type :help! ~~
Re: [PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths
Posted by Jonathan Cameron 4 days, 18 hours ago
On Tue, 19 May 2026 08:48:13 +0300
Matti Vaittinen <mazziesaccount@gmail.com> wrote:

> Thanks Jonathan,
> 
> Your post give me something to think about ;)

This is a can of worms.  More below.

I'm unconcerned as long as (and ideally someone should check it)
we can get of being stuck by unbind/rebind of driver.  Anything
else is best effort.


> 
> On 18/05/2026 18:15, Jonathan Cameron wrote:
> > On Mon, 18 May 2026 14:42:38 +0500
> > Stepan Ionichev <sozdayvek@gmail.com> wrote:
> >   
> >> bm1390_trigger_handler() returns from three error paths without
> >> calling iio_trigger_notify_done(). The success path at the end
> >> does, so on a single transient regmap or read failure the trigger
> >> use_count is never decremented, and the !atomic_read(&trig->use_count)
> >> guard in iio_trigger_poll_chained() drops every subsequent dispatch.
> >> The buffered-data flow stays wedged until the trigger is detached.
> >>
> >> Funnel all returns through a single done label that calls
> >> iio_trigger_notify_done() and reports the outcome via IRQ_RETVAL().
> >>
> >> Fixes: 81ca5979b6ed ("iio: pressure: Support ROHM BU1390")
> >> Cc: stable@vger.kernel.org
> >> Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>  
> > 
> > These error path 'fixes' are fixes for hardware failure - so if anything
> > they are hardending  against a possible error condition. I don't mind
> > that bit it's not a bug to not do this so fixes tag an stable are not
> > appropriate for any of these.
> > 
> > Note however that hardening against these conditions is not this simple.
> > It takes careful analysis of exactly how the hardware behaves and what
> > each error condition 'might' mean.  Whilst they are probably harmless
> > I'm also very dubious about taking them without comprehensive testing
> > on the particular device.
> >   
> >> ---
> >> v2:
> >> - Use a bool and IRQ_RETVAL() instead of irqreturn_t (Andy)
> >>
> >> v1: https://lore.kernel.org/all/20260517160801.269-1-sozdayvek@gmail.com/
> >>
> >>   drivers/iio/pressure/rohm-bm1390.c | 15 ++++++++++-----
> >>   1 file changed, 10 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/iio/pressure/rohm-bm1390.c b/drivers/iio/pressure/rohm-bm1390.c
> >> index 08146ca0f..81368e578 100644
> >> --- a/drivers/iio/pressure/rohm-bm1390.c
> >> +++ b/drivers/iio/pressure/rohm-bm1390.c
> >> @@ -626,12 +626,15 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
> >>   	struct iio_poll_func *pf = p;
> >>   	struct iio_dev *idev = pf->indio_dev;
> >>   	struct bm1390_data *data = iio_priv(idev);
> >> +	bool handled = true;
> >>   	int ret, status;
> >>   
> >>   	/* DRDY is acked by reading status reg */
> >>   	ret = regmap_read(data->regmap, BM1390_REG_STATUS, &status);  
> > So question 1.
> > - What actually is device state if this read fails?  We have no idea.
> >    It might have failed on the 'to device' path in which case the device
> >    didn't see the read.  Or it might have failed on the 'from device path'.
> > 
> > Gets more complex...
> >   
> >> -	if (ret || !status)
> >> -		return IRQ_NONE;  
> > 
> > The trigger in use might well be the dataready trigger provided by this driver
> > (though I note this device has no validate callbacks so we do allow other
> > triggers - that may or may not be a bug!)  I really dislike read to clear
> > register designs as they make this stuff more complex.  
> 
> I have a strong feeling it should be the dataready. Still, I have no 
> idea about actual systems using this driver, so I am a bit cautious 
> adding new restrictions.

If we can show it is broken with any of the other triggers then
we can add the restriction without it being a potential regression. Not sure
if that is doable - easiest way would be to just try it and see.


> 
> > Anyhow question 2:
> > - What happens if we don't clear it and do acknowledge the interrupt plus
> > ack the trigger (which is what iio_trigger_done() is doing?
> >    Two obvious options - wedged device, it re interrupts immediately.
> > If we are wedged, then meh device dead. Without adding retry loops
> > (don't) recovery path is reset the driver by unbinding and rebinding.  
> 
> The BM1390 keeps the IRQ pin asserted.
> 
> > Fun follow up is what happens if having acked the data ready trigger
> > by this read, we get another read before getting to iio_trigger_notify_done()?
> > 
> > Quite possibly we wedge.  
> 
> I see. This isn't fun at all. Even more so if the trigger use-count now 
> prevents us from calling the handler, and returning further IRQ_NONEs, 
> preventing the safety-mechanism intended to disable the offending IRQ. I 
> have a feeling there is IRQF_ONESHOT set though, so perhaps we are safe 
> from this (when no error path is taken in the handler).

Good point.

You are right (I think!) that saves us if using the trigger in this driver
because the trigger is fired by iio_trigger_poll_nested()/handle_nested_irq()
which runs the thread_fn()s for (pollfunc threads) in the thread belonging
to the trigger interrupt.

If we had a trigger that was doing in the top half (iio_trigger_poll()) then
it would get messier but I think that could only slew the data by a sample
which isn't too bad (as we don't need this interrupt to happen). 

If we return an error because we know it's not our interrupt then we should
be fine anyway because our interrupt will turn up later (as long as that
is in the trigger itself).

> 
> > This drivers trigger may be missing a reenable() callback
> > (which would typically reread the status register to clear any such interrupt).  
> 
> Which works for case where we "get another read before getting to 
> iio_trigger_notify_done()" - but not for a case where we might have the 
> bus stuck, causing read errors.

If bus is stuck, I'm of the view recovery unlikely and if it really is a once
in a blue moon thing, unbind and rebind the driver.

> 
> > Whether it does is again a device implementation specific thing.
> > 
> >   
> >> +	if (ret || !status) {
> >> +		handled = false;
> >> +		goto done;
> >> +	}
> >>   
> >>   	dev_dbg(data->dev, "DRDY trig status 0x%x\n", status);
> >>   
> >> @@ -639,7 +642,8 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
> >>   		ret = bm1390_pressure_read(data, &data->buf.pressure);
> >>   		if (ret) {
> >>   			dev_warn(data->dev, "sample read failed %d\n", ret);
> >> -			return IRQ_NONE;
> >> +			handled = false;
> >> +			goto done;  
> > 
> > Hopefully all this stuff is unrelated to the trigger.  For these it is fair to
> > ack the trigger and the interrupt.  Curiously the driver does it partly for the
> > next one (IRQ_HANDLED).  
> 
> I would keep the IRQ_NONE here because, if we keep constantly failing 
> the reads, then the bus is likely to be unerliable - and disabling the 
> useless IRQ is probably very sane thing to do. It should help debugging. 
> What comes to acking the trigger - I am starting to agree with Stepan, 
> we should probably ack the trigger in any case. If we don't ack the 
> trigger, then the IRQ_NONE does not serve the purpose it is intended for.

The interrupt that we'd get spurious detection on here would not be the device
one it would be the software emulated one deep in the iio trigger stuff.

Might still be useful for debug. Anyone fancy hacking an error in and reporting
back what we actually get from the debug hardware?  (with that trigger acked
as you suggest?)


> 
> >>   		}
> >>   	}
> >>   
> >> @@ -648,15 +652,16 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
> >>   				       &data->buf.temp, sizeof(data->buf.temp));
> >>   		if (ret) {
> >>   			dev_warn(data->dev, "temp read failed %d\n", ret);
> >> -			return IRQ_HANDLED;
> >> +			goto done;
> >>   		}
> >>   	}
> >>   
> >>   	iio_push_to_buffers_with_ts(idev, &data->buf, sizeof(data->buf),
> >>   				    data->timestamp);
> >> +done:
> >>   	iio_trigger_notify_done(idev->trig);
> >>   
> >> -	return IRQ_HANDLED;
> >> +	return IRQ_RETVAL(handled);  
> > If we are doing this Andy's suggestion of a helper is neater.
> > 
> > Anyhow, upshot is to get this stuff right requires device specific knowledge.  
> 
> And time... :)

Absolutely - I see it as value add, so its a business decision to work
through all these or not.

> 
> > Ideally the author tests injecting errors at each point to verify if the
> > data capture survives.  However, it's up to a driver author to decide if they
> > care.  There are normally dozens of paths in a driver that will result in needing
> > a reset (unbind/bind for most IIO drivers) - that's expensive, complex, fragile
> > handling code to maintain, so personally I consider it optional.  
> 
> I am not going to try adding any such recovery code in driver. I am 
> afraid it would be way too complex for me to maintain (with my memory, 
> code I've seen last month is new Today) for the added benefit. If we 
> have such a delicate system where this type of 'failure recovery w/o 
> reset' is required, then such code should (in my opinion) be system 
> specific and not generic. Most of the device users will never benefit 
> from it, but will need to look at it...
> 
> What I DO care is the IRQ gets disabled (from host side) if it can't be 
> acked (from device side). That shouldn't be so complex (although, it 
> seems it is more complex I thought when I wrote this driver).
> 
> After all this babbling I've done - if I understood it right, omitting 
> the call to iio_trigger_notify_done() will prevent further returns of 
> the IRQ_NONE, even if the IRQ stays asserted. So yes, I would definitely 
> like to see this fix getting in.

I think you are right on this, but I'd kind of like someone to hammer
some hardware to verify it. I'm not set up to do that today (even in
emulation) but could get to it at some point. 

The bit that makes me a bit doubtful is if this is a level interrupt
and the clear is in the trigger handler I think we get an interrupt
storm anyway - even with IRQ_NONE because that IRQ_NONE is for the
nested interrupt.  We keep return IRQ_HANDLED from the main thread irq
despite not actually doing anything.

That's hard behavior to fix in the core as that skip is intended for
annoying free running edge triggers - which are only ones where this race
'should' happen. Those can run faster than we can actually read data
and so we want to drop a scan.   Maybe we could cap the number
that are skipped so we eventually report a problem from
iio_trigger_poll_nested() or push that decision up to the caller?
This would rely on not call iio_trigger_poll_done() in IRQ_NONE on
the 'software' interrupts in that pollfuncs come from.

bool iio_trigger_poll_nested(struct iio_trigger *trig)
{
	int i;

	if (!atomic_read(&trig->use_count)) {
		atomic_set(&trig->use_count, CONFIG_IIO_CONSUMERS_PER_TRIGGER);

		for (i = 0; i < CONFIG_IIO_CONSUMERS_PER_TRIGGER; i++) {
			if (trig->subirqs[i].enabled)
				handle_nested_irq(trig->subirq_base + i);
			else
				iio_trigger_notify_done(trig);
		}
	} else {
		return false;
	}
	return true;
}

Then check that in the caller. That will tell us someone didn't finish
the previous trigger. Note it's still not fun because if we let other consumers
use the trigger, one of those might be running slow and do we want to throttle
the capture to every other interrupt (which incidentally means the
clear should be in the trigger irq handler, not the one grabbing the data).
Ah - that's a good reason to add a validate_device to the trigger.  It is
broken anyway for other devices using this trigger.

This feels like yet another case where we need some docs on best
practice and acceptable options.

Gah. I hate analysing error paths in complex flows. Lunch time!

Jonathan

> 
> Thanks guys for giving me a lesson again!
> 
> >   
> >>   }
> >>   
> >>   /* Get timestamps and wake the thread if we need to read data */  
> >   
> 
> Yours,
> 	-- Matti
> 
> ---
> Matti Vaittinen
> Linux kernel developer at ROHM Semiconductors
> Oulu Finland
> 
> ~~ When things go utterly wrong vim users can always type :help! ~~
Re: [PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths
Posted by Matti Vaittinen 2 days, 16 hours ago
On 20/05/2026 14:08, Jonathan Cameron wrote:
> On Tue, 19 May 2026 08:48:13 +0300
> Matti Vaittinen <mazziesaccount@gmail.com> wrote:
> 
>> Thanks Jonathan,
>>
>> Your post give me something to think about ;)
> 
> This is a can of worms.  More below.
> 
> I'm unconcerned as long as (and ideally someone should check it)
> we can get of being stuck by unbind/rebind of driver.  Anything
> else is best effort.
> 
> 
>>
>> On 18/05/2026 18:15, Jonathan Cameron wrote:
>>> On Mon, 18 May 2026 14:42:38 +0500
>>> Stepan Ionichev <sozdayvek@gmail.com> wrote:
>>>    
>>>> bm1390_trigger_handler() returns from three error paths without
>>>> calling iio_trigger_notify_done(). The success path at the end
>>>> does, so on a single transient regmap or read failure the trigger
>>>> use_count is never decremented, and the !atomic_read(&trig->use_count)
>>>> guard in iio_trigger_poll_chained() drops every subsequent dispatch.
>>>> The buffered-data flow stays wedged until the trigger is detached.
>>>>
>>>> Funnel all returns through a single done label that calls
>>>> iio_trigger_notify_done() and reports the outcome via IRQ_RETVAL().
>>>>
>>>> Fixes: 81ca5979b6ed ("iio: pressure: Support ROHM BU1390")
>>>> Cc: stable@vger.kernel.org
>>>> Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
>>>
>>> These error path 'fixes' are fixes for hardware failure - so if anything
>>> they are hardending  against a possible error condition. I don't mind
>>> that bit it's not a bug to not do this so fixes tag an stable are not
>>> appropriate for any of these.
>>>
>>> Note however that hardening against these conditions is not this simple.
>>> It takes careful analysis of exactly how the hardware behaves and what
>>> each error condition 'might' mean.  Whilst they are probably harmless
>>> I'm also very dubious about taking them without comprehensive testing
>>> on the particular device.
>>>    
>>>> ---

//snip

>>>>    
>>>> @@ -639,7 +642,8 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>>>>    		ret = bm1390_pressure_read(data, &data->buf.pressure);
>>>>    		if (ret) {
>>>>    			dev_warn(data->dev, "sample read failed %d\n", ret);
>>>> -			return IRQ_NONE;
>>>> +			handled = false;
>>>> +			goto done;
>>>
>>> Hopefully all this stuff is unrelated to the trigger.  For these it is fair to
>>> ack the trigger and the interrupt.  Curiously the driver does it partly for the
>>> next one (IRQ_HANDLED).
>>
>> I would keep the IRQ_NONE here because, if we keep constantly failing
>> the reads, then the bus is likely to be unerliable - and disabling the
>> useless IRQ is probably very sane thing to do. It should help debugging.
>> What comes to acking the trigger - I am starting to agree with Stepan,
>> we should probably ack the trigger in any case. If we don't ack the
>> trigger, then the IRQ_NONE does not serve the purpose it is intended for.
> 
> The interrupt that we'd get spurious detection on here would not be the device
> one it would be the software emulated one deep in the iio trigger stuff.
> 
> Might still be useful for debug. Anyone fancy hacking an error in and reporting
> back what we actually get from the debug hardware?  (with that trigger acked
> as you suggest?)

No promises but I'll see if I can try out something next week...

Yours,
	-- Matti

-- 
---
Matti Vaittinen
Linux kernel developer at ROHM Semiconductors
Oulu Finland

~~ When things go utterly wrong vim users can always type :help! ~~
Re: [PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths
Posted by Matti Vaittinen 6 days, 16 hours ago
On 18/05/2026 12:42, Stepan Ionichev wrote:
> bm1390_trigger_handler() returns from three error paths without
> calling iio_trigger_notify_done(). The success path at the end
> does, so on a single transient regmap or read failure the trigger
> use_count is never decremented, and the !atomic_read(&trig->use_count)
> guard in iio_trigger_poll_chained() drops every subsequent dispatch.
> The buffered-data flow stays wedged until the trigger is detached.
> 

I still believe the use-count should be decremented by the IIO, after it 
has called trigger handlers. (Unless there is an use-case where the 
use-count is not decremented.) Well, let's wait for a little while so 
Jonathan & others have time to comment. I have been wrong at times ;)

> Funnel all returns through a single done label that calls
> iio_trigger_notify_done() and reports the outcome via IRQ_RETVAL().
> 
> Fixes: 81ca5979b6ed ("iio: pressure: Support ROHM BU1390")
> Cc: stable@vger.kernel.org
> Signed-off-by: Stepan Ionichev <sozdayvek@gmail.com>
> ---
> v2:
> - Use a bool and IRQ_RETVAL() instead of irqreturn_t (Andy)
> 
> v1: https://lore.kernel.org/all/20260517160801.269-1-sozdayvek@gmail.com/
> 
>   drivers/iio/pressure/rohm-bm1390.c | 15 ++++++++++-----
>   1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iio/pressure/rohm-bm1390.c b/drivers/iio/pressure/rohm-bm1390.c
> index 08146ca0f..81368e578 100644
> --- a/drivers/iio/pressure/rohm-bm1390.c
> +++ b/drivers/iio/pressure/rohm-bm1390.c
> @@ -626,12 +626,15 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>   	struct iio_poll_func *pf = p;
>   	struct iio_dev *idev = pf->indio_dev;
>   	struct bm1390_data *data = iio_priv(idev);
> +	bool handled = true;

I would inverse the logic. At this point, the IRQ is _not_ handled. 
Hence I'd default this false and only toggled it to true when the IRQ is 
indeed successfully acked and data is read. That should allow you to 
touch the 'handled' only once after the initialization.

>   	int ret, status;
>   
>   	/* DRDY is acked by reading status reg */
>   	ret = regmap_read(data->regmap, BM1390_REG_STATUS, &status);
> -	if (ret || !status)
> -		return IRQ_NONE;
> +	if (ret || !status) {
> +		handled = false;
> +		goto done;
> +	}
>   
>   	dev_dbg(data->dev, "DRDY trig status 0x%x\n", status);
>   
> @@ -639,7 +642,8 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>   		ret = bm1390_pressure_read(data, &data->buf.pressure);
>   		if (ret) {
>   			dev_warn(data->dev, "sample read failed %d\n", ret);
> -			return IRQ_NONE;
> +			handled = false;
> +			goto done;
>   		}
>   	}
>   
> @@ -648,15 +652,16 @@ static irqreturn_t bm1390_trigger_handler(int irq, void *p)
>   				       &data->buf.temp, sizeof(data->buf.temp));
>   		if (ret) {
>   			dev_warn(data->dev, "temp read failed %d\n", ret);
> -			return IRQ_HANDLED;
> +			goto done;
>   		}
>   	}
>   
>   	iio_push_to_buffers_with_ts(idev, &data->buf, sizeof(data->buf),
>   				    data->timestamp);
> +done:
>   	iio_trigger_notify_done(idev->trig);
>   
> -	return IRQ_HANDLED;
> +	return IRQ_RETVAL(handled);
>   }
>   
>   /* Get timestamps and wake the thread if we need to read data */


Yours,
	-- Matti

---
Matti Vaittinen
Linux kernel developer at ROHM Semiconductors
Oulu Finland

~~ When things go utterly wrong vim users can always type :help! ~~
Re: [PATCH v2] iio: pressure: rohm-bm1390: notify trigger on all error paths
Posted by Andy Shevchenko 6 days, 18 hours ago
On Mon, May 18, 2026 at 02:42:38PM +0500, Stepan Ionichev wrote:
> bm1390_trigger_handler() returns from three error paths without
> calling iio_trigger_notify_done(). The success path at the end
> does, so on a single transient regmap or read failure the trigger
> use_count is never decremented, and the !atomic_read(&trig->use_count)
> guard in iio_trigger_poll_chained() drops every subsequent dispatch.
> The buffered-data flow stays wedged until the trigger is detached.
> 
> Funnel all returns through a single done label that calls
> iio_trigger_notify_done() and reports the outcome via IRQ_RETVAL().

...

> +done:
>  	iio_trigger_notify_done(idev->trig);


Maybe it's better to make this as an implementation and wrap it in something like

handle_trigger_irq()
{
	bool result;

	// that returns boolean and doesn't have notify call
	result = this_old_function(...);

	iio_trigger_notify_done(idev->trig);
	return IRQ_RETVAL(result);
}


-- 
With Best Regards,
Andy Shevchenko