drivers/char/hw_random/core.c | 5 +++-- drivers/net/wireless/ath/ath9k/rng.c | 20 +++----------------- 2 files changed, 6 insertions(+), 19 deletions(-)
Even though hwrng provides a `wait` parameter, it doesn't work very well
when waiting for a long time. There are numerous deadlocks that emerge
related to shutdown. Work around this API limitation by waiting for a
shorter amount of time and erroring more frequently. This commit also
prevents hwrng from splatting messages to dmesg when there's a timeout
and switches to using schedule_timeout_interruptible(), so that the
kthread can be stopped.
Reported-by: Gregory Erwin <gregerwin256@gmail.com>
Tested-by: Gregory Erwin <gregerwin256@gmail.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Rui Salvaterra <rsalvaterra@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: stable@vger.kernel.org
Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@mail.gmail.com/
Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@mail.gmail.com/
Link: https://bugs.archlinux.org/task/75138
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
drivers/char/hw_random/core.c | 5 +++--
drivers/net/wireless/ath/ath9k/rng.c | 20 +++-----------------
2 files changed, 6 insertions(+), 19 deletions(-)
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 16f227b995e8..9987f0642285 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -513,8 +513,9 @@ static int hwrng_fillfn(void *unused)
break;
if (rc <= 0) {
- pr_warn("hwrng: no data available\n");
- msleep_interruptible(10000);
+ if (kthread_should_stop())
+ break;
+ schedule_timeout_interruptible(HZ * 10);
continue;
}
diff --git a/drivers/net/wireless/ath/ath9k/rng.c b/drivers/net/wireless/ath/ath9k/rng.c
index cb5414265a9b..757603d1949d 100644
--- a/drivers/net/wireless/ath/ath9k/rng.c
+++ b/drivers/net/wireless/ath/ath9k/rng.c
@@ -52,20 +52,6 @@ static int ath9k_rng_data_read(struct ath_softc *sc, u32 *buf, u32 buf_size)
return j << 2;
}
-static u32 ath9k_rng_delay_get(u32 fail_stats)
-{
- u32 delay;
-
- if (fail_stats < 100)
- delay = 10;
- else if (fail_stats < 105)
- delay = 1000;
- else
- delay = 10000;
-
- return delay;
-}
-
static int ath9k_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait)
{
struct ath_softc *sc = container_of(rng, struct ath_softc, rng_ops);
@@ -80,10 +66,10 @@ static int ath9k_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait)
bytes_read += max & 3UL;
memzero_explicit(&word, sizeof(word));
}
- if (!wait || !max || likely(bytes_read) || fail_stats > 110)
+ if (!wait || !max || likely(bytes_read) || ++fail_stats >= 100 ||
+ ((current->flags & PF_KTHREAD) && kthread_should_stop()) ||
+ schedule_timeout_interruptible(HZ / 20))
break;
-
- msleep_interruptible(ath9k_rng_delay_get(++fail_stats));
}
if (wait && !bytes_read && max)
--
2.35.1
"Jason A. Donenfeld" <Jason@zx2c4.com> writes:
> Even though hwrng provides a `wait` parameter, it doesn't work very well
> when waiting for a long time. There are numerous deadlocks that emerge
> related to shutdown. Work around this API limitation by waiting for a
> shorter amount of time and erroring more frequently. This commit also
> prevents hwrng from splatting messages to dmesg when there's a timeout
> and switches to using schedule_timeout_interruptible(), so that the
> kthread can be stopped.
>
> Reported-by: Gregory Erwin <gregerwin256@gmail.com>
> Tested-by: Gregory Erwin <gregerwin256@gmail.com>
> Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> Cc: Kalle Valo <kvalo@kernel.org>
> Cc: Rui Salvaterra <rsalvaterra@gmail.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: stable@vger.kernel.org
> Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
> Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@mail.gmail.com/
> Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@mail.gmail.com/
> Link: https://bugs.archlinux.org/task/75138
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
On Mon, Jun 27, 2022 at 02:07:35PM +0200, Jason A. Donenfeld wrote:
> Even though hwrng provides a `wait` parameter, it doesn't work very well
> when waiting for a long time. There are numerous deadlocks that emerge
> related to shutdown. Work around this API limitation by waiting for a
> shorter amount of time and erroring more frequently. This commit also
> prevents hwrng from splatting messages to dmesg when there's a timeout
> and switches to using schedule_timeout_interruptible(), so that the
> kthread can be stopped.
>
> Reported-by: Gregory Erwin <gregerwin256@gmail.com>
> Tested-by: Gregory Erwin <gregerwin256@gmail.com>
> Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> Cc: Kalle Valo <kvalo@kernel.org>
> Cc: Rui Salvaterra <rsalvaterra@gmail.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: stable@vger.kernel.org
> Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
> Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@mail.gmail.com/
> Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@mail.gmail.com/
> Link: https://bugs.archlinux.org/task/75138
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
> drivers/char/hw_random/core.c | 5 +++--
> drivers/net/wireless/ath/ath9k/rng.c | 20 +++-----------------
> 2 files changed, 6 insertions(+), 19 deletions(-)
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
"Jason A. Donenfeld" <Jason@zx2c4.com> writes:
> Even though hwrng provides a `wait` parameter, it doesn't work very well
> when waiting for a long time. There are numerous deadlocks that emerge
> related to shutdown. Work around this API limitation by waiting for a
> shorter amount of time and erroring more frequently. This commit also
> prevents hwrng from splatting messages to dmesg when there's a timeout
> and switches to using schedule_timeout_interruptible(), so that the
> kthread can be stopped.
>
> Reported-by: Gregory Erwin <gregerwin256@gmail.com>
> Tested-by: Gregory Erwin <gregerwin256@gmail.com>
> Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> Cc: Kalle Valo <kvalo@kernel.org>
> Cc: Rui Salvaterra <rsalvaterra@gmail.com>
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: stable@vger.kernel.org
> Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
> Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@mail.gmail.com/
> Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@mail.gmail.com/
> Link: https://bugs.archlinux.org/task/75138
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Gregory, care to take this version for a spin as well to double-check
that it still resolves the issue? :)
-Toke
On Mon, Jun 27, 2022 at 5:18 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> "Jason A. Donenfeld" <Jason@zx2c4.com> writes:
>
> > Even though hwrng provides a `wait` parameter, it doesn't work very well
> > when waiting for a long time. There are numerous deadlocks that emerge
> > related to shutdown. Work around this API limitation by waiting for a
> > shorter amount of time and erroring more frequently. This commit also
> > prevents hwrng from splatting messages to dmesg when there's a timeout
> > and switches to using schedule_timeout_interruptible(), so that the
> > kthread can be stopped.
> >
> > Reported-by: Gregory Erwin <gregerwin256@gmail.com>
> > Tested-by: Gregory Erwin <gregerwin256@gmail.com>
> > Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> > Cc: Kalle Valo <kvalo@kernel.org>
> > Cc: Rui Salvaterra <rsalvaterra@gmail.com>
> > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > Cc: stable@vger.kernel.org
> > Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
> > Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@mail.gmail.com/
> > Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@mail.gmail.com/
> > Link: https://bugs.archlinux.org/task/75138
> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
>
> Gregory, care to take this version for a spin as well to double-check
> that it still resolves the issue? :)
>
> -Toke
>
With patch v6, reads from /dev/hwrng block for 5-6s, during which 'ip link set
wlan0 down' will also block. Otherwise, 'ip link set wlan0 down' returns
immediately. Similarly, wiphy_suspend() consistently returns in under 10ms.
Tested-by: Gregory Erwin <gregerwin256@gmail.com>
Hi Gregory,
On Tue, Jun 28, 2022 at 3:39 AM Gregory Erwin <gregerwin256@gmail.com> wrote:
>
> On Mon, Jun 27, 2022 at 5:18 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > "Jason A. Donenfeld" <Jason@zx2c4.com> writes:
> >
> > > Even though hwrng provides a `wait` parameter, it doesn't work very well
> > > when waiting for a long time. There are numerous deadlocks that emerge
> > > related to shutdown. Work around this API limitation by waiting for a
> > > shorter amount of time and erroring more frequently. This commit also
> > > prevents hwrng from splatting messages to dmesg when there's a timeout
> > > and switches to using schedule_timeout_interruptible(), so that the
> > > kthread can be stopped.
> > >
> > > Reported-by: Gregory Erwin <gregerwin256@gmail.com>
> > > Tested-by: Gregory Erwin <gregerwin256@gmail.com>
> > > Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> > > Cc: Kalle Valo <kvalo@kernel.org>
> > > Cc: Rui Salvaterra <rsalvaterra@gmail.com>
> > > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > > Cc: stable@vger.kernel.org
> > > Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
> > > Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@mail.gmail.com/
> > > Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@mail.gmail.com/
> > > Link: https://bugs.archlinux.org/task/75138
> > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> >
> > Gregory, care to take this version for a spin as well to double-check
> > that it still resolves the issue? :)
> >
> > -Toke
> >
>
> With patch v6, reads from /dev/hwrng block for 5-6s, during which 'ip link set
> wlan0 down' will also block. Otherwise, 'ip link set wlan0 down' returns
> immediately. Similarly, wiphy_suspend() consistently returns in under 10ms.
>
> Tested-by: Gregory Erwin <gregerwin256@gmail.com>
Oh 5-6s... so it's actually worse. Yikes. Sounds like v4 might have
been the better patch, then?
Jason
On Tue, Jun 28, 2022 at 12:46 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi Gregory,
>
> On Tue, Jun 28, 2022 at 3:39 AM Gregory Erwin <gregerwin256@gmail.com> wrote:
> >
> > On Mon, Jun 27, 2022 at 5:18 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > >
> > > "Jason A. Donenfeld" <Jason@zx2c4.com> writes:
> > >
> > > > Even though hwrng provides a `wait` parameter, it doesn't work very well
> > > > when waiting for a long time. There are numerous deadlocks that emerge
> > > > related to shutdown. Work around this API limitation by waiting for a
> > > > shorter amount of time and erroring more frequently. This commit also
> > > > prevents hwrng from splatting messages to dmesg when there's a timeout
> > > > and switches to using schedule_timeout_interruptible(), so that the
> > > > kthread can be stopped.
> > > >
> > > > Reported-by: Gregory Erwin <gregerwin256@gmail.com>
> > > > Tested-by: Gregory Erwin <gregerwin256@gmail.com>
> > > > Cc: Toke Høiland-Jørgensen <toke@redhat.com>
> > > > Cc: Kalle Valo <kvalo@kernel.org>
> > > > Cc: Rui Salvaterra <rsalvaterra@gmail.com>
> > > > Cc: Herbert Xu <herbert@gondor.apana.org.au>
> > > > Cc: stable@vger.kernel.org
> > > > Fixes: fcd09c90c3c5 ("ath9k: use hw_random API instead of directly dumping into random.c")
> > > > Link: https://lore.kernel.org/all/CAO+Okf6ZJC5-nTE_EJUGQtd8JiCkiEHytGgDsFGTEjs0c00giw@mail.gmail.com/
> > > > Link: https://lore.kernel.org/lkml/CAO+Okf5k+C+SE6pMVfPf-d8MfVPVq4PO7EY8Hys_DVXtent3HA@mail.gmail.com/
> > > > Link: https://bugs.archlinux.org/task/75138
> > > > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > >
> > > Gregory, care to take this version for a spin as well to double-check
> > > that it still resolves the issue? :)
> > >
> > > -Toke
> > >
> >
> > With patch v6, reads from /dev/hwrng block for 5-6s, during which 'ip link set
> > wlan0 down' will also block. Otherwise, 'ip link set wlan0 down' returns
> > immediately. Similarly, wiphy_suspend() consistently returns in under 10ms.
> >
> > Tested-by: Gregory Erwin <gregerwin256@gmail.com>
>
> Oh 5-6s... so it's actually worse. Yikes. Sounds like v4 might have
> been the better patch, then?
$ curl https://lore.kernel.org/lkml/20220627104955.534013-1-Jason@zx2c4.com/raw
| git am
That one, if you want to give it a spin and see if that 5-6s is back
to ~1s or less.
Jason
On Tue, Jun 28, 2022 at 12:48:50PM +0200, Jason A. Donenfeld wrote: > > $ curl https://lore.kernel.org/lkml/20220627104955.534013-1-Jason@zx2c4.com/raw > | git am > > That one, if you want to give it a spin and see if that 5-6s is back > to ~1s or less. Whatever caused kthread_should_stop to return true should have woken the thread and caused schedule_timeout to return. If it's not waking the thread up then we should find out why. Oh wait you're checking kthread_should_stop before the schedule call instead of afterwards, that would do it. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Hi Herbert, On Tue, Jun 28, 2022 at 12:52 PM Herbert Xu <herbert@gondor.apana.org.au> wrote: > > On Tue, Jun 28, 2022 at 12:48:50PM +0200, Jason A. Donenfeld wrote: > > > > $ curl https://lore.kernel.org/lkml/20220627104955.534013-1-Jason@zx2c4.com/raw > > | git am > > > > That one, if you want to give it a spin and see if that 5-6s is back > > to ~1s or less. > > Whatever caused kthread_should_stop to return true should have > woken the thread and caused schedule_timeout to return. > > If it's not waking the thread up then we should find out why. > > Oh wait you're checking kthread_should_stop before the schedule > call instead of afterwards, that would do it. Oh, that's a really good observation, thank you! I'll send a patch that does it right. I have to check kthread_should_stop() before, because the wake up might have been consumed by a sleep inside of the hwrng ->read_data() function, causing it to return early. So we have to check it before going to sleep again. And after, I need to be checking the return value of schedule_timeout_interruptible(), which I'm not in this latest. So v+1 coming up. By the way, this thread might interest you: https://lore.kernel.org/lkml/20220627145716.641185-1-Jason@zx2c4.com/ Jason
Hi again, On Tue, Jun 28, 2022 at 12:55:57PM +0200, Jason A. Donenfeld wrote: > > Oh wait you're checking kthread_should_stop before the schedule > > call instead of afterwards, that would do it. > > Oh, that's a really good observation, thank you! Wait, no. I already am kthread_should_stop() it afterwards. That "continue" goes to the top of the loop, which checks it in the while() statement. So something else is amiss. I guess I'll investigate. Jason
Hi Herbert, On Tue, Jun 28, 2022 at 02:05:34PM +0200, Jason A. Donenfeld wrote: > Hi again, > > On Tue, Jun 28, 2022 at 12:55:57PM +0200, Jason A. Donenfeld wrote: > > > Oh wait you're checking kthread_should_stop before the schedule > > > call instead of afterwards, that would do it. > > > > Oh, that's a really good observation, thank you! > > Wait, no. I already am kthread_should_stop() it afterwards. That > "continue" goes to the top of the loop, which checks it in the while() > statement. So something else is amiss. I guess I'll investigate. I worked out a solution for the "larger problem" now. It's a bit involved, but not too bad. I'll post a patch shortly. Jason
© 2016 - 2026 Red Hat, Inc.