drivers/net/wireless/realtek/rtlwifi/efuse.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
Syzkaller reported a hung task with uevent_show() on stack trace. That
specific issue was addressed by another commit [0], but even with that
fix applied (for example, running v6.12-rc5) we face another type of hung
task that comes from the same reproducer [1]. By investigating that, we
could narrow it to the following path:
(a) Syzkaller emulates a Realtek USB WiFi adapter using raw-gadget and
dummy_hcd infrastructure.
(b) During the probe of rtl8192cu, the driver ends-up performing an efuse
read procedure (which is related to EEPROM load IIUC), and here lies the
issue: the function read_efuse() calls read_efuse_byte() many times, as
loop iterations depending on the efuse size (in our example, 512 in total).
This procedure for reading efuse bytes relies in a loop that performs an
I/O read up to *10k* times in case of failures. We measured the time of
the loop inside read_efuse_byte() alone, and in this reproducer (which
involves the dummy_hcd emulation layer), it takes 15 seconds each. As a
consequence, we have the driver stuck in its probe routine for big time,
exposing a stack trace like below if we attempt to reboot the system, for
example:
task:kworker/0:3 state:D stack:0 pid:662 tgid:662 ppid:2 flags:0x00004000
Workqueue: usb_hub_wq hub_event
Call Trace:
__schedule+0xe22/0xeb6
schedule_timeout+0xe7/0x132
__wait_for_common+0xb5/0x12e
usb_start_wait_urb+0xc5/0x1ef
? usb_alloc_urb+0x95/0xa4
usb_control_msg+0xff/0x184
_usbctrl_vendorreq_sync+0xa0/0x161
_usb_read_sync+0xb3/0xc5
read_efuse_byte+0x13c/0x146
read_efuse+0x351/0x5f0
efuse_read_all_map+0x42/0x52
rtl_efuse_shadow_map_update+0x60/0xef
rtl_get_hwinfo+0x5d/0x1c2
rtl92cu_read_eeprom_info+0x10a/0x8d5
? rtl92c_read_chip_version+0x14f/0x17e
rtl_usb_probe+0x323/0x851
usb_probe_interface+0x278/0x34b
really_probe+0x202/0x4a4
__driver_probe_device+0x166/0x1b2
driver_probe_device+0x2f/0xd8
[...]
We propose hereby to drastically reduce the attempts of doing the I/O reads
in case of failures, restricted to USB devices (given that they're inherently
slower than PCIe ones). By retrying up to 10 times (instead of 10000), we
got reponsiveness in the reproducer, while seems reasonable to believe
that there's no sane USB device implementation in the field requiring this
amount of retries at every I/O read in order to properly work. Based on
that assumption, it'd be good to have it backported to stable but maybe not
since driver implementation (the 10k number comes from day 0), perhaps up
to 6.x series makes sense.
[0] Commit 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race").
[1] A note about that: this syzkaller report presents multiple reproducers
that differs by the type of emulated USB device. For this specific case,
check the entry from 2024/08/08 06:23 in the list of crashes; the C repro
is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000.
Cc: stable@vger.kernel.org # v6.1+
Reported-by: syzbot+edd9fe0d3a65b14588d5@syzkaller.appspotmail.com
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
---
V3:
- Switch variable declaration to reverse xmas tree order
(thanks Ping-Ke Shih for the suggestion).
V2:
- Restrict the change to USB device only (thanks Ping-Ke Shih).
- Tested in 2 USB devices by Bitterblue Smith - thanks a lot!
V1: https://lore.kernel.org/lkml/20241025150226.896613-1-gpiccoli@igalia.com/
drivers/net/wireless/realtek/rtlwifi/efuse.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtlwifi/efuse.c b/drivers/net/wireless/realtek/rtlwifi/efuse.c
index 82cf5fb5175f..0ff553f650f9 100644
--- a/drivers/net/wireless/realtek/rtlwifi/efuse.c
+++ b/drivers/net/wireless/realtek/rtlwifi/efuse.c
@@ -162,9 +162,19 @@ void efuse_write_1byte(struct ieee80211_hw *hw, u16 address, u8 value)
void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf)
{
struct rtl_priv *rtlpriv = rtl_priv(hw);
+ u16 retry, max_attempts;
u32 value32;
u8 readbyte;
- u16 retry;
+
+ /*
+ * In case of USB devices, transfer speeds are limited, hence
+ * efuse I/O reads could be (way) slower. So, decrease (a lot)
+ * the read attempts in case of failures.
+ */
+ if (rtlpriv->rtlhal.interface == INTF_PCI)
+ max_attempts = 10000;
+ else
+ max_attempts = 10;
rtl_write_byte(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL] + 1,
(_offset & 0xff));
@@ -178,7 +188,7 @@ void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf)
retry = 0;
value32 = rtl_read_dword(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL]);
- while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < 10000)) {
+ while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < max_attempts)) {
value32 = rtl_read_dword(rtlpriv,
rtlpriv->cfg->maps[EFUSE_CTRL]);
retry++;
--
2.46.2
Guilherme G. Piccoli <gpiccoli@igalia.com> wrote: > diff --git a/drivers/net/wireless/realtek/rtlwifi/efuse.c > b/drivers/net/wireless/realtek/rtlwifi/efuse.c > index 82cf5fb5175f..0ff553f650f9 100644 > --- a/drivers/net/wireless/realtek/rtlwifi/efuse.c > +++ b/drivers/net/wireless/realtek/rtlwifi/efuse.c > @@ -162,9 +162,19 @@ void efuse_write_1byte(struct ieee80211_hw *hw, u16 address, u8 value) > void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf) > { > struct rtl_priv *rtlpriv = rtl_priv(hw); > + u16 retry, max_attempts; > u32 value32; > u8 readbyte; > - u16 retry; > + > + /* > + * In case of USB devices, transfer speeds are limited, hence > + * efuse I/O reads could be (way) slower. So, decrease (a lot) > + * the read attempts in case of failures. > + */ > + if (rtlpriv->rtlhal.interface == INTF_PCI) > + max_attempts = 10000; > + else > + max_attempts = 10; As your comment, setting max_atttempts to 10 under condition of INTF_USB would be more reasonable, like u16 max_attempts = 10000; if (rtlpriv->rtlhal.interface == INTF_USB) max_attempts = 10;
>> [...] >> struct rtl_priv *rtlpriv = rtl_priv(hw); >> + u16 retry, max_attempts; >> u32 value32; >> u8 readbyte; >> - u16 retry; >> + >> + /* >> + * In case of USB devices, transfer speeds are limited, hence >> + * efuse I/O reads could be (way) slower. So, decrease (a lot) >> + * the read attempts in case of failures. >> + */ >> + if (rtlpriv->rtlhal.interface == INTF_PCI) >> + max_attempts = 10000; >> + else >> + max_attempts = 10; > > As your comment, setting max_atttempts to 10 under condition of INTF_USB would > be more reasonable, like > > u16 max_attempts = 10000; > > if (rtlpriv->rtlhal.interface == INTF_USB) > max_attempts = 10; > OK, thanks. V4 just sent: https://lore.kernel.org/r/20241101193412.1390391-1-gpiccoli@igalia.com
> > [0] Commit 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race"). > > [1] A note about that: this syzkaller report presents multiple reproducers > that differs by the type of emulated USB device. For this specific case, > check the entry from 2024/08/08 06:23 in the list of crashes; the C repro > is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000. Have you run checkpatch before submission? Above two information can be in formal form as suggestions of checkpatch. WARNING: Prefer a maximum 75 chars per line (possible unwrapped commit description?) #55: in case of failures, restricted to USB devices (given that they're inherently WARNING: Reported-by: should be immediately followed by Closes: with a URL to the report #72: Reported-by: syzbot+edd9fe0d3a65b14588d5@syzkaller.appspotmail.com Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> WARNING: The commit message has 'syzkaller', perhaps it also needs a 'Fixes:' tag? total: 0 errors, 3 warnings, 0 checks, 28 lines checked > > Cc: stable@vger.kernel.org # v6.1+ > Reported-by: syzbot+edd9fe0d3a65b14588d5@syzkaller.appspotmail.com > Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
On 31/10/2024 23:26, Ping-Ke Shih wrote: >> [...] > Have you run checkpatch before submission? Above two information can be in > formal form as suggestions of checkpatch. Yes, I always do that! Happens that checkpatch sometimes provides good advice (or even point to genuine errors), but sometimes...it's OK to ignore the warnings, specially if we have a reason. Below I detail more about my reasons: > > WARNING: Prefer a maximum 75 chars per line (possible unwrapped commit description?) > #55: It's like 76 long line, and helped readability on the commit message. In any case, I'll refactor this one in V4, but notice that it'll continue complaining because of point "[0]". That line has 80 or 81 chars, don't recall, but if I reduce it by removing the word "Commit", for example, checkpatch complains I'm writing a commit reference out of preferred format heh So, lose-lose scenario, I can't make the tool fully happy here xD > WARNING: Reported-by: should be immediately followed by Closes: with a URL to the report > #72: This advice would make sense..weren't it for the fact that this Syzkaller issue is already closed heh As I mentioned in the commit message, the main issue with this reproducer is a race in the uevent path vs driver shutdown, addressed by other commit, hence the Syzkaller report is fixed and closed. But...this patch fixes a secondary issue there, and pointing to the Syzkaller issue is useful first to give it credit, since both issues were found by the tool, but also to point to the reproducer, so I kept the Reported tag, but not the Closes one =) > WARNING: The commit message has 'syzkaller', perhaps it also needs a 'Fixes:' tag? > The fixes tag would point to the driver addition, ~10y ago. Stable bots would attempt to backport it for all releases, which I think maybe is not necessary. This is a small issue restricted to (old?) USB devices and emulated devices (in reproducers), so in my commit message instead of adding a Fixes tag, I've added Cc to stable with my suggestion of the versions to backport (6.1.y and 6.6.y basically). That decision is up to you and other maintainers, so feel free to chime in if you prefer to backport to all releases or even *not* backport it at all, I just provided a suggestion based on my understanding about the issue and the relevance of this fix =] Cheers, Guilherme
© 2016 - 2024 Red Hat, Inc.