[PATCH v2] wifi: rtw89: phy: increase RF calibration timeouts for USB transport

Louis Kotze posted 1 patch 2 months ago
There is a newer version of this series
drivers/net/wireless/realtek/rtw89/phy.c | 8 ++++++++
1 file changed, 8 insertions(+)
[PATCH v2] wifi: rtw89: phy: increase RF calibration timeouts for USB transport
Posted by Louis Kotze 2 months ago
USB transport adds significant latency to H2C/C2H round-trips used
by RF calibration. The existing timeout values were designed for PCIe
and are too tight for USB, causing "failed to wait RF DACK",
"failed to wait RF TSSI" and similar errors on USB adapters.

Apply a 4x timeout multiplier when the device uses USB transport.
The multiplier is applied in rtw89_phy_rfk_report_wait() so all
calibrations benefit without changing any call sites or PCIe
timeout values.

The 4x multiplier was chosen based on measured data from two
independent testers (RTL8922AU, 6GHz MLO and 2.4/5GHz):

  Calibration   PCIe timeout   Max measured (USB)   4x timeout
  PRE_NTFY           5ms              1ms              20ms
  DACK              58ms             72ms             232ms
  RX_DCK           128ms            374ms             512ms
  TSSI normal       20ms             24ms              80ms
  TSSI scan          6ms             14ms              24ms
  TXGAPK            54ms             18ms             216ms
  IQK               84ms             53ms             336ms
  DPK               34ms             30ms             136ms

Tested with RTL8922AU on 6GHz MLO (5GHz + 6GHz simultaneous):
25 connect/disconnect cycles with zero failures.

In response to review feedback on v1, the 4x multiplier was also
re-verified under adverse host conditions on 5GHz. 5 cycles per
scenario, stress-ng as the load generator, max observed time per
calibration:

  Calibration  PCIe  4x   Baseline  CPU stress  Mem stress  Combined
  PRE_NTFY       5   20     0         0           0           1
  DACK          58  232    71 (!)    71 (!)      71 (!)      71 (!)
  RX_DCK       128  512    23        22          22          23
  IQK           84  336    53        53          53          53
  DPK           34  136    23        23          26          23
  TSSI          20   80     6         9          14           9
  TXGAPK        54  216    16        16          16          16

Legend: (!) = exceeds PCIe budget but within 4x budget.

Two observations from that matrix:

1. DACK exceeds the stock PCIe budget (58ms) in baseline on 5GHz
   on this hardware. Without the 4x multiplier, DACK fails
   -ETIMEDOUT deterministically on every connect, no stress
   needed. This is the specific bug the patch fixes.

2. Calibration times are I/O bound (USB H2C/C2H round-trip
   latency), not CPU or memory bound. DACK stays at 71ms across
   all four scenarios. Host-side stress has essentially zero
   effect on observed calibration duration. Bumping the
   multiplier above 4x would not address a failure mode that
   this stress matrix produces.

Signed-off-by: Louis Kotze <loukot@gmail.com>
---
Changes since v1:
- Fix comment style per Ping-Ke Shih review (first line '/*' on its
  own line, kernel-standard format).
- Add stress-test verification table to the commit message. The 4x
  multiplier was re-measured on 5GHz under CPU stress, memory stress,
  and combined stress using stress-ng. DACK max is 71ms in all four
  scenarios, confirming calibration times are I/O bound (USB H2C/C2H
  round-trip) and not affected by host-side load.
- Drop v1 patch 2/2 ("make RF calibration timeouts non-fatal on USB").
  As Ping-Ke noted, the return code from rtw89_phy_rfk_*_and_wait() is
  discarded by all 8922a callers, making the non-fatal change a no-op
  for 8922a. Worse, the one 8922d caller that does check the return
  (rtw8922d_rfk_tssi) uses it to fall back to non-TSSI mode on
  calibration failure — patch 2/2 would have silently broken that
  fallback. With patch 1/2's multiplier alone, 25 connect/disconnect
  cycles complete with zero failures, and the new stress matrix above
  confirms the margin.

 drivers/net/wireless/realtek/rtw89/phy.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/wireless/realtek/rtw89/phy.c b/drivers/net/wireless/realtek/rtw89/phy.c
index e70d0e283..1f249c297 100644
--- a/drivers/net/wireless/realtek/rtw89/phy.c
+++ b/drivers/net/wireless/realtek/rtw89/phy.c
@@ -3956,6 +3956,14 @@ int rtw89_phy_rfk_report_wait(struct rtw89_dev *rtwdev, const char *rfk_name,
 	struct rtw89_rfk_wait_info *wait = &rtwdev->rfk_wait;
 	unsigned long time_left;
 
+	/*
+	 * USB transport adds latency to H2C/C2H round-trips, so RF
+	 * calibrations take longer than on PCIe. Apply a 4x multiplier
+	 * to avoid spurious timeouts.
+	 */
+	if (rtwdev->hci.type == RTW89_HCI_TYPE_USB)
+		ms *= 4;
+
 	/* Since we can't receive C2H event during SER, use a fixed delay. */
 	if (test_bit(RTW89_FLAG_SER_HANDLING, rtwdev->flags)) {
 		fsleep(1000 * ms / 2);
-- 
2.53.0

RE: [PATCH v2] wifi: rtw89: phy: increase RF calibration timeouts for USB transport
Posted by Ping-Ke Shih 2 months ago
Louis Kotze <loukot@gmail.com> wrote:
> USB transport adds significant latency to H2C/C2H round-trips used
> by RF calibration. The existing timeout values were designed for PCIe
> and are too tight for USB, causing "failed to wait RF DACK",
> "failed to wait RF TSSI" and similar errors on USB adapters.
> 
> Apply a 4x timeout multiplier when the device uses USB transport.
> The multiplier is applied in rtw89_phy_rfk_report_wait() so all
> calibrations benefit without changing any call sites or PCIe
> timeout values.
> 
> The 4x multiplier was chosen based on measured data from two
> independent testers (RTL8922AU, 6GHz MLO and 2.4/5GHz):
> 
>   Calibration   PCIe timeout   Max measured (USB)   4x timeout
>   PRE_NTFY           5ms              1ms              20ms
>   DACK              58ms             72ms             232ms
>   RX_DCK           128ms            374ms             512ms
>   TSSI normal       20ms             24ms              80ms
>   TSSI scan          6ms             14ms              24ms
>   TXGAPK            54ms             18ms             216ms
>   IQK               84ms             53ms             336ms
>   DPK               34ms             30ms             136ms
> 
> Tested with RTL8922AU on 6GHz MLO (5GHz + 6GHz simultaneous):
> 25 connect/disconnect cycles with zero failures.
> 
> In response to review feedback on v1, 

Can we remove this phrase? No need to mention v1 in commit message. 

> the 4x multiplier was also
> re-verified under adverse host conditions on 5GHz. 5 cycles per
> scenario, stress-ng as the load generator, max observed time per
> calibration:
> 
>   Calibration  PCIe  4x   Baseline  CPU stress  Mem stress  Combined
>   PRE_NTFY       5   20     0         0           0           1
>   DACK          58  232    71 (!)    71 (!)      71 (!)      71 (!)
>   RX_DCK       128  512    23        22          22          23
>   IQK           84  336    53        53          53          53
>   DPK           34  136    23        23          26          23
>   TSSI          20   80     6         9          14           9
>   TXGAPK        54  216    16        16          16          16
> 
> Legend: (!) = exceeds PCIe budget but within 4x budget.
> 
> Two observations from that matrix:
> 
> 1. DACK exceeds the stock PCIe budget (58ms) in baseline on 5GHz
>    on this hardware. Without the 4x multiplier, DACK fails
>    -ETIMEDOUT deterministically on every connect, no stress
>    needed. This is the specific bug the patch fixes.

I'm not sure this should be called "bug", as Bitterblue has not adjusted
these timeout time by earlier version. 

> 
> 2. Calibration times are I/O bound (USB H2C/C2H round-trip
>    latency), 

I'm also not sure if this is correct. The calibration time of DACK might
rely on WiFi hardware and external components, not only I/O speed. 

> not CPU or memory bound. DACK stays at 71ms across
>    all four scenarios. Host-side stress has essentially zero
>    effect on observed calibration duration. Bumping the
>    multiplier above 4x would not address a failure mode that
>    this stress matrix produces.
> 
> Signed-off-by: Louis Kotze <loukot@gmail.com>

Otherwise, patch content looks good to me.

Acked-by: Ping-Ke Shih <pkshih@realtek.com>


Re: [PATCH v2] wifi: rtw89: phy: increase RF calibration timeouts for USB transport
Posted by Louis Kotze 2 months ago
> Can we remove this phrase? No need to mention v1 in commit message.

Done -- reworded to stand alone without referencing prior versions.

> I'm not sure this should be called "bug", as Bitterblue has not
> adjusted these timeout time by earlier version.

Fair point -- the timeouts were correct for PCIe; USB was not in
scope yet. Changed to "condition" instead of "bug".

> I'm also not sure if this is correct. The calibration time of DACK
> might rely on WiFi hardware and external components, not only I/O
> speed.

You're right, I overclaimed. Reworded to note that transport
round-trip latency appears to dominate under these test conditions,
but hardware and external component factors may also contribute.

All three changes applied in v3. Also added Tested-by tags from
Devin Wittmayer across RTL8922AU/8852AU/8852BU/8852CU (Framework 13
and Raspberry Pi 5), and Reported-by with a link to his xHCI hard
lockup evidence.

Thank you for the review.