[PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)

Simon Schippers posted 1 patch 1 month, 2 weeks ago
There is a newer version of this series
drivers/net/usb/usbnet.c | 8 ++++++++
1 file changed, 8 insertions(+)
[PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Simon Schippers 1 month, 2 weeks ago
During recent testing, I observed significant latency spikes when using
Quectel 5G modems under load. Investigation revealed that the issue was
caused by bufferbloat in the usbnet driver.

In the current implementation, usbnet uses a fixed tx_qlen of:

USB2: 60 * 1518 bytes = 91.08 KB
USB3: 60 * 5 * 1518 bytes = 454.80 KB

Such large transmit queues can be problematic, especially for cellular
modems. For example, with a typical celluar link speed of 10 Mbit/s, a
fully occupied USB3 transmit queue results in:

454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms

of additional latency.

To address this issue, this patch introduces support for
Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
limits the amount of data queued in the driver, effectively reducing
latency without impacting throughput.
This implementation was successfully tested on several devices as
described in the commit.



Future work

Due to offloading, TCP often produces SKBs up to 64 KB in size. To
further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
did not have the intended effect in my tests. The only dirty workaround I
found so far was to call netif_stop_queue() whenever BQL sets
__QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
be desirable.

I also plan to publish a scientific paper on this topic in the near
future.

Thanks,
Simon

[1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
[2] https://lwn.net/Articles/469652/

Simon Schippers (1):
  usbnet: Add support for Byte Queue Limits (BQL)

 drivers/net/usb/usbnet.c | 8 ++++++++
 1 file changed, 8 insertions(+)

-- 
2.43.0
Re: [PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Eric Dumazet 1 month, 2 weeks ago
On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
<simon.schippers@tu-dortmund.de> wrote:
>
> During recent testing, I observed significant latency spikes when using
> Quectel 5G modems under load. Investigation revealed that the issue was
> caused by bufferbloat in the usbnet driver.
>
> In the current implementation, usbnet uses a fixed tx_qlen of:
>
> USB2: 60 * 1518 bytes = 91.08 KB
> USB3: 60 * 5 * 1518 bytes = 454.80 KB
>
> Such large transmit queues can be problematic, especially for cellular
> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
> fully occupied USB3 transmit queue results in:
>
> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
>
> of additional latency.

Doesn't 5G need to push more packets to the driver to get good aggregation ?

>
> To address this issue, this patch introduces support for
> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
> limits the amount of data queued in the driver, effectively reducing
> latency without impacting throughput.
> This implementation was successfully tested on several devices as
> described in the commit.
>
>
>
> Future work
>
> Due to offloading, TCP often produces SKBs up to 64 KB in size.

Only for rates > 500 Mbit. After BQL, we had many more improvements in
the stack.
https://lwn.net/Articles/564978/


> To
> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
> did not have the intended effect in my tests. The only dirty workaround I
> found so far was to call netif_stop_queue() whenever BQL sets
> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
> be desirable.
>
> I also plan to publish a scientific paper on this topic in the near
> future.
>
> Thanks,
> Simon
>
> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
> [2] https://lwn.net/Articles/469652/
>
> Simon Schippers (1):
>   usbnet: Add support for Byte Queue Limits (BQL)
>
>  drivers/net/usb/usbnet.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> --
> 2.43.0
>
[PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Simon Schippers 1 month, 1 week ago
On 11/4/25 18:02, Eric Dumazet wrote:
> On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
> <simon.schippers@tu-dortmund.de> wrote:
>>
>> During recent testing, I observed significant latency spikes when using
>> Quectel 5G modems under load. Investigation revealed that the issue was
>> caused by bufferbloat in the usbnet driver.
>>
>> In the current implementation, usbnet uses a fixed tx_qlen of:
>>
>> USB2: 60 * 1518 bytes = 91.08 KB
>> USB3: 60 * 5 * 1518 bytes = 454.80 KB
>>
>> Such large transmit queues can be problematic, especially for cellular
>> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
>> fully occupied USB3 transmit queue results in:
>>
>> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
>>
>> of additional latency.
> 
> Doesn't 5G need to push more packets to the driver to get good aggregation ?
> 

Yes, but not 455 KB for low speeds. 5G requires a queue of a few ms to
aggregate enough packets for a frame but not of several hundred ms as
calculated in my example. And yes, there are situations where 5G,
especially FR2 mmWave, reaches Gbit/s speeds where a big queue is
required. But the dynamic queue limit approach of BQL should be well
suited for these varying speeds.

>>
>> To address this issue, this patch introduces support for
>> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
>> limits the amount of data queued in the driver, effectively reducing
>> latency without impacting throughput.
>> This implementation was successfully tested on several devices as
>> described in the commit.
>>
>>
>>
>> Future work
>>
>> Due to offloading, TCP often produces SKBs up to 64 KB in size.
> 
> Only for rates > 500 Mbit. After BQL, we had many more improvements in
> the stack.
> https://lwn.net/Articles/564978/
> 
> 

I also saw these large SKBs, for example, for my USB2 Android tethering,
which advertises a network speed of < 500 Mbit/s.
I saw these large SKBs by looking at the file:

cat /sys/class/net/INTERFACE/queues/tx-0/byte_queue_limits/inflight

For UDP-only traffic, inflight always maxed out at MTU size.

Thank you for your replies!

>> To
>> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
>> did not have the intended effect in my tests. The only dirty workaround I
>> found so far was to call netif_stop_queue() whenever BQL sets
>> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
>> be desirable.
>>
>> I also plan to publish a scientific paper on this topic in the near
>> future.
>>
>> Thanks,
>> Simon
>>
>> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
>> [2] https://lwn.net/Articles/469652/
>>
>> Simon Schippers (1):
>>   usbnet: Add support for Byte Queue Limits (BQL)
>>
>>  drivers/net/usb/usbnet.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> --
>> 2.43.0
>>
Re: [PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Eric Dumazet 1 month, 1 week ago
On Wed, Nov 5, 2025 at 2:40 AM Simon Schippers
<simon.schippers@tu-dortmund.de> wrote:
>
> On 11/4/25 18:02, Eric Dumazet wrote:
> > On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
> > <simon.schippers@tu-dortmund.de> wrote:
> >>
> >> During recent testing, I observed significant latency spikes when using
> >> Quectel 5G modems under load. Investigation revealed that the issue was
> >> caused by bufferbloat in the usbnet driver.
> >>
> >> In the current implementation, usbnet uses a fixed tx_qlen of:
> >>
> >> USB2: 60 * 1518 bytes = 91.08 KB
> >> USB3: 60 * 5 * 1518 bytes = 454.80 KB
> >>
> >> Such large transmit queues can be problematic, especially for cellular
> >> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
> >> fully occupied USB3 transmit queue results in:
> >>
> >> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
> >>
> >> of additional latency.
> >
> > Doesn't 5G need to push more packets to the driver to get good aggregation ?
> >
>
> Yes, but not 455 KB for low speeds. 5G requires a queue of a few ms to
> aggregate enough packets for a frame but not of several hundred ms as
> calculated in my example. And yes, there are situations where 5G,
> especially FR2 mmWave, reaches Gbit/s speeds where a big queue is
> required. But the dynamic queue limit approach of BQL should be well
> suited for these varying speeds.
>
> >>
> >> To address this issue, this patch introduces support for
> >> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
> >> limits the amount of data queued in the driver, effectively reducing
> >> latency without impacting throughput.
> >> This implementation was successfully tested on several devices as
> >> described in the commit.
> >>
> >>
> >>
> >> Future work
> >>
> >> Due to offloading, TCP often produces SKBs up to 64 KB in size.
> >
> > Only for rates > 500 Mbit. After BQL, we had many more improvements in
> > the stack.
> > https://lwn.net/Articles/564978/
> >
> >
>
> I also saw these large SKBs, for example, for my USB2 Android tethering,
> which advertises a network speed of < 500 Mbit/s.
> I saw these large SKBs by looking at the file:

TCP does not sense the underlying network speed. This would be moot if
a link is shared by one thousand flows...
The rate is determined by CWND * MSS / RTT.
Some congestion controls have a tendency to inflate CWND to a very big
value, hence bufferbloat.
One of BBR goal is to avoid bufferbloat.

BQL is only a part of the solution.

Disabling TSO/GSO is certainly not part of the solution, you can trust
me on this.

>
> cat /sys/class/net/INTERFACE/queues/tx-0/byte_queue_limits/inflight
>
> For UDP-only traffic, inflight always maxed out at MTU size.
>
> Thank you for your replies!
>
> >> To
> >> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
> >> did not have the intended effect in my tests. The only dirty workaround I
> >> found so far was to call netif_stop_queue() whenever BQL sets
> >> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
> >> be desirable.
> >>
> >> I also plan to publish a scientific paper on this topic in the near
> >> future.
> >>
> >> Thanks,
> >> Simon
> >>
> >> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
> >> [2] https://lwn.net/Articles/469652/
> >>
> >> Simon Schippers (1):
> >>   usbnet: Add support for Byte Queue Limits (BQL)
> >>
> >>  drivers/net/usb/usbnet.c | 8 ++++++++
> >>  1 file changed, 8 insertions(+)
> >>
> >> --
> >> 2.43.0
> >>
Re: [PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Daniele Palmas 1 month, 1 week ago
Hello Simon,

Il giorno mer 5 nov 2025 alle ore 11:40 Simon Schippers
<simon.schippers@tu-dortmund.de> ha scritto:
>
> On 11/4/25 18:02, Eric Dumazet wrote:
> > On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
> > <simon.schippers@tu-dortmund.de> wrote:
> >>
> >> During recent testing, I observed significant latency spikes when using
> >> Quectel 5G modems under load. Investigation revealed that the issue was
> >> caused by bufferbloat in the usbnet driver.
> >>
> >> In the current implementation, usbnet uses a fixed tx_qlen of:
> >>
> >> USB2: 60 * 1518 bytes = 91.08 KB
> >> USB3: 60 * 5 * 1518 bytes = 454.80 KB
> >>
> >> Such large transmit queues can be problematic, especially for cellular
> >> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
> >> fully occupied USB3 transmit queue results in:
> >>
> >> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
> >>
> >> of additional latency.
> >
> > Doesn't 5G need to push more packets to the driver to get good aggregation ?
> >
>
> Yes, but not 455 KB for low speeds. 5G requires a queue of a few ms to
> aggregate enough packets for a frame but not of several hundred ms as
> calculated in my example. And yes, there are situations where 5G,
> especially FR2 mmWave, reaches Gbit/s speeds where a big queue is
> required. But the dynamic queue limit approach of BQL should be well
> suited for these varying speeds.
>

out of curiosity, related to the test with 5G Quectel, did you test
enabling aggregation through QMAP (kernel module rmnet) or simply
qmi_wwan raw_ip ?

Regards,
Daniele

> >>
> >> To address this issue, this patch introduces support for
> >> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
> >> limits the amount of data queued in the driver, effectively reducing
> >> latency without impacting throughput.
> >> This implementation was successfully tested on several devices as
> >> described in the commit.
> >>
> >>
> >>
> >> Future work
> >>
> >> Due to offloading, TCP often produces SKBs up to 64 KB in size.
> >
> > Only for rates > 500 Mbit. After BQL, we had many more improvements in
> > the stack.
> > https://lwn.net/Articles/564978/
> >
> >
>
> I also saw these large SKBs, for example, for my USB2 Android tethering,
> which advertises a network speed of < 500 Mbit/s.
> I saw these large SKBs by looking at the file:
>
> cat /sys/class/net/INTERFACE/queues/tx-0/byte_queue_limits/inflight
>
> For UDP-only traffic, inflight always maxed out at MTU size.
>
> Thank you for your replies!
>
> >> To
> >> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
> >> did not have the intended effect in my tests. The only dirty workaround I
> >> found so far was to call netif_stop_queue() whenever BQL sets
> >> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
> >> be desirable.
> >>
> >> I also plan to publish a scientific paper on this topic in the near
> >> future.
> >>
> >> Thanks,
> >> Simon
> >>
> >> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
> >> [2] https://lwn.net/Articles/469652/
> >>
> >> Simon Schippers (1):
> >>   usbnet: Add support for Byte Queue Limits (BQL)
> >>
> >>  drivers/net/usb/usbnet.c | 8 ++++++++
> >>  1 file changed, 8 insertions(+)
> >>
> >> --
> >> 2.43.0
> >>
>
[PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Simon Schippers 1 month, 1 week ago
On 11/5/25 11:35, Daniele Palmas wrote:
> Hello Simon,
> 
> Il giorno mer 5 nov 2025 alle ore 11:40 Simon Schippers
> <simon.schippers@tu-dortmund.de> ha scritto:
>>
>> On 11/4/25 18:02, Eric Dumazet wrote:
>>> On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
>>> <simon.schippers@tu-dortmund.de> wrote:
>>>>
>>>> During recent testing, I observed significant latency spikes when using
>>>> Quectel 5G modems under load. Investigation revealed that the issue was
>>>> caused by bufferbloat in the usbnet driver.
>>>>
>>>> In the current implementation, usbnet uses a fixed tx_qlen of:
>>>>
>>>> USB2: 60 * 1518 bytes = 91.08 KB
>>>> USB3: 60 * 5 * 1518 bytes = 454.80 KB
>>>>
>>>> Such large transmit queues can be problematic, especially for cellular
>>>> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
>>>> fully occupied USB3 transmit queue results in:
>>>>
>>>> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
>>>>
>>>> of additional latency.
>>>
>>> Doesn't 5G need to push more packets to the driver to get good aggregation ?
>>>
>>
>> Yes, but not 455 KB for low speeds. 5G requires a queue of a few ms to
>> aggregate enough packets for a frame but not of several hundred ms as
>> calculated in my example. And yes, there are situations where 5G,
>> especially FR2 mmWave, reaches Gbit/s speeds where a big queue is
>> required. But the dynamic queue limit approach of BQL should be well
>> suited for these varying speeds.
>>
> 
> out of curiosity, related to the test with 5G Quectel, did you test
> enabling aggregation through QMAP (kernel module rmnet) or simply
> qmi_wwan raw_ip ?
> 
> Regards,
> Daniele
> 

Hi Daniele,

I simply used qmi_wwan. I actually never touched rmnet before.
Is the aggregation through QMAP what you and Eric mean with aggregation?
Because then I misunderstood it, because I was thinking about aggregating
enough (and not too many) packets in the usbnet queue.

Thanks

>>>>
>>>> To address this issue, this patch introduces support for
>>>> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
>>>> limits the amount of data queued in the driver, effectively reducing
>>>> latency without impacting throughput.
>>>> This implementation was successfully tested on several devices as
>>>> described in the commit.
>>>>
>>>>
>>>>
>>>> Future work
>>>>
>>>> Due to offloading, TCP often produces SKBs up to 64 KB in size.
>>>
>>> Only for rates > 500 Mbit. After BQL, we had many more improvements in
>>> the stack.
>>> https://lwn.net/Articles/564978/
>>>
>>>
>>
>> I also saw these large SKBs, for example, for my USB2 Android tethering,
>> which advertises a network speed of < 500 Mbit/s.
>> I saw these large SKBs by looking at the file:
>>
>> cat /sys/class/net/INTERFACE/queues/tx-0/byte_queue_limits/inflight
>>
>> For UDP-only traffic, inflight always maxed out at MTU size.
>>
>> Thank you for your replies!
>>
>>>> To
>>>> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
>>>> did not have the intended effect in my tests. The only dirty workaround I
>>>> found so far was to call netif_stop_queue() whenever BQL sets
>>>> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
>>>> be desirable.
>>>>
>>>> I also plan to publish a scientific paper on this topic in the near
>>>> future.
>>>>
>>>> Thanks,
>>>> Simon
>>>>
>>>> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
>>>> [2] https://lwn.net/Articles/469652/
>>>>
>>>> Simon Schippers (1):
>>>>   usbnet: Add support for Byte Queue Limits (BQL)
>>>>
>>>>  drivers/net/usb/usbnet.c | 8 ++++++++
>>>>  1 file changed, 8 insertions(+)
>>>>
>>>> --
>>>> 2.43.0
>>>>
>>
Re: [PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Daniele Palmas 1 month, 1 week ago
Hi Simon,

Il giorno mer 5 nov 2025 alle ore 12:05 Simon Schippers
<simon.schippers@tu-dortmund.de> ha scritto:
>
> On 11/5/25 11:35, Daniele Palmas wrote:
> > Hello Simon,
> >
> > Il giorno mer 5 nov 2025 alle ore 11:40 Simon Schippers
> > <simon.schippers@tu-dortmund.de> ha scritto:
> >>
> >> On 11/4/25 18:02, Eric Dumazet wrote:
> >>> On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
> >>> <simon.schippers@tu-dortmund.de> wrote:
> >>>>
> >>>> During recent testing, I observed significant latency spikes when using
> >>>> Quectel 5G modems under load. Investigation revealed that the issue was
> >>>> caused by bufferbloat in the usbnet driver.
> >>>>
> >>>> In the current implementation, usbnet uses a fixed tx_qlen of:
> >>>>
> >>>> USB2: 60 * 1518 bytes = 91.08 KB
> >>>> USB3: 60 * 5 * 1518 bytes = 454.80 KB
> >>>>
> >>>> Such large transmit queues can be problematic, especially for cellular
> >>>> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
> >>>> fully occupied USB3 transmit queue results in:
> >>>>
> >>>> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
> >>>>
> >>>> of additional latency.
> >>>
> >>> Doesn't 5G need to push more packets to the driver to get good aggregation ?
> >>>
> >>
> >> Yes, but not 455 KB for low speeds. 5G requires a queue of a few ms to
> >> aggregate enough packets for a frame but not of several hundred ms as
> >> calculated in my example. And yes, there are situations where 5G,
> >> especially FR2 mmWave, reaches Gbit/s speeds where a big queue is
> >> required. But the dynamic queue limit approach of BQL should be well
> >> suited for these varying speeds.
> >>
> >
> > out of curiosity, related to the test with 5G Quectel, did you test
> > enabling aggregation through QMAP (kernel module rmnet) or simply
> > qmi_wwan raw_ip ?
> >
> > Regards,
> > Daniele
> >
>
> Hi Daniele,
>
> I simply used qmi_wwan. I actually never touched rmnet before.
> Is the aggregation through QMAP what you and Eric mean with aggregation?
> Because then I misunderstood it, because I was thinking about aggregating
> enough (and not too many) packets in the usbnet queue.
>

I can't speak for Eric, but, yes, that is what I meant for
aggregation, this is the common way those high-cat modems are used:
it's not clear to me if the change you are proposing could have any
impact when rmnet is used, that's why I was asking the test
conditions.

Thanks,
Daniele

> Thanks
>
> >>>>
> >>>> To address this issue, this patch introduces support for
> >>>> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
> >>>> limits the amount of data queued in the driver, effectively reducing
> >>>> latency without impacting throughput.
> >>>> This implementation was successfully tested on several devices as
> >>>> described in the commit.
> >>>>
> >>>>
> >>>>
> >>>> Future work
> >>>>
> >>>> Due to offloading, TCP often produces SKBs up to 64 KB in size.
> >>>
> >>> Only for rates > 500 Mbit. After BQL, we had many more improvements in
> >>> the stack.
> >>> https://lwn.net/Articles/564978/
> >>>
> >>>
> >>
> >> I also saw these large SKBs, for example, for my USB2 Android tethering,
> >> which advertises a network speed of < 500 Mbit/s.
> >> I saw these large SKBs by looking at the file:
> >>
> >> cat /sys/class/net/INTERFACE/queues/tx-0/byte_queue_limits/inflight
> >>
> >> For UDP-only traffic, inflight always maxed out at MTU size.
> >>
> >> Thank you for your replies!
> >>
> >>>> To
> >>>> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
> >>>> did not have the intended effect in my tests. The only dirty workaround I
> >>>> found so far was to call netif_stop_queue() whenever BQL sets
> >>>> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
> >>>> be desirable.
> >>>>
> >>>> I also plan to publish a scientific paper on this topic in the near
> >>>> future.
> >>>>
> >>>> Thanks,
> >>>> Simon
> >>>>
> >>>> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
> >>>> [2] https://lwn.net/Articles/469652/
> >>>>
> >>>> Simon Schippers (1):
> >>>>   usbnet: Add support for Byte Queue Limits (BQL)
> >>>>
> >>>>  drivers/net/usb/usbnet.c | 8 ++++++++
> >>>>  1 file changed, 8 insertions(+)
> >>>>
> >>>> --
> >>>> 2.43.0
> >>>>
> >>
[PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Simon Schippers 1 month, 1 week ago
On 11/6/25 09:38, Daniele Palmas wrote:
> Hi Simon,
> 
> Il giorno mer 5 nov 2025 alle ore 12:05 Simon Schippers
> <simon.schippers@tu-dortmund.de> ha scritto:
>>
>> On 11/5/25 11:35, Daniele Palmas wrote:
>>> Hello Simon,
>>>
>>> Il giorno mer 5 nov 2025 alle ore 11:40 Simon Schippers
>>> <simon.schippers@tu-dortmund.de> ha scritto:
>>>>
>>>> On 11/4/25 18:02, Eric Dumazet wrote:
>>>>> On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
>>>>> <simon.schippers@tu-dortmund.de> wrote:
>>>>>>
>>>>>> During recent testing, I observed significant latency spikes when using
>>>>>> Quectel 5G modems under load. Investigation revealed that the issue was
>>>>>> caused by bufferbloat in the usbnet driver.
>>>>>>
>>>>>> In the current implementation, usbnet uses a fixed tx_qlen of:
>>>>>>
>>>>>> USB2: 60 * 1518 bytes = 91.08 KB
>>>>>> USB3: 60 * 5 * 1518 bytes = 454.80 KB
>>>>>>
>>>>>> Such large transmit queues can be problematic, especially for cellular
>>>>>> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
>>>>>> fully occupied USB3 transmit queue results in:
>>>>>>
>>>>>> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
>>>>>>
>>>>>> of additional latency.
>>>>>
>>>>> Doesn't 5G need to push more packets to the driver to get good aggregation ?
>>>>>
>>>>
>>>> Yes, but not 455 KB for low speeds. 5G requires a queue of a few ms to
>>>> aggregate enough packets for a frame but not of several hundred ms as
>>>> calculated in my example. And yes, there are situations where 5G,
>>>> especially FR2 mmWave, reaches Gbit/s speeds where a big queue is
>>>> required. But the dynamic queue limit approach of BQL should be well
>>>> suited for these varying speeds.
>>>>
>>>
>>> out of curiosity, related to the test with 5G Quectel, did you test
>>> enabling aggregation through QMAP (kernel module rmnet) or simply
>>> qmi_wwan raw_ip ?
>>>
>>> Regards,
>>> Daniele
>>>
>>
>> Hi Daniele,
>>
>> I simply used qmi_wwan. I actually never touched rmnet before.
>> Is the aggregation through QMAP what you and Eric mean with aggregation?
>> Because then I misunderstood it, because I was thinking about aggregating
>> enough (and not too many) packets in the usbnet queue.
>>
> 
> I can't speak for Eric, but, yes, that is what I meant for
> aggregation, this is the common way those high-cat modems are used:

Hi Daniele,

I think I *really* have to take a look at rmnet and aggregation through
QMAP for future projects :)

> it's not clear to me if the change you are proposing could have any
> impact when rmnet is used, that's why I was asking the test
> conditions.
> 
> Thanks,
> Daniele
> 

This patch has an impact on the underlying USB physical transport of
rmnet. From my understanding, the call stack is as follows:

rmnet_map_tx_aggregate or rmnet_send_skb

|
| Calling dev_queue_xmit(skb)
V

qmi_wwan used for USB modem

|
|  ndo_start_xmit(skb, net) is called
V

usbnet_start_xmit is executed where the size of the internal queue is
dynamically changed using the Byte Queue Limits algorithm by this patch.

Correct me if I am wrong, but I think in the end usbnet is used.

Thanks,
Simon

>> Thanks
>>
>>>>>>
>>>>>> To address this issue, this patch introduces support for
>>>>>> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
>>>>>> limits the amount of data queued in the driver, effectively reducing
>>>>>> latency without impacting throughput.
>>>>>> This implementation was successfully tested on several devices as
>>>>>> described in the commit.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Future work
>>>>>>
>>>>>> Due to offloading, TCP often produces SKBs up to 64 KB in size.
>>>>>
>>>>> Only for rates > 500 Mbit. After BQL, we had many more improvements in
>>>>> the stack.
>>>>> https://lwn.net/Articles/564978/
>>>>>
>>>>>
>>>>
>>>> I also saw these large SKBs, for example, for my USB2 Android tethering,
>>>> which advertises a network speed of < 500 Mbit/s.
>>>> I saw these large SKBs by looking at the file:
>>>>
>>>> cat /sys/class/net/INTERFACE/queues/tx-0/byte_queue_limits/inflight
>>>>
>>>> For UDP-only traffic, inflight always maxed out at MTU size.
>>>>
>>>> Thank you for your replies!
>>>>
>>>>>> To
>>>>>> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
>>>>>> did not have the intended effect in my tests. The only dirty workaround I
>>>>>> found so far was to call netif_stop_queue() whenever BQL sets
>>>>>> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
>>>>>> be desirable.
>>>>>>
>>>>>> I also plan to publish a scientific paper on this topic in the near
>>>>>> future.
>>>>>>
>>>>>> Thanks,
>>>>>> Simon
>>>>>>
>>>>>> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
>>>>>> [2] https://lwn.net/Articles/469652/
>>>>>>
>>>>>> Simon Schippers (1):
>>>>>>   usbnet: Add support for Byte Queue Limits (BQL)
>>>>>>
>>>>>>  drivers/net/usb/usbnet.c | 8 ++++++++
>>>>>>  1 file changed, 8 insertions(+)
>>>>>>
>>>>>> --
>>>>>> 2.43.0
>>>>>>
>>>>
Re: [PATCH net-next v1 0/1] usbnet: Add support for Byte Queue Limits (BQL)
Posted by Daniele Palmas 1 month, 1 week ago
Hi Simon,

Il giorno gio 6 nov 2025 alle ore 11:00 Simon Schippers
<simon.schippers@tu-dortmund.de> ha scritto:
>
> On 11/6/25 09:38, Daniele Palmas wrote:
> > Hi Simon,
> >
> > Il giorno mer 5 nov 2025 alle ore 12:05 Simon Schippers
> > <simon.schippers@tu-dortmund.de> ha scritto:
> >>
> >> On 11/5/25 11:35, Daniele Palmas wrote:
> >>> Hello Simon,
> >>>
> >>> Il giorno mer 5 nov 2025 alle ore 11:40 Simon Schippers
> >>> <simon.schippers@tu-dortmund.de> ha scritto:
> >>>>
> >>>> On 11/4/25 18:02, Eric Dumazet wrote:
> >>>>> On Tue, Nov 4, 2025 at 8:14 AM Simon Schippers
> >>>>> <simon.schippers@tu-dortmund.de> wrote:
> >>>>>>
> >>>>>> During recent testing, I observed significant latency spikes when using
> >>>>>> Quectel 5G modems under load. Investigation revealed that the issue was
> >>>>>> caused by bufferbloat in the usbnet driver.
> >>>>>>
> >>>>>> In the current implementation, usbnet uses a fixed tx_qlen of:
> >>>>>>
> >>>>>> USB2: 60 * 1518 bytes = 91.08 KB
> >>>>>> USB3: 60 * 5 * 1518 bytes = 454.80 KB
> >>>>>>
> >>>>>> Such large transmit queues can be problematic, especially for cellular
> >>>>>> modems. For example, with a typical celluar link speed of 10 Mbit/s, a
> >>>>>> fully occupied USB3 transmit queue results in:
> >>>>>>
> >>>>>> 454.80 KB / (10 Mbit/s / 8 bit/byte) = 363.84 ms
> >>>>>>
> >>>>>> of additional latency.
> >>>>>
> >>>>> Doesn't 5G need to push more packets to the driver to get good aggregation ?
> >>>>>
> >>>>
> >>>> Yes, but not 455 KB for low speeds. 5G requires a queue of a few ms to
> >>>> aggregate enough packets for a frame but not of several hundred ms as
> >>>> calculated in my example. And yes, there are situations where 5G,
> >>>> especially FR2 mmWave, reaches Gbit/s speeds where a big queue is
> >>>> required. But the dynamic queue limit approach of BQL should be well
> >>>> suited for these varying speeds.
> >>>>
> >>>
> >>> out of curiosity, related to the test with 5G Quectel, did you test
> >>> enabling aggregation through QMAP (kernel module rmnet) or simply
> >>> qmi_wwan raw_ip ?
> >>>
> >>> Regards,
> >>> Daniele
> >>>
> >>
> >> Hi Daniele,
> >>
> >> I simply used qmi_wwan. I actually never touched rmnet before.
> >> Is the aggregation through QMAP what you and Eric mean with aggregation?
> >> Because then I misunderstood it, because I was thinking about aggregating
> >> enough (and not too many) packets in the usbnet queue.
> >>
> >
> > I can't speak for Eric, but, yes, that is what I meant for
> > aggregation, this is the common way those high-cat modems are used:
>
> Hi Daniele,
>
> I think I *really* have to take a look at rmnet and aggregation through
> QMAP for future projects :)
>
> > it's not clear to me if the change you are proposing could have any
> > impact when rmnet is used, that's why I was asking the test
> > conditions.
> >
> > Thanks,
> > Daniele
> >
>
> This patch has an impact on the underlying USB physical transport of
> rmnet. From my understanding, the call stack is as follows:
>
> rmnet_map_tx_aggregate or rmnet_send_skb
>
> |
> | Calling dev_queue_xmit(skb)
> V
>
> qmi_wwan used for USB modem
>
> |
> |  ndo_start_xmit(skb, net) is called
> V
>
> usbnet_start_xmit is executed where the size of the internal queue is
> dynamically changed using the Byte Queue Limits algorithm by this patch.
>
> Correct me if I am wrong, but I think in the end usbnet is used.
>

Exactly, I was just wondering if this patch had any effect on the
overall throughput performance once the aggregation is enabled.

Hopefully I'll be able to perform some tests once the patch is merged.

Thanks,
Daniele

> Thanks,
> Simon
>
> >> Thanks
> >>
> >>>>>>
> >>>>>> To address this issue, this patch introduces support for
> >>>>>> Byte Queue Limits (BQL) [1][2] in the usbnet driver. BQL dynamically
> >>>>>> limits the amount of data queued in the driver, effectively reducing
> >>>>>> latency without impacting throughput.
> >>>>>> This implementation was successfully tested on several devices as
> >>>>>> described in the commit.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Future work
> >>>>>>
> >>>>>> Due to offloading, TCP often produces SKBs up to 64 KB in size.
> >>>>>
> >>>>> Only for rates > 500 Mbit. After BQL, we had many more improvements in
> >>>>> the stack.
> >>>>> https://lwn.net/Articles/564978/
> >>>>>
> >>>>>
> >>>>
> >>>> I also saw these large SKBs, for example, for my USB2 Android tethering,
> >>>> which advertises a network speed of < 500 Mbit/s.
> >>>> I saw these large SKBs by looking at the file:
> >>>>
> >>>> cat /sys/class/net/INTERFACE/queues/tx-0/byte_queue_limits/inflight
> >>>>
> >>>> For UDP-only traffic, inflight always maxed out at MTU size.
> >>>>
> >>>> Thank you for your replies!
> >>>>
> >>>>>> To
> >>>>>> further decrease buffer bloat, I tried to disable TSO, GSO and LRO but it
> >>>>>> did not have the intended effect in my tests. The only dirty workaround I
> >>>>>> found so far was to call netif_stop_queue() whenever BQL sets
> >>>>>> __QUEUE_STATE_STACK_XOFF. However, a proper solution to this issue would
> >>>>>> be desirable.
> >>>>>>
> >>>>>> I also plan to publish a scientific paper on this topic in the near
> >>>>>> future.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Simon
> >>>>>>
> >>>>>> [1] https://medium.com/@tom_84912/byte-queue-limits-the-unauthorized-biography-61adc5730b83
> >>>>>> [2] https://lwn.net/Articles/469652/
> >>>>>>
> >>>>>> Simon Schippers (1):
> >>>>>>   usbnet: Add support for Byte Queue Limits (BQL)
> >>>>>>
> >>>>>>  drivers/net/usb/usbnet.c | 8 ++++++++
> >>>>>>  1 file changed, 8 insertions(+)
> >>>>>>
> >>>>>> --
> >>>>>> 2.43.0
> >>>>>>
> >>>>