[PATCH V2] taprio: Set the value of picos_per_byte before fill sched_entry

jianghaoran posted 1 patch 1 year, 6 months ago
There is a newer version of this series
net/sched/sch_taprio.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH V2] taprio: Set the value of picos_per_byte before fill sched_entry
Posted by jianghaoran 1 year, 6 months ago
If the value of picos_per_byte is set after fill sched_entry,
as a result, the min_duration calculated by length_to_duration is 0,
and the validity of the input interval cannot be judged,
too small intervals couldn't allow any packet to be transmitted.
It will appear like commit b5b73b26b3ca ("taprio:
Fix allowing too small intervals") described problem.
Here is a further modification of this problem.

example configuration which will not be able to transmit:

tc qdisc replace dev enp5s0f0 parent root handle 100 taprio \
              num_tc 3 \
              map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
              queues 1@0 1@1 2@2 \
              base-time  1528743495910289987 \
              sched-entry S 01 9 \
	      sched-entry S 02 9 \
	      sched-entry S 04 9 \
              clockid CLOCK_TAI

Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals")
Signed-off-by: jianghaoran <jianghaoran@kylinos.cn>
---
v2:
1,Add an explanation of what this is an example.
2,add a Fixes tag pointing to the first commit
where the issue was presen.
---
 net/sched/sch_taprio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 86675a79da1e..d95ec2250f24 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1507,6 +1507,8 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
 		goto free_sched;
 	}
 
+	taprio_set_picos_per_byte(dev, q);
+
 	err = parse_taprio_schedule(q, tb, new_admin, extack);
 	if (err < 0)
 		goto free_sched;
@@ -1521,8 +1523,6 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
 	if (err < 0)
 		goto free_sched;
 
-	taprio_set_picos_per_byte(dev, q);
-
 	if (mqprio) {
 		err = netdev_set_num_tc(dev, mqprio->num_tc);
 		if (err)
-- 
2.25.1
Re: [PATCH V2] taprio: Set the value of picos_per_byte before fill sched_entry
Posted by Jakub Kicinski 1 year, 6 months ago
On Sat,  1 Oct 2022 16:06:26 +0800 jianghaoran wrote:
> Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals")

Please note that whenever you put a Fixes tag in a patch you should CC
the authors of the commit in question. get_maintainer will point them
out to you (when run on the patch).
Re: [PATCH V2] taprio: Set the value of picos_per_byte before fill sched_entry
Posted by Vladimir Oltean 1 year, 6 months ago
Hi Jianghao,

On Sat, Oct 01, 2022 at 04:06:26PM +0800, jianghaoran wrote:
> If the value of picos_per_byte is set after fill sched_entry,
> as a result, the min_duration calculated by length_to_duration is 0,
> and the validity of the input interval cannot be judged,
> too small intervals couldn't allow any packet to be transmitted.
> It will appear like commit b5b73b26b3ca ("taprio:
> Fix allowing too small intervals") described problem.
> Here is a further modification of this problem.
> 
> example configuration which will not be able to transmit:
> 
> tc qdisc replace dev enp5s0f0 parent root handle 100 taprio \
>               num_tc 3 \
>               map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>               queues 1@0 1@1 2@2 \
>               base-time  1528743495910289987 \
>               sched-entry S 01 9 \
> 	      sched-entry S 02 9 \
> 	      sched-entry S 04 9 \
>               clockid CLOCK_TAI
> 
> Fixes: b5b73b26b3ca ("taprio: Fix allowing too small intervals")
> Signed-off-by: jianghaoran <jianghaoran@kylinos.cn>
> ---

I think this is just a symptomatic treatment of a bigger problem with
the solution Vinicius tried to implement.

One can still change the qdisc on an interface whose link is down, and
the determination logic will still be bypassed, thereby allowing the 9
ns schedule intervals to be accepted as valid.

Is your problem that the 9 ns intervals will kill the kernel due to the
frequent hrtimers, or that no packets will be dequeued from the qdisc?

If the latter, I was working on a feature called queueMaxSDU, where one
can limit the MTU per traffic class. Packets exceeding the max MTU are
dropped at the enqueue() level (therefore, before being accepted into
the Qdisc queues). The problem here, really, is that we accept packets
in enqueue() which will never be eligible in dequeue(). We have the
exact same problem with gates which are forever closed (in your own
example, that would be gates 3 and higher).

Currently, I only added support for user space to input queueMaxSDU into
the kernel over netlink, as well as for the basic qdisc_drop() mechanism
based on skb->len. But I was thinking that the kernel should have a
mechanism to automatically reduce the queueMaxSDU to an even lower value
than specified by the user, if the gate intervals don't accept MTU sized
packets. The "operational" queueMaxSDU is determined by the current link
speed and the smallest contiguous interval corresponding to each traffic
class.

In fact, if you search for vsc9959_tas_guard_bands_update(), you'll see
most of the logic already being written, but just for an offloading
device driver. I was thinking I should generalize this logic and push it
into taprio.

If your problem is the former (9ns hrtimers kill the kernel, how do we
avoid them?), then it's pretty hard to make a judgement that works for
all link speeds (taprio will still accept the interval as valid for a
100Gbps interface, because theoretically, the transmission time of
ETH_ZLEN bytes is still below 9 ns. I don't know how one can realistically
deal with that in a generic way.

Given that it's so easy to bypass taprio's restriction by having the
link down, I don't think it makes much sense to keep pretending that it
works, and submit this as a bug fix :)

I was going to move vsc9959_tas_guard_bands_update() into taprio anyway,
although I'm not sure if in this kernel development cycle. If you're
interested, I can keep you on CC.