net/sched/sch_dualpi2.c | 3 +++ 1 file changed, 3 insertions(+)
dualpi2_change() uses a trim loop to enforce the new queue limit after a
configuration change. The loop calls qdisc_dequeue_internal(sch, true)
which only dequeues from the C-queue (sch->q) and the requeue list
(sch->gso_skb). It does not dequeue from the L-queue (q->l_queue).
However, the loop continuation condition checks qdisc_qlen(sch), which
reflects the total packet count across both queues because
dualpi2_enqueue_skb() manually increments sch->q.qlen for L-queue
packets (line 418). Similarly, q->memory_used accounts for memory from
both queues.
When all packets reside in the L-queue and the C-queue is empty, the
loop condition remains true but qdisc_dequeue_internal() returns NULL.
The subsequent skb->truesize dereference causes a NULL pointer oops.
An unprivileged user can trigger this from a user namespace:
1. unshare(CLONE_NEWUSER | CLONE_NEWNET)
2. Create a dummy device and attach dualpi2 qdisc
3. Send ECT(1)-marked packets to fill the L-queue
4. Reduce the qdisc limit via RTM_NEWQDISC
[ 17.521319] Oops: general protection fault, probably for non-canonical address 0xdffffc000000001a: 0000 [#1] SMP KASAN NOPTI
[ 17.525206] KASAN: null-ptr-deref in range [0x00000000000000d0-0x00000000000000d7]
[ 17.527710] CPU: 3 UID: 1000 PID: 171 Comm: poc Not tainted 7.0.0-rc7-next-20260410 #10 PREEMPTLAZY
[ 17.530795] Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 17.533301] RIP: 0010:dualpi2_change+0xd09/0x1c00
[ 17.535472] Code: ef 83 e7 07 83 c7 03 44 38 cf 7c 09 45 84 c9 0f 85 fb 06 00 00 4c 8d 96 d0 00 00 00 44 8b 8b 5c 02 00 00 4c 89 d7 48 c1 ef 03 <0f> b6 3c 2f 40 84 ff 74 0a 40 80 ff 03 0f 8e fc 06 00 00 4c 89 c7
[ 17.540294] RSP: 0018:ffffc90000bb7360 EFLAGS: 00000202
[ 17.542574] RAX: 0000000000014c3a RBX: ffff88800fe18000 RCX: ffffed1001fc3010
[ 17.543461] RDX: ffff88800fe1825c RSI: 0000000000000000 RDI: 000000000000001a
[ 17.544145] RBP: dffffc0000000000 R08: 0000000000000028 R09: 0000000000171240
[ 17.546982] R10: 00000000000000d0 R11: ffff88800fe18080 R12: ffff88800fe180d0
[ 17.549652] R13: ffff88800fe1825c R14: ffff88800fe18014 R15: ffff88800fe180f8
[ 17.552566] FS: 000000002b3533c0(0000) GS:ffff8880e260f000(0000) knlGS:0000000000000000
[ 17.555942] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 17.556472] CR2: dffffc000000001a CR3: 000000000fb01000 CR4: 00000000003006f0
[ 17.559321] Call Trace:
[ 17.560392] <TASK>
[ 17.560993] ? __asan_memset+0x23/0x50
[ 17.562609] ? __pfx_dualpi2_change+0x10/0x10
[ 17.564265] ? mutex_lock+0x7e/0xd0
[ 17.565628] ? __pfx_mutex_lock+0x10/0x10
[ 17.566886] ? nla_strcmp+0x20/0x100
[ 17.568354] tc_modify_qdisc+0x4ee/0x1d60
[ 17.570211] ? __pfx_tc_modify_qdisc+0x10/0x10
[ 17.571293] ? __pfx_stack_trace_save+0x10/0x10
[ 17.572894] ? mutex_lock+0x7e/0xd0
[ 17.573548] ? __pfx_mutex_lock+0x10/0x10
[ 17.574583] ? security_capable+0x80/0x110
[ 17.576051] rtnetlink_rcv_msg+0x548/0xc10
[ 17.577902] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 17.579170] netlink_rcv_skb+0x12a/0x390
[ 17.580902] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 17.581817] ? __pfx_netlink_rcv_skb+0x10/0x10
[ 17.583064] ? __kasan_slab_alloc+0x89/0x90
[ 17.584753] netlink_unicast+0x5b8/0x980
[ 17.585677] ? __pfx_netlink_unicast+0x10/0x10
[ 17.587312] ? rpm_suspend+0x492/0xe70
[ 17.588612] ? __pfx___alloc_skb+0x10/0x10
[ 17.590473] ? __check_object_size+0x45e/0x650
[ 17.592327] ? rpm_suspend+0x492/0xe70
[ 17.593589] netlink_sendmsg+0x722/0xbb0
[ 17.594452] ? __pfx_netlink_sendmsg+0x10/0x10
[ 17.595192] ? __import_iovec+0x33d/0x5b0
[ 17.596582] ? __pfx_netlink_sendmsg+0x10/0x10
[ 17.598003] ____sys_sendmsg+0x8cf/0xb30
[ 17.599168] ? __pfx_____sys_sendmsg+0x10/0x10
[ 17.599921] ? __pfx_copy_msghdr_from_user+0x10/0x10
[ 17.601616] ? update_cfs_rq_load_avg+0x5a/0x560
[ 17.602924] ___sys_sendmsg+0x104/0x190
[ 17.604318] ? update_irq_load_avg+0xbd/0x18b0
[ 17.604977] ? __pfx____sys_sendmsg+0x10/0x10
[ 17.606505] __sys_sendmsg+0x124/0x1c0
[ 17.607877] ? __pfx___sys_sendmsg+0x10/0x10
[ 17.609867] ? __pfx_handle_softirqs+0x10/0x10
[ 17.611017] do_syscall_64+0x64/0x680
[ 17.612277] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 17.614334] RIP: 0033:0x429d6b
[ 17.615451] Code: 48 89 e5 48 83 ec 20 89 55 ec 48 89 75 f0 89 7d f8 e8 99 e5 02 00 8b 55 ec 48 8b 75 f0 41 89 c0 8b 7d f8 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2d 44 89 c7 48 89 45 f8 e8 f1 e5 02 00 48 8b
[ 17.619777] RSP: 002b:00007fff81a32560 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[ 17.621686] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000429d6b
[ 17.623171] RDX: 0000000000000000 RSI: 00007fff81a325e0 RDI: 0000000000000003
[ 17.625433] RBP: 00007fff81a32580 R08: 0000000000000000 R09: 00007fff81a32797
[ 17.627317] R10: 0000000000000000 R11: 0000000000000293 R12: 00007fff81a32a38
[ 17.629562] R13: 00007fff81a32a48 R14: 00000000004c4848 R15: 0000000000000001
[ 17.630409] </TASK>
[ 17.631205] Modules linked in:
[ 17.633251] ---[ end trace 0000000000000000 ]---
[ 17.634487] RIP: 0010:dualpi2_change+0xd09/0x1c00
[ 17.636511] Code: ef 83 e7 07 83 c7 03 44 38 cf 7c 09 45 84 c9 0f 85 fb 06 00 00 4c 8d 96 d0 00 00 00 44 8b 8b 5c 02 00 00 4c 89 d7 48 c1 ef 03 <0f> b6 3c 2f 40 84 ff 74 0a 40 80 ff 03 0f 8e fc 06 00 00 4c 89 c7
[ 17.641850] RSP: 0018:ffffc90000bb7360 EFLAGS: 00000202
[ 17.643001] RAX: 0000000000014c3a RBX: ffff88800fe18000 RCX: ffffed1001fc3010
[ 17.645928] RDX: ffff88800fe1825c RSI: 0000000000000000 RDI: 000000000000001a
[ 17.647603] RBP: dffffc0000000000 R08: 0000000000000028 R09: 0000000000171240
[ 17.649912] R10: 00000000000000d0 R11: ffff88800fe18080 R12: ffff88800fe180d0
[ 17.652899] R13: ffff88800fe1825c R14: ffff88800fe18014 R15: ffff88800fe180f8
[ 17.655310] FS: 000000002b3533c0(0000) GS:ffff8880e260f000(0000) knlGS:0000000000000000
[ 17.656751] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 17.659213] CR2: dffffc000000001a CR3: 000000000fb01000 CR4: 00000000003006f0
[ 17.661915] Kernel panic - not syncing: Fatal exception in interrupt
[ 17.665688] Kernel Offset: disabled
[ 17.665980] Rebooting in 1 seconds..
Fix this by adding a NULL check after qdisc_dequeue_internal(). When
the C-queue is exhausted but L-queue packets keep qdisc_qlen(sch) above
the limit, the loop breaks safely. Remaining excess L-queue packets will
be drained by the normal dequeue path.
Fixes: 320d031ad6e4 ("sched: Struct definition and parsing of dualpi2 qdisc")
Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com>
---
net/sched/sch_dualpi2.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c
index fe6f5e889625..746c0e506024 100644
--- a/net/sched/sch_dualpi2.c
+++ b/net/sched/sch_dualpi2.c
@@ -870,6 +870,9 @@ static int dualpi2_change(struct Qdisc *sch, struct nlattr *opt,
q->memory_used > q->memory_limit) {
struct sk_buff *skb = qdisc_dequeue_internal(sch, true);
+ if (!skb)
+ break;
+
q->memory_used -= skb->truesize;
qdisc_qstats_backlog_dec(sch, skb);
rtnl_qdisc_drop(skb, sch);
--
2.43.0
On Mon, Apr 13, 2026 at 03:57:40PM +0800, Kito Xu (veritas501) wrote:
> dualpi2_change() uses a trim loop to enforce the new queue limit after a
> configuration change. The loop calls qdisc_dequeue_internal(sch, true)
> which only dequeues from the C-queue (sch->q) and the requeue list
> (sch->gso_skb). It does not dequeue from the L-queue (q->l_queue).
>
> However, the loop continuation condition checks qdisc_qlen(sch), which
> reflects the total packet count across both queues because
> dualpi2_enqueue_skb() manually increments sch->q.qlen for L-queue
> packets (line 418). Similarly, q->memory_used accounts for memory from
> both queues.
>
> When all packets reside in the L-queue and the C-queue is empty, the
> loop condition remains true but qdisc_dequeue_internal() returns NULL.
> The subsequent skb->truesize dereference causes a NULL pointer oops.
>
> An unprivileged user can trigger this from a user namespace:
>
> 1. unshare(CLONE_NEWUSER | CLONE_NEWNET)
> 2. Create a dummy device and attach dualpi2 qdisc
> 3. Send ECT(1)-marked packets to fill the L-queue
> 4. Reduce the qdisc limit via RTM_NEWQDISC
...
> Fix this by adding a NULL check after qdisc_dequeue_internal(). When
> the C-queue is exhausted but L-queue packets keep qdisc_qlen(sch) above
> the limit, the loop breaks safely. Remaining excess L-queue packets will
> be drained by the normal dequeue path.
>
> Fixes: 320d031ad6e4 ("sched: Struct definition and parsing of dualpi2 qdisc")
> Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
On Tue, Apr 14, 2026 at 07:31:32PM +0100, Simon Horman wrote:
> On Mon, Apr 13, 2026 at 03:57:40PM +0800, Kito Xu (veritas501) wrote:
> > dualpi2_change() uses a trim loop to enforce the new queue limit after a
> > configuration change. The loop calls qdisc_dequeue_internal(sch, true)
> > which only dequeues from the C-queue (sch->q) and the requeue list
> > (sch->gso_skb). It does not dequeue from the L-queue (q->l_queue).
> >
> > However, the loop continuation condition checks qdisc_qlen(sch), which
> > reflects the total packet count across both queues because
> > dualpi2_enqueue_skb() manually increments sch->q.qlen for L-queue
> > packets (line 418). Similarly, q->memory_used accounts for memory from
> > both queues.
> >
> > When all packets reside in the L-queue and the C-queue is empty, the
> > loop condition remains true but qdisc_dequeue_internal() returns NULL.
> > The subsequent skb->truesize dereference causes a NULL pointer oops.
> >
> > An unprivileged user can trigger this from a user namespace:
> >
> > 1. unshare(CLONE_NEWUSER | CLONE_NEWNET)
> > 2. Create a dummy device and attach dualpi2 qdisc
> > 3. Send ECT(1)-marked packets to fill the L-queue
> > 4. Reduce the qdisc limit via RTM_NEWQDISC
>
> ...
>
> > Fix this by adding a NULL check after qdisc_dequeue_internal(). When
> > the C-queue is exhausted but L-queue packets keep qdisc_qlen(sch) above
> > the limit, the loop breaks safely. Remaining excess L-queue packets will
> > be drained by the normal dequeue path.
> >
> > Fixes: 320d031ad6e4 ("sched: Struct definition and parsing of dualpi2 qdisc")
> > Signed-off-by: Kito Xu (veritas501) <hxzene@gmail.com>
>
> Reviewed-by: Simon Horman <horms@kernel.org>
Sorry, I now see that a more comprehensive fix for this code path
is available from the original author of the code.
- [PATCH v1 net 1/1] net/sched: sch_dualpi2: fix limit/memlimit enforcement when dequeueing L-queue
https://lore.kernel.org/all/20260413163711.56191-1-chia-yu.chang@nokia-bell-labs.com/
From: "Kito Xu (veritas501)" <hxzene@gmail.com>
Hi Simon,
Thanks for the review and for pointing out the alternative patch
from Chia-Yu. I agree that the more comprehensive fix is the better
choice for this code path.
Since I independently discovered and reported this issue, would it
be possible to add a Reported-by tag to Chia-Yu's patch?
Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com>
Either way, thanks for handling this!
Best regards,
Kito
On Wed, Apr 15, 2026 at 10:31:58AM +0800, veritas501 wrote: > From: "Kito Xu (veritas501)" <hxzene@gmail.com> > > Hi Simon, > > Thanks for the review and for pointing out the alternative patch > from Chia-Yu. I agree that the more comprehensive fix is the better > choice for this code path. > > Since I independently discovered and reported this issue, would it > be possible to add a Reported-by tag to Chia-Yu's patch? > > Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com> That sounds reasonable to me. But it might be best to bring up in a response to Chia-Yu's patch. (I can't make it happen myself.) > > Either way, thanks for handling this! > > Best regards, > Kito
© 2016 - 2026 Red Hat, Inc.