net/netfilter/nf_conntrack_core.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
When deleting an xt_CT rule, its per-rule template conntrack is freed via
nf_ct_destroy() -> nf_ct_tmpl_free(). If an expectation was created with
that template as its master, the expectation's timeout/flush later calls
nf_ct_unlink_expect_report() and dereferences exp->master, which now points
to freed memory, leading to a NULL/poison deref and crash.
Move nf_ct_remove_expectations(ct) before the template early-return in
nf_ct_destroy() so that any expectations attached to a template are removed
(and their timers cancelled) before the template's extensions are torn down.
Signed-off-by: Qingjie Xing <xqjcool@gmail.com>
---
net/netfilter/nf_conntrack_core.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 344f88295976..7f6b95404907 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -577,6 +577,13 @@ void nf_ct_destroy(struct nf_conntrack *nfct)
WARN_ON(refcount_read(&nfct->use) != 0);
+ /* Expectations will have been removed in clean_from_lists,
+ * except TFTP can create an expectation on the first packet,
+ * before connection is in the list, so we need to clean here,
+ * too.
+ */
+ nf_ct_remove_expectations(ct);
+
if (unlikely(nf_ct_is_template(ct))) {
nf_ct_tmpl_free(ct);
return;
@@ -585,13 +592,6 @@ void nf_ct_destroy(struct nf_conntrack *nfct)
if (unlikely(nf_ct_protonum(ct) == IPPROTO_GRE))
destroy_gre_conntrack(ct);
- /* Expectations will have been removed in clean_from_lists,
- * except TFTP can create an expectation on the first packet,
- * before connection is in the list, so we need to clean here,
- * too.
- */
- nf_ct_remove_expectations(ct);
-
if (ct->master)
nf_ct_put(ct->master);
base-commit: 01792bc3e5bdafa171dd83c7073f00e7de93a653
--
2.25.1
Qingjie Xing <xqjcool@gmail.com> wrote: > When deleting an xt_CT rule, its per-rule template conntrack is freed via > nf_ct_destroy() -> nf_ct_tmpl_free(). If an expectation was created with > that template as its master, Uhm. How can that happen? A template isn't a connection, so it should not be able to create an expectation.
With an iptables-configured TFTP helper in place, a UDP packet (10.65.41.36:1069 → 10.65.36.2:69, TFTP RRQ) triggered creation of an expectation. Later, iptables changes removed the rule’s per-rule template nf_conn. When the expectation’s timer expired, nf_ct_unlink_expect_report() ran and dereferenced the freed master, causing a crash. The detailed system logs are as follows: -------------------------------------------------------------------------------- //create [ 1978.316487] nf_conntrack: [nf_ct_tmpl_alloc:580] nf_conn:ffff8881391e3800 ext:0 //insert [ 2131.989389] [nf_ct_expect_insert:417] exp:ffff88823aac8008 master:ffff8881391e3800 ext:ffff888286a3c500 jiffies:4296796140 timeout:300 expires:4297096140 [ 2140.352649] nf_conntrack: [nf_ct_tmpl_alloc:580] nf_conn:ffff88813ae58e00 ext:0 [ 2140.352657] nf_conntrack: [nf_ct_tmpl_alloc:580] nf_conn:ffff88813ae59a00 ext:0 [ 2140.352661] nf_conntrack: [nf_ct_tmpl_alloc:580] nf_conn:ffff88813ae5d600 ext:0 [ 2140.352664] nf_conntrack: [nf_ct_tmpl_alloc:580] nf_conn:ffff88813ae58800 ext:0 [ 2140.352735] nf_conntrack: [nf_ct_tmpl_free:594] nf_conn:ffff8881391e3200 ext:6b6b6b6b6b6b6b6b [ 2140.352738] CPU: 0 PID: 4691 Comm: netd Kdump: loaded Tainted: G W O 6.1 #16 [ 2140.352740] Hardware name: Supermicro SYS-2049P-TN8R-FI005/X11QPL, BIOS 3.3 02/19/2020 [ 2140.352741] Call Trace: [ 2140.352742] <TASK> [ 2140.352743] nf_ct_tmpl_free+0x4f/0x60 [ 2140.352749] nf_ct_destroy+0xce/0x290 [ 2140.352752] xt_ct_tg_destroy+0x78/0xc0 [ 2140.352756] xt_ct_tg_destroy_v1+0x12/0x20 [ 2140.352758] cleanup_entry+0x115/0x1b0 [ 2140.352761] __do_replace+0x3ab/0x530 [ 2140.352763] ? do_ipt_set_ctl+0x5ef/0x6c0 [ 2140.352765] do_ipt_set_ctl+0x5ef/0x6c0 [ 2140.352767] nf_setsockopt+0x1a8/0x2e0 [ 2140.352769] raw_setsockopt+0x7b/0x120 [ 2140.352771] sock_common_setsockopt+0x18/0x30 [ 2140.352773] __sys_setsockopt+0xb9/0x130 [ 2140.352775] __x64_sys_setsockopt+0x21/0x30 [ 2140.352777] do_syscall_64+0x49/0xa0 [ 2140.352780] ? irqentry_exit+0x12/0x40 [ 2140.352782] entry_SYSCALL_64_after_hwframe+0x64/0xce [ 2140.352785] RIP: 0033:0x7f621d3f49aa [ 2140.352787] Code: ff ff ff c3 0f 1f 40 00 48 8b 15 69 b4 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 39 b4 0c 00 f7 //free [ 2140.352889] nf_conntrack: [nf_ct_tmpl_free:594] nf_conn:ffff8881391e3800 ext:6b6b6b6b6b6b6b6b [ 2140.352891] CPU: 0 PID: 4691 Comm: netd Kdump: loaded Tainted: G W O 6.1 #16 [ 2140.352892] Hardware name: Supermicro SYS-2049P-TN8R-FI005/X11QPL, BIOS 3.3 02/19/2020 [ 2140.352893] Call Trace: [ 2140.352893] <TASK> [ 2140.352894] nf_ct_tmpl_free+0x4f/0x60 [ 2140.352896] nf_ct_destroy+0xce/0x290 [ 2140.352898] xt_ct_tg_destroy+0x78/0xc0 [ 2140.352900] xt_ct_tg_destroy_v1+0x12/0x20 [ 2140.352902] cleanup_entry+0x115/0x1b0 [ 2140.352904] __do_replace+0x3ab/0x530 [ 2140.352906] ? do_ipt_set_ctl+0x5ef/0x6c0 [ 2140.352907] do_ipt_set_ctl+0x5ef/0x6c0 [ 2140.352909] nf_setsockopt+0x1a8/0x2e0 [ 2140.352911] raw_setsockopt+0x7b/0x120 [ 2140.352912] sock_common_setsockopt+0x18/0x30 [ 2140.352913] __sys_setsockopt+0xb9/0x130 [ 2140.352915] __x64_sys_setsockopt+0x21/0x30 [ 2140.352917] do_syscall_64+0x49/0xa0 [ 2140.352919] ? irqentry_exit+0x12/0x40 [ 2140.352920] entry_SYSCALL_64_after_hwframe+0x64/0xce [ 2140.352923] RIP: 0033:0x7f621d3f49aa [ 2140.352924] Code: ff ff ff c3 0f 1f 40 00 48 8b 15 69 b4 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 06 c3 0f 1f 44 00 00 48 8b 15 39 b4 0c 00 f7 //expectation timeout [ 2433.066066] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] SMP NOPTI [ 2433.187797] CPU: 10 PID: 66 Comm: ksoftirqd/10 Kdump: loaded Tainted: G W O 6.1 #16 [ 2433.293977] Hardware name: Supermicro SYS-2049P-TN8R-FI005/X11QPL, BIOS 3.3 02/19/2020 [ 2433.306651] nf_conntrack: [__nf_conntrack_alloc:1729] nf_conn:ffff8882a9268440 jiffies:4297097457 [ 2433.388722] RIP: 0010:nf_ct_unlink_expect_report+0x2d/0x1f0 [ 2433.388730] Code: 00 00 55 48 89 e5 41 56 53 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 45 e8 48 8b 4f 70 4c 8b 81 e8 00 00 00 4d 85 c0 74 39 <41> 0f b7 00 48 85 c0 74 30 41 83 78 1c 00 75 11 4c 01 c0 48 8b 99 [ 2433.388732] RSP: 0018:ffffc9000ce0fce0 EFLAGS: 00010202 [ 2433.848812] RAX: a79bfdc906a58200 RBX: ffff88823aac8088 RCX: ffff8881391e3800 [ 2433.934200] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88823aac8008 [ 2434.019584] RBP: ffffc9000ce0fd08 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000000 [ 2434.104964] R10: ffff8897e0f1cc00 R11: ffffffff80ee7e00 R12: ffffffff80ee7e00 [ 2434.190349] R13: 0000000000000000 R14: ffff88823aac8008 R15: ffff88823aac8088 [ 2434.275728] FS: 0000000000000000(0000) GS:ffff8897e0f00000(0000) knlGS:0000000000000000 [ 2434.372555] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2434.441296] CR2: 00007f980bc97000 CR3: 0000000107734003 CR4: 00000000007706e0 [ 2434.526684] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2434.612066] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2434.697449] PKRU: 55555554 [ 2434.729791] Call Trace: [ 2434.759017] <TASK> [ 2434.784079] ? __die_body+0x82/0x130 [ 2434.826826] ? die_addr+0xaa/0xe0 [ 2434.866446] ? exc_general_protection+0x13a/0x1e0 [ 2434.922711] ? asm_exc_general_protection+0x27/0x30 [ 2434.981054] ? nf_ct_expect_dst_hash+0x120/0x120 [ 2435.036276] ? nf_ct_expect_dst_hash+0x120/0x120 [ 2435.091503] ? nf_ct_unlink_expect_report+0x2d/0x1f0 [ 2435.150885] nf_ct_expectation_timed_out+0x2b/0x90 [ 2435.208189] ? nf_ct_expect_dst_hash+0x120/0x120 [ 2435.263415] call_timer_fn+0x2f/0x110 [ 2435.307195] run_timer_softirq+0x616/0x700 [ 2435.356179] ? newidle_balance+0x299/0x320 [ 2435.405166] __do_softirq+0xdc/0x2ab [ 2435.447904] run_ksoftirqd+0x1c/0x30 [ 2435.490649] smpboot_thread_fn+0xe8/0x1b0 [ 2435.538595] kthread+0x269/0x2a0 [ 2435.577179] ? __smpboot_create_thread+0x220/0x220 [ 2435.634479] ? kthreadd+0x380/0x380 [ 2435.676187] ret_from_fork+0x1f/0x30 [ 2435.718930] </TASK> -------------------------------------------------------------------------------
Qingjie Xing <xqjcool@gmail.com> wrote: > With an iptables-configured TFTP helper in place, a UDP packet > (10.65.41.36:1069 → 10.65.36.2:69, TFTP RRQ) triggered creation of an expectation. > Later, iptables changes removed the rule’s per-rule template nf_conn. > When the expectation’s timer expired, nf_ct_unlink_expect_report() > ran and dereferenced the freed master, causing a crash. Sorry, I do not see the problem. A template should never be listed as exp->master. Can you make a reproducer/selftest for this bug? I worry we paper over a different bug.
Florian Westphal <fw@strlen.de> wrote:
> Qingjie Xing <xqjcool@gmail.com> wrote:
> > With an iptables-configured TFTP helper in place, a UDP packet
> > (10.65.41.36:1069 → 10.65.36.2:69, TFTP RRQ) triggered creation of an expectation.
> > Later, iptables changes removed the rule’s per-rule template nf_conn.
> > When the expectation’s timer expired, nf_ct_unlink_expect_report()
> > ran and dereferenced the freed master, causing a crash.
>
> Sorry, I do not see the problem.
> A template should never be listed as exp->master.
>
> Can you make a reproducer/selftest for this bug?
>
> I worry we paper over a different bug.
Or maybe this will provide a clue (not even compile tested).
@@ -299,6 +302,9 @@ struct nf_conntrack_expect *nf_ct_expect_alloc(struct nf_conn *me)
{
struct nf_conntrack_expect *new;
+ if (WARN_ON_ONCE(nf_ct_is_template(me)))
+ return NULL;
+
new = kmem_cache_alloc(nf_ct_expect_cachep, GFP_ATOMIC);
if (!new)
I added a panic() in nf_ct_expect_insert(). After reproducing, the crash dump
(via crash) shows the nf_conntrack involved is a template (used as the master),
and the expectation insertion was triggered by a TFTP packet.
The detailed information is as follows:
---------------------------------------------------
crash> sys
KERNEL: vmlinux [TAINTED]
DUMPFILE: coredump-2025-08-15-00_40-8.0.0-B3 [PARTIAL DUMP]
CPUS: 64
DATE: Thu Aug 14 17:40:23 PDT 2025
UPTIME: 00:01:36
LOAD AVERAGE: 4.39, 1.37, 0.48
TASKS: 1115
NODENAME: MYNODE
RELEASE: 6.1
VERSION: #15 SMP Thu Aug 14 17:02:44 PDT 2025
MACHINE: x86_64 (2800 Mhz)
MEMORY: 382.7 GB
PANIC: "Kernel panic - not syncing: [nf_ct_expect_insert:417] exp:ffff88822ed78008 master:ffff888136fcd000 ext:ffff8881087bf500 jiffies:4294761886 timeout:300 expires:4295061886"
crash> bt
PID: 8605 TASK: ffff888139140040 CPU: 4 COMMAND: "cli"
#0 [ffffc9001762b7f8] machine_kexec at ffffffff80279d53
#1 [ffffc9001762b878] __crash_kexec at ffffffff8038b7b7
#2 [ffffc9001762b948] panic at ffffffff802973e0
#3 [ffffc9001762b9c8] nf_ct_expect_related_report at ffffffff80ee7b27
#4 [ffffc9001762ba40] tftp_help at ffffffff80f001ea
#5 [ffffc9001762ba98] nf_confirm at ffffffff80eeaa77
#6 [ffffc9001762bac8] ipv4_confirm at ffffffff80eeafa9
#7 [ffffc9001762baf8] nf_hook_slow at ffffffff80ed24db
#8 [ffffc9001762bb40] ip_output at ffffffff80fe85a5
#9 [ffffc9001762bbc8] udp_send_skb at ffffffff81033372
#10 [ffffc9001762bc18] udp_sendmsg at ffffffff81032cb2
#11 [ffffc9001762bd90] inet_sendmsg at ffffffff810488a1
#12 [ffffc9001762bdb8] __sys_sendto at ffffffff80dcdda7
#13 [ffffc9001762bf08] __x64_sys_sendto at ffffffff80dcde46
#14 [ffffc9001762bf18] do_syscall_64 at ffffffff811eab09
#15 [ffffc9001762bf50] entry_SYSCALL_64_after_hwframe at ffffffff812000dc
RIP: 00007fde06cdb8f3 RSP: 00007ffc80d56358 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: 000055eb6f07fc03 RCX: 00007fde06cdb8f3
RDX: 000000000000001d RSI: 00007fde047666c0 RDI: 0000000000000007
RBP: 000055eb6fc941c3 R8: 00007fde047665a0 R9: 0000000000000010
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
R13: 000000000000000b R14: 00007fde030f4908 R15: 0000000000000006
ORIG_RAX: 000000000000002c CS: 0033 SS: 002b
crash> nf_conn.status -x ffff888136fcd000
status = 0x808,
----------------------------------------------------
Qingjie Xing <xqjcool@gmail.com> wrote:
> I added a panic() in nf_ct_expect_insert(). After reproducing, the crash dump
> (via crash) shows the nf_conntrack involved is a template (used as the master),
> and the expectation insertion was triggered by a TFTP packet.
The tftp packet should be associated with a conntrack entry, not a
template.
> #3 [ffffc9001762b9c8] nf_ct_expect_related_report at ffffffff80ee7b27
> #4 [ffffc9001762ba40] tftp_help at ffffffff80f001ea
> #5 [ffffc9001762ba98] nf_confirm at ffffffff80eeaa77
> #6 [ffffc9001762bac8] ipv4_confirm at ffffffff80eeafa9
> #7 [ffffc9001762baf8] nf_hook_slow at ffffffff80ed24db
> #8 [ffffc9001762bb40] ip_output at ffffffff80fe85a5
> #9 [ffffc9001762bbc8] udp_send_skb at ffffffff81033372
> #10 [ffffc9001762bc18] udp_sendmsg at ffffffff81032cb2
> #11 [ffffc9001762bd90] inet_sendmsg at ffffffff810488a1
How can this happen?
1. -t raw assigns skb->_nfct to the template.
2. at OUTPUT, nf_conntrack_in is called:
unsigned int
nf_conntrack_in(struct sk_buff *skb, const struct nf_hook_state *state)
{
enum ip_conntrack_info ctinfo;
struct nf_conn *ct, *tmpl;
u_int8_t protonum;
int dataoff, ret;
tmpl = nf_ct_get(skb, &ctinfo);
if (tmpl || ctinfo == IP_CT_UNTRACKED) {
/* Previously seen (loopback or untracked)? Ignore. */
if ((tmpl && !nf_ct_is_template(tmpl)) ||
ctinfo == IP_CT_UNTRACKED)
return NF_ACCEPT;
skb->_nfct = 0; // HERE
}
... and that will *clear* the template again.
3. nf_conntrack_in assigns skb->_nfct to a newly allocated
connrack (not a template).
The backtrace you quote should be impossible.
You need to figure out why skb->_nfct was not cleared by
nf_conntrack_in().
You did not mention anything about timing, does this only
happen at the start, i.e. do we have a race where nf_confirm
was just registered with nf_hook_slow for the first time but
ipv4_confirm wasn't set up yet?
If so, please fix nf_confirm() to return early if the skb
has a template attached.
Thanks for the careful review and the pointers. I dug deeper and found the root cause on my side: there was leftover/out-of-tree code in my local tree that could attach the per-rule template to skb->_nfct. After cleaning up those remnants, upstream behavior matches your description— nf_conntrack_in() clears any template, tftp_help() sees a real conntrack, and I can no longer reproduce the crash. Apologies for the noise and for any time this cost you. I’ll withdraw the patch as it was addressing a problem introduced by my local changes. Thanks again for the guidance.
© 2016 - 2026 Red Hat, Inc.