[PATCH net-next] netfilter: conntrack: Reduce cond_resched frequency in gc_worker

lirongqing posted 1 patch 2 months ago
net/netfilter/nf_conntrack_core.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
[PATCH net-next] netfilter: conntrack: Reduce cond_resched frequency in gc_worker
Posted by lirongqing 2 months ago
From: Li RongQing <lirongqing@baidu.com>

The current implementation calls cond_resched() in every iteration
of the garbage collection loop. This creates some overhead when
processing large conntrack tables with billions of entries,
as each cond_resched() invocation involves scheduler operations.

To reduce this overhead, implement a time-based throttling mechanism
that calls cond_resched() at most once per millisecond. This maintains
system responsiveness while minimizing scheduler contention.

gc_worker() with hashsize=10000 shows measurable improvement:

Before: 7114.274us
After:  5993.518us (15.8% reduction)

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
 net/netfilter/nf_conntrack_core.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 344f882..779ca03 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1513,7 +1513,7 @@ static bool gc_worker_can_early_drop(const struct nf_conn *ct)
 static void gc_worker(struct work_struct *work)
 {
 	unsigned int i, hashsz, nf_conntrack_max95 = 0;
-	u32 end_time, start_time = nfct_time_stamp;
+	u32 end_time, resched_time, start_time = nfct_time_stamp;
 	struct conntrack_gc_work *gc_work;
 	unsigned int expired_count = 0;
 	unsigned long next_run;
@@ -1536,6 +1536,7 @@ static void gc_worker(struct work_struct *work)
 	count = gc_work->count;
 
 	end_time = start_time + GC_SCAN_MAX_DURATION;
+	resched_time = nfct_time_stamp;
 
 	do {
 		struct nf_conntrack_tuple_hash *h;
@@ -1615,7 +1616,10 @@ static void gc_worker(struct work_struct *work)
 		 * we will just continue with next hash slot.
 		 */
 		rcu_read_unlock();
-		cond_resched();
+		if (nfct_time_stamp - resched_time > msecs_to_jiffies(1)) {
+			cond_resched();
+			resched_time = nfct_time_stamp;
+		}
 		i++;
 
 		delta_time = nfct_time_stamp - end_time;
-- 
2.9.4
Re: [PATCH net-next] netfilter: conntrack: Reduce cond_resched frequency in gc_worker
Posted by Florian Westphal 2 months ago
lirongqing <lirongqing@baidu.com> wrote:
> From: Li RongQing <lirongqing@baidu.com>
> 
> The current implementation calls cond_resched() in every iteration
> of the garbage collection loop. This creates some overhead when
> processing large conntrack tables with billions of entries,
> as each cond_resched() invocation involves scheduler operations.
> 
> To reduce this overhead, implement a time-based throttling mechanism
> that calls cond_resched() at most once per millisecond. This maintains
> system responsiveness while minimizing scheduler contention.
> 
> gc_worker() with hashsize=10000 shows measurable improvement:
> 
> Before: 7114.274us
> After:  5993.518us (15.8% reduction)

I dislike this, I have never seen this pattern.

Whole point of cond_resched() is to let scheduler decide.

Maybe it would be better to move gc_worker off to its own
work queue (create_workqueue()) instead of reusing system wq
so one can tune the priority instead?