net/netfilter/nf_conntrack_core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
From: Li RongQing <lirongqing@baidu.com>
The current implementation calls cond_resched() in every iteration
of the garbage collection loop. This creates some overhead when
processing large conntrack tables with billions of entries,
as each cond_resched() invocation involves scheduler operations.
To reduce this overhead, implement a time-based throttling mechanism
that calls cond_resched() at most once per millisecond. This maintains
system responsiveness while minimizing scheduler contention.
gc_worker() with hashsize=10000 shows measurable improvement:
Before: 7114.274us
After: 5993.518us (15.8% reduction)
Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
net/netfilter/nf_conntrack_core.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 344f882..779ca03 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1513,7 +1513,7 @@ static bool gc_worker_can_early_drop(const struct nf_conn *ct)
static void gc_worker(struct work_struct *work)
{
unsigned int i, hashsz, nf_conntrack_max95 = 0;
- u32 end_time, start_time = nfct_time_stamp;
+ u32 end_time, resched_time, start_time = nfct_time_stamp;
struct conntrack_gc_work *gc_work;
unsigned int expired_count = 0;
unsigned long next_run;
@@ -1536,6 +1536,7 @@ static void gc_worker(struct work_struct *work)
count = gc_work->count;
end_time = start_time + GC_SCAN_MAX_DURATION;
+ resched_time = nfct_time_stamp;
do {
struct nf_conntrack_tuple_hash *h;
@@ -1615,7 +1616,10 @@ static void gc_worker(struct work_struct *work)
* we will just continue with next hash slot.
*/
rcu_read_unlock();
- cond_resched();
+ if (nfct_time_stamp - resched_time > msecs_to_jiffies(1)) {
+ cond_resched();
+ resched_time = nfct_time_stamp;
+ }
i++;
delta_time = nfct_time_stamp - end_time;
--
2.9.4
lirongqing <lirongqing@baidu.com> wrote: > From: Li RongQing <lirongqing@baidu.com> > > The current implementation calls cond_resched() in every iteration > of the garbage collection loop. This creates some overhead when > processing large conntrack tables with billions of entries, > as each cond_resched() invocation involves scheduler operations. > > To reduce this overhead, implement a time-based throttling mechanism > that calls cond_resched() at most once per millisecond. This maintains > system responsiveness while minimizing scheduler contention. > > gc_worker() with hashsize=10000 shows measurable improvement: > > Before: 7114.274us > After: 5993.518us (15.8% reduction) I dislike this, I have never seen this pattern. Whole point of cond_resched() is to let scheduler decide. Maybe it would be better to move gc_worker off to its own work queue (create_workqueue()) instead of reusing system wq so one can tune the priority instead?
© 2016 - 2025 Red Hat, Inc.