[PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

Judy Hsiao posted 1 patch 2 years ago
net/core/neighbour.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
[PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by Judy Hsiao 2 years ago
We are seeing cases where neigh_cleanup_and_release() is called by
neigh_forced_gc() many times in a row with preemption turned off.
When running on a low powered CPU at a low CPU frequency, this has
been measured to keep preemption off for ~10 ms. That's not great on a
system with HZ=1000 which expects tasks to be able to schedule in
with ~1ms latency.

Suggested-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Judy Hsiao <judyhsiao@chromium.org>

---

Changes in v2:
- Use ktime_get_ns() for timeout calculation instead of jiffies.

 net/core/neighbour.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index df81c1f0a570..552719c3bbc3 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
 {
 	int max_clean = atomic_read(&tbl->gc_entries) -
 			READ_ONCE(tbl->gc_thresh2);
+	u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
 	unsigned long tref = jiffies - 5 * HZ;
 	struct neighbour *n, *tmp;
 	int shrunk = 0;
+	int loop = 0;
 
 	NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
 
@@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
 				shrunk++;
 			if (shrunk >= max_clean)
 				break;
+			if (++loop == 16) {
+				if (ktime_get_ns() > tmax)
+					goto unlock;
+				loop = 0;
+			}
 		}
 	}
 
 	WRITE_ONCE(tbl->last_flush, jiffies);
-
+unlock:
 	write_unlock_bh(&tbl->lock);
 
 	return shrunk;
-- 
2.43.0.rc2.451.g8631bc7472-goog
Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by patchwork-bot+netdevbpf@kernel.org 2 years ago
Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Wed,  6 Dec 2023 03:38:33 +0000 you wrote:
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
> 
> [...]

Here is the summary with links:
  - [v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
    https://git.kernel.org/netdev/net/c/e5dc5afff62f

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by Stephen Hemminger 2 years ago
On Wed,  6 Dec 2023 03:38:33 +0000
Judy Hsiao <judyhsiao@chromium.org> wrote:

> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index df81c1f0a570..552719c3bbc3 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>  {
>  	int max_clean = atomic_read(&tbl->gc_entries) -
>  			READ_ONCE(tbl->gc_thresh2);
> +	u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
>  	unsigned long tref = jiffies - 5 * HZ;
>  	struct neighbour *n, *tmp;
>  	int shrunk = 0;
> +	int loop = 0;
>  
>  	NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
>  
> @@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>  				shrunk++;
>  			if (shrunk >= max_clean)
>  				break;
> +			if (++loop == 16) {

Overall looks good.
Minor comments:
	- loop count should probably be unsigned
        - the magic constant 16 should be a sysctl tuneable
Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by David Ahern 2 years ago
On 12/6/23 10:39 AM, Stephen Hemminger wrote:
> On Wed,  6 Dec 2023 03:38:33 +0000
> Judy Hsiao <judyhsiao@chromium.org> wrote:
> 
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index df81c1f0a570..552719c3bbc3 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>>  {
>>  	int max_clean = atomic_read(&tbl->gc_entries) -
>>  			READ_ONCE(tbl->gc_thresh2);
>> +	u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
>>  	unsigned long tref = jiffies - 5 * HZ;
>>  	struct neighbour *n, *tmp;
>>  	int shrunk = 0;
>> +	int loop = 0;
>>  
>>  	NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
>>  
>> @@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>>  				shrunk++;
>>  			if (shrunk >= max_clean)
>>  				break;
>> +			if (++loop == 16) {
> 
> Overall looks good.
> Minor comments:
> 	- loop count should probably be unsigned
>         - the magic constant 16 should be a sysctl tuneable

A tunable is needed here; the loop counter is just to keep the overhead
of the ktime_get_ns call in check.
Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by Doug Anderson 2 years ago
Hi,

On Wed, Dec 6, 2023 at 9:51 AM David Ahern <dsahern@kernel.org> wrote:
>
> On 12/6/23 10:39 AM, Stephen Hemminger wrote:
> > On Wed,  6 Dec 2023 03:38:33 +0000
> > Judy Hsiao <judyhsiao@chromium.org> wrote:
> >
> >> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> >> index df81c1f0a570..552719c3bbc3 100644
> >> --- a/net/core/neighbour.c
> >> +++ b/net/core/neighbour.c
> >> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> >>  {
> >>      int max_clean = atomic_read(&tbl->gc_entries) -
> >>                      READ_ONCE(tbl->gc_thresh2);
> >> +    u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
> >>      unsigned long tref = jiffies - 5 * HZ;
> >>      struct neighbour *n, *tmp;
> >>      int shrunk = 0;
> >> +    int loop = 0;
> >>
> >>      NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
> >>
> >> @@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> >>                              shrunk++;
> >>                      if (shrunk >= max_clean)
> >>                              break;
> >> +                    if (++loop == 16) {
> >
> > Overall looks good.
> > Minor comments:
> >       - loop count should probably be unsigned
> >         - the magic constant 16 should be a sysctl tuneable
>
> A tunable is needed here; the loop counter is just to keep the overhead
> of the ktime_get_ns call in check.

From context, I'm going to assume you meant a tunable is _NOT_ needed here. ;-)

-Doug
Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by Doug Anderson 2 years ago
Hi,

On Tue, Dec 5, 2023 at 7:39 PM Judy Hsiao <judyhsiao@chromium.org> wrote:
>
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
>
> Suggested-by: Douglas Anderson <dianders@chromium.org>
> Signed-off-by: Judy Hsiao <judyhsiao@chromium.org>
>
> ---
>
> Changes in v2:
> - Use ktime_get_ns() for timeout calculation instead of jiffies.
>
>  net/core/neighbour.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)

Though as evidenced by the discussion in v1 I'm not versed enough in
this code for it to mean much, the patch nonetheless looks reasonable
to me. I'm happy enough with:

Reviewed-by: Douglas Anderson <dianders@chromium.org>
Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by Eric Dumazet 2 years ago
On Wed, Dec 6, 2023 at 4:39 AM Judy Hsiao <judyhsiao@chromium.org> wrote:
>
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
>
> Suggested-by: Douglas Anderson <dianders@chromium.org>
> Signed-off-by: Judy Hsiao <judyhsiao@chromium.org>
>
> ---
>
> Changes in v2:
> - Use ktime_get_ns() for timeout calculation instead of jiffies.

SGTM, thanks.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
Posted by David Ahern 2 years ago
On 12/5/23 8:38 PM, Judy Hsiao wrote:
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
> 
> Suggested-by: Douglas Anderson <dianders@chromium.org>
> Signed-off-by: Judy Hsiao <judyhsiao@chromium.org>
> 
> ---
> 
> Changes in v2:
> - Use ktime_get_ns() for timeout calculation instead of jiffies.
> 
>  net/core/neighbour.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 


Reviewed-by: David Ahern <dsahern@kernel.org>