From nobody Thu Nov 28 13:55:48 2024 Received: from mail-pg1-f182.google.com (mail-pg1-f182.google.com [209.85.215.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E80A1E0098 for ; Tue, 1 Oct 2024 23:53:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727826827; cv=none; b=CnFdadaHTFLXvUE/lP67fdriLtsRZvduIBpVGp0iANJ9royCgjS27EPoM1Ui4fA/pVkXLTKGvBoBd7reG0k7B1awZKPTBeiK+ngLblUy2UNrSirEpvvkIcOMnTxGPFGamP0MHhKCY00Z2RgO/sli+CWaJvdmd8SssuyCNl3gIMM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727826827; c=relaxed/simple; bh=QaU3GRuNJYHoKKjvrelHYx5nDgWJ2NGSPgRMJp/+Dww=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=pfnF8ikdztY8eRYQV0pEC1hYmKcOqBayqiiRJ2rkujSy43IU8REnDLXfFo/0fuKqfZcOjIsRTG9LUvy8xG6X6L2DFvLVPpZNnNdHBzwXEJkFeh16sZtg9i5dw9g2QKd5XG+iyq+C6I35ghpnZtnrN/UUVXOa689U1igIvgVBkEQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=dtOOPy3a; arc=none smtp.client-ip=209.85.215.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="dtOOPy3a" Received: by mail-pg1-f182.google.com with SMTP id 41be03b00d2f7-7db90a28cf6so221708a12.0 for ; Tue, 01 Oct 2024 16:53:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1727826826; x=1728431626; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8jeH1LephSM3GFjcVzcycSI9hrM1DyHIRSwagoq7Cvo=; b=dtOOPy3a5lzDSM3+S12qUgKyEVsNTvNWX4WReuz/JGy+8zcNycbnYvCzlax4/L5IAl pIzM8xJsknx9bM94QpPzMWS6yRM0GQezr/v0lk1ZZhFIt+jhQRELeC3ACJRRG8DTAWbQ 9mCw2nCTK0x4mv9hupbLNjy9DUq4z8qWjBFMM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727826826; x=1728431626; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8jeH1LephSM3GFjcVzcycSI9hrM1DyHIRSwagoq7Cvo=; b=fRGPJRGSSiq3i6Z/0ayjoo9I9vC8z1XaynTDsJ0o/3nUCp/Hjl3fsmre6JlN4ajdVD mtNhA1vnOKy0jsb7ipYeE01a02fG5rmGxh8XzCh01cxrjTL7DMYw7qzWA1t9624yL5lb crJLandAYNM6PBoYtxUHNOI7uLUP64CkLteLksFu98F8WQYyfIGTvEopeV4X3W4Eg//q tNu6URbOfC/RfpM0KlDyoVcR6j+z6IYOpa134MKjoz0tH7/AzbVf23GtHUu+w/vDsbem WfRkSoEfglm+Uvz8tbfCqlap2wqS7GF6RUWtsjNTHrHuld+WEDry2JSdn1j6EaVCEx0s 04DQ== X-Forwarded-Encrypted: i=1; AJvYcCWG+agJFCZgHnOA1pFLzwKeI5ZRJD6RTbS5kMX2DCnUihNSMwCHJ1FkqQu01GNtgD0yC18HzPxN3vPJeoE=@vger.kernel.org X-Gm-Message-State: AOJu0YwQRNdIyRbUIi2URz0l5Ah29xMdUAWZAxdk6ZwpAG2W7WZzh9X8 93Hw/r/1G/r77679Whp8YCjPVfWZe1cV1ypyPVC59abFy171IykGl8svu6OXAfI= X-Google-Smtp-Source: AGHT+IHnD47ZafzdaY5Hlr1/2gXXQgmcG0+t+WA3Vp+OIdQfqS4Lc5McbW2j0MHbRJNSn6aoWeQjew== X-Received: by 2002:a17:90a:f484:b0:2da:8b9f:5b74 with SMTP id 98e67ed59e1d1-2e1853e76femr1920261a91.13.1727826825726; Tue, 01 Oct 2024 16:53:45 -0700 (PDT) Received: from localhost.localdomain ([2620:11a:c019:0:65e:3115:2f58:c5fd]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e18f89e973sm213130a91.29.2024.10.01.16.53.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Oct 2024 16:53:45 -0700 (PDT) From: Joe Damato To: netdev@vger.kernel.org Cc: mkarsten@uwaterloo.ca, skhawaja@google.com, sdf@fomichev.me, bjorn@rivosinc.com, amritha.nambiar@intel.com, sridhar.samudrala@intel.com, willemdebruijn.kernel@gmail.com, Joe Damato , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Tony Nguyen , Przemek Kitszel , Jiri Pirko , Sebastian Andrzej Siewior , Lorenzo Bianconi , David Ahern , Kory Maincent , Johannes Berg , Breno Leitao , Alexander Lobakin , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), intel-wired-lan@lists.osuosl.org (moderated list:INTEL ETHERNET DRIVERS) Subject: [RFC net-next v4 3/9] net: napi: Make gro_flush_timeout per-NAPI Date: Tue, 1 Oct 2024 23:52:34 +0000 Message-Id: <20241001235302.57609-4-jdamato@fastly.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241001235302.57609-1-jdamato@fastly.com> References: <20241001235302.57609-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allow per-NAPI gro_flush_timeout setting. The existing sysfs parameter is respected; writes to sysfs will write to all NAPI structs for the device and the net_device gro_flush_timeout field. Reads from sysfs will read from the net_device field. The ability to set gro_flush_timeout on specific NAPI instances will be added in a later commit, via netdev-genl. Note that idpf has embedded napi_struct in its internals and has established some series of asserts that involve the size of napi structure. Since this change increases the napi_struct size from 400 to 416 (according to pahole on my system), I've increased the assertion in idpf by 16 bytes. No attention whatsoever was paid to the cacheline placement of idpf internals as a result of this change. Signed-off-by: Joe Damato --- .../networking/net_cachelines/net_device.rst | 2 +- drivers/net/ethernet/intel/idpf/idpf_txrx.h | 2 +- include/linux/netdevice.h | 3 +- net/core/dev.c | 12 +++--- net/core/dev.h | 40 +++++++++++++++++++ net/core/net-sysfs.c | 2 +- 6 files changed, 51 insertions(+), 10 deletions(-) diff --git a/Documentation/networking/net_cachelines/net_device.rst b/Docum= entation/networking/net_cachelines/net_device.rst index eeeb7c925ec5..3d02ae79c850 100644 --- a/Documentation/networking/net_cachelines/net_device.rst +++ b/Documentation/networking/net_cachelines/net_device.rst @@ -98,7 +98,6 @@ struct_netdev_queue* _rx = read_mostly unsigned_int num_rx_queues = =20 unsigned_int real_num_rx_queues - = read_mostly get_rps_cpu struct_bpf_prog* xdp_prog - = read_mostly netif_elide_gro() -unsigned_long gro_flush_timeout - = read_mostly napi_complete_done unsigned_int gro_max_size - = read_mostly skb_gro_receive unsigned_int gro_ipv4_max_size - = read_mostly skb_gro_receive rx_handler_func_t* rx_handler read_mostly = - __netif_receive_skb_core @@ -182,4 +181,5 @@ struct_devlink_port* devlink_port struct_dpll_pin* dpll_pin = =20 struct hlist_head page_pools struct dim_irq_moder* irq_moder +unsigned_long gro_flush_timeout u32 napi_defer_hard_irqs diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethe= rnet/intel/idpf/idpf_txrx.h index f0537826f840..fcdf73486d46 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h @@ -438,7 +438,7 @@ struct idpf_q_vector { __cacheline_group_end_aligned(cold); }; libeth_cacheline_set_assert(struct idpf_q_vector, 112, - 424 + 2 * sizeof(struct dim), + 440 + 2 * sizeof(struct dim), 8 + sizeof(cpumask_var_t)); =20 struct idpf_rx_queue_stats { diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 55764efc5c93..33897edd16c8 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -377,6 +377,7 @@ struct napi_struct { struct list_head dev_list; struct hlist_node napi_hash_node; int irq; + unsigned long gro_flush_timeout; u32 defer_hard_irqs; }; =20 @@ -2075,7 +2076,6 @@ struct net_device { int ifindex; unsigned int real_num_rx_queues; struct netdev_rx_queue *_rx; - unsigned long gro_flush_timeout; unsigned int gro_max_size; unsigned int gro_ipv4_max_size; rx_handler_func_t __rcu *rx_handler; @@ -2398,6 +2398,7 @@ struct net_device { =20 /** @irq_moder: dim parameters used if IS_ENABLED(CONFIG_DIMLIB). */ struct dim_irq_moder *irq_moder; + unsigned long gro_flush_timeout; u32 napi_defer_hard_irqs; =20 u8 priv[] ____cacheline_aligned diff --git a/net/core/dev.c b/net/core/dev.c index 748739958d2a..056ed44f766f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6226,12 +6226,12 @@ bool napi_complete_done(struct napi_struct *n, int = work_done) =20 if (work_done) { if (n->gro_bitmask) - timeout =3D READ_ONCE(n->dev->gro_flush_timeout); + timeout =3D napi_get_gro_flush_timeout(n); n->defer_hard_irqs_count =3D napi_get_defer_hard_irqs(n); } if (n->defer_hard_irqs_count > 0) { n->defer_hard_irqs_count--; - timeout =3D READ_ONCE(n->dev->gro_flush_timeout); + timeout =3D napi_get_gro_flush_timeout(n); if (timeout) ret =3D false; } @@ -6366,7 +6366,7 @@ static void busy_poll_stop(struct napi_struct *napi, = void *have_poll_lock, =20 if (flags & NAPI_F_PREFER_BUSY_POLL) { napi->defer_hard_irqs_count =3D napi_get_defer_hard_irqs(napi); - timeout =3D READ_ONCE(napi->dev->gro_flush_timeout); + timeout =3D napi_get_gro_flush_timeout(napi); if (napi->defer_hard_irqs_count && timeout) { hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINN= ED); skip_schedule =3D true; @@ -6648,6 +6648,7 @@ void netif_napi_add_weight(struct net_device *dev, st= ruct napi_struct *napi, hrtimer_init(&napi->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); napi->timer.function =3D napi_watchdog; napi_set_defer_hard_irqs(napi, READ_ONCE(dev->napi_defer_hard_irqs)); + napi_set_gro_flush_timeout(napi, READ_ONCE(dev->gro_flush_timeout)); init_gro_hash(napi); napi->skb =3D NULL; INIT_LIST_HEAD(&napi->rx_list); @@ -11053,7 +11054,7 @@ void netdev_sw_irq_coalesce_default_on(struct net_d= evice *dev) WARN_ON(dev->reg_state =3D=3D NETREG_REGISTERED); =20 if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { - dev->gro_flush_timeout =3D 20000; + netdev_set_gro_flush_timeout(dev, 20000); netdev_set_defer_hard_irqs(dev, 1); } } @@ -11991,7 +11992,6 @@ static void __init net_dev_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, ifin= dex); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, real= _num_rx_queues); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, _rx); - CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, gro_= flush_timeout); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, gro_= max_size); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, gro_= ipv4_max_size); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, rx_h= andler); @@ -12003,7 +12003,7 @@ static void __init net_dev_struct_check(void) #ifdef CONFIG_NET_XGRESS CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, tcx_= ingress); #endif - CACHELINE_ASSERT_GROUP_SIZE(struct net_device, net_device_read_rx, 100); + CACHELINE_ASSERT_GROUP_SIZE(struct net_device, net_device_read_rx, 92); } =20 /* diff --git a/net/core/dev.h b/net/core/dev.h index b3792219879b..26e598aa56c3 100644 --- a/net/core/dev.h +++ b/net/core/dev.h @@ -174,6 +174,46 @@ static inline void netdev_set_defer_hard_irqs(struct n= et_device *netdev, napi_set_defer_hard_irqs(napi, defer); } =20 +/** + * napi_get_gro_flush_timeout - get the gro_flush_timeout + * @n: napi struct to get the gro_flush_timeout from + * + * Return: the per-NAPI value of the gro_flush_timeout field. + */ +static inline unsigned long +napi_get_gro_flush_timeout(const struct napi_struct *n) +{ + return READ_ONCE(n->gro_flush_timeout); +} + +/** + * napi_set_gro_flush_timeout - set the gro_flush_timeout for a napi + * @n: napi struct to set the gro_flush_timeout + * @timeout: timeout value to set + * + * napi_set_gro_flush_timeout sets the per-NAPI gro_flush_timeout + */ +static inline void napi_set_gro_flush_timeout(struct napi_struct *n, + unsigned long timeout) +{ + WRITE_ONCE(n->gro_flush_timeout, timeout); +} + +/** + * netdev_set_gro_flush_timeout - set gro_flush_timeout of a netdev's NAPIs + * @netdev: the net_device for which all NAPIs will have gro_flush_timeout= set + * @timeout: the timeout value to set + */ +static inline void netdev_set_gro_flush_timeout(struct net_device *netdev, + unsigned long timeout) +{ + struct napi_struct *napi; + + WRITE_ONCE(netdev->gro_flush_timeout, timeout); + list_for_each_entry(napi, &netdev->napi_list, dev_list) + napi_set_gro_flush_timeout(napi, timeout); +} + int rps_cpumask_housekeeping(struct cpumask *mask); =20 #if defined(CONFIG_DEBUG_NET) && defined(CONFIG_BPF_SYSCALL) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 25125f356a15..2d9afc6e2161 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -409,7 +409,7 @@ NETDEVICE_SHOW_RW(tx_queue_len, fmt_dec); =20 static int change_gro_flush_timeout(struct net_device *dev, unsigned long = val) { - WRITE_ONCE(dev->gro_flush_timeout, val); + netdev_set_gro_flush_timeout(dev, val); return 0; } =20 --=20 2.25.1