From nobody Sun Feb 8 05:28:16 2026 Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9523B18DF81 for ; Fri, 23 Aug 2024 17:31:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434309; cv=none; b=l4rkhqSOojEKwLTidHZ5pPHxilbMZqmT8y0eXxcwZGoYDtjTpUVZZ+ZJTVGNTmaBXo1vplNvMYd1R3yyY8HSTEaJ7/rNEar0wQh7sgNpZEdIxZTTVvMScWsqfdnpXtK9Y5T1Rng9Dhe4FoZc3RXRqcN5nKmg7vMcGp+cfDHzfK4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434309; c=relaxed/simple; bh=VNWu2J39++LQzHtTrd4Q6xUHjBc0llUb4IP1EArdOZw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=e2bQDwG69oy7TJCphcWz/rw0vKWyjdXPqcHqLk0GiDVAgPznPxAGiP63oCzjdiHNysSNUaeaGbftoLHRXP0ELwUbdSyBwD6pnb3KL7xKDDS8pj2LFpAsGCMSsHVJ6UFOk3xMF3Mha9n+VwVsgprDOqgCygGMYuntBgP+F40/3mU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=b7EgIJHt; arc=none smtp.client-ip=209.85.215.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="b7EgIJHt" Received: by mail-pg1-f179.google.com with SMTP id 41be03b00d2f7-7b594936e9bso1616559a12.1 for ; Fri, 23 Aug 2024 10:31:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1724434307; x=1725039107; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7yL0MPvBF8bNsfhufIdh1IITs7ZJ9No4pUlKB9Ss45Q=; b=b7EgIJHtUTkvQqKhxVwAF7xGob00f5q8CfXcd7qGXd8PgMUOj34bO7KPDgzIcX+27F 2O48J7fmJ7tCwTVMxHID7qp6H+A3i4F6CYqIjTdXhA3T3KV8JDtqyq/BpHfA7qUXZjjS yM0hGzoXmLXtgPT4TaENEKnz05Z//SRu/AgxI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724434307; x=1725039107; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7yL0MPvBF8bNsfhufIdh1IITs7ZJ9No4pUlKB9Ss45Q=; b=cNwx3W1um0IVm8EfTpe8vBkjVJNTp1Vh3JGxQYMbgqvXg4dlgLi3nL/BuAcIm/rgll RI7AIZsrf8Ljqin3+qZDR/p1wN6euhH8o7r2Uv5WmwSfd05v7HMysZz7rx2rrB90GSeG Ve14MwJuwqIPmzDnK8T1qgV7G/VDmBAG0fm6kkCsyZwQQEW1sMLynJ+1sRsNtwMiNHoj jWEwciv+9srr+m5PuhFum0hMS7S3txQhsvC6eVCGLM2WG7Crracxdxf7TFHi/BHqVNVp ao+8fRBiEJBDSIXAwPK54bAL0keh5TTjSTMXFHKI6qlTjBmWfHBE3pwbR1vtZyYqXeDK jkDg== X-Forwarded-Encrypted: i=1; AJvYcCWYvNj0hXm/wUKOuI1NuS4UY2QB1UL4C4lSaljlONDvHenVMKlv+DFU4Gv/+WmwPwrHs3oYfQnCv/r+kB8=@vger.kernel.org X-Gm-Message-State: AOJu0YzUIfWYkLXGR94nnTw02lU0Z9t+mNWhZ3osmiP805Jz5yQ7n9Mm SRDGx7KYMM7oGW/47xMymBiCC/9oITGH5y1wVsqJFwgY0q/urHdiBD5Wev50ydk= X-Google-Smtp-Source: AGHT+IFmojFMICV8FHNtia4R5sPL+9AeNF/vbaHALXfPws8KtlkVXl3M4m9hBDPIXTtBqi5YwD+2Nw== X-Received: by 2002:a05:6a21:a343:b0:1c2:956a:a909 with SMTP id adf61e73a8af0-1cc89dbc287mr3646076637.27.1724434300683; Fri, 23 Aug 2024 10:31:40 -0700 (PDT) Received: from localhost.localdomain ([2620:11a:c019:0:65e:3115:2f58:c5fd]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7143430964fsm3279624b3a.150.2024.08.23.10.31.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Aug 2024 10:31:40 -0700 (PDT) From: Joe Damato To: netdev@vger.kernel.org Cc: amritha.nambiar@intel.com, sridhar.samudrala@intel.com, sdf@fomichev.me, peter@typeblog.net, m2shafiei@uwaterloo.ca, bjorn@rivosinc.com, hch@infradead.org, willy@infradead.org, willemdebruijn.kernel@gmail.com, skhawaja@google.com, kuba@kernel.org, Martin Karsten , Joe Damato , "David S. Miller" , Eric Dumazet , Paolo Abeni , Jiri Pirko , Sebastian Andrzej Siewior , Lorenzo Bianconi , Daniel Borkmann , Breno Leitao , Johannes Berg , Heiner Kallweit , linux-kernel@vger.kernel.org (open list) Subject: [PATCH net-next 1/6] net: Add sysfs parameter irq_suspend_timeout Date: Fri, 23 Aug 2024 17:30:52 +0000 Message-Id: <20240823173103.94978-2-jdamato@fastly.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240823173103.94978-1-jdamato@fastly.com> References: <20240823173103.94978-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Martin Karsten This patch doesn't change any behavior but prepares the code for other changes in the following commits which use irq_suspend_timeout as a timeout for IRQ suspension. Signed-off-by: Martin Karsten Co-developed-by: Joe Damato Signed-off-by: Joe Damato Tested-by: Joe Damato Tested-by: Martin Karsten --- rfc -> v1: - Removed napi.rst documentation from this patch; added to patch 6. include/linux/netdevice.h | 2 ++ net/core/dev.c | 3 ++- net/core/net-sysfs.c | 18 ++++++++++++++++++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 0ef3eaa23f4b..31867bb2ff65 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1857,6 +1857,7 @@ enum netdev_reg_state { * @gro_flush_timeout: timeout for GRO layer in NAPI * @napi_defer_hard_irqs: If not zero, provides a counter that would * allow to avoid NIC hard IRQ, on busy queues. + * @irq_suspend_timeout: IRQ suspension timeout * * @rx_handler: handler for received packets * @rx_handler_data: XXX: need comments on this one @@ -2060,6 +2061,7 @@ struct net_device { struct netdev_rx_queue *_rx; unsigned long gro_flush_timeout; int napi_defer_hard_irqs; + unsigned long irq_suspend_timeout; unsigned int gro_max_size; unsigned int gro_ipv4_max_size; rx_handler_func_t __rcu *rx_handler; diff --git a/net/core/dev.c b/net/core/dev.c index e7260889d4cb..3bf325ec25a3 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -11945,6 +11945,7 @@ static void __init net_dev_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, _rx); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, gro_= flush_timeout); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, napi= _defer_hard_irqs); + CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, irq_= suspend_timeout); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, gro_= max_size); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, gro_= ipv4_max_size); CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, rx_h= andler); @@ -11956,7 +11957,7 @@ static void __init net_dev_struct_check(void) #ifdef CONFIG_NET_XGRESS CACHELINE_ASSERT_GROUP_MEMBER(struct net_device, net_device_read_rx, tcx_= ingress); #endif - CACHELINE_ASSERT_GROUP_SIZE(struct net_device, net_device_read_rx, 104); + CACHELINE_ASSERT_GROUP_SIZE(struct net_device, net_device_read_rx, 112); } =20 /* diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 0e2084ce7b75..fb6f3327310f 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -440,6 +440,23 @@ static ssize_t napi_defer_hard_irqs_store(struct devic= e *dev, } NETDEVICE_SHOW_RW(napi_defer_hard_irqs, fmt_dec); =20 +static int change_irq_suspend_timeout(struct net_device *dev, unsigned lon= g val) +{ + WRITE_ONCE(dev->irq_suspend_timeout, val); + return 0; +} + +static ssize_t irq_suspend_timeout_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + return netdev_store(dev, attr, buf, len, change_irq_suspend_timeout); +} +NETDEVICE_SHOW_RW(irq_suspend_timeout, fmt_ulong); + static ssize_t ifalias_store(struct device *dev, struct device_attribute *= attr, const char *buf, size_t len) { @@ -664,6 +681,7 @@ static struct attribute *net_class_attrs[] __ro_after_i= nit =3D { &dev_attr_tx_queue_len.attr, &dev_attr_gro_flush_timeout.attr, &dev_attr_napi_defer_hard_irqs.attr, + &dev_attr_irq_suspend_timeout.attr, &dev_attr_phys_port_id.attr, &dev_attr_phys_port_name.attr, &dev_attr_phys_switch_id.attr, --=20 2.25.1 From nobody Sun Feb 8 05:28:16 2026 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 855D8191F81 for ; Fri, 23 Aug 2024 17:31:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434311; cv=none; b=F7MjM1RX1gXCxfEJbaFBLsvFMky9aMGm2A19eHdHw35NPkk40qEPV7HEA9vjyZBp0/Tghc+QIS0Jr+AYZAkBT8qh7JfxUnwvyA9GzSJLcJFB1zlbuZGrrzTHxGoMjCegiP9tFgeLrh0UwK6zPrYGxJCSUBcPZIp7nOA4GQqVceI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434311; c=relaxed/simple; bh=LxeFyxsuYCcPtr/MQv2af2sWojNlmhphYuIJiMvBues=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=k9h8xAXKIof0zcroChLt/WqsaHq6N5+5BS2QJbsa73IATWEKMY4ztKhP3ujseHC4vLb3EHCvfNVheKBxi6gR+D3kL0W/bQJ0XFNRLyr3edSWd8PjyHUC0P7bVTxvTsg3iuh7Sp+O4zPd2dMFDt6E2YfBKk14d58tQvNmlYgYDcs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=nMfkGg2n; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="nMfkGg2n" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-714262f1bb4so1842596b3a.3 for ; Fri, 23 Aug 2024 10:31:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1724434310; x=1725039110; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VENZ7u9r+KFqHJZ3BdWz6q0JyzsBYDvzURd4w22e1CM=; b=nMfkGg2nH8Ecfi9ElHrvrwn+pvkDxnum08m8jDN2+sCJKx0URNWDCJoHn2ArjkGC5J leUXvC7dYpd8tKD9N/YGP7GZLRznRU94kGCzqcq9TCdXl3HHOKrwBE1QdjvRTLF0Zawi 8JA4WZ08FPYWcz11kc0zTUH5dHwKrinagm6b0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724434310; x=1725039110; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VENZ7u9r+KFqHJZ3BdWz6q0JyzsBYDvzURd4w22e1CM=; b=Pl7A1uxfkgx+YjTb9ql15Bb1/U4cT4peGtSHMUJ8nmZj7VAn9sEGH0GgINX5D5pxhm Ge30RKSCF3+LZYQ9d+1o4JzwXg7wn57a07uAeuXvAuH7PGDtbF4v7rXM3tFcbVaqCT9I nidKLzVhkZxXcY6FmRql3+DTaYjllYeayRBHGHtABnvSRSNrBDJG7y/3/owfwafkUY/d Jsj0kthzA/cllzqloWvC/7UZkjPBB53eEwCN5T8n035XP8M8BuePPRevHJayU+ke7kSj lSAymqkpGPu3Uc4MqO8ngcF8w55NvMSv0+muUMtTBI/33D6ih5xzf2M9ToqxclDUesvY h2TA== X-Forwarded-Encrypted: i=1; AJvYcCWX2oFTIC0JnIZYEE7vwt01M2aMEYImfkq0RfJIm7lCUjv/kM9T+l1rmn9zXAJf1BXNkih1HvCB80MP3eQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyLvaOimorH2MWyn9ihudH028wa8bWhnDTBvrkUBC8XJXVMUL3Q 9kyX8QqqCI21OrQNJj3elykq9vivb6o6Xfb/y9+Ouk+rDc1enTKIxG/a+pa5rKo= X-Google-Smtp-Source: AGHT+IF80kFwwD0giG0CoU1G7lJncioYNLPbf5NvDHI5LFjxf5lo8LyJnGR5gbXJE+GOFTJmyqHKtA== X-Received: by 2002:a05:6a00:888:b0:70d:2ba1:2402 with SMTP id d2e1a72fcca58-71445f3cc63mr3053603b3a.29.1724434309626; Fri, 23 Aug 2024 10:31:49 -0700 (PDT) Received: from localhost.localdomain ([2620:11a:c019:0:65e:3115:2f58:c5fd]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7143430964fsm3279624b3a.150.2024.08.23.10.31.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Aug 2024 10:31:49 -0700 (PDT) From: Joe Damato To: netdev@vger.kernel.org Cc: amritha.nambiar@intel.com, sridhar.samudrala@intel.com, sdf@fomichev.me, peter@typeblog.net, m2shafiei@uwaterloo.ca, bjorn@rivosinc.com, hch@infradead.org, willy@infradead.org, willemdebruijn.kernel@gmail.com, skhawaja@google.com, kuba@kernel.org, Martin Karsten , Joe Damato , "David S. Miller" , Eric Dumazet , Paolo Abeni , Jiri Pirko , Sebastian Andrzej Siewior , Lorenzo Bianconi , Daniel Borkmann , linux-kernel@vger.kernel.org (open list) Subject: [PATCH net-next 2/6] net: Suspend softirq when prefer_busy_poll is set Date: Fri, 23 Aug 2024 17:30:53 +0000 Message-Id: <20240823173103.94978-3-jdamato@fastly.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240823173103.94978-1-jdamato@fastly.com> References: <20240823173103.94978-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Martin Karsten When NAPI_F_PREFER_BUSY_POLL is set during busy_poll_stop and the irq_suspend_timeout sysfs is nonzero, this timeout is used to defer softirq scheduling, potentially longer than gro_flush_timeout. This can be used to effectively suspend softirq processing during the time it takes for an application to process data and return to the next busy-poll. The call to napi->poll in busy_poll_stop might lead to an invocation of napi_complete_done, but the prefer-busy flag is still set at that time, so the same logic is used to defer softirq scheduling for irq_suspend_timeout. Signed-off-by: Martin Karsten Co-developed-by: Joe Damato Signed-off-by: Joe Damato Tested-by: Joe Damato Tested-by: Martin Karsten --- net/core/dev.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 3bf325ec25a3..74060ba866d4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6230,7 +6230,12 @@ bool napi_complete_done(struct napi_struct *n, int w= ork_done) timeout =3D READ_ONCE(n->dev->gro_flush_timeout); n->defer_hard_irqs_count =3D READ_ONCE(n->dev->napi_defer_hard_irqs); } - if (n->defer_hard_irqs_count > 0) { + if (napi_prefer_busy_poll(n)) { + timeout =3D READ_ONCE(n->dev->irq_suspend_timeout); + if (timeout) + ret =3D false; + } + if (ret && n->defer_hard_irqs_count > 0) { n->defer_hard_irqs_count--; timeout =3D READ_ONCE(n->dev->gro_flush_timeout); if (timeout) @@ -6366,9 +6371,13 @@ static void busy_poll_stop(struct napi_struct *napi,= void *have_poll_lock, bpf_net_ctx =3D bpf_net_ctx_set(&__bpf_net_ctx); =20 if (flags & NAPI_F_PREFER_BUSY_POLL) { - napi->defer_hard_irqs_count =3D READ_ONCE(napi->dev->napi_defer_hard_irq= s); - timeout =3D READ_ONCE(napi->dev->gro_flush_timeout); - if (napi->defer_hard_irqs_count && timeout) { + timeout =3D READ_ONCE(napi->dev->irq_suspend_timeout); + if (!timeout) { + napi->defer_hard_irqs_count =3D READ_ONCE(napi->dev->napi_defer_hard_ir= qs); + if (napi->defer_hard_irqs_count) + timeout =3D READ_ONCE(napi->dev->gro_flush_timeout); + } + if (timeout) { hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINN= ED); skip_schedule =3D true; } --=20 2.25.1 From nobody Sun Feb 8 05:28:16 2026 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 954EF19258C for ; Fri, 23 Aug 2024 17:31:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434315; cv=none; b=mHKbkL3f1RfpoPTDwwQzJ+9kvOdNpHKSgHqMVkcz66d6lF1B+l9TUagXuytx2UXeuvEtCcFXacSwG69yd9IaSduRy1tPnFoiR5LnYM1xdMzwF092sVJu7Mhb2m+56O08fPWy98QPg0usY9K2/Z070eGTL16nbGB3gxEdTQL1BV8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434315; c=relaxed/simple; bh=uX0QibKij4hrx1iInrriZkBV1gHHT5oaikcEGlUnPwI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kpwhlHKX1HADOxQtuactYTX9c07WucL+gr5K3EqAryfBSd7B24fHHupfuZ3W8QfVWSfvzk7B29POjKnAHaGF/7JwFEpZZLgfGuzCz0HDJX5h+Y4Wl5E7cAfVVgQ5fMylujrbbmiUue5omK5U6hizJCWLNBOqTL15t+u5xqf+9fU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=fmUkKKLa; arc=none smtp.client-ip=209.85.210.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="fmUkKKLa" Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-7142e002aceso1956956b3a.2 for ; Fri, 23 Aug 2024 10:31:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1724434313; x=1725039113; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jJ3Cgz7lMCJbNt90o7mMTqlTtQDLxW3serLqcE11Oz0=; b=fmUkKKLa76VtyrMJONnBCmzjy//kBey/QIPGFSQtVVxllYqvb/oHS7YjGFFJ7j42/6 sF3daiBLlJ45dwelvD7Z1CZ0DUw8W2MsXzA1/DgcmI6rID1rDlHYwQk4aqt1sELaX1pv HQctLfsGX9Xak7VN0hbLOwa3jL2lqyfhfahpU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724434313; x=1725039113; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jJ3Cgz7lMCJbNt90o7mMTqlTtQDLxW3serLqcE11Oz0=; b=XtrmDCyifrTxhuHfoKStiaDFrHS4vqcpaTeI99NvlLIAqAd91IXlkID9KfFUzZY0K9 w8xas+tR0dySsD9NHFLsb0WLulFtm1byMQJtPf6d1iZx4XZlYe2wWcdfHDdb3qfObVsA gscThNgd/ewEXoHwbPUPSVJLoCoL5lJy1SRG99a4CBdZjkAJ80tawSbLn3dE+mfzOjqb yScev4UjDTPE8AO0GoeZGgG+Dl+pCwVXkgPEQUraqNuXWegRebPPyCH0Ybr1tTxn+yno HuaRtc/d4IGpe+SWBUzjtPn9FnvUwMEYntOpNbBSNkT6R5S15nM+dNOUZ6cR1IM37dkC b7AA== X-Forwarded-Encrypted: i=1; AJvYcCU4blMVIPqPyfOCMSpwKqP9Z6kQPzDNKp/TIVbbVriOPQ1CpVebazQ+a8SvwHcVf6wQ82A8EHbEMs0b2So=@vger.kernel.org X-Gm-Message-State: AOJu0YyfmY90oxZkofVe5aWi1hDMfP0+zarbvyScFoi4NOsa3R+rK8Bx 159ePWktn+A5Rt/xDmH5pfbUG/Bs5Ifye+TAf4mIc4eEMmQstP3kTBY2PnFUaeI= X-Google-Smtp-Source: AGHT+IFlAQEc9KoqTiJ+uenwue8nngt9FosFsk1pBy96LJQu6gg7G2mzoKPb7tyMW3FTcjllOcQq/w== X-Received: by 2002:a05:6a00:9443:b0:706:6962:4b65 with SMTP id d2e1a72fcca58-714457d3788mr3219030b3a.14.1724434312772; Fri, 23 Aug 2024 10:31:52 -0700 (PDT) Received: from localhost.localdomain ([2620:11a:c019:0:65e:3115:2f58:c5fd]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7143430964fsm3279624b3a.150.2024.08.23.10.31.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Aug 2024 10:31:52 -0700 (PDT) From: Joe Damato To: netdev@vger.kernel.org Cc: amritha.nambiar@intel.com, sridhar.samudrala@intel.com, sdf@fomichev.me, peter@typeblog.net, m2shafiei@uwaterloo.ca, bjorn@rivosinc.com, hch@infradead.org, willy@infradead.org, willemdebruijn.kernel@gmail.com, skhawaja@google.com, kuba@kernel.org, Martin Karsten , Joe Damato , "David S. Miller" , Eric Dumazet , Paolo Abeni , Jiri Pirko , Sebastian Andrzej Siewior , Lorenzo Bianconi , Daniel Borkmann , linux-kernel@vger.kernel.org (open list) Subject: [PATCH net-next 3/6] net: Add control functions for irq suspension Date: Fri, 23 Aug 2024 17:30:54 +0000 Message-Id: <20240823173103.94978-4-jdamato@fastly.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240823173103.94978-1-jdamato@fastly.com> References: <20240823173103.94978-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Martin Karsten The napi_suspend_irqs routine bootstraps irq suspension by elongating the defer timeout to irq_suspend_timeout. The napi_resume_irqs routine effectly cancels irq suspension by forcing the napi to be scheduled immediately. Signed-off-by: Martin Karsten Co-developed-by: Joe Damato Signed-off-by: Joe Damato Tested-by: Joe Damato Tested-by: Martin Karsten --- include/net/busy_poll.h | 3 +++ net/core/dev.c | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index 9b09acac538e..f095b2bdeee1 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -52,6 +52,9 @@ void napi_busy_loop_rcu(unsigned int napi_id, bool (*loop_end)(void *, unsigned long), void *loop_end_arg, bool prefer_busy_poll, u16 budget); =20 +void napi_suspend_irqs(unsigned int napi_id); +void napi_resume_irqs(unsigned int napi_id); + #else /* CONFIG_NET_RX_BUSY_POLL */ static inline unsigned long net_busy_loop_on(void) { diff --git a/net/core/dev.c b/net/core/dev.c index 74060ba866d4..4de0dfc86e21 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6507,6 +6507,39 @@ void napi_busy_loop(unsigned int napi_id, } EXPORT_SYMBOL(napi_busy_loop); =20 +void napi_suspend_irqs(unsigned int napi_id) +{ + struct napi_struct *napi; + + rcu_read_lock(); + napi =3D napi_by_id(napi_id); + if (napi) { + unsigned long timeout =3D READ_ONCE(napi->dev->irq_suspend_timeout); + + if (timeout) + hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINN= ED); + } + rcu_read_unlock(); +} +EXPORT_SYMBOL(napi_suspend_irqs); + +void napi_resume_irqs(unsigned int napi_id) +{ + struct napi_struct *napi; + + rcu_read_lock(); + napi =3D napi_by_id(napi_id); + if (napi) { + if (READ_ONCE(napi->dev->irq_suspend_timeout)) { + local_bh_disable(); + napi_schedule(napi); + local_bh_enable(); + } + } + rcu_read_unlock(); +} +EXPORT_SYMBOL(napi_resume_irqs); + #endif /* CONFIG_NET_RX_BUSY_POLL */ =20 static void napi_hash_add(struct napi_struct *napi) --=20 2.25.1 From nobody Sun Feb 8 05:28:16 2026 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F4FD190463 for ; Fri, 23 Aug 2024 17:31:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434316; cv=none; b=dNgtYkSYGfMX/PZvrDg53vBzKF99A1tLum558oabr752wl6URxUwBZwAGxsR8hZm8M/zjDQjtR52c7gz9iPHC0ZirlTC3jEDVtjyiOvbz6l6U/X7E2M1SjG2yo1SSpwYFBGVlDDt1Q5moRCp092y65inN7JeczrgYNMbhsVj4V4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434316; c=relaxed/simple; bh=uGO5dEJrzAGpkDGXLrC8ztAywM3dRMCy6F4elADygWk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=POFOAZbIXbmricoucEAhChBhyb5TTIIMZyjYwI94kISWhQuuIGV9n2fa1zjQxdMqkctcz7tbeBtnqFinQEC4cLnhvOY+abOGhscnMZ3ietDaF/JzSKw+uOuu2xXHRN14914Wcjnx1hiGGDOVKiFbfVmT/4+isFnK/L40ufMzKJo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=twgo/R6y; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="twgo/R6y" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-71433cba1b7so1811087b3a.0 for ; Fri, 23 Aug 2024 10:31:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1724434315; x=1725039115; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=58S4qbuyYKoKQjyHpzs28ekwRvFYt2yreo1eLG7u1fM=; b=twgo/R6yBWs7WuZvoSdye6GCNXPvb0JViHYT4zcpQCEdfcSfBy9yJycfC/01CrerAK 9I+R4bRipMWIkMu3GKP4zVmpOGxR0x71MkkOcYk+R6e2CfJwm6D2OJBlUFih08NCNNbS P4esa7u4fAZ6UfK+wpk6ek8BwEDw7BKiE6XGM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724434315; x=1725039115; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=58S4qbuyYKoKQjyHpzs28ekwRvFYt2yreo1eLG7u1fM=; b=usN9pjgfM3TM4xDXIhXaBYuPJ9WBwXChu8PGHo75M3e0PLVSoAfka/tM4hCjZmUkZD vJlI6ccPiFMofrwAc55JtovqMinVO0S8t4iYXgCX/YaMqfi9xrCxPo17iazScxB8BldB rZOPx+3rkdQ1QktCiqCnl0E+KIE5bQp7JXeBPEB0heG11mWyxk72LegqgqhY+oW6+Oq/ yAMDhaJZR4kk2YUw98Vpl9oCuHuhuuLtTozbHbC2SAL5dzMR7k6piWOZ4zk1oxVuQ7/q 0lbVea+KRIrgClYhp4L4m4+DAFTZ710TN/0IAEiyyfE273Lf3+jKq5PucakDeUp+/Yd0 CgwQ== X-Forwarded-Encrypted: i=1; AJvYcCXXoS0elZ+HvMUWgE0Let7Jdx8Kfr7U+uVYuzhswvyIFCMPYeD12KflGWFK9DCSAOnYKDIv4uHc2eafT80=@vger.kernel.org X-Gm-Message-State: AOJu0YxoRN32E3dF2xaK1gTQNontwngqcIgKdGsd7tiMGpjZ80Ld67Sy 5FCZamF2dX0SshwptdJc/rgfILSzydI2shnaPp1zZ/1/ZqqsYeu29LoPOnUWTQA= X-Google-Smtp-Source: AGHT+IERjh31UHWhIZDePORqNzeJZelgKL0ItV5Or+5rM0DaNHnsupMW/bNRFrWJoppeQW8QJR6Ong== X-Received: by 2002:a05:6a20:9f4a:b0:1c4:c879:b77f with SMTP id adf61e73a8af0-1cc89d7e4b6mr3579864637.27.1724434314568; Fri, 23 Aug 2024 10:31:54 -0700 (PDT) Received: from localhost.localdomain ([2620:11a:c019:0:65e:3115:2f58:c5fd]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7143430964fsm3279624b3a.150.2024.08.23.10.31.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Aug 2024 10:31:54 -0700 (PDT) From: Joe Damato To: netdev@vger.kernel.org Cc: amritha.nambiar@intel.com, sridhar.samudrala@intel.com, sdf@fomichev.me, peter@typeblog.net, m2shafiei@uwaterloo.ca, bjorn@rivosinc.com, hch@infradead.org, willy@infradead.org, willemdebruijn.kernel@gmail.com, skhawaja@google.com, kuba@kernel.org, Martin Karsten , Joe Damato , Alexander Viro , Christian Brauner , Jan Kara , linux-fsdevel@vger.kernel.org (open list:FILESYSTEMS (VFS and infrastructure)), linux-kernel@vger.kernel.org (open list) Subject: [PATCH net-next 4/6] eventpoll: Trigger napi_busy_loop, if prefer_busy_poll is set Date: Fri, 23 Aug 2024 17:30:55 +0000 Message-Id: <20240823173103.94978-5-jdamato@fastly.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240823173103.94978-1-jdamato@fastly.com> References: <20240823173103.94978-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Martin Karsten Setting prefer_busy_poll now leads to an effectively nonblocking iteration though napi_busy_loop, even when busy_poll_usecs is 0. Signed-off-by: Martin Karsten Co-developed-by: Joe Damato Signed-off-by: Joe Damato Tested-by: Joe Damato Tested-by: Martin Karsten --- fs/eventpoll.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index f53ca4f7fced..cc47f72005ed 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -420,7 +420,9 @@ static bool busy_loop_ep_timeout(unsigned long start_ti= me, =20 static bool ep_busy_loop_on(struct eventpoll *ep) { - return !!ep->busy_poll_usecs || net_busy_loop_on(); + return !!READ_ONCE(ep->busy_poll_usecs) || + READ_ONCE(ep->prefer_busy_poll) || + net_busy_loop_on(); } =20 static bool ep_busy_loop_end(void *p, unsigned long start_time) --=20 2.25.1 From nobody Sun Feb 8 05:28:16 2026 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 315EF192B6E for ; Fri, 23 Aug 2024 17:31:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434318; cv=none; b=P9zCGp7lbX9mCBKZOHdK2jPBoYyzGM+x9/otTiaVXMGj8g5++0v279tFOASWFiRYG8MfXttetTcT5w1YS6DW+ckIBIQJIDFUL7nlLH7mmY5SEc6tMeg1Ac5AYezABC6Pix3ZSoM8r8UOIrTg/hNIeM+BmbsR1EtsAgH5LXNgEW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434318; c=relaxed/simple; bh=x08ZrLW3d91mrWZQmC/v9H7iJRa473+4x/SsWYtS3JE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=SsssUIItGMIzn6wWDiJG6WP7qBe6kr8ey4pwV51Wc9JzDuSlH6kVH7Tu6azDByL8rQN2ewS9zZ4UvjNM5MbkzYpyIr8aduvKYYi6mV3wufpQUswuFg9uyzHjcwr2d5zlA557aix4JxisYXUeM3n6DLAKHmA7vzZVd0j0GXwe/W4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=UDZ0sT8u; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="UDZ0sT8u" Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-71433cba1b7so1811115b3a.0 for ; Fri, 23 Aug 2024 10:31:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1724434316; x=1725039116; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IoqQBySxLV4Sefg/kJRVr1Dr3C923zx/iiQfn4tblTE=; b=UDZ0sT8u5MrIFsQ3wnGp9F9y8pQUiZjszYOkWFxo0DmRD7PgI+tWFNlusmmu2kUoI4 7Y51sqGcrLLr9mdv95q3Xg5gPbYTJJwQh280N/zECe56Ppl2iIuY2WwaGeMFHAFrTixS fL8fumpgh/G5j6qj65gnWQugCyPUtzs5Zxseo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724434316; x=1725039116; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IoqQBySxLV4Sefg/kJRVr1Dr3C923zx/iiQfn4tblTE=; b=I0aUSAXd1zMGToGxTAN0qFQeOJCT9H59E3WwQ4ct9mtFfE5PLF80kg0xHoa8FoZgLw kL1nrFW2QCGlTlnqfgZ2RrOREWYl3iRgV4DtAuSIQ/EzRlXabBx3ofcQq6bWYfrtBWgu 79F44+4XNyDAOhbnRq1QceB5BuMdaRpnwgI8HHDlX/MxW6PVxYmUdr3vn+3S8aTqGOpG zFHxiDxnss6MUG1lAYKtMBI+bSBSmW44EPFBJ1o3xjkNJ+N38lqBajd6AbVHk+0vkegD XbHBCmd1XkRKZ4bAANzjEWh4KqATHFYU9G622yIroQ6th32gbr+cVRbC6SUw6nxtPwLf fHOQ== X-Forwarded-Encrypted: i=1; AJvYcCXhGX2g6XqVBxMQUGy8zVWn7qlTy8w97cu3L+yuNhyOEIa0Uprz7Fh4vaHyIiN54I9E7r2oq6j58dHHWI8=@vger.kernel.org X-Gm-Message-State: AOJu0Yx2BEkkIRzihAxpf77rfHeHVBVoOnx/1OeE/Z/iIHocGQgR1zgB Hw+8w+gnuHCzqAFv9V31eMoqEnpH71ZK7s/LN+UGkiNyHtLiBxk4y6cHvpLOt50= X-Google-Smtp-Source: AGHT+IHXsgl2NQYO+yMV9X1gIfvGuclsqseoRzkfOcO3p7twoiEGni5X0dIXcScV5If78qjNaxucOA== X-Received: by 2002:a05:6a00:3e2a:b0:710:5605:a986 with SMTP id d2e1a72fcca58-714454098b9mr3730824b3a.0.1724434316322; Fri, 23 Aug 2024 10:31:56 -0700 (PDT) Received: from localhost.localdomain ([2620:11a:c019:0:65e:3115:2f58:c5fd]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7143430964fsm3279624b3a.150.2024.08.23.10.31.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Aug 2024 10:31:56 -0700 (PDT) From: Joe Damato To: netdev@vger.kernel.org Cc: amritha.nambiar@intel.com, sridhar.samudrala@intel.com, sdf@fomichev.me, peter@typeblog.net, m2shafiei@uwaterloo.ca, bjorn@rivosinc.com, hch@infradead.org, willy@infradead.org, willemdebruijn.kernel@gmail.com, skhawaja@google.com, kuba@kernel.org, Martin Karsten , Joe Damato , Alexander Viro , Christian Brauner , Jan Kara , linux-fsdevel@vger.kernel.org (open list:FILESYSTEMS (VFS and infrastructure)), linux-kernel@vger.kernel.org (open list) Subject: [PATCH net-next 5/6] eventpoll: Control irq suspension for prefer_busy_poll Date: Fri, 23 Aug 2024 17:30:56 +0000 Message-Id: <20240823173103.94978-6-jdamato@fastly.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240823173103.94978-1-jdamato@fastly.com> References: <20240823173103.94978-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Martin Karsten When events are reported to userland and prefer_busy_poll is set, irqs are temporarily suspended using napi_suspend_irqs. If no events are found and ep_poll would go to sleep, irq suspension is cancelled using napi_resume_irqs. Signed-off-by: Martin Karsten Co-developed-by: Joe Damato Signed-off-by: Joe Damato Tested-by: Joe Damato Tested-by: Martin Karsten --- rfc -> v1: - move irq resume code from ep_free to a helper which either resumes IRQs or does nothing if !defined(CONFIG_NET_RX_BUSY_POLL). fs/eventpoll.c | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index cc47f72005ed..5dbe717c06b4 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -457,6 +457,8 @@ static bool ep_busy_loop(struct eventpoll *ep, int nonb= lock) * it back in when we have moved a socket with a valid NAPI * ID onto the ready list. */ + if (prefer_busy_poll) + napi_resume_irqs(napi_id); ep->napi_id =3D 0; return false; } @@ -540,6 +542,22 @@ static long ep_eventpoll_bp_ioctl(struct file *file, u= nsigned int cmd, } } =20 +static void ep_suspend_napi_irqs(struct eventpoll *ep) +{ + unsigned int napi_id =3D READ_ONCE(ep->napi_id); + + if (napi_id >=3D MIN_NAPI_ID && READ_ONCE(ep->prefer_busy_poll)) + napi_suspend_irqs(napi_id); +} + +static void ep_resume_napi_irqs(struct eventpoll *ep) +{ + unsigned int napi_id =3D READ_ONCE(ep->napi_id); + + if (napi_id >=3D MIN_NAPI_ID && READ_ONCE(ep->prefer_busy_poll)) + napi_resume_irqs(napi_id); +} + #else =20 static inline bool ep_busy_loop(struct eventpoll *ep, int nonblock) @@ -557,6 +575,13 @@ static long ep_eventpoll_bp_ioctl(struct file *file, u= nsigned int cmd, return -EOPNOTSUPP; } =20 +static void ep_suspend_napi_irqs(struct eventpoll *ep) +{ +} + +static void ep_resume_napi_irqs(struct eventpoll *ep) +{ +} #endif /* CONFIG_NET_RX_BUSY_POLL */ =20 /* @@ -788,6 +813,7 @@ static bool ep_refcount_dec_and_test(struct eventpoll *= ep) =20 static void ep_free(struct eventpoll *ep) { + ep_resume_napi_irqs(ep); mutex_destroy(&ep->mtx); free_uid(ep->user); wakeup_source_unregister(ep->ws); @@ -2005,8 +2031,10 @@ static int ep_poll(struct eventpoll *ep, struct epol= l_event __user *events, * trying again in search of more luck. */ res =3D ep_send_events(ep, events, maxevents); - if (res) + if (res) { + ep_suspend_napi_irqs(ep); return res; + } } =20 if (timed_out) --=20 2.25.1 From nobody Sun Feb 8 05:28:16 2026 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04E65193069 for ; Fri, 23 Aug 2024 17:31:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434320; cv=none; b=TdN+v5KYm6WVIgYM2e0rJwudQZxCZ5tw4XpRqBSS8Njlq/uVE8f66iCFzUZzIpOVJSDcYog/lMZYSEmATbYBWExaUqs6tMnW2+SfXDe/MYUSKtWgwqkHwZGXx3dJmACeaFrWYEHdPfkVQACPwVzz5Xwlm4VUIxMKsOpRky8P+aQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724434320; c=relaxed/simple; bh=A5hbTpFf9NReafBa0UeWIum+icaAKqweWQ059Dzzjcw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=O/mgrdaYWtN59SoI67DriJi4KWs6snlTG/alYsi8HGuZILnKXhxznrHSurhEgHMDfvOct8Y9V1hsVvoflHbUFe/zrbgcw5N7LNCyCnuCMxQsTJB5s2yvqlNCkzcUx7X7gyVl7twD5CnxvzvQFPc7OR4n5nhXGxzV3/x6in3R6Ek= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com; spf=pass smtp.mailfrom=fastly.com; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b=QGL9lB6o; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=fastly.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fastly.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fastly.com header.i=@fastly.com header.b="QGL9lB6o" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-7142e002aceso1957045b3a.2 for ; Fri, 23 Aug 2024 10:31:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1724434318; x=1725039118; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dRJG7gtnGEjX3zB/r9nH89EwAMVuBrmJ1/QaNGiIfQ4=; b=QGL9lB6ogoNYeGHIWooh8MVe/ffaB0F0a9lCxfSy4N18rwBI9mHQC1wPx5MNuIb02V XmbwF8I2dirzc4relE65P4bIEZrmwP9QKJbeGc6829Ub1+xCDFsNfZ29Tusi8JHuaXop w79qa/CawICQYQZHlQcetEnbf3aLyIZM/RYeY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724434318; x=1725039118; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dRJG7gtnGEjX3zB/r9nH89EwAMVuBrmJ1/QaNGiIfQ4=; b=fa2hAvX8HgBaQdjPBfw6abssNRfWPBq0PtH/tQHZMzORuSpxqEViOJI2tzsgqYdjEj 1nLYPk3EgWhX1mz+wOg5dMa/mquiBBCOLWH0I//nQkBy8e0rU0as7GabBsCGyEd2H5Qp b4o7lgfZ/ec7zMrI+VHE1SAh3uUXwpKDTqIKKkRiFhK8ACY/B5XFt7QQNjBnEnZ7GQRl A1UnJvcrlxo9aa92+yVCuFEOtEki9jIL2jHviTik3w7NabzmVHjZ7pTZ2GPaGXdtoJTS Ki43pfTrr3OCF60r+foF7fMy3crB6c3k9kxEu/MX0cGFdfCdxyAk7F3/0B+0mHmDHt9q ha1Q== X-Forwarded-Encrypted: i=1; AJvYcCX7YwYAsUPmgkBaZkwjwhlEmePS3hi7BM8RIHCFm0WWJof7jngl1xh8GuZuYaWyaTSs1f/0O9eXvVNwIFo=@vger.kernel.org X-Gm-Message-State: AOJu0Yxzwxrf7+fM7cLuZGzGYSej7Jw9WPEll8xGbH3qLlXBpv3ZcLcX IMgmtZZjNXBVQECexmLpgp8MzWAIKFm03tBoTgvud481lo3pin1HoR6J2r1GhKo= X-Google-Smtp-Source: AGHT+IE/GsHsdnuK6rqXk7/a6x4NQp6xoPAAhkZXujnf6d3K6GfaKiO9uswOkRtrYN+I8cZ8TbZK+g== X-Received: by 2002:a05:6a21:2d86:b0:1c4:dfa7:d3ce with SMTP id adf61e73a8af0-1cc89d4cd1fmr3514127637.17.1724434318226; Fri, 23 Aug 2024 10:31:58 -0700 (PDT) Received: from localhost.localdomain ([2620:11a:c019:0:65e:3115:2f58:c5fd]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7143430964fsm3279624b3a.150.2024.08.23.10.31.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Aug 2024 10:31:57 -0700 (PDT) From: Joe Damato To: netdev@vger.kernel.org Cc: amritha.nambiar@intel.com, sridhar.samudrala@intel.com, sdf@fomichev.me, peter@typeblog.net, m2shafiei@uwaterloo.ca, bjorn@rivosinc.com, hch@infradead.org, willy@infradead.org, willemdebruijn.kernel@gmail.com, skhawaja@google.com, kuba@kernel.org, Joe Damato , Martin Karsten , "David S. Miller" , Eric Dumazet , Paolo Abeni , Jonathan Corbet , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:BPF [MISC]:Keyword:(?:\b|_)bpf(?:\b|_)) Subject: [PATCH net-next 6/6] docs: networking: Describe irq suspension Date: Fri, 23 Aug 2024 17:30:57 +0000 Message-Id: <20240823173103.94978-7-jdamato@fastly.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240823173103.94978-1-jdamato@fastly.com> References: <20240823173103.94978-1-jdamato@fastly.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Describe irq suspension, the epoll ioctls, and the tradeoffs of using different gro_flush_timeout values. Signed-off-by: Joe Damato Co-developed-by: Martin Karsten Signed-off-by: Martin Karsten Tested-by: Joe Damato Tested-by: Martin Karsten --- Documentation/networking/napi.rst | 112 +++++++++++++++++++++++++++++- 1 file changed, 110 insertions(+), 2 deletions(-) diff --git a/Documentation/networking/napi.rst b/Documentation/networking/n= api.rst index 7bf7b95c4f7a..04e838835b50 100644 --- a/Documentation/networking/napi.rst +++ b/Documentation/networking/napi.rst @@ -192,6 +192,9 @@ The ``gro_flush_timeout`` sysfs configuration of the ne= tdevice is reused to control the delay of the timer, while ``napi_defer_hard_irqs`` controls the number of consecutive empty polls before NAPI gives up and goes back to using hardware IRQs. +``irq_suspend_timeout`` is used to determine how long an application can +completely suspend IRQs. It is used in combination with SO_PREFER_BUSY_POL= L, +which can be set on a per-epoll context basis with ``EPIOCSPARAMS`` ioctl. =20 .. _poll: =20 @@ -208,6 +211,46 @@ selected sockets or using the global ``net.core.busy_p= oll`` and ``net.core.busy_read`` sysctls. An io_uring API for NAPI busy polling also exists. =20 +epoll-based busy polling +------------------------ + +It is possible to trigger packet processing directly from calls to +``epoll_wait``. In order to use this feature, a user application must ensu= re +all file descriptors which are added to an epoll context have the same NAP= I ID. + +If the application uses a dedicated acceptor thread, the application can o= btain +the NAPI ID of the incoming connection using SO_INCOMING_NAPI_ID and then +distribute that file descriptor to a worker thread. The worker thread woul= d add +the file descriptor to its epoll context. This would ensure each worker th= read +has an epoll context with FDs that have the same NAPI ID. + +Alternatively, if the application uses SO_REUSEPORT, a bpf or ebpf program= be +inserted to distribute incoming connections to threads such that each thre= ad is +only given incoming connections with the same NAPI ID. Care must be taken = to +carefully handle cases where a system may have multiple NICs. + +In order to enable busy polling, there are two choices: + +1. ``/proc/sys/net/core/busy_poll`` can be set with a time in useconds to = busy + loop waiting for events. This is a system-wide setting and will cause a= ll + epoll-based applications to busy poll when they call epoll_wait. This m= ay + not be desireable as many applications may not have the need to busy po= ll. + +2. Applications using recent kernels can issue an ioctl on the epoll conte= xt + file descriptor to set (``EPIOCSPARAMS``) or get (``EPIOCGPARAMS``) ``s= truct + epoll_params``:, which user programs can define as follows: + +.. code-block:: c + + struct epoll_params { + uint32_t busy_poll_usecs; + uint16_t busy_poll_budget; + uint8_t prefer_busy_poll; + + /* pad the struct to a multiple of 64bits */ + uint8_t __pad; + }; + IRQ mitigation --------------- =20 @@ -223,12 +266,77 @@ Such applications can pledge to the kernel that they = will perform a busy polling operation periodically, and the driver should keep the device IRQs permanently masked. This mode is enabled by using the ``SO_PREFER_BUSY_POL= L`` socket option. To avoid system misbehavior the pledge is revoked -if ``gro_flush_timeout`` passes without any busy poll call. +if ``gro_flush_timeout`` passes without any busy poll call. For epoll-based +busy polling applications, the ``prefer_busy_poll`` field of ``struct +epoll_params`` can be set to 1 and the ``EPIOCSPARAMS`` ioctl can be issue= d to +enable this mode. See the above section for more details. =20 The NAPI budget for busy polling is lower than the default (which makes sense given the low latency intention of normal busy polling). This is not the case with IRQ mitigation, however, so the budget can be adjusted -with the ``SO_BUSY_POLL_BUDGET`` socket option. +with the ``SO_BUSY_POLL_BUDGET`` socket option. For epoll-based busy polli= ng +applications, the ``busy_poll_budget`` field can be adjusted to the desire= d value +in ``struct epoll_params`` and set on a specific epoll context using the `= `EPIOCSPARAMS`` +ioctl. See the above section for more details. + +It is important to note that choosing a large value for ``gro_flush_timeou= t`` +will defer IRQs to allow for better batch processing, but will induce late= ncy +when the system is not fully loaded. Choosing a small value for +``gro_flush_timeout`` can cause interference of the user application which= is +attempting to busy poll by device IRQs and softirq processing. This value +should be chosen carefully with these tradeoffs in mind. epoll-based busy +polling applications may be able to mitigate how much user processing happ= ens +by choosing an appropriate value for ``maxevents``. + +Users may want to consider an alternate approach, IRQ suspension, to help = deal +with these tradeoffs. + +IRQ suspension +-------------- + +IRQ suspension is a mechanism wherein device IRQs are masked while epoll +triggers NAPI packet processing. + +While application calls to epoll_wait successfully retrieve events, the ke= rnel will +defer the IRQ suspension timer. If the kernel does not retrieve any events +while busy polling (for example, because network traffic levels subsided),= IRQ +suspension is disabled and the IRQ mitigation strategies described above a= re +engaged. + +This allows users to balance CPU consumption with network processing +efficiency. + +To use this mechanism: + + 1. The sysfs parameter ``irq_suspend_timeout`` should be set to the maxi= mum + time (in nanoseconds) the application can have its IRQs suspended. Th= is + timeout serves as a safety mechanism to restart IRQ driver interrupt + processing if the application has stalled. This value should be chose= n so + that it covers the amount of time the user application needs to proce= ss + data from its call to epoll_wait, noting that applications can contro= l how + much data they retrieve by setting ``max_events`` when calling epoll_= wait. + + 2. The sysfs parameter ``gro_flush_timeout`` and ``napi_defer_hard_irqs`= ` can + be set to low values. They will be used to defer IRQs after busy poll= has + found no data. + + 3. The ``prefer_busy_poll`` flag must be set to true. This can be done u= sing + the ``EPIOCSPARAMS`` ioctl as described above. + + 4. The application uses epoll as described above to trigger NAPI packet + processing. + +As mentioned above, as long as subsequent calls to epoll_wait return event= s to +userland, the ``irq_suspend_timeout`` is deferred and IRQs are disabled. T= his +allows the application to process data without interference. + +Once a call to epoll_wait results in no events being found, IRQ suspension= is +automatically disabled and the ``gro_flush_timeout`` and +``napi_defer_hard_irqs`` mitigation mechanisms take over. + +It is expected that ``irq_suspend_timeout`` will be set to a value much la= rger +than ``gro_flush_timeout`` as ``irq_suspend_timeout`` should suspend IRQs = for +the duration of one userland processing cycle. =20 .. _threaded: =20 --=20 2.25.1