From nobody Sat Feb 7 19:02:06 2026 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A4AC2C11EF for ; Tue, 2 Dec 2025 19:34:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764704091; cv=none; b=sadK1Ua/EiDo1L4be+yX/fZ7nnpSbsOYUFrgQxI+u9HmHtETCOnKnN8sN7cBVLvGEHxEMm8XqkbbZ/i4RHvxKTeL+SxCXRZRAPeauGoog80XGTPQHW53ggdzN0rdjFqKo2qtiuTSWOtuyH/yucQxjvbJsr6kNxviQ2NRDCyk8Nk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764704091; c=relaxed/simple; bh=fcyNJ7pAcnLA20wRvsFdOU7fm172cRoj/qNvIB8hORk=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:Cc; b=uht24qkrhwyuyY77e0xsesJSIEduYEDuwYtF4SOsb6Wa16exYKFQWr4W0j2JIIhnP9siLzEsT/2NaTtlUyXdx1gKVEuY2gE/mh+2NH6xrYDWdmcPXd9XwLgjNbJ5dVMHmDz10Azv0fKb3b7635JK2dd1cOU2gKdVpn2MXLgK4GE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=TccvJHCd; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TccvJHCd" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-78c04f279c2so6657847b3.3 for ; Tue, 02 Dec 2025 11:34:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764704089; x=1765308889; darn=vger.kernel.org; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:from:to:cc:subject:date:message-id:reply-to; bh=+VEGSOhmz0nAR98pKXmGNz+Y7vH/5odvfWJ2rgZfmuU=; b=TccvJHCdNzGWlsoDxqldkI/O/yURoOlGNtqoTQUYX7logWhJcFhfw+z+OJYJfGeJPi PAULw4j2h4vMyRJFKczerIA+IaQfuCJnTHslkhMemVJYrMnSMLZxgSdeb5yV8FkzpLYo XvAdmGmIa9kx2OnYXy/jW7KfwhmQyszxD3p4/iWfPgcWRYWSJjuSKKAdr1c4Jw85t300 4Yy2m9GHmNSHWf0ZVobNbpBllUmT7AVSlBlu1dx4q8zXBm7CVSy0+tX1351O8XB2Pshw L4aeUkw6os73Ks1EKvcCAkPQG+QeqwYF7vR9IKkQ6upAlN/BYkbrBoDalfS0zU56CwFi qSKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764704089; x=1765308889; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+VEGSOhmz0nAR98pKXmGNz+Y7vH/5odvfWJ2rgZfmuU=; b=YdgAlzK4+NuYzeGpnSxhkpUMrbyFg5tYoGqGgrlIYKffkoKkYq8xie3YDUM9EiD22/ htjcPOgaoNgSynVx5YsNpnJvzK/vMWmkti836MqrdaCrTm9+wMLqNJL2V1vf88LuZ6Dc mTMuIYzY7VtTUCRfKBProDRyfhp8xIEVqtuQRKn2NFHCd0Mu1rNvAyr5099sbPOi5MBu nUO6/MmAf3uxsd9ROk3BPvLAOtfJcmOtha7epVS2d2jZOVngTiIAI8Oa2XJKqsBPFsE8 BGlSVgTy6FxQ4KwfZSeqlTbLhHh3rkYRHmTTWgxc/YKp7/UivcTgD7QtGfcGffI1K7Y9 9LHA== X-Forwarded-Encrypted: i=1; AJvYcCVYy014RGEdvAY9ql9P6QK9cHxcRTWsnAsTLrtEIQ0u1/qiRfgo9Uet8eBqdRIU1L7RCH9tYye6uf1IkVU=@vger.kernel.org X-Gm-Message-State: AOJu0Yx6sTOygIeIgVoYBL242gC+tR6tSjo2mhiHdzxEUjUy27WHN5SL siRYzOye7x8sp5hFPoJRz/0T8cvfInuF7JjYCUd95DRBxv0YUGybGkKMae5pKiaE X-Gm-Gg: ASbGncuh+jlxBTrupgAY8JXGgkTE3Vccb70IYtfuEjNLwdTP7x1hpNLn040y2PgP06S CZ9KkNnxcsoVHIFsLE6x3wOZ8ABCfVewJtJ5R2FKb17WDaFRUZCJNLMt5KppjQ+nJrIOaQP4bIu ekZU5lkYwq/5sUfQZotWBiWShv5qSF6tZ2Jgpsv+3QOVsK7WXv95QILhr+FZS+tgLk7uDms7MS4 TgoEgZ+VVIxC8EvADpwUBS3Ebh5iu/Oh/kcRhmuF3m45WVhec3TdsmQKyUtz08x/Ht56vKIhlNA kqDpLE3rgF8j7SFTqefQuJAMfLFSIY0ReUjZGUGRohR+AsjD9xWTr8FQn+rNZuz0GsUfjHunPKr fcsEZGDwObOtzxrNOdqDWovIL1e8Lzkaz2gG04UMFJIcFFQiP6/vmDoQblKbmiCy2vBfKR6GfJp O/bgVwoqfeyCjGuCeSQifv+A== X-Google-Smtp-Source: AGHT+IHamzi87hcNUu5q6QsNlFOyKe2O5bZE7tJ2WcAqtiT478fK9YlBRKe+aaXZ6v5bTKa1gq6dIw== X-Received: by 2002:a05:690c:708b:b0:783:cfa0:3b69 with SMTP id 00721157ae682-78ab6d7cd25mr274241977b3.4.1764704088875; Tue, 02 Dec 2025 11:34:48 -0800 (PST) Received: from localhost ([2a03:2880:25ff:45::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78ad0c25ae5sm66464217b3.0.2025.12.02.11.34.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Dec 2025 11:34:48 -0800 (PST) From: Bobby Eshleman Date: Tue, 02 Dec 2025 11:34:17 -0800 Subject: [PATCH net-next v2] net: devmem: convert binding refcount to percpu_ref Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251202-upstream-percpu-ref-v2-1-4accb717da40@meta.com> X-B4-Tracking: v=1; b=H4sIADg/L2kC/33NQQqDMBBA0auEWTvFmVpLXXmP4iLGsWZhDJMoF vHuBQ/Q9Yf3D0iiXhI05gCVzSe/BGgMFwbcZMNH0A/QGOCSH0Rc4xpTVrEzRlEXV1QZsSrpzs+ x5r7qoTAQVUa/X+obgmQMsmfoCgOTT3nR77Xb6Op/5Y2Q0Inl0r64p2FoZ8n25pYZuvM8f/W4Y 2u/AAAA X-Change-ID: 20251126-upstream-percpu-ref-401327f62b4b To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Mina Almasry , Stanislav Fomichev , asml.silence@gmail.com, Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Convert net_devmem_dmabuf_binding refcount from refcount_t to percpu_ref to optimize common-case reference counting on the hot path. The typical devmem workflow involves binding a dmabuf to a queue (acquiring the initial reference on binding->ref), followed by high-volume traffic where every skb fragment acquires a reference. Eventually traffic stops and the unbind operation releases the initial reference. Additionally, the high traffic hot path is often multi-core. This access pattern is ideal for percpu_ref as the first and last reference during bind/unbind normally book-ends activity in the hot path. __net_devmem_dmabuf_binding_free becomes the percpu_ref callback invoked when the last reference is dropped. kperf test: - 4MB message sizes - 60s of workload each run - 5 runs - 4 flows Throughput: Before: 45.31 GB/s (+/- 3.17 GB/s) After: 48.67 GB/s (+/- 0.01 GB/s) Picking throughput-matched kperf runs (both before and after matched at ~48 GB/s) for apples-to-apples comparison: Summary (averaged across 4 workers): TX worker CPU idle %: Before: 34.44% After: 87.13% RX worker CPU idle %: Before: 5.38% After: 9.73% kperf before: client: =3D=3D Source client: Tx 98.100 Gbps (735764807680 bytes in 60001149 usec) client: Tx102.798 Gbps (770996961280 bytes in 60001149 usec) client: Tx101.534 Gbps (761517834240 bytes in 60001149 usec) client: Tx 82.794 Gbps (620966707200 bytes in 60001149 usec) client: net CPU 56: usr: 0.01% sys: 0.12% idle:17.06% iow: 0.00% irq: 9.8= 9% sirq:72.91% client: app CPU 60: usr: 0.08% sys:63.30% idle:36.24% iow: 0.00% irq: 0.3= 0% sirq: 0.06% client: net CPU 57: usr: 0.03% sys: 0.08% idle:75.68% iow: 0.00% irq: 2.9= 6% sirq:21.23% client: app CPU 61: usr: 0.06% sys:67.67% idle:31.94% iow: 0.00% irq: 0.2= 8% sirq: 0.03% client: net CPU 58: usr: 0.01% sys: 0.06% idle:76.87% iow: 0.00% irq: 2.8= 4% sirq:20.19% client: app CPU 62: usr: 0.06% sys:69.78% idle:29.79% iow: 0.00% irq: 0.3= 0% sirq: 0.05% client: net CPU 59: usr: 0.06% sys: 0.16% idle:74.97% iow: 0.00% irq: 3.7= 6% sirq:21.03% client: app CPU 63: usr: 0.06% sys:59.82% idle:39.80% iow: 0.00% irq: 0.2= 5% sirq: 0.05% client: =3D=3D Target client: Rx 98.092 Gbps (735764807680 bytes in 60006084 usec) client: Rx102.785 Gbps (770962161664 bytes in 60006084 usec) client: Rx101.523 Gbps (761499566080 bytes in 60006084 usec) client: Rx 82.783 Gbps (620933136384 bytes in 60006084 usec) client: net CPU 2: usr: 0.00% sys: 0.01% idle:24.51% iow: 0.00% irq: 1.6= 7% sirq:73.79% client: app CPU 6: usr: 1.51% sys:96.43% idle: 1.13% iow: 0.00% irq: 0.3= 6% sirq: 0.55% client: net CPU 1: usr: 0.00% sys: 0.01% idle:25.18% iow: 0.00% irq: 1.9= 9% sirq:72.80% client: app CPU 5: usr: 2.21% sys:94.54% idle: 2.54% iow: 0.00% irq: 0.3= 8% sirq: 0.30% client: net CPU 3: usr: 0.00% sys: 0.01% idle:26.34% iow: 0.00% irq: 2.1= 2% sirq:71.51% client: app CPU 7: usr: 2.22% sys:94.28% idle: 2.52% iow: 0.00% irq: 0.5= 9% sirq: 0.37% client: net CPU 0: usr: 0.00% sys: 0.03% idle: 0.00% iow: 0.00% irq:10.4= 4% sirq:89.51% client: app CPU 4: usr: 2.39% sys:81.46% idle:15.33% iow: 0.00% irq: 0.5= 0% sirq: 0.30% kperf after: client: =3D=3D Source client: Tx 99.257 Gbps (744447016960 bytes in 60001303 usec) client: Tx101.013 Gbps (757617131520 bytes in 60001303 usec) client: Tx 88.179 Gbps (661357854720 bytes in 60001303 usec) client: Tx101.002 Gbps (757533245440 bytes in 60001303 usec) client: net CPU 56: usr: 0.00% sys: 0.01% idle: 6.22% iow: 0.00% irq: 8.6= 8% sirq:85.06% client: app CPU 60: usr: 0.08% sys:12.56% idle:87.21% iow: 0.00% irq: 0.0= 8% sirq: 0.05% client: net CPU 57: usr: 0.00% sys: 0.05% idle:69.53% iow: 0.00% irq: 2.0= 2% sirq:28.38% client: app CPU 61: usr: 0.11% sys:13.40% idle:86.36% iow: 0.00% irq: 0.0= 8% sirq: 0.03% client: net CPU 58: usr: 0.00% sys: 0.03% idle:70.04% iow: 0.00% irq: 3.3= 8% sirq:26.53% client: app CPU 62: usr: 0.10% sys:11.46% idle:88.31% iow: 0.00% irq: 0.0= 8% sirq: 0.03% client: net CPU 59: usr: 0.01% sys: 0.06% idle:71.18% iow: 0.00% irq: 1.9= 7% sirq:26.75% client: app CPU 63: usr: 0.10% sys:13.10% idle:86.64% iow: 0.00% irq: 0.1= 0% sirq: 0.05% client: =3D=3D Target client: Rx 99.250 Gbps (744415182848 bytes in 60003297 usec) client: Rx101.006 Gbps (757589737472 bytes in 60003297 usec) client: Rx 88.171 Gbps (661319475200 bytes in 60003297 usec) client: Rx100.996 Gbps (757514792960 bytes in 60003297 usec) client: net CPU 2: usr: 0.00% sys: 0.01% idle:28.02% iow: 0.00% irq: 1.9= 5% sirq:70.00% client: app CPU 6: usr: 2.03% sys:87.20% idle:10.04% iow: 0.00% irq: 0.3= 7% sirq: 0.33% client: net CPU 3: usr: 0.00% sys: 0.00% idle:27.63% iow: 0.00% irq: 1.9= 0% sirq:70.45% client: app CPU 7: usr: 1.78% sys:89.70% idle: 7.79% iow: 0.00% irq: 0.3= 7% sirq: 0.34% client: net CPU 0: usr: 0.00% sys: 0.01% idle: 0.00% iow: 0.00% irq: 9.9= 6% sirq:90.01% client: app CPU 4: usr: 2.33% sys:83.51% idle:13.24% iow: 0.00% irq: 0.6= 4% sirq: 0.26% client: net CPU 1: usr: 0.00% sys: 0.01% idle:27.60% iow: 0.00% irq: 1.9= 4% sirq:70.43% client: app CPU 5: usr: 1.88% sys:89.61% idle: 7.86% iow: 0.00% irq: 0.3= 5% sirq: 0.27% Signed-off-by: Bobby Eshleman --- Changes in v2: - remove comments (Stan and Paolo) - fix grammar error in commit msg - avoid unnecessary name change of work_struct wq - Link to v1: https://lore.kernel.org/r/20251126-upstream-percpu-ref-v1-1-c= ea20a92b1dd@meta.com --- net/core/devmem.c | 23 ++++++++++++++++++++--- net/core/devmem.h | 10 +++------- 2 files changed, 23 insertions(+), 10 deletions(-) diff --git a/net/core/devmem.c b/net/core/devmem.c index ec4217d6c0b4..17ba386c7f67 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -54,6 +54,15 @@ static dma_addr_t net_devmem_get_dma_addr(const struct n= et_iov *niov) ((dma_addr_t)net_iov_idx(niov) << PAGE_SHIFT); } =20 +static void net_devmem_dmabuf_binding_release(struct percpu_ref *ref) +{ + struct net_devmem_dmabuf_binding *binding =3D + container_of(ref, struct net_devmem_dmabuf_binding, ref); + + INIT_WORK(&binding->unbind_w, __net_devmem_dmabuf_binding_free); + schedule_work(&binding->unbind_w); +} + void __net_devmem_dmabuf_binding_free(struct work_struct *wq) { struct net_devmem_dmabuf_binding *binding =3D container_of(wq, typeof(*bi= nding), unbind_w); @@ -75,6 +84,7 @@ void __net_devmem_dmabuf_binding_free(struct work_struct = *wq) dma_buf_detach(binding->dmabuf, binding->attachment); dma_buf_put(binding->dmabuf); xa_destroy(&binding->bound_rxqs); + percpu_ref_exit(&binding->ref); kvfree(binding->tx_vec); kfree(binding); } @@ -143,7 +153,7 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_= binding *binding) __net_mp_close_rxq(binding->dev, rxq_idx, &mp_params); } =20 - net_devmem_dmabuf_binding_put(binding); + percpu_ref_kill(&binding->ref); } =20 int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx, @@ -209,7 +219,12 @@ net_devmem_bind_dmabuf(struct net_device *dev, binding->dev =3D dev; xa_init_flags(&binding->bound_rxqs, XA_FLAGS_ALLOC); =20 - refcount_set(&binding->ref, 1); + err =3D percpu_ref_init(&binding->ref, + net_devmem_dmabuf_binding_release, + 0, GFP_KERNEL); + + if (err < 0) + goto err_free_binding; =20 mutex_init(&binding->lock); =20 @@ -220,7 +235,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, if (IS_ERR(binding->attachment)) { err =3D PTR_ERR(binding->attachment); NL_SET_ERR_MSG(extack, "Failed to bind dmabuf to device"); - goto err_free_binding; + goto err_exit_ref; } =20 binding->sgt =3D dma_buf_map_attachment_unlocked(binding->attachment, @@ -322,6 +337,8 @@ net_devmem_bind_dmabuf(struct net_device *dev, direction); err_detach: dma_buf_detach(dmabuf, binding->attachment); +err_exit_ref: + percpu_ref_exit(&binding->ref); err_free_binding: kfree(binding); err_put_dmabuf: diff --git a/net/core/devmem.h b/net/core/devmem.h index 0b43a648cd2e..2534c8144212 100644 --- a/net/core/devmem.h +++ b/net/core/devmem.h @@ -41,7 +41,7 @@ struct net_devmem_dmabuf_binding { * retransmits) hold a reference to the binding until the skb holding * them is freed. */ - refcount_t ref; + struct percpu_ref ref; =20 /* The list of bindings currently active. Used for netlink to notify us * of the user dropping the bind. @@ -125,17 +125,13 @@ static inline unsigned long net_iov_virtual_addr(cons= t struct net_iov *niov) static inline bool net_devmem_dmabuf_binding_get(struct net_devmem_dmabuf_binding *binding) { - return refcount_inc_not_zero(&binding->ref); + return percpu_ref_tryget(&binding->ref); } =20 static inline void net_devmem_dmabuf_binding_put(struct net_devmem_dmabuf_binding *binding) { - if (!refcount_dec_and_test(&binding->ref)) - return; - - INIT_WORK(&binding->unbind_w, __net_devmem_dmabuf_binding_free); - schedule_work(&binding->unbind_w); + percpu_ref_put(&binding->ref); } =20 void net_devmem_get_net_iov(struct net_iov *niov); --- base-commit: 3c4159b3019cc3444495f54c18083cda579cba84 change-id: 20251126-upstream-percpu-ref-401327f62b4b Best regards, --=20 Bobby Eshleman