From nobody Fri Dec 19 10:55:54 2025 Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D57624BD03 for ; Wed, 5 Nov 2025 01:23:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305807; cv=none; b=FZM/0ywLW1Yyk3zmBMmpRXVQyeVUACRxVliZlSQz7sccbkXjORU58IqIh8TW1r20FJ6gI+V/evznHBWgfAgPZ1/WlV39HWX2TXcloUqyUMPnsmp1vemufJS99kIaBwEyByFspBRMf9pJ+zFPVf63+8hQ3qGOQlPo0mvQQE7sLI8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305807; c=relaxed/simple; bh=OvzNhBXxuWUIF6RvKnWgwOVCFYbwugbVchWt/Nz0HFY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=t/prt1awwiBgHpyS1f6JpBuWGOK5qsF0pPOGCTop5dmN5eY7kjwoU0AwRKwcF50G/y4uYH1TmJEzl6eRev62Rv26Wwg6WXtGIS3iwR/wvxKdTAPv3/JILzCagxHV5A6b1eCVAp10PYPoSiPD8yW7PyLmsZ5JWVqmxmrmf/C+oSs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=bbKiRJKa; arc=none smtp.client-ip=209.85.128.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bbKiRJKa" Received: by mail-yw1-f179.google.com with SMTP id 00721157ae682-7864cef1976so6946777b3.0 for ; Tue, 04 Nov 2025 17:23:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762305804; x=1762910604; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=v+SHvc7nB9HHEDm3HUCX6qil9+t5RGl5tlTHjTzf/d8=; b=bbKiRJKaCxxCePZ3MFJn14tbhBMXlMebKzsToyhvXUnlWvzLKWy9Ubg3yBBhnBKIcq 8ZshJNuCT1wcMTMMr2dEFKAkKKU+H33EKMe3Lsh+Fgy8m9rBL/fmKoNgZ5Rbh2UwTx8F 9f7ojtB+BrfUXQevhivZARAjJfSISBdNWuzgIY0IkqSfAs7qOzWbLIWg6oa6Y9yxgAT2 TTIouHeHrtH2g2ca40JQEiK0okNcM9dxpWgec5el7ndxmLLZ6hEBSkb25rkezMEjJ+xt KBYLg5rIq7aH1puTak7ss07+5q5+fx/YgK5gxuRLPTKBm1BLmMAsPahiQD7PlzMcR3GG 0P0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762305804; x=1762910604; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v+SHvc7nB9HHEDm3HUCX6qil9+t5RGl5tlTHjTzf/d8=; b=hsV/gvajNfiRYa6zmw/oV8DJ9oE5C3v7XY0GWWmEsBKI4LFzGpsmeW7sbyIsVlz2C2 IXDscAabfn1Q8dX5XpLarqNHSwbwX1MsCYqM9eOU5k1MxFzyD72Yc8pefRp9xLna5AEL TXvo+JCu8I0WtkHejdLaVbYOeRisjYOEX8QYKPCFk/sxxtsVFNzNfC8SgSLh+4qJkrMz goxLP4B6gDFb2CXJY0e35/NrNvn0bQTs+emDtCOF26ql0Jx6xFpf6RJ6JNV9SjIrSPmC eTXGCyisFU1iROuKUHOAr3RplnmaiRJwK6ROKQ5xR4u4Qfqjm/w+O7qASv0IJ7Cro0Xl 82EA== X-Forwarded-Encrypted: i=1; AJvYcCXdLTi3UBN5Uv4UAYs2KdN4JldTxgPgrSWDM+NnM5cUVaMWXaiICTfcTP4dfLcPYzwY4JUObged8UVU49Y=@vger.kernel.org X-Gm-Message-State: AOJu0Yy5HhBeUY1l9GT06gr8b2F//9GUc8pd2BpZ65KQO1cVrLz4ftvN M0VVeyyhfloOPMmKEewFannekWJmgb0qWcHFtoEQZ44DpOS4P/hc5Ndf X-Gm-Gg: ASbGncuT0gpBqvh7F8WI/DzIuiKGHSC+yg9+eP9BuDiveXM5Rd9YMyEUMtGamJBlENy VTFfb6VC1AaRFa8plBPdeywcUajp1YbWy+Y2ATZ3BsxqyDEWWsxvhWSnPZ5zAKW6igRJXM+PPvK G3B1LgQMHg4KI4DcAza2Mo5olWSnx26+St6NOkUyehyO2m14N4VPow1IIOtJx6CwvbqXhBnMEgH iHs92P5xh9nQdLJKfPaGOAsajLy3si+mHCJpYQPArQ7zKIRqo4RYr0c4nltVk8mH06TwEAXhOjW eXm+JK4puiH3ILlNLnEasUVsavkn5SBG/M+mnOoCOwbyYnDk0S06d1vCp7AaqOvbhzB6uSn5mwG KPEWa4pXhmad4j0azNgVYHrcAyX578gpQIF6S3H1sdF6E3c7CVCRtYzZmZI5N1ltwpDdJc/Ekvj 6vrHCstdSIlYc= X-Google-Smtp-Source: AGHT+IFeyI0Qmv/W+8x9HEVce2BJ1U7rPbvYiu9qefbETDwwR8GfWjNVDPJZ55xAV6KEJZjaAcCOAw== X-Received: by 2002:a05:690e:1c0b:b0:63f:af0f:aaf with SMTP id 956f58d0204a3-63fd31161f5mr1195839d50.1.1762305804005; Tue, 04 Nov 2025 17:23:24 -0800 (PST) Received: from localhost ([2a03:2880:25ff:41::]) by smtp.gmail.com with ESMTPSA id 956f58d0204a3-63fc92f0a66sm1316651d50.0.2025.11.04.17.23.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Nov 2025 17:23:23 -0800 (PST) From: Bobby Eshleman Date: Tue, 04 Nov 2025 17:23:20 -0800 Subject: [PATCH net-next v6 1/6] net: devmem: rename tx_vec to vec in dmabuf binding Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-1-ea98cf4d40b3@meta.com> References: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> In-Reply-To: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Kuniyuki Iwashima , Willem de Bruijn , Neal Cardwell , David Ahern , Arnd Bergmann , Jonathan Corbet , Andrew Lunn , Shuah Khan , Mina Almasry Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Stanislav Fomichev , Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Rename the 'tx_vec' field in struct net_devmem_dmabuf_binding to 'vec'. This field holds pointers to net_iov structures. The rename prepares for reusing 'vec' for both TX and RX directions. No functional change intended. Reviewed-by: Mina Almasry Signed-off-by: Bobby Eshleman --- net/core/devmem.c | 22 +++++++++++----------- net/core/devmem.h | 2 +- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/net/core/devmem.c b/net/core/devmem.c index 1d04754bc756..4dee2666dd07 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -75,7 +75,7 @@ void __net_devmem_dmabuf_binding_free(struct work_struct = *wq) dma_buf_detach(binding->dmabuf, binding->attachment); dma_buf_put(binding->dmabuf); xa_destroy(&binding->bound_rxqs); - kvfree(binding->tx_vec); + kvfree(binding->vec); kfree(binding); } =20 @@ -232,10 +232,10 @@ net_devmem_bind_dmabuf(struct net_device *dev, } =20 if (direction =3D=3D DMA_TO_DEVICE) { - binding->tx_vec =3D kvmalloc_array(dmabuf->size / PAGE_SIZE, - sizeof(struct net_iov *), - GFP_KERNEL); - if (!binding->tx_vec) { + binding->vec =3D kvmalloc_array(dmabuf->size / PAGE_SIZE, + sizeof(struct net_iov *), + GFP_KERNEL); + if (!binding->vec) { err =3D -ENOMEM; goto err_unmap; } @@ -249,7 +249,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, dev_to_node(&dev->dev)); if (!binding->chunk_pool) { err =3D -ENOMEM; - goto err_tx_vec; + goto err_vec; } =20 virtual =3D 0; @@ -295,7 +295,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov), net_devmem_get_dma_addr(niov)); if (direction =3D=3D DMA_TO_DEVICE) - binding->tx_vec[owner->area.base_virtual / PAGE_SIZE + i] =3D niov; + binding->vec[owner->area.base_virtual / PAGE_SIZE + i] =3D niov; } =20 virtual +=3D len; @@ -315,8 +315,8 @@ net_devmem_bind_dmabuf(struct net_device *dev, gen_pool_for_each_chunk(binding->chunk_pool, net_devmem_dmabuf_free_chunk_owner, NULL); gen_pool_destroy(binding->chunk_pool); -err_tx_vec: - kvfree(binding->tx_vec); +err_vec: + kvfree(binding->vec); err_unmap: dma_buf_unmap_attachment_unlocked(binding->attachment, binding->sgt, direction); @@ -363,7 +363,7 @@ struct net_devmem_dmabuf_binding *net_devmem_get_bindin= g(struct sock *sk, int err =3D 0; =20 binding =3D net_devmem_lookup_dmabuf(dmabuf_id); - if (!binding || !binding->tx_vec) { + if (!binding || !binding->vec) { err =3D -EINVAL; goto out_err; } @@ -414,7 +414,7 @@ net_devmem_get_niov_at(struct net_devmem_dmabuf_binding= *binding, *off =3D virt_addr % PAGE_SIZE; *size =3D PAGE_SIZE - *off; =20 - return binding->tx_vec[virt_addr / PAGE_SIZE]; + return binding->vec[virt_addr / PAGE_SIZE]; } =20 /*** "Dmabuf devmem memory provider" ***/ diff --git a/net/core/devmem.h b/net/core/devmem.h index 101150d761af..2ada54fb63d7 100644 --- a/net/core/devmem.h +++ b/net/core/devmem.h @@ -63,7 +63,7 @@ struct net_devmem_dmabuf_binding { * address. This array is convenient to map the virtual addresses to * net_iovs in the TX path. */ - struct net_iov **tx_vec; + struct net_iov **vec; =20 struct work_struct unbind_w; }; --=20 2.47.3 From nobody Fri Dec 19 10:55:54 2025 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0106A242D9D for ; Wed, 5 Nov 2025 01:23:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305808; cv=none; b=KeUS1dTP9++4phD3VZceOuZwny1nklZmCqjgtF0I+sJSlDuEiaNX4OPnYh6ibPQWGsmMhGYpfJu+i8hf8lN59jDvLdxk6xFkr0f765vD41D9m78ukHWjZ0AvTsumFoy0vILIY2/n6iS80O0ERqvjZJgMTfKvic7MFxj5P15PHH0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305808; c=relaxed/simple; bh=ieGYJI9ziUHUQcaSbEmkTTmHhUJOCFMJhOBD52u6LIY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=izZAVxKfvHKSHrAOcvEPWogMG+ySElIMop3KXwfo50KM2ICuLfJjpnKXWOFGyq11iKG+445CJZ2SxQuJ0q0EUERrWoDUObqsxt/PyXtvIA+Sd/ZSW7+vJqPgGdtnnd5DIKyHk90EkMiHQP4BSojvkS0YpzuHu9c4xJByoJm4fhs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=gquzvXsb; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gquzvXsb" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-7864ab69f01so44718517b3.3 for ; Tue, 04 Nov 2025 17:23:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762305805; x=1762910605; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=1xpyNjvOpzHicrhK6zjCao8v8pSlxTbvuW+wenQI8Ik=; b=gquzvXsbpH+UxQYzNFEyFFmZpA15dmESjWOQQRbU8XpfWAPckLWiXHikV0Ir4jyqvz txrb8YotTyfsJymCjh3YcnSMa2ghZz7K13rzsA/1DfBLBLZhR56Y0EZm1TrjWVhRuzGP ZGDTj4rBD3T6FzTdoZWYOPiTHFBHAaI+/zt3i2ZtwEJIwjJHuqSlxZtEr2TOT+iD34Vr sUthM1OqBCrolHAQYVcukEJeQSEoGZTfhPlPG5kHHPvNm6Y4i63yjBlV2bEBWAgnesL+ pI2t2JxhxAnnMum+M/eTM192fC/xYelvdtyI6Qs6I8Ib3DIE4wTDPEMo33l7FD3f2lGL Ll3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762305805; x=1762910605; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1xpyNjvOpzHicrhK6zjCao8v8pSlxTbvuW+wenQI8Ik=; b=nAdT7VSrqKoCf4YCFuHyHiSZSZ9KRM/wEodaeiklf7ryJLml0rDsxhgTbA+h6FCDEO mFh0df5IrZ7QSm/HocBhww06dCozg7yIP3MW8ubDx6wRZZrwboyFYFUyW8CuGaKnqtD7 9Q1+1Jdyhmkx6ZVDTRuBdAOHUzUbcSTZtoUTFkoOivHF+oqcuJU9DvgsTTiPvr5DMuLE 3uXKoKQy6d9CJje5lnqTbgxQMxrGu3j3LdDFLozjbjbZdr6iue3v5BpPKxg7TJ5lJG96 eMk1/X7y3miu/dL5FvPATgyprsgC4hJixM0kxRs3gbCEbtmpQpxrIDzya8m5ldVX4p3n 25Hg== X-Forwarded-Encrypted: i=1; AJvYcCVLtI88zbr0xG7vFeV/znkvXXw3MUOp926RfQgnnnoeCY1EPfwH6dKieQ0qqQ3zxnYq3qsu7yDM5wPk5tU=@vger.kernel.org X-Gm-Message-State: AOJu0YzbgVHTA2fVXDmROv9UAYYIFmVORvoJC4ciSIdvCCgC2e3v0vqF lyk+VShweY1MZ3g/JrMRxem/LLMWVY2ebds57SEtDGPUPu/j9tztr9ol X-Gm-Gg: ASbGncvw3F10oIKtt8taEdERj9jEOSWEWv0pcyCrbb3lW+lzgt8V4tb/B6FUqWqvgk7 cB19NdhhkY1Giya8IdFCbOGUllbPoeF+Tqa1v+wgydTXwr2j8AMMEL3BhaObuFdE9vvNKAin0Jw FUoxPhCUCMnY4bvdnLl4/2s+h0qMgALrzOJfNsyx/X2Xm8WkqmccG/qHRKeymHYltDCO0MhpITv /kh3Y7crklxhEuoZyXpbZ0rHXqtvzpGOjpPccosMQkaNL2LVnIbISrwsjbyIybtmZsoRSutjRCV GFDYvj23edLoguQbpVLwkyVGREW5ZPa/fcvTCapZwq3kRa0rgbC0Bu6JodWS10ESVcpT/Z6hUGL N/znrML5TYXiuE32MmVsyY7YKjCsK11LDhlrVIlZwZSGcBmTGevGMQVHMcdqWTJYpZOI7J2amiO 1lMBEgIhv3950= X-Google-Smtp-Source: AGHT+IEUvytYvHi/IO2KX++l2Q23Drlof8YVKuhjSICzsPMFHeHVGstmHJL/UBXx/ejfmWw6Z9oZFQ== X-Received: by 2002:a53:c04f:0:20b0:63f:c019:23ee with SMTP id 956f58d0204a3-63fd34cd657mr1283126d50.21.1762305804817; Tue, 04 Nov 2025 17:23:24 -0800 (PST) Received: from localhost ([2a03:2880:25ff:5f::]) by smtp.gmail.com with ESMTPSA id 956f58d0204a3-63fc95dc03asm1300572d50.20.2025.11.04.17.23.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Nov 2025 17:23:24 -0800 (PST) From: Bobby Eshleman Date: Tue, 04 Nov 2025 17:23:21 -0800 Subject: [PATCH net-next v6 2/6] net: devmem: refactor sock_devmem_dontneed for autorelease split Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-2-ea98cf4d40b3@meta.com> References: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> In-Reply-To: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Kuniyuki Iwashima , Willem de Bruijn , Neal Cardwell , David Ahern , Arnd Bergmann , Jonathan Corbet , Andrew Lunn , Shuah Khan , Mina Almasry Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Stanislav Fomichev , Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Refactor sock_devmem_dontneed() in preparation for supporting both autorelease and manual token release modes. Split the function into two parts: - sock_devmem_dontneed(): handles input validation, token allocation, and copying from userspace - sock_devmem_dontneed_autorelease(): performs the actual token release via xarray lookup and page pool put This separation allows a future commit to add a parallel sock_devmem_dontneed_manual_release() function that uses a different token tracking mechanism (per-niov reference counting) without duplicating the input validation logic. The refactoring is purely mechanical with no functional change. Only intended to minimize the noise in subsequent patches. Reviewed-by: Mina Almasry Signed-off-by: Bobby Eshleman --- net/core/sock.c | 52 ++++++++++++++++++++++++++++++++-------------------- 1 file changed, 32 insertions(+), 20 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index 7a9bbc2afcf0..5562f517d889 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1082,30 +1082,13 @@ static int sock_reserve_memory(struct sock *sk, int= bytes) #define MAX_DONTNEED_FRAGS 1024 =20 static noinline_for_stack int -sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optle= n) +sock_devmem_dontneed_autorelease(struct sock *sk, struct dmabuf_token *tok= ens, + unsigned int num_tokens) { - unsigned int num_tokens, i, j, k, netmem_num =3D 0; - struct dmabuf_token *tokens; + unsigned int i, j, k, netmem_num =3D 0; int ret =3D 0, num_frags =3D 0; netmem_ref netmems[16]; =20 - if (!sk_is_tcp(sk)) - return -EBADF; - - if (optlen % sizeof(*tokens) || - optlen > sizeof(*tokens) * MAX_DONTNEED_TOKENS) - return -EINVAL; - - num_tokens =3D optlen / sizeof(*tokens); - tokens =3D kvmalloc_array(num_tokens, sizeof(*tokens), GFP_KERNEL); - if (!tokens) - return -ENOMEM; - - if (copy_from_sockptr(tokens, optval, optlen)) { - kvfree(tokens); - return -EFAULT; - } - xa_lock_bh(&sk->sk_user_frags); for (i =3D 0; i < num_tokens; i++) { for (j =3D 0; j < tokens[i].token_count; j++) { @@ -1135,6 +1118,35 @@ sock_devmem_dontneed(struct sock *sk, sockptr_t optv= al, unsigned int optlen) for (k =3D 0; k < netmem_num; k++) WARN_ON_ONCE(!napi_pp_put_page(netmems[k])); =20 + return ret; +} + +static noinline_for_stack int +sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optle= n) +{ + struct dmabuf_token *tokens; + unsigned int num_tokens; + int ret; + + if (!sk_is_tcp(sk)) + return -EBADF; + + if (optlen % sizeof(*tokens) || + optlen > sizeof(*tokens) * MAX_DONTNEED_TOKENS) + return -EINVAL; + + num_tokens =3D optlen / sizeof(*tokens); + tokens =3D kvmalloc_array(num_tokens, sizeof(*tokens), GFP_KERNEL); + if (!tokens) + return -ENOMEM; + + if (copy_from_sockptr(tokens, optval, optlen)) { + kvfree(tokens); + return -EFAULT; + } + + ret =3D sock_devmem_dontneed_autorelease(sk, tokens, num_tokens); + kvfree(tokens); return ret; } --=20 2.47.3 From nobody Fri Dec 19 10:55:54 2025 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F5B32550A4 for ; Wed, 5 Nov 2025 01:23:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305810; cv=none; b=tCdZKhgoHXWL0CFGDgdRg6kVMaaYUxFHt29RY4qPJgO13lfpXm3bjtA0U5qRjbp/NPB5CPowWzWVgWPK8AdkHVZgIVy6VzstDZpyLZwkf/bXwEzC4YUddSaydpcpttAAiBImMI/kHw8N/QD3qJb/RLpCnNBZ6eCN6mTpqkDMve0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305810; c=relaxed/simple; bh=YMknTLKRCPhcgUPsp62PyPTAUbInQWS2lHoa+4aPnJo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=smuRyeFg1S0QNMcQrN7R1JpO33TErHg6qUZXC5iyxEfhY1cwdTe6DLtBX+TkwAGZRbeL+6KHIotaYeIQ4E8mDhIxSv3EhH/L8gVJcFZQ+YGen6wJNU2UOHohtipxRuqfO9zcJj/oQL/B5Y6Kd34hnlkOqC3s6eWpBUAYae+D4jE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BJV1yIAS; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BJV1yIAS" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-7864ab69f01so44718777b3.3 for ; Tue, 04 Nov 2025 17:23:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762305806; x=1762910606; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=hil2HKi9B1UhfW0L0RR1Dk2NsV84mEAbRdtkuqHzVz0=; b=BJV1yIASAB+IHEsg1uZ3SBAFlgxdvLyZtVUJ97MqD/uiVgZ3T/ivtYOp8x8VLsTEB7 FXYKAcyVWbvvpXCw8U/MK3jZksZd3qXjwaPooedpeVvHwIlhKCjymfwn9Gn2pN9HCEYE ZQd2Uc/9jbroXUTXMYnragIZixcAlpyJERxWo8wlxQ55NOB4QORWpL8B6Nb+Oggioxqu rhrJAA3El2JsjiO6vP1fZueT739ofxyzOhLyFvL+H4xkxDIyAZDOEc8a9NZnIVnNu2o2 FYcPt2W1KZs0poJAQ8VMEaiIKBDeSYsnYglTsM8g3zqMtPxMqUviFe97L42x7/+GMuf3 rp/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762305806; x=1762910606; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hil2HKi9B1UhfW0L0RR1Dk2NsV84mEAbRdtkuqHzVz0=; b=cgJIwcgiPXC5nV+XZqJQg9sLr+n2ZJJPAOTgBv13Yunpf/DE8ZVhWV5AcFETLV0oAS eIv9E19hXZlebvuJackNjzJlF6ObY2GGbq7X007pZ6xe699W2MDBrZZ4UljvfzGWDUcH E9y/bYp0tUiAz81jlTMYIZ8jX/IysglfV2YdIppUwS3b8eFValGsknI994sCFStV/fH4 11voxJp0Jj1uqJfaAQRsZdqNiHDdnzfwXOnqMT7ncjCQ9LVxueoHhzXL8x5775plEePt lVr/93L+a+Fi4n9DriniG5FmxixVMXVNjCDnPLouYBJgWRy7XnWqk4n609ln2s5oOlvV bv8A== X-Forwarded-Encrypted: i=1; AJvYcCVKARY0JXyXrQb0CJ5NTzSCqlWrac7X1vrp/zcmNQ1NN4j6nbVFAZllfdvx+Iow1jPq98G0nnr6sW2lGow=@vger.kernel.org X-Gm-Message-State: AOJu0YzEaZHEc1kJSAb2tR/vg/5g6XzEbh9c4TL527FyyjjPpigyVWOn 1mgG4hUXQQ6Ta9mFnl+Bz/hsQ/MQvSxNbhIEoe3AtqW7+uf4qyHq2dqw X-Gm-Gg: ASbGncuCnI3KPNK4rwhdEb2pwHR2QqphoQdzl1pHhEMlt9t3xXr43GTmyX2W4SHbV93 IFj/BziW0PBbkcKxj8I47NJTYHiGgsGHcn+qMaEAg0dgEtRPq/AFcEdT8BrRVEWsdSqsKXP47xx cdVDmqqQaQUnU052406+LLt/jR+aGFOvMCD69extDyLwDFqEaOxgRB8ZBwQLHvKfW+jtx5jbFuc 8JmpQTiubYwO/H62/jLoiL11FsjMOabTh1efWpOBfUEPzo4/v8zUofslIW4eCNw5DrCAB9ZOuIv k5sj8QWyXQymVZ9Qy+VL/rCuK7INttxpHGPX4VhWrwjIML3Fq7S83tvcfyL9nh19Pfn4WayGZBO GHGSFCDe94280tRXhfuFiK0HiQ+j3HHMgYM7RB8HkSisMutXOwqZ7M4lLmPLgC1y5ujABZh4dc1 0ZUmI/vE/WphpbeE01i5Pz X-Google-Smtp-Source: AGHT+IHIxe8GEicHF4Mwai1ViXqkyqY1nn4HDCAvuanckvMK93kalwvgw28+rUEVuRK43rzSsEqBAQ== X-Received: by 2002:a05:690c:a4da:10b0:784:8d21:394f with SMTP id 00721157ae682-786a4103d98mr13457847b3.6.1762305805797; Tue, 04 Nov 2025 17:23:25 -0800 (PST) Received: from localhost ([2a03:2880:25ff:a::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-786a1f7af41sm5902377b3.45.2025.11.04.17.23.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Nov 2025 17:23:25 -0800 (PST) From: Bobby Eshleman Date: Tue, 04 Nov 2025 17:23:22 -0800 Subject: [PATCH net-next v6 3/6] net: devmem: prepare for autorelease rx token management Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-3-ea98cf4d40b3@meta.com> References: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> In-Reply-To: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Kuniyuki Iwashima , Willem de Bruijn , Neal Cardwell , David Ahern , Arnd Bergmann , Jonathan Corbet , Andrew Lunn , Shuah Khan , Mina Almasry Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Stanislav Fomichev , Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Add alternative token management implementation (autorelease vs non-autorelease) that replaces xarray-based token lookups with direct array access using page offsets as dmabuf tokens. When enabled, this eliminates xarray overhead and reduces CPU utilization in devmem RX threads by approximately 13%. This patch changes the meaning of tokens when this option is used. Tokens previously referred to unique fragments of pages. With this option tokens instead represent references to pages, not fragments. Because of this, multiple tokens may refer to the same page and so have identical value (e.g., two small fragments may coexist on the same page). The token and offset pair that the user receives uniquely identifies fragments if needed. This assumes that the user is not attempting to sort / uniq the token list using tokens alone. This introduces a restriction: devmem RX sockets cannot switch dmabuf bindings when using the autorelease off option. This is necessary because 32-bit tokens lack sufficient bits to encode both large dmabuf page counts and binding/queue IDs. For example, a system with 8 NICs and 32 queues needs 8 bits for binding IDs, leaving only 24 bits for pages (64GB max). This restriction aligns with common usage, as steering flows to different queues/devices is often undesirable for TCP. This patch adds an atomic uref counter to net_iov for tracking user references via binding->vec. The pp_ref_count is only updated on uref transitions from zero to one or from one to zero, to minimize atomic overhead. If a user fails to refill and closes before returning all tokens, the binding will finish the uref release when unbound. A flag "autorelease" is added per-socket. This will be used for enabling the old behavior of the kernel releasing references for the sockets upon close(2) (autorelease), instead of requiring that socket users do this themselves. The autorelease flag is always true in this patch, meaning that the old (non-optimized) behavior is kept unconditionally. A future patch supports a user-facing knob to toggle this feature and will change the default to false for the improved performance. An outstanding_urefs counter is added per-socket so that changes to the autorelease mode can be rejected for active sockets. The dmabuf unbind path always checks for any leaked urefs. Signed-off-by: Bobby Eshleman --- Changes in v6: - remove sk_devmem_info.autorelease, using binding->autorelease instead - move binding->autorelease check to outside of net_devmem_dmabuf_binding_put_urefs() (Mina) - remove overly defensive net_is_devmem_iov() (Mina) - add comment about multiple urefs mapping to a single netmem ref (Mina) - remove overly defense netmem NULL and netmem_is_net_iov checks (Mina) - use niov without casting back and forth with netmem (Mina) - move the autorelease flag from per-binding to per-socket (Mina) - remove the batching logic in sock_devmem_dontneed_manual_release() (Mina) - move autorelease check inside tcp_xa_pool_commit() (Mina) - remove single-binding restriction for autorelease mode (Mina) - unbind always checks for leaked urefs Changes in v5: - remove unused variables - introduce autorelease flag, preparing for future patch toggle new behavior Changes in v3: - make urefs per-binding instead of per-socket, reducing memory footprint - fallback to cleaning up references in dmabuf unbind if socket leaked tokens - drop ethtool patch Changes in v2: - always use GFP_ZERO for binding->vec (Mina) - remove WARN for changed binding (Mina) - remove extraneous binding ref get (Mina) - remove WARNs on invalid user input (Mina) - pre-assign niovs in binding->vec for RX case (Mina) - use atomic_set(, 0) to initialize sk_user_frags.urefs - fix length of alloc for urefs --- include/net/netmem.h | 1 + include/net/sock.h | 13 +++++++-- net/core/devmem.c | 42 ++++++++++++++++++++++------- net/core/devmem.h | 2 +- net/core/sock.c | 55 +++++++++++++++++++++++++++++++++----- net/ipv4/tcp.c | 69 ++++++++++++++++++++++++++++++++++++++------= ---- net/ipv4/tcp_ipv4.c | 11 +++++--- net/ipv4/tcp_minisocks.c | 5 +++- 8 files changed, 161 insertions(+), 37 deletions(-) diff --git a/include/net/netmem.h b/include/net/netmem.h index 9e10f4ac50c3..80d2263ba4ed 100644 --- a/include/net/netmem.h +++ b/include/net/netmem.h @@ -112,6 +112,7 @@ struct net_iov { }; struct net_iov_area *owner; enum net_iov_type type; + atomic_t uref; }; =20 struct net_iov_area { diff --git a/include/net/sock.h b/include/net/sock.h index c7e58b8e8a90..548fabacff7c 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -350,7 +350,11 @@ struct sk_filter; * @sk_scm_rights: flagged by SO_PASSRIGHTS to recv SCM_RIGHTS * @sk_scm_unused: unused flags for scm_recv() * @ns_tracker: tracker for netns reference - * @sk_user_frags: xarray of pages the user is holding a reference on. + * @sk_devmem_info: the devmem binding information for the socket + * @frags: xarray of tokens for autorelease mode + * @binding: pointer to the dmabuf binding + * @outstanding_urefs: count of outstanding user references + * @autorelease: if true, tokens released on close; if false, user must = release * @sk_owner: reference to the real owner of the socket that calls * sock_lock_init_class_and_name(). */ @@ -579,7 +583,12 @@ struct sock { struct numa_drop_counters *sk_drop_counters; struct rcu_head sk_rcu; netns_tracker ns_tracker; - struct xarray sk_user_frags; + struct { + struct xarray frags; + struct net_devmem_dmabuf_binding *binding; + atomic_t outstanding_urefs; + bool autorelease; + } sk_devmem_info; =20 #if IS_ENABLED(CONFIG_PROVE_LOCKING) && IS_ENABLED(CONFIG_MODULES) struct module *sk_owner; diff --git a/net/core/devmem.c b/net/core/devmem.c index 4dee2666dd07..904d19e58f4b 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -116,6 +117,24 @@ void net_devmem_free_dmabuf(struct net_iov *niov) gen_pool_free(binding->chunk_pool, dma_addr, PAGE_SIZE); } =20 +static void +net_devmem_dmabuf_binding_put_urefs(struct net_devmem_dmabuf_binding *bind= ing) +{ + int i; + + for (i =3D 0; i < binding->dmabuf->size / PAGE_SIZE; i++) { + struct net_iov *niov; + netmem_ref netmem; + + niov =3D binding->vec[i]; + netmem =3D net_iov_to_netmem(niov); + + /* Multiple urefs map to only a single netmem ref. */ + if (atomic_xchg(&niov->uref, 0) > 0) + WARN_ON_ONCE(!napi_pp_put_page(netmem)); + } +} + void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding) { struct netdev_rx_queue *rxq; @@ -143,6 +162,10 @@ void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf= _binding *binding) __net_mp_close_rxq(binding->dev, rxq_idx, &mp_params); } =20 + /* Clean up any lingering urefs from sockets that had autorelease + * disabled. + */ + net_devmem_dmabuf_binding_put_urefs(binding); net_devmem_dmabuf_binding_put(binding); } =20 @@ -231,14 +254,13 @@ net_devmem_bind_dmabuf(struct net_device *dev, goto err_detach; } =20 - if (direction =3D=3D DMA_TO_DEVICE) { - binding->vec =3D kvmalloc_array(dmabuf->size / PAGE_SIZE, - sizeof(struct net_iov *), - GFP_KERNEL); - if (!binding->vec) { - err =3D -ENOMEM; - goto err_unmap; - } + /* Used by TX and also by RX when socket has autorelease disabled */ + binding->vec =3D kvmalloc_array(dmabuf->size / PAGE_SIZE, + sizeof(struct net_iov *), + GFP_KERNEL | __GFP_ZERO); + if (!binding->vec) { + err =3D -ENOMEM; + goto err_unmap; } =20 /* For simplicity we expect to make PAGE_SIZE allocations, but the @@ -292,10 +314,10 @@ net_devmem_bind_dmabuf(struct net_device *dev, niov =3D &owner->area.niovs[i]; niov->type =3D NET_IOV_DMABUF; niov->owner =3D &owner->area; + atomic_set(&niov->uref, 0); page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov), net_devmem_get_dma_addr(niov)); - if (direction =3D=3D DMA_TO_DEVICE) - binding->vec[owner->area.base_virtual / PAGE_SIZE + i] =3D niov; + binding->vec[owner->area.base_virtual / PAGE_SIZE + i] =3D niov; } =20 virtual +=3D len; diff --git a/net/core/devmem.h b/net/core/devmem.h index 2ada54fb63d7..d4eb28d079bb 100644 --- a/net/core/devmem.h +++ b/net/core/devmem.h @@ -61,7 +61,7 @@ struct net_devmem_dmabuf_binding { =20 /* Array of net_iov pointers for this binding, sorted by virtual * address. This array is convenient to map the virtual addresses to - * net_iovs in the TX path. + * net_iovs. */ struct net_iov **vec; =20 diff --git a/net/core/sock.c b/net/core/sock.c index 5562f517d889..465645c1d74f 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -87,6 +87,7 @@ =20 #include #include +#include #include #include #include @@ -151,6 +152,7 @@ #include =20 #include "dev.h" +#include "devmem.h" =20 static DEFINE_MUTEX(proto_list_mutex); static LIST_HEAD(proto_list); @@ -1081,6 +1083,43 @@ static int sock_reserve_memory(struct sock *sk, int = bytes) #define MAX_DONTNEED_TOKENS 128 #define MAX_DONTNEED_FRAGS 1024 =20 +static noinline_for_stack int +sock_devmem_dontneed_manual_release(struct sock *sk, struct dmabuf_token *= tokens, + unsigned int num_tokens) +{ + struct net_iov *niov; + unsigned int i, j; + netmem_ref netmem; + unsigned int token; + int num_frags =3D 0; + int ret; + + if (!sk->sk_devmem_info.binding) + return -EINVAL; + + for (i =3D 0; i < num_tokens; i++) { + for (j =3D 0; j < tokens[i].token_count; j++) { + token =3D tokens[i].token_start + j; + if (token >=3D sk->sk_devmem_info.binding->dmabuf->size / PAGE_SIZE) + break; + + if (++num_frags > MAX_DONTNEED_FRAGS) + return ret; + + niov =3D sk->sk_devmem_info.binding->vec[token]; + if (atomic_dec_and_test(&niov->uref)) { + netmem =3D net_iov_to_netmem(niov); + WARN_ON_ONCE(!napi_pp_put_page(netmem)); + } + ret++; + } + } + + atomic_sub(ret, &sk->sk_devmem_info.outstanding_urefs); + + return ret; +} + static noinline_for_stack int sock_devmem_dontneed_autorelease(struct sock *sk, struct dmabuf_token *tok= ens, unsigned int num_tokens) @@ -1089,32 +1128,32 @@ sock_devmem_dontneed_autorelease(struct sock *sk, s= truct dmabuf_token *tokens, int ret =3D 0, num_frags =3D 0; netmem_ref netmems[16]; =20 - xa_lock_bh(&sk->sk_user_frags); + xa_lock_bh(&sk->sk_devmem_info.frags); for (i =3D 0; i < num_tokens; i++) { for (j =3D 0; j < tokens[i].token_count; j++) { if (++num_frags > MAX_DONTNEED_FRAGS) goto frag_limit_reached; =20 netmem_ref netmem =3D (__force netmem_ref)__xa_erase( - &sk->sk_user_frags, tokens[i].token_start + j); + &sk->sk_devmem_info.frags, tokens[i].token_start + j); =20 if (!netmem || WARN_ON_ONCE(!netmem_is_net_iov(netmem))) continue; =20 netmems[netmem_num++] =3D netmem; if (netmem_num =3D=3D ARRAY_SIZE(netmems)) { - xa_unlock_bh(&sk->sk_user_frags); + xa_unlock_bh(&sk->sk_devmem_info.frags); for (k =3D 0; k < netmem_num; k++) WARN_ON_ONCE(!napi_pp_put_page(netmems[k])); netmem_num =3D 0; - xa_lock_bh(&sk->sk_user_frags); + xa_lock_bh(&sk->sk_devmem_info.frags); } ret++; } } =20 frag_limit_reached: - xa_unlock_bh(&sk->sk_user_frags); + xa_unlock_bh(&sk->sk_devmem_info.frags); for (k =3D 0; k < netmem_num; k++) WARN_ON_ONCE(!napi_pp_put_page(netmems[k])); =20 @@ -1145,7 +1184,11 @@ sock_devmem_dontneed(struct sock *sk, sockptr_t optv= al, unsigned int optlen) return -EFAULT; } =20 - ret =3D sock_devmem_dontneed_autorelease(sk, tokens, num_tokens); + if (sk->sk_devmem_info.autorelease) + ret =3D sock_devmem_dontneed_autorelease(sk, tokens, num_tokens); + else + ret =3D sock_devmem_dontneed_manual_release(sk, tokens, + num_tokens); =20 kvfree(tokens); return ret; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index a9345aa5a2e5..052875c1b547 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -260,6 +260,7 @@ #include #include #include +#include #include #include #include @@ -492,7 +493,10 @@ void tcp_init_sock(struct sock *sk) =20 set_bit(SOCK_SUPPORT_ZC, &sk->sk_socket->flags); sk_sockets_allocated_inc(sk); - xa_init_flags(&sk->sk_user_frags, XA_FLAGS_ALLOC1); + xa_init_flags(&sk->sk_devmem_info.frags, XA_FLAGS_ALLOC1); + sk->sk_devmem_info.binding =3D NULL; + atomic_set(&sk->sk_devmem_info.outstanding_urefs, 0); + sk->sk_devmem_info.autorelease =3D true; } EXPORT_IPV6_MOD(tcp_init_sock); =20 @@ -2424,11 +2428,11 @@ static void tcp_xa_pool_commit_locked(struct sock *= sk, struct tcp_xa_pool *p) =20 /* Commit part that has been copied to user space. */ for (i =3D 0; i < p->idx; i++) - __xa_cmpxchg(&sk->sk_user_frags, p->tokens[i], XA_ZERO_ENTRY, + __xa_cmpxchg(&sk->sk_devmem_info.frags, p->tokens[i], XA_ZERO_ENTRY, (__force void *)p->netmems[i], GFP_KERNEL); /* Rollback what has been pre-allocated and is no longer needed. */ for (; i < p->max; i++) - __xa_erase(&sk->sk_user_frags, p->tokens[i]); + __xa_erase(&sk->sk_devmem_info.frags, p->tokens[i]); =20 p->max =3D 0; p->idx =3D 0; @@ -2436,14 +2440,17 @@ static void tcp_xa_pool_commit_locked(struct sock *= sk, struct tcp_xa_pool *p) =20 static void tcp_xa_pool_commit(struct sock *sk, struct tcp_xa_pool *p) { + if (!sk->sk_devmem_info.autorelease) + return; + if (!p->max) return; =20 - xa_lock_bh(&sk->sk_user_frags); + xa_lock_bh(&sk->sk_devmem_info.frags); =20 tcp_xa_pool_commit_locked(sk, p); =20 - xa_unlock_bh(&sk->sk_user_frags); + xa_unlock_bh(&sk->sk_devmem_info.frags); } =20 static int tcp_xa_pool_refill(struct sock *sk, struct tcp_xa_pool *p, @@ -2454,18 +2461,18 @@ static int tcp_xa_pool_refill(struct sock *sk, stru= ct tcp_xa_pool *p, if (p->idx < p->max) return 0; =20 - xa_lock_bh(&sk->sk_user_frags); + xa_lock_bh(&sk->sk_devmem_info.frags); =20 tcp_xa_pool_commit_locked(sk, p); =20 for (k =3D 0; k < max_frags; k++) { - err =3D __xa_alloc(&sk->sk_user_frags, &p->tokens[k], + err =3D __xa_alloc(&sk->sk_devmem_info.frags, &p->tokens[k], XA_ZERO_ENTRY, xa_limit_31b, GFP_KERNEL); if (err) break; } =20 - xa_unlock_bh(&sk->sk_user_frags); + xa_unlock_bh(&sk->sk_devmem_info.frags); =20 p->max =3D k; p->idx =3D 0; @@ -2479,12 +2486,14 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, cons= t struct sk_buff *skb, unsigned int offset, struct msghdr *msg, int remaining_len) { + struct net_devmem_dmabuf_binding *binding =3D NULL; struct dmabuf_cmsg dmabuf_cmsg =3D { 0 }; struct tcp_xa_pool tcp_xa_pool; unsigned int start; int i, copy, n; int sent =3D 0; int err =3D 0; + int refs; =20 tcp_xa_pool.max =3D 0; tcp_xa_pool.idx =3D 0; @@ -2536,6 +2545,7 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const = struct sk_buff *skb, skb_frag_t *frag =3D &skb_shinfo(skb)->frags[i]; struct net_iov *niov; u64 frag_offset; + u32 token; int end; =20 /* !skb_frags_readable() should indicate that ALL the @@ -2568,13 +2578,32 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, cons= t struct sk_buff *skb, start; dmabuf_cmsg.frag_offset =3D frag_offset; dmabuf_cmsg.frag_size =3D copy; - err =3D tcp_xa_pool_refill(sk, &tcp_xa_pool, - skb_shinfo(skb)->nr_frags - i); - if (err) + + binding =3D net_devmem_iov_binding(niov); + + if (!sk->sk_devmem_info.binding) + sk->sk_devmem_info.binding =3D binding; + + if (sk->sk_devmem_info.binding !=3D binding) { + err =3D -EFAULT; goto out; + } + + if (sk->sk_devmem_info.autorelease) { + err =3D tcp_xa_pool_refill(sk, &tcp_xa_pool, + skb_shinfo(skb)->nr_frags - i); + if (err) + goto out; + + dmabuf_cmsg.frag_token =3D + tcp_xa_pool.tokens[tcp_xa_pool.idx]; + } else { + token =3D net_iov_virtual_addr(niov) >> PAGE_SHIFT; + dmabuf_cmsg.frag_token =3D token; + } + =20 /* Will perform the exchange later */ - dmabuf_cmsg.frag_token =3D tcp_xa_pool.tokens[tcp_xa_pool.idx]; dmabuf_cmsg.dmabuf_id =3D net_devmem_iov_binding_id(niov); =20 offset +=3D copy; @@ -2587,8 +2616,15 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const= struct sk_buff *skb, if (err) goto out; =20 - atomic_long_inc(&niov->pp_ref_count); - tcp_xa_pool.netmems[tcp_xa_pool.idx++] =3D skb_frag_netmem(frag); + if (sk->sk_devmem_info.autorelease) { + atomic_long_inc(&niov->pp_ref_count); + tcp_xa_pool.netmems[tcp_xa_pool.idx++] =3D + skb_frag_netmem(frag); + } else { + if (atomic_inc_return(&niov->uref) =3D=3D 1) + atomic_long_inc(&niov->pp_ref_count); + refs++; + } =20 sent +=3D copy; =20 @@ -2599,6 +2635,7 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const = struct sk_buff *skb, } =20 tcp_xa_pool_commit(sk, &tcp_xa_pool); + if (!remaining_len) goto out; =20 @@ -2617,9 +2654,13 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const= struct sk_buff *skb, =20 out: tcp_xa_pool_commit(sk, &tcp_xa_pool); + if (!sent) sent =3D err; =20 + if (refs > 0) + atomic_add(refs, &sk->sk_devmem_info.outstanding_urefs); + return sent; } =20 diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 40a76da5364a..dbb7a71e3cce 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -89,6 +89,9 @@ =20 #include =20 +#include +#include "../core/devmem.h" + #include =20 #ifdef CONFIG_TCP_MD5SIG @@ -2493,7 +2496,7 @@ static void tcp_release_user_frags(struct sock *sk) unsigned long index; void *netmem; =20 - xa_for_each(&sk->sk_user_frags, index, netmem) + xa_for_each(&sk->sk_devmem_info.frags, index, netmem) WARN_ON_ONCE(!napi_pp_put_page((__force netmem_ref)netmem)); #endif } @@ -2502,9 +2505,11 @@ void tcp_v4_destroy_sock(struct sock *sk) { struct tcp_sock *tp =3D tcp_sk(sk); =20 - tcp_release_user_frags(sk); + if (sk->sk_devmem_info.binding && sk->sk_devmem_info.autorelease) + tcp_release_user_frags(sk); =20 - xa_destroy(&sk->sk_user_frags); + xa_destroy(&sk->sk_devmem_info.frags); + sk->sk_devmem_info.binding =3D NULL; =20 trace_tcp_destroy_sock(sk); =20 diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index ded2cf1f6006..a017dea35bb8 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -663,7 +663,10 @@ struct sock *tcp_create_openreq_child(const struct soc= k *sk, =20 __TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS); =20 - xa_init_flags(&newsk->sk_user_frags, XA_FLAGS_ALLOC1); + xa_init_flags(&newsk->sk_devmem_info.frags, XA_FLAGS_ALLOC1); + newsk->sk_devmem_info.binding =3D NULL; + atomic_set(&newsk->sk_devmem_info.outstanding_urefs, 0); + newsk->sk_devmem_info.autorelease =3D true; =20 return newsk; } --=20 2.47.3 From nobody Fri Dec 19 10:55:54 2025 Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD79B23C512 for ; Wed, 5 Nov 2025 01:23:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305811; cv=none; b=IeCq6TR8UTavaKTqp5BPdDvsQaaXIoFx18EG7SWb2SCChkiNRhvkQtnW+jdwAWRzg9FWsfid9yxsiVGgZNhbtmm5Z++YihNk0PMkzvaoYiqcNVLYUg4yNO2Q53dwTEtIj9QwSd2hXLPXR/hhYKCVEfnwYWU69IbRC9D6ktBa+pU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305811; c=relaxed/simple; bh=yCIkyL0h8kHUFJCTSyMV+CAW54KfpLm6K2H45fUhKZI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=iFGsA9ZVIEu6n1m41MGEe6Qg/sR1AnG8tlxHqqWDvVWVIb4tEQgFdg9pye+mC35jUpawlbWIOT7FiyAfHdcWQtO/NrHTtGMEG2kfIeXugk/vRDOmsJLshXTj8Cxmqob2x0c+kxNaP+k3/mBMqYDeuKvJIQZ9J1Gg71TUTMhzEVY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dBqY/or0; arc=none smtp.client-ip=209.85.128.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dBqY/or0" Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-786a317fe78so6631257b3.2 for ; Tue, 04 Nov 2025 17:23:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762305807; x=1762910607; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=++wGl6enUw/dRP6YEy+waldxVwK8Dc+ChAEagCck49g=; b=dBqY/or07iDvxPtc+MG+dOBb+UIttyvVsX0qd+7HffYqxVfQuf53ekC5v0nGmVUigx aVG2aoDMq4LwMvj5T/RQZpbaTAnPhC06C8/aN/W90TaJ4NmPLNhQhVS3F24GdHLPpBin Mucifdvh2YQiiN+FuzxetELN5cBdZKNrmKCK9intXQItXjPZ7P45q0mHRNyAFhHcCQyk +m7a0jvuryS25COdeiFxA7OetXAOzCny1nS2ZjQ9wqeMSuCUXPna4T36aiQbG0q8NB7T 5OQdDr72xb2W9acG45zbK+x53HeiWqh5QJYP/rY2H6gRtBu0IXp2C+wygRQ7J3u6IMw9 CuuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762305807; x=1762910607; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=++wGl6enUw/dRP6YEy+waldxVwK8Dc+ChAEagCck49g=; b=R0al9abM/zHPDcXI7fWiMzpDN1fQJER+orwritnUtcFj1NQzgixyjpFnm8eOekuWiD UFQ6Q8f4YnlRXnk9Hxb9hHen6dK3kAF7ZZgsz5zxzYEzqg6kb7yLkwiOljsYvthqUWQ/ wrEFAvXXwrEUik2tU7ulLlPjztE4kz9OLG/strd/viuTsio3M9c0qHDYTlwwRoFfrgJs dZgv1APEptAtMBKwuRRS/mOSDihxP7y9lQOdz2vV5s3j4kIB7yqANP4qeifNUWr/6dTv 3KET6gi6/1CGzIjzaPeJQAgeQQ16S8DObyIMAT9xyFUr13qzukkvKKYuYU9xZFTAWR6K Mb6w== X-Forwarded-Encrypted: i=1; AJvYcCVjVhKPl/h6w1N3vywT4mCmp9Re3dgdqTIB53Adz6StU+NtK61UhTaRbM1LkCUhPIp8S4sZZDByILqV6Kw=@vger.kernel.org X-Gm-Message-State: AOJu0YwvB8PTKVmO4Xth/v5Uwk0doXg8vtpqtwxzsmRlKG2Zgey7ZEtK 1haHUUH9RqLUU+TA48cDWkLBZacHo8KAFFpMWRw5E5ctojCHA1pzogvM X-Gm-Gg: ASbGncsYLAwDCoQKCC+yr5Sw4e5rpe7y+FbmwLuP/hlT9ywg5r8zfKI+gREHWccWMg+ ddy0ClerfsygWdnGc7GUAu/dpfkop6OeC9iRWuhz9g4H/ey54EJh7aQg0ZZ1RbYZRLtCtaaXccr MfXz1L9hWUcjV1H3pva/Gt672/PjbXtjyvYDGIEGOQXlgsbAP3pl2sQx6WvonPtB/1/fDavcAmi OWi3pYRkbV6WY/g1vOpRQGYFGbXyZQL9OUJL31H9NpxKhLehRrm+mgzWl/GMQhTrpNchflgX1JQ zB5md0ao1IRdShrCyBLKKqpOqttBNYR2Itv5hXWPEptqckHhNidPvA0ZTxZLs1j8oKvcYMJViHK hdlsWjK/CeipbtSbY2o8svSvcJQwC/kiukV3/NiZhdUzfzQR9hxWrCz18sCrHciP411kLw0ETaC m5rFR51svJ5+c= X-Google-Smtp-Source: AGHT+IEQJ9WEsquzTbokNdx5L6vycggE3/QvpfsVJ9sSnQ7RONk0w2fr5nFci2oGs61wzZsC3j65UQ== X-Received: by 2002:a05:690c:6111:b0:786:61c6:7e71 with SMTP id 00721157ae682-786a41b3d21mr16312747b3.33.1762305806699; Tue, 04 Nov 2025 17:23:26 -0800 (PST) Received: from localhost ([2a03:2880:25ff:74::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78691d8ef92sm14967307b3.5.2025.11.04.17.23.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Nov 2025 17:23:26 -0800 (PST) From: Bobby Eshleman Date: Tue, 04 Nov 2025 17:23:23 -0800 Subject: [PATCH net-next v6 4/6] net: devmem: add SO_DEVMEM_AUTORELEASE for autorelease control Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-4-ea98cf4d40b3@meta.com> References: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> In-Reply-To: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Kuniyuki Iwashima , Willem de Bruijn , Neal Cardwell , David Ahern , Arnd Bergmann , Jonathan Corbet , Andrew Lunn , Shuah Khan , Mina Almasry Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Stanislav Fomichev , Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Add SO_DEVMEM_AUTORELEASE socket option to allow applications to control token release behavior on a per-socket basis. The socket option accepts boolean values (0 or 1): - 1 (true): outstanding tokens are automatically released when the socket closes - 0 (false): outstanding tokens are released when the dmabuf is unbound The option can only be changed when the socket has no outstanding tokens, enforced by checking: 1. The frags xarray is empty (no tokens in autorelease mode) 2. The outstanding_urefs counter is zero (no tokens in manual mode) This restriction prevents inconsistent token tracking state between acquisition and release calls. If either condition fails, setsockopt returns -EBUSY. The default state is autorelease off. Signed-off-by: Bobby Eshleman --- include/uapi/asm-generic/socket.h | 2 ++ net/core/sock.c | 51 +++++++++++++++++++++++++++++= ++++ net/ipv4/tcp.c | 2 +- tools/include/uapi/asm-generic/socket.h | 2 ++ 4 files changed, 56 insertions(+), 1 deletion(-) diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/s= ocket.h index 53b5a8c002b1..59302318bb34 100644 --- a/include/uapi/asm-generic/socket.h +++ b/include/uapi/asm-generic/socket.h @@ -150,6 +150,8 @@ #define SO_INQ 84 #define SCM_INQ SO_INQ =20 +#define SO_DEVMEM_AUTORELEASE 85 + #if !defined(__KERNEL__) =20 #if __BITS_PER_LONG =3D=3D 64 || (defined(__x86_64__) && defined(__ILP32__= )) diff --git a/net/core/sock.c b/net/core/sock.c index 465645c1d74f..27af476f3cd3 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1160,6 +1160,46 @@ sock_devmem_dontneed_autorelease(struct sock *sk, st= ruct dmabuf_token *tokens, return ret; } =20 +static noinline_for_stack int +sock_devmem_set_autorelease(struct sock *sk, sockptr_t optval, unsigned in= t optlen) +{ + int val; + + if (!sk_is_tcp(sk)) + return -EBADF; + + if (optlen < sizeof(int)) + return -EINVAL; + + if (copy_from_sockptr(&val, optval, sizeof(val))) + return -EFAULT; + + /* Validate that val is 0 or 1 */ + if (val !=3D 0 && val !=3D 1) + return -EINVAL; + + sockopt_lock_sock(sk); + + /* Can only change autorelease if: + * 1. No tokens in the frags xarray (autorelease mode) + * 2. No outstanding urefs (manual release mode) + */ + if (!xa_empty(&sk->sk_devmem_info.frags)) { + sockopt_release_sock(sk); + return -EBUSY; + } + + if (atomic_read(&sk->sk_devmem_info.outstanding_urefs) > 0) { + sockopt_release_sock(sk); + return -EBUSY; + } + + sk->sk_devmem_info.autorelease =3D !!val; + + sockopt_release_sock(sk); + return 0; +} + static noinline_for_stack int sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optle= n) { @@ -1351,6 +1391,9 @@ int sk_setsockopt(struct sock *sk, int level, int opt= name, #ifdef CONFIG_PAGE_POOL case SO_DEVMEM_DONTNEED: return sock_devmem_dontneed(sk, optval, optlen); + + case SO_DEVMEM_AUTORELEASE: + return sock_devmem_set_autorelease(sk, optval, optlen); #endif case SO_SNDTIMEO_OLD: case SO_SNDTIMEO_NEW: @@ -2208,6 +2251,14 @@ int sk_getsockopt(struct sock *sk, int level, int op= tname, v.val =3D READ_ONCE(sk->sk_txrehash); break; =20 +#ifdef CONFIG_PAGE_POOL + case SO_DEVMEM_AUTORELEASE: + if (!sk_is_tcp(sk)) + return -EBADF; + v.val =3D sk->sk_devmem_info.autorelease; + break; +#endif + default: /* We implement the SO_SNDLOWAT etc to not be settable * (1003.1g 7). diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 052875c1b547..8226ba892b36 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -496,7 +496,7 @@ void tcp_init_sock(struct sock *sk) xa_init_flags(&sk->sk_devmem_info.frags, XA_FLAGS_ALLOC1); sk->sk_devmem_info.binding =3D NULL; atomic_set(&sk->sk_devmem_info.outstanding_urefs, 0); - sk->sk_devmem_info.autorelease =3D true; + sk->sk_devmem_info.autorelease =3D false; } EXPORT_IPV6_MOD(tcp_init_sock); =20 diff --git a/tools/include/uapi/asm-generic/socket.h b/tools/include/uapi/a= sm-generic/socket.h index f333a0ac4ee4..9710a3d7cc4d 100644 --- a/tools/include/uapi/asm-generic/socket.h +++ b/tools/include/uapi/asm-generic/socket.h @@ -147,6 +147,8 @@ =20 #define SO_PASSRIGHTS 83 =20 +#define SO_DEVMEM_AUTORELEASE 85 + #if !defined(__KERNEL__) =20 #if __BITS_PER_LONG =3D=3D 64 || (defined(__x86_64__) && defined(__ILP32__= )) --=20 2.47.3 From nobody Fri Dec 19 10:55:54 2025 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D983125BEE7 for ; Wed, 5 Nov 2025 01:23:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305812; cv=none; b=ICf6x7qZf9VwTG9idoL5hVEBeZnlRuE7IEbkSqXsB4rT7p3p49cc/Diz0jm6e/VrpufqAv6g/WLFeT11UuQozRaluooc5UjBZBtDXwThuHLyAb4/zfazRsiOgt154P/KfY62NgXYSAIttTvKvZ7gqooeb4U8OnIB+zB1yQd0HnA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305812; c=relaxed/simple; bh=9ZdVGaIt/OvZxDjCG8t0f5xRJnb5qgTQFaBNuCOVr0Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=WqYfyDFTvAZkT+nd7rYFxsfgHJkb8Y/h6AF97goTS6R9kGc5HqWjbTomUeYdYzqErlrtjrNzNpfYENXV6cJbbH23pkDLsC8n/ekpRAIyYIOLVNkOTlKX2u61r4YpbGYCjugYl1RXM8FaLqP+jWgBN/+gom7EvbeLFUapmEgmxoY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cs9ugeJJ; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cs9ugeJJ" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-7866c61a9bbso31140457b3.1 for ; Tue, 04 Nov 2025 17:23:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762305808; x=1762910608; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=IXQPAkEgQbO8euIcwuGERIBFcD6kBEPGrPcFSGxSLm8=; b=cs9ugeJJrOPiGhprQlaFybPw90yy1tKkHgTW4e2KCR4YsIyQEVt96vg/NXcY6lihGc Ag4CZaDgiBu9IYRtlgOvW60JEyl+7OdB+52qv3Dhmh1gqM9n7OIAc1e51qmaqe0OM/Jl cAOgPDsxcV04FsI4KPeX/gu5W7QEw3rZ64Syn4YItgFYRQRsUGst7BKOk+AosltjfkZ8 OZ4sbaYO8iVfv8Q8QMn+39OKujZCUrI2GGP6WP6iJxu5Ia7e15+GTfJNQkya+OxYgdp6 ST9Q8GD4g76No5HggNmx+3NWbXfi6RwJhMxWlwUkQZnULPp4YjTq0hZ1DpnfkWteCiHo Ip9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762305808; x=1762910608; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IXQPAkEgQbO8euIcwuGERIBFcD6kBEPGrPcFSGxSLm8=; b=TFl5lt1nXXh992QbdLTB8ISfRUT0TF0F6JZj24DfXf3Jt/vmjzPtLZayYg0YXiZ5Qd 4cr9EwJliiRwcKdxRKyTvSW6lZ6vZNPUZD0ZwhzbIkhzRgx+P2+5bSVgCOrBOckXVyKp j327SC1NhH9+mIetgYMt7ZyIL9G33ued/dXpu7P5O/fb/zJ3VyqS/Y1GkGdn2zGt3JOp ytuc9inAUzPWK5nmmW4OEQL2TAVLulnnAt4y3AHZMj/TH1+/Z405t7iyzpoycJycKENF JxyBSSyF+LCAeufYqdPN6+VbCvXdGX0XjG0aK5I0+/afFxSYsEYcmc/oqoz5Camv4L6e 7zcg== X-Forwarded-Encrypted: i=1; AJvYcCWS32wU2X3qufBLW7Cg0xV27+Fl+cstA1IOiMWlFqoqOi2f8yBqyjvWSeqfPR9uMLdwB5okwTVIO2BcTT8=@vger.kernel.org X-Gm-Message-State: AOJu0YxFLCt/Cr4N7vmvLkazz+0VYyOuA4orhWUDC3RgXHdyPlj+crrM rX+ZpbDJTj3r77wU9GY/yqQkr2zJsBcQEK1OToqaoDAtH8b5HUI2Dbdj X-Gm-Gg: ASbGncvWUjtakrJOuBGWuB9E8BWgjnnkCU+iRgDhNCYLHJV1xgGsdccwJZ4TKCQz+XZ H0NNrwP29pRy7rAA8QP2dNCOgI7dUzhrf05eQKeE6cUzozWuOEiBbrn90z9KdptZlglzzY1au/g wplLgXICsRFC9pgU+pgb/X6xgc9i/k2XwnCHG1yfV4haAgh8lzRf0gP96ATVA/Q5depeMgGr/ck 2dt3L/Lc+bBESYJE8vVHAYGGi4THl00Xufv5xcA7FC7Tcc166y98p+xgS1b1IDWMhrvddpD6Bsc nOIm0UMjmQ0RHRe1tvn5XJiX93/9tXTun8lJGQcADF6GW2sgu6KuqOoE2OerwAQ+u3JHE+p3I2v y8TVr3cozAng5/Khi5eNZdw2PZMTjBh4vkfCvapiwefvc9CqZO+N2pcKoJionbGdBe0m/qTcpuL dGk/pDsY9XeMI= X-Google-Smtp-Source: AGHT+IGmZcasptx2/HU85kDuwAXzBhZHQkoteu1oE0YpB1MwZKyvKWzYVmBym2flQTEn3Ob02W1r2A== X-Received: by 2002:a05:690c:6311:b0:786:652d:50e with SMTP id 00721157ae682-786a419d7d6mr14885297b3.37.1762305807568; Tue, 04 Nov 2025 17:23:27 -0800 (PST) Received: from localhost ([2a03:2880:25ff:70::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78691db1369sm15259657b3.16.2025.11.04.17.23.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Nov 2025 17:23:27 -0800 (PST) From: Bobby Eshleman Date: Tue, 04 Nov 2025 17:23:24 -0800 Subject: [PATCH net-next v6 5/6] net: devmem: document SO_DEVMEM_AUTORELEASE socket option Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-5-ea98cf4d40b3@meta.com> References: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> In-Reply-To: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Kuniyuki Iwashima , Willem de Bruijn , Neal Cardwell , David Ahern , Arnd Bergmann , Jonathan Corbet , Andrew Lunn , Shuah Khan , Mina Almasry Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Stanislav Fomichev , Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Update devmem.rst documentation to describe the new SO_DEVMEM_AUTORELEASE socket option and its usage. Document the following: - The two token release modes (automatic vs manual) - How to use SO_DEVMEM_AUTORELEASE to control the behavior - Performance benefits of disabling autorelease (~10% CPU reduction) - Restrictions and caveats of manual token release - Usage examples for both getsockopt and setsockopt Signed-off-by: Bobby Eshleman --- Documentation/networking/devmem.rst | 70 +++++++++++++++++++++++++++++++++= ++-- 1 file changed, 68 insertions(+), 2 deletions(-) diff --git a/Documentation/networking/devmem.rst b/Documentation/networking= /devmem.rst index a6cd7236bfbd..1bfce686dce6 100644 --- a/Documentation/networking/devmem.rst +++ b/Documentation/networking/devmem.rst @@ -215,8 +215,8 @@ Freeing frags ------------- =20 Frags received via SCM_DEVMEM_DMABUF are pinned by the kernel while the us= er -processes the frag. The user must return the frag to the kernel via -SO_DEVMEM_DONTNEED:: +processes the frag. Users should return tokens to the kernel via +SO_DEVMEM_DONTNEED when they are done processing the data:: =20 ret =3D setsockopt(client_fd, SOL_SOCKET, SO_DEVMEM_DONTNEED, &token, sizeof(token)); @@ -235,6 +235,72 @@ can be less than the tokens provided by the user in ca= se of: (a) an internal kernel leak bug. (b) the user passed more than 1024 frags. =20 + +Autorelease Control +~~~~~~~~~~~~~~~~~~~ + +The SO_DEVMEM_AUTORELEASE socket option controls what happens to outstandi= ng +tokens (tokens not released via SO_DEVMEM_DONTNEED) when the socket closes= :: + + int autorelease =3D 0; /* 0 =3D manual release, 1 =3D automatic release = */ + ret =3D setsockopt(client_fd, SOL_SOCKET, SO_DEVMEM_AUTORELEASE, + &autorelease, sizeof(autorelease)); + + /* Query current setting */ + int current_val; + socklen_t len =3D sizeof(current_val); + ret =3D getsockopt(client_fd, SOL_SOCKET, SO_DEVMEM_AUTORELEASE, + ¤t_val, &len); + +When autorelease is disabled (default): + +- Outstanding tokens are NOT released when the socket closes +- Outstanding tokens are only released when the dmabuf is unbound +- Provides better performance by eliminating xarray overhead (~10% CPU red= uction) +- Kernel tracks tokens via atomic reference counters in net_iov structures + +When autorelease is enabled: + +- Outstanding tokens are automatically released when the socket closes +- Backwards compatible behavior +- Kernel tracks tokens in an xarray per socket + +Important: In both modes, applications should call SO_DEVMEM_DONTNEED to +return tokens as soon as they are done processing. The autorelease setting= only +affects what happens to tokens that are still outstanding when close() is = called. + +The autorelease setting can only be changed when the socket has no outstan= ding +tokens. If tokens are present, setsockopt returns -EBUSY. + + +Performance Considerations +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Disabling autorelease provides approximately ~10% CPU utilization improvem= ent in +RX workloads by: + +- Eliminating xarray allocations and lookups for token tracking +- Using atomic reference counters instead +- Reducing lock contention on the xarray spinlock + +However, applications must ensure all tokens are released via +SO_DEVMEM_DONTNEED before closing the socket, otherwise the backing pages = will +remain pinned until the dmabuf is unbound. + + +Caveats +~~~~~~~ + +- With autorelease disabled, sockets cannot switch between different dmabuf + bindings. This restriction exists because tokens in this mode do not enc= ode + the binding information necessary to perform the token release. + +- Applications using manual release mode (autorelease=3D0) must ensure all= tokens + are returned via SO_DEVMEM_DONTNEED before socket close to avoid resource + leaks during the lifetime of the dmabuf binding. Tokens not released bef= ore + close() will only be freed when the dmabuf is unbound. + + TX Interface =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --=20 2.47.3 From nobody Fri Dec 19 10:55:54 2025 Received: from mail-yx1-f52.google.com (mail-yx1-f52.google.com [74.125.224.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FC0B25783C for ; Wed, 5 Nov 2025 01:23:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305814; cv=none; b=aS99PR7bRNnKqfB9upNI5rdFF5J1rO1q/QdPMaWLtnM04/5YpygyBBfbspvM+Pi8WMOJu6Mq3bOJO0s24d0VBQH3iUa9u3v6/Kvp99j01Hg4+ISg5RWylYXXhGLnqBcMiCX27oEwiNBLZBV9hRoIz+ElXQ76m8IKMpxjxREwKz4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762305814; c=relaxed/simple; bh=EUVr5JNK/GTAT1ttd11lgstVq87db7L7gS8Y83xgPYo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Nzi2j3aWgyoM9vB9sB7Drxz/Wk7TDQKPq7xR65yLPFFLqRBzwejcvmf2xuxidjSJzvzI7pAu+lO7EPKZNiedy2aNXqY5Y6wGnq6+IhVQIIC+c7xnDgMcIrTjzu09qGinGW7AmQR0h6xBBuOwthtWi3uLjb0oeAlk2g8cg44O4NI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=S4qTPyPy; arc=none smtp.client-ip=74.125.224.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="S4qTPyPy" Received: by mail-yx1-f52.google.com with SMTP id 956f58d0204a3-63f945d2060so4469437d50.3 for ; Tue, 04 Nov 2025 17:23:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762305808; x=1762910608; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=4eVDPnkuv2SP94dsK2EFl5pOIR+N6oa4rL2rsMVs/R8=; b=S4qTPyPymyijDVhDyYOxTorfZgWjhzgv9UvEmkeogZrad0+Ni11qfJbH2VXOmsGQkT WSRN7s725oc7WScFEClLyvYtiHkMENn1NLLdRaZ0MlPXMKxMDTnnbI6dxo8PUonPPnVu Ia1cSW+GDU6XTsyY053c3jrI4HrViJipGWuQE92rDsVXIV9SC1MnG8Fsmnt3S5VDYCNj C1mrFIV8JWo0ScXI/uGmBiLLwcHcgEeW7l5Uh+zl6LOwSJo8A4usOKIiN7SE5hUbPBm9 bPfFwN12J8klL5zDqH3t57kdeuhPmKjQERplVgBG0/dDL5py8YiVzmtrA5w+8o8GDzyc aUtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762305808; x=1762910608; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4eVDPnkuv2SP94dsK2EFl5pOIR+N6oa4rL2rsMVs/R8=; b=RfY4sjFPngIKfaA8shHnRHaPGPj5P4EVTMDzBEzloCQLJxT7wdFgAU+mbd1jfXLarO iJvoSpy7km3zobWb32JZTqd5muwed3afzchhfCkZlSvOzXguR77HWLimBoEaTvGFQp+C ZHPUZtvG2YrPXp0uGVsJhp3sXVyetxu7IrFA/k7KsBzYwW1p1igJi4r0PRBBf6SDHOhQ 1/ixjazP8riV5S424DmwzDMOYDyiNzxN9E/Rka9PZ9gWxkBixEvei+8zWPu0nqRtNNeD aZhEFtZjFmPitUT28v2bmKh6QntYYbgXCZYKofvY2KhgKa9vcrQfPE/lD8BT/k86D7wk zwwA== X-Forwarded-Encrypted: i=1; AJvYcCX8a0jaJgwk04bhZ4HVU8dzTgkHYLL08yzAHKT03rLf/owdyaLlsJ9sdICFtdXkJBYRpxUOUrPGjhBj0V8=@vger.kernel.org X-Gm-Message-State: AOJu0Yznb1Iz6h75WLW2Xn9Cb05Yg925HUnu7oV+6RjLRFupteGhvmEu r6hPCDIuFimbH2I+w5Nr4C+MxOBvr9jwgO9zWSSJYpLk83bfuwqsr7CR X-Gm-Gg: ASbGncsNFPcbiWmbatu/EDqPh0hNwOlk5Kkd+Dr6ig5wXEl5b9KNzhnFbU93+lajGBy wYMK9Bx8WhBlbTAdVyffdIS+wr78GV4yx7ZRIhZ4WnLwS3iKLAXH8XrGiWp73URuVjK5veT2758 0/32qdS+0hMkBJLpGpLJsV1wVLsNp0BmBQS4LRVKN1apLP1eEAXiQV4Agu+uxkjpUsivYnhRFMB RP77GXWau7A42YeuXsIqPSjYqwMdMcLh91bI0WVqOfpFSgJ7pKW88ANsjfSCY3DsfRkpkp1fJD7 wJ1+W5+aU05drYqmSBuUeVV+hvDNAAAcGGeFy4p+yfaay2U2Yrm/Viya8dC7NvglG8tyIwSMQfE po4OgrNjdl4+Yd82GeslfgcxkAtskyhCGRicAoR/oyVlrI/GwieElS4U4kbxsBGAJayZFY4D6ry sIagD9V76UsEo= X-Google-Smtp-Source: AGHT+IEfDb+oJnGjarh+9F7GOHtv3gc1oPFaNFLIKdQInLnQKAgyBWcutw5Cbxj5+5Saxr4AJ66EZw== X-Received: by 2002:a05:690e:241c:b0:63f:9f5a:a555 with SMTP id 956f58d0204a3-63fd35a47ddmr1049504d50.50.1762305808562; Tue, 04 Nov 2025 17:23:28 -0800 (PST) Received: from localhost ([2a03:2880:25ff:5f::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7869a7b978fsm9216057b3.2.2025.11.04.17.23.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 04 Nov 2025 17:23:28 -0800 (PST) From: Bobby Eshleman Date: Tue, 04 Nov 2025 17:23:25 -0800 Subject: [PATCH net-next v6 6/6] net: devmem: add tests for SO_DEVMEM_AUTORELEASE socket option Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-6-ea98cf4d40b3@meta.com> References: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> In-Reply-To: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Kuniyuki Iwashima , Willem de Bruijn , Neal Cardwell , David Ahern , Arnd Bergmann , Jonathan Corbet , Andrew Lunn , Shuah Khan , Mina Almasry Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, Stanislav Fomichev , Bobby Eshleman X-Mailer: b4 0.14.3 From: Bobby Eshleman Add -A flag to ncdevmem to set autorelease mode. Add tests for the SO_DEVMEM_AUTORELEASE socket option: New tests include: - check_sockopt_autorelease_default: Verifies default value is 0 - check_sockopt_autorelease_set_0: Tests setting to 0 and reading back - check_sockopt_autorelease_set_1: Tests toggling from 0 to 1 - check_sockopt_autorelease_invalid: Tests invalid value (2) returns EINVAL - check_autorelease_disabled: Tests ncdevmem in manual token release mode - check_autorelease_enabled: Tests ncdevmem in autorelease mode All check_sockopt tests gracefully skip with KsftSkipEx if SO_DEVMEM_AUTORELEASE is not supported by the kernel. Signed-off-by: Bobby Eshleman --- tools/testing/selftests/drivers/net/hw/devmem.py | 115 ++++++++++++++++++= +++- tools/testing/selftests/drivers/net/hw/ncdevmem.c | 20 +++- 2 files changed, 133 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/drivers/net/hw/devmem.py b/tools/testi= ng/selftests/drivers/net/hw/devmem.py index 45c2d49d55b6..29ec179d651f 100755 --- a/tools/testing/selftests/drivers/net/hw/devmem.py +++ b/tools/testing/selftests/drivers/net/hw/devmem.py @@ -1,6 +1,9 @@ #!/usr/bin/env python3 # SPDX-License-Identifier: GPL-2.0 =20 +import socket +import errno + from os import path from lib.py import ksft_run, ksft_exit from lib.py import ksft_eq, KsftSkipEx @@ -63,12 +66,122 @@ def check_tx_chunks(cfg) -> None: ksft_eq(socat.stdout.strip(), "hello\nworld") =20 =20 +@ksft_disruptive +def check_autorelease_disabled(cfg) -> None: + """Test RX with autorelease disabled (requires manual token release in= ncdevmem)""" + require_devmem(cfg) + + port =3D rand_port() + socat =3D f"socat -u - TCP{cfg.addr_ipver}:{cfg.baddr}:{port},bind=3D{= cfg.remote_baddr}:{port}" + listen_cmd =3D f"{cfg.bin_local} -l -f {cfg.ifname} -s {cfg.addr} -p {= port} -c {cfg.remote_addr} -v 7 -A 0" + + with bkg(listen_cmd, exit_wait=3DTrue) as ncdevmem: + wait_port_listen(port) + cmd(f"yes $(echo -e \x01\x02\x03\x04\x05\x06) | \ + head -c 1K | {socat}", host=3Dcfg.remote, shell=3DTrue) + + ksft_eq(ncdevmem.ret, 0) + + +@ksft_disruptive +def check_autorelease_enabled(cfg) -> None: + """Test RX with autorelease enabled (requires token autorelease in ncd= evmem)""" + require_devmem(cfg) + + port =3D rand_port() + socat =3D f"socat -u - TCP{cfg.addr_ipver}:{cfg.baddr}:{port},bind=3D{= cfg.remote_baddr}:{port}" + listen_cmd =3D f"{cfg.bin_local} -l -f {cfg.ifname} -s {cfg.addr} -p {= port} -c {cfg.remote_addr} -v 7 -A 1" + + with bkg(listen_cmd, exit_wait=3DTrue) as ncdevmem: + wait_port_listen(port) + cmd(f"yes $(echo -e \x01\x02\x03\x04\x05\x06) | \ + head -c 1K | {socat}", host=3Dcfg.remote, shell=3DTrue) + + ksft_eq(ncdevmem.ret, 0) + + +def check_sockopt_autorelease_default(cfg) -> None: + """Test that SO_DEVMEM_AUTORELEASE default is 0""" + SO_DEVMEM_AUTORELEASE =3D 85 + + sock =3D socket.socket(socket.AF_INET, socket.SOCK_STREAM) + try: + val =3D sock.getsockopt(socket.SOL_SOCKET, SO_DEVMEM_AUTORELEASE) + ksft_eq(val, 0, "Default autorelease should be 0") + except OSError as e: + if e.errno =3D=3D errno.ENOPROTOOPT: + raise KsftSkipEx("SO_DEVMEM_AUTORELEASE not supported") + raise + finally: + sock.close() + + +def check_sockopt_autorelease_set_0(cfg) -> None: + """Test setting SO_DEVMEM_AUTORELEASE to 0""" + SO_DEVMEM_AUTORELEASE =3D 85 + + sock =3D socket.socket(socket.AF_INET, socket.SOCK_STREAM) + try: + sock.setsockopt(socket.SOL_SOCKET, SO_DEVMEM_AUTORELEASE, 0) + val =3D sock.getsockopt(socket.SOL_SOCKET, SO_DEVMEM_AUTORELEASE) + ksft_eq(val, 0, "Autorelease should be 0 after setting") + except OSError as e: + if e.errno =3D=3D errno.ENOPROTOOPT: + raise KsftSkipEx("SO_DEVMEM_AUTORELEASE not supported") + raise + finally: + sock.close() + + +def check_sockopt_autorelease_set_1(cfg) -> None: + """Test setting SO_DEVMEM_AUTORELEASE to 1""" + SO_DEVMEM_AUTORELEASE =3D 85 + + sock =3D socket.socket(socket.AF_INET, socket.SOCK_STREAM) + try: + # First set to 0 + sock.setsockopt(socket.SOL_SOCKET, SO_DEVMEM_AUTORELEASE, 0) + # Then set back to 1 + sock.setsockopt(socket.SOL_SOCKET, SO_DEVMEM_AUTORELEASE, 1) + val =3D sock.getsockopt(socket.SOL_SOCKET, SO_DEVMEM_AUTORELEASE) + ksft_eq(val, 1, "Autorelease should be 1 after setting") + except OSError as e: + if e.errno =3D=3D errno.ENOPROTOOPT: + raise KsftSkipEx("SO_DEVMEM_AUTORELEASE not supported") + raise + finally: + sock.close() + + +def check_sockopt_autorelease_invalid(cfg) -> None: + """Test that SO_DEVMEM_AUTORELEASE rejects invalid values""" + SO_DEVMEM_AUTORELEASE =3D 85 + + sock =3D socket.socket(socket.AF_INET, socket.SOCK_STREAM) + try: + try: + sock.setsockopt(socket.SOL_SOCKET, SO_DEVMEM_AUTORELEASE, 2) + raise Exception("setsockopt should have failed with EINVAL") + except OSError as e: + if e.errno =3D=3D errno.ENOPROTOOPT: + raise KsftSkipEx("SO_DEVMEM_AUTORELEASE not supported") + ksft_eq(e.errno, errno.EINVAL, "Should fail with EINVAL for in= valid value") + finally: + sock.close() + + def main() -> None: with NetDrvEpEnv(__file__) as cfg: cfg.bin_local =3D path.abspath(path.dirname(__file__) + "/ncdevmem= ") cfg.bin_remote =3D cfg.remote.deploy(cfg.bin_local) =20 - ksft_run([check_rx, check_tx, check_tx_chunks], + ksft_run([check_rx, check_tx, check_tx_chunks, + check_autorelease_enabled, + check_autorelease_disabled, + check_sockopt_autorelease_default, + check_sockopt_autorelease_set_0, + check_sockopt_autorelease_set_1, + check_sockopt_autorelease_invalid], args=3D(cfg, )) ksft_exit() =20 diff --git a/tools/testing/selftests/drivers/net/hw/ncdevmem.c b/tools/test= ing/selftests/drivers/net/hw/ncdevmem.c index 3288ed04ce08..34d608d07bec 100644 --- a/tools/testing/selftests/drivers/net/hw/ncdevmem.c +++ b/tools/testing/selftests/drivers/net/hw/ncdevmem.c @@ -83,6 +83,10 @@ #define MSG_SOCK_DEVMEM 0x2000000 #endif =20 +#ifndef SO_DEVMEM_AUTORELEASE +#define SO_DEVMEM_AUTORELEASE 85 +#endif + #define MAX_IOV 1024 =20 static size_t max_chunk; @@ -97,6 +101,7 @@ static unsigned int ifindex; static unsigned int dmabuf_id; static uint32_t tx_dmabuf_id; static int waittime_ms =3D 500; +static int autorelease =3D -1; =20 /* System state loaded by current_config_load() */ #define MAX_FLOWS 8 @@ -890,6 +895,16 @@ static int do_server(struct memory_buffer *mem) if (enable_reuseaddr(socket_fd)) goto err_close_socket; =20 + if (autorelease >=3D 0) { + ret =3D setsockopt(socket_fd, SOL_SOCKET, SO_DEVMEM_AUTORELEASE, + &autorelease, sizeof(autorelease)); + if (ret) { + pr_err("SO_DEVMEM_AUTORELEASE failed"); + goto err_close_socket; + } + fprintf(stderr, "Set SO_DEVMEM_AUTORELEASE to %d\n", autorelease); + } + fprintf(stderr, "binding to address %s:%d\n", server_ip, ntohs(server_sin.sin6_port)); =20 @@ -1397,7 +1412,7 @@ int main(int argc, char *argv[]) int is_server =3D 0, opt; int ret, err =3D 1; =20 - while ((opt =3D getopt(argc, argv, "ls:c:p:v:q:t:f:z:")) !=3D -1) { + while ((opt =3D getopt(argc, argv, "ls:c:p:v:q:t:f:z:A:")) !=3D -1) { switch (opt) { case 'l': is_server =3D 1; @@ -1426,6 +1441,9 @@ int main(int argc, char *argv[]) case 'z': max_chunk =3D atoi(optarg); break; + case 'A': + autorelease =3D atoi(optarg); + break; case '?': fprintf(stderr, "unknown option: %c\n", optopt); break; --=20 2.47.3