[PATCH net] tls: avoid zc receive for file-backed pages

Yeonju Bae posted 1 patch 3 days, 2 hours ago
net/tls/tls_sw.c | 37 +++++++++++++++++++++++++++++++++++++
1 file changed, 37 insertions(+)
[PATCH net] tls: avoid zc receive for file-backed pages
Posted by Yeonju Bae 3 days, 2 hours ago
kTLS RX zc decrypt writes unauthenticated AEAD output directly into
pages pinned from the recvmsg iterator via tls_setup_from_iter().
For MAP_SHARED, PROT_WRITE file-backed destinations, those pages are
live page-cache pages rather than anonymous copies: MAP_SHARED does not
trigger copy-on-write, so FOLL_WRITE returns the actual page-cache page.

crypto_aead_decrypt() writes CTR-mode decryption output into the
scatter-gather list before the authentication tag is verified.  If the
tag check fails (-EBADMSG), the plaintext-like output is already
resident in the page-cache page.  exit_free_pages() calls put_page()
without any content cleanup, so the modification persists through the
backing file.  An independent open(O_RDONLY)/read() of the same file
returns different content and its SHA-256 changes.  MAP_PRIVATE is safe
via COW; PROT_READ-only destinations fail at iov_iter_get_pages2()
before any decryption occurs.

Avoid zc receive for file-backed destination pages.  In
tls_setup_from_iter(), after iov_iter_get_pages2() pins pages, check
each page with folio_mapping(page_folio(page)).  If any pinned page is
file-backed (mapping != NULL), release the pinned pages and return
-EOPNOTSUPP.  Handle -EOPNOTSUPP in tls_decrypt_sw() by clearing
darg->zc and retrying, which causes tls_decrypt_sg() to allocate a
kernel bounce buffer instead.  Decryption output never reaches the
file-backed page; on tag failure the bounce buffer is discarded.

This follows the existing opportunistic zc retry pattern already used
for TLS 1.3 record type mismatches in tls_decrypt_sw().

Verified on linux-7.0-rc3 QEMU (x86-64), four destination types:
  MAP_SHARED+RW:   file_changed=0  (was 4077/4096 bytes before patch)
  MAP_PRIVATE+RW:  file_changed=0  (COW isolation; unchanged)
  anonymous heap:  no file backing  (unchanged)
  PROT_READ only:  file_changed=0  (EFAULT before decrypt; unchanged)

Signed-off-by: Yeonju Bae <iwasbaeyz@gmail.com>
---
 net/tls/tls_sw.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index a977b0434..c312a83b4 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -36,6 +36,7 @@
  */
 
 #include <linux/bug.h>
+#include <linux/pagemap.h>
 #include <linux/sched/signal.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
@@ -1443,6 +1444,34 @@ static int tls_setup_from_iter(struct iov_iter *from,
 
 		length -= copied;
 		size += copied;
+		/* Reject file-backed destination pages.  Writing unauthenticated
+		 * AEAD output into a page-cache page before tag verification
+		 * leaves the backing file modified even when recvmsg() returns
+		 * -EBADMSG.  Return -EOPNOTSUPP so the caller retries via the
+		 * non-ZC bounce-buffer path.
+		 */
+		{
+			ssize_t remain = copied;
+			size_t  off    = offset;
+			int     np = 0, j;
+
+			while (remain > 0) {
+				remain -= min_t(ssize_t, remain,
+						(ssize_t)(PAGE_SIZE - off));
+				off = 0;
+				np++;
+			}
+			for (j = 0; j < np; j++) {
+				if (folio_mapping(page_folio(pages[j]))) {
+					int k;
+
+					for (k = 0; k < np; k++)
+						put_page(pages[k]);
+					rc = -EOPNOTSUPP;
+					goto out;
+				}
+			}
+		}
 		while (copied) {
 			use = min_t(int, copied, PAGE_SIZE - offset);
 
@@ -1699,6 +1728,14 @@ tls_decrypt_sw(struct sock *sk, struct tls_context *tls_ctx,
 	if (err < 0) {
 		if (err == -EBADMSG)
 			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTERROR);
+		if (err == -EOPNOTSUPP && darg->zc) {
+			/* tls_setup_from_iter detected file-backed destination
+			 * pages; retry without ZC via the bounce-buffer path.
+			 */
+			darg->zc = false;
+			TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSDECRYPTRETRY);
+			return tls_decrypt_sw(sk, tls_ctx, msg, darg);
+		}
 		return err;
 	}
 	/* keep going even for ->async, the code below is TLS 1.3 */
-- 
2.43.0