From nobody Sun Feb 8 02:42:09 2026 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 488A63597A for ; Wed, 18 Dec 2024 16:56:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734540998; cv=none; b=UAID9qVjyLl2q+Icg+TFxle32RiksJ1crkh1l1WqWPGOCpGguO3CbDrGtsooQfv0a+rfVMMjAJXAi0h+rDbxxnSvNBk5ybsrMPDpRpOeI2qmiC0ZRpEE95FhTNR0HuBhT1FXef7TGGzFzwlWhfVZFiQZqjYsX4WBX7SVS3+CIlE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734540998; c=relaxed/simple; bh=MMQEj7LAqkGgvcG6MI8Z5DjGkXDNXgywSBSnTZ1Hw2o=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:Cc; b=SEeD/9JFT/3xgEc0trI5VLQ6oklQE5dE/V3oBxulIFJ6lM9t270cwcYpYlYweIcC37x5jbYlQnJ1LFJdjdlPVZP6z1/Uj1lTGquAr75ZNUsPur09bhE7gSMQ1nOCdTyyRdJ9DIYjHJR6SomyiHGaCaMpyyTEqp78M0F7b0rTjFM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=IbtQonUD; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IbtQonUD" Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-435b0df5dbdso73485e9.0 for ; Wed, 18 Dec 2024 08:56:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734540994; x=1735145794; darn=vger.kernel.org; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:from:to:cc:subject:date:message-id:reply-to; bh=AZ5voeoEGvFIlnkWTtVGdlj8CKOl5YBTaqd9AzIPNK8=; b=IbtQonUDVfNK5sOsZ+Zi2vrQZRt5Px+lnl+NdCykArSpZXE+AbdOpFldru3ZDcyadn PfTdM3dBd7SWosO4r5zwF8hcPKuRWjYMax1m/GrwTAYnm3kdYz0r2hW/FpMiuMlDrNBk kSfNuiZq+tSdDWPMPzJpdIkgHqkaIqFS1CdyhJYF/q94stJfzNu9H3ciy766u0OD+Vtx /FITLPjBvv3One4amVs4pnJD5OcnTG6GeyazCdXvmu91kECRm5KYTmE8DqYgnxQS9Rnu r46UXaOSoF9N8mQOdRvuaOeDKYqOPHOXEtti8YGD/dpuMLk+Qst/Klqh6etQOs2fyCgH e++Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734540994; x=1735145794; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=AZ5voeoEGvFIlnkWTtVGdlj8CKOl5YBTaqd9AzIPNK8=; b=EAzx7WoBlMJu0jul/xnUwUvv1/LxOLZzdI8neBYLKmPQ8UB3/Xm9iAn+vof/g8jcjS xuFvFmNcSqKtCMWqZlHahmV3MNrOSEkyPLAx2oh4pztwPnhiKQVTfYvX5U87AmLwq2qW tPprhAGSFifNTgQDiVAk753JrKrCjJ/p7WKMZ2hkentwBBFYU4P3/6RgoXRVl5qa98m8 VxZdM4cCS6lDLEJyzS0iSB9ORMoBCgAhkZflDwP4F+xgTwsC5yoapJ4BfvDS5FAvxuc1 t524kwCgjke6MaIkIOg1KHo9NS+Zsz9YNQRSUxdhZ6bgs32W9yliJiKDMburM/bKi7T0 fvXg== X-Forwarded-Encrypted: i=1; AJvYcCVnw5VPPz6n6J+x0MWYs2XEZLDA4i9TOUZxkOisT2OYrY4GMo3iBSNVEw4mZP4OS3mQ76v9+d0MMzDidEg=@vger.kernel.org X-Gm-Message-State: AOJu0YyGMjuKxqryrMko015W9xQeO0pt/rOqykJePtvl4lxvrZ4+1awk Lc/98HoysAaDR3RSOcmC9HG36JZJBWfKETfGZXbc1+KvBczBIbMBWTAczDfCcg== X-Gm-Gg: ASbGnctd1x0nPw79Ynhqz4DLmnnG9dr/YcGwNWkBxKsaP8+29+Sv5ZdmTGSScEngPJj cvXclsEgfHeiSAOmuCRy6GkQHli+1vH1Dg4OKmDfSLiC2V2yc5bhPge75Rs2LcutQC2gwZ9+A67 Kq9BhIS6edG/sonHu++pbDqTW70mZeJEi2fQWmIFudYnmCsnDhpU+1guKrsa+Yh+IVc6zBrBVYz TTLQaojkafEAh1e6Pzr36KW800MbBB77LemNPd5l5s= X-Google-Smtp-Source: AGHT+IF3N05RjEQCj8gANjJelHtGo09ptmZGcgO6vIz0C3tgi4N57ksXoYFvu4L/kTUaof9juBi3zA== X-Received: by 2002:a05:600c:4c97:b0:434:9fac:3408 with SMTP id 5b1f17b1804b1-436553f2cfdmr1449615e9.2.1734540993148; Wed, 18 Dec 2024 08:56:33 -0800 (PST) Received: from localhost ([2a00:79e0:9d:4:2ea4:1f:c82b:c76e]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-388c80162b1sm14336097f8f.38.2024.12.18.08.56.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Dec 2024 08:56:32 -0800 (PST) From: Jann Horn Date: Wed, 18 Dec 2024 17:56:25 +0100 Subject: [PATCH] io_uring: Fix registered ring file refcount leak Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20241218-uring-reg-ring-cleanup-v1-1-8f63e999045b@google.com> X-B4-Tracking: v=1; b=H4sIALj+YmcC/x2MSwqAMAwFryJZWzC14ucq4qK0UQNSS4oiiHe3u hh4s3hzQyJhSjAUNwidnHgPWbAswK02LKTYZwddaYMaO3UIh0UJZb7hNrLhiAob79va1NgbC/k chWa+/vA4Pc8LfNFqYGgAAAA= X-Change-ID: 20241218-uring-reg-ring-cleanup-15dd7343194a To: Jens Axboe , Pavel Begunkov Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Jann Horn X-Mailer: b4 0.15-dev X-Developer-Signature: v=1; a=ed25519-sha256; t=1734540988; l=3962; i=jannh@google.com; s=20240730; h=from:subject:message-id; bh=MMQEj7LAqkGgvcG6MI8Z5DjGkXDNXgywSBSnTZ1Hw2o=; b=p0JbFNAOdCET2T9VN0wW0XHaTI88fETWFrtk8qqq4YrThql8d9t+tUQs+XQEOcPAAWiw262YK E35OYEkrSBDBnVd35h2PGpb4GAofl3HMlVqzgnkojnoOSKLFWkpmCOq X-Developer-Key: i=jannh@google.com; a=ed25519; pk=AljNtGOzXeF6khBXDJVVvwSEkVDGnnZZYqfWhP1V+C8= Currently, io_uring_unreg_ringfd() (which cleans up registered rings) is only called on exit, but __io_uring_free (which frees the tctx in which the registered ring pointers are stored) is also called on execve (via begin_new_exec -> io_uring_task_cancel -> __io_uring_cancel -> io_uring_cancel_generic -> __io_uring_free). This means: A process going through execve while having registered rings will leak references to the rings' `struct file`. Fix it by zapping registered rings on execve(). This is implemented by moving the io_uring_unreg_ringfd() from io_uring_files_cancel() into its callee __io_uring_cancel(), which is called from io_uring_task_cancel() on execve. This could probably be exploited *on 32-bit kernels* by leaking 2^32 references to the same ring, because the file refcount is stored in a pointer-sized field and get_file() doesn't have protection against refcount overflow, just a WARN_ONCE(); but on 64-bit it should have no impact beyond a memory leak. Cc: stable@vger.kernel.org Fixes: e7a6c00dc77a ("io_uring: add support for registering ring file descr= iptors") Signed-off-by: Jann Horn --- testcase: ``` #define _GNU_SOURCE #include #include #include #include #include #include #include #define SYSCHK(x) ({ \ typeof(x) __res =3D (x); \ if (__res =3D=3D (typeof(x))-1) \ err(1, "SYSCHK(" #x ")"); \ __res; \ }) #define NUM_SQ_PAGES 4 static int uring_fd; int main(void) { int memfd_sq =3D SYSCHK(memfd_create("", 0)); int memfd_cq =3D SYSCHK(memfd_create("", 0)); SYSCHK(ftruncate(memfd_sq, NUM_SQ_PAGES * 0x1000)); SYSCHK(ftruncate(memfd_cq, NUM_SQ_PAGES * 0x1000)); // sq void *sq_data =3D SYSCHK(mmap(NULL, NUM_SQ_PAGES*0x1000, PROT_READ|PROT_W= RITE, MAP_SHARED, memfd_sq, 0)); // cq void *cq_data =3D SYSCHK(mmap(NULL, NUM_SQ_PAGES*0x1000, PROT_READ|PROT_W= RITE, MAP_SHARED, memfd_cq, 0)); *(volatile unsigned int *)(cq_data+4) =3D 64 * NUM_SQ_PAGES; close(memfd_sq); close(memfd_cq); // initialize uring struct io_uring_params params =3D { .flags =3D IORING_SETUP_REGISTERED_FD_ONLY|IORING_SETUP_NO_MMAP|IORING_= SETUP_NO_SQARRAY, .sq_off =3D { .user_addr =3D (unsigned long)sq_data }, .cq_off =3D { .user_addr =3D (unsigned long)cq_data } }; uring_fd =3D SYSCHK(syscall(__NR_io_uring_setup, /*entries=3D*/10, ¶m= s)); // re-execute execl("/proc/self/exe", NULL); err(1, "execve"); } ``` Leave it running for a while and monitor `grep ^filp /proc/slabinfo` and memory usage - without the patch, both will go up slowly but steadily. --- include/linux/io_uring.h | 4 +--- io_uring/io_uring.c | 1 + 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index e123d5e17b526148054872ee513f665adea80eb6..85fe4e6b275c7de260ea9a8552b= 8e1c3e7f7e5ec 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -15,10 +15,8 @@ bool io_is_uring_fops(struct file *file); =20 static inline void io_uring_files_cancel(void) { - if (current->io_uring) { - io_uring_unreg_ringfd(); + if (current->io_uring) __io_uring_cancel(false); - } } static inline void io_uring_task_cancel(void) { diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 06ff41484e29c6e7d8779bd7ff8317ebae003a8d..b1e888999cea137e043bf003725= 76861182857ac 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3214,6 +3214,7 @@ __cold void io_uring_cancel_generic(bool cancel_all, = struct io_sq_data *sqd) =20 void __io_uring_cancel(bool cancel_all) { + io_uring_unreg_ringfd(); io_uring_cancel_generic(cancel_all, NULL); } =20 --- base-commit: aef25be35d23ec768eed08bfcf7ca3cf9685bc28 change-id: 20241218-uring-reg-ring-cleanup-15dd7343194a --=20 Jann Horn