From nobody Tue Feb 10 20:47:02 2026 Received: from mail-yx1-f65.google.com (mail-yx1-f65.google.com [74.125.224.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C64811C8626 for ; Thu, 18 Dec 2025 03:00:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.65 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766026833; cv=none; b=aOoYG6BZCzClTlBkEDbfLh7vl7Cc6yi4IHHRVyBPGqErehdWMkGEaZMf+VavGFzuR5gd1bN5IAatttES0Wa/Ib38RGzDxoQMuxS9mE15hIBzt6tvIqAaAmLYS8+uRzTxSbEVby8dJkKe4Re5o2tnT8mrfmn/iwvLVDnR6pkvliw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766026833; c=relaxed/simple; bh=Vm++0UfiI7mTMG7UcqfeOA+uxyR8LCbfyx1yxXxyHSo=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=kpK2Y5uujSJ6fj1wahXZcO4Jq9dUgiMkKd9LsSJVMQPWrRhngnfbhDrPCtYdxQVuFa3eoxrAcpeFxzFfVDSZUsM7q3MskMXsNZvYjA04TVloedHWBv0tGqx+AXCx8EWjLD1pB7DnOCzRzBgKX7ddzp9mpbiOOiWsybEqdBhnBWg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ODkciTn7; arc=none smtp.client-ip=74.125.224.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ODkciTn7" Received: by mail-yx1-f65.google.com with SMTP id 956f58d0204a3-6455a60c12bso130210d50.3 for ; Wed, 17 Dec 2025 19:00:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766026831; x=1766631631; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=f1aW+idYYoTpGggNMnP1gIvl+I8Ac1YfYU7BJVbP8Vk=; b=ODkciTn7QovESv863Wsiyh/h0tHGmaFu+zj5raTRO9WBC4QeHN9QKTaihFo/yh1552 HVMr+NRdAzvTMU8zQBqVTAQXo6c3KBOTNcqtwIQfjaX0VwwyFWCHBA/wzLcM9NMMRue7 0Z5ZhQOiYhfj44OxIJmgX/g5tiPY0aeijN978VndvrztXGnbOvulGRLmVzn70o63mEew dIWke/Ge5np0me26QkH3uM8AQ8eEge60QOAFgeNRj88FcSaTyuYSG2zdb5xD2ax70Ze0 9wTaJ00SHw4A/ITdg1dh2BgO3MYXBywc4JjY6RNLEa3OHfP3Z0euGUdBR6Ln4Sd4mOU1 Megg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766026831; x=1766631631; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=f1aW+idYYoTpGggNMnP1gIvl+I8Ac1YfYU7BJVbP8Vk=; b=YqN45OPsCAJOtexDhlog3igje8sG79mzi+7DflusFbxZUgI5UwQg+7kFt4ZxtuvSh4 1+8OMFQrPRc8zC6k0xyvJApp8yFojpcnP//5gZMa+e64jus/d9k8wr7/Kx8Q82vgghIX ZPbqrhnFXVmNbH0yJlyI9geOTDP4Ez035bjXT5liuhtJpd1X0/s/1YDXl2GTv7b98LWq QCaMziNABn8q9GAa4VrDw4RXnsvMkdZP8ixJPmdyTeiHjW0N9w8dOrZEiubGgdX+YM6F fASa8j/N06UCKV2qH6BZlkIz0NJOIFOZjzAM2kPjMk/9UhsBGq5I2KMUelZLmDeNfGEp 7HSw== X-Forwarded-Encrypted: i=1; AJvYcCUU0Yc+G95t5XdU5EI0n3getGRZy1/Phzz+Zw0zTF3CFl8bLm5S9D+zvrXuZe+SErO7+UqC4BQjj9qqXBY=@vger.kernel.org X-Gm-Message-State: AOJu0YyfaHLHDPGHysptXPOI/r01f9Kfmcf0TDkPril9CcMrNYbEt2Sr uD3EI319RjLZH3z644wuY6htU19WNVHr++kKZ6cE9k9SRkeSHQlnHUj9 X-Gm-Gg: AY/fxX6WzT6kvxMXNOLdpxAoCjYUVBcOrF5IzuFaejRZcF4dlg5ePWQSUE9dn7KjFfP aqXA50c+PFw1yjbKDqnZkZIXTGSG4z3hd4KwFSwgZb8ik4u0aUn4Kvnz7l0EzNlTxA5AuIdeQYY BFfPp3qN/UJU1qfaR/AqrCzTERpuvzhHCsMqcuvD3qsi4daLa5gsrHQWKS43Sh4IHKjiw1Fh9vC MQQbNj0URtC5P4c0yFSupTxtf16OfA6ygM5H623vvgFTz0p/mN01Uw9cSC6HhiNMyIhPPLomECT E7Cxr+3VUIAtzGqLPR8lZqCHGHo4kdk6848jI5zSCKtsWEhbXqIOf7La/ZaxH2Gj165XDd2WxTv E6i+6AgX82LrF+XQowAS5zxTE0nv9EnQmJ9sIl9gYpABVtJvEJS3SLi38LHeuKvffnA/SblHq4O oqhSWG3lFUESOMXrMNKaLHAFP4Sv9LroEqP9BPvtewNMU27SWc2ITveNbBoEk0 X-Google-Smtp-Source: AGHT+IFLtp/YJVgdDwF2T2NRVscU9ohhAIZGe3jf1diZBtUCBHkPwsifUpFmbcBIew2yjf6ZKbexmg== X-Received: by 2002:a05:690e:13c4:b0:644:60d9:866e with SMTP id 956f58d0204a3-6455567be0dmr15248816d50.95.1766026830540; Wed, 17 Dec 2025 19:00:30 -0800 (PST) Received: from abc-virtual-machine.localdomain ([170.246.157.94]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78fa728f8b7sm3638087b3.45.2025.12.17.19.00.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Dec 2025 19:00:30 -0800 (PST) From: Yuhao Jiang To: Jens Axboe , Pavel Begunkov Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, Yuhao Jiang , stable@vger.kernel.org Subject: [PATCH] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass via compound page accounting Date: Wed, 17 Dec 2025 20:59:47 -0600 Message-Id: <20251218025947.36115-1-danisjiang@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When multiple registered buffers share the same compound page, only the first buffer accounts for the memory via io_buffer_account_pin(). The subsequent buffers skip accounting since headpage_already_acct() returns true. When the first buffer is unregistered, the accounting is decremented, but the compound page remains pinned by the remaining buffers. This creates a state where pinned memory is not properly accounted against RLIMIT_MEMLOCK. On systems with HugeTLB pages pre-allocated, an unprivileged user can exploit this to pin memory beyond RLIMIT_MEMLOCK by cycling buffer registrations. The bypass amount is proportional to the number of available huge pages, potentially allowing gigabytes of memory to be pinned while the kernel accounting shows near-zero. Fix this by recalculating the actual pages to unaccount when unmapping a buffer. For regular pages, always unaccount. For compound pages, only unaccount if no other registered buffer references the same compound page. This ensures the accounting persists until the last buffer referencing the compound page is released. Reported-by: Yuhao Jiang Fixes: 57bebf807e2a ("io_uring/rsrc: optimise registered huge pages") Cc: stable@vger.kernel.org Signed-off-by: Yuhao Jiang --- io_uring/rsrc.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 67 insertions(+), 2 deletions(-) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index a63474b331bf..dcf2340af5a2 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -139,15 +139,80 @@ static void io_free_imu(struct io_ring_ctx *ctx, stru= ct io_mapped_ubuf *imu) kvfree(imu); } =20 +/* + * Calculate pages to unaccount when unmapping a buffer. Regular pages are + * always counted. Compound pages are only counted if no other registered + * buffer references them, ensuring accounting persists until the last use= r. + */ +static unsigned long io_buffer_calc_unaccount(struct io_ring_ctx *ctx, + struct io_mapped_ubuf *imu) +{ + struct page *last_hpage =3D NULL; + unsigned long acct =3D 0; + unsigned int i; + + for (i =3D 0; i < imu->nr_bvecs; i++) { + struct page *page =3D imu->bvec[i].bv_page; + struct page *hpage; + unsigned int j; + + if (!PageCompound(page)) { + acct++; + continue; + } + + hpage =3D compound_head(page); + if (hpage =3D=3D last_hpage) + continue; + last_hpage =3D hpage; + + /* Check if we already processed this hpage earlier in this buffer */ + for (j =3D 0; j < i; j++) { + if (PageCompound(imu->bvec[j].bv_page) && + compound_head(imu->bvec[j].bv_page) =3D=3D hpage) + goto next_hpage; + } + + /* Only unaccount if no other buffer references this page */ + for (j =3D 0; j < ctx->buf_table.nr; j++) { + struct io_rsrc_node *node =3D ctx->buf_table.nodes[j]; + struct io_mapped_ubuf *other; + unsigned int k; + + if (!node) + continue; + other =3D node->buf; + if (other =3D=3D imu) + continue; + + for (k =3D 0; k < other->nr_bvecs; k++) { + struct page *op =3D other->bvec[k].bv_page; + + if (!PageCompound(op)) + continue; + if (compound_head(op) =3D=3D hpage) + goto next_hpage; + } + } + acct +=3D page_size(hpage) >> PAGE_SHIFT; +next_hpage: + ; + } + return acct; +} + static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf= *imu) { + unsigned long acct; + if (unlikely(refcount_read(&imu->refs) > 1)) { if (!refcount_dec_and_test(&imu->refs)) return; } =20 - if (imu->acct_pages) - io_unaccount_mem(ctx->user, ctx->mm_account, imu->acct_pages); + acct =3D io_buffer_calc_unaccount(ctx, imu); + if (acct) + io_unaccount_mem(ctx->user, ctx->mm_account, acct); imu->release(imu->priv); io_free_imu(ctx, imu); } --=20 2.34.1