From nobody Mon May 25 02:42:23 2026 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5696B3321DE for ; Tue, 19 May 2026 11:30:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779190253; cv=none; b=fI2QEm3SS82EMdA3Z1wUoKvgC5qdvSdQ8iUPeWQRRw1Vqhhu7Pi2ZOqrOkmQrbU886lSHC9i86GpZMDXb0Fz6hVyKblx9TrDZXX79gkL3zMmoNMjIpvrIXcItL65V0ZH45dsyz8L9Jl1jw8GDB0TNOqNRdxxXuflrPkAFx3JQe4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779190253; c=relaxed/simple; bh=2zdbSDYo2a4iqVuU9shfVMNQnc3dcNJq5nqcEPmoTd0=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=HNYCkNXo1Y5opcl1Mhm0YFSPisRQvmzLloBMbl0QCm9NdmtYFrtmuwckAyFrOk4PoS+JZacVs4O9R8rwMoridgbMIYrZXst5OZ9yVuY/auJz/1Wqaq+MrQFCZiz7SthzcN2D2e6h9g8TjK/cuL7AIlS+km/L1EaKmk9KsMnCOsk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=g6aZw9Ch; arc=none smtp.client-ip=209.85.222.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="g6aZw9Ch" Received: by mail-qk1-f174.google.com with SMTP id af79cd13be357-912475287a5so377328085a.2 for ; Tue, 19 May 2026 04:30:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779190250; x=1779795050; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=OdnoiIWTikhhiv4xZc+nEe/cpMOrkcgd3lQkjWYtbHs=; b=g6aZw9Ch4UtPKi/LjUypQDlcdjIICFxfAqt097JPkHwgKCXWPdXcxnN0avWS9KaTrd 6YsvdHohDfWHegvxnhSwgPPKSENz0EZyWeWDgB7WoB5XpQ80lyi4yj5c6lYw2ecNba2c nZUcfjQmsNNWZkPAc8sHk/OEyo9ELKCYJ/n9SO7B0d+6gtdwyshm9RtgQ2dXsP8rqbiL qJaZN1GdXR30aK3aJ1eAWpOsXgzC5WlNXrq1IkuKM28yTowBl2q5JnNh+hFrHF2X2PWw NFY6MRjifIbqA51nFbyTNy7JOjA6jC3uH972SuYYy8Rijjizrg2+9Gj7gQYJy2vx3p8r vrow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779190250; x=1779795050; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=OdnoiIWTikhhiv4xZc+nEe/cpMOrkcgd3lQkjWYtbHs=; b=OV8UA+No0dzZur5vyf7EGdQkUiBlQq9D0Z8eC6GrG5bbCowfb/pcQdXFqBqnJ4fP1d ZA+e1gClakHJJClz/98Hsmr7M/ft0DplbT7MxQDNnr8hqip6x66v5xxqx3IgJ4QU5XAh Jt7oniGhwZ1O0F6Hb9zFyh3Bg7kVyiP0slStc6BCr/gn4iZ2xwOlGmdUxz3OFVba0dML s5VBj14QHD+dfj/ZwkhGHDCLiF9SZtUuFYNucMIGLuiBBySghiPW4kPuL6Y80VJJbaiY iUSRtw0RBP/jZO8igRE6BfMamJ/P1cN6V3kxYOEZ+vPlDeKOwQLCc3jmtB6H/fNlGOGt pU5Q== X-Forwarded-Encrypted: i=1; AFNElJ9fCEtKJyPB1bB7L6scbjKh9ifhE68ehr+OnoiRAxtcHVJGMNRZEkGek/2DnCgAJMOW78YaHKG/lzal7vU=@vger.kernel.org X-Gm-Message-State: AOJu0YzGpQtNKiovEAEkwlr53/ydWZAmVrXaJmylK0KFRH+vEhZ2P4Qr WstFwZYblvBMqM2ohglyr3L4vAS36+irbMmrryYTa+0Wrhov/dtpId71 X-Gm-Gg: Acq92OGYciK1b9WoDTCiRd9RSOjwURd79NwwRp9De2AeYNPEGj8+mLs0BSjp6lCsmsN Q32FTT+alvoJ0bdrd1lY4UlWhDoEj6tAyW9ypQCQW1IZUClg24N2ZacystTw1JhyDMKUN0MzmcX hmmvBfS3hGm1M+PJxvi/ur2lCqsNkJUxpty3zU2up/3fx1EOdGvV3MvmcwjB7UO6ZvyedwAv8l1 NJoeHKrYJ/MZeCeEF9Sw9QZC1zSEYRj3XPJ2/lcUyPSJfaZtPL45Z5EPDEdRgY0E4y2C/1Dq2Ai GAQ4kxp7cms+h+ZxAuGRP3Ph0RujweAfm+0rFlh6toxbstyAw+36RJFyI+n4sD480+shB3M8jAi Ll513Rav+eqBKrECpoKjDyCbb4k9FQ2lORBsaR11VnIP7wVoInAaWMYaJWTxD0jM7Iqh56VlV15 a1rZxxLgD8+Lwq4KoJfC3D9oquNqX57XfgteISE7YHBHKAvbIde1gV+b6SSAq1QMZ0T5oTJCMDT JFNFjePj/1jtgrnrP1q X-Received: by 2002:a05:620a:4406:b0:8ef:ed0b:c235 with SMTP id af79cd13be357-911d087dccamr2744259685a.56.1779190236546; Tue, 19 May 2026 04:30:36 -0700 (PDT) Received: from server0 (c-68-48-65-54.hsd1.mi.comcast.net. [68.48.65.54]) by smtp.gmail.com with ESMTPSA id af79cd13be357-910bcf37762sm1880795585a.36.2026.05.19.04.30.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 May 2026 04:30:36 -0700 (PDT) From: Michael Bommarito To: Ilya Dryomov , Alex Markuze , Viacheslav Dubeyko Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH] ceph: bound num_split_inos and num_split_realms in ceph_handle_snap() Date: Tue, 19 May 2026 07:30:17 -0400 Message-ID: <20260519113017.1851462-1-michael.bommarito@gmail.com> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A peer that can deliver a CEPH_MSG_CLIENT_SNAP to the kernel CephFS client (a compromised or malicious MDS, or an attacker who has forged/replayed a cephx session on the cluster network) can cause an out-of-bounds slab read in ceph_update_snap_trace() by sending num_split_inos or num_split_realms as a small negative __le32. ceph_handle_snap() parses both counts into signed int and then advances the decode pointer with `p +=3D sizeof(u64) * num_split_inos`; the multiplication is in size_t, so the signed operand is widened modulo 2**64 and a wire value like -32 produces an attacker-chosen byte offset that walks p backwards into the slab. The subsequent ceph_decode_need(&p, e, sizeof(*ri), bad) passes (end - p is huge), ri =3D p, and the next 4-byte read inside ceph_update_snap_trace() is performed from attacker-positioned memory. The same arithmetic and the same pointer hand-off exist in the non-split branch. Promote num_split_inos and num_split_realms to u32 to match the on-wire __le32 fields, compute each array's byte length with array_size() so a size_t overflow saturates to SIZE_MAX instead of wrapping, sum the two lengths with check_add_overflow(), and verify the total against the remaining front-buffer length before any pointer bump. Re-use the validated byte counts for the bumps in both the split and non-split branches. The MDS is an authenticated peer under cephx, but the kernel client is still expected to validate metadata it accepts over the wire; this hardens the input-validation boundary that snap-message decode crosses. Fixes: 963b61eb041e8 ("ceph: snapshot management") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito --- Reproduced on x86_64 QEMU/KVM, KASAN_INLINE generic, two ways: - In-tree harness that allocates an upstream struct ceph_msg via ceph_msg_new(), writes num_split_inos =3D (u32)-32, num_split_realms =3D 0, op =3D CEPH_SNAP_OP_UPDATE into the front buffer, and calls ceph_handle_snap(&mdsc, &session, msg) directly. - End-to-end over a real TCP connection from a real ceph-mds daemon to the kernel CephFS client, with num_split_inos rewritten to (u32)-32 in the front buffer and the messenger v1 footer.front_crc recomputed so the kernel libceph receive path accepts the message. The KASAN report fires from the tcp_recvmsg softirq path through ceph_handle_snap+0x345 into ceph_update_snap_trace+0x23bf, confirming the bug is reached via the normal MDS->client receive path and not only by direct harness invocation. A stock v7.1-rc3 kernel produces: BUG: KASAN: slab-out-of-bounds in ceph_update_snap_trace+0x23bf/0x31a0 Read of size 4 at addr ffff8880012be1f8 by task init/1 ceph_update_snap_trace+0x23bf/0x31a0 ? ceph_handle_snap+0x312/0x900 ceph_handle_snap+0x345/0x900 The buggy address is located 248 bytes to the right of allocated 256-byte region [ffff8880012be000, ffff8880012be100) With this patch applied, the same trigger (both via the harness and via the wire path) hits the new validator's goto bad path, logs "corrupt snap message from mds0", calls ceph_msg_dump(), and returns cleanly with no KASAN report. Harness and wire-injection scripts available on request. The kernel ships no fs/ceph selftests and no ceph KUnit module that exercises ceph_handle_snap, so no in-tree selftest delta to report. fs/ceph/snap.c | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-) diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 52b4c2684f922..7c4487eb2708a 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -1027,9 +1027,10 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, void *p =3D msg->front.iov_base; void *e =3D p + msg->front.iov_len; struct ceph_mds_snap_head *h; - int num_split_inos, num_split_realms; + u32 num_split_inos, num_split_realms; __le64 *split_inos =3D NULL, *split_realms =3D NULL; - int i; + size_t split_inos_bytes, split_realms_bytes, split_bytes; + u32 i; int locked_rwsem =3D 0; bool close_sessions =3D false; @@ -1048,6 +1049,24 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, trace_len =3D le32_to_cpu(h->trace_len); p +=3D sizeof(*h); + /* + * Validate that the two MDS-supplied counts cannot wrap when + * multiplied by sizeof(u64), and that the two arrays together + * fit in the remaining front buffer before any of the pointer + * bumps below. Without this, a malformed (or malicious) snap + * message can cause 'p +=3D sizeof(u64) * num_split_inos' to land + * at an attacker-chosen offset via the size_t * int widening, + * bypassing ceph_decode_need() and making the subsequent + * 'ri =3D p; ri->created' read out of bounds. + */ + split_inos_bytes =3D array_size(num_split_inos, sizeof(u64)); + split_realms_bytes =3D array_size(num_split_realms, sizeof(u64)); + if (split_inos_bytes =3D=3D SIZE_MAX || split_realms_bytes =3D=3D SIZE_MA= X || + check_add_overflow(split_inos_bytes, split_realms_bytes, + &split_bytes) || + (size_t)(e - p) < split_bytes) + goto bad; + doutc(cl, "from mds%d op %s split %llx tracelen %d\n", mds, ceph_snap_op_name(op), split, trace_len); @@ -1064,9 +1083,9 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, * child. */ split_inos =3D p; - p +=3D sizeof(u64) * num_split_inos; + p +=3D split_inos_bytes; split_realms =3D p; - p +=3D sizeof(u64) * num_split_realms; + p +=3D split_realms_bytes; ceph_decode_need(&p, e, sizeof(*ri), bad); /* we will peek at realm info here, but will _not_ * advance p, as the realm update will occur below in @@ -1144,8 +1163,8 @@ void ceph_handle_snap(struct ceph_mds_client *mdsc, * positioned at the start of realm info, as expected by * ceph_update_snap_trace(). */ - p +=3D sizeof(u64) * num_split_inos; - p +=3D sizeof(u64) * num_split_realms; + p +=3D split_inos_bytes; + p +=3D split_realms_bytes; } /* -- 2.53.0