net/ceph/cls_lock_client.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
decode_lockers() in cls_lock_client.c contains two bare decode operations
that allow a malicious or compromised OSD to trigger slab-out-of-bounds
reads:
1. ceph_decode_32(p) at the num_lockers field has no preceding bounds
check. ceph_start_decoding() accepts struct_len=0 as valid -- the
internal ceph_decode_need(p, end, 0, bad) always passes -- so when an
OSD sends struct_len=0, ceph_start_decoding() returns success with
p == end. The immediately following bare ceph_decode_32(p) then reads
4 bytes past the validated buffer boundary. The garbage value is
passed directly to kzalloc_objs() as the locker count.
The sibling function decode_watchers() in osd_client.c already uses
ceph_decode_32_safe() after its own ceph_start_decoding() call.
decode_lockers() was the only site using the bare variant.
2. ceph_decode_8(p) after the decode_locker() loop has no preceding
bounds check. If an OSD crafts num_lockers such that the loop
advances p exactly to end, the subsequent bare ceph_decode_8(p) reads
one byte past the validated buffer boundary. The result is passed
directly into *type, which is used as a lock type discriminator by
callers, giving an OSD-controlled one-byte OOB read with direct
influence over the lock type field.
Fix both by replacing bare operations with their safe variants:
ceph_decode_32(p) -> ceph_decode_32_safe(p, end, *num_lockers,
err_inval)
ceph_decode_8(p) -> ceph_decode_8_safe(p, end, *type,
err_free_lockers)
The goto targets differ intentionally:
err_inval: is a new label returning -EINVAL directly. It is used for
the pre-allocation failure path where *lockers is not yet allocated
and must not be passed to ceph_free_lockers().
err_free_lockers: is the existing label. It is used for the
post-allocation failure path where *lockers is allocated and must
be freed.
ret is set to -EINVAL before ceph_decode_8_safe() so that
err_free_lockers returns the correct error code on bounds violation.
Without this, err_free_lockers would return a stale ret value (0 from
the successful decode_locker() loop), silently swallowing the error.
-EINVAL is correct for both failure paths. The data received from the
OSD is structurally malformed. -ENOMEM would misrepresent the failure
class to callers and to stable@ backporters triaging error paths.
KASAN report for bug 1 (kernel 7.0.0-rc7, QEMU/x86_64, KASLR disabled):
==================================================================
BUG: KASAN: slab-out-of-bounds in ceph_oob3_init+0x251/0xff0 [ceph_oob3_poc]
Read of size 4 at addr ffff88800a29b76e by task insmod/58
CPU: 0 UID: 0 PID: 58 Comm: insmod Tainted: G O 7.0.0-rc7-g9c2abf69da83-dirty #15 PREEMPT(lazy)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x4d/0x70
print_report+0x170/0x4f3
kasan_report+0xda/0x110
ceph_oob3_init+0x251/0xff0 [ceph_oob3_poc]
do_one_initcall+0x9a/0x3a0
do_init_module+0x27c/0x790
load_module+0x4a9a/0x6350
init_module_from_file+0x15c/0x180
idempotent_init_module+0x21f/0x750
__x64_sys_finit_module+0xba/0x120
do_syscall_64+0xe2/0x570
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Allocated by task 58:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x7f/0x90
ceph_oob3_init+0x4d/0xff0 [ceph_oob3_poc]
do_one_initcall+0x9a/0x3a0
do_init_module+0x27c/0x790
load_module+0x4a9a/0x6350
init_module_from_file+0x15c/0x180
idempotent_init_module+0x21f/0x750
__x64_sys_finit_module+0xba/0x120
do_syscall_64+0xe2/0x570
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff88800a29a000
which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 5998 bytes inside of
allocated 6000-byte region [ffff88800a29a000, ffff88800a29b770)
Memory state around the buggy address:
ffff88800a29b600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88800a29b680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88800a29b700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
^
ffff88800a29b780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
num_lockers=0xccccaaaa (OOB garbage from KASAN redzone)
Bug 2 (ceph_decode_8) follows from the identical precondition. A
dedicated PoC is available on request.
Attacker model: a malicious or compromised OSD in a multi-tenant Ceph
deployment can trigger this against any kernel client that issues the
lock.get_info class method (e.g. during RBD exclusive lock acquisition)
without any further privileges beyond OSD session establishment.
Fixes: d4ed4a530562 ("libceph: support for lock.lock_info")
Cc: stable@vger.kernel.org
Signed-off-by: Pavitra Jha <jhapavitra98@gmail.com>
---
v3: Combine both fixes (ceph_decode_32 and ceph_decode_8) into a single
patch per Viacheslav Dubeyko's review. Set ret = -EINVAL before
ceph_decode_8_safe() so err_free_lockers returns the correct error
code, not stale ret (caught by Dan Carpenter / smatch). Clarify
err_inval vs err_free_lockers goto selection rationale and
-EINVAL justification.
---
net/ceph/cls_lock_client.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/ceph/cls_lock_client.c b/net/ceph/cls_lock_client.c
index c6956f1df..4e6a6d3e4 100644
--- a/net/ceph/cls_lock_client.c
+++ b/net/ceph/cls_lock_client.c
@@ -299,7 +299,7 @@ static int decode_lockers(void **p, void *end, u8 *type, char **tag,
if (ret)
return ret;
- *num_lockers = ceph_decode_32(p);
+ ceph_decode_32_safe(p, end, *num_lockers, err_inval);
*lockers = kzalloc_objs(**lockers, *num_lockers, GFP_NOIO);
if (!*lockers)
return -ENOMEM;
@@ -310,7 +310,8 @@ static int decode_lockers(void **p, void *end, u8 *type, char **tag,
goto err_free_lockers;
}
- *type = ceph_decode_8(p);
+ ret = -EINVAL;
+ ceph_decode_8_safe(p, end, *type, err_free_lockers);
s = ceph_extract_encoded_string(p, end, NULL, GFP_NOIO);
if (IS_ERR(s)) {
ret = PTR_ERR(s);
@@ -320,6 +321,8 @@ static int decode_lockers(void **p, void *end, u8 *type, char **tag,
*tag = s;
return 0;
+err_inval:
+ return -EINVAL;
err_free_lockers:
ceph_free_lockers(*lockers, *num_lockers);
return ret;
--
2.53.0
On Tue, 2026-06-02 at 00:17 -0400, Pavitra Jha wrote:
> decode_lockers() in cls_lock_client.c contains two bare decode
> operations
> that allow a malicious or compromised OSD to trigger slab-out-of-
> bounds
> reads:
>
> 1. ceph_decode_32(p) at the num_lockers field has no preceding bounds
> check. ceph_start_decoding() accepts struct_len=0 as valid -- the
> internal ceph_decode_need(p, end, 0, bad) always passes -- so when
> an
> OSD sends struct_len=0, ceph_start_decoding() returns success with
> p == end. The immediately following bare ceph_decode_32(p) then
> reads
> 4 bytes past the validated buffer boundary. The garbage value is
> passed directly to kzalloc_objs() as the locker count.
>
> The sibling function decode_watchers() in osd_client.c already
> uses
> ceph_decode_32_safe() after its own ceph_start_decoding() call.
> decode_lockers() was the only site using the bare variant.
>
> 2. ceph_decode_8(p) after the decode_locker() loop has no preceding
> bounds check. If an OSD crafts num_lockers such that the loop
> advances p exactly to end, the subsequent bare ceph_decode_8(p)
> reads
> one byte past the validated buffer boundary. The result is passed
> directly into *type, which is used as a lock type discriminator by
> callers, giving an OSD-controlled one-byte OOB read with direct
> influence over the lock type field.
>
> Fix both by replacing bare operations with their safe variants:
> ceph_decode_32(p) -> ceph_decode_32_safe(p, end, *num_lockers,
> err_inval)
> ceph_decode_8(p) -> ceph_decode_8_safe(p, end, *type,
> err_free_lockers)
>
> The goto targets differ intentionally:
> err_inval: is a new label returning -EINVAL directly. It is used
> for
> the pre-allocation failure path where *lockers is not yet allocated
> and must not be passed to ceph_free_lockers().
>
> err_free_lockers: is the existing label. It is used for the
> post-allocation failure path where *lockers is allocated and must
> be freed.
>
> ret is set to -EINVAL before ceph_decode_8_safe() so that
> err_free_lockers returns the correct error code on bounds violation.
> Without this, err_free_lockers would return a stale ret value (0 from
> the successful decode_locker() loop), silently swallowing the error.
>
> -EINVAL is correct for both failure paths. The data received from the
> OSD is structurally malformed. -ENOMEM would misrepresent the failure
> class to callers and to stable@ backporters triaging error paths.
>
> KASAN report for bug 1 (kernel 7.0.0-rc7, QEMU/x86_64, KASLR
> disabled):
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in ceph_oob3_init+0x251/0xff0
> [ceph_oob3_poc]
> Read of size 4 at addr ffff88800a29b76e by task insmod/58
>
> CPU: 0 UID: 0 PID: 58 Comm: insmod Tainted: G O
> 7.0.0-rc7-g9c2abf69da83-dirty #15 PREEMPT(lazy)
> Tainted: [O]=OOT_MODULE
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-
> debian-1.17.0-1 04/01/2014
> Call Trace:
> <TASK>
> dump_stack_lvl+0x4d/0x70
> print_report+0x170/0x4f3
> kasan_report+0xda/0x110
> ceph_oob3_init+0x251/0xff0 [ceph_oob3_poc]
> do_one_initcall+0x9a/0x3a0
> do_init_module+0x27c/0x790
> load_module+0x4a9a/0x6350
> init_module_from_file+0x15c/0x180
> idempotent_init_module+0x21f/0x750
> __x64_sys_finit_module+0xba/0x120
> do_syscall_64+0xe2/0x570
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> Allocated by task 58:
> kasan_save_stack+0x30/0x50
> kasan_save_track+0x14/0x30
> __kasan_kmalloc+0x7f/0x90
> ceph_oob3_init+0x4d/0xff0 [ceph_oob3_poc]
> do_one_initcall+0x9a/0x3a0
> do_init_module+0x27c/0x790
> load_module+0x4a9a/0x6350
> init_module_from_file+0x15c/0x180
> idempotent_init_module+0x21f/0x750
> __x64_sys_finit_module+0xba/0x120
> do_syscall_64+0xe2/0x570
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> The buggy address belongs to the object at ffff88800a29a000
> which belongs to the cache kmalloc-8k of size 8192
> The buggy address is located 5998 bytes inside of
> allocated 6000-byte region [ffff88800a29a000, ffff88800a29b770)
>
> Memory state around the buggy address:
> ffff88800a29b600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffff88800a29b680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >ffff88800a29b700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc
> ^
> ffff88800a29b780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
>
> num_lockers=0xccccaaaa (OOB garbage from KASAN redzone)
>
> Bug 2 (ceph_decode_8) follows from the identical precondition. A
> dedicated PoC is available on request.
>
> Attacker model: a malicious or compromised OSD in a multi-tenant Ceph
> deployment can trigger this against any kernel client that issues the
> lock.get_info class method (e.g. during RBD exclusive lock
> acquisition)
> without any further privileges beyond OSD session establishment.
>
> Fixes: d4ed4a530562 ("libceph: support for lock.lock_info")
> Cc: stable@vger.kernel.org
> Signed-off-by: Pavitra Jha <jhapavitra98@gmail.com>
> ---
> v3: Combine both fixes (ceph_decode_32 and ceph_decode_8) into a
> single
> patch per Viacheslav Dubeyko's review. Set ret = -EINVAL before
> ceph_decode_8_safe() so err_free_lockers returns the correct
> error
> code, not stale ret (caught by Dan Carpenter / smatch). Clarify
> err_inval vs err_free_lockers goto selection rationale and
> -EINVAL justification.
> ---
> net/ceph/cls_lock_client.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/net/ceph/cls_lock_client.c b/net/ceph/cls_lock_client.c
> index c6956f1df..4e6a6d3e4 100644
> --- a/net/ceph/cls_lock_client.c
> +++ b/net/ceph/cls_lock_client.c
> @@ -299,7 +299,7 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
> if (ret)
> return ret;
>
> - *num_lockers = ceph_decode_32(p);
> + ceph_decode_32_safe(p, end, *num_lockers, err_inval);
> *lockers = kzalloc_objs(**lockers, *num_lockers, GFP_NOIO);
> if (!*lockers)
> return -ENOMEM;
> @@ -310,7 +310,8 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
> goto err_free_lockers;
> }
>
> - *type = ceph_decode_8(p);
> + ret = -EINVAL;
> + ceph_decode_8_safe(p, end, *type, err_free_lockers);
> s = ceph_extract_encoded_string(p, end, NULL, GFP_NOIO);
> if (IS_ERR(s)) {
> ret = PTR_ERR(s);
> @@ -320,6 +321,8 @@ static int decode_lockers(void **p, void *end, u8
> *type, char **tag,
> *tag = s;
> return 0;
>
> +err_inval:
> + return -EINVAL;
> err_free_lockers:
> ceph_free_lockers(*lockers, *num_lockers);
> return ret;
Looks good.
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Thanks,
Slava.
© 2016 - 2026 Red Hat, Inc.