bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

[PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by xulang 3 weeks, 5 days ago

From: Lang Xu <xulang@uniontech.com>

An out-of-bounds read occurs when copying element from a
BPF_MAP_TYPE_CGROUP_STORAGE map to another map type with the same
value_size that is not 8-byte aligned.

The issue happens when:
1. A CGROUP_STORAGE map is created with value_size not aligned to
   8 bytes (e.g., 4 bytes)
2. A HASH map is created with the same value_size (e.g., 4 bytes)
3. Update element in 2 with data in 1

In the kernel, map elements are typically aligned to 8 bytes. However,
bpf_cgroup_storage_calculate_size() allocates storage based on the exact
value_size without alignment. When copy_map_value_long() is called, it
assumes all map values are 8-byte aligned and rounds up the copy size,
leading to a 4-byte out-of-bounds read from the cgroup storage buffer.

This patch fixes the issue by ensuring cgroup storage allocates 8-byte
aligned buffers, matching the assumptions in copy_map_value_long().

Fixes: b741f1630346 ("bpf: introduce per-cpu cgroup local storage")
Reported-by: Kaiyan Mei <kaiyanm@hust.edu.cn>
Closes: https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
Signed-off-by: Lang Xu <xulang@uniontech.com>
---
 kernel/bpf/local_storage.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
index 8fca0c64f7b1..54b32ba19194 100644
--- a/kernel/bpf/local_storage.c
+++ b/kernel/bpf/local_storage.c
@@ -487,14 +487,13 @@ static size_t bpf_cgroup_storage_calculate_size(struct bpf_map *map, u32 *pages)
 {
 	size_t size;
 
+	size = round_up(map->value_size, 8);
 	if (cgroup_storage_type(map) == BPF_CGROUP_STORAGE_SHARED) {
-		size = sizeof(struct bpf_storage_buffer) + map->value_size;
+		size += sizeof(struct bpf_storage_buffer);
 		*pages = round_up(sizeof(struct bpf_cgroup_storage) + size,
 				  PAGE_SIZE) >> PAGE_SHIFT;
 	} else {
-		size = map->value_size;
-		*pages = round_up(round_up(size, 8) * num_possible_cpus(),
-				  PAGE_SIZE) >> PAGE_SHIFT;
+		*pages = round_up(size * num_possible_cpus(), PAGE_SIZE) >> PAGE_SHIFT;
 	}
 
 	return size;
-- 
2.51.0

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Martin KaFai Lau 3 weeks, 3 days ago


On 3/11/26 10:25 PM, xulang wrote:
> From: Lang Xu <xulang@uniontech.com>
> 
> An out-of-bounds read occurs when copying element from a
> BPF_MAP_TYPE_CGROUP_STORAGE map to another map type with the same
> value_size that is not 8-byte aligned.
> 
> The issue happens when:
> 1. A CGROUP_STORAGE map is created with value_size not aligned to
>     8 bytes (e.g., 4 bytes)
> 2. A HASH map is created with the same value_size (e.g., 4 bytes)
> 3. Update element in 2 with data in 1

Please create a selftest for this.

pw-bot: cr

> 
> In the kernel, map elements are typically aligned to 8 bytes. However,
> bpf_cgroup_storage_calculate_size() allocates storage based on the exact
> value_size without alignment. When copy_map_value_long() is called, it
> assumes all map values are 8-byte aligned and rounds up the copy size,
> leading to a 4-byte out-of-bounds read from the cgroup storage buffer.
> 
> This patch fixes the issue by ensuring cgroup storage allocates 8-byte
> aligned buffers, matching the assumptions in copy_map_value_long().

This is fixing the src side of the "copy_map_value_long(map, dst, src)".
The src could also be from a skb? What is the value_size that the 
verifier is checking for bpf_map_update_elem?

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by xulang 3 weeks, 1 day ago

From: Lang Xu <xulang@uniontech.com>

> Please create a selftest for this.

Going to do that. To stably reproduce this bug, I need the KASAN
config enabled, how do I ensure it's enabled during a selftest cycle,
by adding the line below to the 'config'? not quite sure.

--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -46,6 +46,7 @@ CONFIG_IPV6_GRE=y
 CONFIG_IPV6_SEG6_BPF=y
 CONFIG_IPV6_SIT=y
 CONFIG_IPV6_TUNNEL=y
+CONFIG_KASAN=y
 CONFIG_KEYS=y
 CONFIG_LIRC=y
 CONFIG_LWTUNNEL=y

> This is fixing the src side of the "copy_map_value_long(map, dst, src)".
> The src could also be from a skb? What is the value_size that the
> verifier is checking for bpf_map_update_elem?

The value_size checked by verifier is exactly the size with which
the map is defined, i.e., not the size rounded up to 8-byte by kernel

As for bpf_map_update_elem->..->copy_map_value_long, 'src' couldn't be from
'skb' which mismatches the expected ptr-type of 'bpf_map_update_elem',
I've tried codes like these:

1. bpf_map_update_elem(&lru_map, &key, skb, BPF_ANY);
2. bpf_map_update_elem(&lru_map, &key, skb->sk, BPF_ANY);  // null checked
3. bpf_map_update_elem(&lru_map, &key, skb->flow_keys, BPF_ANY);

All these ptrs mismatch the expected ptr-type, which can be detected by the verifier.
The verifier complains with msg like 'R3 type=ctx expected=fp, pkt, pkt_meta, map_key,
map_value, mem, ringbuf_mem, buf, trusted_ptr'

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Martin KaFai Lau 3 weeks ago


On 3/16/26 6:51 AM, xulang wrote:
> From: Lang Xu <xulang@uniontech.com>
> 
>> Please create a selftest for this.
> 
> Going to do that. To stably reproduce this bug, I need the KASAN
> config enabled, how do I ensure it's enabled during a selftest cycle,
> by adding the line below to the 'config'? not quite sure.
> 
> --- a/tools/testing/selftests/bpf/config
> +++ b/tools/testing/selftests/bpf/config
> @@ -46,6 +46,7 @@ CONFIG_IPV6_GRE=y
>   CONFIG_IPV6_SEG6_BPF=y
>   CONFIG_IPV6_SIT=y
>   CONFIG_IPV6_TUNNEL=y
> +CONFIG_KASAN=y

I would leave out this config change from this fix for now. cc: Ihor to 
consider enabling it for bpf-next.

It is still useful to have a selftest for this case. I always have KASAN 
turned on when running selftests.


>> This is fixing the src side of the "copy_map_value_long(map, dst, src)".
>> The src could also be from a skb? What is the value_size that the
>> verifier is checking for bpf_map_update_elem?
> 
> The value_size checked by verifier is exactly the size with which
> the map is defined, i.e., not the size rounded up to 8-byte by kernel

If the verifier ensures only 4-bytes, I am not sure if the helper should 
read 8-bytes.

> 
> As for bpf_map_update_elem->..->copy_map_value_long, 'src' couldn't be from
> 'skb' which mismatches the expected ptr-type of 'bpf_map_update_elem',
> I've tried codes like these:
> 
> 1. bpf_map_update_elem(&lru_map, &key, skb, BPF_ANY);
> 2. bpf_map_update_elem(&lru_map, &key, skb->sk, BPF_ANY);  // null checked
> 3. bpf_map_update_elem(&lru_map, &key, skb->flow_keys, BPF_ANY);
> 
> All these ptrs mismatch the expected ptr-type, which can be detected by the verifier.
> The verifier complains with msg like 'R3 type=ctx expected=fp, pkt, pkt_meta, map_key,
> map_value, mem, ringbuf_mem, buf, trusted_ptr'

I meant the __sk_buff->data. Take a look at how skb->data can be used in 
the selftests. __sk_buff->data may not have readable bytes rounded up to 
8. Just one example that the src cannot always be fixed by allocating more.

 From looking at git history on pcpu_init_value, the issue should be 
introduced in commit d3bec0138bfb.

[PATCH bpf 0/2] bpf: Fix and test cgroup storage OOB issue

Posted by xulang 3 weeks ago

This series fixes an out-of-bounds read in BPF cgroup storage when the
value_size is not 8-byte aligned. The fix ensures proper alignment during
buffer allocation, and a test case is added to prevent regression.

Lang Xu (2):
  bpf: Fix OOB in bpf_obj_memcpy for cgroup storage
  selftests/bpf: Add test for cgroup storage OOB read

 kernel/bpf/local_storage.c                    |  7 ++-
 .../selftests/bpf/prog_tests/cgroup_storage.c | 42 ++++++++++++++++++
 .../selftests/bpf/progs/cgroup_storage.c      | 43 +++++++++++++++++++
 3 files changed, 88 insertions(+), 4 deletions(-)

-- 
2.51.0

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Ihor Solodrai 3 weeks ago

On 3/16/26 1:50 PM, Martin KaFai Lau wrote:
> 
> 
> On 3/16/26 6:51 AM, xulang wrote:
>> From: Lang Xu <xulang@uniontech.com>
>>
>>> Please create a selftest for this.
>>
>> Going to do that. To stably reproduce this bug, I need the KASAN
>> config enabled, how do I ensure it's enabled during a selftest cycle,
>> by adding the line below to the 'config'? not quite sure.
>>
>> --- a/tools/testing/selftests/bpf/config
>> +++ b/tools/testing/selftests/bpf/config
>> @@ -46,6 +46,7 @@ CONFIG_IPV6_GRE=y
>>   CONFIG_IPV6_SEG6_BPF=y
>>   CONFIG_IPV6_SIT=y
>>   CONFIG_IPV6_TUNNEL=y
>> +CONFIG_KASAN=y
> 
> I would leave out this config change from this fix for now. cc: Ihor to consider enabling it for bpf-next.
> 
> It is still useful to have a selftest for this case. I always have KASAN turned on when running selftests.

Hi Martin. BPF CI has been running with KASAN since July 2025.
We only disabled it on s390x due to frequent OOMs, x86_64 and 
aarch64 are covered.

For BPF CI it's not important whether CONFIG_KASAN=y is enabled
in tools/testing/selftests/bpf/config, as long as it's not =n

> 
> 
>>> [...]

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Yonghong Song 3 weeks, 4 days ago


On 3/11/26 10:25 PM, xulang wrote:
> From: Lang Xu <xulang@uniontech.com>
>
> An out-of-bounds read occurs when copying element from a
> BPF_MAP_TYPE_CGROUP_STORAGE map to another map type with the same
> value_size that is not 8-byte aligned.
>
> The issue happens when:
> 1. A CGROUP_STORAGE map is created with value_size not aligned to
>     8 bytes (e.g., 4 bytes)
> 2. A HASH map is created with the same value_size (e.g., 4 bytes)
> 3. Update element in 2 with data in 1
>
> In the kernel, map elements are typically aligned to 8 bytes. However,
> bpf_cgroup_storage_calculate_size() allocates storage based on the exact
> value_size without alignment. When copy_map_value_long() is called, it
> assumes all map values are 8-byte aligned and rounds up the copy size,
> leading to a 4-byte out-of-bounds read from the cgroup storage buffer.
>
> This patch fixes the issue by ensuring cgroup storage allocates 8-byte
> aligned buffers, matching the assumptions in copy_map_value_long().
>
> Fixes: b741f1630346 ("bpf: introduce per-cpu cgroup local storage")
> Reported-by: Kaiyan Mei <kaiyanm@hust.edu.cn>
> Closes: https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
> Signed-off-by: Lang Xu <xulang@uniontech.com>

Acked-by: Yonghong Song <yonghong.song@linux.dev>

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Paul Chaignon 3 weeks, 5 days ago

On Thu, Mar 12, 2026 at 01:25:25PM +0800, xulang wrote:
> From: Lang Xu <xulang@uniontech.com>
> 
> An out-of-bounds read occurs when copying element from a
> BPF_MAP_TYPE_CGROUP_STORAGE map to another map type with the same
> value_size that is not 8-byte aligned.
> 
> The issue happens when:
> 1. A CGROUP_STORAGE map is created with value_size not aligned to
>    8 bytes (e.g., 4 bytes)
> 2. A HASH map is created with the same value_size (e.g., 4 bytes)
> 3. Update element in 2 with data in 1
> 
> In the kernel, map elements are typically aligned to 8 bytes. However,
> bpf_cgroup_storage_calculate_size() allocates storage based on the exact
> value_size without alignment. When copy_map_value_long() is called, it
> assumes all map values are 8-byte aligned and rounds up the copy size,
> leading to a 4-byte out-of-bounds read from the cgroup storage buffer.
> 
> This patch fixes the issue by ensuring cgroup storage allocates 8-byte
> aligned buffers, matching the assumptions in copy_map_value_long().

I don't think this bug is specific to the CGROUP_STORAGE maps. Wouldn't
it affect any copy from a non-percpu map into a percpu hashmap? The
reproducer in [1] copies from a BPF_MAP_TYPE_CGROUP_STORAGE map to a
BPF_MAP_TYPE_LRU_PERCPU_HASH map, but I suspect you'd hit the same bug
if copying from BPF_MAP_TYPE_HASH into BPF_MAP_TYPE_PERCPU_HASH because
for BPF_MAP_TYPE_HASH the value size is also not rounded up to a
multiple of 8.

1 - https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/

> 
> Fixes: b741f1630346 ("bpf: introduce per-cpu cgroup local storage")
> Reported-by: Kaiyan Mei <kaiyanm@hust.edu.cn>
> Closes: https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
> Signed-off-by: Lang Xu <xulang@uniontech.com>
> ---
>  kernel/bpf/local_storage.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
> index 8fca0c64f7b1..54b32ba19194 100644
> --- a/kernel/bpf/local_storage.c
> +++ b/kernel/bpf/local_storage.c
> @@ -487,14 +487,13 @@ static size_t bpf_cgroup_storage_calculate_size(struct bpf_map *map, u32 *pages)
>  {
>  	size_t size;
>  
> +	size = round_up(map->value_size, 8);
>  	if (cgroup_storage_type(map) == BPF_CGROUP_STORAGE_SHARED) {
> -		size = sizeof(struct bpf_storage_buffer) + map->value_size;
> +		size += sizeof(struct bpf_storage_buffer);
>  		*pages = round_up(sizeof(struct bpf_cgroup_storage) + size,
>  				  PAGE_SIZE) >> PAGE_SHIFT;
>  	} else {
> -		size = map->value_size;
> -		*pages = round_up(round_up(size, 8) * num_possible_cpus(),
> -				  PAGE_SIZE) >> PAGE_SHIFT;
> +		*pages = round_up(size * num_possible_cpus(), PAGE_SIZE) >> PAGE_SHIFT;
>  	}
>  
>  	return size;
> -- 
> 2.51.0
> 
>

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Yonghong Song 3 weeks, 4 days ago


On 3/12/26 4:51 AM, Paul Chaignon wrote:
> On Thu, Mar 12, 2026 at 01:25:25PM +0800, xulang wrote:
>> From: Lang Xu <xulang@uniontech.com>
>>
>> An out-of-bounds read occurs when copying element from a
>> BPF_MAP_TYPE_CGROUP_STORAGE map to another map type with the same
>> value_size that is not 8-byte aligned.
>>
>> The issue happens when:
>> 1. A CGROUP_STORAGE map is created with value_size not aligned to
>>     8 bytes (e.g., 4 bytes)
>> 2. A HASH map is created with the same value_size (e.g., 4 bytes)
>> 3. Update element in 2 with data in 1
>>
>> In the kernel, map elements are typically aligned to 8 bytes. However,
>> bpf_cgroup_storage_calculate_size() allocates storage based on the exact
>> value_size without alignment. When copy_map_value_long() is called, it
>> assumes all map values are 8-byte aligned and rounds up the copy size,
>> leading to a 4-byte out-of-bounds read from the cgroup storage buffer.
>>
>> This patch fixes the issue by ensuring cgroup storage allocates 8-byte
>> aligned buffers, matching the assumptions in copy_map_value_long().
> I don't think this bug is specific to the CGROUP_STORAGE maps. Wouldn't
> it affect any copy from a non-percpu map into a percpu hashmap? The
> reproducer in [1] copies from a BPF_MAP_TYPE_CGROUP_STORAGE map to a
> BPF_MAP_TYPE_LRU_PERCPU_HASH map, but I suspect you'd hit the same bug
> if copying from BPF_MAP_TYPE_HASH into BPF_MAP_TYPE_PERCPU_HASH because
> for BPF_MAP_TYPE_HASH the value size is also not rounded up to a
> multiple of 8.

The BPF_MAP_TYPE_HASH table have value size rounds up to 8. See:

         if (percpu)
                 htab->elem_size += sizeof(void *);
         else
                 htab->elem_size += round_up(htab->map.value_size, 8);

The same for array size.

>
> 1 - https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
>
>> Fixes: b741f1630346 ("bpf: introduce per-cpu cgroup local storage")
>> Reported-by: Kaiyan Mei <kaiyanm@hust.edu.cn>
>> Closes: https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
>> Signed-off-by: Lang Xu <xulang@uniontech.com>
>> ---
>>   kernel/bpf/local_storage.c | 7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
>> index 8fca0c64f7b1..54b32ba19194 100644
>> --- a/kernel/bpf/local_storage.c
>> +++ b/kernel/bpf/local_storage.c
>> @@ -487,14 +487,13 @@ static size_t bpf_cgroup_storage_calculate_size(struct bpf_map *map, u32 *pages)
>>   {
>>   	size_t size;
>>   
>> +	size = round_up(map->value_size, 8);
>>   	if (cgroup_storage_type(map) == BPF_CGROUP_STORAGE_SHARED) {
>> -		size = sizeof(struct bpf_storage_buffer) + map->value_size;
>> +		size += sizeof(struct bpf_storage_buffer);
>>   		*pages = round_up(sizeof(struct bpf_cgroup_storage) + size,
>>   				  PAGE_SIZE) >> PAGE_SHIFT;
>>   	} else {
>> -		size = map->value_size;
>> -		*pages = round_up(round_up(size, 8) * num_possible_cpus(),
>> -				  PAGE_SIZE) >> PAGE_SHIFT;
>> +		*pages = round_up(size * num_possible_cpus(), PAGE_SIZE) >> PAGE_SHIFT;
>>   	}
>>   
>>   	return size;
>> -- 
>> 2.51.0
>>
>>

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Paul Chaignon 3 weeks, 4 days ago

On Thu, Mar 12, 2026 at 09:41:40AM -0700, Yonghong Song wrote:
> 
> 
> On 3/12/26 4:51 AM, Paul Chaignon wrote:
> > On Thu, Mar 12, 2026 at 01:25:25PM +0800, xulang wrote:
> > > From: Lang Xu <xulang@uniontech.com>
> > > 
> > > An out-of-bounds read occurs when copying element from a
> > > BPF_MAP_TYPE_CGROUP_STORAGE map to another map type with the same
> > > value_size that is not 8-byte aligned.
> > > 
> > > The issue happens when:
> > > 1. A CGROUP_STORAGE map is created with value_size not aligned to
> > >     8 bytes (e.g., 4 bytes)
> > > 2. A HASH map is created with the same value_size (e.g., 4 bytes)
> > > 3. Update element in 2 with data in 1
> > > 
> > > In the kernel, map elements are typically aligned to 8 bytes. However,
> > > bpf_cgroup_storage_calculate_size() allocates storage based on the exact
> > > value_size without alignment. When copy_map_value_long() is called, it
> > > assumes all map values are 8-byte aligned and rounds up the copy size,
> > > leading to a 4-byte out-of-bounds read from the cgroup storage buffer.
> > > 
> > > This patch fixes the issue by ensuring cgroup storage allocates 8-byte
> > > aligned buffers, matching the assumptions in copy_map_value_long().
> > I don't think this bug is specific to the CGROUP_STORAGE maps. Wouldn't
> > it affect any copy from a non-percpu map into a percpu hashmap? The
> > reproducer in [1] copies from a BPF_MAP_TYPE_CGROUP_STORAGE map to a
> > BPF_MAP_TYPE_LRU_PERCPU_HASH map, but I suspect you'd hit the same bug
> > if copying from BPF_MAP_TYPE_HASH into BPF_MAP_TYPE_PERCPU_HASH because
> > for BPF_MAP_TYPE_HASH the value size is also not rounded up to a
> > multiple of 8.
> 
> The BPF_MAP_TYPE_HASH table have value size rounds up to 8. See:
> 
>         if (percpu)
>                 htab->elem_size += sizeof(void *);
>         else
>                 htab->elem_size += round_up(htab->map.value_size, 8);
> 
> The same for array size.

My bad, I looked at the _alloc_check and assumed any round_up would be
reflected there :/ Given that:

Acked-by: Paul Chaignon <paul.chaignon@gmail.com>

I also had a look at other map types and they all seem to round up to 8
or to not be susceptible to the oob copy (ex., queue & stack). The one
for which I'm unsure is BPF_MAP_TYPE_*_CGROUP_STORAGE. It doesn't seem
to round up to 8, but I'm unsure it could be used to reproduce the copy.

On a related note, this is the sort of reproducer that would be good to
add in https://github.com/google/syzkaller/tree/master/sys/linux/test
because syzbot can easily learn from it and reach potentially similar
bugs.

> 
> > 
> > 1 - https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
> > 
> > > Fixes: b741f1630346 ("bpf: introduce per-cpu cgroup local storage")
> > > Reported-by: Kaiyan Mei <kaiyanm@hust.edu.cn>
> > > Closes: https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
> > > Signed-off-by: Lang Xu <xulang@uniontech.com>
> > > ---
> > >   kernel/bpf/local_storage.c | 7 +++----
> > >   1 file changed, 3 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
> > > index 8fca0c64f7b1..54b32ba19194 100644
> > > --- a/kernel/bpf/local_storage.c
> > > +++ b/kernel/bpf/local_storage.c
> > > @@ -487,14 +487,13 @@ static size_t bpf_cgroup_storage_calculate_size(struct bpf_map *map, u32 *pages)
> > >   {
> > >   	size_t size;
> > > +	size = round_up(map->value_size, 8);
> > >   	if (cgroup_storage_type(map) == BPF_CGROUP_STORAGE_SHARED) {
> > > -		size = sizeof(struct bpf_storage_buffer) + map->value_size;
> > > +		size += sizeof(struct bpf_storage_buffer);
> > >   		*pages = round_up(sizeof(struct bpf_cgroup_storage) + size,
> > >   				  PAGE_SIZE) >> PAGE_SHIFT;
> > >   	} else {
> > > -		size = map->value_size;
> > > -		*pages = round_up(round_up(size, 8) * num_possible_cpus(),
> > > -				  PAGE_SIZE) >> PAGE_SHIFT;
> > > +		*pages = round_up(size * num_possible_cpus(), PAGE_SIZE) >> PAGE_SHIFT;
> > >   	}
> > >   	return size;
> > > -- 
> > > 2.51.0
> > > 
> > > 
>

Re: [PATCH bpf v1] bpf: Fix OOB in bpf_obj_memcpy for cgroup storage

Posted by Yonghong Song 3 weeks, 4 days ago


On 3/12/26 11:02 AM, Paul Chaignon wrote:
> On Thu, Mar 12, 2026 at 09:41:40AM -0700, Yonghong Song wrote:
>>
>> On 3/12/26 4:51 AM, Paul Chaignon wrote:
>>> On Thu, Mar 12, 2026 at 01:25:25PM +0800, xulang wrote:
>>>> From: Lang Xu <xulang@uniontech.com>
>>>>
>>>> An out-of-bounds read occurs when copying element from a
>>>> BPF_MAP_TYPE_CGROUP_STORAGE map to another map type with the same
>>>> value_size that is not 8-byte aligned.
>>>>
>>>> The issue happens when:
>>>> 1. A CGROUP_STORAGE map is created with value_size not aligned to
>>>>      8 bytes (e.g., 4 bytes)
>>>> 2. A HASH map is created with the same value_size (e.g., 4 bytes)
>>>> 3. Update element in 2 with data in 1
>>>>
>>>> In the kernel, map elements are typically aligned to 8 bytes. However,
>>>> bpf_cgroup_storage_calculate_size() allocates storage based on the exact
>>>> value_size without alignment. When copy_map_value_long() is called, it
>>>> assumes all map values are 8-byte aligned and rounds up the copy size,
>>>> leading to a 4-byte out-of-bounds read from the cgroup storage buffer.
>>>>
>>>> This patch fixes the issue by ensuring cgroup storage allocates 8-byte
>>>> aligned buffers, matching the assumptions in copy_map_value_long().
>>> I don't think this bug is specific to the CGROUP_STORAGE maps. Wouldn't
>>> it affect any copy from a non-percpu map into a percpu hashmap? The
>>> reproducer in [1] copies from a BPF_MAP_TYPE_CGROUP_STORAGE map to a
>>> BPF_MAP_TYPE_LRU_PERCPU_HASH map, but I suspect you'd hit the same bug
>>> if copying from BPF_MAP_TYPE_HASH into BPF_MAP_TYPE_PERCPU_HASH because
>>> for BPF_MAP_TYPE_HASH the value size is also not rounded up to a
>>> multiple of 8.
>> The BPF_MAP_TYPE_HASH table have value size rounds up to 8. See:
>>
>>          if (percpu)
>>                  htab->elem_size += sizeof(void *);
>>          else
>>                  htab->elem_size += round_up(htab->map.value_size, 8);
>>
>> The same for array size.
> My bad, I looked at the _alloc_check and assumed any round_up would be
> reflected there :/ Given that:
>
> Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
>
> I also had a look at other map types and they all seem to round up to 8
> or to not be susceptible to the oob copy (ex., queue & stack). The one
> for which I'm unsure is BPF_MAP_TYPE_*_CGROUP_STORAGE. It doesn't seem
> to round up to 8, but I'm unsure it could be used to reproduce the copy.

For cgroup local storage, I think we should be okay.

struct bpf_local_storage_elem {
         struct hlist_node map_node;     /* Linked to bpf_local_storage_map */
         struct hlist_node snode;        /* Linked to bpf_local_storage */
         struct bpf_local_storage __rcu *local_storage;
         union {
                 struct rcu_head rcu;
                 struct hlist_node free_node;    /* used to postpone
                                                  * bpf_selem_free
                                                  * after raw_spin_unlock
                                                  */
         };
         atomic_t state;
         bool use_kmalloc_nolock;
         /* 3 bytes hole */
         /* The data is stored in another cacheline to minimize
          * the number of cachelines access during a cache hit.
          */
         struct bpf_local_storage_data sdata ____cacheline_aligned;
};

struct bpf_local_storage_data {
         /* smap is used as the searching key when looking up
          * from the object's bpf_local_storage.
          *
          * Put it in the same cacheline as the data to minimize
          * the number of cachelines accessed during the cache hit case.
          */
         struct bpf_local_storage_map __rcu *smap;
         u8 data[] __aligned(8);
};

$ pahole -C bpf_local_storage_elem ../../linux-bld/vmlinux
struct bpf_local_storage_elem {
         struct hlist_node          map_node;             /*     0    16 */
         struct hlist_node          snode;                /*    16    16 */
         struct bpf_local_storage * local_storage;        /*    32     8 */
         union {
                 struct callback_head rcu __attribute__((__aligned__(8))); /*    40    16 */
                 struct hlist_node  free_node;            /*    40    16 */
         };                                               /*    40    16 */
         union {
                 struct callback_head       rcu __attribute__((__aligned__(8))); /*     0    16 */
                 struct hlist_node          free_node;            /*     0    16 */
         };

         atomic_t                   state;                /*    56     4 */
         bool                       use_kmalloc_nolock;   /*    60     1 */

         /* XXX 3 bytes hole, try to pack */

         /* --- cacheline 1 boundary (64 bytes) --- */
         struct bpf_local_storage_data sdata __attribute__((__aligned__(64))); /*    64     8 */

         /* XXX last struct has a flexible array */

         /* Force padding: */
         struct bpf_local_storage_data :64;
         struct bpf_local_storage_data :64;
         struct bpf_local_storage_data :64;
         struct bpf_local_storage_data :64;
         struct bpf_local_storage_data :64;
         struct bpf_local_storage_data :64;
         struct bpf_local_storage_data :64;

         /* size: 128, cachelines: 2, members: 7 */
         /* sum members: 69, holes: 1, sum holes: 3 */
         /* padding: 56 */
         /* forced alignments: 1, forced holes: 1, sum forced holes: 3 */
         /* flexible array members: end: 1 */
};


So th minimum size will be 72 (with elem size 0). If the elem size is 4,
the allocation size will be 76.

The allocation is using kmalloc so the size will be promoted to next
slub bucket which should be round of 8.

>
> On a related note, this is the sort of reproducer that would be good to
> add in https://github.com/google/syzkaller/tree/master/sys/linux/test
> because syzbot can easily learn from it and reach potentially similar
> bugs.
>
>>> 1 - https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
>>>
>>>> Fixes: b741f1630346 ("bpf: introduce per-cpu cgroup local storage")
>>>> Reported-by: Kaiyan Mei <kaiyanm@hust.edu.cn>
>>>> Closes: https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
>>>> Signed-off-by: Lang Xu <xulang@uniontech.com>
>>>> ---
>>>>    kernel/bpf/local_storage.c | 7 +++----
>>>>    1 file changed, 3 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
>>>> index 8fca0c64f7b1..54b32ba19194 100644
>>>> --- a/kernel/bpf/local_storage.c
>>>> +++ b/kernel/bpf/local_storage.c
>>>> @@ -487,14 +487,13 @@ static size_t bpf_cgroup_storage_calculate_size(struct bpf_map *map, u32 *pages)
>>>>    {
>>>>    	size_t size;
>>>> +	size = round_up(map->value_size, 8);
>>>>    	if (cgroup_storage_type(map) == BPF_CGROUP_STORAGE_SHARED) {
>>>> -		size = sizeof(struct bpf_storage_buffer) + map->value_size;
>>>> +		size += sizeof(struct bpf_storage_buffer);
>>>>    		*pages = round_up(sizeof(struct bpf_cgroup_storage) + size,
>>>>    				  PAGE_SIZE) >> PAGE_SHIFT;
>>>>    	} else {
>>>> -		size = map->value_size;
>>>> -		*pages = round_up(round_up(size, 8) * num_possible_cpus(),
>>>> -				  PAGE_SIZE) >> PAGE_SHIFT;
>>>> +		*pages = round_up(size * num_possible_cpus(), PAGE_SIZE) >> PAGE_SHIFT;
>>>>    	}
>>>>    	return size;
>>>> -- 
>>>> 2.51.0
>>>>
>>>>