[PATCH] gcov: use atomic counter updates to fix concurrent access crashes

Konstantin Khorenko posted 1 patch 2 months, 1 week ago
There is a newer version of this series
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] gcov: use atomic counter updates to fix concurrent access crashes
Posted by Konstantin Khorenko 2 months, 1 week ago
GCC's GCOV instrumentation can merge global branch counters with loop
induction variables as an optimization.  In inflate_fast(), the inner
copy loops get transformed so that the GCOV counter value is loaded
multiple times to compute the loop base address, start index, and end
bound.  Since GCOV counters are global (not per-CPU), concurrent
execution on different CPUs causes the counter to change between loads,
producing inconsistent values and out-of-bounds memory writes.

The crash manifests during IPComp (IP Payload Compression) processing
when inflate_fast() runs concurrently on multiple CPUs:

  BUG: unable to handle page fault for address: ffffd0a3c0902ffa
  RIP: inflate_fast+1431
  Call Trace:
   zlib_inflate
   __deflate_decompress
   crypto_comp_decompress
   ipcomp_decompress [xfrm_ipcomp]
   ipcomp_input [xfrm_ipcomp]
   xfrm_input

At the crash point, the compiler generated three loads from the same
global GCOV counter (__gcov0.inflate_fast+216) to compute base, start,
and end for an indexed loop.  Another CPU modified the counter between
loads, making the values inconsistent — the write went 3.4 MB past a
65 KB buffer.

Add -fprofile-update=atomic to CFLAGS_GCOV at the global level in the
top-level Makefile.  This tells GCC that GCOV counters may be
concurrently accessed, causing counter updates to use atomic
instructions (lock addq) instead of plain load/store.  This prevents
the compiler from merging counters with loop induction variables.

Applying this globally rather than per-subsystem not only addresses the
observed crash in zlib but makes GCOV coverage data more consistent
overall, preventing similar issues in any kernel code path that may
execute concurrently.

Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Tested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>

---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 6b1d9fb1a6b4..a55ad668d6ba 100644
--- a/Makefile
+++ b/Makefile
@@ -806,7 +806,7 @@ all: vmlinux
 
 CFLAGS_GCOV	:= -fprofile-arcs -ftest-coverage
 ifdef CONFIG_CC_IS_GCC
-CFLAGS_GCOV	+= -fno-tree-loop-im
+CFLAGS_GCOV	+= -fno-tree-loop-im -fprofile-update=atomic
 endif
 export CFLAGS_GCOV
 
-- 
2.43.5

Re: [PATCH] gcov: use atomic counter updates to fix concurrent access crashes
Posted by kernel test robot 2 months ago
Hi Konstantin,

kernel test robot noticed the following build warnings:

[auto build test WARNING on soc/for-next]
[also build test WARNING on linus/master v7.0-rc7 next-20260410]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Konstantin-Khorenko/gcov-use-atomic-counter-updates-to-fix-concurrent-access-crashes/20260411-133428
base:   https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git for-next
patch link:    https://lore.kernel.org/r/20260402141831.1437357-2-khorenko%40virtuozzo.com
patch subject: [PATCH] gcov: use atomic counter updates to fix concurrent access crashes
config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20260411/202604111946.Erd3tguU-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 15.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260411/202604111946.Erd3tguU-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/r/202604111946.Erd3tguU-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> io_uring/io_uring.c:3224:1: warning: target does not support atomic profile update, single mode is selected
    3224 | __initcall(io_uring_init);
         | ^~~~~~~~~~
--
>> io_uring/opdef.c:890:1: warning: target does not support atomic profile update, single mode is selected
     890 | }
         | ^
--
>> io_uring/kbuf.c:740:1: warning: target does not support atomic profile update, single mode is selected
     740 | }
         | ^
--
>> io_uring/rsrc.c:1555:1: warning: target does not support atomic profile update, single mode is selected
    1555 | }
         | ^
--
>> io_uring/notif.c:141:1: warning: target does not support atomic profile update, single mode is selected
     141 | }
         | ^
--
>> io_uring/tctx.c:388:1: warning: target does not support atomic profile update, single mode is selected
     388 | }
         | ^
--
>> io_uring/filetable.c:158:1: warning: target does not support atomic profile update, single mode is selected
     158 | }
         | ^
--
>> io_uring/rw.c:1397:1: warning: target does not support atomic profile update, single mode is selected
    1397 | }
         | ^
--
>> io_uring/poll.c:963:1: warning: target does not support atomic profile update, single mode is selected
     963 | }
         | ^
--
>> io_uring/tw.c:355:1: warning: target does not support atomic profile update, single mode is selected
     355 | }
         | ^
--
>> io_uring/wait.c:308:1: warning: target does not support atomic profile update, single mode is selected
     308 | }
         | ^
..


vim +3224 io_uring/io_uring.c

76d3ccecfa186a io_uring/io_uring.c Matteo Rizzo 2023-08-21  3221  
2b188cc1bb857a fs/io_uring.c       Jens Axboe   2019-01-07  3222  	return 0;
2b188cc1bb857a fs/io_uring.c       Jens Axboe   2019-01-07  3223  };
2b188cc1bb857a fs/io_uring.c       Jens Axboe   2019-01-07 @3224  __initcall(io_uring_init);

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH] gcov: use atomic counter updates to fix concurrent access crashes
Posted by Nathan Chancellor 2 months, 1 week ago
On Thu, Apr 02, 2026 at 05:18:31PM +0300, Konstantin Khorenko wrote:
> GCC's GCOV instrumentation can merge global branch counters with loop
> induction variables as an optimization.  In inflate_fast(), the inner
> copy loops get transformed so that the GCOV counter value is loaded
> multiple times to compute the loop base address, start index, and end
> bound.  Since GCOV counters are global (not per-CPU), concurrent
> execution on different CPUs causes the counter to change between loads,
> producing inconsistent values and out-of-bounds memory writes.
> 
> The crash manifests during IPComp (IP Payload Compression) processing
> when inflate_fast() runs concurrently on multiple CPUs:
> 
>   BUG: unable to handle page fault for address: ffffd0a3c0902ffa
>   RIP: inflate_fast+1431
>   Call Trace:
>    zlib_inflate
>    __deflate_decompress
>    crypto_comp_decompress
>    ipcomp_decompress [xfrm_ipcomp]
>    ipcomp_input [xfrm_ipcomp]
>    xfrm_input
> 
> At the crash point, the compiler generated three loads from the same
> global GCOV counter (__gcov0.inflate_fast+216) to compute base, start,
> and end for an indexed loop.  Another CPU modified the counter between
> loads, making the values inconsistent — the write went 3.4 MB past a
> 65 KB buffer.
> 
> Add -fprofile-update=atomic to CFLAGS_GCOV at the global level in the
> top-level Makefile.  This tells GCC that GCOV counters may be
> concurrently accessed, causing counter updates to use atomic
> instructions (lock addq) instead of plain load/store.  This prevents
> the compiler from merging counters with loop induction variables.
> 
> Applying this globally rather than per-subsystem not only addresses the
> observed crash in zlib but makes GCOV coverage data more consistent
> overall, preventing similar issues in any kernel code path that may
> execute concurrently.
> 
> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
> Tested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>

While this is obviously a fix, what are the chances of regressions from
this change? As this should only impact GCOV, this could go via whatever
tree carries GCOV patches. If Kbuild is to take this change, my vote
would be to defer it to 7.2 at this point in the development cycle so
that it can have most of a cycle to sit in -next.

> ---
>  Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index 6b1d9fb1a6b4..a55ad668d6ba 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -806,7 +806,7 @@ all: vmlinux
>  
>  CFLAGS_GCOV	:= -fprofile-arcs -ftest-coverage
>  ifdef CONFIG_CC_IS_GCC
> -CFLAGS_GCOV	+= -fno-tree-loop-im
> +CFLAGS_GCOV	+= -fno-tree-loop-im -fprofile-update=atomic
>  endif
>  export CFLAGS_GCOV
>  
> -- 
> 2.43.5
> 
Re: [PATCH] gcov: use atomic counter updates to fix concurrent access crashes
Posted by Peter Oberparleiter 2 months ago
On 06.04.2026 21:37, Nathan Chancellor wrote:
> On Thu, Apr 02, 2026 at 05:18:31PM +0300, Konstantin Khorenko wrote:
>> GCC's GCOV instrumentation can merge global branch counters with loop
>> induction variables as an optimization.  In inflate_fast(), the inner
>> copy loops get transformed so that the GCOV counter value is loaded
>> multiple times to compute the loop base address, start index, and end
>> bound.  Since GCOV counters are global (not per-CPU), concurrent
>> execution on different CPUs causes the counter to change between loads,
>> producing inconsistent values and out-of-bounds memory writes.
>>
>> The crash manifests during IPComp (IP Payload Compression) processing
>> when inflate_fast() runs concurrently on multiple CPUs:
>>
>>   BUG: unable to handle page fault for address: ffffd0a3c0902ffa
>>   RIP: inflate_fast+1431
>>   Call Trace:
>>    zlib_inflate
>>    __deflate_decompress
>>    crypto_comp_decompress
>>    ipcomp_decompress [xfrm_ipcomp]
>>    ipcomp_input [xfrm_ipcomp]
>>    xfrm_input
>>
>> At the crash point, the compiler generated three loads from the same
>> global GCOV counter (__gcov0.inflate_fast+216) to compute base, start,
>> and end for an indexed loop.  Another CPU modified the counter between
>> loads, making the values inconsistent — the write went 3.4 MB past a
>> 65 KB buffer.
>>
>> Add -fprofile-update=atomic to CFLAGS_GCOV at the global level in the
>> top-level Makefile.  This tells GCC that GCOV counters may be
>> concurrently accessed, causing counter updates to use atomic
>> instructions (lock addq) instead of plain load/store.  This prevents
>> the compiler from merging counters with loop induction variables.
>>
>> Applying this globally rather than per-subsystem not only addresses the
>> observed crash in zlib but makes GCOV coverage data more consistent
>> overall, preventing similar issues in any kernel code path that may
>> execute concurrently.
>>
>> Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
>> Tested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
>> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
> 
> While this is obviously a fix, what are the chances of regressions from
> this change? As this should only impact GCOV, this could go via whatever
> tree carries GCOV patches. If Kbuild is to take this change, my vote
> would be to defer it to 7.2 at this point in the development cycle so
> that it can have most of a cycle to sit in -next.

Adding Andrew since he typically integrates GCOV patches via his tree,
and for input on how to handle this patch.

To summarize the situation, this patch:
- is only effective with GCC + GCOV profiling enabled
- fixes a run-time crash
- improves overall GCOV coverage data consistency
- triggers a number of build errors due to side-effects on GCC constant
  folding and therefore depends on the associated series [1] that fixes
  these build-errors
- has a non-zero chance to trigger additional build-time errors, e.g.
  in similar macros guarded by arch/config symbols not covered by
  current testing

Given the last point, I agree with Nathan that this patch would benefit
from additional test coverage to minimize regression risks, e.g. via a
cycle in -next.

[1]
https://lore.kernel.org/lkml/20260402140558.1437002-1-khorenko@virtuozzo.com/

>> ---
>>  Makefile | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 6b1d9fb1a6b4..a55ad668d6ba 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -806,7 +806,7 @@ all: vmlinux
>>  
>>  CFLAGS_GCOV	:= -fprofile-arcs -ftest-coverage
>>  ifdef CONFIG_CC_IS_GCC
>> -CFLAGS_GCOV	+= -fno-tree-loop-im
>> +CFLAGS_GCOV	+= -fno-tree-loop-im -fprofile-update=atomic
>>  endif
>>  export CFLAGS_GCOV
>>  
>> -- 
>> 2.43.5
>>


-- 
Peter Oberparleiter
Linux on IBM Z Development - IBM Germany R&D
Re: [PATCH] gcov: use atomic counter updates to fix concurrent access crashes
Posted by Andrew Morton 1 month, 4 weeks ago
On Thu, 9 Apr 2026 10:11:24 +0200 Peter Oberparleiter <oberpar@linux.ibm.com> wrote:

> > would be to defer it to 7.2 at this point in the development cycle so
> > that it can have most of a cycle to sit in -next.
> 
> Adding Andrew since he typically integrates GCOV patches via his tree,
> and for input on how to handle this patch.
> 
> To summarize the situation, this patch:
> - is only effective with GCC + GCOV profiling enabled
> - fixes a run-time crash
> - improves overall GCOV coverage data consistency
> - triggers a number of build errors due to side-effects on GCC constant
>   folding and therefore depends on the associated series [1] that fixes
>   these build-errors
> - has a non-zero chance to trigger additional build-time errors, e.g.
>   in similar macros guarded by arch/config symbols not covered by
>   current testing
> 
> Given the last point, I agree with Nathan that this patch would benefit
> from additional test coverage to minimize regression risks, e.g. via a
> cycle in -next.

Great, thanks for preempting lots of dumb akpm questions ;)

Agree, I'll stash this in the post-rc1 pile.