[v2] tracing: Silence warning when chunk allocation fails in trace_pid_write

[PATCH v2] tracing: Silence warning when chunk allocation fails in trace_pid_write

Posted by Pu Lehui 3 weeks, 3 days ago

From: Pu Lehui <pulehui@huawei.com>

Syzkaller trigger a fault injection warning:

WARNING: CPU: 1 PID: 12326 at tracepoint_add_func+0xbfc/0xeb0
Modules linked in:
CPU: 1 UID: 0 PID: 12326 Comm: syz.6.10325 Tainted: G U 6.14.0-rc5-syzkaller #0
Tainted: [U]=USER
Hardware name: Google Compute Engine/Google Compute Engine
RIP: 0010:tracepoint_add_func+0xbfc/0xeb0 kernel/tracepoint.c:294
Code: 09 fe ff 90 0f 0b 90 0f b6 74 24 43 31 ff 41 bc ea ff ff ff
RSP: 0018:ffffc9000414fb48 EFLAGS: 00010283
RAX: 00000000000012a1 RBX: ffffffff8e240ae0 RCX: ffffc90014b78000
RDX: 0000000000080000 RSI: ffffffff81bbd78b RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffffffffef
R13: 0000000000000000 R14: dffffc0000000000 R15: ffffffff81c264f0
FS:  00007f27217f66c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2e80dff8 CR3: 00000000268f8000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 tracepoint_probe_register_prio+0xc0/0x110 kernel/tracepoint.c:464
 register_trace_prio_sched_switch include/trace/events/sched.h:222 [inline]
 register_pid_events kernel/trace/trace_events.c:2354 [inline]
 event_pid_write.isra.0+0x439/0x7a0 kernel/trace/trace_events.c:2425
 vfs_write+0x24c/0x1150 fs/read_write.c:677
 ksys_write+0x12b/0x250 fs/read_write.c:731
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

We can reproduce the warning by following the steps below:
1. echo 8 >> set_event_notrace_pid. Let tr->filtered_pids owns one pid
   and register sched_switch tracepoint.
2. echo ' ' >> set_event_pid, and perform fault injection during chunk
   allocation of trace_pid_list_alloc. Let pid_list with no pid and
assign to tr->filtered_pids.
3. echo ' ' >> set_event_pid. Let pid_list is NULL and assign to
   tr->filtered_pids.
4. echo 9 >> set_event_pid, will trigger the double register
   sched_switch tracepoint warning.

The reason is that syzkaller injects a fault into the chunk allocation
in trace_pid_list_alloc, causing a failure in trace_pid_list_set, which
may trigger double register of the same tracepoint. This only occurs
when the system is about to crash, but to suppress this warning, let's
add failure handling logic to trace_pid_list_set.

Fixes: 8d6e90983ade ("tracing: Create a sparse bitmask for pid filtering")
Reported-by: syzbot+161412ccaeff20ce4dde@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67cb890e.050a0220.d8275.022e.GAE@google.com
Signed-off-by: Pu Lehui <pulehui@huawei.com>
---
 kernel/trace/trace.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 1b7db732c0b1..f2a84d1ce4b7 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -834,7 +834,10 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
 		/* copy the current bits to the new max */
 		ret = trace_pid_list_first(filtered_pids, &pid);
 		while (!ret) {
-			trace_pid_list_set(pid_list, pid);
+			ret = trace_pid_list_set(pid_list, pid);
+			if (ret < 0)
+				goto out;
+
 			ret = trace_pid_list_next(filtered_pids, pid + 1, &pid);
 			nr_pids++;
 		}
@@ -871,6 +874,7 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
 		trace_parser_clear(&parser);
 		ret = 0;
 	}
+ out:
 	trace_parser_put(&parser);
 
 	if (ret < 0) {
-- 
2.34.1

Re: [PATCH v2] tracing: Silence warning when chunk allocation fails in trace_pid_write

Posted by Masami Hiramatsu (Google) 3 weeks, 2 days ago

On Mon,  8 Sep 2025 02:46:58 +0000
Pu Lehui <pulehui@huaweicloud.com> wrote:

> From: Pu Lehui <pulehui@huawei.com>
> 
> Syzkaller trigger a fault injection warning:
> 
> WARNING: CPU: 1 PID: 12326 at tracepoint_add_func+0xbfc/0xeb0
> Modules linked in:
> CPU: 1 UID: 0 PID: 12326 Comm: syz.6.10325 Tainted: G U 6.14.0-rc5-syzkaller #0
> Tainted: [U]=USER
> Hardware name: Google Compute Engine/Google Compute Engine
> RIP: 0010:tracepoint_add_func+0xbfc/0xeb0 kernel/tracepoint.c:294
> Code: 09 fe ff 90 0f 0b 90 0f b6 74 24 43 31 ff 41 bc ea ff ff ff
> RSP: 0018:ffffc9000414fb48 EFLAGS: 00010283
> RAX: 00000000000012a1 RBX: ffffffff8e240ae0 RCX: ffffc90014b78000
> RDX: 0000000000080000 RSI: ffffffff81bbd78b RDI: 0000000000000001
> RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffffffffef
> R13: 0000000000000000 R14: dffffc0000000000 R15: ffffffff81c264f0
> FS:  00007f27217f66c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2e80dff8 CR3: 00000000268f8000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  tracepoint_probe_register_prio+0xc0/0x110 kernel/tracepoint.c:464
>  register_trace_prio_sched_switch include/trace/events/sched.h:222 [inline]
>  register_pid_events kernel/trace/trace_events.c:2354 [inline]
>  event_pid_write.isra.0+0x439/0x7a0 kernel/trace/trace_events.c:2425
>  vfs_write+0x24c/0x1150 fs/read_write.c:677
>  ksys_write+0x12b/0x250 fs/read_write.c:731
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> We can reproduce the warning by following the steps below:
> 1. echo 8 >> set_event_notrace_pid. Let tr->filtered_pids owns one pid
>    and register sched_switch tracepoint.
> 2. echo ' ' >> set_event_pid, and perform fault injection during chunk
>    allocation of trace_pid_list_alloc. Let pid_list with no pid and
> assign to tr->filtered_pids.
> 3. echo ' ' >> set_event_pid. Let pid_list is NULL and assign to
>    tr->filtered_pids.
> 4. echo 9 >> set_event_pid, will trigger the double register
>    sched_switch tracepoint warning.
> 
> The reason is that syzkaller injects a fault into the chunk allocation
> in trace_pid_list_alloc, causing a failure in trace_pid_list_set, which
> may trigger double register of the same tracepoint. This only occurs
> when the system is about to crash, but to suppress this warning, let's
> add failure handling logic to trace_pid_list_set.
> 
> Fixes: 8d6e90983ade ("tracing: Create a sparse bitmask for pid filtering")
> Reported-by: syzbot+161412ccaeff20ce4dde@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67cb890e.050a0220.d8275.022e.GAE@google.com
> Signed-off-by: Pu Lehui <pulehui@huawei.com>

Looks good to me.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thank you,


> ---
>  kernel/trace/trace.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 1b7db732c0b1..f2a84d1ce4b7 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -834,7 +834,10 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
>  		/* copy the current bits to the new max */
>  		ret = trace_pid_list_first(filtered_pids, &pid);
>  		while (!ret) {
> -			trace_pid_list_set(pid_list, pid);
> +			ret = trace_pid_list_set(pid_list, pid);
> +			if (ret < 0)
> +				goto out;
> +
>  			ret = trace_pid_list_next(filtered_pids, pid + 1, &pid);
>  			nr_pids++;
>  		}
> @@ -871,6 +874,7 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
>  		trace_parser_clear(&parser);
>  		ret = 0;
>  	}
> + out:
>  	trace_parser_put(&parser);
>  
>  	if (ret < 0) {
> -- 
> 2.34.1
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH v2] tracing: Silence warning when chunk allocation fails in trace_pid_write

Posted by Steven Rostedt 3 weeks, 3 days ago

On Mon,  8 Sep 2025 02:46:58 +0000
Pu Lehui <pulehui@huaweicloud.com> wrote:

> From: Pu Lehui <pulehui@huawei.com>
> 
> Syzkaller trigger a fault injection warning:
> 
> WARNING: CPU: 1 PID: 12326 at tracepoint_add_func+0xbfc/0xeb0
> Modules linked in:
> CPU: 1 UID: 0 PID: 12326 Comm: syz.6.10325 Tainted: G U 6.14.0-rc5-syzkaller #0
> Tainted: [U]=USER
> Hardware name: Google Compute Engine/Google Compute Engine
> RIP: 0010:tracepoint_add_func+0xbfc/0xeb0 kernel/tracepoint.c:294
> Code: 09 fe ff 90 0f 0b 90 0f b6 74 24 43 31 ff 41 bc ea ff ff ff
> RSP: 0018:ffffc9000414fb48 EFLAGS: 00010283
> RAX: 00000000000012a1 RBX: ffffffff8e240ae0 RCX: ffffc90014b78000
> RDX: 0000000000080000 RSI: ffffffff81bbd78b RDI: 0000000000000001
> RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffffffffef
> R13: 0000000000000000 R14: dffffc0000000000 R15: ffffffff81c264f0
> FS:  00007f27217f66c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2e80dff8 CR3: 00000000268f8000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  tracepoint_probe_register_prio+0xc0/0x110 kernel/tracepoint.c:464
>  register_trace_prio_sched_switch include/trace/events/sched.h:222 [inline]
>  register_pid_events kernel/trace/trace_events.c:2354 [inline]
>  event_pid_write.isra.0+0x439/0x7a0 kernel/trace/trace_events.c:2425
>  vfs_write+0x24c/0x1150 fs/read_write.c:677
>  ksys_write+0x12b/0x250 fs/read_write.c:731
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> We can reproduce the warning by following the steps below:
> 1. echo 8 >> set_event_notrace_pid. Let tr->filtered_pids owns one pid
>    and register sched_switch tracepoint.
> 2. echo ' ' >> set_event_pid, and perform fault injection during chunk
>    allocation of trace_pid_list_alloc. Let pid_list with no pid and
> assign to tr->filtered_pids.
> 3. echo ' ' >> set_event_pid. Let pid_list is NULL and assign to
>    tr->filtered_pids.
> 4. echo 9 >> set_event_pid, will trigger the double register
>    sched_switch tracepoint warning.
> 
> The reason is that syzkaller injects a fault into the chunk allocation
> in trace_pid_list_alloc, causing a failure in trace_pid_list_set, which
> may trigger double register of the same tracepoint. This only occurs
> when the system is about to crash, but to suppress this warning, let's
> add failure handling logic to trace_pid_list_set.
> 
> Fixes: 8d6e90983ade ("tracing: Create a sparse bitmask for pid filtering")
> Reported-by: syzbot+161412ccaeff20ce4dde@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/67cb890e.050a0220.d8275.022e.GAE@google.com
> Signed-off-by: Pu Lehui <pulehui@huawei.com>
> ---

FYI, when sending a v2, please state below the three dashes what was
changed since v1. Something like:

Changes since v1: https://lore.kernel.org/all/20250821071721.3609109-1-pulehui@huaweicloud.com/

- Instead of returning -EINVAL before trace_parser_load() have
  trace_pid_write() return error when trace_pid_list_set() returns an error.


I have a Link tag to this email that is added by my scripts, with the idea
that this email will have a link to the previous version and so on. It will
create a chain of the email discussions that lead to what lands in mainline.

-- Steve


>  kernel/trace/trace.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 1b7db732c0b1..f2a84d1ce4b7 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -834,7 +834,10 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
>  		/* copy the current bits to the new max */
>  		ret = trace_pid_list_first(filtered_pids, &pid);
>  		while (!ret) {
> -			trace_pid_list_set(pid_list, pid);
> +			ret = trace_pid_list_set(pid_list, pid);
> +			if (ret < 0)
> +				goto out;
> +
>  			ret = trace_pid_list_next(filtered_pids, pid + 1, &pid);
>  			nr_pids++;
>  		}
> @@ -871,6 +874,7 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
>  		trace_parser_clear(&parser);
>  		ret = 0;
>  	}
> + out:
>  	trace_parser_put(&parser);
>  
>  	if (ret < 0) {

Re: [PATCH v2] tracing: Silence warning when chunk allocation fails in trace_pid_write

Posted by Pu Lehui 3 weeks, 2 days ago

On 2025/9/9 3:02, Steven Rostedt wrote:
> On Mon,  8 Sep 2025 02:46:58 +0000
> Pu Lehui <pulehui@huaweicloud.com> wrote:
> 
>> From: Pu Lehui <pulehui@huawei.com>
>>
>> Syzkaller trigger a fault injection warning:
>>
>> WARNING: CPU: 1 PID: 12326 at tracepoint_add_func+0xbfc/0xeb0
>> Modules linked in:
>> CPU: 1 UID: 0 PID: 12326 Comm: syz.6.10325 Tainted: G U 6.14.0-rc5-syzkaller #0
>> Tainted: [U]=USER
>> Hardware name: Google Compute Engine/Google Compute Engine
>> RIP: 0010:tracepoint_add_func+0xbfc/0xeb0 kernel/tracepoint.c:294
>> Code: 09 fe ff 90 0f 0b 90 0f b6 74 24 43 31 ff 41 bc ea ff ff ff
>> RSP: 0018:ffffc9000414fb48 EFLAGS: 00010283
>> RAX: 00000000000012a1 RBX: ffffffff8e240ae0 RCX: ffffc90014b78000
>> RDX: 0000000000080000 RSI: ffffffff81bbd78b RDI: 0000000000000001
>> RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
>> R10: 0000000000000001 R11: 0000000000000001 R12: ffffffffffffffef
>> R13: 0000000000000000 R14: dffffc0000000000 R15: ffffffff81c264f0
>> FS:  00007f27217f66c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000001b2e80dff8 CR3: 00000000268f8000 CR4: 00000000003526f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>   <TASK>
>>   tracepoint_probe_register_prio+0xc0/0x110 kernel/tracepoint.c:464
>>   register_trace_prio_sched_switch include/trace/events/sched.h:222 [inline]
>>   register_pid_events kernel/trace/trace_events.c:2354 [inline]
>>   event_pid_write.isra.0+0x439/0x7a0 kernel/trace/trace_events.c:2425
>>   vfs_write+0x24c/0x1150 fs/read_write.c:677
>>   ksys_write+0x12b/0x250 fs/read_write.c:731
>>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>>   do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
>>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>>
>> We can reproduce the warning by following the steps below:
>> 1. echo 8 >> set_event_notrace_pid. Let tr->filtered_pids owns one pid
>>     and register sched_switch tracepoint.
>> 2. echo ' ' >> set_event_pid, and perform fault injection during chunk
>>     allocation of trace_pid_list_alloc. Let pid_list with no pid and
>> assign to tr->filtered_pids.
>> 3. echo ' ' >> set_event_pid. Let pid_list is NULL and assign to
>>     tr->filtered_pids.
>> 4. echo 9 >> set_event_pid, will trigger the double register
>>     sched_switch tracepoint warning.
>>
>> The reason is that syzkaller injects a fault into the chunk allocation
>> in trace_pid_list_alloc, causing a failure in trace_pid_list_set, which
>> may trigger double register of the same tracepoint. This only occurs
>> when the system is about to crash, but to suppress this warning, let's
>> add failure handling logic to trace_pid_list_set.
>>
>> Fixes: 8d6e90983ade ("tracing: Create a sparse bitmask for pid filtering")
>> Reported-by: syzbot+161412ccaeff20ce4dde@syzkaller.appspotmail.com
>> Closes: https://lore.kernel.org/all/67cb890e.050a0220.d8275.022e.GAE@google.com
>> Signed-off-by: Pu Lehui <pulehui@huawei.com>
>> ---
> 
> FYI, when sending a v2, please state below the three dashes what was
> changed since v1. Something like:
> 
> Changes since v1: https://lore.kernel.org/all/20250821071721.3609109-1-pulehui@huaweicloud.com/
> 
> - Instead of returning -EINVAL before trace_parser_load() have
>    trace_pid_write() return error when trace_pid_list_set() returns an error.
> 
> 
> I have a Link tag to this email that is added by my scripts, with the idea
> that this email will have a link to the previous version and so on. It will
> create a chain of the email discussions that lead to what lands in mainline.

It's indeed necessary. Thanks Steven.

> 
> -- Steve
> 
> 
>>   kernel/trace/trace.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
>> index 1b7db732c0b1..f2a84d1ce4b7 100644
>> --- a/kernel/trace/trace.c
>> +++ b/kernel/trace/trace.c
>> @@ -834,7 +834,10 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
>>   		/* copy the current bits to the new max */
>>   		ret = trace_pid_list_first(filtered_pids, &pid);
>>   		while (!ret) {
>> -			trace_pid_list_set(pid_list, pid);
>> +			ret = trace_pid_list_set(pid_list, pid);
>> +			if (ret < 0)
>> +				goto out;
>> +
>>   			ret = trace_pid_list_next(filtered_pids, pid + 1, &pid);
>>   			nr_pids++;
>>   		}
>> @@ -871,6 +874,7 @@ int trace_pid_write(struct trace_pid_list *filtered_pids,
>>   		trace_parser_clear(&parser);
>>   		ret = 0;
>>   	}
>> + out:
>>   	trace_parser_put(&parser);
>>   
>>   	if (ret < 0) {

Re: [PATCH v2] tracing: Silence warning when chunk allocation fails in trace_pid_write

Posted by Steven Rostedt 3 weeks, 3 days ago

On Mon, 8 Sep 2025 15:02:57 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> I have a Link tag to this email that is added by my scripts, with the idea
> that this email will have a link to the previous version and so on. It will
> create a chain of the email discussions that lead to what lands in mainline.

BTW, I purposely replied to your email to create that chain ;-)

-- Steve