delay accounting started populating taskstats records with a valid
version field via fill_pid() and fill_tgid().
Later, commit ad4ecbcba728 ("[PATCH] delay accounting taskstats
interface send tgid once") changed the TGID exit path to send the
cached signal->stats aggregate directly instead of building the outgoing
record through fill_tgid(). Unlike fill_tgid(), fill_tgid_exit() only
accumulates accounting data and never initializes stats->version.
As a result, TGID exit notifications can reach userspace with
version == 0 even though PID exit notifications and
TASKSTATS_CMD_GET replies carry a valid taskstats version.
Set stats->version = TASKSTATS_VERSION after copying the cached TGID
aggregate into the outgoing netlink payload so all taskstats records are
self-describing again.
Fixes: ad4ecbcba728 ("[PATCH] delay accounting taskstats interface send tgid once")
Cc: stable@vger.kernel.org
Signed-off-by: Yiyang Chen <cyyzero16@gmail.com>
---
kernel/taskstats.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index 0cd680ccc7e5..73bd6a6a7893 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -649,6 +649,7 @@ void taskstats_exit(struct task_struct *tsk, int group_dead)
goto err;
memcpy(stats, tsk->signal->stats, sizeof(*stats));
+ stats->version = TASKSTATS_VERSION;
send:
send_cpu_listeners(rep_skb, listeners);
--
2.43.0
On Mon, 30 Mar 2026 03:00:40 +0800 Yiyang Chen <cyyzero16@gmail.com> wrote:
> delay accounting started populating taskstats records with a valid
> version field via fill_pid() and fill_tgid().
>
> Later, commit ad4ecbcba728 ("[PATCH] delay accounting taskstats
> interface send tgid once") changed the TGID exit path to send the
> cached signal->stats aggregate directly instead of building the outgoing
> record through fill_tgid(). Unlike fill_tgid(), fill_tgid_exit() only
> accumulates accounting data and never initializes stats->version.
>
> As a result, TGID exit notifications can reach userspace with
> version == 0 even though PID exit notifications and
> TASKSTATS_CMD_GET replies carry a valid taskstats version.
>
> Set stats->version = TASKSTATS_VERSION after copying the cached TGID
> aggregate into the outgoing netlink payload so all taskstats records are
> self-describing again.
>
> Fixes: ad4ecbcba728 ("[PATCH] delay accounting taskstats interface send tgid once")
Thanks, lol, 20 years ago.
Can you explain how others can trigger this? Some combination of
steps which results in the bad output?
> Cc: stable@vger.kernel.org
Is there a chance of breaking existing userspace here? Some existing
userspace code which is expecting 0 here and will get surprised by this
change?
> --- a/kernel/taskstats.c
> +++ b/kernel/taskstats.c
> @@ -649,6 +649,7 @@ void taskstats_exit(struct task_struct *tsk, int group_dead)
> goto err;
>
> memcpy(stats, tsk->signal->stats, sizeof(*stats));
> + stats->version = TASKSTATS_VERSION;
>
> send:
> send_cpu_listeners(rep_skb, listeners);
On Tue, Mar 31, 2026 at 5:29 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Mon, 30 Mar 2026 03:00:40 +0800 Yiyang Chen <cyyzero16@gmail.com> wrote:
>
> > delay accounting started populating taskstats records with a valid
> > version field via fill_pid() and fill_tgid().
> >
> > Later, commit ad4ecbcba728 ("[PATCH] delay accounting taskstats
> > interface send tgid once") changed the TGID exit path to send the
> > cached signal->stats aggregate directly instead of building the outgoing
> > record through fill_tgid(). Unlike fill_tgid(), fill_tgid_exit() only
> > accumulates accounting data and never initializes stats->version.
> >
> > As a result, TGID exit notifications can reach userspace with
> > version == 0 even though PID exit notifications and
> > TASKSTATS_CMD_GET replies carry a valid taskstats version.
> >
> > Set stats->version = TASKSTATS_VERSION after copying the cached TGID
> > aggregate into the outgoing netlink payload so all taskstats records are
> > self-describing again.
> >
> > Fixes: ad4ecbcba728 ("[PATCH] delay accounting taskstats interface send tgid once")
>
> Thanks, lol, 20 years ago.
>
> Can you explain how others can trigger this? Some combination of
> steps which results in the bad output?
Yes. This is easy to reproduce with `tools/accounting/getdelays.c`.
I have a small follow-up patch for that tool which:
1. increases the receive buffer/message size so the pid+tgid combined exit
notification is not dropped/truncated
2. prints `stats->version`.
With that patch, the reproducer is:
Terminal 1:
./getdelays -d -v -l -m 0
Terminal 2:
taskset -c 0 python3 -c 'import threading,time; t=threading.Thread(target=time.sleep,args=(0.1,)); t.start(); t.join()'
That produces both PID and TGID exit notifications for the same process. The PID
exit record reports a valid taskstats version, while the TGID exit record reports
`version 0`.
>
> > Cc: stable@vger.kernel.org
>
> Is there a chance of breaking existing userspace here? Some existing
> userspace code which is expecting 0 here and will get surprised by this
> change?
In practice, userspace uses `taskstats.version` to decide which fields are
present in `struct taskstats`, i.e. as a schema/version discriminator. A zero
version does not describe a valid taskstats layout, so it is hard to see how
userspace could use `0` as a meaningful or useful distinction here.
So I do not think fixing this in mainline should break sensible userspace. It
just restores consistency of the taskstats version semantics across
`TASKSTATS_CMD_GET`, PID exit notifications, and TGID exit notifications.
To be honest, I'm also not sure if this should backport to stable. But I think
mainline should still fix it.
Thanks,
Yiyang
© 2016 - 2026 Red Hat, Inc.