[PATCH v4 0/9] perf tools: Some improvements on data type profiler

Zecheng Li posted 9 patches 2 months ago
tools/perf/arch/x86/annotate/instructions.c | 183 +++++++++++++-
tools/perf/util/annotate-data.c             | 102 ++++++--
tools/perf/util/annotate-data.h             |  14 +-
tools/perf/util/annotate.c                  |  20 ++
tools/perf/util/dwarf-aux.c                 | 266 +++++++++++++++-----
tools/perf/util/dwarf-aux.h                 |   2 +-
6 files changed, 493 insertions(+), 94 deletions(-)
[PATCH v4 0/9] perf tools: Some improvements on data type profiler
Posted by Zecheng Li 2 months ago
Hi all,

I've identified several missing data type annotations within the perf
tools when annotating the Linux kernel. This patch series improves the
coverage and correctness of data type annotations.

Some patches from the previous version of this series were
cherry-picked. This revision adds new improvements based on feedback and
further development.

Here's a breakdown of the changes in this revision:

Patch 1 skips annotations for LEA instructions in x86, as these do not
involve memory access. It now returns NO_TYPE.

Patches 2 implements the TSR_KIND_POINTER to represent registers holding
memory addresses of the type. We are using the size of void* to get the
pointer size. This could be improved to use an architecture dependent
pointer size, but may require more work.

Patches 3-5 implement a basic approach for register offset tracking that
supports add, sub, and lea operations. The register state is invalidated
when an unsupported arithmetic instruction is encountered. This revision
uses TSR_KIND_POINTER to avoid finding the pointer type in DWARF and
preserves the pointer offset information in the stack state.

Patches 6-8 split patch 8 from v2 with some minor improvements. It skips
check_variable when the type is found directly by register, since
sufficient checking is already performed in match_var_offset.
check_variable lacks some DWARF information to correctly determine if a
variable is valid. I also found it is able to find members for
typedef'd types so I preserve them in match_var_offset.

Patch 9 implements support for DW_OP_piece. Currently, this is allowed
in check_allowed_ops but is handled like other single location
expressions. This patch splits any expression containing DW_OP_piece
into multiple parts and handle them separately.

I have tested each patch on a vmlinux and manually checked the results.
After applying all patches, there are less missing or incorrect
annotations. No obvious regressions were observed.

v4:
Merged patch in v3:
perf annotate: Rename TSR_KIND_POINTER to TSR_KIND_PERCPU_POINTER

Updated patches 1-5 based on the feedback from Namhyung.

v3:
https://lore.kernel.org/all/20250917195808.2514277-1-zecheng@google.com/
Merged patches in v2:

perf dwarf-aux: Use signed variable types in match_var_offset
perf dwarf-aux: More accurate variable type match for breg
perf dwarf-aux: Better variable collection for insn tracking
perf dwarf-aux: Skip check_variable for die_find_variable_by_reg

v2:
https://lore.kernel.org/all/20250825195412.223077-1-zecheng@google.com/
1. update the match_var_offset function signature to s64
2. correct the comment for is_breg_access_indirect. Use simpler logic to
match the expressions we support.
3. add is_reg_var_addr to indicate whether a register holds an address
of the variable. This defers the type dereference logic to
update_var_state.
4. invalidate register state for unsupported instructions.
5. include two new patches related to improving data type profiler.

v1:
https://lore.kernel.org/linux-perf-users/20250725202809.1230085-1-zecheng@google.com/

Zecheng Li (9):
  perf annotate: Skip annotating data types to lea instructions
  perf annotate: Track address registers via TSR_KIND_POINTER
  perf annotate: Track arithmetic instructions on pointers
  perf annotate: Save pointer offset in stack state
  perf annotate: Invalidate register states for untracked instructions
  perf dwarf-aux: Skip check_variable for die_find_variable_by_reg
  perf dwarf-aux: Preserve typedefs in match_var_offset
  perf annotate: Improve type comparison from different scopes
  perf dwarf-aux: Support DW_OP_piece expressions

 tools/perf/arch/x86/annotate/instructions.c | 183 +++++++++++++-
 tools/perf/util/annotate-data.c             | 102 ++++++--
 tools/perf/util/annotate-data.h             |  14 +-
 tools/perf/util/annotate.c                  |  20 ++
 tools/perf/util/dwarf-aux.c                 | 266 +++++++++++++++-----
 tools/perf/util/dwarf-aux.h                 |   2 +-
 6 files changed, 493 insertions(+), 94 deletions(-)

-- 
2.51.0
Re: [PATCH v4 0/9] perf tools: Some improvements on data type profiler
Posted by Namhyung Kim 1 month, 3 weeks ago
Hello,

On Mon, Oct 13, 2025 at 06:15:57PM +0000, Zecheng Li wrote:
> Hi all,
> 
> I've identified several missing data type annotations within the perf
> tools when annotating the Linux kernel. This patch series improves the
> coverage and correctness of data type annotations.
> 
> Some patches from the previous version of this series were
> cherry-picked. This revision adds new improvements based on feedback and
> further development.
> 
> Here's a breakdown of the changes in this revision:
> 
> Patch 1 skips annotations for LEA instructions in x86, as these do not
> involve memory access. It now returns NO_TYPE.
> 
> Patches 2 implements the TSR_KIND_POINTER to represent registers holding
> memory addresses of the type. We are using the size of void* to get the
> pointer size. This could be improved to use an architecture dependent
> pointer size, but may require more work.
> 
> Patches 3-5 implement a basic approach for register offset tracking that
> supports add, sub, and lea operations. The register state is invalidated
> when an unsupported arithmetic instruction is encountered. This revision
> uses TSR_KIND_POINTER to avoid finding the pointer type in DWARF and
> preserves the pointer offset information in the stack state.

I've applied up to this to perf-tools-next, will review the rest later.

Thanks,
Namhyung

> 
> Patches 6-8 split patch 8 from v2 with some minor improvements. It skips
> check_variable when the type is found directly by register, since
> sufficient checking is already performed in match_var_offset.
> check_variable lacks some DWARF information to correctly determine if a
> variable is valid. I also found it is able to find members for
> typedef'd types so I preserve them in match_var_offset.
> 
> Patch 9 implements support for DW_OP_piece. Currently, this is allowed
> in check_allowed_ops but is handled like other single location
> expressions. This patch splits any expression containing DW_OP_piece
> into multiple parts and handle them separately.
> 
> I have tested each patch on a vmlinux and manually checked the results.
> After applying all patches, there are less missing or incorrect
> annotations. No obvious regressions were observed.
> 
> v4:
> Merged patch in v3:
> perf annotate: Rename TSR_KIND_POINTER to TSR_KIND_PERCPU_POINTER
> 
> Updated patches 1-5 based on the feedback from Namhyung.
> 
> v3:
> https://lore.kernel.org/all/20250917195808.2514277-1-zecheng@google.com/
> Merged patches in v2:
> 
> perf dwarf-aux: Use signed variable types in match_var_offset
> perf dwarf-aux: More accurate variable type match for breg
> perf dwarf-aux: Better variable collection for insn tracking
> perf dwarf-aux: Skip check_variable for die_find_variable_by_reg
> 
> v2:
> https://lore.kernel.org/all/20250825195412.223077-1-zecheng@google.com/
> 1. update the match_var_offset function signature to s64
> 2. correct the comment for is_breg_access_indirect. Use simpler logic to
> match the expressions we support.
> 3. add is_reg_var_addr to indicate whether a register holds an address
> of the variable. This defers the type dereference logic to
> update_var_state.
> 4. invalidate register state for unsupported instructions.
> 5. include two new patches related to improving data type profiler.
> 
> v1:
> https://lore.kernel.org/linux-perf-users/20250725202809.1230085-1-zecheng@google.com/
> 
> Zecheng Li (9):
>   perf annotate: Skip annotating data types to lea instructions
>   perf annotate: Track address registers via TSR_KIND_POINTER
>   perf annotate: Track arithmetic instructions on pointers
>   perf annotate: Save pointer offset in stack state
>   perf annotate: Invalidate register states for untracked instructions
>   perf dwarf-aux: Skip check_variable for die_find_variable_by_reg
>   perf dwarf-aux: Preserve typedefs in match_var_offset
>   perf annotate: Improve type comparison from different scopes
>   perf dwarf-aux: Support DW_OP_piece expressions
> 
>  tools/perf/arch/x86/annotate/instructions.c | 183 +++++++++++++-
>  tools/perf/util/annotate-data.c             | 102 ++++++--
>  tools/perf/util/annotate-data.h             |  14 +-
>  tools/perf/util/annotate.c                  |  20 ++
>  tools/perf/util/dwarf-aux.c                 | 266 +++++++++++++++-----
>  tools/perf/util/dwarf-aux.h                 |   2 +-
>  6 files changed, 493 insertions(+), 94 deletions(-)
> 
> -- 
> 2.51.0
>