[v2] objtool: Function validation tracing

[RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Alexandre Chartre 3 months, 3 weeks ago

Hi,

Version v2 of this RFC addresses all comments from Josh and Peter,
in particular:

- add --disas option to disassemble functions
- do not fail the build if libopcodes is not available. Instead objtool
  is then built without disassembly support. In that case, objtool prints
  a warning message if trying to use disassembly.

  Example:
  $ ./tools/objtool/objtool --disas --link vmlinux.o
  vmlinux.o: warning: objtool: Rebuild with libopcodes for disassembly support
  
- remove dbuffer
- rename VTRACE* to TRACE*
- add trace.[ch] for trace-related functions and macros


This RFC provides the following changes to objtool.

- Disassemble code with libopcodes instead of running objdump

  objtool executes the objdump command to disassemble code. In particular,
  if objtool fails to validate a function then it will use objdump to
  disassemble the entire file which is not very helpful when processing
  a large file (like vmlinux.o).

  Using libopcodes provides more control about the disassembly scope and
  output, and it is possible to disassemble a single instruction or
  a single function. Now when objtool fails to validate a function it
  will disassemble that single function instead of disassembling the
  entire file.

- Add the --trace <function> option to trace function validation

  Figuring out why a function validation has failed can be difficult because
  objtool checks all code flows (including alternatives) and maintains
  instructions states (in particular call frame information).

  The trace option allows to follow the function validation done by objtool
  instruction per instruction, see what objtool is doing and get function
  validation information. An output example is shown below.

- Add the --disas <function> option to disassemble functions

  Disassembly is done using libopcodes and it will show the different code
  alternatives.

Note: some changes are architecture specific (x86, powerpc, loongarch). Any
feedback about the behavior on powerpc and loongarch is welcome.

Thanks,

alex.


-----

Examples:
=========

Example 1 (--trace option): Trace the validation of the os_save() function
--------------------------------------------------------------------------

$ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr --hacks=skylake --ibt --orc --retpoline --rethunk --sls --static-call --uaccess --prefix=16 --link --trace os_xsave -v vmlinux.o
os_xsave: validation begin
 65c20: os_xsave+0x0  push %r12				- state: cfa=rsp+16 r12=(cfa-16) stack_size=16 
 65c22: os_xsave+0x2  mov 0x0(%rip),%eax # alternatives_patched 
 65c28: os_xsave+0x8  push %rbp 			- state: cfa=rsp+24 rbp=(cfa-24) stack_size=24 
 65c29: os_xsave+0x9  mov %rdi,%rbp 
 65c2c: os_xsave+0xc  push %rbx 			- state: cfa=rsp+32 rbx=(cfa-32) stack_size=32 
 65c2d: os_xsave+0xd  mov 0x8(%rdi),%rbx 
 65c31: os_xsave+0x11 mov %rbx,%r12 
 65c34: os_xsave+0x14 shr $0x20,%r12 
 65c38: os_xsave+0x18 test %eax,%eax 
 65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> 		- jump taken
 65c6a: os_xsave+0x4a | ud2 
 65c6c: os_xsave+0x4c | jmp 65c3c <os_xsave+0x1c> 	- unconditional jump
 65c3c: os_xsave+0x1c | xor %edx,%edx 
 65c3e: os_xsave+0x1e | mov %rbx,%rsi 
 65c41: os_xsave+0x21 | mov %rbp,%rdi 
 65c44: os_xsave+0x24 | callq xfd_validate_state 	- call
 65c49: os_xsave+0x29 | mov %ebx,%eax 
 65c4b: os_xsave+0x2b | mov %r12d,%edx 
 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 1/4 begin
 65c55: os_xsave+0x35 | | test %ebx,%ebx 
 65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> 	- jump taken
 65c6e: os_xsave+0x4e | | | ud2 
 65c70: os_xsave+0x50 | | | pop %rbx 			- state: cfa=rsp+24 rbx=<undef> stack_size=24 
 65c71: os_xsave+0x51 | | | pop %rbp 			- state: cfa=rsp+16 rbp=<undef> stack_size=16 
 65c72: os_xsave+0x52 | | | pop %r12 			- state: cfa=rsp+8 r12=<undef> stack_size=8 
 65c74: os_xsave+0x54 | | | xor %eax,%eax 
 65c76: os_xsave+0x56 | | | xor %edx,%edx 
 65c78: os_xsave+0x58 | | | xor %esi,%esi 
 65c7a: os_xsave+0x5a | | | xor %edi,%edi 
 65c7c: os_xsave+0x5c | | | jmpq __x86_return_thunk 	- return
 65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> 	- jump not taken
 65c59: os_xsave+0x39 | | pop %rbx 			- state: cfa=rsp+24 rbx=<undef> stack_size=24 
 65c5a: os_xsave+0x3a | | pop %rbp 			- state: cfa=rsp+16 rbp=<undef> stack_size=16 
 65c5b: os_xsave+0x3b | | pop %r12 			- state: cfa=rsp+8 r12=<undef> stack_size=8 
 65c5d: os_xsave+0x3d | | xor %eax,%eax 
 65c5f: os_xsave+0x3f | | xor %edx,%edx 
 65c61: os_xsave+0x41 | | xor %esi,%esi 
 65c63: os_xsave+0x43 | | xor %edi,%edi 
 65c65: os_xsave+0x45 | | jmpq __x86_return_thunk - return
                      | <alternative.65c4e> alt 1/4 end
 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 2/4 begin
 1c3d: .altinstr_replacement+0x1c3d | | xsaves64 0x40(%rbp) 
 65c53: os_xsave+0x33 | | xor %ebx,%ebx 
 65c55: os_xsave+0x35 | | test %ebx,%ebx - already visited
                      | <alternative.65c4e> alt 2/4 end
 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 3/4 begin
 1c38: .altinstr_replacement+0x1c38 | | xsavec64 0x40(%rbp) 
 65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
                      | <alternative.65c4e> alt 3/4 end
 65c4e: os_xsave+0x2e | <alternative.65c4e> alt 4/4 begin
 1c33: .altinstr_replacement+0x1c33 | | xsaveopt64 0x40(%rbp) 
 65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
                      | <alternative.65c4e> alt 4/4 end
 65c4e: os_xsave+0x2e | <alternative.65c4e> alt default
 65c4e: os_xsave+0x2e | xsave64 0x40(%rbp) 
 65c53: os_xsave+0x33 | xor %ebx,%ebx - already visited
 65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> - jump not taken
 65c3c: os_xsave+0x1c xor %edx,%edx - already visited
os_xsave: validation end


Example 2 (--disas option): Disassemble perf_get_x86_pmu_capability()
---------------------------------------------------------------------

$ ./tools/objtool/objtool --disas=perf_get_x86_pmu_capability --link vmlinux.o
perf_get_x86_pmu_capability:
  d000:  perf_get_x86_pmu_capability         endbr64                                                   
  d004:  perf_get_x86_pmu_capability+0x4     callq  __fentry__                                         
  d009:  perf_get_x86_pmu_capability+0x9     mov    %rdi,%rdx                                          
                                             <alternative.d00c> default - begin
  d00c:  perf_get_x86_pmu_capability+0xc     | jmpq   .altinstr_aux+0x90                               
                                             <alternative.d00c> default - end
                                             <alternative.d00c> 1/2 - begin
                                             | <fake nop> (5 bytes)                                    
                                             <alternative.d00c> 1/2 end
                                             <alternative.d00c> 2/2 - begin
   5e5:  .altinstr_replacement+0x5e5         | jmpq   perf_get_x86_pmu_capability+0x3f                 
                                             <alternative.d00c> 2/2 end
  d011:  perf_get_x86_pmu_capability+0x11    ud2                                                       
  d013:  perf_get_x86_pmu_capability+0x13    movq   $0x0,(%rdx)                                        
  d01a:  perf_get_x86_pmu_capability+0x1a    movq   $0x0,0x8(%rdx)                                     
  d022:  perf_get_x86_pmu_capability+0x22    movq   $0x0,0x10(%rdx)                                    
  d02a:  perf_get_x86_pmu_capability+0x2a    movq   $0x0,0x18(%rdx)                                    
  d032:  perf_get_x86_pmu_capability+0x32    xor    %eax,%eax                                          
  d034:  perf_get_x86_pmu_capability+0x34    xor    %edx,%edx                                          
  d036:  perf_get_x86_pmu_capability+0x36    xor    %ecx,%ecx                                          
  d038:  perf_get_x86_pmu_capability+0x38    xor    %edi,%edi                                          
  d03a:  perf_get_x86_pmu_capability+0x3a    jmpq   __x86_return_thunk                                 
  d03f:  perf_get_x86_pmu_capability+0x3f    cmpq   $0x0,0x0(%rip)        # x86_pmu+0x10               
  d047:  perf_get_x86_pmu_capability+0x47    je     d013 <perf_get_x86_pmu_capability+0x13>            
  d049:  perf_get_x86_pmu_capability+0x49    mov    0x0(%rip),%eax        # x86_pmu+0x8                
  d04f:  perf_get_x86_pmu_capability+0x4f    mov    %eax,(%rdi)                                        
                                             <jump alternative.d051> default
  d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax                                          
                                             <jump alternative.d051> else
  d051:  perf_get_x86_pmu_capability+0x51    | jmp    d053 <perf_get_x86_pmu_capability+0x53>
                                             <jump alternative.d051> end
  d053:  perf_get_x86_pmu_capability+0x53    mov    0x0(%rip),%rdi        # x86_pmu+0xd8               
                                             <alternative.d05a> default - begin
  d05a:  perf_get_x86_pmu_capability+0x5a    | callq  __sw_hweight64                                   
                                             <alternative.d05a> default - end
                                             <alternative.d05a> 1/1 - begin
   5ea:  .altinstr_replacement+0x5ea         | popcnt %rdi,%rax                                        
                                             <alternative.d05a> 1/1 end
  d05f:  perf_get_x86_pmu_capability+0x5f    mov    %eax,0x4(%rdx)                                     
                                             <jump alternative.d062> default
  d062:  perf_get_x86_pmu_capability+0x62    | xchg   %ax,%ax                                          
                                             <jump alternative.d062> else
  d062:  perf_get_x86_pmu_capability+0x62    | jmp    d064 <perf_get_x86_pmu_capability+0x64>
                                             <jump alternative.d062> end
  d064:  perf_get_x86_pmu_capability+0x64    mov    0x0(%rip),%rdi        # x86_pmu+0xe0               
                                             <alternative.d06b> default - begin
  d06b:  perf_get_x86_pmu_capability+0x6b    | callq  __sw_hweight64                                   
                                             <alternative.d06b> default - end
                                             <alternative.d06b> 1/1 - begin
   5ef:  .altinstr_replacement+0x5ef         | popcnt %rdi,%rax                                        
                                             <alternative.d06b> 1/1 end
  d070:  perf_get_x86_pmu_capability+0x70    mov    %eax,0x8(%rdx)                                     
  d073:  perf_get_x86_pmu_capability+0x73    mov    0x0(%rip),%eax        # x86_pmu+0xf8               
  d079:  perf_get_x86_pmu_capability+0x79    mov    %eax,0xc(%rdx)                                     
  d07c:  perf_get_x86_pmu_capability+0x7c    mov    %eax,0x10(%rdx)                                    
  d07f:  perf_get_x86_pmu_capability+0x7f    mov    0x0(%rip),%rax        # x86_pmu+0x108              
  d086:  perf_get_x86_pmu_capability+0x86    mov    %eax,0x14(%rdx)                                    
  d089:  perf_get_x86_pmu_capability+0x89    mov    0x0(%rip),%eax        # x86_pmu+0x110              
  d08f:  perf_get_x86_pmu_capability+0x8f    mov    %eax,0x18(%rdx)                                    
  d092:  perf_get_x86_pmu_capability+0x92    movzbl 0x0(%rip),%ecx        # x86_pmu+0x1d1              
  d099:  perf_get_x86_pmu_capability+0x99    movzbl 0x1c(%rdx),%eax                                    
  d09d:  perf_get_x86_pmu_capability+0x9d    shr    %cl                                                
  d09f:  perf_get_x86_pmu_capability+0x9f    and    $0x1,%ecx                                          
  d0a2:  perf_get_x86_pmu_capability+0xa2    and    $0xfffffffe,%eax                                   
  d0a5:  perf_get_x86_pmu_capability+0xa5    or     %ecx,%eax                                          
  d0a7:  perf_get_x86_pmu_capability+0xa7    mov    %al,0x1c(%rdx)                                     
  d0aa:  perf_get_x86_pmu_capability+0xaa    xor    %eax,%eax                                          
  d0ac:  perf_get_x86_pmu_capability+0xac    xor    %edx,%edx                                          
  d0ae:  perf_get_x86_pmu_capability+0xae    xor    %ecx,%ecx                                          
  d0b0:  perf_get_x86_pmu_capability+0xb0    xor    %edi,%edi                                          
  d0b2:  perf_get_x86_pmu_capability+0xb2    jmpq   __x86_return_thunk                                 


-----

Alexandre Chartre (17):
  objtool: Move disassembly functions to a separated file
  objtool: Create disassembly context
  objtool: Disassemble code with libopcodes instead of running objdump
  tool build: Remove annoying newline in build output
  objtool: Print symbol during disassembly
  objtool: Improve offstr() output
  objtool: Store instruction disassembly result
  objtool: Disassemble instruction on warning or backtrace
  objtool: Extract code to validate instruction from the validate branch
    loop
  objtool: Record symbol name max length
  objtool: Add option to trace function validation
  objtool: Trace instruction state changes during function validation
  objtool: Improve register reporting during function validation
  objtool: Improve tracing of alternative instructions
  objtool: Do not validate IBT for .return_sites and .call_sites
  objtool: Add the --disas=<function-pattern> action
  objtool: Disassemble all alternatives when using --disas

 tools/build/Makefile.feature            |   4 +-
 tools/objtool/Build                     |   3 +
 tools/objtool/Makefile                  |  22 +
 tools/objtool/arch/loongarch/decode.c   |  30 ++
 tools/objtool/arch/powerpc/decode.c     |  31 ++
 tools/objtool/arch/x86/decode.c         |  34 ++
 tools/objtool/builtin-check.c           |   5 +-
 tools/objtool/check.c                   | 687 ++++++++++++++----------
 tools/objtool/disas.c                   | 569 ++++++++++++++++++++
 tools/objtool/include/objtool/arch.h    |  13 +
 tools/objtool/include/objtool/builtin.h |   2 +
 tools/objtool/include/objtool/check.h   |  32 +-
 tools/objtool/include/objtool/disas.h   |  69 +++
 tools/objtool/include/objtool/trace.h   | 111 ++++
 tools/objtool/include/objtool/warn.h    |  33 +-
 tools/objtool/trace.c                   | 147 +++++
 16 files changed, 1489 insertions(+), 303 deletions(-)
 create mode 100644 tools/objtool/disas.c
 create mode 100644 tools/objtool/include/objtool/disas.h
 create mode 100644 tools/objtool/include/objtool/trace.h
 create mode 100644 tools/objtool/trace.c

-- 
2.43.5

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

Sorry, but I hadn't seen these patches until Josh replied to them just
now :/

In general I really like the direction this is taking; I agree with Josh
on the +0 thing, in general foo and foo+0 are not the same. Where foo
can be the entire function, foo+0 is always the first instruction.

Furthermore I have some nits about the output, but I don't think we
should hold up merging these patches for that; it could just be me
needing to rewire my brain or something ;-)

> Example 1 (--trace option): Trace the validation of the os_save() function
> --------------------------------------------------------------------------
> 
> $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr --hacks=skylake --ibt --orc --retpoline --rethunk --sls --static-call --uaccess --prefix=16 --link --trace os_xsave -v vmlinux.o
> os_xsave: validation begin
>  65c20: os_xsave+0x0  push %r12				- state: cfa=rsp+16 r12=(cfa-16) stack_size=16 
>  65c22: os_xsave+0x2  mov 0x0(%rip),%eax # alternatives_patched 
>  65c28: os_xsave+0x8  push %rbp 			- state: cfa=rsp+24 rbp=(cfa-24) stack_size=24 
>  65c29: os_xsave+0x9  mov %rdi,%rbp 
>  65c2c: os_xsave+0xc  push %rbx 			- state: cfa=rsp+32 rbx=(cfa-32) stack_size=32 
>  65c2d: os_xsave+0xd  mov 0x8(%rdi),%rbx 
>  65c31: os_xsave+0x11 mov %rbx,%r12 
>  65c34: os_xsave+0x14 shr $0x20,%r12 
>  65c38: os_xsave+0x18 test %eax,%eax 
>  65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> 		- jump taken
>  65c6a: os_xsave+0x4a | ud2 
>  65c6c: os_xsave+0x4c | jmp 65c3c <os_xsave+0x1c> 	- unconditional jump
>  65c3c: os_xsave+0x1c | xor %edx,%edx 
>  65c3e: os_xsave+0x1e | mov %rbx,%rsi 
>  65c41: os_xsave+0x21 | mov %rbp,%rdi 
>  65c44: os_xsave+0x24 | callq xfd_validate_state 	- call
>  65c49: os_xsave+0x29 | mov %ebx,%eax 
>  65c4b: os_xsave+0x2b | mov %r12d,%edx 
>  65c4e: os_xsave+0x2e | <alternative.65c4e> alt 1/4 begin
>  65c55: os_xsave+0x35 | | test %ebx,%ebx 
>  65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> 	- jump taken
>  65c6e: os_xsave+0x4e | | | ud2 
>  65c70: os_xsave+0x50 | | | pop %rbx 			- state: cfa=rsp+24 rbx=<undef> stack_size=24 
>  65c71: os_xsave+0x51 | | | pop %rbp 			- state: cfa=rsp+16 rbp=<undef> stack_size=16 
>  65c72: os_xsave+0x52 | | | pop %r12 			- state: cfa=rsp+8 r12=<undef> stack_size=8 
>  65c74: os_xsave+0x54 | | | xor %eax,%eax 
>  65c76: os_xsave+0x56 | | | xor %edx,%edx 
>  65c78: os_xsave+0x58 | | | xor %esi,%esi 
>  65c7a: os_xsave+0x5a | | | xor %edi,%edi 
>  65c7c: os_xsave+0x5c | | | jmpq __x86_return_thunk 	- return
>  65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> 	- jump not taken
>  65c59: os_xsave+0x39 | | pop %rbx 			- state: cfa=rsp+24 rbx=<undef> stack_size=24 
>  65c5a: os_xsave+0x3a | | pop %rbp 			- state: cfa=rsp+16 rbp=<undef> stack_size=16 
>  65c5b: os_xsave+0x3b | | pop %r12 			- state: cfa=rsp+8 r12=<undef> stack_size=8 
>  65c5d: os_xsave+0x3d | | xor %eax,%eax 
>  65c5f: os_xsave+0x3f | | xor %edx,%edx 
>  65c61: os_xsave+0x41 | | xor %esi,%esi 
>  65c63: os_xsave+0x43 | | xor %edi,%edi 
>  65c65: os_xsave+0x45 | | jmpq __x86_return_thunk - return

>                       | <alternative.65c4e> alt 1/4 end
>  65c4e: os_xsave+0x2e | <alternative.65c4e> alt 2/4 begin
>  1c3d: .altinstr_replacement+0x1c3d | | xsaves64 0x40(%rbp) 
>  65c53: os_xsave+0x33 | | xor %ebx,%ebx 
>  65c55: os_xsave+0x35 | | test %ebx,%ebx - already visited
>                       | <alternative.65c4e> alt 2/4 end
>  65c4e: os_xsave+0x2e | <alternative.65c4e> alt 3/4 begin
>  1c38: .altinstr_replacement+0x1c38 | | xsavec64 0x40(%rbp) 
>  65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>                       | <alternative.65c4e> alt 3/4 end
>  65c4e: os_xsave+0x2e | <alternative.65c4e> alt 4/4 begin
>  1c33: .altinstr_replacement+0x1c33 | | xsaveopt64 0x40(%rbp) 
>  65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>                       | <alternative.65c4e> alt 4/4 end
>  65c4e: os_xsave+0x2e | <alternative.65c4e> alt default
>  65c4e: os_xsave+0x2e | xsave64 0x40(%rbp) 
>  65c53: os_xsave+0x33 | xor %ebx,%ebx - already visited

I find it *very* hard to read these alternatives. If at all possible, I
think something like:

   65c4e: os_xsave+0x2e | xsave64	| xsaveopt64	| xsavec64	| xsaves64
   65c53: os_xsave+0x33 | xor %ebx,%ebx

Would be *much* easier to follow

>  65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> - jump not taken
>  65c3c: os_xsave+0x1c xor %edx,%edx - already visited
> os_xsave: validation end
> 
> 
> Example 2 (--disas option): Disassemble perf_get_x86_pmu_capability()
> ---------------------------------------------------------------------
> 
> $ ./tools/objtool/objtool --disas=perf_get_x86_pmu_capability --link vmlinux.o
> perf_get_x86_pmu_capability:
>   d000:  perf_get_x86_pmu_capability         endbr64                                                   
>   d004:  perf_get_x86_pmu_capability+0x4     callq  __fentry__                                         
>   d009:  perf_get_x86_pmu_capability+0x9     mov    %rdi,%rdx                                          
>                                              <alternative.d00c> default - begin
>   d00c:  perf_get_x86_pmu_capability+0xc     | jmpq   .altinstr_aux+0x90                               

(you probably need to relocate the target -- we never jump into alinstr)

>                                              <alternative.d00c> default - end
>                                              <alternative.d00c> 1/2 - begin
>                                              | <fake nop> (5 bytes)                                    
>                                              <alternative.d00c> 1/2 end
>                                              <alternative.d00c> 2/2 - begin
>    5e5:  .altinstr_replacement+0x5e5         | jmpq   perf_get_x86_pmu_capability+0x3f                 
>                                              <alternative.d00c> 2/2 end

Idem; the above is *really* hard to decipher.

  d00c:  perf_get_x86_pmu_capability+0xc | jmpq   .altinstr_aux+0x90 | nop5 | jmpq   perf_get_x86_pmu_capability+0x3f

>   d011:  perf_get_x86_pmu_capability+0x11    ud2                                                       
>   d013:  perf_get_x86_pmu_capability+0x13    movq   $0x0,(%rdx)                                        
>   d01a:  perf_get_x86_pmu_capability+0x1a    movq   $0x0,0x8(%rdx)                                     
>   d022:  perf_get_x86_pmu_capability+0x22    movq   $0x0,0x10(%rdx)                                    
>   d02a:  perf_get_x86_pmu_capability+0x2a    movq   $0x0,0x18(%rdx)                                    
>   d032:  perf_get_x86_pmu_capability+0x32    xor    %eax,%eax                                          
>   d034:  perf_get_x86_pmu_capability+0x34    xor    %edx,%edx                                          
>   d036:  perf_get_x86_pmu_capability+0x36    xor    %ecx,%ecx                                          
>   d038:  perf_get_x86_pmu_capability+0x38    xor    %edi,%edi                                          
>   d03a:  perf_get_x86_pmu_capability+0x3a    jmpq   __x86_return_thunk                                 
>   d03f:  perf_get_x86_pmu_capability+0x3f    cmpq   $0x0,0x0(%rip)        # x86_pmu+0x10               
>   d047:  perf_get_x86_pmu_capability+0x47    je     d013 <perf_get_x86_pmu_capability+0x13>            
>   d049:  perf_get_x86_pmu_capability+0x49    mov    0x0(%rip),%eax        # x86_pmu+0x8                
>   d04f:  perf_get_x86_pmu_capability+0x4f    mov    %eax,(%rdi)                                        
>                                              <jump alternative.d051> default
>   d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax                                          
>                                              <jump alternative.d051> else
>   d051:  perf_get_x86_pmu_capability+0x51    | jmp    d053 <perf_get_x86_pmu_capability+0x53>
>                                              <jump alternative.d051> end

this is a jump_label; if we would retain the whole 'key' reloc, and
not only the key_addend, you could make it something like:

  d051:  perf_get_x86_pmu_capability+0x51      [ jmp.d8 d053 <perf_get_x86_pmu_capability+0x53> ] * perf_is_hybrid

(also, this here reads like it is either nop2 or jmp.d8 +0, which is
'weird')


>   d053:  perf_get_x86_pmu_capability+0x53    mov    0x0(%rip),%rdi        # x86_pmu+0xd8               
>                                              <alternative.d05a> default - begin
>   d05a:  perf_get_x86_pmu_capability+0x5a    | callq  __sw_hweight64                                   
>                                              <alternative.d05a> default - end
>                                              <alternative.d05a> 1/1 - begin
>    5ea:  .altinstr_replacement+0x5ea         | popcnt %rdi,%rax                                        
>                                              <alternative.d05a> 1/1 end

  d05a:  perf_get_x86_pmu_capability+0x5a      | callq  __sw_hweight64 | popcnt %rdi,%rax


>   d05f:  perf_get_x86_pmu_capability+0x5f    mov    %eax,0x4(%rdx)                                     

>                                              <jump alternative.d062> default
>   d062:  perf_get_x86_pmu_capability+0x62    | xchg   %ax,%ax                                          
>                                              <jump alternative.d062> else
>   d062:  perf_get_x86_pmu_capability+0x62    | jmp    d064 <perf_get_x86_pmu_capability+0x64>
>                                              <jump alternative.d062> end

Same jump_label again, with the same problem, the target seems 'wrong'

>   d064:  perf_get_x86_pmu_capability+0x64    mov    0x0(%rip),%rdi        # x86_pmu+0xe0               
>                                              <alternative.d06b> default - begin
>   d06b:  perf_get_x86_pmu_capability+0x6b    | callq  __sw_hweight64                                   
>                                              <alternative.d06b> default - end
>                                              <alternative.d06b> 1/1 - begin
>    5ef:  .altinstr_replacement+0x5ef         | popcnt %rdi,%rax                                        
>                                              <alternative.d06b> 1/1 end
>   d070:  perf_get_x86_pmu_capability+0x70    mov    %eax,0x8(%rdx)                                     
>   d073:  perf_get_x86_pmu_capability+0x73    mov    0x0(%rip),%eax        # x86_pmu+0xf8               
>   d079:  perf_get_x86_pmu_capability+0x79    mov    %eax,0xc(%rdx)                                     
>   d07c:  perf_get_x86_pmu_capability+0x7c    mov    %eax,0x10(%rdx)                                    
>   d07f:  perf_get_x86_pmu_capability+0x7f    mov    0x0(%rip),%rax        # x86_pmu+0x108              
>   d086:  perf_get_x86_pmu_capability+0x86    mov    %eax,0x14(%rdx)                                    
>   d089:  perf_get_x86_pmu_capability+0x89    mov    0x0(%rip),%eax        # x86_pmu+0x110              
>   d08f:  perf_get_x86_pmu_capability+0x8f    mov    %eax,0x18(%rdx)                                    
>   d092:  perf_get_x86_pmu_capability+0x92    movzbl 0x0(%rip),%ecx        # x86_pmu+0x1d1              
>   d099:  perf_get_x86_pmu_capability+0x99    movzbl 0x1c(%rdx),%eax                                    
>   d09d:  perf_get_x86_pmu_capability+0x9d    shr    %cl                                                
>   d09f:  perf_get_x86_pmu_capability+0x9f    and    $0x1,%ecx                                          
>   d0a2:  perf_get_x86_pmu_capability+0xa2    and    $0xfffffffe,%eax                                   
>   d0a5:  perf_get_x86_pmu_capability+0xa5    or     %ecx,%eax                                          
>   d0a7:  perf_get_x86_pmu_capability+0xa7    mov    %al,0x1c(%rdx)                                     
>   d0aa:  perf_get_x86_pmu_capability+0xaa    xor    %eax,%eax                                          
>   d0ac:  perf_get_x86_pmu_capability+0xac    xor    %edx,%edx                                          
>   d0ae:  perf_get_x86_pmu_capability+0xae    xor    %ecx,%ecx                                          
>   d0b0:  perf_get_x86_pmu_capability+0xb0    xor    %edi,%edi                                          
>   d0b2:  perf_get_x86_pmu_capability+0xb2    jmpq   __x86_return_thunk

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Alexandre Chartre 2 weeks ago

On 9/24/25 09:36, Peter Zijlstra wrote:
> 
> Sorry, but I hadn't seen these patches until Josh replied to them just
> now :/
> 
> In general I really like the direction this is taking; I agree with Josh
> on the +0 thing, in general foo and foo+0 are not the same. Where foo
> can be the entire function, foo+0 is always the first instruction.

Ok, I will use this rule.


> Furthermore I have some nits about the output, but I don't think we
> should hold up merging these patches for that; it could just be me
> needing to rewire my brain or something ;-)
> 
>> Example 1 (--trace option): Trace the validation of the os_save() function
>> --------------------------------------------------------------------------
>>
>> $ ./tools/objtool/objtool --hacks=jump_label --hacks=noinstr --hacks=skylake --ibt --orc --retpoline --rethunk --sls --static-call --uaccess --prefix=16 --link --trace os_xsave -v vmlinux.o
>> os_xsave: validation begin
>>   65c20: os_xsave+0x0  push %r12				- state: cfa=rsp+16 r12=(cfa-16) stack_size=16
>>   65c22: os_xsave+0x2  mov 0x0(%rip),%eax # alternatives_patched
>>   65c28: os_xsave+0x8  push %rbp 			- state: cfa=rsp+24 rbp=(cfa-24) stack_size=24
>>   65c29: os_xsave+0x9  mov %rdi,%rbp
>>   65c2c: os_xsave+0xc  push %rbx 			- state: cfa=rsp+32 rbx=(cfa-32) stack_size=32
>>   65c2d: os_xsave+0xd  mov 0x8(%rdi),%rbx
>>   65c31: os_xsave+0x11 mov %rbx,%r12
>>   65c34: os_xsave+0x14 shr $0x20,%r12
>>   65c38: os_xsave+0x18 test %eax,%eax
>>   65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> 		- jump taken
>>   65c6a: os_xsave+0x4a | ud2
>>   65c6c: os_xsave+0x4c | jmp 65c3c <os_xsave+0x1c> 	- unconditional jump
>>   65c3c: os_xsave+0x1c | xor %edx,%edx
>>   65c3e: os_xsave+0x1e | mov %rbx,%rsi
>>   65c41: os_xsave+0x21 | mov %rbp,%rdi
>>   65c44: os_xsave+0x24 | callq xfd_validate_state 	- call
>>   65c49: os_xsave+0x29 | mov %ebx,%eax
>>   65c4b: os_xsave+0x2b | mov %r12d,%edx
>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 1/4 begin
>>   65c55: os_xsave+0x35 | | test %ebx,%ebx
>>   65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> 	- jump taken
>>   65c6e: os_xsave+0x4e | | | ud2
>>   65c70: os_xsave+0x50 | | | pop %rbx 			- state: cfa=rsp+24 rbx=<undef> stack_size=24
>>   65c71: os_xsave+0x51 | | | pop %rbp 			- state: cfa=rsp+16 rbp=<undef> stack_size=16
>>   65c72: os_xsave+0x52 | | | pop %r12 			- state: cfa=rsp+8 r12=<undef> stack_size=8
>>   65c74: os_xsave+0x54 | | | xor %eax,%eax
>>   65c76: os_xsave+0x56 | | | xor %edx,%edx
>>   65c78: os_xsave+0x58 | | | xor %esi,%esi
>>   65c7a: os_xsave+0x5a | | | xor %edi,%edi
>>   65c7c: os_xsave+0x5c | | | jmpq __x86_return_thunk 	- return
>>   65c57: os_xsave+0x37 | | jne 65c6e <os_xsave+0x4e> 	- jump not taken
>>   65c59: os_xsave+0x39 | | pop %rbx 			- state: cfa=rsp+24 rbx=<undef> stack_size=24
>>   65c5a: os_xsave+0x3a | | pop %rbp 			- state: cfa=rsp+16 rbp=<undef> stack_size=16
>>   65c5b: os_xsave+0x3b | | pop %r12 			- state: cfa=rsp+8 r12=<undef> stack_size=8
>>   65c5d: os_xsave+0x3d | | xor %eax,%eax
>>   65c5f: os_xsave+0x3f | | xor %edx,%edx
>>   65c61: os_xsave+0x41 | | xor %esi,%esi
>>   65c63: os_xsave+0x43 | | xor %edi,%edi
>>   65c65: os_xsave+0x45 | | jmpq __x86_return_thunk - return
> 
>>                        | <alternative.65c4e> alt 1/4 end
>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 2/4 begin
>>   1c3d: .altinstr_replacement+0x1c3d | | xsaves64 0x40(%rbp)
>>   65c53: os_xsave+0x33 | | xor %ebx,%ebx
>>   65c55: os_xsave+0x35 | | test %ebx,%ebx - already visited
>>                        | <alternative.65c4e> alt 2/4 end
>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 3/4 begin
>>   1c38: .altinstr_replacement+0x1c38 | | xsavec64 0x40(%rbp)
>>   65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>>                        | <alternative.65c4e> alt 3/4 end
>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 4/4 begin
>>   1c33: .altinstr_replacement+0x1c33 | | xsaveopt64 0x40(%rbp)
>>   65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>>                        | <alternative.65c4e> alt 4/4 end
>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt default
>>   65c4e: os_xsave+0x2e | xsave64 0x40(%rbp)
>>   65c53: os_xsave+0x33 | xor %ebx,%ebx - already visited
> 
> I find it *very* hard to read these alternatives. If at all possible, I
> think something like:
> 
>     65c4e: os_xsave+0x2e | xsave64	| xsaveopt64	| xsavec64	| xsaves64
>     65c53: os_xsave+0x33 | xor %ebx,%ebx
> 
> Would be *much* easier to follow
> 

Ok, I can try. It will certainly works well when the alternative has a very small
number of instructions (in particular for single instructions). I am not sure it
will display nicely with more instructions, especially when instructions are not
aligned on the same addresses.

But it's worth trying, and maybe have options for different formatting.


>>   65c3a: os_xsave+0x1a je 65c6a <os_xsave+0x4a> - jump not taken
>>   65c3c: os_xsave+0x1c xor %edx,%edx - already visited
>> os_xsave: validation end
>>
>>
>> Example 2 (--disas option): Disassemble perf_get_x86_pmu_capability()
>> ---------------------------------------------------------------------
>>
>> $ ./tools/objtool/objtool --disas=perf_get_x86_pmu_capability --link vmlinux.o
>> perf_get_x86_pmu_capability:
>>    d000:  perf_get_x86_pmu_capability         endbr64
>>    d004:  perf_get_x86_pmu_capability+0x4     callq  __fentry__
>>    d009:  perf_get_x86_pmu_capability+0x9     mov    %rdi,%rdx
>>                                               <alternative.d00c> default - begin
>>    d00c:  perf_get_x86_pmu_capability+0xc     | jmpq   .altinstr_aux+0x90
> 
> (you probably need to relocate the target -- we never jump into alinstr)
> 

Right. I probably doesn't resolve reloc in alternatives; I will check that.

>>                                               <alternative.d00c> default - end
>>                                               <alternative.d00c> 1/2 - begin
>>                                               | <fake nop> (5 bytes)
>>                                               <alternative.d00c> 1/2 end
>>                                               <alternative.d00c> 2/2 - begin
>>     5e5:  .altinstr_replacement+0x5e5         | jmpq   perf_get_x86_pmu_capability+0x3f
>>                                               <alternative.d00c> 2/2 end
> 
> Idem; the above is *really* hard to decipher.
> 
>    d00c:  perf_get_x86_pmu_capability+0xc | jmpq   .altinstr_aux+0x90 | nop5 | jmpq   perf_get_x86_pmu_capability+0x3f
> 
>>    d011:  perf_get_x86_pmu_capability+0x11    ud2
>>    d013:  perf_get_x86_pmu_capability+0x13    movq   $0x0,(%rdx)
>>    d01a:  perf_get_x86_pmu_capability+0x1a    movq   $0x0,0x8(%rdx)
>>    d022:  perf_get_x86_pmu_capability+0x22    movq   $0x0,0x10(%rdx)
>>    d02a:  perf_get_x86_pmu_capability+0x2a    movq   $0x0,0x18(%rdx)
>>    d032:  perf_get_x86_pmu_capability+0x32    xor    %eax,%eax
>>    d034:  perf_get_x86_pmu_capability+0x34    xor    %edx,%edx
>>    d036:  perf_get_x86_pmu_capability+0x36    xor    %ecx,%ecx
>>    d038:  perf_get_x86_pmu_capability+0x38    xor    %edi,%edi
>>    d03a:  perf_get_x86_pmu_capability+0x3a    jmpq   __x86_return_thunk
>>    d03f:  perf_get_x86_pmu_capability+0x3f    cmpq   $0x0,0x0(%rip)        # x86_pmu+0x10
>>    d047:  perf_get_x86_pmu_capability+0x47    je     d013 <perf_get_x86_pmu_capability+0x13>
>>    d049:  perf_get_x86_pmu_capability+0x49    mov    0x0(%rip),%eax        # x86_pmu+0x8
>>    d04f:  perf_get_x86_pmu_capability+0x4f    mov    %eax,(%rdi)
>>                                               <jump alternative.d051> default
>>    d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax
>>                                               <jump alternative.d051> else
>>    d051:  perf_get_x86_pmu_capability+0x51    | jmp    d053 <perf_get_x86_pmu_capability+0x53>
>>                                               <jump alternative.d051> end
> 
> this is a jump_label; if we would retain the whole 'key' reloc, and
> not only the key_addend, you could make it something like:
> 
>    d051:  perf_get_x86_pmu_capability+0x51      [ jmp.d8 d053 <perf_get_x86_pmu_capability+0x53> ] * perf_is_hybrid
> 
> (also, this here reads like it is either nop2 or jmp.d8 +0, which is
> 'weird')
> 

I will try to improve this.

Thanks for feedback.

alex.

> 
>>    d053:  perf_get_x86_pmu_capability+0x53    mov    0x0(%rip),%rdi        # x86_pmu+0xd8
>>                                               <alternative.d05a> default - begin
>>    d05a:  perf_get_x86_pmu_capability+0x5a    | callq  __sw_hweight64
>>                                               <alternative.d05a> default - end
>>                                               <alternative.d05a> 1/1 - begin
>>     5ea:  .altinstr_replacement+0x5ea         | popcnt %rdi,%rax
>>                                               <alternative.d05a> 1/1 end
> 
>    d05a:  perf_get_x86_pmu_capability+0x5a      | callq  __sw_hweight64 | popcnt %rdi,%rax
> 
> 
>>    d05f:  perf_get_x86_pmu_capability+0x5f    mov    %eax,0x4(%rdx)
> 
>>                                               <jump alternative.d062> default
>>    d062:  perf_get_x86_pmu_capability+0x62    | xchg   %ax,%ax
>>                                               <jump alternative.d062> else
>>    d062:  perf_get_x86_pmu_capability+0x62    | jmp    d064 <perf_get_x86_pmu_capability+0x64>
>>                                               <jump alternative.d062> end
> 
> Same jump_label again, with the same problem, the target seems 'wrong'
> 
>>    d064:  perf_get_x86_pmu_capability+0x64    mov    0x0(%rip),%rdi        # x86_pmu+0xe0
>>                                               <alternative.d06b> default - begin
>>    d06b:  perf_get_x86_pmu_capability+0x6b    | callq  __sw_hweight64
>>                                               <alternative.d06b> default - end
>>                                               <alternative.d06b> 1/1 - begin
>>     5ef:  .altinstr_replacement+0x5ef         | popcnt %rdi,%rax
>>                                               <alternative.d06b> 1/1 end
>>    d070:  perf_get_x86_pmu_capability+0x70    mov    %eax,0x8(%rdx)
>>    d073:  perf_get_x86_pmu_capability+0x73    mov    0x0(%rip),%eax        # x86_pmu+0xf8
>>    d079:  perf_get_x86_pmu_capability+0x79    mov    %eax,0xc(%rdx)
>>    d07c:  perf_get_x86_pmu_capability+0x7c    mov    %eax,0x10(%rdx)
>>    d07f:  perf_get_x86_pmu_capability+0x7f    mov    0x0(%rip),%rax        # x86_pmu+0x108
>>    d086:  perf_get_x86_pmu_capability+0x86    mov    %eax,0x14(%rdx)
>>    d089:  perf_get_x86_pmu_capability+0x89    mov    0x0(%rip),%eax        # x86_pmu+0x110
>>    d08f:  perf_get_x86_pmu_capability+0x8f    mov    %eax,0x18(%rdx)
>>    d092:  perf_get_x86_pmu_capability+0x92    movzbl 0x0(%rip),%ecx        # x86_pmu+0x1d1
>>    d099:  perf_get_x86_pmu_capability+0x99    movzbl 0x1c(%rdx),%eax
>>    d09d:  perf_get_x86_pmu_capability+0x9d    shr    %cl
>>    d09f:  perf_get_x86_pmu_capability+0x9f    and    $0x1,%ecx
>>    d0a2:  perf_get_x86_pmu_capability+0xa2    and    $0xfffffffe,%eax
>>    d0a5:  perf_get_x86_pmu_capability+0xa5    or     %ecx,%eax
>>    d0a7:  perf_get_x86_pmu_capability+0xa7    mov    %al,0x1c(%rdx)
>>    d0aa:  perf_get_x86_pmu_capability+0xaa    xor    %eax,%eax
>>    d0ac:  perf_get_x86_pmu_capability+0xac    xor    %edx,%edx
>>    d0ae:  perf_get_x86_pmu_capability+0xae    xor    %ecx,%ecx
>>    d0b0:  perf_get_x86_pmu_capability+0xb0    xor    %edi,%edi
>>    d0b2:  perf_get_x86_pmu_capability+0xb2    jmpq   __x86_return_thunk

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

On Wed, Sep 24, 2025 at 09:36:49AM +0200, Peter Zijlstra wrote:

> >                       | <alternative.65c4e> alt 1/4 end
> >  65c4e: os_xsave+0x2e | <alternative.65c4e> alt 2/4 begin
> >  1c3d: .altinstr_replacement+0x1c3d | | xsaves64 0x40(%rbp) 
> >  65c53: os_xsave+0x33 | | xor %ebx,%ebx 
> >  65c55: os_xsave+0x35 | | test %ebx,%ebx - already visited
> >                       | <alternative.65c4e> alt 2/4 end
> >  65c4e: os_xsave+0x2e | <alternative.65c4e> alt 3/4 begin
> >  1c38: .altinstr_replacement+0x1c38 | | xsavec64 0x40(%rbp) 
> >  65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
> >                       | <alternative.65c4e> alt 3/4 end
> >  65c4e: os_xsave+0x2e | <alternative.65c4e> alt 4/4 begin
> >  1c33: .altinstr_replacement+0x1c33 | | xsaveopt64 0x40(%rbp) 
> >  65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
> >                       | <alternative.65c4e> alt 4/4 end
> >  65c4e: os_xsave+0x2e | <alternative.65c4e> alt default
> >  65c4e: os_xsave+0x2e | xsave64 0x40(%rbp) 
> >  65c53: os_xsave+0x33 | xor %ebx,%ebx - already visited
> 
> I find it *very* hard to read these alternatives. If at all possible, I
> think something like:
> 
>    65c4e: os_xsave+0x2e | xsave64	| xsaveopt64	| xsavec64	| xsaves64
>    65c53: os_xsave+0x33 | xor %ebx,%ebx
> 
> Would be *much* easier to follow

Another option is to write it source-like:

  65c4e: os_xsave+0x2e | ALTERNATIVE("xsave64",
                                     "xsaveopt64", X86_FEATURE_XSAVEOPT,
				     "xsavec64", X86_FEATURE_XSAVEC,
				     "xsaves64", X86_FEATURE_XSAVES);
  65c53: os_xsave+0x33 | xor %ebx,%ebx


We have the 'feature' bit, we'd just have to 'borrow' the feature
strings from the kernel I suppose.

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Alexandre Chartre 2 weeks ago

On 9/24/25 11:17, Peter Zijlstra wrote:
> On Wed, Sep 24, 2025 at 09:36:49AM +0200, Peter Zijlstra wrote:
> 
>>>                        | <alternative.65c4e> alt 1/4 end
>>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 2/4 begin
>>>   1c3d: .altinstr_replacement+0x1c3d | | xsaves64 0x40(%rbp)
>>>   65c53: os_xsave+0x33 | | xor %ebx,%ebx
>>>   65c55: os_xsave+0x35 | | test %ebx,%ebx - already visited
>>>                        | <alternative.65c4e> alt 2/4 end
>>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 3/4 begin
>>>   1c38: .altinstr_replacement+0x1c38 | | xsavec64 0x40(%rbp)
>>>   65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>>>                        | <alternative.65c4e> alt 3/4 end
>>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 4/4 begin
>>>   1c33: .altinstr_replacement+0x1c33 | | xsaveopt64 0x40(%rbp)
>>>   65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
>>>                        | <alternative.65c4e> alt 4/4 end
>>>   65c4e: os_xsave+0x2e | <alternative.65c4e> alt default
>>>   65c4e: os_xsave+0x2e | xsave64 0x40(%rbp)
>>>   65c53: os_xsave+0x33 | xor %ebx,%ebx - already visited
>>
>> I find it *very* hard to read these alternatives. If at all possible, I
>> think something like:
>>
>>     65c4e: os_xsave+0x2e | xsave64	| xsaveopt64	| xsavec64	| xsaves64
>>     65c53: os_xsave+0x33 | xor %ebx,%ebx
>>
>> Would be *much* easier to follow
> 
> Another option is to write it source-like:
> 
>    65c4e: os_xsave+0x2e | ALTERNATIVE("xsave64",
>                                       "xsaveopt64", X86_FEATURE_XSAVEOPT,
> 				     "xsavec64", X86_FEATURE_XSAVEC,
> 				     "xsaves64", X86_FEATURE_XSAVES);
>    65c53: os_xsave+0x33 | xor %ebx,%ebx
> 
> 
> We have the 'feature' bit, we'd just have to 'borrow' the feature
> strings from the kernel I suppose.

Yes, that would be very useful. But I will probably look at it for a next
set of patches.

alex.

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

On Wed, Sep 24, 2025 at 11:50:25AM +0200, Alexandre Chartre wrote:
> 
> On 9/24/25 11:17, Peter Zijlstra wrote:
> > On Wed, Sep 24, 2025 at 09:36:49AM +0200, Peter Zijlstra wrote:
> > 
> > > >                        | <alternative.65c4e> alt 1/4 end
> > > >   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 2/4 begin
> > > >   1c3d: .altinstr_replacement+0x1c3d | | xsaves64 0x40(%rbp)
> > > >   65c53: os_xsave+0x33 | | xor %ebx,%ebx
> > > >   65c55: os_xsave+0x35 | | test %ebx,%ebx - already visited
> > > >                        | <alternative.65c4e> alt 2/4 end
> > > >   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 3/4 begin
> > > >   1c38: .altinstr_replacement+0x1c38 | | xsavec64 0x40(%rbp)
> > > >   65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
> > > >                        | <alternative.65c4e> alt 3/4 end
> > > >   65c4e: os_xsave+0x2e | <alternative.65c4e> alt 4/4 begin
> > > >   1c33: .altinstr_replacement+0x1c33 | | xsaveopt64 0x40(%rbp)
> > > >   65c53: os_xsave+0x33 | | xor %ebx,%ebx - already visited
> > > >                        | <alternative.65c4e> alt 4/4 end
> > > >   65c4e: os_xsave+0x2e | <alternative.65c4e> alt default
> > > >   65c4e: os_xsave+0x2e | xsave64 0x40(%rbp)
> > > >   65c53: os_xsave+0x33 | xor %ebx,%ebx - already visited
> > > 
> > > I find it *very* hard to read these alternatives. If at all possible, I
> > > think something like:
> > > 
> > >     65c4e: os_xsave+0x2e | xsave64	| xsaveopt64	| xsavec64	| xsaves64
> > >     65c53: os_xsave+0x33 | xor %ebx,%ebx
> > > 
> > > Would be *much* easier to follow
> > 
> > Another option is to write it source-like:
> > 
> >    65c4e: os_xsave+0x2e | ALTERNATIVE("xsave64",
> >                                       "xsaveopt64", X86_FEATURE_XSAVEOPT,
> > 				     "xsavec64", X86_FEATURE_XSAVEC,
> > 				     "xsaves64", X86_FEATURE_XSAVES);
> >    65c53: os_xsave+0x33 | xor %ebx,%ebx
> > 
> > 
> > We have the 'feature' bit, we'd just have to 'borrow' the feature
> > strings from the kernel I suppose.
> 
> Yes, that would be very useful. But I will probably look at it for a next
> set of patches.

Yes, like I said, I don't think we need to hold up the current set for
this. But given how I struggle to read that alternative stuff, I figured
we should explore alternatives :-)

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

On Wed, Sep 24, 2025 at 09:36:49AM +0200, Peter Zijlstra wrote:
> > Example 2 (--disas option): Disassemble perf_get_x86_pmu_capability()
> > ---------------------------------------------------------------------
> > 
> > $ ./tools/objtool/objtool --disas=perf_get_x86_pmu_capability --link vmlinux.o
> > perf_get_x86_pmu_capability:
> >   d000:  perf_get_x86_pmu_capability         endbr64                                                   
> >   d004:  perf_get_x86_pmu_capability+0x4     callq  __fentry__                                         
> >   d009:  perf_get_x86_pmu_capability+0x9     mov    %rdi,%rdx                                          
> >                                              <alternative.d00c> default - begin
> >   d00c:  perf_get_x86_pmu_capability+0xc     | jmpq   .altinstr_aux+0x90                               
> 
> (you probably need to relocate the target -- we never jump into alinstr)
> 
> >                                              <alternative.d00c> default - end
> >                                              <alternative.d00c> 1/2 - begin
> >                                              | <fake nop> (5 bytes)                                    
> >                                              <alternative.d00c> 1/2 end
> >                                              <alternative.d00c> 2/2 - begin
> >    5e5:  .altinstr_replacement+0x5e5         | jmpq   perf_get_x86_pmu_capability+0x3f                 
> >                                              <alternative.d00c> 2/2 end
> 
> Idem; the above is *really* hard to decipher.
> 
>   d00c:  perf_get_x86_pmu_capability+0xc | jmpq   .altinstr_aux+0x90 | nop5 | jmpq   perf_get_x86_pmu_capability+0x3f
> 
> >   d011:  perf_get_x86_pmu_capability+0x11    ud2                                                       
> >   d013:  perf_get_x86_pmu_capability+0x13    movq   $0x0,(%rdx)                                        
> >   d01a:  perf_get_x86_pmu_capability+0x1a    movq   $0x0,0x8(%rdx)                                     
> >   d022:  perf_get_x86_pmu_capability+0x22    movq   $0x0,0x10(%rdx)                                    
> >   d02a:  perf_get_x86_pmu_capability+0x2a    movq   $0x0,0x18(%rdx)                                    
> >   d032:  perf_get_x86_pmu_capability+0x32    xor    %eax,%eax                                          
> >   d034:  perf_get_x86_pmu_capability+0x34    xor    %edx,%edx                                          
> >   d036:  perf_get_x86_pmu_capability+0x36    xor    %ecx,%ecx                                          
> >   d038:  perf_get_x86_pmu_capability+0x38    xor    %edi,%edi                                          
> >   d03a:  perf_get_x86_pmu_capability+0x3a    jmpq   __x86_return_thunk                                 
> >   d03f:  perf_get_x86_pmu_capability+0x3f    cmpq   $0x0,0x0(%rip)        # x86_pmu+0x10               
> >   d047:  perf_get_x86_pmu_capability+0x47    je     d013 <perf_get_x86_pmu_capability+0x13>            
> >   d049:  perf_get_x86_pmu_capability+0x49    mov    0x0(%rip),%eax        # x86_pmu+0x8                
> >   d04f:  perf_get_x86_pmu_capability+0x4f    mov    %eax,(%rdi)                                        
> >                                              <jump alternative.d051> default
> >   d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax                                          
> >                                              <jump alternative.d051> else
> >   d051:  perf_get_x86_pmu_capability+0x51    | jmp    d053 <perf_get_x86_pmu_capability+0x53>
> >                                              <jump alternative.d051> end
> 
> this is a jump_label; if we would retain the whole 'key' reloc, and
> not only the key_addend, you could make it something like:
> 
>   d051:  perf_get_x86_pmu_capability+0x51      [ jmp.d8 d053 <perf_get_x86_pmu_capability+0x53> ] * perf_is_hybrid
> 
> (also, this here reads like it is either nop2 or jmp.d8 +0, which is
> 'weird')

Also, particularly in alternatives I think it makes sense to make
explicit distinction between jmp.d8 and jmp.d32 (and similar for Jcc.d8
/ Jcc.d32).

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

On Wed, Sep 24, 2025 at 09:42:06AM +0200, Peter Zijlstra wrote:

> > >   d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax                                          

That libopcode is 'funny', isn't that typically spelled "nop" ?

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

On Wed, Sep 24, 2025 at 12:08:47PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 24, 2025 at 09:42:06AM +0200, Peter Zijlstra wrote:
> 
> > > >   d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax                                          
> 
> That libopcode is 'funny', isn't that typically spelled "nop" ?

Ooh, I see, it is "osp nop" and yeah binutils also seems to do that as
"xchg %ax,%ax".

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Alexandre Chartre 2 weeks ago

On 9/24/25 12:10, Peter Zijlstra wrote:
> On Wed, Sep 24, 2025 at 12:08:47PM +0200, Peter Zijlstra wrote:
>> On Wed, Sep 24, 2025 at 09:42:06AM +0200, Peter Zijlstra wrote:
>>
>>>>>    d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax
>>
>> That libopcode is 'funny', isn't that typically spelled "nop" ?
> 
> Ooh, I see, it is "osp nop" and yeah binutils also seems to do that as
> "xchg %ax,%ax".

Yes, "xchg %ax,%ax" is NOP2 (opcodes: 0x66 0x90), "nop" is NOP1 (0x90).

That's one more improvement we can do: identify NOP instructions and
display them as NOP<n>

alex.

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

On Wed, Sep 24, 2025 at 12:52:35PM +0200, Alexandre Chartre wrote:
> 
> On 9/24/25 12:10, Peter Zijlstra wrote:
> > On Wed, Sep 24, 2025 at 12:08:47PM +0200, Peter Zijlstra wrote:
> > > On Wed, Sep 24, 2025 at 09:42:06AM +0200, Peter Zijlstra wrote:
> > > 
> > > > > >    d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax
> > > 
> > > That libopcode is 'funny', isn't that typically spelled "nop" ?
> > 
> > Ooh, I see, it is "osp nop" and yeah binutils also seems to do that as
> > "xchg %ax,%ax".
> 
> Yes, "xchg %ax,%ax" is NOP2 (opcodes: 0x66 0x90), "nop" is NOP1 (0x90).
> 
> That's one more improvement we can do: identify NOP instructions and
> display them as NOP<n>

Right, I have just the function for that. Let me do a patch for you.

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Peter Zijlstra 2 weeks ago

On Wed, Sep 24, 2025 at 12:58:13PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 24, 2025 at 12:52:35PM +0200, Alexandre Chartre wrote:
> > 
> > On 9/24/25 12:10, Peter Zijlstra wrote:
> > > On Wed, Sep 24, 2025 at 12:08:47PM +0200, Peter Zijlstra wrote:
> > > > On Wed, Sep 24, 2025 at 09:42:06AM +0200, Peter Zijlstra wrote:
> > > > 
> > > > > > >    d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax
> > > > 
> > > > That libopcode is 'funny', isn't that typically spelled "nop" ?
> > > 
> > > Ooh, I see, it is "osp nop" and yeah binutils also seems to do that as
> > > "xchg %ax,%ax".
> > 
> > Yes, "xchg %ax,%ax" is NOP2 (opcodes: 0x66 0x90), "nop" is NOP1 (0x90).
> > 
> > That's one more improvement we can do: identify NOP instructions and
> > display them as NOP<n>
> 
> Right, I have just the function for that. Let me do a patch for you.

Bah, I'm having trouble with the makefiles, but we sorta already detect
nops, and the below should sorta work. I'll improve decode.c a bit.

--- a/tools/objtool/disas.c
+++ b/tools/objtool/disas.c
@@ -273,6 +273,9 @@ static size_t disas_insn(struct disas_co
 	dctx->insn = insn;
 	dctx->result[0] = '\0';
 
+	if (insn->type == INSN_NOP)
+		return snprintf(dctx->result, DISAS_RESULT_SIZE, "NOP%d", insn->len);
+
 	/*
 	 * Set the disassembler buffer to read data from the section
 	 * containing the instruction to disassemble.

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Alexandre Chartre 2 weeks ago


On 9/24/25 13:45, Peter Zijlstra wrote:
> On Wed, Sep 24, 2025 at 12:58:13PM +0200, Peter Zijlstra wrote:
>> On Wed, Sep 24, 2025 at 12:52:35PM +0200, Alexandre Chartre wrote:
>>>
>>> On 9/24/25 12:10, Peter Zijlstra wrote:
>>>> On Wed, Sep 24, 2025 at 12:08:47PM +0200, Peter Zijlstra wrote:
>>>>> On Wed, Sep 24, 2025 at 09:42:06AM +0200, Peter Zijlstra wrote:
>>>>>
>>>>>>>>     d051:  perf_get_x86_pmu_capability+0x51    | xchg   %ax,%ax
>>>>>
>>>>> That libopcode is 'funny', isn't that typically spelled "nop" ?
>>>>
>>>> Ooh, I see, it is "osp nop" and yeah binutils also seems to do that as
>>>> "xchg %ax,%ax".
>>>
>>> Yes, "xchg %ax,%ax" is NOP2 (opcodes: 0x66 0x90), "nop" is NOP1 (0x90).
>>>
>>> That's one more improvement we can do: identify NOP instructions and
>>> display them as NOP<n>
>>
>> Right, I have just the function for that. Let me do a patch for you.
> 
> Bah, I'm having trouble with the makefiles, but we sorta already detect
> nops, and the below should sorta work. I'll improve decode.c a bit.
> 
> --- a/tools/objtool/disas.c
> +++ b/tools/objtool/disas.c
> @@ -273,6 +273,9 @@ static size_t disas_insn(struct disas_co
>   	dctx->insn = insn;
>   	dctx->result[0] = '\0';
>   
> +	if (insn->type == INSN_NOP)
> +		return snprintf(dctx->result, DISAS_RESULT_SIZE, "NOP%d", insn->len);
> +
>   	/*
>   	 * Set the disassembler buffer to read data from the section
>   	 * containing the instruction to disassemble.

Cool, an easy change :) Thanks.

alex.

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Josh Poimboeuf 2 weeks, 1 day ago

On Thu, Jun 19, 2025 at 04:56:42PM +0200, Alexandre Chartre wrote:
> Hi,
> 
> Version v2 of this RFC addresses all comments from Josh and Peter,
> in particular:
> 
> - add --disas option to disassemble functions
> - do not fail the build if libopcodes is not available. Instead objtool
>   is then built without disassembly support. In that case, objtool prints
>   a warning message if trying to use disassembly.
> 
>   Example:
>   $ ./tools/objtool/objtool --disas --link vmlinux.o
>   vmlinux.o: warning: objtool: Rebuild with libopcodes for disassembly support
>   
> - remove dbuffer
> - rename VTRACE* to TRACE*
> - add trace.[ch] for trace-related functions and macros

Sorry for the delay... this is looking really good.  A few nits I saw
when testing:

1) With "make -s" I see

  Auto-detecting system features:
  ...                                  libbfd: [ on  ]
  ...                disassembler-init-styled: [ on  ]

but I'm thinking that should be completely silent?

2) Also seeing an oddity with --trace:

  $ OBJTOOL_ARGS='--trace=shrink_node' make -s -j12 mm/vmscan.o
  shrink_node: validation begin
   12440:  shrink_node                                                 push   %rbp                                                -  - statecfa=rsp+16 rbp=(cfa-16) stack_size=16
   12440:  shrink_node                                                 push   %rbp
   12441:  shrink_node+0x1                                             mov    %rsp,%rbp                                           -  - statecfa=rbp+16
   12441:  shrink_node+0x1                                             mov    %rsp,%rbp

For the instructions which have unwinding state changes, it's printing
them twice.  Also the formatting looks a little off (two dashes;
"statecfa").

-- 
Josh

Re: [RFC PATCH v2 00/17] objtool: Function validation tracing

Posted by Alexandre Chartre 2 weeks ago


On 9/24/25 00:33, Josh Poimboeuf wrote:
> On Thu, Jun 19, 2025 at 04:56:42PM +0200, Alexandre Chartre wrote:
>> Hi,
>>
>> Version v2 of this RFC addresses all comments from Josh and Peter,
>> in particular:
>>
>> - add --disas option to disassemble functions
>> - do not fail the build if libopcodes is not available. Instead objtool
>>    is then built without disassembly support. In that case, objtool prints
>>    a warning message if trying to use disassembly.
>>
>>    Example:
>>    $ ./tools/objtool/objtool --disas --link vmlinux.o
>>    vmlinux.o: warning: objtool: Rebuild with libopcodes for disassembly support
>>    
>> - remove dbuffer
>> - rename VTRACE* to TRACE*
>> - add trace.[ch] for trace-related functions and macros
> 
> Sorry for the delay... this is looking really good.  A few nits I saw
> when testing:
> 
> 1) With "make -s" I see
> 
>    Auto-detecting system features:
>    ...                                  libbfd: [ on  ]
>    ...                disassembler-init-styled: [ on  ]
> 
> but I'm thinking that should be completely silent?
> 
>
> 2) Also seeing an oddity with --trace:
> 
>    $ OBJTOOL_ARGS='--trace=shrink_node' make -s -j12 mm/vmscan.o
>    shrink_node: validation begin
>     12440:  shrink_node                                                 push   %rbp                                                -  - statecfa=rsp+16 rbp=(cfa-16) stack_size=16
>     12440:  shrink_node                                                 push   %rbp
>     12441:  shrink_node+0x1                                             mov    %rsp,%rbp                                           -  - statecfa=rbp+16
>     12441:  shrink_node+0x1                                             mov    %rsp,%rbp
> 
> For the instructions which have unwinding state changes, it's printing
> them twice.  Also the formatting looks a little off (two dashes;
> "statecfa").
> 

Thanks for all comments, I will address them and also resync the code.

alex.