[PATCH] symbols: explicitly specify source file name for symtab

Jan Beulich posted 1 patch 2 weeks, 5 days ago
Failed in applying to current master (apply log)
[PATCH] symbols: explicitly specify source file name for symtab
Posted by Jan Beulich 2 weeks, 5 days ago
If there are any local symbols in an object file, GNU ld will create an
STT_FILE symbol derived from the object file name if there is none in the
incoming symbol table. The object file name, however, varies between
linking passes. As a result, symbol name compression can yield different
results if any of those local symbols need retaining (Arm [and RISC-V]
mapping symbols are omitted, for example). If that difference in
compression would yield a difference in the sizes of symbol_names[] or
symbols_token_table[], the compare-symbol-tables sanity check will fail.

Fixes: d37d63d4b548 ("symbols: prefix static symbols with their source file names")
Reported-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
The observed problem was with a stub generated as Arm64 erratum 843419
workaround. Such stubs' symbols (imo wrongly) are associated with the last
input object, rather than the input object they belong to. Also for other
kinds of stubs, afaict. See
https://sourceware.org/bugzilla/show_bug.cgi?id=34140.

As per the above, having a Fixes: tag here is questionable.

--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -200,7 +200,8 @@ $(TARGET).efi: $(objtree)/prelink.o $(no
 ifeq ($(CONFIG_DEBUG_INFO),y)
 	$(if $(filter --strip-debug,$(EFI_LDFLAGS)),echo,:) "Will strip debug info from $(@F)"
 endif
-	$(objtree)/tools/symbols $(all_symbols) --empty > $(dot-target).0s.S
+	$(objtree)/tools/symbols $(all_symbols) --source-name=$(@F).S --empty \
+		> $(dot-target).0s.S
 	$(MAKE) $(build)=$(@D) .$(@F).0s.o
 	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
 	          $(LD) $(call EFI_LDFLAGS,$(base)) -T $(obj)/efi.lds $< $(relocs-dummy) \
@@ -210,6 +211,7 @@ endif
 		> $(dot-target).1r.S
 	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).0 \
 		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+                  --source-name=$(@F).S \
 		> $(dot-target).1s.S
 	$(MAKE) $(build)=$(@D) .$(@F).1r.o .$(@F).1s.o
 	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
@@ -220,6 +222,7 @@ endif
 		> $(dot-target).2r.S
 	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).1 \
 		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
+                  --source-name=$(@F).S \
 		> $(dot-target).2s.S
 	$(MAKE) $(build)=$(@D) .$(@F).2r.o .$(@F).2s.o
 	$(call compare-symbol-tables, $(dot-target).1r.o, $(dot-target).2r.o)
--- a/xen/tools/symbols.c
+++ b/xen/tools/symbols.c
@@ -66,6 +66,7 @@ int token_profit[0x10000];
 unsigned char best_table[256][2];
 unsigned char best_table_len[256];
 
+static const char *srcname = "xen-syms.S";
 
 static void usage(void)
 {
@@ -356,6 +357,7 @@ static void write_src(void)
 	printf("#define ALGN 4\n");
 	printf("#endif\n");
 
+	printf("\t.file \"%s\"\n", srcname);
 	printf("\t.section .rodata, \"a\"\n");
 
 	printf("#ifndef SYMBOLS_ORIGIN\n");
@@ -679,6 +681,8 @@ int main(int argc, char **argv)
 				unsorted = true;
 			else if (strcmp(argv[i], "--sort-by-name") == 0)
 				sort_by_name = 1;
+			else if (strncmp(argv[i], "--source-name=", 14) == 0)
+				srcname = argv[i] + 14;
 			else if (strcmp(argv[i], "--warn-dup") == 0)
 				warn_dup = true;
 			else if (strcmp(argv[i], "--error-dup") == 0)
Re: [PATCH] symbols: explicitly specify source file name for symtab
Posted by Roger Pau Monné 2 weeks, 4 days ago
On Mon, May 11, 2026 at 12:00:03PM +0200, Jan Beulich wrote:
> If there are any local symbols in an object file, GNU ld will create an
> STT_FILE symbol derived from the object file name if there is none in the
> incoming symbol table. The object file name, however, varies between
> linking passes. As a result, symbol name compression can yield different
> results if any of those local symbols need retaining (Arm [and RISC-V]
> mapping symbols are omitted, for example). If that difference in
> compression would yield a difference in the sizes of symbol_names[] or
> symbols_token_table[], the compare-symbol-tables sanity check will fail.
> 
> Fixes: d37d63d4b548 ("symbols: prefix static symbols with their source file names")
> Reported-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> The observed problem was with a stub generated as Arm64 erratum 843419
> workaround. Such stubs' symbols (imo wrongly) are associated with the last
> input object, rather than the input object they belong to. Also for other
> kinds of stubs, afaict. See
> https://sourceware.org/bugzilla/show_bug.cgi?id=34140.
> 
> As per the above, having a Fixes: tag here is questionable.
> 
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -200,7 +200,8 @@ $(TARGET).efi: $(objtree)/prelink.o $(no
>  ifeq ($(CONFIG_DEBUG_INFO),y)
>  	$(if $(filter --strip-debug,$(EFI_LDFLAGS)),echo,:) "Will strip debug info from $(@F)"
>  endif
> -	$(objtree)/tools/symbols $(all_symbols) --empty > $(dot-target).0s.S
> +	$(objtree)/tools/symbols $(all_symbols) --source-name=$(@F).S --empty \
> +		> $(dot-target).0s.S
>  	$(MAKE) $(build)=$(@D) .$(@F).0s.o
>  	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
>  	          $(LD) $(call EFI_LDFLAGS,$(base)) -T $(obj)/efi.lds $< $(relocs-dummy) \
> @@ -210,6 +211,7 @@ endif
>  		> $(dot-target).1r.S
>  	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).0 \
>  		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
> +                  --source-name=$(@F).S \
>  		> $(dot-target).1s.S
>  	$(MAKE) $(build)=$(@D) .$(@F).1r.o .$(@F).1s.o
>  	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
> @@ -220,6 +222,7 @@ endif
>  		> $(dot-target).2r.S
>  	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).1 \
>  		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
> +                  --source-name=$(@F).S \
>  		> $(dot-target).2s.S

Wouldn't it be more accurate to use $(dot-target) as the source name?

Maybe $(notdir $(dot-target)).S?

I see the default is already set to the target filename for other
arches, so not a big deal IMO.

Thanks, Roger.

Re: [PATCH] symbols: explicitly specify source file name for symtab
Posted by Jan Beulich 2 weeks, 3 days ago
On 12.05.2026 11:20, Roger Pau Monné wrote:
> On Mon, May 11, 2026 at 12:00:03PM +0200, Jan Beulich wrote:
>> If there are any local symbols in an object file, GNU ld will create an
>> STT_FILE symbol derived from the object file name if there is none in the
>> incoming symbol table. The object file name, however, varies between
>> linking passes. As a result, symbol name compression can yield different
>> results if any of those local symbols need retaining (Arm [and RISC-V]
>> mapping symbols are omitted, for example). If that difference in
>> compression would yield a difference in the sizes of symbol_names[] or
>> symbols_token_table[], the compare-symbol-tables sanity check will fail.
>>
>> Fixes: d37d63d4b548 ("symbols: prefix static symbols with their source file names")
>> Reported-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> --- a/xen/arch/x86/Makefile
>> +++ b/xen/arch/x86/Makefile
>> @@ -200,7 +200,8 @@ $(TARGET).efi: $(objtree)/prelink.o $(no
>>  ifeq ($(CONFIG_DEBUG_INFO),y)
>>  	$(if $(filter --strip-debug,$(EFI_LDFLAGS)),echo,:) "Will strip debug info from $(@F)"
>>  endif
>> -	$(objtree)/tools/symbols $(all_symbols) --empty > $(dot-target).0s.S
>> +	$(objtree)/tools/symbols $(all_symbols) --source-name=$(@F).S --empty \
>> +		> $(dot-target).0s.S
>>  	$(MAKE) $(build)=$(@D) .$(@F).0s.o
>>  	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
>>  	          $(LD) $(call EFI_LDFLAGS,$(base)) -T $(obj)/efi.lds $< $(relocs-dummy) \
>> @@ -210,6 +211,7 @@ endif
>>  		> $(dot-target).1r.S
>>  	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).0 \
>>  		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
>> +                  --source-name=$(@F).S \
>>  		> $(dot-target).1s.S
>>  	$(MAKE) $(build)=$(@D) .$(@F).1r.o .$(@F).1s.o
>>  	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
>> @@ -220,6 +222,7 @@ endif
>>  		> $(dot-target).2r.S
>>  	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).1 \
>>  		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
>> +                  --source-name=$(@F).S \
>>  		> $(dot-target).2s.S
> 
> Wouldn't it be more accurate to use $(dot-target) as the source name?
> 
> Maybe $(notdir $(dot-target)).S?

Why would that be better (more accurate)? The file names change, so the
specified file is "virtual" anyway. I simply don't see why prepending a
. would be helpful.

> I see the default is already set to the target filename for other
> arches, so not a big deal IMO.

It's a "virtual" filename also there. No real xen-syms.S is ever created.

Jan

Re: [PATCH] symbols: explicitly specify source file name for symtab
Posted by Andrew Cooper 2 weeks, 4 days ago
On 11/05/2026 11:00 am, Jan Beulich wrote:
> If there are any local symbols in an object file, GNU ld will create an
> STT_FILE symbol derived from the object file name if there is none in the
> incoming symbol table. The object file name, however, varies between
> linking passes. As a result, symbol name compression can yield different
> results if any of those local symbols need retaining (Arm [and RISC-V]
> mapping symbols are omitted, for example). If that difference in
> compression would yield a difference in the sizes of symbol_names[] or
> symbols_token_table[], the compare-symbol-tables sanity check will fail.
>
> Fixes: d37d63d4b548 ("symbols: prefix static symbols with their source file names")
> Reported-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> The observed problem was with a stub generated as Arm64 erratum 843419
> workaround. Such stubs' symbols (imo wrongly) are associated with the last
> input object, rather than the input object they belong to. Also for other
> kinds of stubs, afaict. See
> https://sourceware.org/bugzilla/show_bug.cgi?id=34140.
>
> As per the above, having a Fixes: tag here is questionable.
>
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -200,7 +200,8 @@ $(TARGET).efi: $(objtree)/prelink.o $(no
>  ifeq ($(CONFIG_DEBUG_INFO),y)
>  	$(if $(filter --strip-debug,$(EFI_LDFLAGS)),echo,:) "Will strip debug info from $(@F)"
>  endif
> -	$(objtree)/tools/symbols $(all_symbols) --empty > $(dot-target).0s.S
> +	$(objtree)/tools/symbols $(all_symbols) --source-name=$(@F).S --empty \
> +		> $(dot-target).0s.S
>  	$(MAKE) $(build)=$(@D) .$(@F).0s.o
>  	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
>  	          $(LD) $(call EFI_LDFLAGS,$(base)) -T $(obj)/efi.lds $< $(relocs-dummy) \
> @@ -210,6 +211,7 @@ endif
>  		> $(dot-target).1r.S
>  	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).0 \
>  		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
> +                  --source-name=$(@F).S \
>  		> $(dot-target).1s.S
>  	$(MAKE) $(build)=$(@D) .$(@F).1r.o .$(@F).1s.o
>  	$(foreach base, $(VIRT_BASE) $(ALT_BASE), \
> @@ -220,6 +222,7 @@ endif
>  		> $(dot-target).2r.S
>  	$(NM) -pa --format=sysv $(dot-target).$(VIRT_BASE).1 \
>  		| $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
> +                  --source-name=$(@F).S \
>  		> $(dot-target).2s.S
>  	$(MAKE) $(build)=$(@D) .$(@F).2r.o .$(@F).2s.o
>  	$(call compare-symbol-tables, $(dot-target).1r.o, $(dot-target).2r.o)
> --- a/xen/tools/symbols.c
> +++ b/xen/tools/symbols.c
> @@ -66,6 +66,7 @@ int token_profit[0x10000];
>  unsigned char best_table[256][2];
>  unsigned char best_table_len[256];
>  
> +static const char *srcname = "xen-syms.S";
>  
>  static void usage(void)
>  {
> @@ -356,6 +357,7 @@ static void write_src(void)
>  	printf("#define ALGN 4\n");
>  	printf("#endif\n");
>  
> +	printf("\t.file \"%s\"\n", srcname);
>  	printf("\t.section .rodata, \"a\"\n");
>  
>  	printf("#ifndef SYMBOLS_ORIGIN\n");
> @@ -679,6 +681,8 @@ int main(int argc, char **argv)
>  				unsorted = true;
>  			else if (strcmp(argv[i], "--sort-by-name") == 0)
>  				sort_by_name = 1;
> +			else if (strncmp(argv[i], "--source-name=", 14) == 0)
> +				srcname = argv[i] + 14;
>  			else if (strcmp(argv[i], "--warn-dup") == 0)
>  				warn_dup = true;
>  			else if (strcmp(argv[i], "--error-dup") == 0)

Why does x86 need to plumb the source name in, but the other
architectures don't?

xen-syms.S suffices for both x86 builds AFAICT, so can't it just be
unconditional?

~Andrew
Re: [PATCH] symbols: explicitly specify source file name for symtab
Posted by Jan Beulich 2 weeks, 4 days ago
On 11.05.2026 15:41, Andrew Cooper wrote:
> On 11/05/2026 11:00 am, Jan Beulich wrote:
>> --- a/xen/tools/symbols.c
>> +++ b/xen/tools/symbols.c
>> @@ -66,6 +66,7 @@ int token_profit[0x10000];
>>  unsigned char best_table[256][2];
>>  unsigned char best_table_len[256];
>>  
>> +static const char *srcname = "xen-syms.S";
>>  
>>  static void usage(void)
>>  {
>> @@ -356,6 +357,7 @@ static void write_src(void)
>>  	printf("#define ALGN 4\n");
>>  	printf("#endif\n");
>>  
>> +	printf("\t.file \"%s\"\n", srcname);
>>  	printf("\t.section .rodata, \"a\"\n");
>>  
>>  	printf("#ifndef SYMBOLS_ORIGIN\n");
>> @@ -679,6 +681,8 @@ int main(int argc, char **argv)
>>  				unsorted = true;
>>  			else if (strcmp(argv[i], "--sort-by-name") == 0)
>>  				sort_by_name = 1;
>> +			else if (strncmp(argv[i], "--source-name=", 14) == 0)
>> +				srcname = argv[i] + 14;
>>  			else if (strcmp(argv[i], "--warn-dup") == 0)
>>  				warn_dup = true;
>>  			else if (strcmp(argv[i], "--error-dup") == 0)
> 
> Why does x86 need to plumb the source name in, but the other
> architectures don't?
> 
> xen-syms.S suffices for both x86 builds AFAICT, so can't it just be
> unconditional?

It could. Yet I'd prefer the distinction between xen.efi and xen-syms to
be recognizable (in case any dependent local symbol would show up).

Jan