[RFC 0/5] Reuse 32 bit C code more safely

Frediano Ziglio posted 5 patches 2 weeks, 1 day ago
Failed in applying to current master (apply log)
.gitignore                                    |   3 +-
xen/arch/x86/boot/Makefile                    |  30 ++-
.../x86/boot/{build32.lds => build32.lds.S}   |  60 +++++-
xen/arch/x86/boot/cmdline.c                   |   7 -
xen/arch/x86/boot/head.S                      |  89 +--------
xen/arch/x86/boot/reloc-trampoline.c          |  28 +++
xen/arch/x86/boot/reloc-trampoline64.c        |   1 +
xen/arch/x86/boot/reloc.c                     |  69 ++++---
xen/arch/x86/boot/setup-pages.c               | 116 ++++++++++++
xen/arch/x86/boot/setup-pages64.c             |   1 +
xen/arch/x86/boot/x86_64.S                    |   2 +-
xen/arch/x86/efi/efi-boot.h                   |  67 +------
xen/arch/x86/include/asm/page.h               |   3 +-
xen/tools/make_output                         | 177 ++++++++++++++++++
14 files changed, 446 insertions(+), 207 deletions(-)
rename xen/arch/x86/boot/{build32.lds => build32.lds.S} (52%)
create mode 100644 xen/arch/x86/boot/reloc-trampoline.c
create mode 120000 xen/arch/x86/boot/reloc-trampoline64.c
create mode 100644 xen/arch/x86/boot/setup-pages.c
create mode 120000 xen/arch/x86/boot/setup-pages64.c
create mode 100755 xen/tools/make_output
[RFC 0/5] Reuse 32 bit C code more safely
Posted by Frediano Ziglio 2 weeks, 1 day ago
This RFC series attempt to:
- use more C code, that is replace some assembly code with C;
- avoid some code duplication between C and assembly;
- prevent some issues having relocations in C code.

The idea is extending the current C to binary code conversion
done for 32 bit C code called from head.S making sure relocations
are safe and allowing external symbols usage from C code.

Note that, as an addition, scripts generating code check for no
data to allow code and data separation.

More details of the implementation are in commit message 2/5,
which is the largest patch.
Patch 1/5 is to prepare code and avoid data.
Patch 3/5 is an example of code reuse between 32 and 64 bit.
Patch 4/5 is also another example of code reuse but is more hacky and
dirty due to not being possible include use some headers.

Code boot successfully using:
- BIOS boot;
- EFI boot with Grub2 and ELF file;
- direct EFI boot without Grub.

Suggestions/opinions are welcome.

Code is currently based on "staging" branch, currently commit
6471badeeec92db1cb8155066551f7509cd82efd.

Frediano Ziglio (5):
  Avoid usage of global in reloc.c
  x86/boot: create a C bundle for 32 bit boot code and use it
  Reuse code to relocate trampoline
  Remove duplication preparing pages
  setup mapping for trampoline in setup_pagesXX

 .gitignore                                    |   3 +-
 xen/arch/x86/boot/Makefile                    |  30 ++-
 .../x86/boot/{build32.lds => build32.lds.S}   |  60 +++++-
 xen/arch/x86/boot/cmdline.c                   |   7 -
 xen/arch/x86/boot/head.S                      |  89 +--------
 xen/arch/x86/boot/reloc-trampoline.c          |  28 +++
 xen/arch/x86/boot/reloc-trampoline64.c        |   1 +
 xen/arch/x86/boot/reloc.c                     |  69 ++++---
 xen/arch/x86/boot/setup-pages.c               | 116 ++++++++++++
 xen/arch/x86/boot/setup-pages64.c             |   1 +
 xen/arch/x86/boot/x86_64.S                    |   2 +-
 xen/arch/x86/efi/efi-boot.h                   |  67 +------
 xen/arch/x86/include/asm/page.h               |   3 +-
 xen/tools/make_output                         | 177 ++++++++++++++++++
 14 files changed, 446 insertions(+), 207 deletions(-)
 rename xen/arch/x86/boot/{build32.lds => build32.lds.S} (52%)
 create mode 100644 xen/arch/x86/boot/reloc-trampoline.c
 create mode 120000 xen/arch/x86/boot/reloc-trampoline64.c
 create mode 100644 xen/arch/x86/boot/setup-pages.c
 create mode 120000 xen/arch/x86/boot/setup-pages64.c
 create mode 100755 xen/tools/make_output

-- 
2.46.0
Re: [RFC 0/5] Reuse 32 bit C code more safely
Posted by Andrew Cooper 1 week, 6 days ago
On 04/09/2024 3:56 pm, Frediano Ziglio wrote:
> This RFC series attempt to:
> - use more C code, that is replace some assembly code with C;
> - avoid some code duplication between C and assembly;
> - prevent some issues having relocations in C code.
>
> The idea is extending the current C to binary code conversion
> done for 32 bit C code called from head.S making sure relocations
> are safe and allowing external symbols usage from C code.
>
> Note that, as an addition, scripts generating code check for no
> data to allow code and data separation.
>
> More details of the implementation are in commit message 2/5,
> which is the largest patch.
> Patch 1/5 is to prepare code and avoid data.
> Patch 3/5 is an example of code reuse between 32 and 64 bit.
> Patch 4/5 is also another example of code reuse but is more hacky and
> dirty due to not being possible include use some headers.
>
> Code boot successfully using:
> - BIOS boot;
> - EFI boot with Grub2 and ELF file;
> - direct EFI boot without Grub.
>
> Suggestions/opinions are welcome.
>
> Code is currently based on "staging" branch, currently commit
> 6471badeeec92db1cb8155066551f7509cd82efd.

I fully support taking logic out of asm and writing it in C, as well as
taking steps to dedup the EFI and non-EFI paths.  A couple of
observations before diving into the details.

The visibility pragmas mean that you've lost the `-include xen/config.h`
from the $(CC) invocation.  We use this to get some Xen-wide settings
everywhere, which includes handling visibility.

The symlinks for dual builds are going to cause problems for tarball
archives.  Instead you can encode this with make rules.  e.g.

    obj-y += foo32.o foo64.o

    %32.o: %.c
        $(CC) -m32 ...

    %64.o: %.c
        $(CC) -m64 ...

will build two different .o's from the same .c.  This is how XTF builds
different tests from the same source.


I'm on the fence with the ifdefary and bit suffixes.  I don't think we
need the error case because x86_128 isn't coming along any time soon.

For completeness, there's a trick used by the shadow code (see
SHADOW_INTERNAL_NAME()) which adds a suffix without local ifdefary. 
It's nicer to read, but breaks grep/cscope/etc.  I'm torn as to which is
the lesser evil.

~Andrew

Re: [RFC 0/5] Reuse 32 bit C code more safely
Posted by Jan Beulich 1 week, 3 days ago
On 06.09.2024 17:59, Andrew Cooper wrote:
> On 04/09/2024 3:56 pm, Frediano Ziglio wrote:
>> This RFC series attempt to:
>> - use more C code, that is replace some assembly code with C;
>> - avoid some code duplication between C and assembly;
>> - prevent some issues having relocations in C code.
>>
>> The idea is extending the current C to binary code conversion
>> done for 32 bit C code called from head.S making sure relocations
>> are safe and allowing external symbols usage from C code.
>>
>> Note that, as an addition, scripts generating code check for no
>> data to allow code and data separation.
>>
>> More details of the implementation are in commit message 2/5,
>> which is the largest patch.
>> Patch 1/5 is to prepare code and avoid data.
>> Patch 3/5 is an example of code reuse between 32 and 64 bit.
>> Patch 4/5 is also another example of code reuse but is more hacky and
>> dirty due to not being possible include use some headers.
>>
>> Code boot successfully using:
>> - BIOS boot;
>> - EFI boot with Grub2 and ELF file;
>> - direct EFI boot without Grub.
>>
>> Suggestions/opinions are welcome.
>>
>> Code is currently based on "staging" branch, currently commit
>> 6471badeeec92db1cb8155066551f7509cd82efd.
> 
> I fully support taking logic out of asm and writing it in C, as well as
> taking steps to dedup the EFI and non-EFI paths.  A couple of
> observations before diving into the details.
> 
> The visibility pragmas mean that you've lost the `-include xen/config.h`
> from the $(CC) invocation.  We use this to get some Xen-wide settings
> everywhere, which includes handling visibility.
> 
> The symlinks for dual builds are going to cause problems for tarball
> archives.  Instead you can encode this with make rules.  e.g.
> 
>     obj-y += foo32.o foo64.o
> 
>     %32.o: %.c
>         $(CC) -m32 ...
> 
>     %64.o: %.c
>         $(CC) -m64 ...
> 
> will build two different .o's from the same .c.  This is how XTF builds
> different tests from the same source.
> 
> 
> I'm on the fence with the ifdefary and bit suffixes.  I don't think we
> need the error case because x86_128 isn't coming along any time soon.
> 
> For completeness, there's a trick used by the shadow code (see
> SHADOW_INTERNAL_NAME()) which adds a suffix without local ifdefary. 
> It's nicer to read, but breaks grep/cscope/etc.  I'm torn as to which is
> the lesser evil.

While generally I prefer that approach shadow code takes, I think the
#ifdef-ary is acceptably bounded here. What I'm more worried by are the
fair number of #define-s for the 32-bit case of setting up page tables.

Jan