Hi,
This series is a big revamp of vendor-checking to enable it to perform DCE.
It improves all configurations useful in practice at minimal cost in the full
build, and at a massive advantage for the single-vendor case. Many ifdefs can
go away as a side effect of the series.
This series depends on cross-vendor removal:
https://lore.kernel.org/xen-devel/20260205170923.38425-1-alejandro.garciavallejo@amd.com/T/#m4c3d318f37e4f24d0f8c62b104221aa5d428cebc
Patch 1 in this series matches that of cross-vendor removal. It's logically
required, but that's the single requirement.
High level description
======================
When compared to the RFC this makes a different approach The series introduces
cpu_vendor() which maps to a constant in the single vendor case and to
(boot_cpu_data.vendor & X86_ENABLED_VENDORS), where X86_ENABLED_VENDORS is a
mask of the compile-time chosen vendors. This enables the compiler to detect
dead-code at the uses and remove all unreachable branches, including in
switches.
When compared to the x86_vendor_is() macro introduced in the RFC, this is
simpler. It achieves MOST of what the older macro did without touching the
switches, with a few caveats:
1. Compiled-out vendors cause a panic, they don't fallback onto the unknown
vendor case. In retrospect, this is a much saner thing to do.
2. equalities and inequalities have been replaced by equivalent (cpu_vendor() & ...)
forms. This isn't stylistic preference. This form allows the compiler
to merge the compared-against constant with X86_ENABLED_VENDORS, yielding
much better codegen throughout the tree.
The effect of (2) triples the delta in the full build below.
Some differences might be attributable to the change from policy vendor checks
to boot_cpu_data. In the case of the emulator it caused a 400 bytes increase
due to the way it checks using LOTS of macro invocations, so I left that one
piece using the policy vendor except for the single vendor case.
And finally, some bloat-o-meters to better grasp the effects
1. AMD + wo/ unknown (compared against prior AMD-only build)
add/remove: 0/12 grow/shrink: 4/66 up/down: 108/-11595 (-11487)
2. AMD + Intel + wo/ unknown (compared against prior full build)
(excludes Hygon, Shanghai, Centaur and any other unknown CPU vendors)
add/remove: 1/6 grow/shrink: 27/31 up/down: 356/-1552 (-1196)
3. All vendors + w/ unknown (compared against prior full build)
add/remove: 0/0 grow/shrink: 33/19 up/down: 398/-273 (125)
1. AMD + wo/ unknown (compared against prior AMD-only build)
============================================================
add/remove: 0/12 grow/shrink: 4/66 up/down: 108/-11595 (-11487)
Function old new delta
x86_cpu_policies_are_compatible 157 194 +37
xen_config_data 1476 1509 +33
amd_check_entrysign 807 827 +20
init_guest_cpu_policies 1364 1382 +18
opt_gds_mit 1 - -1
nmi_p6_event_width 4 - -4
nmi_p4_cccr_val 4 - -4
init_e820 1037 1033 -4
x86_mcinfo_dump 477 471 -6
pci_cfg_ok 307 301 -6
get_hw_residencies 213 205 -8
recalculate_cpuid_policy 909 900 -9
dom0_setup_permissions 3809 3800 -9
arch_ioreq_server_get_type_addr 250 241 -9
cpu_has_amd_erratum 230 219 -11
init_amd 2449 2437 -12
parse_spec_ctrl 2321 2307 -14
amd_nonfatal_mcheck_init 192 177 -15
shanghai_cpu_dev 16 - -16
hygon_cpu_dev 16 - -16
centaur_cpu_dev 16 - -16
x86emul_0fae 2758 2741 -17
vmce_init_vcpu 153 136 -17
pge_init 60 42 -18
cpufreq_cpu_init 34 15 -19
nmi_watchdog_tick 534 514 -20
vmce_restore_vcpu 160 139 -21
init_nonfatal_mce_checker 142 120 -22
ucode_update_hcall_cont 888 865 -23
vpmu_arch_initialise 195 168 -27
mce_firstbank 37 10 -27
init_shanghai 29 - -29
validate_gl4e 617 587 -30
l4e_propagate_from_guest 451 421 -30
guest_walk_tables_4_levels 3411 3381 -30
clear_msr_range 30 - -30
acpi_dead_idle 430 398 -32
print_mtrr_state 719 684 -35
amd_mcheck_init 451 416 -35
hvm_vcpu_virtual_to_linear 631 595 -36
do_IRQ 1783 1747 -36
init_bsp_APIC 193 149 -44
symbols_offsets 30800 30752 -48
cpu_callback 4650 4600 -50
mc_memerr_dhandler 903 851 -52
vpmu_init 309 248 -61
mcheck_init 1187 1122 -65
microcode_nmi_callback 205 139 -66
init_IRQ 477 407 -70
disable_lapic_nmi_watchdog 119 49 -70
__start_xen 9448 9378 -70
alternative_instructions 154 82 -72
traps_init 543 468 -75
protmode_load_seg 1904 1829 -75
set_cx_pminfo 1691 1614 -77
init_intel_cacheinfo 1191 1111 -80
is_cpu_primary 93 - -93
symbols_sorted_offsets 60760 60664 -96
do_mca 3181 3085 -96
guest_cpuid 2395 2292 -103
guest_common_max_feature_adjustments 110 - -110
read_msr 1471 1346 -125
x86emul_decode 12729 12597 -132
guest_common_default_feature_adjustments 232 62 -170
do_microcode_update 793 608 -185
cpufreq_driver_init 453 263 -190
vmce_wrmsr 967 768 -199
symbols_names 108114 107914 -200
recalculate_misc 898 689 -209
vmce_rdmsr 1083 872 -211
early_cpu_init 948 721 -227
guest_wrmsr 2853 2622 -231
init_centaur 238 - -238
domain_cpu_policy_changed 677 408 -269
write_msr 1752 1455 -297
x86_emulate 222203 221896 -307
init_hygon 402 - -402
guest_rdmsr 2308 1881 -427
start_vmx 1607 1105 -502
setup_apic_nmi_watchdog 977 276 -701
do_get_hw_residencies 1289 9 -1280
init_speculation_mitigations 9714 6788 -2926
Total: Before=3878514, After=3867027, chg -0.30%
2. AMD + Intel + wo/ unknown (compared against prior full build)
================================================================
add/remove: 1/6 grow/shrink: 27/31 up/down: 356/-1552 (-1196)
Function old new delta
start_vmx 1607 1686 +79
vcpu_info_reset - 38 +38
xen_config_data 1483 1520 +37
x86_cpu_policies_are_compatible 157 194 +37
amd_check_entrysign 807 838 +31
init_speculation_mitigations 9836 9862 +26
set_cx_pminfo 1691 1709 +18
read_msr 1471 1486 +15
guest_cpuid 2395 2405 +10
vmce_restore_vcpu 160 167 +7
recalculate_cpuid_policy 909 916 +7
vmce_init_vcpu 153 159 +6
do_mc_get_cpu_info 584 590 +6
init_e820 1037 1042 +5
x86_mcinfo_dump 477 480 +3
setup_apic_nmi_watchdog 977 980 +3
guest_common_max_feature_adjustments 110 113 +3
guest_common_default_feature_adjustments 257 260 +3
disable_lapic_nmi_watchdog 119 122 +3
cpu_has_amd_erratum 230 233 +3
amd_nonfatal_mcheck_init 192 195 +3
alternative_instructions 154 157 +3
pge_init 60 62 +2
mce_firstbank 37 39 +2
init_bsp_APIC 193 195 +2
do_get_hw_residencies 1289 1291 +2
mcheck_init 1227 1228 +1
intel_mcheck_init 2398 2399 +1
pci_cfg_ok 307 306 -1
init_nonfatal_mce_checker 160 159 -1
mc_memerr_dhandler 903 901 -2
recalculate_misc 898 890 -8
acpi_cpufreq_cpu_init 823 815 -8
cpufreq_driver_init 468 459 -9
vmce_intel_rdmsr 161 150 -11
validate_gl4e 617 605 -12
guest_walk_tables_4_levels 3411 3399 -12
traps_init 543 528 -15
l4e_propagate_from_guest 451 436 -15
shanghai_cpu_dev 16 - -16
hygon_cpu_dev 16 - -16
centaur_cpu_dev 16 - -16
write_msr 1752 1735 -17
cpu_callback 5100 5080 -20
amd_mcheck_init 451 431 -20
hvm_vcpu_virtual_to_linear 631 610 -21
domain_cpu_policy_changed 677 656 -21
vpmu_arch_initialise 195 173 -22
acpi_dead_idle 430 407 -23
symbols_offsets 31800 31776 -24
print_mtrr_state 719 693 -26
guest_rdmsr 2308 2282 -26
do_mca 3181 3153 -28
vpmu_init 320 291 -29
vcpu_create 864 835 -29
unmap_guest_area 198 169 -29
init_shanghai 29 - -29
symbols_sorted_offsets 62760 62720 -40
vmce_wrmsr 993 936 -57
vmce_rdmsr 1134 1073 -61
symbols_names 112215 112141 -74
init_intel_cacheinfo 1191 1111 -80
early_cpu_init 948 854 -94
init_centaur 238 - -238
init_hygon 402 - -402
Total: Before=3932243, After=3931047, chg -0.03%
3. All vendors + w/ unknown (compared against prior full build)
===============================================================
add/remove: 0/0 grow/shrink: 33/19 up/down: 398/-273 (125)
Function old new delta
start_vmx 1607 1686 +79
early_cpu_init 948 986 +38
x86_cpu_policies_are_compatible 157 194 +37
xen_config_data 1483 1515 +32
amd_check_entrysign 807 838 +31
init_speculation_mitigations 9836 9862 +26
set_cx_pminfo 1691 1709 +18
read_msr 1471 1486 +15
vmce_wrmsr 993 1005 +12
init_nonfatal_mce_checker 160 170 +10
guest_cpuid 2395 2405 +10
mcheck_init 1227 1235 +8
vmce_restore_vcpu 160 167 +7
recalculate_cpuid_policy 909 916 +7
vmce_init_vcpu 153 159 +6
init_intel_cacheinfo 1191 1197 +6
do_mc_get_cpu_info 584 590 +6
cpufreq_driver_init 468 474 +6
init_e820 1037 1042 +5
amd_mcheck_init 451 456 +5
x86_mcinfo_dump 477 480 +3
setup_apic_nmi_watchdog 977 980 +3
guest_common_max_feature_adjustments 110 113 +3
guest_common_default_feature_adjustments 257 260 +3
disable_lapic_nmi_watchdog 119 122 +3
cpu_has_amd_erratum 230 233 +3
cpu_callback 5100 5103 +3
amd_nonfatal_mcheck_init 192 195 +3
alternative_instructions 154 157 +3
mce_firstbank 37 39 +2
init_bsp_APIC 193 195 +2
do_get_hw_residencies 1289 1291 +2
intel_mcheck_init 2398 2399 +1
recalculate_misc 898 897 -1
pci_cfg_ok 307 306 -1
mc_memerr_dhandler 903 901 -2
vmce_rdmsr 1134 1127 -7
traps_init 543 535 -8
acpi_cpufreq_cpu_init 823 815 -8
vmce_intel_rdmsr 161 150 -11
validate_gl4e 617 605 -12
guest_walk_tables_4_levels 3411 3399 -12
l4e_propagate_from_guest 451 437 -14
write_msr 1752 1735 -17
vpmu_init 320 302 -18
print_mtrr_state 719 698 -21
hvm_vcpu_virtual_to_linear 631 610 -21
domain_cpu_policy_changed 677 656 -21
vpmu_arch_initialise 195 173 -22
acpi_dead_idle 430 407 -23
guest_rdmsr 2308 2282 -26
do_mca 3181 3153 -28
Total: Before=3932245, After=3932370, chg +0.00%
Alejandro Vallejo (12):
x86: Reject CPU policies with vendors other than the host's
x86: Add more granularity to the vendors in Kconfig
x86: Add cpu_vendor() as a wrapper for the host's CPU vendor
x86: Migrate MSR handler vendor checks to cpu_vendor()
x86: Migrate spec_ctrl vendor checks to cpu_vendor()
x86: Migrate switch vendor checks to cpu_vendor()
x86: Have x86_emulate/ implement the single-vendor optimisation
x86/acpi: Migrate vendor checks to cpu_vendor()
x86/pv: Migrate vendor checks to cpu_vendor()
x86/mcheck: Migrate vendor checks to use cpu_vendor()
x86/cpu: Migrate vendor checks to use cpu_vendor()
x86: Migrate every remaining raw vendor check to cpu_vendor()
xen/arch/x86/Kconfig.cpu | 43 ++++++++++++++++++++++++++
xen/arch/x86/acpi/cpu_idle.c | 16 +++++-----
xen/arch/x86/acpi/cpufreq/acpi.c | 2 +-
xen/arch/x86/acpi/cpufreq/cpufreq.c | 15 +++------
xen/arch/x86/alternative.c | 2 +-
xen/arch/x86/apic.c | 2 +-
xen/arch/x86/cpu-policy.c | 14 ++++-----
xen/arch/x86/cpu/Makefile | 6 ++--
xen/arch/x86/cpu/amd.c | 6 ++--
xen/arch/x86/cpu/common.c | 8 +++--
xen/arch/x86/cpu/intel_cacheinfo.c | 5 ++-
xen/arch/x86/cpu/mcheck/amd_nonfatal.c | 2 +-
xen/arch/x86/cpu/mcheck/mcaction.c | 2 +-
xen/arch/x86/cpu/mcheck/mce.c | 23 ++++++--------
xen/arch/x86/cpu/mcheck/mce.h | 2 +-
xen/arch/x86/cpu/mcheck/mce_amd.c | 7 ++---
xen/arch/x86/cpu/mcheck/mce_intel.c | 7 ++---
xen/arch/x86/cpu/mcheck/non-fatal.c | 6 +---
xen/arch/x86/cpu/mcheck/vmce.c | 16 +++-------
xen/arch/x86/cpu/microcode/amd.c | 2 +-
xen/arch/x86/cpu/microcode/core.c | 2 +-
xen/arch/x86/cpu/mtrr/generic.c | 5 ++-
xen/arch/x86/cpu/mwait-idle.c | 5 ++-
xen/arch/x86/cpu/vpmu.c | 9 ++----
xen/arch/x86/cpuid.c | 5 ++-
xen/arch/x86/dom0_build.c | 2 +-
xen/arch/x86/domain.c | 16 +++++-----
xen/arch/x86/e820.c | 2 +-
xen/arch/x86/guest/xen/xen.c | 6 +++-
xen/arch/x86/hvm/hvm.c | 3 +-
xen/arch/x86/hvm/ioreq.c | 3 +-
xen/arch/x86/hvm/vmx/vmx.c | 8 ++---
xen/arch/x86/i8259.c | 5 ++-
xen/arch/x86/include/asm/cpufeature.h | 27 ++++++++++++++++
xen/arch/x86/include/asm/guest_pt.h | 3 +-
xen/arch/x86/irq.c | 3 +-
xen/arch/x86/msr.c | 35 +++++++++------------
xen/arch/x86/nmi.c | 4 +--
xen/arch/x86/pv/domain.c | 2 +-
xen/arch/x86/pv/emul-priv-op.c | 27 +++++++---------
xen/arch/x86/setup.c | 7 ++---
xen/arch/x86/spec_ctrl.c | 42 ++++++++++---------------
xen/arch/x86/traps-setup.c | 2 +-
xen/arch/x86/x86_emulate/private.h | 10 +++++-
xen/arch/x86/x86_emulate/x86_emulate.c | 2 +-
xen/lib/x86/policy.c | 3 +-
46 files changed, 224 insertions(+), 200 deletions(-)
base-commit: 381b4ff16f7ff83a2dc44f16b8dd0208f3255ec7
--
2.43.0