From nobody Fri Oct 31 03:41:49 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=reject dis=none) header.from=citrix.com ARC-Seal: i=1; a=rsa-sha256; t=1755290511; cv=none; d=zohomail.com; s=zohoarc; b=iGMDtgU7XaqgFo8I5o1lCK0ftmOdQmZU4reIyyTPwck//a8PST4E6IJjH92EnAAKGfdCG8pUawh8PsFmh5wMUPitjeGV833ovxfRIzkRsTrohiiZa3kRGZGMcIPFhJnN0F84WHVIyZkQAEHtou59ypBs8PxIMcBxjSX36Kw6DPc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1755290511; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Blw/OgqEV6s8KO1QT2Lc7w+OV2g1Z3TtGNPo19pd+aY=; b=NL9GBZQlSO9ZXb2dnlELsbDC5ZyKt5cK4sj2giPiBh3LC4q4I7GY05H8pCsts3iEoz+rPoPB4/qZaOgrsvzjYcfFPpdiapSMFGuDyn125lqgndMYvOPT0yrS9IcFWWqdIRXpMPdQqG2I5QX/q9U6gRIf/eSdKj7PNvoCCCrEcfY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1755290511500459.09570246050737; Fri, 15 Aug 2025 13:41:51 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1083895.1443423 (Exim 4.92) (envelope-from ) id 1un1F1-0007pg-JM; Fri, 15 Aug 2025 20:41:35 +0000 Received: by outflank-mailman (output) from mailman id 1083895.1443423; Fri, 15 Aug 2025 20:41:35 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1un1F0-0007nf-NR; Fri, 15 Aug 2025 20:41:34 +0000 Received: by outflank-mailman (input) for mailman id 1083895; Fri, 15 Aug 2025 20:41:32 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1un1Ex-0005E7-RV for xen-devel@lists.xenproject.org; Fri, 15 Aug 2025 20:41:31 +0000 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [2a00:1450:4864:20::32b]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 39b96662-7a18-11f0-a328-13f23c93f187; Fri, 15 Aug 2025 22:41:31 +0200 (CEST) Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-45a1b0becf5so11026065e9.2 for ; Fri, 15 Aug 2025 13:41:31 -0700 (PDT) Received: from localhost.localdomain (host-195-149-20-212.as13285.net. [195.149.20.212]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3bb5d089e07sm3153924f8f.0.2025.08.15.13.41.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Aug 2025 13:41:30 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 39b96662-7a18-11f0-a328-13f23c93f187 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=citrix.com; s=google; t=1755290491; x=1755895291; darn=lists.xenproject.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Blw/OgqEV6s8KO1QT2Lc7w+OV2g1Z3TtGNPo19pd+aY=; b=Vb0k2LzJLNqlyyhFFtf8ltlj79iTEKH0KXsyAiWQs0mQOHb+Ox1DsXIwyNHEv1NOLR A+8PeCfHK+uswArl6wra17YuLxNyUdSAr0voVpBVn0C+AEyzP0H580CPqD1Ync9/9M7g OPsl6ssXPhiNAN2FqczmKHKUFlktv/3cIl8kc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755290491; x=1755895291; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Blw/OgqEV6s8KO1QT2Lc7w+OV2g1Z3TtGNPo19pd+aY=; b=daltgyzlzR6fgMGQCllaxFTAIWWaiK9FHBCqk8WH3cLRDdLOD0KHeCJVQ1WSL07ZUx Jj5HZ+Ey5Ewqzs6juc0R3w4Dp4rL9pV1upP1HtTqx8e8JJCDxxmj+lVRnuqnCOV6PoQi LYGBv6O1c3xHAKvwQ5CCu+dyCn9C3zFa1Htag3cgaotHTJJwjhSWp1iiE6A99ntYtaX5 72cBF6R2MQ6vkkCLF8NhfwWHFvuVWggHUWdTv7fh24vrNQ0+gFTqMFGeNeRBvM2eSNCG KthMdioq0IyYySOYBfSqX4RQUsZKvO8bX+mBBOUIv1KzymYnGIwFVBRkeyju3eFPAGCx tCSg== X-Gm-Message-State: AOJu0Yx5do7LnvrE4TewkhvlVoPu9fonnNJEWUNawVPUjdYK7YJly/Zm DgG30JJ2ZV7l6i6ctg2TRCoJwC/jeOhwyyGIG1a2VMf1Ontq/RGi6UQdsxL7YWnqVWbzoz39jVi YNtmv X-Gm-Gg: ASbGncvrdFKkSXnmcgLGC8QCwv8qBSLWbAso+uT8EHbZLya0yTbjSwMOrbXOSks9v5m UgO9OuHkedKk9MwgrpsgPFTtspTNYsgZuibA48imBrT8cTfhnSk9qxxXbQoXHIH2keMuguSFGoG 8HKVNWDI//d/ArRWRho2SDoMwY0d2535j6ijvrzcK6dcsbXSVhpO0AQetTGrWjgHpJZNECUtgEH GMducyH8rXIOoEEUYeYH68mr+GRvtReNgDQbps2yDciR4HQ/mD+InEnKZ8IsUvm3FkeA8DAApIG IfSl+MVy6xfWHqYYhaCAap8PXPwVS6ub9YEyduCqYWLP7oc1YgEKZ2WAxvn2k87P+PzWGO6PQw8 Ozig/hnZwhVaEgaB2X8D07qK4YKbjkRMdRAAoibqSNtdTLS2UFHtyCBegtEdOPFVf+1Zw3tRhpn gI X-Google-Smtp-Source: AGHT+IEZ6wVEsLUR/pwMbyTNhLupnt76Uj/H6B8RQmZijH++982FmR/W9Us83m9j3vRuXU+XgWQsmw== X-Received: by 2002:a05:600c:354b:b0:456:eb9:5236 with SMTP id 5b1f17b1804b1-45a21808b2bmr34021885e9.15.1755290490541; Fri, 15 Aug 2025 13:41:30 -0700 (PDT) From: Andrew Cooper To: Xen-devel Cc: Andrew Cooper , Jan Beulich , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Subject: [PATCH v2 13/16] x86/msr: Use MSR_IMM when available Date: Fri, 15 Aug 2025 21:41:14 +0100 Message-Id: <20250815204117.3312742-14-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250815204117.3312742-1-andrew.cooper3@citrix.com> References: <20250815204117.3312742-1-andrew.cooper3@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @citrix.com) X-ZM-MESSAGEID: 1755290513936124100 Most MSR accesses have compile time constant indexes. By using the immedia= te form when available, the decoder can start issuing uops directly for the relevant MSR, rather than having to issue uops to implement "switch (%ecx)". Modern CPUs have tens of thousands of MSRs, so that's quite an if/else chai= n. Create __{rdmsr,wrmsrns}_imm() helpers and use them from {rdmsr,wrmsrns}() when the compiler can determine that the msr index is known at compile time. At the instruction level, the combined ABI is awkward. Explain our choices= in detail. Signed-off-by: Andrew Cooper --- CC: Jan Beulich CC: Roger Pau Monn=C3=A9 The expression wrmsrns(MSR_STAR, rdmsr(MSR_STAR)) now yields: : b9 81 00 00 c0 mov $0xc0000081,%ecx 0f 32 rdmsr 48 c1 e2 20 shl $0x20,%rdx 48 09 d0 or %rdx,%rax 48 89 c2 mov %rax,%rdx 48 c1 ea 20 shr $0x20,%rdx 2e 0f 30 cs wrmsr e9 a3 84 e8 ff jmp ffff82d040204260 <__x86_return_thunk> which is as good as we can manage. The alternative form of this looks like: : b9 81 00 00 c0 mov $0xc0000081,%ecx c4 e7 7b f6 c0 81 00 rdmsr $0xc0000081,%rax 00 c0 2e c4 e7 7a f6 c0 81 cs wrmsrns %rax,$0xc0000081 00 00 c0 e9 xx xx xx xx jmp ffff82d040204260 <__x86_return_thunk> Still TBD. We ought to update the *_safe() forms too. rdmsr_safe() is eas= ier because the potential #GP locations line up, but there need to be two varia= nts because of v2: * Let the compiler do %ecx setup * Add RDMSR $imm too --- xen/arch/x86/include/asm/alternative.h | 7 ++ xen/arch/x86/include/asm/msr.h | 86 ++++++++++++++++++++- xen/include/public/arch-x86/cpufeatureset.h | 1 + 3 files changed, 92 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/include/asm/alternative.h b/xen/arch/x86/include/= asm/alternative.h index 0482bbf7cbf1..fe87b15ec72c 100644 --- a/xen/arch/x86/include/asm/alternative.h +++ b/xen/arch/x86/include/asm/alternative.h @@ -151,6 +151,13 @@ extern void alternative_instructions(void); ALTERNATIVE(oldinstr, newinstr, feature) \ :: input ) =20 +#define alternative_input_2(oldinstr, newinstr1, feature1, \ + newinstr2, feature2, input...) \ + asm_inline volatile ( \ + ALTERNATIVE_2(oldinstr, newinstr1, feature1, \ + newinstr2, feature2) \ + :: input ) + /* Like alternative_input, but with a single output argument */ #define alternative_io(oldinstr, newinstr, feature, output, input...) \ asm_inline volatile ( \ diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h index 1bd27b989a4d..2ceff6cca8bb 100644 --- a/xen/arch/x86/include/asm/msr.h +++ b/xen/arch/x86/include/asm/msr.h @@ -29,10 +29,52 @@ * wrmsrl(MSR_FOO, val); */ =20 -static inline uint64_t rdmsr(unsigned int msr) +/* + * RDMSR with a compile-time constant index, when available. Falls back to + * plain RDMSR. + */ +static always_inline uint64_t __rdmsr_imm(uint32_t msr) +{ + uint64_t val; + + /* + * For best performance, RDMSR $msr, %r64 is recommended. For + * compatibility, we need to fall back to plain RDMSR. + * + * The combined ABI is awkward, because RDMSR $imm produces an r64, + * whereas WRMSR{,NS} produces a split edx:eax pair. + * + * Always use RDMSR $imm, %rax, because it has the most in common with= the + * legacy form. When MSR_IMM isn't available, emit logic to fold %edx + * back into %rax. + * + * Let the compiler do %ecx setup. This does mean there's a useless `= mov + * $imm, %ecx` in the instruction stream in the MSR_IMM case, but it m= eans + * the compiler can de-duplicate the setup in the common case of readi= ng + * and writing the same MSR. + */ + alternative_io( + "rdmsr\n\t" + "shl $32, %%rdx\n\t" + "or %%rdx, %%rax\n\t", + + /* RDMSR $msr, %rax */ + ".byte 0xc4,0xe7,0x7b,0xf6,0xc0; .long %c[msr]", X86_FEATURE_MSR_I= MM, + + "=3Da" (val), + + [msr] "i" (msr), "c" (msr) : "rdx"); + + return val; +} + +static always_inline uint64_t rdmsr(unsigned int msr) { unsigned long lo, hi; =20 + if ( __builtin_constant_p(msr) ) + return __rdmsr_imm(msr); + asm volatile ( "rdmsr" : "=3Da" (lo), "=3Dd" (hi) : "c" (msr) ); @@ -55,11 +97,51 @@ static inline void wrmsr(unsigned int msr, uint64_t val) } #define wrmsrl(msr, val) wrmsr(msr, val) =20 +/* + * Non-serialising WRMSR with a compile-time constant index, when availabl= e. + * Falls back to plain WRMSRNS, or to a serialising WRMSR. + */ +static always_inline void __wrmsrns_imm(uint32_t msr, uint64_t val) +{ + /* + * For best performance, WRMSRNS %r64, $msr is recommended. For + * compatibility, we need to fall back to plain WRMSRNS, or to WRMSR. + * + * The combined ABI is awkward, because WRMSRNS $imm takes a single r6= 4, + * whereas WRMSR{,NS} takes a split edx:eax pair. + * + * Always use WRMSRNS %rax, $imm, because it has the most in common wi= th + * the legacy forms. When MSR_IMM isn't available, emit setup logic f= or + * %edx. + * + * Let the compiler do %ecx setup. This does mean there's a useless `= mov + * $imm, %ecx` in the instruction stream in the MSR_IMM case, but it m= eans + * the compiler can de-duplicate the setup in the common case of readi= ng + * and writing the same MSR. + */ + alternative_input_2( + "mov %%rax, %%rdx\n\t" + "shr $32, %%rdx\n\t" + ".byte 0x2e; wrmsr", + + /* CS WRMSRNS %rax, $msr */ + ".byte 0x2e,0xc4,0xe7,0x7a,0xf6,0xc0; .long %c[msr]", X86_FEATURE_= MSR_IMM, + + "mov %%rax, %%rdx\n\t" + "shr $32, %%rdx\n\t" + ".byte 0x0f,0x01,0xc6", X86_FEATURE_WRMSRNS, + + [msr] "i" (msr), "a" (val), "c" (msr) : "rdx"); +} + /* Non-serialising WRMSR, when available. Falls back to a serialising WRM= SR. */ -static inline void wrmsrns(uint32_t msr, uint64_t val) +static always_inline void wrmsrns(uint32_t msr, uint64_t val) { uint32_t lo =3D val, hi =3D val >> 32; =20 + if ( __builtin_constant_p(msr) ) + return __wrmsrns_imm(msr, val); + /* * WRMSR is 2 bytes. WRMSRNS is 3 bytes. Pad WRMSR with a redundant = CS * prefix to avoid a trailing NOP. diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/publ= ic/arch-x86/cpufeatureset.h index f7312e0b04e7..990b1d13f301 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -349,6 +349,7 @@ XEN_CPUFEATURE(MCDT_NO, 13*32+ 5) /*A MCDT_= NO */ XEN_CPUFEATURE(UC_LOCK_DIS, 13*32+ 6) /* UC-lock disable */ =20 /* Intel-defined CPU features, CPUID level 0x00000007:1.ecx, word 14 */ +XEN_CPUFEATURE(MSR_IMM, 14*32+ 5) /* {RD,WR}MSR $imm32 */ =20 /* Intel-defined CPU features, CPUID level 0x00000007:1.edx, word 15 */ XEN_CPUFEATURE(AVX_VNNI_INT8, 15*32+ 4) /*A AVX-VNNI-INT8 Instructio= ns */ --=20 2.39.5