[PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed

Bertrand Marquis posted 7 patches 4 years, 5 months ago
There is a newer version of this series
[PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Bertrand Marquis 4 years, 5 months ago
Sanitize CTR_EL0 value between cores.

In most cases different values will taint Xen but if different
i-cache policies are found, we choose the one which will be compatible
between all cores in terms of invalidation/data cache flushing strategy.

In this case we need to activate the TID2 bit in HCR to emulate the
TCR_EL0 register for guests. This patch is not activating TID2 bit all
the time to limit the overhead when possible.

When TID2 is activate we also need to emulate the CCSIDR, CSSELR and
CLIDR registers which is done here for both 32 and 64bit versions of the
registers.

Add CTR register field definitions using Linux value and define names
and use the opportunity to rename CTR_L1Ip to use an upper case name
instead. The patch is also defining ICACHE_POLICY_xxx instead of only
having CTR_L1IP_xxx. Code using those defines is also updated by this
patch (arm32 setup).

On most big/LITTLE platforms this patch will activate TID2 and emulate
VIPT type of i-cache for all cores (as most LITTLE cores are VIPT where
big ones are PIPT). This is the case for example on Juno boards.

On platforms with only the same type of cores, this patch should not
modify the behaviour.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
Changes in v3: none
Change in v2: Patch introduced in v2
---
 xen/arch/arm/arm64/cpufeature.c  | 19 +++++++++++---
 xen/arch/arm/arm64/vsysreg.c     | 40 ++++++++++++++++++++++++++++
 xen/arch/arm/cpufeature.c        |  2 ++
 xen/arch/arm/domain.c            |  8 ++++++
 xen/arch/arm/setup.c             |  2 +-
 xen/arch/arm/vcpreg.c            | 45 ++++++++++++++++++++++++++++++++
 xen/include/asm-arm/arm64/hsr.h  |  6 +++++
 xen/include/asm-arm/cpufeature.h | 11 ++++++++
 xen/include/asm-arm/processor.h  | 18 ++++++++++---
 9 files changed, 143 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
index b1936ef1d6..334d590ba0 100644
--- a/xen/arch/arm/arm64/cpufeature.c
+++ b/xen/arch/arm/arm64/cpufeature.c
@@ -275,9 +275,6 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
 	ARM64_FTR_END,
 };
 
-#if 0
-/* TODO: use this to sanitize the cache line size among cores */
-
 static const struct arm64_ftr_bits ftr_ctr[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, 1, 1),
@@ -294,7 +291,6 @@ static const struct arm64_ftr_bits ftr_ctr[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IMINLINE_SHIFT, 4, 0),
 	ARM64_FTR_END,
 };
-#endif
 
 static const struct arm64_ftr_bits ftr_id_mmfr0[] = {
 	S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_MMFR0_INNERSHR_SHIFT, 4, 0xf),
@@ -510,6 +506,12 @@ static s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
  * End of imported linux structures and code
  */
 
+/*
+ * This is set to true if we have different type of i-caches on cores
+ * and used to activate TID2 bit to emulate CTR_EL0 register for guests
+ */
+bool mismatch_cache_type = false;
+
 static void sanitize_reg(u64 *cur_reg, u64 new_reg, const char *reg_name,
 						const struct arm64_ftr_bits *ftrp)
 {
@@ -600,6 +602,15 @@ void update_system_features(const struct cpuinfo_arm *new)
 	 */
 	SANITIZE_REG(dczid, 0, dczid);
 
+	SANITIZE_REG(ctr, 0, ctr);
+
+	/*
+	 * If CTR is different among cores, set mismatch_cache_type to activate
+	 * TID2 bit in HCR and emulate CTR register access for guests.
+	 */
+	if ( system_cpuinfo.ctr.bits[0] != new->ctr.bits[0] )
+		mismatch_cache_type = true;
+
 	if ( cpu_feature64_has_el0_32(&system_cpuinfo) )
 	{
 		SANITIZE_ID_REG(pfr32, 0, pfr0);
diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index 887266dd46..17212bd7ae 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -341,6 +341,46 @@ void do_sysreg(struct cpu_user_regs *regs,
     case HSR_SYSREG(3,0,c0,c7,7):
         return handle_ro_raz(regs, regidx, hsr.sysreg.read, hsr, 1);
 
+    /*
+     * HCR_EL2.TID2
+     *
+     * registers related to cache detection
+     */
+    case HSR_SYSREG_CTR_EL0:
+        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
+                system_cpuinfo.ctr.bits[0]);
+
+    case HSR_SYSREG_CLIDR_EL1:
+        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
+                READ_SYSREG(CLIDR_EL1));
+
+    case HSR_SYSREG_CSSELR_EL1:
+        if ( psr_mode_is_user(regs) )
+            return inject_undef_exception(regs, hsr);
+        if ( hsr.sysreg.read )
+            set_user_reg(regs, regidx, v->arch.csselr);
+        else
+            v->arch.csselr = get_user_reg(regs, regidx);
+        break;
+
+    case HSR_SYSREG_CCSIDR_EL1:
+        if ( psr_mode_is_user(regs) )
+            return inject_undef_exception(regs, hsr);
+        if ( hsr.sysreg.read )
+        {
+            /* we need to set CSSELR and do the read of CCSIDR atomically */
+            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
+            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
+        }
+        break;
+
+    case HSR_SYSREG_CCSIDR2_EL1:
+        /*
+         * This would need to return a properly defined value if CCIDX is
+         * implemented in the processor
+         */
+        return inject_undef_exception(regs, hsr);
+
     /*
      * HCR_EL2.TIDCP
      *
diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
index 113f20f601..6e51f530a8 100644
--- a/xen/arch/arm/cpufeature.c
+++ b/xen/arch/arm/cpufeature.c
@@ -127,6 +127,8 @@ void identify_cpu(struct cpuinfo_arm *c)
 
     c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
 
+    c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
+
     aarch32_el0 = cpu_feature64_has_el0_32(c);
 #endif
 
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 19c756ac3d..7a97fde3e7 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -585,6 +585,14 @@ int arch_vcpu_create(struct vcpu *v)
     v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
 
     v->arch.hcr_el2 = get_default_hcr_flags();
+#ifdef CONFIG_ARM64
+    /*
+     * Only activated TID2 to catch access to CTR_EL0 if the platform has some
+     * mismatching i-cache types among cores
+     */
+    if ( mismatch_cache_type )
+        v->arch.hcr_el2 |= HCR_TID2;
+#endif
 
     if ( (rc = vcpu_vgic_init(v)) != 0 )
         goto fail;
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 3798c5ade0..33b7bfb59c 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -627,7 +627,7 @@ static void __init setup_mm(void)
         panic("No memory bank\n");
 
     /* We only supports instruction caches implementing the IVIPT extension. */
-    if ( ((ctr >> CTR_L1Ip_SHIFT) & CTR_L1Ip_MASK) == CTR_L1Ip_AIVIVT )
+    if ( ((ctr >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) == ICACHE_POLICY_AIVIVT )
         panic("AIVIVT instruction cache not supported\n");
 
     init_pdx();
diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
index 33259c4194..5ffed96ded 100644
--- a/xen/arch/arm/vcpreg.c
+++ b/xen/arch/arm/vcpreg.c
@@ -361,6 +361,51 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
     HSR_CPREG32_TID3_CASES(c7):
         return handle_ro_raz(regs, regidx, cp32.read, hsr, 1);
 
+#ifdef CONFIG_ARM64
+    /*
+     * HCR_EL2.TID2
+     *
+     * registers related to cache detection
+     * Only supported on arm64 as we do not sanitize cpuinfo on arm32 so we
+     * do not need to emulate those.
+     */
+    case HSR_CPREG32(CTR):
+        return handle_ro_read_val(regs, regidx, cp32.read, hsr, 1,
+                system_cpuinfo.ctr.bits[0]);
+
+    case HSR_CPREG32(CLIDR):
+        return handle_ro_read_val(regs, regidx, cp32.read, hsr, 1,
+                READ_SYSREG(CLIDR_EL1));
+
+    case HSR_CPREG32(CSSELR):
+        if ( psr_mode_is_user(regs) )
+            return inject_undef_exception(regs, hsr);
+        if ( cp32.read )
+            set_user_reg(regs, regidx, v->arch.csselr);
+        else
+            v->arch.csselr = get_user_reg(regs, regidx);
+        break;
+
+    case HSR_CPREG32(CCSIDR):
+        if ( psr_mode_is_user(regs) )
+            return inject_undef_exception(regs, hsr);
+        if ( cp32.read )
+        {
+            /* we need to set CSSELR and do the read of CCSIDR atomically */
+            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
+            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
+        }
+        break;
+
+    case HSR_CPREG32(CCSIDR2):
+        /*
+         * This would need to return a properly defined value if CCIDX is
+         * implemented in the processor
+         */
+        return inject_undef_exception(regs, hsr);
+
+#endif
+
     /*
      * HCR_EL2.TIDCP
      *
diff --git a/xen/include/asm-arm/arm64/hsr.h b/xen/include/asm-arm/arm64/hsr.h
index e691d41c17..c33980e4e5 100644
--- a/xen/include/asm-arm/arm64/hsr.h
+++ b/xen/include/asm-arm/arm64/hsr.h
@@ -147,6 +147,12 @@
 #define HSR_SYSREG_ID_AA64AFR1_EL1   HSR_SYSREG(3,0,c0,c5,5)
 #define HSR_SYSREG_ID_AA64ZFR0_EL1   HSR_SYSREG(3,0,c0,c4,4)
 
+#define HSR_SYSREG_CTR_EL0      HSR_SYSREG(3,3,c0,c0,1)
+#define HSR_SYSREG_CLIDR_EL1    HSR_SYSREG(3,1,c0,c0,1)
+#define HSR_SYSREG_CSSELR_EL1   HSR_SYSREG(3,2,c0,c0,0)
+#define HSR_SYSREG_CCSIDR_EL1   HSR_SYSREG(3,1,c0,c0,0)
+#define HSR_SYSREG_CCSIDR2_EL1  HSR_SYSREG(3,1,c0,c0,2)
+
 #endif /* __ASM_ARM_ARM64_HSR_H */
 
 /*
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index 5219fd3bab..ca6e827fcb 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -267,6 +267,14 @@ struct cpuinfo_arm {
         register_t bits[1];
     } dczid;
 
+    /*
+     * CTR is only used to check for different cache types or policies and
+     * taint Xen in this case
+     */
+    struct {
+        register_t bits[1];
+    } ctr;
+
 #endif
 
     /*
@@ -339,6 +347,9 @@ extern struct cpuinfo_arm system_cpuinfo;
 extern void identify_cpu(struct cpuinfo_arm *);
 
 #ifdef CONFIG_ARM_64
+
+extern bool mismatched_cache_type;
+
 extern void update_system_features(const struct cpuinfo_arm *);
 #else
 static inline void update_system_features(const struct cpuinfo_arm *cpuinfo)
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 2577e9a244..8c9843e12b 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -7,9 +7,21 @@
 #include <public/arch-arm.h>
 
 /* CTR Cache Type Register */
-#define CTR_L1Ip_MASK       0x3
-#define CTR_L1Ip_SHIFT      14
-#define CTR_L1Ip_AIVIVT     0x1
+#define CTR_L1IP_MASK       0x3
+#define CTR_L1IP_SHIFT      14
+#define CTR_DMINLINE_SHIFT  16
+#define CTR_IMINLINE_SHIFT  0
+#define CTR_IMINLINE_MASK   0xf
+#define CTR_ERG_SHIFT       20
+#define CTR_CWG_SHIFT       24
+#define CTR_CWG_MASK        15
+#define CTR_IDC_SHIFT       28
+#define CTR_DIC_SHIFT       29
+
+#define ICACHE_POLICY_VPIPT  0
+#define ICACHE_POLICY_AIVIVT 1
+#define ICACHE_POLICY_VIPT   2
+#define ICACHE_POLICY_PIPT   3
 
 /* MIDR Main ID Register */
 #define MIDR_REVISION_MASK      0xf
-- 
2.17.1


Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Julien Grall 4 years, 5 months ago
Hi Bertrand,

On 25/08/2021 14:18, Bertrand Marquis wrote:
> Sanitize CTR_EL0 value between cores.
> 
> In most cases different values will taint Xen but if different
> i-cache policies are found, we choose the one which will be compatible
> between all cores in terms of invalidation/data cache flushing strategy.

I understand that all the CPUs in Xen needs to agree on the cache flush 
strategy. However...

> 
> In this case we need to activate the TID2 bit in HCR to emulate the
> TCR_EL0 register for guests. This patch is not activating TID2 bit all
> the time to limit the overhead when possible.

as we discussed in an earlier version, a vCPU is unlikely (at least in 
short/medium) to be able move across pCPU of different type. So the vCPU 
would be pinned to a set of pCPUs. IOW, the guest would have to be 
big.LITTLE aware and therefore would be able to do its own strategy 
decision.

So I think we should be able to get away from trappings the registers.

> When TID2 is activate we also need to emulate the CCSIDR, CSSELR and
> CLIDR registers which is done here for both 32 and 64bit versions of the
> registers.
> 
> Add CTR register field definitions using Linux value and define names
> and use the opportunity to rename CTR_L1Ip to use an upper case name
> instead. The patch is also defining ICACHE_POLICY_xxx instead of only
> having CTR_L1IP_xxx. Code using those defines is also updated by this
> patch (arm32 setup).
> 
> On most big/LITTLE platforms this patch will activate TID2 and emulate
> VIPT type of i-cache for all cores (as most LITTLE cores are VIPT where
> big ones are PIPT). This is the case for example on Juno boards.
> 
> On platforms with only the same type of cores, this patch should not
> modify the behaviour.
> 
> Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
> ---
> Changes in v3: none
> Change in v2: Patch introduced in v2
> ---
>   xen/arch/arm/arm64/cpufeature.c  | 19 +++++++++++---
>   xen/arch/arm/arm64/vsysreg.c     | 40 ++++++++++++++++++++++++++++
>   xen/arch/arm/cpufeature.c        |  2 ++
>   xen/arch/arm/domain.c            |  8 ++++++
>   xen/arch/arm/setup.c             |  2 +-
>   xen/arch/arm/vcpreg.c            | 45 ++++++++++++++++++++++++++++++++
>   xen/include/asm-arm/arm64/hsr.h  |  6 +++++
>   xen/include/asm-arm/cpufeature.h | 11 ++++++++
>   xen/include/asm-arm/processor.h  | 18 ++++++++++---
>   9 files changed, 143 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
> index b1936ef1d6..334d590ba0 100644
> --- a/xen/arch/arm/arm64/cpufeature.c
> +++ b/xen/arch/arm/arm64/cpufeature.c
> @@ -275,9 +275,6 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
>   	ARM64_FTR_END,
>   };
>   
> -#if 0
> -/* TODO: use this to sanitize the cache line size among cores */
> -
>   static const struct arm64_ftr_bits ftr_ctr[] = {
>   	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */
>   	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, 1, 1),
> @@ -294,7 +291,6 @@ static const struct arm64_ftr_bits ftr_ctr[] = {
>   	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IMINLINE_SHIFT, 4, 0),
>   	ARM64_FTR_END,
>   };
> -#endif
>   
>   static const struct arm64_ftr_bits ftr_id_mmfr0[] = {
>   	S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_MMFR0_INNERSHR_SHIFT, 4, 0xf),
> @@ -510,6 +506,12 @@ static s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
>    * End of imported linux structures and code
>    */
>   
> +/*
> + * This is set to true if we have different type of i-caches on cores
> + * and used to activate TID2 bit to emulate CTR_EL0 register for guests
> + */
> +bool mismatch_cache_type = false;

If we are still planning to trap and emulate the registers, then this 
needs to be an HW capability (see cpus_set_cap()).

> +
>   static void sanitize_reg(u64 *cur_reg, u64 new_reg, const char *reg_name,
>   						const struct arm64_ftr_bits *ftrp)
>   {
> @@ -600,6 +602,15 @@ void update_system_features(const struct cpuinfo_arm *new)
>   	 */
>   	SANITIZE_REG(dczid, 0, dczid);
>   
> +	SANITIZE_REG(ctr, 0, ctr);
> +
> +	/*
> +	 * If CTR is different among cores, set mismatch_cache_type to activate
> +	 * TID2 bit in HCR and emulate CTR register access for guests.
> +	 */
> +	if ( system_cpuinfo.ctr.bits[0] != new->ctr.bits[0] )
> +		mismatch_cache_type = true;
> +
>   	if ( cpu_feature64_has_el0_32(&system_cpuinfo) )
>   	{
>   		SANITIZE_ID_REG(pfr32, 0, pfr0);
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index 887266dd46..17212bd7ae 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -341,6 +341,46 @@ void do_sysreg(struct cpu_user_regs *regs,
>       case HSR_SYSREG(3,0,c0,c7,7):
>           return handle_ro_raz(regs, regidx, hsr.sysreg.read, hsr, 1);
>   
> +    /*
> +     * HCR_EL2.TID2
> +     *
> +     * registers related to cache detection
> +     */
> +    case HSR_SYSREG_CTR_EL0:
> +        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
> +                system_cpuinfo.ctr.bits[0]);

Coding style: This needs to be aligned with the first argument.

> +
> +    case HSR_SYSREG_CLIDR_EL1:
> +        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
> +                READ_SYSREG(CLIDR_EL1));

Same.

> +
> +    case HSR_SYSREG_CSSELR_EL1:
> +        if ( psr_mode_is_user(regs) )
> +            return inject_undef_exception(regs, hsr);
> +        if ( hsr.sysreg.read )
> +            set_user_reg(regs, regidx, v->arch.csselr);
> +        else
> +            v->arch.csselr = get_user_reg(regs, regidx);
> +        break;
> +
> +    case HSR_SYSREG_CCSIDR_EL1:
> +        if ( psr_mode_is_user(regs) )
> +            return inject_undef_exception(regs, hsr);
> +        if ( hsr.sysreg.read )
> +        {
> +            /* we need to set CSSELR and do the read of CCSIDR atomically */

I couldn't find this requirement in the Arm Arm. Do you have the section 
at hand?

> +            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
> +            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
> +        }

 From the Arm Arm (D13.2.25 in ARM DDI 0487F.c):

"
If CSSELR_EL1.Level is programmed to a cache level that is not 
implemented, then on a read of the CCSIDR_EL1
the behavior is CONSTRAINED UNPREDICTABLE, and can be one of the following:
• The CCSIDR_EL1 read is treated as NOP.
• The CCSIDR_EL1 read is UNDEFINED.
• The CCSIDR_EL1 read returns an UNKNOWN value.
"

We can't trust the guest here, so we need to prevent any of this 
behavior (in particular 1 and 2) to happen. The options are:
  1) Sanitize the values from the guest
  2) Make sure the register is 0 before reading (for the NOP case to 
avoid leaking a value from Xen) and catch the undefined (some similar to 
the extable on x86).

> +        break;
> +
> +    case HSR_SYSREG_CCSIDR2_EL1:
> +        /*
> +         * This would need to return a properly defined value if CCIDX is
> +         * implemented in the processor
> +         */
> +        return inject_undef_exception(regs, hsr);
> +
>       /*
>        * HCR_EL2.TIDCP
>        *
> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> index 113f20f601..6e51f530a8 100644
> --- a/xen/arch/arm/cpufeature.c
> +++ b/xen/arch/arm/cpufeature.c
> @@ -127,6 +127,8 @@ void identify_cpu(struct cpuinfo_arm *c)
>   
>       c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
>   
> +    c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
> +
>       aarch32_el0 = cpu_feature64_has_el0_32(c);
>   #endif
>   
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 19c756ac3d..7a97fde3e7 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -585,6 +585,14 @@ int arch_vcpu_create(struct vcpu *v)
>       v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
>   
>       v->arch.hcr_el2 = get_default_hcr_flags();
> +#ifdef CONFIG_ARM64

This #ifdef could be droppped if we use an HW caps.

> +    /*
> +     * Only activated TID2 to catch access to CTR_EL0 if the platform has some
> +     * mismatching i-cache types among cores
> +     */
> +    if ( mismatch_cache_type )
> +        v->arch.hcr_el2 |= HCR_TID2;
> +#endif
>   
>       if ( (rc = vcpu_vgic_init(v)) != 0 )
>           goto fail;
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 3798c5ade0..33b7bfb59c 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -627,7 +627,7 @@ static void __init setup_mm(void)
>           panic("No memory bank\n");
>   
>       /* We only supports instruction caches implementing the IVIPT extension. */
> -    if ( ((ctr >> CTR_L1Ip_SHIFT) & CTR_L1Ip_MASK) == CTR_L1Ip_AIVIVT )
> +    if ( ((ctr >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) == ICACHE_POLICY_AIVIVT )
>           panic("AIVIVT instruction cache not supported\n");
>   
>       init_pdx();
> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
> index 33259c4194..5ffed96ded 100644
> --- a/xen/arch/arm/vcpreg.c
> +++ b/xen/arch/arm/vcpreg.c
> @@ -361,6 +361,51 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>       HSR_CPREG32_TID3_CASES(c7):
>           return handle_ro_raz(regs, regidx, cp32.read, hsr, 1);
>   
> +#ifdef CONFIG_ARM64
> +    /*
> +     * HCR_EL2.TID2
> +     *
> +     * registers related to cache detection
> +     * Only supported on arm64 as we do not sanitize cpuinfo on arm32 so we
> +     * do not need to emulate those.
> +     */
> +    case HSR_CPREG32(CTR):
> +        return handle_ro_read_val(regs, regidx, cp32.read, hsr, 1,
> +                system_cpuinfo.ctr.bits[0]);
> +
> +    case HSR_CPREG32(CLIDR):
> +        return handle_ro_read_val(regs, regidx, cp32.read, hsr, 1,
> +                READ_SYSREG(CLIDR_EL1));
> +
> +    case HSR_CPREG32(CSSELR):
> +        if ( psr_mode_is_user(regs) )
> +            return inject_undef_exception(regs, hsr);
> +        if ( cp32.read )
> +            set_user_reg(regs, regidx, v->arch.csselr);
> +        else
> +            v->arch.csselr = get_user_reg(regs, regidx);
> +        break;
> +
> +    case HSR_CPREG32(CCSIDR):
> +        if ( psr_mode_is_user(regs) )
> +            return inject_undef_exception(regs, hsr);
> +        if ( cp32.read )
> +        {
> +            /* we need to set CSSELR and do the read of CCSIDR atomically */
> +            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
> +            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
> +        }
> +        break;
> +
> +    case HSR_CPREG32(CCSIDR2):
> +        /*
> +         * This would need to return a properly defined value if CCIDX is
> +         * implemented in the processor
> +         */
> +        return inject_undef_exception(regs, hsr);
> +
> +#endif
> +
>       /*
>        * HCR_EL2.TIDCP
>        *
> diff --git a/xen/include/asm-arm/arm64/hsr.h b/xen/include/asm-arm/arm64/hsr.h
> index e691d41c17..c33980e4e5 100644
> --- a/xen/include/asm-arm/arm64/hsr.h
> +++ b/xen/include/asm-arm/arm64/hsr.h
> @@ -147,6 +147,12 @@
>   #define HSR_SYSREG_ID_AA64AFR1_EL1   HSR_SYSREG(3,0,c0,c5,5)
>   #define HSR_SYSREG_ID_AA64ZFR0_EL1   HSR_SYSREG(3,0,c0,c4,4)
>   
> +#define HSR_SYSREG_CTR_EL0      HSR_SYSREG(3,3,c0,c0,1)
> +#define HSR_SYSREG_CLIDR_EL1    HSR_SYSREG(3,1,c0,c0,1)
> +#define HSR_SYSREG_CSSELR_EL1   HSR_SYSREG(3,2,c0,c0,0)
> +#define HSR_SYSREG_CCSIDR_EL1   HSR_SYSREG(3,1,c0,c0,0)
> +#define HSR_SYSREG_CCSIDR2_EL1  HSR_SYSREG(3,1,c0,c0,2)
> +
>   #endif /* __ASM_ARM_ARM64_HSR_H */
>   
>   /*
> diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
> index 5219fd3bab..ca6e827fcb 100644
> --- a/xen/include/asm-arm/cpufeature.h
> +++ b/xen/include/asm-arm/cpufeature.h
> @@ -267,6 +267,14 @@ struct cpuinfo_arm {
>           register_t bits[1];
>       } dczid;
>   
> +    /*
> +     * CTR is only used to check for different cache types or policies and
> +     * taint Xen in this case
> +     */
> +    struct {
> +        register_t bits[1];
> +    } ctr;
> +
>   #endif
>   
>       /*
> @@ -339,6 +347,9 @@ extern struct cpuinfo_arm system_cpuinfo;
>   extern void identify_cpu(struct cpuinfo_arm *);
>   
>   #ifdef CONFIG_ARM_64
> +
> +extern bool mismatched_cache_type;
> +
>   extern void update_system_features(const struct cpuinfo_arm *);
>   #else
>   static inline void update_system_features(const struct cpuinfo_arm *cpuinfo)
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 2577e9a244..8c9843e12b 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -7,9 +7,21 @@
>   #include <public/arch-arm.h>
>   
>   /* CTR Cache Type Register */
> -#define CTR_L1Ip_MASK       0x3
> -#define CTR_L1Ip_SHIFT      14
> -#define CTR_L1Ip_AIVIVT     0x1
> +#define CTR_L1IP_MASK       0x3
> +#define CTR_L1IP_SHIFT      14
> +#define CTR_DMINLINE_SHIFT  16
> +#define CTR_IMINLINE_SHIFT  0
> +#define CTR_IMINLINE_MASK   0xf
> +#define CTR_ERG_SHIFT       20
> +#define CTR_CWG_SHIFT       24
> +#define CTR_CWG_MASK        15
> +#define CTR_IDC_SHIFT       28
> +#define CTR_DIC_SHIFT       29
> +
> +#define ICACHE_POLICY_VPIPT  0
> +#define ICACHE_POLICY_AIVIVT 1
> +#define ICACHE_POLICY_VIPT   2
> +#define ICACHE_POLICY_PIPT   3
>   
>   /* MIDR Main ID Register */
>   #define MIDR_REVISION_MASK      0xf
> 

Cheers,

-- 
Julien Grall

Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Bertrand Marquis 4 years, 5 months ago
Hi Julien,

> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
> 
> Hi Bertrand,
> 
> On 25/08/2021 14:18, Bertrand Marquis wrote:
>> Sanitize CTR_EL0 value between cores.
>> In most cases different values will taint Xen but if different
>> i-cache policies are found, we choose the one which will be compatible
>> between all cores in terms of invalidation/data cache flushing strategy.
> 
> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
> 
>> In this case we need to activate the TID2 bit in HCR to emulate the
>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
>> the time to limit the overhead when possible.
> 
> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
> 
> So I think we should be able to get away from trappings the registers.

I do agree that we should be able to get away from that in the long term once
we have cpupools properly set but right now this is the only way to have
something useable (I will not say right).
I will work on finding a way to setup properly cpupools (or something else as
we discussed earlier) but in the short term I think this is the best we can do.

An other solution would be to discard this patch from the serie for now until
I have worked a proper solution for this case.

Should we discard or merge or do you have an other idea ?


> 
>> When TID2 is activate we also need to emulate the CCSIDR, CSSELR and
>> CLIDR registers which is done here for both 32 and 64bit versions of the
>> registers.
>> Add CTR register field definitions using Linux value and define names
>> and use the opportunity to rename CTR_L1Ip to use an upper case name
>> instead. The patch is also defining ICACHE_POLICY_xxx instead of only
>> having CTR_L1IP_xxx. Code using those defines is also updated by this
>> patch (arm32 setup).
>> On most big/LITTLE platforms this patch will activate TID2 and emulate
>> VIPT type of i-cache for all cores (as most LITTLE cores are VIPT where
>> big ones are PIPT). This is the case for example on Juno boards.
>> On platforms with only the same type of cores, this patch should not
>> modify the behaviour.
>> Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
>> ---
>> Changes in v3: none
>> Change in v2: Patch introduced in v2
>> ---
>>  xen/arch/arm/arm64/cpufeature.c  | 19 +++++++++++---
>>  xen/arch/arm/arm64/vsysreg.c     | 40 ++++++++++++++++++++++++++++
>>  xen/arch/arm/cpufeature.c        |  2 ++
>>  xen/arch/arm/domain.c            |  8 ++++++
>>  xen/arch/arm/setup.c             |  2 +-
>>  xen/arch/arm/vcpreg.c            | 45 ++++++++++++++++++++++++++++++++
>>  xen/include/asm-arm/arm64/hsr.h  |  6 +++++
>>  xen/include/asm-arm/cpufeature.h | 11 ++++++++
>>  xen/include/asm-arm/processor.h  | 18 ++++++++++---
>>  9 files changed, 143 insertions(+), 8 deletions(-)
>> diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
>> index b1936ef1d6..334d590ba0 100644
>> --- a/xen/arch/arm/arm64/cpufeature.c
>> +++ b/xen/arch/arm/arm64/cpufeature.c
>> @@ -275,9 +275,6 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
>>  	ARM64_FTR_END,
>>  };
>>  -#if 0
>> -/* TODO: use this to sanitize the cache line size among cores */
>> -
>>  static const struct arm64_ftr_bits ftr_ctr[] = {
>>  	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */
>>  	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, 1, 1),
>> @@ -294,7 +291,6 @@ static const struct arm64_ftr_bits ftr_ctr[] = {
>>  	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IMINLINE_SHIFT, 4, 0),
>>  	ARM64_FTR_END,
>>  };
>> -#endif
>>    static const struct arm64_ftr_bits ftr_id_mmfr0[] = {
>>  	S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_MMFR0_INNERSHR_SHIFT, 4, 0xf),
>> @@ -510,6 +506,12 @@ static s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new,
>>   * End of imported linux structures and code
>>   */
>>  +/*
>> + * This is set to true if we have different type of i-caches on cores
>> + * and used to activate TID2 bit to emulate CTR_EL0 register for guests
>> + */
>> +bool mismatch_cache_type = false;
> 
> If we are still planning to trap and emulate the registers, then this needs to be an HW capability (see cpus_set_cap()).

Ok

> 
>> +
>>  static void sanitize_reg(u64 *cur_reg, u64 new_reg, const char *reg_name,
>>  						const struct arm64_ftr_bits *ftrp)
>>  {
>> @@ -600,6 +602,15 @@ void update_system_features(const struct cpuinfo_arm *new)
>>  	 */
>>  	SANITIZE_REG(dczid, 0, dczid);
>>  +	SANITIZE_REG(ctr, 0, ctr);
>> +
>> +	/*
>> +	 * If CTR is different among cores, set mismatch_cache_type to activate
>> +	 * TID2 bit in HCR and emulate CTR register access for guests.
>> +	 */
>> +	if ( system_cpuinfo.ctr.bits[0] != new->ctr.bits[0] )
>> +		mismatch_cache_type = true;
>> +
>>  	if ( cpu_feature64_has_el0_32(&system_cpuinfo) )
>>  	{
>>  		SANITIZE_ID_REG(pfr32, 0, pfr0);
>> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
>> index 887266dd46..17212bd7ae 100644
>> --- a/xen/arch/arm/arm64/vsysreg.c
>> +++ b/xen/arch/arm/arm64/vsysreg.c
>> @@ -341,6 +341,46 @@ void do_sysreg(struct cpu_user_regs *regs,
>>      case HSR_SYSREG(3,0,c0,c7,7):
>>          return handle_ro_raz(regs, regidx, hsr.sysreg.read, hsr, 1);
>>  +    /*
>> +     * HCR_EL2.TID2
>> +     *
>> +     * registers related to cache detection
>> +     */
>> +    case HSR_SYSREG_CTR_EL0:
>> +        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
>> +                system_cpuinfo.ctr.bits[0]);
> 
> Coding style: This needs to be aligned with the first argument.

Sure I will fix that

> 
>> +
>> +    case HSR_SYSREG_CLIDR_EL1:
>> +        return handle_ro_read_val(regs, regidx, hsr.sysreg.read, hsr, 1,
>> +                READ_SYSREG(CLIDR_EL1));
> 
> Same.

Ok

> 
>> +
>> +    case HSR_SYSREG_CSSELR_EL1:
>> +        if ( psr_mode_is_user(regs) )
>> +            return inject_undef_exception(regs, hsr);
>> +        if ( hsr.sysreg.read )
>> +            set_user_reg(regs, regidx, v->arch.csselr);
>> +        else
>> +            v->arch.csselr = get_user_reg(regs, regidx);
>> +        break;
>> +
>> +    case HSR_SYSREG_CCSIDR_EL1:
>> +        if ( psr_mode_is_user(regs) )
>> +            return inject_undef_exception(regs, hsr);
>> +        if ( hsr.sysreg.read )
>> +        {
>> +            /* we need to set CSSELR and do the read of CCSIDR atomically */
> 
> I couldn't find this requirement in the Arm Arm. Do you have the section at hand?

If we get interrupted, someone could program CSSELR differently and the next read
will not be reflecting what the guest actually wants to do.

The code is not preemptible right now so this cannot be an issue but I added the
 comment more as a warning.

This is not something from the documentation, this is because value written
in CSSELR is defining what is read from CCSIDR

> 
>> +            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
>> +            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
>> +        }
> 
> From the Arm Arm (D13.2.25 in ARM DDI 0487F.c):
> 
> "
> If CSSELR_EL1.Level is programmed to a cache level that is not implemented, then on a read of the CCSIDR_EL1
> the behavior is CONSTRAINED UNPREDICTABLE, and can be one of the following:
> • The CCSIDR_EL1 read is treated as NOP.
> • The CCSIDR_EL1 read is UNDEFINED.
> • The CCSIDR_EL1 read returns an UNKNOWN value.
> "
> 
> We can't trust the guest here, so we need to prevent any of this behavior (in particular 1 and 2) to happen. The options are:
> 1) Sanitize the values from the guest
> 2) Make sure the register is 0 before reading (for the NOP case to avoid leaking a value from Xen) and catch the undefined (some similar to the extable on x86).

Good catch I should have thought of that.
I will check this but my preference would go to option 1 if feasible.

> 
>> +        break;
>> +
>> +    case HSR_SYSREG_CCSIDR2_EL1:
>> +        /*
>> +         * This would need to return a properly defined value if CCIDX is
>> +         * implemented in the processor
>> +         */
>> +        return inject_undef_exception(regs, hsr);
>> +
>>      /*
>>       * HCR_EL2.TIDCP
>>       *
>> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
>> index 113f20f601..6e51f530a8 100644
>> --- a/xen/arch/arm/cpufeature.c
>> +++ b/xen/arch/arm/cpufeature.c
>> @@ -127,6 +127,8 @@ void identify_cpu(struct cpuinfo_arm *c)
>>        c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
>>  +    c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
>> +
>>      aarch32_el0 = cpu_feature64_has_el0_32(c);
>>  #endif
>>  diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
>> index 19c756ac3d..7a97fde3e7 100644
>> --- a/xen/arch/arm/domain.c
>> +++ b/xen/arch/arm/domain.c
>> @@ -585,6 +585,14 @@ int arch_vcpu_create(struct vcpu *v)
>>      v->arch.vmpidr = MPIDR_SMP | vcpuid_to_vaffinity(v->vcpu_id);
>>        v->arch.hcr_el2 = get_default_hcr_flags();
>> +#ifdef CONFIG_ARM64
> 
> This #ifdef could be droppped if we use an HW caps.

Ack.

> 
>> +    /*
>> +     * Only activated TID2 to catch access to CTR_EL0 if the platform has some
>> +     * mismatching i-cache types among cores
>> +     */
>> +    if ( mismatch_cache_type )
>> +        v->arch.hcr_el2 |= HCR_TID2;
>> +#endif
>>        if ( (rc = vcpu_vgic_init(v)) != 0 )
>>          goto fail;
>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>> index 3798c5ade0..33b7bfb59c 100644
>> --- a/xen/arch/arm/setup.c
>> +++ b/xen/arch/arm/setup.c
>> @@ -627,7 +627,7 @@ static void __init setup_mm(void)
>>          panic("No memory bank\n");
>>        /* We only supports instruction caches implementing the IVIPT extension. */
>> -    if ( ((ctr >> CTR_L1Ip_SHIFT) & CTR_L1Ip_MASK) == CTR_L1Ip_AIVIVT )
>> +    if ( ((ctr >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) == ICACHE_POLICY_AIVIVT )
>>          panic("AIVIVT instruction cache not supported\n");
>>        init_pdx();
>> diff --git a/xen/arch/arm/vcpreg.c b/xen/arch/arm/vcpreg.c
>> index 33259c4194..5ffed96ded 100644
>> --- a/xen/arch/arm/vcpreg.c
>> +++ b/xen/arch/arm/vcpreg.c
>> @@ -361,6 +361,51 @@ void do_cp15_32(struct cpu_user_regs *regs, const union hsr hsr)
>>      HSR_CPREG32_TID3_CASES(c7):
>>          return handle_ro_raz(regs, regidx, cp32.read, hsr, 1);
>>  +#ifdef CONFIG_ARM64
>> +    /*
>> +     * HCR_EL2.TID2
>> +     *
>> +     * registers related to cache detection
>> +     * Only supported on arm64 as we do not sanitize cpuinfo on arm32 so we
>> +     * do not need to emulate those.
>> +     */
>> +    case HSR_CPREG32(CTR):
>> +        return handle_ro_read_val(regs, regidx, cp32.read, hsr, 1,
>> +                system_cpuinfo.ctr.bits[0]);
>> +
>> +    case HSR_CPREG32(CLIDR):
>> +        return handle_ro_read_val(regs, regidx, cp32.read, hsr, 1,
>> +                READ_SYSREG(CLIDR_EL1));
>> +
>> +    case HSR_CPREG32(CSSELR):
>> +        if ( psr_mode_is_user(regs) )
>> +            return inject_undef_exception(regs, hsr);
>> +        if ( cp32.read )
>> +            set_user_reg(regs, regidx, v->arch.csselr);
>> +        else
>> +            v->arch.csselr = get_user_reg(regs, regidx);
>> +        break;
>> +
>> +    case HSR_CPREG32(CCSIDR):
>> +        if ( psr_mode_is_user(regs) )
>> +            return inject_undef_exception(regs, hsr);
>> +        if ( cp32.read )
>> +        {
>> +            /* we need to set CSSELR and do the read of CCSIDR atomically */
>> +            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
>> +            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
>> +        }
>> +        break;
>> +
>> +    case HSR_CPREG32(CCSIDR2):
>> +        /*
>> +         * This would need to return a properly defined value if CCIDX is
>> +         * implemented in the processor
>> +         */
>> +        return inject_undef_exception(regs, hsr);
>> +
>> +#endif
>> +
>>      /*
>>       * HCR_EL2.TIDCP
>>       *
>> diff --git a/xen/include/asm-arm/arm64/hsr.h b/xen/include/asm-arm/arm64/hsr.h
>> index e691d41c17..c33980e4e5 100644
>> --- a/xen/include/asm-arm/arm64/hsr.h
>> +++ b/xen/include/asm-arm/arm64/hsr.h
>> @@ -147,6 +147,12 @@
>>  #define HSR_SYSREG_ID_AA64AFR1_EL1   HSR_SYSREG(3,0,c0,c5,5)
>>  #define HSR_SYSREG_ID_AA64ZFR0_EL1   HSR_SYSREG(3,0,c0,c4,4)
>>  +#define HSR_SYSREG_CTR_EL0      HSR_SYSREG(3,3,c0,c0,1)
>> +#define HSR_SYSREG_CLIDR_EL1    HSR_SYSREG(3,1,c0,c0,1)
>> +#define HSR_SYSREG_CSSELR_EL1   HSR_SYSREG(3,2,c0,c0,0)
>> +#define HSR_SYSREG_CCSIDR_EL1   HSR_SYSREG(3,1,c0,c0,0)
>> +#define HSR_SYSREG_CCSIDR2_EL1  HSR_SYSREG(3,1,c0,c0,2)
>> +
>>  #endif /* __ASM_ARM_ARM64_HSR_H */
>>    /*
>> diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
>> index 5219fd3bab..ca6e827fcb 100644
>> --- a/xen/include/asm-arm/cpufeature.h
>> +++ b/xen/include/asm-arm/cpufeature.h
>> @@ -267,6 +267,14 @@ struct cpuinfo_arm {
>>          register_t bits[1];
>>      } dczid;
>>  +    /*
>> +     * CTR is only used to check for different cache types or policies and
>> +     * taint Xen in this case
>> +     */
>> +    struct {
>> +        register_t bits[1];
>> +    } ctr;
>> +
>>  #endif
>>        /*
>> @@ -339,6 +347,9 @@ extern struct cpuinfo_arm system_cpuinfo;
>>  extern void identify_cpu(struct cpuinfo_arm *);
>>    #ifdef CONFIG_ARM_64
>> +
>> +extern bool mismatched_cache_type;
>> +
>>  extern void update_system_features(const struct cpuinfo_arm *);
>>  #else
>>  static inline void update_system_features(const struct cpuinfo_arm *cpuinfo)
>> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
>> index 2577e9a244..8c9843e12b 100644
>> --- a/xen/include/asm-arm/processor.h
>> +++ b/xen/include/asm-arm/processor.h
>> @@ -7,9 +7,21 @@
>>  #include <public/arch-arm.h>
>>    /* CTR Cache Type Register */
>> -#define CTR_L1Ip_MASK       0x3
>> -#define CTR_L1Ip_SHIFT      14
>> -#define CTR_L1Ip_AIVIVT     0x1
>> +#define CTR_L1IP_MASK       0x3
>> +#define CTR_L1IP_SHIFT      14
>> +#define CTR_DMINLINE_SHIFT  16
>> +#define CTR_IMINLINE_SHIFT  0
>> +#define CTR_IMINLINE_MASK   0xf
>> +#define CTR_ERG_SHIFT       20
>> +#define CTR_CWG_SHIFT       24
>> +#define CTR_CWG_MASK        15
>> +#define CTR_IDC_SHIFT       28
>> +#define CTR_DIC_SHIFT       29
>> +
>> +#define ICACHE_POLICY_VPIPT  0
>> +#define ICACHE_POLICY_AIVIVT 1
>> +#define ICACHE_POLICY_VIPT   2
>> +#define ICACHE_POLICY_PIPT   3
>>    /* MIDR Main ID Register */
>>  #define MIDR_REVISION_MASK      0xf
> 
> Cheers,

Thanks for the review

Cheers
Bertrand

> 
> -- 
> Julien Grall

Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Julien Grall 4 years, 5 months ago

On 31/08/2021 14:17, Bertrand Marquis wrote:
> Hi Julien,

Hi Bertrand,

> 
>> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
>>
>> Hi Bertrand,
>>
>> On 25/08/2021 14:18, Bertrand Marquis wrote:
>>> Sanitize CTR_EL0 value between cores.
>>> In most cases different values will taint Xen but if different
>>> i-cache policies are found, we choose the one which will be compatible
>>> between all cores in terms of invalidation/data cache flushing strategy.
>>
>> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
>>
>>> In this case we need to activate the TID2 bit in HCR to emulate the
>>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
>>> the time to limit the overhead when possible.
>>
>> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
>>
>> So I think we should be able to get away from trappings the registers.
> 
> I do agree that we should be able to get away from that in the long term once
> we have cpupools properly set but right now this is the only way to have
> something useable (I will not say right).
> I will work on finding a way to setup properly cpupools (or something else as
> we discussed earlier) but in the short term I think this is the best we can do.

My concern is you are making look like Xen will be able to deal nicely 
with big.LITTLE when in fact there are a lot more potential issue by 
allow a vCPU moving accross pCPU of different type (the errata is one 
example).

> 
> An other solution would be to discard this patch from the serie for now until
> I have worked a proper solution for this case.
> 
> Should we discard or merge or do you have an other idea ?
Please correct me if I am wrong, at the moment, it doesn't look like 
this patch will be part of the longer plan. If so, then I think it 
should be parked for now.

This would also have the advantage to avoid spending too much time on 
resolving the emulation issue I mentioned in my previous answer.

No need to resend a new version of this series yet. You can wait until 
the rest of the series get more feedback.

[...]

> If we get interrupted, someone could program CSSELR differently and the next read
> will not be reflecting what the guest actually wants to do

AFAICT, CSSELR is preserved during the context switch of vCPU. So that 
someone would have to be Xen, right?

If so, what you describe would also be an issue even if we didn't trap 
the register. Therefore, if Xen would ever use CSSELR, then that code 
would need to save the value, use the register and then restore the 
value with preemption disabled.

> 
> The code is not preemptible right now so this cannot be an issue but I added the
>   comment more as a warning.
> 
> This is not something from the documentation, this is because value written
> in CSSELR is defining what is read from CCSIDR
> 
>>
>>> +            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
>>> +            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
>>> +        }

Cheers,

-- 
Julien Grall

Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Bertrand Marquis 4 years, 5 months ago
Hi Julien,

> On 31 Aug 2021, at 15:47, Julien Grall <julien@xen.org> wrote:
> 
> 
> 
> On 31/08/2021 14:17, Bertrand Marquis wrote:
>> Hi Julien,
> 
> Hi Bertrand,
> 
>>> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
>>> 
>>> Hi Bertrand,
>>> 
>>> On 25/08/2021 14:18, Bertrand Marquis wrote:
>>>> Sanitize CTR_EL0 value between cores.
>>>> In most cases different values will taint Xen but if different
>>>> i-cache policies are found, we choose the one which will be compatible
>>>> between all cores in terms of invalidation/data cache flushing strategy.
>>> 
>>> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
>>> 
>>>> In this case we need to activate the TID2 bit in HCR to emulate the
>>>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
>>>> the time to limit the overhead when possible.
>>> 
>>> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
>>> 
>>> So I think we should be able to get away from trappings the registers.
>> I do agree that we should be able to get away from that in the long term once
>> we have cpupools properly set but right now this is the only way to have
>> something useable (I will not say right).
>> I will work on finding a way to setup properly cpupools (or something else as
>> we discussed earlier) but in the short term I think this is the best we can do.
> 
> My concern is you are making look like Xen will be able to deal nicely with big.LITTLE when in fact there are a lot more potential issue by allow a vCPU moving accross pCPU of different type (the errata is one example).

I agree and this is why Xen is tainted.

> 
>> An other solution would be to discard this patch from the serie for now until
>> I have worked a proper solution for this case.
>> Should we discard or merge or do you have an other idea ?
> Please correct me if I am wrong, at the moment, it doesn't look like this patch will be part of the longer plan. If so, then I think it should be parked for now.

Not sure it depends on what the final solution would be but this is highly possible I agree.

> 
> This would also have the advantage to avoid spending too much time on resolving the emulation issue I mentioned in my previous answer.
> 
> No need to resend a new version of this series yet. You can wait until the rest of the series get more feedback.

Ok, I will wait for feedback and next serie will not include this patch anymore.

> 
> [...]
> 
>> If we get interrupted, someone could program CSSELR differently and the next read
>> will not be reflecting what the guest actually wants to do
> 
> AFAICT, CSSELR is preserved during the context switch of vCPU. So that someone would have to be Xen, right?
> 
> If so, what you describe would also be an issue even if we didn't trap the register. Therefore, if Xen would ever use CSSELR, then that code would need to save the value, use the register and then restore the value with preemption disabled.

I could just remove the comment, I added it as information, but if you think it is misleading no problem.

Anyway as we will park this for now no need to discuss that further.

Cheers
Bertrand

> 
>> The code is not preemptible right now so this cannot be an issue but I added the
>>  comment more as a warning.
>> This is not something from the documentation, this is because value written
>> in CSSELR is defining what is read from CCSIDR
>>> 
>>>> +            WRITE_SYSREG(v->arch.csselr, CSSELR_EL1);
>>>> +            set_user_reg(regs, regidx, READ_SYSREG(CCSIDR_EL1));
>>>> +        }
> 
> Cheers,
> 
> -- 
> Julien Grall


Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Stefano Stabellini 4 years, 5 months ago
On Tue, 31 Aug 2021, Bertrand Marquis wrote:
> Hi Julien,
> 
> > On 31 Aug 2021, at 15:47, Julien Grall <julien@xen.org> wrote:
> > 
> > 
> > 
> > On 31/08/2021 14:17, Bertrand Marquis wrote:
> >> Hi Julien,
> > 
> > Hi Bertrand,
> > 
> >>> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
> >>> 
> >>> Hi Bertrand,
> >>> 
> >>> On 25/08/2021 14:18, Bertrand Marquis wrote:
> >>>> Sanitize CTR_EL0 value between cores.
> >>>> In most cases different values will taint Xen but if different
> >>>> i-cache policies are found, we choose the one which will be compatible
> >>>> between all cores in terms of invalidation/data cache flushing strategy.
> >>> 
> >>> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
> >>> 
> >>>> In this case we need to activate the TID2 bit in HCR to emulate the
> >>>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
> >>>> the time to limit the overhead when possible.
> >>> 
> >>> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
> >>> 
> >>> So I think we should be able to get away from trappings the registers.
> >> I do agree that we should be able to get away from that in the long term once
> >> we have cpupools properly set but right now this is the only way to have
> >> something useable (I will not say right).
> >> I will work on finding a way to setup properly cpupools (or something else as
> >> we discussed earlier) but in the short term I think this is the best we can do.
> > 
> > My concern is you are making look like Xen will be able to deal nicely with big.LITTLE when in fact there are a lot more potential issue by allow a vCPU moving accross pCPU of different type (the errata is one example).
> 
> I agree and this is why Xen is tainted.
> 
> > 
> >> An other solution would be to discard this patch from the serie for now until
> >> I have worked a proper solution for this case.
> >> Should we discard or merge or do you have an other idea ?
> > Please correct me if I am wrong, at the moment, it doesn't look like this patch will be part of the longer plan. If so, then I think it should be parked for now.
> 
> Not sure it depends on what the final solution would be but this is highly possible I agree.
> 
> > 
> > This would also have the advantage to avoid spending too much time on resolving the emulation issue I mentioned in my previous answer.
> > 
> > No need to resend a new version of this series yet. You can wait until the rest of the series get more feedback.
> 
> Ok, I will wait for feedback and next serie will not include this patch anymore.

Would it be worth keeping just the part that sanitizes ctr, without any
of the emulation stuff? That way we could still detect cases where there
is a mismatch between CPUs, print a useful message and taint Xen.

For clarity something like the appended.


diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
index b1936ef1d6..d2456af2bf 100644
--- a/xen/arch/arm/arm64/cpufeature.c
+++ b/xen/arch/arm/arm64/cpufeature.c
@@ -275,9 +275,6 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
 	ARM64_FTR_END,
 };
 
-#if 0
-/* TODO: use this to sanitize the cache line size among cores */
-
 static const struct arm64_ftr_bits ftr_ctr[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, 1, 1),
@@ -294,7 +291,6 @@ static const struct arm64_ftr_bits ftr_ctr[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IMINLINE_SHIFT, 4, 0),
 	ARM64_FTR_END,
 };
-#endif
 
 static const struct arm64_ftr_bits ftr_id_mmfr0[] = {
 	S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_MMFR0_INNERSHR_SHIFT, 4, 0xf),
@@ -600,6 +596,8 @@ void update_system_features(const struct cpuinfo_arm *new)
 	 */
 	SANITIZE_REG(dczid, 0, dczid);
 
+	SANITIZE_REG(ctr, 0, ctr);
+
 	if ( cpu_feature64_has_el0_32(&system_cpuinfo) )
 	{
 		SANITIZE_ID_REG(pfr32, 0, pfr0);
diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
index 113f20f601..6e51f530a8 100644
--- a/xen/arch/arm/cpufeature.c
+++ b/xen/arch/arm/cpufeature.c
@@ -127,6 +127,8 @@ void identify_cpu(struct cpuinfo_arm *c)
 
     c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
 
+    c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
+
     aarch32_el0 = cpu_feature64_has_el0_32(c);
 #endif
 
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 3798c5ade0..33b7bfb59c 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -627,7 +627,7 @@ static void __init setup_mm(void)
         panic("No memory bank\n");
 
     /* We only supports instruction caches implementing the IVIPT extension. */
-    if ( ((ctr >> CTR_L1Ip_SHIFT) & CTR_L1Ip_MASK) == CTR_L1Ip_AIVIVT )
+    if ( ((ctr >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) == ICACHE_POLICY_AIVIVT )
         panic("AIVIVT instruction cache not supported\n");
 
     init_pdx();
diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
index 5219fd3bab..ca6e827fcb 100644
--- a/xen/include/asm-arm/cpufeature.h
+++ b/xen/include/asm-arm/cpufeature.h
@@ -267,6 +267,14 @@ struct cpuinfo_arm {
         register_t bits[1];
     } dczid;
 
+    /*
+     * CTR is only used to check for different cache types or policies and
+     * taint Xen in this case
+     */
+    struct {
+        register_t bits[1];
+    } ctr;
+
 #endif
 
     /*
@@ -339,6 +347,9 @@ extern struct cpuinfo_arm system_cpuinfo;
 extern void identify_cpu(struct cpuinfo_arm *);
 
 #ifdef CONFIG_ARM_64
+
+extern bool mismatched_cache_type;
+
 extern void update_system_features(const struct cpuinfo_arm *);
 #else
 static inline void update_system_features(const struct cpuinfo_arm *cpuinfo)
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 2577e9a244..8c9843e12b 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -7,9 +7,21 @@
 #include <public/arch-arm.h>
 
 /* CTR Cache Type Register */
-#define CTR_L1Ip_MASK       0x3
-#define CTR_L1Ip_SHIFT      14
-#define CTR_L1Ip_AIVIVT     0x1
+#define CTR_L1IP_MASK       0x3
+#define CTR_L1IP_SHIFT      14
+#define CTR_DMINLINE_SHIFT  16
+#define CTR_IMINLINE_SHIFT  0
+#define CTR_IMINLINE_MASK   0xf
+#define CTR_ERG_SHIFT       20
+#define CTR_CWG_SHIFT       24
+#define CTR_CWG_MASK        15
+#define CTR_IDC_SHIFT       28
+#define CTR_DIC_SHIFT       29
+
+#define ICACHE_POLICY_VPIPT  0
+#define ICACHE_POLICY_AIVIVT 1
+#define ICACHE_POLICY_VIPT   2
+#define ICACHE_POLICY_PIPT   3
 
 /* MIDR Main ID Register */
 #define MIDR_REVISION_MASK      0xf

Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Bertrand Marquis 4 years, 5 months ago
Hi Stefano,

> On 3 Sep 2021, at 23:49, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Tue, 31 Aug 2021, Bertrand Marquis wrote:
>> Hi Julien,
>> 
>>> On 31 Aug 2021, at 15:47, Julien Grall <julien@xen.org> wrote:
>>> 
>>> 
>>> 
>>> On 31/08/2021 14:17, Bertrand Marquis wrote:
>>>> Hi Julien,
>>> 
>>> Hi Bertrand,
>>> 
>>>>> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
>>>>> 
>>>>> Hi Bertrand,
>>>>> 
>>>>> On 25/08/2021 14:18, Bertrand Marquis wrote:
>>>>>> Sanitize CTR_EL0 value between cores.
>>>>>> In most cases different values will taint Xen but if different
>>>>>> i-cache policies are found, we choose the one which will be compatible
>>>>>> between all cores in terms of invalidation/data cache flushing strategy.
>>>>> 
>>>>> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
>>>>> 
>>>>>> In this case we need to activate the TID2 bit in HCR to emulate the
>>>>>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
>>>>>> the time to limit the overhead when possible.
>>>>> 
>>>>> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
>>>>> 
>>>>> So I think we should be able to get away from trappings the registers.
>>>> I do agree that we should be able to get away from that in the long term once
>>>> we have cpupools properly set but right now this is the only way to have
>>>> something useable (I will not say right).
>>>> I will work on finding a way to setup properly cpupools (or something else as
>>>> we discussed earlier) but in the short term I think this is the best we can do.
>>> 
>>> My concern is you are making look like Xen will be able to deal nicely with big.LITTLE when in fact there are a lot more potential issue by allow a vCPU moving accross pCPU of different type (the errata is one example).
>> 
>> I agree and this is why Xen is tainted.
>> 
>>> 
>>>> An other solution would be to discard this patch from the serie for now until
>>>> I have worked a proper solution for this case.
>>>> Should we discard or merge or do you have an other idea ?
>>> Please correct me if I am wrong, at the moment, it doesn't look like this patch will be part of the longer plan. If so, then I think it should be parked for now.
>> 
>> Not sure it depends on what the final solution would be but this is highly possible I agree.
>> 
>>> 
>>> This would also have the advantage to avoid spending too much time on resolving the emulation issue I mentioned in my previous answer.
>>> 
>>> No need to resend a new version of this series yet. You can wait until the rest of the series get more feedback.
>> 
>> Ok, I will wait for feedback and next serie will not include this patch anymore.
> 
> Would it be worth keeping just the part that sanitizes ctr, without any
> of the emulation stuff? That way we could still detect cases where there
> is a mismatch between CPUs, print a useful message and taint Xen.

That’s a good idea, it means removing the emulation part and just keep the sanitization.

@Julien: would you be ok with that ?

Should I send a v4 or should we use Stefano’s patch directly instead ?

Cheers
Bertrand

> 
> For clarity something like the appended.
> 
> 
> diff --git a/xen/arch/arm/arm64/cpufeature.c b/xen/arch/arm/arm64/cpufeature.c
> index b1936ef1d6..d2456af2bf 100644
> --- a/xen/arch/arm/arm64/cpufeature.c
> +++ b/xen/arch/arm/arm64/cpufeature.c
> @@ -275,9 +275,6 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr2[] = {
> 	ARM64_FTR_END,
> };
> 
> -#if 0
> -/* TODO: use this to sanitize the cache line size among cores */
> -
> static const struct arm64_ftr_bits ftr_ctr[] = {
> 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */
> 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, 1, 1),
> @@ -294,7 +291,6 @@ static const struct arm64_ftr_bits ftr_ctr[] = {
> 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IMINLINE_SHIFT, 4, 0),
> 	ARM64_FTR_END,
> };
> -#endif
> 
> static const struct arm64_ftr_bits ftr_id_mmfr0[] = {
> 	S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_MMFR0_INNERSHR_SHIFT, 4, 0xf),
> @@ -600,6 +596,8 @@ void update_system_features(const struct cpuinfo_arm *new)
> 	 */
> 	SANITIZE_REG(dczid, 0, dczid);
> 
> +	SANITIZE_REG(ctr, 0, ctr);
> +
> 	if ( cpu_feature64_has_el0_32(&system_cpuinfo) )
> 	{
> 		SANITIZE_ID_REG(pfr32, 0, pfr0);
> diff --git a/xen/arch/arm/cpufeature.c b/xen/arch/arm/cpufeature.c
> index 113f20f601..6e51f530a8 100644
> --- a/xen/arch/arm/cpufeature.c
> +++ b/xen/arch/arm/cpufeature.c
> @@ -127,6 +127,8 @@ void identify_cpu(struct cpuinfo_arm *c)
> 
>     c->dczid.bits[0] = READ_SYSREG(DCZID_EL0);
> 
> +    c->ctr.bits[0] = READ_SYSREG(CTR_EL0);
> +
>     aarch32_el0 = cpu_feature64_has_el0_32(c);
> #endif
> 
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 3798c5ade0..33b7bfb59c 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -627,7 +627,7 @@ static void __init setup_mm(void)
>         panic("No memory bank\n");
> 
>     /* We only supports instruction caches implementing the IVIPT extension. */
> -    if ( ((ctr >> CTR_L1Ip_SHIFT) & CTR_L1Ip_MASK) == CTR_L1Ip_AIVIVT )
> +    if ( ((ctr >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) == ICACHE_POLICY_AIVIVT )
>         panic("AIVIVT instruction cache not supported\n");
> 
>     init_pdx();
> diff --git a/xen/include/asm-arm/cpufeature.h b/xen/include/asm-arm/cpufeature.h
> index 5219fd3bab..ca6e827fcb 100644
> --- a/xen/include/asm-arm/cpufeature.h
> +++ b/xen/include/asm-arm/cpufeature.h
> @@ -267,6 +267,14 @@ struct cpuinfo_arm {
>         register_t bits[1];
>     } dczid;
> 
> +    /*
> +     * CTR is only used to check for different cache types or policies and
> +     * taint Xen in this case
> +     */
> +    struct {
> +        register_t bits[1];
> +    } ctr;
> +
> #endif
> 
>     /*
> @@ -339,6 +347,9 @@ extern struct cpuinfo_arm system_cpuinfo;
> extern void identify_cpu(struct cpuinfo_arm *);
> 
> #ifdef CONFIG_ARM_64
> +
> +extern bool mismatched_cache_type;
> +
> extern void update_system_features(const struct cpuinfo_arm *);
> #else
> static inline void update_system_features(const struct cpuinfo_arm *cpuinfo)
> diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
> index 2577e9a244..8c9843e12b 100644
> --- a/xen/include/asm-arm/processor.h
> +++ b/xen/include/asm-arm/processor.h
> @@ -7,9 +7,21 @@
> #include <public/arch-arm.h>
> 
> /* CTR Cache Type Register */
> -#define CTR_L1Ip_MASK       0x3
> -#define CTR_L1Ip_SHIFT      14
> -#define CTR_L1Ip_AIVIVT     0x1
> +#define CTR_L1IP_MASK       0x3
> +#define CTR_L1IP_SHIFT      14
> +#define CTR_DMINLINE_SHIFT  16
> +#define CTR_IMINLINE_SHIFT  0
> +#define CTR_IMINLINE_MASK   0xf
> +#define CTR_ERG_SHIFT       20
> +#define CTR_CWG_SHIFT       24
> +#define CTR_CWG_MASK        15
> +#define CTR_IDC_SHIFT       28
> +#define CTR_DIC_SHIFT       29
> +
> +#define ICACHE_POLICY_VPIPT  0
> +#define ICACHE_POLICY_AIVIVT 1
> +#define ICACHE_POLICY_VIPT   2
> +#define ICACHE_POLICY_PIPT   3
> 
> /* MIDR Main ID Register */
> #define MIDR_REVISION_MASK      0xf

Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Julien Grall 4 years, 5 months ago
Hi Bertrand,

On 06/09/2021 09:29, Bertrand Marquis wrote:
>> On 3 Sep 2021, at 23:49, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>
>> On Tue, 31 Aug 2021, Bertrand Marquis wrote:
>>> Hi Julien,
>>>
>>>> On 31 Aug 2021, at 15:47, Julien Grall <julien@xen.org> wrote:
>>>>
>>>>
>>>>
>>>> On 31/08/2021 14:17, Bertrand Marquis wrote:
>>>>> Hi Julien,
>>>>
>>>> Hi Bertrand,
>>>>
>>>>>> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
>>>>>>
>>>>>> Hi Bertrand,
>>>>>>
>>>>>> On 25/08/2021 14:18, Bertrand Marquis wrote:
>>>>>>> Sanitize CTR_EL0 value between cores.
>>>>>>> In most cases different values will taint Xen but if different
>>>>>>> i-cache policies are found, we choose the one which will be compatible
>>>>>>> between all cores in terms of invalidation/data cache flushing strategy.
>>>>>>
>>>>>> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
>>>>>>
>>>>>>> In this case we need to activate the TID2 bit in HCR to emulate the
>>>>>>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
>>>>>>> the time to limit the overhead when possible.
>>>>>>
>>>>>> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
>>>>>>
>>>>>> So I think we should be able to get away from trappings the registers.
>>>>> I do agree that we should be able to get away from that in the long term once
>>>>> we have cpupools properly set but right now this is the only way to have
>>>>> something useable (I will not say right).
>>>>> I will work on finding a way to setup properly cpupools (or something else as
>>>>> we discussed earlier) but in the short term I think this is the best we can do.
>>>>
>>>> My concern is you are making look like Xen will be able to deal nicely with big.LITTLE when in fact there are a lot more potential issue by allow a vCPU moving accross pCPU of different type (the errata is one example).
>>>
>>> I agree and this is why Xen is tainted.
>>>
>>>>
>>>>> An other solution would be to discard this patch from the serie for now until
>>>>> I have worked a proper solution for this case.
>>>>> Should we discard or merge or do you have an other idea ?
>>>> Please correct me if I am wrong, at the moment, it doesn't look like this patch will be part of the longer plan. If so, then I think it should be parked for now.
>>>
>>> Not sure it depends on what the final solution would be but this is highly possible I agree.
>>>
>>>>
>>>> This would also have the advantage to avoid spending too much time on resolving the emulation issue I mentioned in my previous answer.
>>>>
>>>> No need to resend a new version of this series yet. You can wait until the rest of the series get more feedback.
>>>
>>> Ok, I will wait for feedback and next serie will not include this patch anymore.
>>
>> Would it be worth keeping just the part that sanitizes ctr, without any
>> of the emulation stuff? That way we could still detect cases where there
>> is a mismatch between CPUs, print a useful message and taint Xen.
> 
> That’s a good idea, it means removing the emulation part and just keep the sanitization.
> 
> @Julien: would you be ok with that ?

I actually thought about suggesting it before Stefano did it. So I am OK 
with that.

> 
> Should I send a v4 or should we use Stefano’s patch directly instead ?

I would suggest to send a v4. This needs a signed-off-by from Stefano 
and a new commit message.

Cheers,

-- 
Julien Grall

Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Bertrand Marquis 4 years, 5 months ago
Hi Julien,

> On 6 Sep 2021, at 18:36, Julien Grall <julien@xen.org> wrote:
> 
> Hi Bertrand,
> 
> On 06/09/2021 09:29, Bertrand Marquis wrote:
>>> On 3 Sep 2021, at 23:49, Stefano Stabellini <sstabellini@kernel.org> wrote:
>>> 
>>> On Tue, 31 Aug 2021, Bertrand Marquis wrote:
>>>> Hi Julien,
>>>> 
>>>>> On 31 Aug 2021, at 15:47, Julien Grall <julien@xen.org> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 31/08/2021 14:17, Bertrand Marquis wrote:
>>>>>> Hi Julien,
>>>>> 
>>>>> Hi Bertrand,
>>>>> 
>>>>>>> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
>>>>>>> 
>>>>>>> Hi Bertrand,
>>>>>>> 
>>>>>>> On 25/08/2021 14:18, Bertrand Marquis wrote:
>>>>>>>> Sanitize CTR_EL0 value between cores.
>>>>>>>> In most cases different values will taint Xen but if different
>>>>>>>> i-cache policies are found, we choose the one which will be compatible
>>>>>>>> between all cores in terms of invalidation/data cache flushing strategy.
>>>>>>> 
>>>>>>> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
>>>>>>> 
>>>>>>>> In this case we need to activate the TID2 bit in HCR to emulate the
>>>>>>>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
>>>>>>>> the time to limit the overhead when possible.
>>>>>>> 
>>>>>>> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
>>>>>>> 
>>>>>>> So I think we should be able to get away from trappings the registers.
>>>>>> I do agree that we should be able to get away from that in the long term once
>>>>>> we have cpupools properly set but right now this is the only way to have
>>>>>> something useable (I will not say right).
>>>>>> I will work on finding a way to setup properly cpupools (or something else as
>>>>>> we discussed earlier) but in the short term I think this is the best we can do.
>>>>> 
>>>>> My concern is you are making look like Xen will be able to deal nicely with big.LITTLE when in fact there are a lot more potential issue by allow a vCPU moving accross pCPU of different type (the errata is one example).
>>>> 
>>>> I agree and this is why Xen is tainted.
>>>> 
>>>>> 
>>>>>> An other solution would be to discard this patch from the serie for now until
>>>>>> I have worked a proper solution for this case.
>>>>>> Should we discard or merge or do you have an other idea ?
>>>>> Please correct me if I am wrong, at the moment, it doesn't look like this patch will be part of the longer plan. If so, then I think it should be parked for now.
>>>> 
>>>> Not sure it depends on what the final solution would be but this is highly possible I agree.
>>>> 
>>>>> 
>>>>> This would also have the advantage to avoid spending too much time on resolving the emulation issue I mentioned in my previous answer.
>>>>> 
>>>>> No need to resend a new version of this series yet. You can wait until the rest of the series get more feedback.
>>>> 
>>>> Ok, I will wait for feedback and next serie will not include this patch anymore.
>>> 
>>> Would it be worth keeping just the part that sanitizes ctr, without any
>>> of the emulation stuff? That way we could still detect cases where there
>>> is a mismatch between CPUs, print a useful message and taint Xen.
>> That’s a good idea, it means removing the emulation part and just keep the sanitization.
>> @Julien: would you be ok with that ?
> 
> I actually thought about suggesting it before Stefano did it. So I am OK with that.
> 
>> Should I send a v4 or should we use Stefano’s patch directly instead ?
> 
> I would suggest to send a v4. This needs a signed-off-by from Stefano and a new commit message.

Ok I will do that beginning of next week.

Cheers
Bertrand

> 
> Cheers,
> 
> -- 
> Julien Grall

Re: [PATCH v3 7/7] xen/arm: Sanitize CTR_EL0 and emulate it if needed
Posted by Stefano Stabellini 4 years, 5 months ago
On Tue, 7 Sep 2021, Bertrand Marquis wrote:
> Hi Julien,
> 
> > On 6 Sep 2021, at 18:36, Julien Grall <julien@xen.org> wrote:
> > 
> > Hi Bertrand,
> > 
> > On 06/09/2021 09:29, Bertrand Marquis wrote:
> >>> On 3 Sep 2021, at 23:49, Stefano Stabellini <sstabellini@kernel.org> wrote:
> >>> 
> >>> On Tue, 31 Aug 2021, Bertrand Marquis wrote:
> >>>> Hi Julien,
> >>>> 
> >>>>> On 31 Aug 2021, at 15:47, Julien Grall <julien@xen.org> wrote:
> >>>>> 
> >>>>> 
> >>>>> 
> >>>>> On 31/08/2021 14:17, Bertrand Marquis wrote:
> >>>>>> Hi Julien,
> >>>>> 
> >>>>> Hi Bertrand,
> >>>>> 
> >>>>>>> On 27 Aug 2021, at 16:05, Julien Grall <julien@xen.org> wrote:
> >>>>>>> 
> >>>>>>> Hi Bertrand,
> >>>>>>> 
> >>>>>>> On 25/08/2021 14:18, Bertrand Marquis wrote:
> >>>>>>>> Sanitize CTR_EL0 value between cores.
> >>>>>>>> In most cases different values will taint Xen but if different
> >>>>>>>> i-cache policies are found, we choose the one which will be compatible
> >>>>>>>> between all cores in terms of invalidation/data cache flushing strategy.
> >>>>>>> 
> >>>>>>> I understand that all the CPUs in Xen needs to agree on the cache flush strategy. However...
> >>>>>>> 
> >>>>>>>> In this case we need to activate the TID2 bit in HCR to emulate the
> >>>>>>>> TCR_EL0 register for guests. This patch is not activating TID2 bit all
> >>>>>>>> the time to limit the overhead when possible.
> >>>>>>> 
> >>>>>>> as we discussed in an earlier version, a vCPU is unlikely (at least in short/medium) to be able move across pCPU of different type. So the vCPU would be pinned to a set of pCPUs. IOW, the guest would have to be big.LITTLE aware and therefore would be able to do its own strategy decision.
> >>>>>>> 
> >>>>>>> So I think we should be able to get away from trappings the registers.
> >>>>>> I do agree that we should be able to get away from that in the long term once
> >>>>>> we have cpupools properly set but right now this is the only way to have
> >>>>>> something useable (I will not say right).
> >>>>>> I will work on finding a way to setup properly cpupools (or something else as
> >>>>>> we discussed earlier) but in the short term I think this is the best we can do.
> >>>>> 
> >>>>> My concern is you are making look like Xen will be able to deal nicely with big.LITTLE when in fact there are a lot more potential issue by allow a vCPU moving accross pCPU of different type (the errata is one example).
> >>>> 
> >>>> I agree and this is why Xen is tainted.
> >>>> 
> >>>>> 
> >>>>>> An other solution would be to discard this patch from the serie for now until
> >>>>>> I have worked a proper solution for this case.
> >>>>>> Should we discard or merge or do you have an other idea ?
> >>>>> Please correct me if I am wrong, at the moment, it doesn't look like this patch will be part of the longer plan. If so, then I think it should be parked for now.
> >>>> 
> >>>> Not sure it depends on what the final solution would be but this is highly possible I agree.
> >>>> 
> >>>>> 
> >>>>> This would also have the advantage to avoid spending too much time on resolving the emulation issue I mentioned in my previous answer.
> >>>>> 
> >>>>> No need to resend a new version of this series yet. You can wait until the rest of the series get more feedback.
> >>>> 
> >>>> Ok, I will wait for feedback and next serie will not include this patch anymore.
> >>> 
> >>> Would it be worth keeping just the part that sanitizes ctr, without any
> >>> of the emulation stuff? That way we could still detect cases where there
> >>> is a mismatch between CPUs, print a useful message and taint Xen.
> >> That’s a good idea, it means removing the emulation part and just keep the sanitization.
> >> @Julien: would you be ok with that ?
> > 
> > I actually thought about suggesting it before Stefano did it. So I am OK with that.
> > 
> >> Should I send a v4 or should we use Stefano’s patch directly instead ?
> > 
> > I would suggest to send a v4. This needs a signed-off-by from Stefano and a new commit message.
> 
> Ok I will do that beginning of next week.

Of course you can add my signed-off-by and even my reviewed-by (although
it will look weird as it seems I reviewed my own patch)