From nobody Fri Dec 19 12:00:34 2025 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98ABF1C688F for ; Tue, 27 Aug 2024 13:08:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764118; cv=none; b=ZsN3dmB+QCPDRCoZNQEP5axkp7xXbvvpRQKq1B2aQdFBH1HIUc+dDIuGkMhP4U1euacmV1T43WVGZBYNUIKFcAmBFjqTIO/nejp5q9/90A2rV8O1NTEPWacSNoYRvswjfftSgdkzV+dblvD7yfn/TUHYPNoCRv4+Qpaw79+thuE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764118; c=relaxed/simple; bh=LUcRncYkstw7segiyPcylSfJ5R32kGx71z17VxE4yG8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=SUjCXEXGxXItQkata70G6iGdrXBVHrRI+npKEwLJ4l2FzfUwa7gEMCyeILr4dU382zqaWJNu8cZLuuuoKWzhgujTpe1o3ud634ILZIVvApE5RwKG73UzcOdn5wCZDoQUAUbNSpwZ50amxL2LeKVFAuQ1fl7fWF7SF6NJ8GwCrHg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=Fm9LmITs; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Fm9LmITs" Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-42819654737so46932375e9.1 for ; Tue, 27 Aug 2024 06:08:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1724764113; x=1725368913; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4MD8wc4g8KI9rh8uybWL5vkhuXQ1ZDf9pDxd792jHps=; b=Fm9LmITsuiwXlUV3dLoMz3wZeQVZgnkz0DGpetUG/yT2TV3BQN4I8JvgHJEALvKsuX J2MLUG51Zsd28v7JQP2YTEgq5729sW1A9BMDuqgPTawWErqA7g+XuZNG12hrts+n9joD 41feg5IUwVjXApKvvCbYEhJhMVH0qZtdnTDtnmweV9Dt6+JL4ez9q31AbM3284Fg6JyF +II6Koc5pWDqKBC/K1OANOTsjrjLJJHXIbXGLRorgJ/NdFbZRSPG4Vyfl9HYO+THVqTJ QzNQh6BCwmh8LiShRjvaCIoBw1QtksdBBI0QsYyQjuLDALN9BC2zzLPpIczhBdE8gyLY jTHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724764113; x=1725368913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4MD8wc4g8KI9rh8uybWL5vkhuXQ1ZDf9pDxd792jHps=; b=MIS8YUTgIKKUGfz8+ZZj4vKZrIAUJPHPcYdqxMcPdCCbmmd0MFWy76Y79PU+hZtuMv rH//Fn4ScX1WV0GqAK6WhsLswCXVQkKNv3wXYszKpfspd4zn2GhVM15vTeg7nBndAh4r QDDaksl9otse64EVp9gMk71ZZ64r0e7f+2cG6PIxAqgGGpO0/6GOxcFt8SYSMzqQdTCU mhqNFOdrp6j+fSiICcjVcpyHV2yZwmqfkRNWG7W7IBtvUdQx6w1eLyLUFXO4PPMSN4u9 lsR8WrNSNDWjTcK4zcLUixqNXv+9fXjAetNsmsOWvkgaoOdyF6wx1ZGlsuX0Lnx+nTe6 Z6rQ== X-Gm-Message-State: AOJu0Yw4q/bOt7oLAAF3W/s2ScrLK0jXIdTHdqNMScpSBFzoRrVuhlMe L8EfR4o75zLr3JF00vRtKNuvfI4A9f+kfFdssERWi5rbPwfUKHZ5xcFT4sWmia0= X-Google-Smtp-Source: AGHT+IGf25+bx2RIunIzXYD7ZgDC8cl0SkQ9RTsMDgpu1jLvGJSLbOebGk8UXVEvEVDEuniRA+7y/w== X-Received: by 2002:a05:600c:524f:b0:425:80d5:b8b2 with SMTP id 5b1f17b1804b1-42acd55df54mr82544845e9.16.1724764112183; Tue, 27 Aug 2024 06:08:32 -0700 (PDT) Received: from draig.lan ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42abeff8e28sm222633065e9.40.2024.08.27.06.08.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 06:08:30 -0700 (PDT) Received: from draig.lan (localhost [IPv6:::1]) by draig.lan (Postfix) with ESMTP id 1BC295F9EB; Tue, 27 Aug 2024 14:08:30 +0100 (BST) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, maz@kernel.org, arnd@linaro.org, D Scott Phillips , =?UTF-8?q?Alex=20Benn=C3=A9e?= Subject: [PATCH 1/3] ampere/arm64: Add a fixup handler for alignment faults in aarch64 code Date: Tue, 27 Aug 2024 14:08:27 +0100 Message-Id: <20240827130829.43632-2-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240827130829.43632-1-alex.bennee@linaro.org> References: <20240827130829.43632-1-alex.bennee@linaro.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: D Scott Phillips A later patch will hand out Device memory in some cases to code which expects a Normal memory type, as an errata workaround. Unaligned accesses to Device memory will fault though, so here we add a fixup handler to emulate faulting accesses, at a performance penalty. Many of the instructions in the Loads and Stores group are supported, but these groups are not handled here: * Advanced SIMD load/store multiple structures * Advanced SIMD load/store multiple structures (post-indexed) * Advanced SIMD load/store single structure * Advanced SIMD load/store single structure (post-indexed) * Load/store memory tags * Load/store exclusive * LDAPR/STLR (unscaled immediate) * Load register (literal) [cannot Alignment fault] * Load/store register (unprivileged) * Atomic memory operations * Load/store register (pac) Instruction implementations are translated from the Exploration tools' ASL specifications. Upstream-Status: Pending Signed-off-by: D Scott Phillips [AJB: fix align_ldst_regoff_simdfp] Signed-off-by: Alex Benn=C3=A9e --- v2 - fix handling of some registers vAJB: - fix align_ldst_regoff_simdfp - fix scale calculation (ternary instead of |) - don't skip n =3D=3D t && n !=3D 31 (not relevant to simd/fp) - check for invalid option<1> - expand opc & 0x2 check to include size - add failure pr_warn to fixup_alignment --- arch/arm64/include/asm/insn.h | 1 + arch/arm64/mm/Makefile | 3 +- arch/arm64/mm/fault.c | 721 ++++++++++++++++++++++++++++++++++ arch/arm64/mm/fault_neon.c | 59 +++ 4 files changed, 783 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/mm/fault_neon.c diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h index 8c0a36f72d6fc..d6e926b5046c1 100644 --- a/arch/arm64/include/asm/insn.h +++ b/arch/arm64/include/asm/insn.h @@ -431,6 +431,7 @@ __AARCH64_INSN_FUNCS(clrex, 0xFFFFF0FF, 0xD503305F) __AARCH64_INSN_FUNCS(ssbb, 0xFFFFFFFF, 0xD503309F) __AARCH64_INSN_FUNCS(pssbb, 0xFFFFFFFF, 0xD503349F) __AARCH64_INSN_FUNCS(bti, 0xFFFFFF3F, 0xD503241f) +__AARCH64_INSN_FUNCS(dc_zva, 0xFFFFFFE0, 0xD50B7420) =20 #undef __AARCH64_INSN_FUNCS =20 diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index 60454256945b8..05f1ac75e315c 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y :=3D dma-mapping.o extable.o fault.o init.o \ +obj-y :=3D dma-mapping.o extable.o fault.o fault_neon.o init.o \ cache.o copypage.o flush.o \ ioremap.o mmap.o pgd.o mmu.o \ context.o proc.o pageattr.o fixmap.o @@ -13,5 +13,6 @@ obj-$(CONFIG_DEBUG_VIRTUAL) +=3D physaddr.o obj-$(CONFIG_ARM64_MTE) +=3D mteswap.o KASAN_SANITIZE_physaddr.o +=3D n =20 ++CFLAGS_REMOVE_fault_neon.o +=3D -mgeneral-regs-only obj-$(CONFIG_KASAN) +=3D kasan_init.o KASAN_SANITIZE_kasan_init.o :=3D n diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 451ba7cbd5adb..744e7b1664b1c 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -5,6 +5,7 @@ * Copyright (C) 1995 Linus Torvalds * Copyright (C) 1995-2004 Russell King * Copyright (C) 2012 ARM Ltd. + * Copyright (C) 2020 Ampere Computing LLC */ =20 #include @@ -42,8 +43,10 @@ #include #include #include +#include =20 struct fault_info { + /* fault handler, return 0 on successful handling */ int (*fn)(unsigned long far, unsigned long esr, struct pt_regs *regs); int sig; @@ -693,9 +696,727 @@ static int __kprobes do_translation_fault(unsigned lo= ng far, return 0; } =20 +static int copy_from_user_io(void *to, const void __user *from, unsigned l= ong n) +{ + const u8 __user *src =3D from; + u8 *dest =3D to; + + for (; n; n--) + if (get_user(*dest++, src++)) + break; + return n; +} + +static int copy_to_user_io(void __user *to, const void *from, unsigned lon= g n) +{ + const u8 *src =3D from; + u8 __user *dest =3D to; + + for (; n; n--) + if (put_user(*src++, dest++)) + break; + return n; +} + +static int align_load(unsigned long addr, int sz, u64 *out) +{ + union { + u8 d8; + u16 d16; + u32 d32; + u64 d64; + char c[8]; + } data; + + if (sz !=3D 1 && sz !=3D 2 && sz !=3D 4 && sz !=3D 8) + return 1; + if (is_ttbr0_addr(addr)) { + if (copy_from_user_io(data.c, (const void __user *)addr, sz)) + return 1; + } else + memcpy_fromio(data.c, (const void __iomem *)addr, sz); + switch (sz) { + case 1: + *out =3D data.d8; + break; + case 2: + *out =3D data.d16; + break; + case 4: + *out =3D data.d32; + break; + case 8: + *out =3D data.d64; + break; + default: + return 1; + } + return 0; +} + +static int align_store(unsigned long addr, int sz, u64 val) +{ + union { + u8 d8; + u16 d16; + u32 d32; + u64 d64; + char c[8]; + } data; + + switch (sz) { + case 1: + data.d8 =3D val; + break; + case 2: + data.d16 =3D val; + break; + case 4: + data.d32 =3D val; + break; + case 8: + data.d64 =3D val; + break; + default: + return 1; + } + if (is_ttbr0_addr(addr)) { + if (copy_to_user_io((void __user *)addr, data.c, sz)) + return 1; + } else + memcpy_toio((void __iomem *)addr, data.c, sz); + return 0; +} + +static int align_dc_zva(unsigned long addr, struct pt_regs *regs) +{ + int bs =3D read_cpuid(DCZID_EL0) & 0xf; + int sz =3D 1 << (bs + 2); + + addr &=3D ~(sz - 1); + if (is_ttbr0_addr(addr)) { + for (; sz; sz--) { + if (align_store(addr, 1, 0)) + return 1; + } + } else + memset_io((void *)addr, 0, sz); + return 0; +} + +extern u64 __arm64_get_vn_dt(int n, int t); +extern void __arm64_set_vn_dt(int n, int t, u64 val); + +#define get_vn_dt __arm64_get_vn_dt +#define set_vn_dt __arm64_set_vn_dt + +static int align_ldst_pair(u32 insn, struct pt_regs *regs) +{ + const u32 OPC =3D GENMASK(31, 30); + const u32 L_MASK =3D BIT(22); + + int opc =3D FIELD_GET(OPC, insn); + int L =3D FIELD_GET(L_MASK, insn); + + bool wback =3D !!(insn & BIT(23)); + bool postindex =3D !(insn & BIT(24)); + + int n =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int t2 =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT2, insn); + bool is_store =3D !L; + bool is_signed =3D !!(opc & 1); + int scale =3D 2 + (opc >> 1); + int datasize =3D 8 << scale; + u64 uoffset =3D aarch64_insn_decode_immediate(AARCH64_INSN_IMM_7, insn); + s64 offset =3D sign_extend64(uoffset, 6) << scale; + u64 address; + u64 data1, data2; + u64 dbytes; + + if ((is_store && (opc & 1)) || opc =3D=3D 3) + return 1; + + if (wback && (t =3D=3D n || t2 =3D=3D n) && n !=3D 31) + return 1; + + if (!is_store && t =3D=3D t2) + return 1; + + dbytes =3D datasize / 8; + + address =3D regs_get_register(regs, n << 3); + + if (!postindex) + address +=3D offset; + + if (is_store) { + data1 =3D pt_regs_read_reg(regs, t); + data2 =3D pt_regs_read_reg(regs, t2); + if (align_store(address, dbytes, data1) || + align_store(address + dbytes, dbytes, data2)) + return 1; + } else { + if (align_load(address, dbytes, &data1) || + align_load(address + dbytes, dbytes, &data2)) + return 1; + if (is_signed) { + data1 =3D sign_extend64(data1, datasize - 1); + data2 =3D sign_extend64(data2, datasize - 1); + } + pt_regs_write_reg(regs, t, data1); + pt_regs_write_reg(regs, t2, data2); + } + + if (wback) { + if (postindex) + address +=3D offset; + if (n =3D=3D 31) + regs->sp =3D address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst_pair_simdfp(u32 insn, struct pt_regs *regs) +{ + const u32 OPC =3D GENMASK(31, 30); + const u32 L_MASK =3D BIT(22); + + int opc =3D FIELD_GET(OPC, insn); + int L =3D FIELD_GET(L_MASK, insn); + + bool wback =3D !!(insn & BIT(23)); + bool postindex =3D !(insn & BIT(24)); + + int n =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int t2 =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT2, insn); + bool is_store =3D !L; + int scale =3D 2 + opc; + int datasize =3D 8 << scale; + u64 uoffset =3D aarch64_insn_decode_immediate(AARCH64_INSN_IMM_7, insn); + s64 offset =3D sign_extend64(uoffset, 6) << scale; + u64 address; + u64 data1_d0, data1_d1, data2_d0, data2_d1; + u64 dbytes; + + if (opc =3D=3D 0x3) + return 1; + + if (!is_store && t =3D=3D t2) + return 1; + + dbytes =3D datasize / 8; + + address =3D regs_get_register(regs, n << 3); + + if (!postindex) + address +=3D offset; + + if (is_store) { + data1_d0 =3D get_vn_dt(t, 0); + data2_d0 =3D get_vn_dt(t2, 0); + if (datasize =3D=3D 128) { + data1_d1 =3D get_vn_dt(t, 1); + data2_d1 =3D get_vn_dt(t2, 1); + if (align_store(address, 8, data1_d0) || + align_store(address + 8, 8, data1_d1) || + align_store(address + 16, 8, data2_d0) || + align_store(address + 24, 8, data2_d1)) + return 1; + } else { + if (align_store(address, dbytes, data1_d0) || + align_store(address + dbytes, dbytes, data2_d0)) + return 1; + } + } else { + if (datasize =3D=3D 128) { + if (align_load(address, 8, &data1_d0) || + align_load(address + 8, 8, &data1_d1) || + align_load(address + 16, 8, &data2_d0) || + align_load(address + 24, 8, &data2_d1)) + return 1; + } else { + if (align_load(address, dbytes, &data1_d0) || + align_load(address + dbytes, dbytes, &data2_d0)) + return 1; + data1_d1 =3D data2_d1 =3D 0; + } + set_vn_dt(t, 0, data1_d0); + set_vn_dt(t, 1, data1_d1); + set_vn_dt(t2, 0, data2_d0); + set_vn_dt(t2, 1, data2_d1); + } + + if (wback) { + if (postindex) + address +=3D offset; + if (n =3D=3D 31) + regs->sp =3D address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst_regoff(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE =3D GENMASK(31, 30); + const u32 OPC =3D GENMASK(23, 22); + const u32 OPTION =3D GENMASK(15, 13); + const u32 S =3D BIT(12); + + u32 size =3D FIELD_GET(SIZE, insn); + u32 opc =3D FIELD_GET(OPC, insn); + u32 option =3D FIELD_GET(OPTION, insn); + u32 s =3D FIELD_GET(S, insn); + int scale =3D size; + int extend_len =3D (option & 0x1) ? 64 : 32; + bool extend_unsigned =3D !(option & 0x4); + int shift =3D s ? scale : 0; + + int n =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int m =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RM, insn); + bool is_store; + bool is_signed; + int regsize; + int datasize; + u64 offset; + u64 address; + u64 data; + + if ((opc & 0x2) =3D=3D 0) { + /* store or zero-extending load */ + is_store =3D !(opc & 0x1); + regsize =3D size =3D=3D 0x3 ? 64 : 32; + is_signed =3D false; + } else { + if (size =3D=3D 0x3) { + if ((opc & 0x1) =3D=3D 0) { + /* prefetch */ + return 0; + } else { + /* undefined */ + return 1; + } + } else { + /* sign-extending load */ + is_store =3D false; + if (size =3D=3D 0x2 && (opc & 0x1) =3D=3D 0x1) { + /* undefined */ + return 1; + } + regsize =3D (opc & 0x1) =3D=3D 0x1 ? 32 : 64; + is_signed =3D true; + } + } + + datasize =3D 8 << scale; + + if (n =3D=3D t && n !=3D 31) + return 1; + + offset =3D pt_regs_read_reg(regs, m); + if (extend_len =3D=3D 32) { + offset &=3D (u32)~0; + if (!extend_unsigned) + sign_extend64(offset, 31); + } + offset <<=3D shift; + + address =3D regs_get_register(regs, n << 3) + offset; + + if (is_store) { + data =3D pt_regs_read_reg(regs, t); + if (align_store(address, datasize / 8, data)) + return 1; + } else { + if (align_load(address, datasize / 8, &data)) + return 1; + if (is_signed) { + if (regsize =3D=3D 32) + data =3D sign_extend32(data, datasize - 1); + else + data =3D sign_extend64(data, datasize - 1); + } + } + + return 0; +} + +static int align_ldst_regoff_simdfp(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE =3D GENMASK(31, 30); + const u32 OPC =3D GENMASK(23, 22); + const u32 OPTION =3D GENMASK(15, 13); + const u32 S =3D BIT(12); + + u32 size =3D FIELD_GET(SIZE, insn); + u32 opc =3D FIELD_GET(OPC, insn); + u32 option =3D FIELD_GET(OPTION, insn); + u32 s =3D FIELD_GET(S, insn); + /* this elides the 8/16 bit sign extensions */ + int extend_len =3D (option & 0x1) ? 64 : 32; + bool extend_unsigned =3D !(option & 0x4); + + int n =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int m =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RM, insn); + bool is_store =3D !(opc & BIT(0)); + int scale; + int shift; + int datasize; + u64 offset; + u64 address; + u64 data_d0, data_d1; + + /* if option<1> =3D=3D '0' then UNDEFINED; // sub-word index */ + if ((option & 0x2) =3D=3D 0) { + pr_warn("option<1> =3D=3D 0 is UNDEFINED"); + return 1; + } + + /* if opc<1> =3D=3D '1' && size !=3D '00' then UNDEFINED;*/ + if ((opc & 0x2) && size !=3D 0b00) { + pr_warn("opc<1> =3D=3D '1' && size !=3D '00' is UNDEFINED\n"); + return 1; + } + + /* + * constant integer scale =3D if opc<1> =3D=3D '1' then 4 else UInt(size); + */ + scale =3D opc & 0x2 ? 4 : size; + shift =3D s ? scale : 0; + + datasize =3D 8 << scale; + + offset =3D pt_regs_read_reg(regs, m); + if (extend_len =3D=3D 32) { + offset &=3D (u32)~0; + if (!extend_unsigned) + sign_extend64(offset, 31); + } + offset <<=3D shift; + + address =3D regs_get_register(regs, n << 3) + offset; + + if (is_store) { + data_d0 =3D get_vn_dt(t, 0); + if (datasize =3D=3D 128) { + data_d1 =3D get_vn_dt(t, 1); + if (align_store(address, 8, data_d0) || + align_store(address + 8, 8, data_d1)) + return 1; + } else { + if (align_store(address, datasize / 8, data_d0)) + return 1; + } + } else { + if (datasize =3D=3D 128) { + if (align_load(address, 8, &data_d0) || + align_load(address + 8, 8, &data_d1)) + return 1; + } else { + if (align_load(address, datasize / 8, &data_d0)) + return 1; + data_d1 =3D 0; + } + set_vn_dt(t, 0, data_d0); + set_vn_dt(t, 1, data_d1); + } + + return 0; +} + +static int align_ldst_imm(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE =3D GENMASK(31, 30); + const u32 OPC =3D GENMASK(23, 22); + + u32 size =3D FIELD_GET(SIZE, insn); + u32 opc =3D FIELD_GET(OPC, insn); + bool wback =3D !(insn & BIT(24)) && !!(insn & BIT(10)); + bool postindex =3D wback && !(insn & BIT(11)); + int scale =3D size; + u64 offset; + + int n =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + bool is_store; + bool is_signed; + int regsize; + int datasize; + u64 address; + u64 data; + + if (!(insn & BIT(24))) { + u64 uoffset =3D + aarch64_insn_decode_immediate(AARCH64_INSN_IMM_9, insn); + offset =3D sign_extend64(uoffset, 8); + } else { + offset =3D aarch64_insn_decode_immediate(AARCH64_INSN_IMM_12, insn); + offset <<=3D scale; + } + + if ((opc & 0x2) =3D=3D 0) { + /* store or zero-extending load */ + is_store =3D !(opc & 0x1); + regsize =3D size =3D=3D 0x3 ? 64 : 32; + is_signed =3D false; + } else { + if (size =3D=3D 0x3) { + if (FIELD_GET(GENMASK(11, 10), insn) =3D=3D 0 && (opc & 0x1) =3D=3D 0) { + /* prefetch */ + return 0; + } else { + /* undefined */ + return 1; + } + } else { + /* sign-extending load */ + is_store =3D false; + if (size =3D=3D 0x2 && (opc & 0x1) =3D=3D 0x1) { + /* undefined */ + return 1; + } + regsize =3D (opc & 0x1) =3D=3D 0x1 ? 32 : 64; + is_signed =3D true; + } + } + + datasize =3D 8 << scale; + + if (n =3D=3D t && n !=3D 31) + return 1; + + address =3D regs_get_register(regs, n << 3); + + if (!postindex) + address +=3D offset; + + if (is_store) { + data =3D pt_regs_read_reg(regs, t); + if (align_store(address, datasize / 8, data)) + return 1; + } else { + if (align_load(address, datasize / 8, &data)) + return 1; + if (is_signed) { + if (regsize =3D=3D 32) + data =3D sign_extend32(data, datasize - 1); + else + data =3D sign_extend64(data, datasize - 1); + } + pt_regs_write_reg(regs, t, data); + } + + if (wback) { + if (postindex) + address +=3D offset; + if (n =3D=3D 31) + regs->sp =3D address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst_imm_simdfp(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE =3D GENMASK(31, 30); + const u32 OPC =3D GENMASK(23, 22); + + u32 size =3D FIELD_GET(SIZE, insn); + u32 opc =3D FIELD_GET(OPC, insn); + bool wback =3D !(insn & BIT(24)) && !!(insn & BIT(10)); + bool postindex =3D wback && !(insn & BIT(11)); + int scale =3D (opc & 0x2) << 1 | size; + u64 offset; + + int n =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t =3D aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + bool is_store =3D !(opc & BIT(0)) ; + int datasize; + u64 address; + u64 data_d0, data_d1; + + if (scale > 4) + return 1; + + if (!(insn & BIT(24))) { + u64 uoffset =3D + aarch64_insn_decode_immediate(AARCH64_INSN_IMM_9, insn); + offset =3D sign_extend64(uoffset, 8); + } else { + offset =3D aarch64_insn_decode_immediate(AARCH64_INSN_IMM_12, insn); + offset <<=3D scale; + } + + datasize =3D 8 << scale; + + address =3D regs_get_register(regs, n << 3); + + if (!postindex) + address +=3D offset; + + if (is_store) { + data_d0 =3D get_vn_dt(t, 0); + if (datasize =3D=3D 128) { + data_d1 =3D get_vn_dt(t, 1); + if (align_store(address, 8, data_d0) || + align_store(address + 8, 8, data_d1)) + return 1; + } else { + if (align_store(address, datasize / 8, data_d0)) + return 1; + } + } else { + if (datasize =3D=3D 128) { + if (align_load(address, 8, &data_d0) || + align_load(address + 8, 8, &data_d1)) + return 1; + } else { + if (align_load(address, datasize / 8, &data_d0)) + return 1; + data_d1 =3D 0; + } + set_vn_dt(t, 0, data_d0); + set_vn_dt(t, 1, data_d1); + } + + if (wback) { + if (postindex) + address +=3D offset; + if (n =3D=3D 31) + regs->sp =3D address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst(u32 insn, struct pt_regs *regs) +{ + const u32 op0 =3D FIELD_GET(GENMASK(31, 28), insn); + const u32 op1 =3D FIELD_GET(BIT(26), insn); + const u32 op2 =3D FIELD_GET(GENMASK(24, 23), insn); + const u32 op3 =3D FIELD_GET(GENMASK(21, 16), insn); + const u32 op4 =3D FIELD_GET(GENMASK(11, 10), insn); + + if ((op0 & 0x3) =3D=3D 0x2) { + /* + * |------+-----+-----+-----+-----+-------------------------------------= ----| + * | op0 | op1 | op2 | op3 | op4 | Decode group = | + * |------+-----+-----+-----+-----+-------------------------------------= ----| + * | xx10 | - | 00 | - | - | Load/store no-allocate pair (offset)= | + * | xx10 | - | 01 | - | - | Load/store register pair (post-index= ed) | + * | xx10 | - | 10 | - | - | Load/store register pair (offset) = | + * | xx10 | - | 11 | - | - | Load/store register pair (pre-indexe= d) | + * |------+-----+-----+-----+-----+-------------------------------------= ----| + */ + + if (op1 =3D=3D 0) { /* V =3D=3D 0 */ + /* general */ + return align_ldst_pair(insn, regs); + } else { + /* simdfp */ + return align_ldst_pair_simdfp(insn, regs); + } + } else if ((op0 & 0x3) =3D=3D 0x3 && + (((op2 & 0x2) =3D=3D 0 && (op3 & 0x20) =3D=3D 0 && op4 !=3D 0x2) || + ((op2 & 0x2) =3D=3D 0x2))) { + /* + * |------+-----+-----+--------+-----+----------------------------------= ------------| + * | op0 | op1 | op2 | op3 | op4 | Decode group = | + * |------+-----+-----+--------+-----+----------------------------------= ------------| + * | xx11 | - | 0x | 0xxxxx | 00 | Load/store register (unscaled imm= ediate) | + * | xx11 | - | 0x | 0xxxxx | 01 | Load/store register (immediate po= st-indexed) | + * | xx11 | - | 0x | 0xxxxx | 11 | Load/store register (immediate pr= e-indexed) | + * | xx11 | - | 1x | - | - | Load/store register (unsigned imm= ediate) | + * |------+-----+-----+--------+-----+----------------------------------= ------------| + */ + + if (op1 =3D=3D 0) { /* V =3D=3D 0 */ + /* general */ + return align_ldst_imm(insn, regs); + } else { + /* simdfp */ + return align_ldst_imm_simdfp(insn, regs); + } + } else if ((op0 & 0x3) =3D=3D 0x3 && (op2 & 0x2) =3D=3D 0 && + (op3 & 0x20) =3D=3D 0x20 && op4 =3D=3D 0x2) { + /* + * |------+-----+-----+--------+-----+----------------------------------= -----| + * | op0 | op1 | op2 | op3 | op4 | = | + * |------+-----+-----+--------+-----+----------------------------------= -----| + * | xx11 | - | 0x | 1xxxxx | 10 | Load/store register (register off= set) | + * |------+-----+-----+--------+-----+----------------------------------= -----| + */ + if (op1 =3D=3D 0) { /* V =3D=3D 0 */ + /* general */ + return align_ldst_regoff(insn, regs); + } else { + /* simdfp */ + return align_ldst_regoff_simdfp(insn, regs); + } + } else + return 1; +} + +static int fixup_alignment(unsigned long addr, unsigned int esr, + struct pt_regs *regs) +{ + u32 insn; + int res; + + if (user_mode(regs)) { + __le32 insn_le; + + if (!is_ttbr0_addr(addr)) + return 1; + + if (get_user(insn_le, + (__le32 __user *)instruction_pointer(regs))) + return 1; + insn =3D le32_to_cpu(insn_le); + } else { + if (aarch64_insn_read((void *)instruction_pointer(regs), &insn)) + return 1; + } + + if (aarch64_insn_is_class_branch_sys(insn)) { + if (aarch64_insn_is_dc_zva(insn)) + res =3D align_dc_zva(addr, regs); + else + res =3D 1; + } else if (((insn >> 25) & 0x5) =3D=3D 0x4) { + res =3D align_ldst(insn, regs); + } else { + res =3D 1; + } + + if (!res) + instruction_pointer_set(regs, instruction_pointer(regs) + 4); + else + pr_warn("%s: failed to fixup 0x%04x", __func__, insn); + + return res; +} + static int do_alignment_fault(unsigned long far, unsigned long esr, struct pt_regs *regs) { +#ifdef CONFIG_ALTRA_ERRATUM_82288 + if (!fixup_alignment(far, esr, regs)) + return 0; +#endif if (IS_ENABLED(CONFIG_COMPAT_ALIGNMENT_FIXUPS) && compat_user_mode(regs)) return do_compat_alignment_fixup(far, regs); diff --git a/arch/arm64/mm/fault_neon.c b/arch/arm64/mm/fault_neon.c new file mode 100644 index 0000000000000..d5319ed07d89b --- /dev/null +++ b/arch/arm64/mm/fault_neon.c @@ -0,0 +1,59 @@ +/* + * These functions require asimd, which is not accepted by Clang in normal + * kernel code, which is compiled with -mgeneral-regs-only. GCC will someh= ow + * eat it regardless, but we want it to be portable, so move these in their + * own translation unit. This allows us to turn off -mgeneral-regs-only for + * these (where it should be harmless) without risking the compiler doing + * wrong things in places where we don't want it to. + * + * Otherwise this is identical to the original patch. + * + * -- q66 + * + */ + +#include + +u64 __arm64_get_vn_dt(int n, int t) { + u64 res; + + switch (n) { +#define V(n) \ + case n: \ + asm("cbnz %w1, 1f\n\t" \ + "mov %0, v"#n".d[0]\n\t" \ + "b 2f\n\t" \ + "1: mov %0, v"#n".d[1]\n\t" \ + "2:" : "=3Dr" (res) : "r" (t)); \ + break + V( 0); V( 1); V( 2); V( 3); V( 4); V( 5); V( 6); V( 7); + V( 8); V( 9); V(10); V(11); V(12); V(13); V(14); V(15); + V(16); V(17); V(18); V(19); V(20); V(21); V(22); V(23); + V(24); V(25); V(26); V(27); V(28); V(29); V(30); V(31); +#undef V + default: + res =3D 0; + break; + } + return res; +} + +void __arm64_set_vn_dt(int n, int t, u64 val) { + switch (n) { +#define V(n) \ + case n: \ + asm("cbnz %w1, 1f\n\t" \ + "mov v"#n".d[0], %0\n\t" \ + "b 2f\n\t" \ + "1: mov v"#n".d[1], %0\n\t" \ + "2:" :: "r" (val), "r" (t)); \ + break + V( 0); V( 1); V( 2); V( 3); V( 4); V( 5); V( 6); V( 7); + V( 8); V( 9); V(10); V(11); V(12); V(13); V(14); V(15); + V(16); V(17); V(18); V(19); V(20); V(21); V(22); V(23); + V(24); V(25); V(26); V(27); V(28); V(29); V(30); V(31); +#undef Q + default: + break; + } +} --=20 2.39.2 From nobody Fri Dec 19 12:00:34 2025 Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19C441C68A4 for ; Tue, 27 Aug 2024 13:08:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764116; cv=none; b=PHAVYWnBycYBfcAPm5glSfB4lEjBYNtr4Hi6GOASS2SGmB6MaP6vx2wA9VoxiC84tju/DKvB5dIdlR29+wdFCSqn4gPZHyg8MzEmsbtp9w8YNRXhxRUw6MvEcw07kQxzC3+21+IKqxC7TutX7pJvjc5tTzm0V+e7QUQOWhdjP9w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764116; c=relaxed/simple; bh=K+TlnW9Omwb4YI1ixYapWz1zvuIl+jvuJ8oEax/y7lQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=p9crqRvWCZlSM5gHMmmMEpSQxe8uHcD3//msI9Gyu/RONToRtR0fsWA5Z3GcByJBh3DD6oCFbFjhOYdO37xSEUBm5wKD/db252PJ+T2viyghxCGg2XVMwAz6jfR9T/cs/ftAWpyC/hx/B+DmBzwqx5NhU+xiXcqY6oS8MhHQB1Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=q8tqisc6; arc=none smtp.client-ip=209.85.128.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="q8tqisc6" Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-428f5c0833bso34705285e9.0 for ; Tue, 27 Aug 2024 06:08:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1724764112; x=1725368912; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zVI5R+WdttBIprX1m04d9bNMFDDS+NINGmtqeG2h9Pk=; b=q8tqisc6qvwC9MpIGhu5X3sDpdVxWzf1pz8gsEjWmXCciikURaEpsv63Wnu+Lyyhiz y9c+9wWkOpjefQg+VDZHUlBHs48x55NpWBY7aZoFtQ6CJlfTfUIXXjkiUwZbytfFGmni 2IgrJmypXc7PyWxJGCw9AkFwYiJF3r5o3MArh6duTOL7v0xpb8LaOJaqZS3r36/zs3OE vV51T8rkS2jmLCvGZLo09gxb3GBA4sbfiOysyJmd46zriiUu1zwJRKeuJn3Q6VmIqZLm 380GFYZ1y6QdszPJdpJVYsJccbiaCLRNeQxq+PotlwOFNku2AINR+dXJ0ATTdLLPeG9J tp1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724764112; x=1725368912; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zVI5R+WdttBIprX1m04d9bNMFDDS+NINGmtqeG2h9Pk=; b=nZ7iFgKGMYOhwg9fJ/0GIj/UauWePI7OKIdIgJoncok5XrnjYd5AW6DnmW5qlnnwuW k2mjE84qberM2zyke1J2QOQTfMG2bFt01f4MHgdXxuWnfbyuI9pcMb/dsFN2CVRxoQM9 x2J23D9WrpP3sg0FBYpfo+n7hndCZNgyZm+WmTCJBWve+PDs9WrEPp0E26lj9PlmOIs+ knBUD0kw+Vvcy6SDDhdL4N1e5qASYqF/bkmQRJzYpyeWvJ4AY5G6Eq9jY2s1oROobetL lxLizrjDERuM2n5N3HUoWMDnWJqDjMQPpCS5VMobWPA2qtI2+EewEkNWZTkL4et7RhTH Jjyg== X-Gm-Message-State: AOJu0Yy+qev0IvcR3AL0ADwAMtMIxH3mDN+6/F4ak1YR31X3B/tuKfs5 dntrlsLEWRc12mbYBO4sp0qBbEYShnEElfaZzaWnThMwylH3PX8iym1nwMP2zWM= X-Google-Smtp-Source: AGHT+IE8YdSXKOIj7u+2bEQxKE7kkLyia2JoNSH9WemPGkONFgqu20dBwFvHn7im4jBKQqBOOuvFvA== X-Received: by 2002:a05:600c:19c7:b0:426:5b19:d2b3 with SMTP id 5b1f17b1804b1-42b9a682241mr16687955e9.14.1724764111836; Tue, 27 Aug 2024 06:08:31 -0700 (PDT) Received: from draig.lan ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42abefcb674sm221517205e9.32.2024.08.27.06.08.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 06:08:30 -0700 (PDT) Received: from draig.lan (localhost [IPv6:::1]) by draig.lan (Postfix) with ESMTP id 29A415F9F9; Tue, 27 Aug 2024 14:08:30 +0100 (BST) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, maz@kernel.org, arnd@linaro.org, D Scott Phillips , =?UTF-8?q?Alex=20Benn=C3=A9e?= Subject: [PATCH 2/3] ampere/arm64: Work around Ampere Altra erratum #82288 PCIE_65 Date: Tue, 27 Aug 2024 14:08:28 +0100 Message-Id: <20240827130829.43632-3-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240827130829.43632-1-alex.bennee@linaro.org> References: <20240827130829.43632-1-alex.bennee@linaro.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: D Scott Phillips Altra's PCIe controller may generate incorrect addresses when receiving writes from the CPU with a discontiguous set of byte enables. Attempt to work around this by handing out Device-nGnRE maps instead of Normal Non-cacheable maps for PCIe memory areas. Upstream-Status: Pending Signed-off-by: D Scott Phillips Signed-off-by: Alex Benn=C3=A9e --- arch/arm64/Kconfig | 22 +++++++++++++++++++++- arch/arm64/include/asm/io.h | 3 +++ arch/arm64/include/asm/pgtable.h | 27 ++++++++++++++++++++++----- arch/arm64/mm/ioremap.c | 27 +++++++++++++++++++++++++++ drivers/pci/quirks.c | 9 +++++++++ include/asm-generic/io.h | 4 ++++ mm/ioremap.c | 2 +- 7 files changed, 87 insertions(+), 7 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b3fc891f15442..01adb50df214e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -440,6 +440,27 @@ config AMPERE_ERRATUM_AC03_CPU_38 config ARM64_WORKAROUND_CLEAN_CACHE bool =20 +config ALTRA_ERRATUM_82288 + bool "Ampere Altra: 82288: PCIE_65: PCIe Root Port outbound write combini= ng issue" + default y + help + This option adds an alternative code sequence to work around + Ampere Altra erratum 82288. + + PCIe device drivers may map MMIO space as Normal, non-cacheable + memory attribute (e.g. Linux kernel drivers mapping MMIO + using ioremap_wc). This may be for the purpose of enabling write + combining or unaligned accesses. This can result in data corruption + on the PCIe interface=E2=80=99s outbound MMIO writes due to issues with= the + write-combining operation. + + The workaround modifies software that maps PCIe MMIO space as Normal, + non-cacheable memory (e.g. ioremap_wc) to instead Device, + non-gatheringmemory (e.g. ioremap). And all memory operations on PCIe + MMIO space must be strictly aligned. + + If unsure, say Y. + config ARM64_ERRATUM_826319 bool "Cortex-A53: 826319: System might deadlock if a write cannot complet= e until read data is accepted" default y @@ -2388,4 +2409,3 @@ endmenu # "CPU Power Management" source "drivers/acpi/Kconfig" =20 source "arch/arm64/kvm/Kconfig" - diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h index 41fd90895dfc3..403b65f2f44de 100644 --- a/arch/arm64/include/asm/io.h +++ b/arch/arm64/include/asm/io.h @@ -273,6 +273,9 @@ __iowrite64_copy(void __iomem *to, const void *from, si= ze_t count) =20 #define ioremap_prot ioremap_prot =20 +pgprot_t ioremap_map_prot(phys_addr_t phys_addr, size_t size, unsigned lon= g prot); +#define ioremap_map_prot ioremap_map_prot + #define _PAGE_IOREMAP PROT_DEVICE_nGnRE =20 #define ioremap_wc(addr, size) \ diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgta= ble.h index 7a4f5604be3f7..f4603924390eb 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -236,11 +236,6 @@ static inline pte_t pte_mkyoung(pte_t pte) return set_pte_bit(pte, __pgprot(PTE_AF)); } =20 -static inline pte_t pte_mkspecial(pte_t pte) -{ - return set_pte_bit(pte, __pgprot(PTE_SPECIAL)); -} - static inline pte_t pte_mkcont(pte_t pte) { pte =3D set_pte_bit(pte, __pgprot(PTE_CONT)); @@ -682,6 +677,28 @@ static inline bool pud_table(pud_t pud) { return true;= } PUD_TYPE_TABLE) #endif =20 +#ifdef CONFIG_ALTRA_ERRATUM_82288 +extern bool __read_mostly have_altra_erratum_82288; +#endif + +static inline pte_t pte_mkspecial(pte_t pte) +{ +#ifdef CONFIG_ALTRA_ERRATUM_82288 + phys_addr_t phys =3D __pte_to_phys(pte); + pgprot_t prot =3D __pgprot(pte_val(pte) & ~PTE_ADDR_LOW); + + if (unlikely(have_altra_erratum_82288) && + (phys < 0x80000000 || + (phys >=3D 0x200000000000 && phys < 0x400000000000) || + (phys >=3D 0x600000000000 && phys < 0x800000000000))) { + pte =3D __pte(__phys_to_pte_val(phys) | pgprot_val(pgprot_device(prot))); + } +#endif + + return set_pte_bit(pte, __pgprot(PTE_SPECIAL)); +} + + extern pgd_t init_pg_dir[]; extern pgd_t init_pg_end[]; extern pgd_t swapper_pg_dir[]; diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index 269f2f63ab7dc..8965766181359 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -3,6 +3,33 @@ #include #include =20 +#ifdef CONFIG_ALTRA_ERRATUM_82288 + +bool have_altra_erratum_82288 __read_mostly; +EXPORT_SYMBOL(have_altra_erratum_82288); + +static bool is_altra_pci(phys_addr_t phys_addr, size_t size) +{ + phys_addr_t end =3D phys_addr + size; + + return (phys_addr < 0x80000000 || + (end > 0x200000000000 && phys_addr < 0x400000000000) || + (end > 0x600000000000 && phys_addr < 0x800000000000)); +} +#endif + +pgprot_t ioremap_map_prot(phys_addr_t phys_addr, size_t size, + unsigned long prot_val) +{ + pgprot_t prot =3D __pgprot(prot_val); +#ifdef CONFIG_ALTRA_ERRATUM_82288 + if (unlikely(have_altra_erratum_82288 && is_altra_pci(phys_addr, size))) { + prot =3D pgprot_device(prot); + } +#endif + return prot; +} + void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size, unsigned long prot) { diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index a2ce4e08edf5a..8baf90ee3357c 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -6234,6 +6234,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa73f= , dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa76e, dpc_log_size); #endif =20 +#ifdef CONFIG_ALTRA_ERRATUM_82288 +static void quirk_altra_erratum_82288(struct pci_dev *dev) +{ + pr_info_once("Write combining PCI maps disabled due to hardware erratum\n= "); + have_altra_erratum_82288 =3D true; +} +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMPERE, 0xe100, quirk_altra_erratum_= 82288); +#endif + /* * For a PCI device with multiple downstream devices, its driver may use * a flattened device tree to describe the downstream devices. diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h index 80de699bf6af4..75670d7094537 100644 --- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -1047,6 +1047,10 @@ static inline void iounmap(volatile void __iomem *ad= dr) #elif defined(CONFIG_GENERIC_IOREMAP) #include =20 +#ifndef ioremap_map_prot +#define ioremap_map_prot(phys_addr, size, prot) __pgprot(prot) +#endif + void __iomem *generic_ioremap_prot(phys_addr_t phys_addr, size_t size, pgprot_t prot); =20 diff --git a/mm/ioremap.c b/mm/ioremap.c index 3e049dfb28bd0..a4e6950682f33 100644 --- a/mm/ioremap.c +++ b/mm/ioremap.c @@ -52,7 +52,7 @@ void __iomem *generic_ioremap_prot(phys_addr_t phys_addr,= size_t size, void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size, unsigned long prot) { - return generic_ioremap_prot(phys_addr, size, __pgprot(prot)); + return generic_ioremap_prot(phys_addr, size, ioremap_map_prot(phys_addr, = size, prot)); } EXPORT_SYMBOL(ioremap_prot); #endif --=20 2.39.2 From nobody Fri Dec 19 12:00:34 2025 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A44801C689C for ; Tue, 27 Aug 2024 13:08:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764117; cv=none; b=agD2kwbXgswnC9Jt0VufAOJKMkVzpXJ6ekzj0gDd1wl7yGAYwmXBKaLR6LSMAZfxSPMD3eumf7ik7e12zNtgv5foBY9PBvSaG9mN/l8H93QsJecd60dr25GAJ1avZbCezPggTw8WWlZ1Q76KfeWWQcH/ucHRLlQACI6KJ3Hux2k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764117; c=relaxed/simple; bh=DKZ2PjINcn2iG38PL7e/2uFqlx0qNbPScgOvJPCEMaI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=fX/boZfyHsKNMrsow9qJgYhTQmCGz+OZuQztV83y7l9UnDjge2UdcnRmqyzvcFCtnkBOC2sj2zpL3ziJMG7LU4t2373vkAUgs3J0r+Bscj9D1L7EU+niFzyasY1DHPeP9IVVlG6OV0Hh9uu2BR7M15HqkvJMAPdPWDJ0s8IMQPs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=K3aCZ5jo; arc=none smtp.client-ip=209.85.221.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="K3aCZ5jo" Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-3717ff2358eso3000943f8f.1 for ; Tue, 27 Aug 2024 06:08:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1724764113; x=1725368913; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7KqRcAFW8nUzDZ1FEN66jGQtx4LYaSNAvAG+5szB8q0=; b=K3aCZ5jo8K2zlHgZPYjZIJGNJpUpr05l82Yv/lFxIrdlvjsJBAab1K9PPi6B2CYUUK Tg8RFOCQPqG9ufKL1aqyktO7jsULYeTpNLLpkLSmbzf9ZeJ0cPmJ1AsoZOwtr5648feS XBe033GcGQlsp0niXml7uoadWyxzutHwPNCLT8pmsTv6RNxr7kqKG1tPccqG4CbEAzMY OGa/FVmKcKPhM6ixlSyXPwCNgl5QGVZbmuvCvAlg/EH7cM2u5FudVMkmXC13DYveDXiN 6R8mcz70LySzPXha9gpfxLwV7es1Tk9woyiDVBQxqFM29X2nXfKddpOblgLYcqRrbcOh 6OSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724764113; x=1725368913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7KqRcAFW8nUzDZ1FEN66jGQtx4LYaSNAvAG+5szB8q0=; b=Jmnb+Pjh86/hLOEkMs1fPXwTgP5m51YGboyXSsYUD+7wGU+CR9+qnsNN6Bz6YyDOi5 GpRUGmxBhH8QX37nAKk9ldJeN+JqtirvKl6mE0YvPsayttaLj9U8xJr2dAvf+lY0wg/Q thPM8YyHXUBRw05wfsZ1Ai8liHGrM5lp4ct1C56VLbtSbgZzuaz4y/b7OvrNutC1DEEr CXJVnIz0VvHqP4MT/jRikMx0xB+y0dx2OCJu6tgRnG4vkTHR8h7/LJjiI+eBxciwB13s 8XRfkz9d3DQCRuDlI9rrFeYUvAzU0nWe5YQE62x4aszplUSMjKA4UzcRszRjdRD+aTlj ij+g== X-Gm-Message-State: AOJu0YyX4RIRLVtdfT1H4ZPCFKK9JjV/AiR2IG/w9zSgqsmNFiMgKjYt CQdldNo26TQdIBNbjsFce6+twvhvrrhcW9ydzccJD9PKNk8UEBxQpHvedv7O5DU= X-Google-Smtp-Source: AGHT+IFlcKOK92s/UxJ1kIhXDpRFVT9AfPK9EVRV4+49EEVGWs+1h7/ccWiUMy3fn1r+RZMCd6eVbg== X-Received: by 2002:adf:a351:0:b0:371:7e46:68cb with SMTP id ffacd0b85a97d-373118c6b95mr7484089f8f.50.1724764112624; Tue, 27 Aug 2024 06:08:32 -0700 (PDT) Received: from draig.lan ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3730813c1e3sm13012590f8f.35.2024.08.27.06.08.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 06:08:32 -0700 (PDT) Received: from draig.lan (localhost [IPv6:::1]) by draig.lan (Postfix) with ESMTP id 380E35F9FF; Tue, 27 Aug 2024 14:08:30 +0100 (BST) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, maz@kernel.org, arnd@linaro.org, =?UTF-8?q?Alex=20Benn=C3=A9e?= Subject: [PATCH 3/3] ampere/arm64: instrument the altra workarounds Date: Tue, 27 Aug 2024 14:08:29 +0100 Message-Id: <20240827130829.43632-4-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240827130829.43632-1-alex.bennee@linaro.org> References: <20240827130829.43632-1-alex.bennee@linaro.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable This is mostly a debugging aid to measure the impact of workarounds. Signed-off-by: Alex Benn=C3=A9e --- arch/arm64/include/asm/pgtable.h | 2 ++ arch/arm64/mm/fault.c | 9 +++-- arch/arm64/mm/ioremap.c | 11 ++++++ include/trace/events/altra_fixup.h | 57 ++++++++++++++++++++++++++++++ 4 files changed, 77 insertions(+), 2 deletions(-) create mode 100644 include/trace/events/altra_fixup.h diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgta= ble.h index f4603924390eb..26812b7fc6d93 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -679,6 +679,7 @@ static inline bool pud_table(pud_t pud) { return true; } =20 #ifdef CONFIG_ALTRA_ERRATUM_82288 extern bool __read_mostly have_altra_erratum_82288; +void do_trace_altra_mkspecial(pte_t pte); #endif =20 static inline pte_t pte_mkspecial(pte_t pte) @@ -692,6 +693,7 @@ static inline pte_t pte_mkspecial(pte_t pte) (phys >=3D 0x200000000000 && phys < 0x400000000000) || (phys >=3D 0x600000000000 && phys < 0x800000000000))) { pte =3D __pte(__phys_to_pte_val(phys) | pgprot_val(pgprot_device(prot))); + do_trace_altra_mkspecial(pte); } #endif =20 diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 744e7b1664b1c..6cb3c600cc56a 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -45,6 +45,8 @@ #include #include =20 +#include + struct fault_info { /* fault handler, return 0 on successful handling */ int (*fn)(unsigned long far, unsigned long esr, @@ -1376,6 +1378,8 @@ static int fixup_alignment(unsigned long addr, unsign= ed int esr, u32 insn; int res; =20 + trace_altra_fixup_alignment(addr, esr); + if (user_mode(regs)) { __le32 insn_le; =20 @@ -1414,8 +1418,9 @@ static int do_alignment_fault(unsigned long far, unsi= gned long esr, struct pt_regs *regs) { #ifdef CONFIG_ALTRA_ERRATUM_82288 - if (!fixup_alignment(far, esr, regs)) - return 0; + if (!fixup_alignment(far, esr, regs)) { + return 0; + } #endif if (IS_ENABLED(CONFIG_COMPAT_ALIGNMENT_FIXUPS) && compat_user_mode(regs)) diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index 8965766181359..d38d903d8a063 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -5,9 +5,19 @@ =20 #ifdef CONFIG_ALTRA_ERRATUM_82288 =20 +#define CREATE_TRACE_POINTS +#include + bool have_altra_erratum_82288 __read_mostly; EXPORT_SYMBOL(have_altra_erratum_82288); =20 +void do_trace_altra_mkspecial(pte_t pte) +{ + trace_altra_mkspecial(pte); +} +EXPORT_SYMBOL(do_trace_altra_mkspecial); +EXPORT_TRACEPOINT_SYMBOL(altra_mkspecial); + static bool is_altra_pci(phys_addr_t phys_addr, size_t size) { phys_addr_t end =3D phys_addr + size; @@ -25,6 +35,7 @@ pgprot_t ioremap_map_prot(phys_addr_t phys_addr, size_t s= ize, #ifdef CONFIG_ALTRA_ERRATUM_82288 if (unlikely(have_altra_erratum_82288 && is_altra_pci(phys_addr, size))) { prot =3D pgprot_device(prot); + trace_altra_ioremap_prot(prot); } #endif return prot; diff --git a/include/trace/events/altra_fixup.h b/include/trace/events/altr= a_fixup.h new file mode 100644 index 0000000000000..73115740c5d84 --- /dev/null +++ b/include/trace/events/altra_fixup.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM altra_fixup + +#if !defined(_ALTERA_FIXUP_H_) || defined(TRACE_HEADER_MULTI_READ) +#define _ALTRA_FIXUP_H_ + +#include +#include + +#ifdef CONFIG_ALTRA_ERRATUM_82288 + +TRACE_EVENT(altra_fixup_alignment, + TP_PROTO(unsigned long far, unsigned long esr), + TP_ARGS(far, esr), + TP_STRUCT__entry( + __field(unsigned long, far) + __field(unsigned long, esr) + ), + TP_fast_assign( + __entry->far =3D far; + __entry->esr =3D esr; + ), + TP_printk("far=3D0x%016lx esr=3D0x%016lx", + __entry->far, __entry->esr) +); + +TRACE_EVENT(altra_mkspecial, + TP_PROTO(pte_t pte), + TP_ARGS(pte), + TP_STRUCT__entry( + __field(pteval_t, pte) + ), + TP_fast_assign( + __entry->pte =3D pte_val(pte); + ), + TP_printk("pte=3D0x%016llx", __entry->pte) +); + +TRACE_EVENT(altra_ioremap_prot, + TP_PROTO(pgprot_t prot), + TP_ARGS(prot), + TP_STRUCT__entry( + __field(pteval_t, pte) + ), + TP_fast_assign( + __entry->pte =3D pgprot_val(prot); + ), + TP_printk("prot=3D0x%016llx", __entry->pte) +); + +#endif /* CONFIG_ALTRA_ERRATUM_82288 */ + +#endif /* _ALTRA_FIXUP_H_ */ + +/* This part must be outside protection */ +#include --=20 2.39.2