From nobody Thu Apr 2 17:23:45 2026 Received: from angie.orcam.me.uk (angie.orcam.me.uk [78.133.224.34]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9D1C237C0FA; Fri, 27 Mar 2026 18:57:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=78.133.224.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774637853; cv=none; b=oZQDzlIGWZ5qnvGd2N1/q1FI+2VqLzu7OxhYUiVnWJumJf6aBschEe8c3UsdjI2CB6vyt1jK2rQiF723dCN5T3zyazI/LriTq/Qajx9pI7TgAX57O/cXoGp+3Ni5l6yfhOcuawNbu31lxPgtoCrv1U7gr7a/lS3XFtn/84qpuRQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774637853; c=relaxed/simple; bh=W621bP2LBfNGgIV8/H74wW4ghtsOlWPp+V3zbtpuWss=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=aDI/jbSxwe//TXgAkB6zGif4IfG29o7+SUgtRjbpCbMrstsA2kOYqtbnPEWCN5g3t+ttdT0lsFRcwJPZ3NRLAJiU9Ic1J2yB0P11ijQ8ocWQAxCK2oJp6pA0QCCtsJ2jm8lVCg/V/5tef0A50vdJDgG6kdeWbXeo0wQ8HrGgemo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=orcam.me.uk; spf=none smtp.mailfrom=orcam.me.uk; arc=none smtp.client-ip=78.133.224.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=orcam.me.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=orcam.me.uk Received: by angie.orcam.me.uk (Postfix, from userid 500) id DDA999200BC; Fri, 27 Mar 2026 19:57:30 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by angie.orcam.me.uk (Postfix) with ESMTP id D6D1992009B; Fri, 27 Mar 2026 18:57:30 +0000 (GMT) Date: Fri, 27 Mar 2026 18:57:30 +0000 (GMT) From: "Maciej W. Rozycki" To: Thomas Bogendoerfer , Gregory CLEMENT , Thomas Huth , =?UTF-8?Q?Philippe_Mathieu-Daud=C3=A9?= , Keguang Zhang , Jiaxun Yang cc: Waldemar Brodkorb , linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/3] MIPS: mm: Rewrite TLB uniquification for the hidden bit feature In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Before the introduction of the EHINV feature, which lets software mark=20 TLB entries invalid, certain older implementations of the MIPS ISA were=20 equipped with an analogous bit, as a vendor extension, which however is=20 hidden from software and only ever set at reset, and then any software=20 write clears it, making the intended TLB entry valid. This feature makes it unsafe to read a TLB entry with TLBR, modify the=20 page mask, and write the entry back with TLBWI, because this operation=20 will implicitly clear the hidden bit and this may create a duplicate=20 entry, as with the presence of the hidden bit there is no guarantee all the entries across the TLB are unique each. Usually the firmware has already uniquified TLB entries before handing=20 control over, in which case we only need to guarantee at bootstrap no=20 clash will happen with the VPN2 values chosen in local_flush_tlb_all(). =20 However with systems such as Mikrotik RB532 we get handed the TLB as at=20 reset, with the hidden bit set across the entries and possibly duplicate=20 entries present. This then causes a machine check exception when page=20 sizes are reset in r4k_tlb_uniquify() and prevents the system from=20 booting. Rewrite the algorithm used in r4k_tlb_uniquify() then such as to avoid=20 the reuse of ASID/VPN values across the TLB. Get rid of global entries=20 first as they may be blocking the entire address space, e.g. 16 256MiB=20 pages will exhaust the whole address space of a 32-bit CPU and a single=20 big page can exhaust the 32-bit compatibility space on a 64-bit CPU. Details of the algorithm chosen are given across the code itself. Fixes: 9f048fa48740 ("MIPS: mm: Prevent a TLB shutdown on initial uniquific= ation") Signed-off-by: Maciej W. Rozycki Cc: stable@vger.kernel.org # v6.18+ --- Hi, I realise this is a little large for backporting, but I think reverting=20 the original fix isn't a good idea, and neither is going further back,=20 beyond commit 35ad7e181541 ("MIPS: mm: tlb-r4k: Uniquify TLB entries on=20 init"), as that'd be fixing one set of systems at the expense of other=20 systems, and then bringing back random crashes at boot due to clashing TLB=20 entries. Maciej --- arch/mips/mm/tlb-r4k.c | 278 +++++++++++++++++++++++++++++++++++++++-----= ----- 1 file changed, 226 insertions(+), 52 deletions(-) linux-mips-tlb-r4k-uniquify-hidden.diff Index: linux-macro/arch/mips/mm/tlb-r4k.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- linux-macro.orig/arch/mips/mm/tlb-r4k.c +++ linux-macro/arch/mips/mm/tlb-r4k.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -24,6 +25,7 @@ #include #include #include +#include #include #include #include @@ -511,87 +513,259 @@ static int __init set_ntlb(char *str) __setup("ntlb=3D", set_ntlb); =20 =20 -/* Comparison function for EntryHi VPN fields. */ -static int r4k_vpn_cmp(const void *a, const void *b) +/* The start bit position of VPN2 and Mask in EntryHi/PageMask registers. = */ +#define VPN2_SHIFT 13 + +/* Read full EntryHi even with CONFIG_32BIT. */ +static inline unsigned long long read_c0_entryhi_native(void) { - long v =3D *(unsigned long *)a - *(unsigned long *)b; - int s =3D sizeof(long) > sizeof(int) ? sizeof(long) * 8 - 1: 0; - return s ? (v !=3D 0) | v >> s : v; + return cpu_has_64bits ? read_c0_entryhi_64() : read_c0_entryhi(); } =20 +/* Write full EntryHi even with CONFIG_32BIT. */ +static inline void write_c0_entryhi_native(unsigned long long v) +{ + if (cpu_has_64bits) + write_c0_entryhi_64(v); + else + write_c0_entryhi(v); +} + +/* TLB entry state for uniquification. */ +struct tlbent { + unsigned long long wired:1; + unsigned long long global:1; + unsigned long long asid:10; + unsigned long long vpn:51; + unsigned long long pagesz:5; + unsigned long long index:14; +}; + /* - * Initialise all TLB entries with unique values that do not clash with - * what we have been handed over and what we'll be using ourselves. + * Comparison function for TLB entry sorting. Place wired entries first, + * then global entries, then order by the increasing VPN/ASID and the + * decreasing page size. This lets us avoid clashes with wired entries + * easily and get entries for larger pages out of the way first. + * + * We could group bits so as to reduce the number of comparisons, but this + * is seldom executed and not performance-critical, so prefer legibility. */ -static void __ref r4k_tlb_uniquify(void) +static int r4k_entry_cmp(const void *a, const void *b) { - int tlbsize =3D current_cpu_data.tlbsize; - bool use_slab =3D slab_is_available(); - int start =3D num_wired_entries(); - phys_addr_t tlb_vpn_size; - unsigned long *tlb_vpns; - unsigned long vpn_mask; - int cnt, ent, idx, i; + struct tlbent ea =3D *(struct tlbent *)a, eb =3D *(struct tlbent *)b; =20 - vpn_mask =3D GENMASK(cpu_vmbits - 1, 13); - vpn_mask |=3D IS_ENABLED(CONFIG_64BIT) ? 3ULL << 62 : 1 << 31; + if (ea.wired > eb.wired) + return -1; + else if (ea.wired < eb.wired) + return 1; + else if (ea.global > eb.global) + return -1; + else if (ea.global < eb.global) + return 1; + else if (ea.vpn < eb.vpn) + return -1; + else if (ea.vpn > eb.vpn) + return 1; + else if (ea.asid < eb.asid) + return -1; + else if (ea.asid > eb.asid) + return 1; + else if (ea.pagesz > eb.pagesz) + return -1; + else if (ea.pagesz < eb.pagesz) + return 1; + else + return 0; +} =20 - tlb_vpn_size =3D tlbsize * sizeof(*tlb_vpns); - tlb_vpns =3D (use_slab ? - kmalloc(tlb_vpn_size, GFP_KERNEL) : - memblock_alloc_raw(tlb_vpn_size, sizeof(*tlb_vpns))); - if (WARN_ON(!tlb_vpns)) - return; /* Pray local_flush_tlb_all() is good enough. */ +/* + * Fetch all the TLB entries. Mask individual VPN values retrieved with + * the corresponding page mask and ignoring any 1KiB extension as we'll + * be using 4KiB pages for uniquification. + */ +static void __ref r4k_tlb_uniquify_read(struct tlbent *tlb_vpns, int tlbsi= ze) +{ + int start =3D num_wired_entries(); + unsigned long long vpn_mask; + bool global; + int i; =20 - htw_stop(); + vpn_mask =3D GENMASK(current_cpu_data.vmbits - 1, VPN2_SHIFT); + vpn_mask |=3D cpu_has_64bits ? 3ULL << 62 : 1 << 31; =20 - for (i =3D start, cnt =3D 0; i < tlbsize; i++, cnt++) { - unsigned long vpn; + for (i =3D 0; i < tlbsize; i++) { + unsigned long long entryhi, vpn, mask, asid; + unsigned int pagesz; =20 write_c0_index(i); mtc0_tlbr_hazard(); tlb_read(); tlb_read_hazard(); - vpn =3D read_c0_entryhi(); - vpn &=3D vpn_mask & PAGE_MASK; - tlb_vpns[cnt] =3D vpn; =20 - /* Prevent any large pages from overlapping regular ones. */ - write_c0_pagemask(read_c0_pagemask() & PM_DEFAULT_MASK); - mtc0_tlbw_hazard(); - tlb_write_indexed(); - tlbw_use_hazard(); + global =3D !!(read_c0_entrylo0() & ENTRYLO_G); + entryhi =3D read_c0_entryhi_native(); + mask =3D read_c0_pagemask(); + + asid =3D entryhi & cpu_asid_mask(¤t_cpu_data); + vpn =3D (entryhi & vpn_mask & ~mask) >> VPN2_SHIFT; + pagesz =3D ilog2((mask >> VPN2_SHIFT) + 1); + + tlb_vpns[i].global =3D global; + tlb_vpns[i].asid =3D global ? 0 : asid; + tlb_vpns[i].vpn =3D vpn; + tlb_vpns[i].pagesz =3D pagesz; + tlb_vpns[i].wired =3D i < start; + tlb_vpns[i].index =3D i; } +} =20 - sort(tlb_vpns, cnt, sizeof(tlb_vpns[0]), r4k_vpn_cmp, NULL); +/* + * Write unique values to all but the wired TLB entries each, using + * the 4KiB page size. This size might not be supported with R6, but + * EHINV is mandatory for R6, so we won't ever be called in that case. + * + * A sorted table is supplied with any wired entries at the beginning, + * followed by any global entries, and then finally regular entries. + * We start at the VPN and ASID values of zero and only assign user + * addresses, therefore guaranteeing no clash with addresses produced + * by UNIQUE_ENTRYHI. We avoid any VPN values used by wired or global + * entries, by increasing the VPN value beyond the span of such entry. + * + * When a VPN/ASID clash is found with a regular entry we increment the + * ASID instead until no VPN/ASID clash has been found or the ASID space + * has been exhausted, in which case we increase the VPN value beyond + * the span of the largest clashing entry. + * + * We do not need to be concerned about FTLB or MMID configurations as + * those are required to implement the EHINV feature. + */ +static void __ref r4k_tlb_uniquify_write(struct tlbent *tlb_vpns, int tlbs= ize) +{ + unsigned long long asid, vpn, vpn_size, pagesz; + int widx, gidx, idx, sidx, lidx, i; =20 - write_c0_pagemask(PM_DEFAULT_MASK); + vpn_size =3D 1ULL << (current_cpu_data.vmbits - VPN2_SHIFT); + pagesz =3D ilog2((PM_4K >> VPN2_SHIFT) + 1); + + write_c0_pagemask(PM_4K); write_c0_entrylo0(0); write_c0_entrylo1(0); =20 - idx =3D 0; - ent =3D tlbsize; - for (i =3D start; i < tlbsize; i++) + asid =3D 0; + vpn =3D 0; + widx =3D 0; + gidx =3D 0; + for (sidx =3D 0; sidx < tlbsize && tlb_vpns[sidx].wired; sidx++) + ; + for (lidx =3D sidx; lidx < tlbsize && tlb_vpns[lidx].global; lidx++) + ; + idx =3D gidx =3D sidx + 1; + for (i =3D sidx; i < tlbsize; i++) { + unsigned long long entryhi, vpn_pagesz =3D 0; + while (1) { - unsigned long entryhi, vpn; + if (WARN_ON(vpn >=3D vpn_size)) { + dump_tlb_all(); + /* Pray local_flush_tlb_all() will cope. */ + return; + } =20 - entryhi =3D UNIQUE_ENTRYHI(ent); - vpn =3D entryhi & vpn_mask & PAGE_MASK; + /* VPN must be below the next wired entry. */ + if (widx < sidx && vpn >=3D tlb_vpns[widx].vpn) { + vpn =3D max(vpn, + (tlb_vpns[widx].vpn + + (1ULL << tlb_vpns[widx].pagesz))); + asid =3D 0; + widx++; + continue; + } + /* VPN must be below the next global entry. */ + if (gidx < lidx && vpn >=3D tlb_vpns[gidx].vpn) { + vpn =3D max(vpn, + (tlb_vpns[gidx].vpn + + (1ULL << tlb_vpns[gidx].pagesz))); + asid =3D 0; + gidx++; + continue; + } + /* Try to find a free ASID so as to conserve VPNs. */ + if (idx < tlbsize && vpn =3D=3D tlb_vpns[idx].vpn && + asid =3D=3D tlb_vpns[idx].asid) { + unsigned long long idx_pagesz; =20 - if (idx >=3D cnt || vpn < tlb_vpns[idx]) { - write_c0_entryhi(entryhi); - write_c0_index(i); - mtc0_tlbw_hazard(); - tlb_write_indexed(); - ent++; - break; - } else if (vpn =3D=3D tlb_vpns[idx]) { - ent++; - } else { + idx_pagesz =3D tlb_vpns[idx].pagesz; + vpn_pagesz =3D max(vpn_pagesz, idx_pagesz); + do + idx++; + while (idx < tlbsize && + vpn =3D=3D tlb_vpns[idx].vpn && + asid =3D=3D tlb_vpns[idx].asid); + asid++; + if (asid > cpu_asid_mask(¤t_cpu_data)) { + vpn +=3D vpn_pagesz; + asid =3D 0; + vpn_pagesz =3D 0; + } + continue; + } + /* VPN mustn't be above the next regular entry. */ + if (idx < tlbsize && vpn > tlb_vpns[idx].vpn) { + vpn =3D max(vpn, + (tlb_vpns[idx].vpn + + (1ULL << tlb_vpns[idx].pagesz))); + asid =3D 0; idx++; + continue; } + break; } =20 + entryhi =3D (vpn << VPN2_SHIFT) | asid; + write_c0_entryhi_native(entryhi); + write_c0_index(tlb_vpns[i].index); + mtc0_tlbw_hazard(); + tlb_write_indexed(); + + tlb_vpns[i].asid =3D asid; + tlb_vpns[i].vpn =3D vpn; + tlb_vpns[i].pagesz =3D pagesz; + + asid++; + if (asid > cpu_asid_mask(¤t_cpu_data)) { + vpn +=3D 1ULL << pagesz; + asid =3D 0; + } + } +} + +/* + * Initialise all TLB entries with unique values that do not clash with + * what we have been handed over and what we'll be using ourselves. + */ +static void __ref r4k_tlb_uniquify(void) +{ + int tlbsize =3D current_cpu_data.tlbsize; + bool use_slab =3D slab_is_available(); + phys_addr_t tlb_vpn_size; + struct tlbent *tlb_vpns; + + tlb_vpn_size =3D tlbsize * sizeof(*tlb_vpns); + tlb_vpns =3D (use_slab ? + kmalloc(tlb_vpn_size, GFP_KERNEL) : + memblock_alloc_raw(tlb_vpn_size, sizeof(*tlb_vpns))); + if (WARN_ON(!tlb_vpns)) + return; /* Pray local_flush_tlb_all() is good enough. */ + + htw_stop(); + + r4k_tlb_uniquify_read(tlb_vpns, tlbsize); + + sort(tlb_vpns, tlbsize, sizeof(*tlb_vpns), r4k_entry_cmp, NULL); + + r4k_tlb_uniquify_write(tlb_vpns, tlbsize); + + write_c0_pagemask(PM_DEFAULT_MASK); + tlbw_use_hazard(); htw_start(); flush_micro_tlb();