From nobody Tue Feb 10 12:59:44 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588265040; cv=none; d=zohomail.com; s=zohoarc; b=Dbdq0mAaGcGXOiEicY5+V7uc4ZpgoCannr/ssK29hfjPDtS5enLAw9y3vhJbv/dcsxa50G87DgiWUv5bHzHo+j2GVRDyhHA1m8ZR20NjuK81fkQjbL+j10voKRm1rMG6udG+S1yOSCc+VieGjMbiXtxSo8Ojqkypw0fdRwLUX20= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588265040; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=7ZJAf0ODZypJUEx/2hOu0Olpz0sJ7Rm6b/UcyBKBG+Y=; b=VyIxCsJ/SxnMyfyMNRatD1euNUHEzoSyUd6MoAVL4fzCpAN92CF93X2LoMW1l7wjoF8F2r9RvbbC+Dm/2fvlulKwdIiCI87UqysZmgMkZA17bW3f52hNDbquFzbKFMv9wim9JYsA5kDkxiytt2knOxOfVqRHTA/richS0kndD9k= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588265040130632.8420874806707; Thu, 30 Apr 2020 09:44:00 -0700 (PDT) Received: from localhost ([::1]:49670 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUCI2-0000x3-HY for importer@patchew.org; Thu, 30 Apr 2020 12:43:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49388) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jUC3B-0001qT-KR for qemu-devel@nongnu.org; Thu, 30 Apr 2020 12:28:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.90_1) (envelope-from ) id 1jUC35-0006dr-KS for qemu-devel@nongnu.org; Thu, 30 Apr 2020 12:28:37 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:41679) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jUC35-0006YI-4q for qemu-devel@nongnu.org; Thu, 30 Apr 2020 12:28:31 -0400 Received: by mail-pl1-x643.google.com with SMTP id d24so2419494pll.8 for ; Thu, 30 Apr 2020 09:28:30 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id j5sm243514pfh.58.2020.04.30.09.28.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Apr 2020 09:28:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7ZJAf0ODZypJUEx/2hOu0Olpz0sJ7Rm6b/UcyBKBG+Y=; b=ZereAmgQO3hlSgIReZWndX/4U5wg9gmXv0KJO6OjOedqGmhrWYnB3STted5JkysLvH Nks+dwZxXd/g5sy1cRBnGLf47o4WQQLJU7KdMeWBFxfWO0o7I8yLUK+tVY/W5E1D7N+a P6ETwlMTyEPXZGyPR8O3EJWCrF0O/8DsdzmRE67PaKInPd3WDnPu7BfRWV0lLVOXp50Q Hl31S322UraBlxcWt3uYcBBrjpPn/+gLa4TO89skuAt/71MT5YsrpB/yhCmACkgKTMqY Wt2CPd27i4P/hXKsbxf1jNvn9O/qB8T015jbK8oJOQWLgWiIR7WINPhkc1LmBtILKvaL X5SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7ZJAf0ODZypJUEx/2hOu0Olpz0sJ7Rm6b/UcyBKBG+Y=; b=GyK/u1KgKKRCzUr86NwvQjiTpi+P/uk+mpbYk3zK5BKLs10TG5o0AuHZzrT8an/kUe fy4QUTFSeLTYFeUqcUd4YddzxJ4u8hjKLyvKRzjl5eJ/pSRtBuedcbxa1HlgzdIuvg6Y w0gqs4YI3Acz0X6wT16JvGwF93j73K8seGEc2usyhrfqMWFIp6xP8Wt7jnyAjP6pOBcb Eazuj9NkuriB6O1qT4lorhbl0D/9RZnOC5Vu66txztQRCBWt2pd47TN2M34LaQaS69Vn fTgV/lVxmZycrRX+AcY2HpW8iZ9ySmQJFQ0cAsJMjuSy3MFNC0+tLYx4g+J9crTidADE A2fw== X-Gm-Message-State: AGi0PuYmnasLrtm+nhmvY114+uNRJbDVhbc/E/W+R5gS8J5imVCbWUsD bBzxLargQKBBy2MGlu8LKpgZiw59lpo= X-Google-Smtp-Source: APiQypKUsBLg6bcqc1vDh0FR6yHeza7MjeYLH8+q/wJlQrjbpVPEAh4Hhd8cyT5xFoqugW13ollYbw== X-Received: by 2002:a17:902:9882:: with SMTP id s2mr4409551plp.184.1588264109355; Thu, 30 Apr 2020 09:28:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 10/18] target/arm: Use SVEContLdSt in sve_ld1_r Date: Thu, 30 Apr 2020 09:28:05 -0700 Message-Id: <20200430162813.17671-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200430162813.17671-1-richard.henderson@linaro.org> References: <20200430162813.17671-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: Error: [-] PROGRAM ABORT : Malformed IPv6 address (bad octet value). Location : parse_addr6(), p0f-client.c:67 X-Received-From: 2607:f8b0:4864:20::643 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Content-Type: text/plain; charset="utf-8" First use of the new helper functions, so we can remove the unused markup. No longer need a scratch for user-only, as we completely probe the page set before reading; system mode still requires a scratch for MMIO. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/sve_helper.c | 188 +++++++++++++++++++++------------------- 1 file changed, 97 insertions(+), 91 deletions(-) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d007137735..6bae342a17 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -4221,9 +4221,9 @@ typedef struct { * final element on each page. Identify any single element that spans * the page boundary. Return true if there are any active elements. */ -static bool __attribute__((unused)) -sve_cont_ldst_elements(SVEContLdSt *info, target_ulong addr, uint64_t *vg, - intptr_t reg_max, int esz, int msize) +static bool sve_cont_ldst_elements(SVEContLdSt *info, target_ulong addr, + uint64_t *vg, intptr_t reg_max, + int esz, int msize) { const int esize =3D 1 << esz; const uint64_t pg_mask =3D pred_esz_masks[esz]; @@ -4313,10 +4313,9 @@ sve_cont_ldst_elements(SVEContLdSt *info, target_ulo= ng addr, uint64_t *vg, * Control the generation of page faults with @fault. Return false if * there is no work to do, which can only happen with @fault =3D=3D FAULT_= NO. */ -static bool __attribute__((unused)) -sve_cont_ldst_pages(SVEContLdSt *info, SVEContFault fault, CPUARMState *en= v, - target_ulong addr, MMUAccessType access_type, - uintptr_t retaddr) +static bool sve_cont_ldst_pages(SVEContLdSt *info, SVEContFault fault, + CPUARMState *env, target_ulong addr, + MMUAccessType access_type, uintptr_t retad= dr) { int mmu_idx =3D cpu_mmu_index(env, false); int mem_off =3D info->mem_off_first[0]; @@ -4388,109 +4387,116 @@ static inline bool test_host_page(void *host) /* * Common helper for all contiguous one-register predicated loads. */ -static void sve_ld1_r(CPUARMState *env, void *vg, const target_ulong addr, - uint32_t desc, const uintptr_t retaddr, - const int esz, const int msz, - sve_ldst1_host_fn *host_fn, - sve_ldst1_tlb_fn *tlb_fn) +static inline QEMU_ALWAYS_INLINE +void sve_ld1_r(CPUARMState *env, uint64_t *vg, const target_ulong addr, + uint32_t desc, const uintptr_t retaddr, + const int esz, const int msz, + sve_ldst1_host_fn *host_fn, + sve_ldst1_tlb_fn *tlb_fn) { - const TCGMemOpIdx oi =3D extract32(desc, SIMD_DATA_SHIFT, MEMOPIDX_SHI= FT); - const int mmu_idx =3D get_mmuidx(oi); const unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT + MEMOPIDX_SHIFT= , 5); void *vd =3D &env->vfp.zregs[rd]; - const int diffsz =3D esz - msz; const intptr_t reg_max =3D simd_oprsz(desc); - const intptr_t mem_max =3D reg_max >> diffsz; - ARMVectorReg scratch; + intptr_t reg_off, reg_last, mem_off; + SVEContLdSt info; void *host; - intptr_t split, reg_off, mem_off; + int flags; =20 - /* Find the first active element. */ - reg_off =3D find_next_active(vg, 0, reg_max, esz); - if (unlikely(reg_off =3D=3D reg_max)) { + /* Find the active elements. */ + if (!sve_cont_ldst_elements(&info, addr, vg, reg_max, esz, 1 << msz)) { /* The entire predicate was false; no load occurs. */ memset(vd, 0, reg_max); return; } - mem_off =3D reg_off >> diffsz; =20 - /* - * If the (remaining) load is entirely within a single page, then: - * For softmmu, and the tlb hits, then no faults will occur; - * For user-only, either the first load will fault or none will. - * We can thus perform the load directly to the destination and - * Vd will be unmodified on any exception path. - */ - split =3D max_for_page(addr, mem_off, mem_max); - if (likely(split =3D=3D mem_max)) { - host =3D tlb_vaddr_to_host(env, addr + mem_off, MMU_DATA_LOAD, mmu= _idx); - if (test_host_page(host)) { - intptr_t i =3D reg_off; - host -=3D mem_off; - do { - host_fn(vd, i, host + (i >> diffsz)); - i =3D find_next_active(vg, i + (1 << esz), reg_max, esz); - } while (i < reg_max); - /* After having taken any fault, zero leading inactive element= s. */ - swap_memzero(vd, reg_off); - return; - } - } + /* Probe the page(s). Exit with exception for any invalid page. */ + sve_cont_ldst_pages(&info, FAULT_ALL, env, addr, MMU_DATA_LOAD, retadd= r); =20 - /* - * Perform the predicated read into a temporary, thus ensuring - * if the load of the last element faults, Vd is not modified. - */ + flags =3D info.page[0].flags | info.page[1].flags; + if (unlikely(flags !=3D 0)) { #ifdef CONFIG_USER_ONLY - swap_memzero(&scratch, reg_off); - host =3D g2h(addr); - do { - host_fn(&scratch, reg_off, host + (reg_off >> diffsz)); - reg_off +=3D 1 << esz; - reg_off =3D find_next_active(vg, reg_off, reg_max, esz); - } while (reg_off < reg_max); + g_assert_not_reached(); #else - memset(&scratch, 0, reg_max); - goto start; - while (1) { - reg_off =3D find_next_active(vg, reg_off, reg_max, esz); - if (reg_off >=3D reg_max) { - break; - } - mem_off =3D reg_off >> diffsz; - split =3D max_for_page(addr, mem_off, mem_max); + /* + * At least one page includes MMIO (or watchpoints). + * Any bus operation can fail with cpu_transaction_failed, + * which for ARM will raise SyncExternal. Perform the load + * into scratch memory to preserve register state until the end. + */ + ARMVectorReg scratch; =20 - start: - if (split - mem_off >=3D (1 << msz)) { - /* At least one whole element on this page. */ - host =3D tlb_vaddr_to_host(env, addr + mem_off, - MMU_DATA_LOAD, mmu_idx); - if (host) { - host -=3D mem_off; - do { - host_fn(&scratch, reg_off, host + mem_off); - reg_off +=3D 1 << esz; - reg_off =3D find_next_active(vg, reg_off, reg_max, esz= ); - mem_off =3D reg_off >> diffsz; - } while (split - mem_off >=3D (1 << msz)); - continue; + memset(&scratch, 0, reg_max); + mem_off =3D info.mem_off_first[0]; + reg_off =3D info.reg_off_first[0]; + reg_last =3D info.reg_off_last[1]; + if (reg_last < 0) { + reg_last =3D info.reg_off_split; + if (reg_last < 0) { + reg_last =3D info.reg_off_last[0]; } } =20 - /* - * Perform one normal read. This may fault, longjmping out to the - * main loop in order to raise an exception. It may succeed, and - * as a side-effect load the TLB entry for the next round. Finall= y, - * in the extremely unlikely case we're performing this operation - * on I/O memory, it may succeed but not bring in the TLB entry. - * But even then we have still made forward progress. - */ - tlb_fn(env, &scratch, reg_off, addr + mem_off, retaddr); - reg_off +=3D 1 << esz; - } -#endif + do { + uint64_t pg =3D vg[reg_off >> 6]; + do { + if ((pg >> (reg_off & 63)) & 1) { + tlb_fn(env, &scratch, reg_off, addr + mem_off, retaddr= ); + } + reg_off +=3D 1 << esz; + mem_off +=3D 1 << msz; + } while (reg_off & 63); + } while (reg_off <=3D reg_last); =20 - memcpy(vd, &scratch, reg_max); + memcpy(vd, &scratch, reg_max); + return; +#endif + } + + /* The entire operation is in RAM, on valid pages. */ + + memset(vd, 0, reg_max); + mem_off =3D info.mem_off_first[0]; + reg_off =3D info.reg_off_first[0]; + reg_last =3D info.reg_off_last[0]; + host =3D info.page[0].host; + + while (reg_off <=3D reg_last) { + uint64_t pg =3D vg[reg_off >> 6]; + do { + if ((pg >> (reg_off & 63)) & 1) { + host_fn(vd, reg_off, host + mem_off); + } + reg_off +=3D 1 << esz; + mem_off +=3D 1 << msz; + } while (reg_off <=3D reg_last && (reg_off & 63)); + } + + /* + * Use the slow path to manage the cross-page misalignment. + * But we know this is RAM and cannot trap. + */ + mem_off =3D info.mem_off_split; + if (unlikely(mem_off >=3D 0)) { + tlb_fn(env, vd, info.reg_off_split, addr + mem_off, retaddr); + } + + mem_off =3D info.mem_off_first[1]; + if (unlikely(mem_off >=3D 0)) { + reg_off =3D info.reg_off_first[1]; + reg_last =3D info.reg_off_last[1]; + host =3D info.page[1].host; + + do { + uint64_t pg =3D vg[reg_off >> 6]; + do { + if ((pg >> (reg_off & 63)) & 1) { + host_fn(vd, reg_off, host + mem_off); + } + reg_off +=3D 1 << esz; + mem_off +=3D 1 << msz; + } while (reg_off & 63); + } while (reg_off <=3D reg_last); + } } =20 #define DO_LD1_1(NAME, ESZ) \ --=20 2.20.1