From nobody Sat Feb 7 06:40:05 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1568643701; cv=none; d=zoho.com; s=zohoarc; b=TQGpdqP0WSoSnlKZqdGYeDw/JVKdvTQLwGuwb9sbhVgGv+ozKZV1Pr27si8XubRIUC49yPaICGQhQBVBqDQ4Mbfrv9b5UD2wSHP36mcYCpudOiA7qL4ZPfIvQcj78r7R1eTasp0psvnHi9VM1SzZWtwkNZKhO3wm6iWcftr11wQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1568643701; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=74RdgXR2I9oaaF+q1pkEpmD8IGt0J56HvTuR8rzejJE=; b=e20haI5i0hkrUttwpqLtuYwcspvDF0rFi6jru230jaep1HNblzBpr30Xahj42GpCOHO9Oog1wk6Qlooh4dGkg6q6Z9+znqepDwV0kojrx/1ZS92hJGJrExeOM6LnGPza10uczklrOP5dmjQu0CH5rL7pO6AnAVI+VgCpEtCsIIs= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 156864370159333.6469784828879; Mon, 16 Sep 2019 07:21:41 -0700 (PDT) Received: from localhost ([::1]:34978 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i9rsp-0000xn-Tc for importer@patchew.org; Mon, 16 Sep 2019 10:21:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46167) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i9rX5-0003KL-KK for qemu-devel@nongnu.org; Mon, 16 Sep 2019 09:59:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i9rX3-0000R3-Jt for qemu-devel@nongnu.org; Mon, 16 Sep 2019 09:59:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:2839) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i9rX3-0000Qn-8x; Mon, 16 Sep 2019 09:59:09 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9318086E86F; Mon, 16 Sep 2019 13:59:08 +0000 (UTC) Received: from t460s.redhat.com (ovpn-117-103.ams2.redhat.com [10.36.117.103]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8A29110013A1; Mon, 16 Sep 2019 13:59:06 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Date: Mon, 16 Sep 2019 15:57:53 +0200 Message-Id: <20190916135806.1269-17-david@redhat.com> In-Reply-To: <20190916135806.1269-1-david@redhat.com> References: <20190916135806.1269-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.68]); Mon, 16 Sep 2019 13:59:08 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3 16/29] s390x/tcg: Fault-safe memset X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Florian Weimer , Thomas Huth , David Hildenbrand , =?UTF-8?q?Dan=20Hor=C3=A1k?= , Cornelia Huck , Stefano Brivio , qemu-s390x@nongnu.org, Cole Robinson , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" Replace fast_memset() by access_memset(), that first tries to probe access to all affected pages (maximum is two). We'll use the same mechanism for other types of accesses soon. Only in very rare cases (especially TLB_NOTDIRTY), we'll have to fallback to ld/st helpers. Try to speed up that case as suggested by Richard. We'll rework most involved handlers soon to do all accesses via new fault-safe helpers, especially MVC. Signed-off-by: David Hildenbrand Reviewed-by: Richard Henderson --- target/s390x/mem_helper.c | 123 +++++++++++++++++++++++++++++++------- 1 file changed, 103 insertions(+), 20 deletions(-) diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c index a24506676b..dd5da70746 100644 --- a/target/s390x/mem_helper.c +++ b/target/s390x/mem_helper.c @@ -117,27 +117,95 @@ static inline void cpu_stsize_data_ra(CPUS390XState *= env, uint64_t addr, } } =20 -static void fast_memset(CPUS390XState *env, uint64_t dest, uint8_t byte, - uint32_t l, uintptr_t ra) +/* An access covers at most 4096 bytes and therefore at most two pages. */ +typedef struct S390Access { + target_ulong vaddr1; + target_ulong vaddr2; + char *haddr1; + char *haddr2; + uint16_t size1; + uint16_t size2; + /* + * If we can't access the host page directly, we'll have to do I/O acc= ess + * via ld/st helpers. These are internal details, so we store the + * mmu idx to do the access here instead of passing it around in the + * helpers. Maybe, one day we can get rid of ld/st access - once we can + * handle TLB_NOTDIRTY differently. We don't expect these special acce= sses + * to trigger exceptions - only if we would have TLB_NOTDIRTY on LAP + * pages, we might trigger a new MMU translation - very unlikely that + * the mapping changes in between and we would trigger a fault. + */ + int mmu_idx; +} S390Access; + +static S390Access access_prepare(CPUS390XState *env, vaddr vaddr, int size, + MMUAccessType access_type, int mmu_idx, + uintptr_t ra) { - int mmu_idx =3D cpu_mmu_index(env, false); + S390Access access =3D { + .vaddr1 =3D vaddr, + .size1 =3D MIN(size, -(vaddr | TARGET_PAGE_MASK)), + .mmu_idx =3D mmu_idx, + }; =20 - while (l > 0) { - void *p =3D tlb_vaddr_to_host(env, dest, MMU_DATA_STORE, mmu_idx); - if (p) { - /* Access to the whole page in write mode granted. */ - uint32_t l_adj =3D adj_len_to_page(l, dest); - memset(p, byte, l_adj); - dest +=3D l_adj; - l -=3D l_adj; + g_assert(size > 0 && size <=3D 4096); + access.haddr1 =3D probe_access(env, access.vaddr1, access.size1, acces= s_type, + mmu_idx, ra); + + if (unlikely(access.size1 !=3D size)) { + /* The access crosses page boundaries. */ + access.vaddr2 =3D wrap_address(env, vaddr + access.size1); + access.size2 =3D size - access.size1; + access.haddr2 =3D probe_access(env, access.vaddr2, access.size2, + access_type, mmu_idx, ra); + } + return access; +} + +/* Helper to handle memset on a single page. */ +static void do_access_memset(CPUS390XState *env, vaddr vaddr, char *haddr, + uint8_t byte, uint16_t size, int mmu_idx, + uintptr_t ra) +{ +#ifdef CONFIG_USER_ONLY + g_assert(haddr); + memset(haddr, byte, size); +#else + TCGMemOpIdx oi =3D make_memop_idx(MO_UB, mmu_idx); + int i; + + if (likely(haddr)) { + memset(haddr, byte, size); + } else { + /* + * Do a single access and test if we can then get access to the + * page. This is especially relevant to speed up TLB_NOTDIRTY. + */ + g_assert(size > 0); + helper_ret_stb_mmu(env, vaddr, byte, oi, ra); + haddr =3D tlb_vaddr_to_host(env, vaddr, MMU_DATA_STORE, mmu_idx); + if (likely(haddr)) { + memset(haddr + 1, byte, size - 1); } else { - /* We failed to get access to the whole page. The next write - access will likely fill the QEMU TLB for the next iteration= . */ - cpu_stb_data_ra(env, dest, byte, ra); - dest++; - l--; + for (i =3D 1; i < size; i++) { + helper_ret_stb_mmu(env, vaddr + i, byte, oi, ra); + } } } +#endif +} + +static void access_memset(CPUS390XState *env, S390Access *desta, + uint8_t byte, uintptr_t ra) +{ + + do_access_memset(env, desta->vaddr1, desta->haddr1, byte, desta->size1, + desta->mmu_idx, ra); + if (likely(!desta->size2)) { + return; + } + do_access_memset(env, desta->vaddr2, desta->haddr2, byte, desta->size2, + desta->mmu_idx, ra); } =20 #ifndef CONFIG_USER_ONLY @@ -259,15 +327,19 @@ uint32_t HELPER(nc)(CPUS390XState *env, uint32_t l, u= int64_t dest, static uint32_t do_helper_xc(CPUS390XState *env, uint32_t l, uint64_t dest, uint64_t src, uintptr_t ra) { + const int mmu_idx =3D cpu_mmu_index(env, false); + S390Access desta; uint32_t i; uint8_t c =3D 0; =20 HELPER_LOG("%s l %d dest %" PRIx64 " src %" PRIx64 "\n", __func__, l, dest, src); =20 + desta =3D access_prepare(env, dest, l + 1, MMU_DATA_STORE, mmu_idx, ra= ); + /* xor with itself is the same as memset(0) */ if (src =3D=3D dest) { - fast_memset(env, dest, 0, l + 1, ra); + access_memset(env, &desta, 0, ra); return 0; } =20 @@ -315,6 +387,8 @@ uint32_t HELPER(oc)(CPUS390XState *env, uint32_t l, uin= t64_t dest, static uint32_t do_helper_mvc(CPUS390XState *env, uint32_t l, uint64_t des= t, uint64_t src, uintptr_t ra) { + const int mmu_idx =3D cpu_mmu_index(env, false); + S390Access desta; uint32_t i; =20 HELPER_LOG("%s l %d dest %" PRIx64 " src %" PRIx64 "\n", @@ -323,13 +397,15 @@ static uint32_t do_helper_mvc(CPUS390XState *env, uin= t32_t l, uint64_t dest, /* MVC always copies one more byte than specified - maximum is 256 */ l++; =20 + desta =3D access_prepare(env, dest, l, MMU_DATA_STORE, mmu_idx, ra); + /* * "When the operands overlap, the result is obtained as if the operan= ds * were processed one byte at a time". Only non-destructive overlaps * behave like memmove(). */ if (dest =3D=3D src + 1) { - fast_memset(env, dest, cpu_ldub_data_ra(env, src, ra), l, ra); + access_memset(env, &desta, cpu_ldub_data_ra(env, src, ra), ra); } else if (!is_destructive_overlap(env, dest, src, l)) { fast_memmove(env, dest, src, l, ra); } else { @@ -775,7 +851,9 @@ static inline uint32_t do_mvcl(CPUS390XState *env, uint64_t *src, uint64_t *srclen, uint16_t pad, int wordsize, uintptr_t ra) { + const int mmu_idx =3D cpu_mmu_index(env, false); int len =3D MIN(*destlen, -(*dest | TARGET_PAGE_MASK)); + S390Access desta; int i, cc; =20 if (*destlen =3D=3D *srclen) { @@ -805,7 +883,8 @@ static inline uint32_t do_mvcl(CPUS390XState *env, } else if (wordsize =3D=3D 1) { /* Pad the remaining area */ *destlen -=3D len; - fast_memset(env, *dest, pad, len, ra); + desta =3D access_prepare(env, *dest, len, MMU_DATA_STORE, mmu_idx,= ra); + access_memset(env, &desta, pad, ra); *dest =3D wrap_address(env, *dest + len); } else { /* The remaining length selects the padding byte. */ @@ -825,6 +904,7 @@ static inline uint32_t do_mvcl(CPUS390XState *env, /* move long */ uint32_t HELPER(mvcl)(CPUS390XState *env, uint32_t r1, uint32_t r2) { + const int mmu_idx =3D cpu_mmu_index(env, false); uintptr_t ra =3D GETPC(); uint64_t destlen =3D env->regs[r1 + 1] & 0xffffff; uint64_t dest =3D get_address(env, r1); @@ -832,6 +912,7 @@ uint32_t HELPER(mvcl)(CPUS390XState *env, uint32_t r1, = uint32_t r2) uint64_t src =3D get_address(env, r2); uint8_t pad =3D env->regs[r2 + 1] >> 24; uint32_t cc, cur_len; + S390Access desta; =20 if (is_destructive_overlap(env, dest, src, MIN(srclen, destlen))) { cc =3D 3; @@ -859,7 +940,9 @@ uint32_t HELPER(mvcl)(CPUS390XState *env, uint32_t r1, = uint32_t r2) while (destlen) { cur_len =3D MIN(destlen, -(dest | TARGET_PAGE_MASK)); if (!srclen) { - fast_memset(env, dest, pad, cur_len, ra); + desta =3D access_prepare(env, dest, cur_len, MMU_DATA_STORE, m= mu_idx, + ra); + access_memset(env, &desta, pad, ra); } else { cur_len =3D MIN(MIN(srclen, -(src | TARGET_PAGE_MASK)), cur_le= n); =20 --=20 2.21.0