From nobody Thu May 2 21:43:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 15022237440541011.4852339988464; Tue, 8 Aug 2017 13:22:24 -0700 (PDT) Received: from localhost ([::1]:44354 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dfB1C-0001Ua-OX for importer@patchew.org; Tue, 08 Aug 2017 16:22:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47909) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dfB0B-00012U-7l for qemu-devel@nongnu.org; Tue, 08 Aug 2017 16:21:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dfB07-0005wW-4S for qemu-devel@nongnu.org; Tue, 08 Aug 2017 16:21:19 -0400 Received: from relay1.mentorg.com ([192.94.38.131]:44560) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dfB06-0005vF-UO for qemu-devel@nongnu.org; Tue, 08 Aug 2017 16:21:15 -0400 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=svr-ies-mbx-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1dfB00-0002RY-MI from joseph_myers@mentor.com ; Tue, 08 Aug 2017 13:21:08 -0700 Received: from digraph.polyomino.org.uk (137.202.0.87) by svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) with Microsoft SMTP Server (TLS) id 15.0.1263.5; Tue, 8 Aug 2017 21:21:05 +0100 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.86_2) (envelope-from ) id 1dfAzt-0006I5-Hq; Tue, 08 Aug 2017 20:21:01 +0000 Date: Tue, 8 Aug 2017 20:21:01 +0000 From: Joseph Myers X-X-Sender: jsm28@digraph.polyomino.org.uk To: , , , Message-ID: User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 X-Originating-IP: [137.202.0.87] X-ClientProxiedBy: svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 Subject: [Qemu-devel] [PATCH] target/i386: fix pmovsx/pmovzx in-place operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The SSE4.1 pmovsx* and pmovzx* instructions take packed 1-byte, 2-byte or 4-byte inputs and sign-extend or zero-extend them to a wider vector output. The associated helpers for these instructions do the extension on each element in turn, starting with the lowest. If the input and output are the same register, this means that all the input elements after the first have been overwritten before they are read. This patch makes the helpers extend starting with the highest element, not the lowest, to avoid such overwriting. This fixes many GCC test failures (161 in the gcc testsuite in my GCC 6-based testing) when testing with a default CPU setting enabling those instructions. Signed-off-by: Joseph Myers --- diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 16509d0..d578216 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1617,18 +1617,18 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s) #define SSE_HELPER_F(name, elem, num, F) \ void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->elem(0) =3D F(0); \ - d->elem(1) =3D F(1); \ if (num > 2) { \ - d->elem(2) =3D F(2); \ - d->elem(3) =3D F(3); \ if (num > 4) { \ - d->elem(4) =3D F(4); \ - d->elem(5) =3D F(5); \ - d->elem(6) =3D F(6); \ d->elem(7) =3D F(7); \ + d->elem(6) =3D F(6); \ + d->elem(5) =3D F(5); \ + d->elem(4) =3D F(4); \ } \ + d->elem(3) =3D F(3); \ + d->elem(2) =3D F(2); \ } \ + d->elem(1) =3D F(1); \ + d->elem(0) =3D F(0); \ } =20 SSE_HELPER_F(helper_pmovsxbw, W, 8, (int8_t) s->B) --=20 Joseph S. Myers joseph@codesourcery.com