From nobody Fri Oct 31 04:01:22 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=quarantine dis=none) header.from=suse.com ARC-Seal: i=1; a=rsa-sha256; t=1749119095; cv=none; d=zohomail.com; s=zohoarc; b=LVktrvsI/pQaXJ2lINb30TlzevYhv3fSLDX0PTFrCVtWcKkpgUckS2hwWPzzvL1wnFhDrxQxgdqjMGEValAHH005sOlxJGtszUglaE8kaTKxjni8xApXHfRKeJs7jV8R0NDHFK6aleUo9C8CYTgnt4DZMqgyVZDl2CUdlHQu6Y8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749119095; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=CCGJJ+18Hx71qZ1BImD3aJBlrhNSiedNN3y6oo5smPU=; b=NFhAjRvnV2vbooYYe6WYOe8JILF/u53VdhsuIK/MB3d24rh7oPp3cKgX/cPehj6SnB2MfXtxRkImyjPVy5tBxyPoeJ+61BkAAawOEHzCiVf24dlp9+sMjSNrShGIvhHiWZD4Gr4LRCnpOnVr7meMvHp34D0vJmLGMAox1KQa9jY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1749119095554371.2724490860113; Thu, 5 Jun 2025 03:24:55 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1006542.1385737 (Exim 4.92) (envelope-from ) id 1uN7m8-0003lQ-53; Thu, 05 Jun 2025 10:24:44 +0000 Received: by outflank-mailman (output) from mailman id 1006542.1385737; Thu, 05 Jun 2025 10:24:44 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7m8-0003lJ-1l; Thu, 05 Jun 2025 10:24:44 +0000 Received: by outflank-mailman (input) for mailman id 1006542; Thu, 05 Jun 2025 10:24:43 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7m7-0003lD-FF for xen-devel@lists.xenproject.org; Thu, 05 Jun 2025 10:24:43 +0000 Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [2a00:1450:4864:20::330]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 4b46800b-41f7-11f0-a300-13f23c93f187; Thu, 05 Jun 2025 12:24:42 +0200 (CEST) Received: by mail-wm1-x330.google.com with SMTP id 5b1f17b1804b1-442ea341570so4983345e9.1 for ; Thu, 05 Jun 2025 03:24:42 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-747afff70b5sm12562346b3a.160.2025.06.05.03.24.38 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Jun 2025 03:24:41 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 4b46800b-41f7-11f0-a300-13f23c93f187 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1749119082; x=1749723882; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=CCGJJ+18Hx71qZ1BImD3aJBlrhNSiedNN3y6oo5smPU=; b=GF7X7+tlnFvyN8NA1zazfNKkCmtE8tJXNwAtEl/ikNEwstsyrDmJyO2gkW/mfHaZ07 uvaznPONxsOeoasAiT0Rg0Gc01edBALulSUGNmokdMiHnsPVusu53HQ3LAjuj8vCksko w/oFolNP6wwlXQIKz46qo5/HQcMd5Evu3aJ0/PbabpsW+nGRQuRb8g90EEQ1BFhnRLUP nMRZgpGAPqkUVAekfIJJ7xUCxH8sxnlIs7FRNiFQFHcLrCgOTGDENt4lfmYdNjXa0jss fYKkSi852ee0UrmSenUERpG5mQQsrspSKSPmWXtaDmxQiy0tHfG/2x5Hrdic7fBUz/Os bVZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749119082; x=1749723882; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=CCGJJ+18Hx71qZ1BImD3aJBlrhNSiedNN3y6oo5smPU=; b=q8dOjvKBBfB6q7g+diJ3CdleyFGv8+S4gAOoriK1LwhNdxrM+scXkozW2jKbTWVT// b7nKmlrOfevl+GFAJDQN/UQ7DgXCySl2gRhYKQBLzo0mGswjxVDJlAUZ3FWq+UaY9h3P ocjZMtRtQt0Oi4357gX0o4ujQ84QrdIhBSM9teGhvZb6A9IA8U+zrAo9QcQ/HnrU0Pvy FlGoFvsWb5bYUtY8JHHgfDbREX59YwuB2nnNdIwR/LqiI0JGf45E5SE/3wLxbLznVmSh HXKbiR3FkcKmKw5kdv5Ndr4KxP4CnI4E1UN+CbpG4tR8rl8F1g0r+ua+X857dNLIDFul GUig== X-Gm-Message-State: AOJu0YzXTVmAgOBTmQkYcbvmlpPK/Jz8dQihSkne5akcOuNg9T86AKBS 2tRhMJF38E9uTZ92tP+3wc2Rqqvc6v3GI+h4YvLYCANcdoEpXWnoFkaak1O54RCtVoqDhnXYENv h5bw= X-Gm-Gg: ASbGnctjKS1XuUg9/E/4T1jLEG+XTuoPPVYpl0dVAjKsTttkQdQIx/tsgz96vwZPn85 VtG52AF6QWBPzz0i0G4CUL+m6ExYve2e7pZ7LdLLpiMvrk+q9HVPwuv72TAqXo9PXQWE9xIogQu KV8YOUKhNAXmDL8qFWROnU61ggbqhvvmqJsBjZa77c8SEHP6+2YnK4AwqRBrshtxZBRXhBnndH1 Cttlc5RzZTj3jiiqSjPuZIwntAKId2HRrR88KeYkut2tAoZtz//Mmeu8k/OzSVU6mCq4lG72tn1 G0xIa6cGoUoYGOq/pAKSD6CtyXlXykAjyzMncELJtHCDVCaGl9SxUW3T3nKfXaxBnrObbQlE6PV 2pknlUH3mq7snQFF2QgKCk/9l8BZd8J1ySraEWXTX+aIOX18= X-Google-Smtp-Source: AGHT+IHCcgv3vgC11pIaE8euNozBhtxqfc142zYW33hG7ykrV9v+UuFD9NqILRK4vpfVAlbQsyjLkA== X-Received: by 2002:a05:6000:250f:b0:3a4:edad:2a59 with SMTP id ffacd0b85a97d-3a51d8f6a3bmr5700565f8f.1.1749119081922; Thu, 05 Jun 2025 03:24:41 -0700 (PDT) Message-ID: Date: Thu, 5 Jun 2025 12:24:34 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v5 1/6] x86: suppress ERMS for internal use when MISC_ENABLE.FAST_STRING is clear From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= References: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @suse.com) X-ZM-MESSAGEID: 1749119096418116600 Content-Type: text/plain; charset="utf-8" Before we start actually adjusting behavior when ERMS is available, follow Linux commit 161ec53c702c ("x86, mem, intel: Initialize Enhanced REP MOVSB/STOSB") and zap the CPUID-derived feature flag when the MSR bit is clear. Don't extend the artificial clearing to guest view, though: Guests can take their own decision in this regard, as they can read (most of) MISC_ENABLE. Signed-off-by: Jan Beulich --- TBD: Would be nice if "cpuid=3Dno-erms" propagated to guest view (for "cpuid=3D" generally meaning to affect guests as well as Xen), but since both disabling paths use setup_clear_cpu_cap() they're indistinguishable in guest_common_feature_adjustments(). A separate boolean could take care of this, but would look clumsy to me. --- v5: Correct guest_common_max_feature_adjustments() addition. v4: Also adjust guest_common_max_feature_adjustments(). v3: New. --- a/xen/arch/x86/cpu/intel.c +++ b/xen/arch/x86/cpu/intel.c @@ -366,8 +366,18 @@ static void cf_check early_init_intel(st paddr_bits =3D 36; =20 if (c =3D=3D &boot_cpu_data) { + uint64_t misc_enable; + check_memory_type_self_snoop_errata(); =20 + /* + * If fast string is not enabled in IA32_MISC_ENABLE for any reason, + * clear the enhanced fast string CPU capability. + */ + rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable); + if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) + setup_clear_cpu_cap(X86_FEATURE_ERMS); + intel_init_levelling(); } =20 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -487,6 +487,12 @@ static void __init guest_common_max_feat */ if ( test_bit(X86_FEATURE_RTM, fs) ) __set_bit(X86_FEATURE_RTM_ALWAYS_ABORT, fs); + + /* + * We expose MISC_ENABLE to guests, so our internal clearing of ERMS w= hen + * FAST_STRING is not set should not affect the view of migrating-in g= uests. + */ + __set_bit(X86_FEATURE_ERMS, fs); } =20 static void __init guest_common_default_feature_adjustments(uint32_t *fs) @@ -567,6 +573,16 @@ static void __init guest_common_default_ __clear_bit(X86_FEATURE_RTM, fs); __set_bit(X86_FEATURE_RTM_ALWAYS_ABORT, fs); } + + /* + * We expose MISC_ENABLE to guests, so our internal clearing of ERMS w= hen + * FAST_STRING is not set should not propagate to guest view. Guests = can + * judge on their own whether to ignore the CPUID bit when the MSR bit= is + * clear. The bit being uniformly set in the max policies, we only ne= ed + * to clear it here (if hardware doesn't have it). + */ + if ( !raw_cpu_policy.feat.erms ) + __clear_bit(X86_FEATURE_ERMS, fs); } =20 static void __init guest_common_feature_adjustments(uint32_t *fs) --- a/xen/arch/x86/include/asm/msr-index.h +++ b/xen/arch/x86/include/asm/msr-index.h @@ -493,6 +493,7 @@ #define MSR_IA32_THERM_INTERRUPT 0x0000019b #define MSR_IA32_THERM_STATUS 0x0000019c #define MSR_IA32_MISC_ENABLE 0x000001a0 +#define MSR_IA32_MISC_ENABLE_FAST_STRING (1<<0) #define MSR_IA32_MISC_ENABLE_PERF_AVAIL (1<<7) #define MSR_IA32_MISC_ENABLE_BTS_UNAVAIL (1<<11) #define MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL (1<<12) From nobody Fri Oct 31 04:01:22 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=quarantine dis=none) header.from=suse.com ARC-Seal: i=1; a=rsa-sha256; t=1749119140; cv=none; d=zohomail.com; s=zohoarc; b=OO8+0ekJD8MXlzxqocYsYSAEn6emhyBmZhniLiepBHJVeVEA5NGfLkr79gNyXr8/Rm1BGws8qHjbsoekDD6rLP0j+87zwJW4UUQlnpPDBIUe8qpUB0x/l96PJBk+LalRsfuDEE1P4+zR0521+kolKycHmRXaVnBsNyF2MLXAkmU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749119140; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=ODTNNGIVtgxD+7IVtLkWN/Ewp9sfVPUojE1+ozek4qQ=; b=ixSeR8JnT5GT4SYfKNr1V1zoXkyodEDPazmEiAXdv6zBXYzz307vyp/U/gyoDTRaqxnytMZ+ke2HXMPLATzbv036ZyaVCTLT9w9HeQT+0EKGxMikvIj3efzeMiIVFA0LpGK35qFdcfLiU9m3U7n/qL6gcVMQvSaDDawkIDAT2fo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1749119140087718.3603017661208; Thu, 5 Jun 2025 03:25:40 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1006548.1385747 (Exim 4.92) (envelope-from ) id 1uN7mq-0004Gg-DQ; Thu, 05 Jun 2025 10:25:28 +0000 Received: by outflank-mailman (output) from mailman id 1006548.1385747; Thu, 05 Jun 2025 10:25:28 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7mq-0004GZ-A7; Thu, 05 Jun 2025 10:25:28 +0000 Received: by outflank-mailman (input) for mailman id 1006548; Thu, 05 Jun 2025 10:25:26 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7mo-00043g-N2 for xen-devel@lists.xenproject.org; Thu, 05 Jun 2025 10:25:26 +0000 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [2a00:1450:4864:20::435]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 64a44cf5-41f7-11f0-b894-0df219b8e170; Thu, 05 Jun 2025 12:25:25 +0200 (CEST) Received: by mail-wr1-x435.google.com with SMTP id ffacd0b85a97d-3a375888297so533036f8f.1 for ; Thu, 05 Jun 2025 03:25:25 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-23506bc8b26sm116434025ad.9.2025.06.05.03.25.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Jun 2025 03:25:24 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 64a44cf5-41f7-11f0-b894-0df219b8e170 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1749119124; x=1749723924; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=ODTNNGIVtgxD+7IVtLkWN/Ewp9sfVPUojE1+ozek4qQ=; b=a5qugSiNPUXk/XrjAYZnN+37X449OrmlTgSSnntCsc0hWPVL4eTTvvl3xnH/29yLAc HMoHOx+acffFA+P545esOq8Gy8X2SLsfa1AGXkzZDhmqS7G3QfS3bEjUnI0nhVrUGGpv NGupieYowRJmWWbjVk4zfyfh14GVxA2Y13qiOHMSUgYENQViwsShNw6PXnAZoZEo+PbL cWQ2nQeBg7NBfrKhrNGLgC+/Jezm0pVEKxKGNr/lhrbM0Isdth900g0ClH+W1CN0Ixmh t0okw58DHZo2ckYYI/og63dq8ORv1d9xFa2/VUm7iqMU1Xq1J9aO3+rloKdCIsEJsiSI jOkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749119124; x=1749723924; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ODTNNGIVtgxD+7IVtLkWN/Ewp9sfVPUojE1+ozek4qQ=; b=TMl0X8MkEvm3TxvU9r3eEkB900mLow2gXVvVFS86Q21vOmCNAmivB8DVdSqqIuECfb 76TYzwPHj/co4/xt9NBzhAcujSHlALkqLEqikDQHDPO4iQiST+KUK/2BZ1fiTynC289z zDJfLOIZbNqoTgJnlLA6T8cJ/1mG6mll7X9A81fUG8xIW2/SxHWEqcPoEQvgRhSi6Ttl vjL5r+eSEKoDYEoF/5yEG31SHb6dGhjHP761kHjmK2zervkoRqy5dw7DwXXNzZJYedQG 8YjLRLsRyeH3DThFzE+eoPzKYBQUgZHKtmSYSy2MZ5NClZSTnS9wE9ywdoQS8U2+sUKB wOMQ== X-Gm-Message-State: AOJu0YxCF2ydK8wDHgTocJxmti4alz1yNr63eSYYUNv+dXLVYth5SnRw FOSbckKsHLfKGr8AojCFpH6vgbl3rVu11B1nJ5qdZNyIVSM7n6W5rHYSF7TOe5ZOwo9qCuoSwgz GQyA= X-Gm-Gg: ASbGnctF92kNcLhjJ4EcNNh68T5YJjKwaL5aFxS2EwYyAjtmH0d4TwrLAFNE6S+i/pz KliLU2fandFyM9J9PyCxNKabg8nLP71zVgtI3K8EhWCalSG17wFrPSQ5pmegKL7Z/BTmIuIcZ0h oyEYfZy0vgzoW0cmRvSnIqUOGsJrBbjsl/4uDqEgyXUiRweCkLo0FbDLNz9x/xQa6R2AM4DPULd weju4KJZrcZ/yl7e0e7+/aYqHhy4DaF0FF51XcknyBBS50shIdaWk+ZOt02CzQRzytUUBsX0R+7 Z7DYGsK+EW2JEXYYVWQ7fjkagFib5IsW27mhoTdeYzvUOWnq2tZpvp+7bsFFbt/cM+6yk0fAAXf 4/7sUYPKzvXwl8WugnxNPG7UkyoTRJOrgQqCl X-Google-Smtp-Source: AGHT+IH/iUBcOUGxo/kZ1XOmLorzJ659k/mRMtiOrAHY23tAraBTACOClPoWtUOpiNH7ekY6pa8chQ== X-Received: by 2002:a05:6000:2890:b0:3a4:e603:3d2 with SMTP id ffacd0b85a97d-3a526a84717mr2559856f8f.0.1749119124473; Thu, 05 Jun 2025 03:25:24 -0700 (PDT) Message-ID: <4592a702-acf3-4229-9069-d5b639151657@suse.com> Date: Thu, 5 Jun 2025 12:25:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v5 2/6] x86: re-work memset() From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= References: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @suse.com) X-ZM-MESSAGEID: 1749119140955116600 Content-Type: text/plain; charset="utf-8" Move the function to its own assembly file. Having it in C just for the entire body to be an asm() isn't really helpful. Then have two flavors: A "basic" version using qword steps for the bulk of the operation, and an ERMS version for modern hardware, to be substituted in via alternatives patching. For RET to be usable in an alternative's replacement code, extend the CALL/JMP patching to cover the case of "JMP __x86_return_thunk" coming last in replacement code. Signed-off-by: Jan Beulich --- We may want to consider branching over the REP STOSQ as well, if the number of qwords turns out to be zero. We may also want to consider using non-REP STOS{L,W,B} for the tail. --- v5: Re-base. v4: Use %r8 instead of %rsi in a few places. v3: Re-base. --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -47,6 +47,7 @@ obj-$(CONFIG_RETURN_THUNK) +=3D indirect-t obj-$(CONFIG_PV) +=3D ioport_emulate.o obj-y +=3D irq.o obj-$(CONFIG_KEXEC) +=3D machine_kexec.o +obj-y +=3D memset.o obj-y +=3D mm.o x86_64/mm.o obj-$(CONFIG_VM_EVENT) +=3D monitor.o obj-y +=3D mpparse.o --- a/xen/arch/x86/alternative.c +++ b/xen/arch/x86/alternative.c @@ -346,6 +346,12 @@ static int init_or_livepatch _apply_alte /* 0xe8/0xe9 are relative branches; fix the offset. */ if ( a->repl_len >=3D 5 && (*buf & 0xfe) =3D=3D 0xe8 ) *(int32_t *)(buf + 1) +=3D repl - orig; + else if ( IS_ENABLED(CONFIG_RETURN_THUNK) && + a->repl_len > 5 && buf[a->repl_len - 5] =3D=3D 0xe9 && + ((long)repl + a->repl_len + + *(int32_t *)(buf + a->repl_len - 4) =3D=3D + (long)__x86_return_thunk) ) + *(int32_t *)(buf + a->repl_len - 4) +=3D repl - orig; =20 a->priv =3D 1; =20 --- /dev/null +++ b/xen/arch/x86/memset.S @@ -0,0 +1,30 @@ +#include + +.macro memset + and $7, %edx + shr $3, %rcx + movzbl %sil, %esi + mov $0x0101010101010101, %rax + imul %rsi, %rax + mov %rdi, %r8 + rep stosq + or %edx, %ecx + jz 0f + rep stosb +0: + mov %r8, %rax + RET +.endm + +.macro memset_erms + mov %esi, %eax + mov %rdi, %r8 + rep stosb + mov %r8, %rax + RET +.endm + +FUNC(memset) + mov %rdx, %rcx + ALTERNATIVE memset, memset_erms, X86_FEATURE_ERMS +END(memset) --- a/xen/arch/x86/string.c +++ b/xen/arch/x86/string.c @@ -22,19 +22,6 @@ void *(memcpy)(void *dest, const void *s return dest; } =20 -void *(memset)(void *s, int c, size_t n) -{ - long d0, d1; - - asm volatile ( - "rep stosb" - : "=3D&c" (d0), "=3D&D" (d1) - : "a" (c), "1" (s), "0" (n) - : "memory"); - - return s; -} - void *(memmove)(void *dest, const void *src, size_t n) { long d0, d1, d2; From nobody Fri Oct 31 04:01:22 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=quarantine dis=none) header.from=suse.com ARC-Seal: i=1; a=rsa-sha256; t=1749119170; cv=none; d=zohomail.com; s=zohoarc; b=ZIsZJbmOzYfN6+qoGsDbHnKNUiY8eQpbDv0b3pb3V6/RC2MOwN4LcE3o8VLxY5DyGGeaQ33GcpHuZfpvWqxAbTKhyB63ROL8RamDk69UGH6Ww6t17RIpJQc839d5d9qbcQj6J1ympHKT7vRa6caBVT5dMWg/S93oxLNRrSWGv0s= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749119170; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Kzfk8kOPjucUBtuwDsuMCecezBv2uHiBq7/jphCJyvE=; b=JbAFZ5LPthLmD9o5XaGhICma6zJfGfLzOSZtoxIa2VDD2WNO0XOLUJ9xCNH26RHKXuvVoeVpsXlaxLUiwnZEFR1Hz5PV5s8NXmLpxiktfiVHxm3qNmT73OyUK7OZ2Etf+9ltjd40BEzOPM6/ZHCLoitQlQBU13HBd8Gl5VXEjPk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1749119170039725.2834692788945; Thu, 5 Jun 2025 03:26:10 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1006554.1385756 (Exim 4.92) (envelope-from ) id 1uN7nI-0004o3-PL; Thu, 05 Jun 2025 10:25:56 +0000 Received: by outflank-mailman (output) from mailman id 1006554.1385756; Thu, 05 Jun 2025 10:25:56 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7nI-0004nw-Mp; Thu, 05 Jun 2025 10:25:56 +0000 Received: by outflank-mailman (input) for mailman id 1006554; Thu, 05 Jun 2025 10:25:55 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7nH-00043g-5j for xen-devel@lists.xenproject.org; Thu, 05 Jun 2025 10:25:55 +0000 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [2a00:1450:4864:20::32e]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 75ab4638-41f7-11f0-b894-0df219b8e170; Thu, 05 Jun 2025 12:25:53 +0200 (CEST) Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-451dbe494d6so8992725e9.1 for ; Thu, 05 Jun 2025 03:25:53 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-747afeabc2dsm12994333b3a.65.2025.06.05.03.25.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Jun 2025 03:25:52 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 75ab4638-41f7-11f0-b894-0df219b8e170 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1749119153; x=1749723953; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=Kzfk8kOPjucUBtuwDsuMCecezBv2uHiBq7/jphCJyvE=; b=BT9XNNNh3KTFaXwaAfAArbWCTKHWj3wOoT9qXhcfedbcLmkvq8ZKmRnt2HeQzNHdBo lyP7xMOZhEK5vPzj0HBH726Oo/jA32U/yLRag5A2O5Tc8LdWZp6sEc3DLGnvc8mo0R5t cgaU158UxQ2+br+T69Ud3Up6wD3lDXZGk6QICszGqqlowa43XVbCPjBn3nnbzQ28C4k8 fid5n3/8lXSFT1IZCOYAtWCyCOJbwztQUtpC2hkghWAuVzV3P7CcHWfDuPvx2jwc6gXA vpqrVXBj5sCyyA/Q/JxJ3B0qPsfRYovichzZ4cMRL3AJDjgcbln2y8Yb7aCclUWVsgLx F7WA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749119153; x=1749723953; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Kzfk8kOPjucUBtuwDsuMCecezBv2uHiBq7/jphCJyvE=; b=MTfQiFYNXGUrkSfexAWTSaEQNf5lezHIkzSXFnuWwJHGwqd3625k0MUvK1YuG9sqiv V9XZX4badaA3tmpkHsL6gDiXqWYQ0z6dM3C8Qoqm6vaid3DN1xScC4DDAzokV6vuwh0e 8ehDoEtbuo0qbYfS/u9SeHDlCW2uM8YvtNpOyF5PSHVfHtkS9Emj20N5oiNr9MAcocWk bGEDJpsquamOtx1xDgneJEIt/MI8wNv8NSb0LK0H3dKpj5a6Zm9P9RfjFWsrNcrz6A8Q 3F7duoyXhDgvUhdB1Q3kuz4Nppa9WfSZVytrSlb0cOw7NkVa5wKnmpDv3qCGo8ZAZMnS m5vg== X-Gm-Message-State: AOJu0Yx8YWsdYpkf7C5IkO5vnLENvSMU/bUCzwxee4b7/qJTerTndb0O JJN3xAWc0RDgsksNjllUoOzQyQGTi411ABCgSFIt952PRHz6gFP3K1S8ZkgPCVL1FNPwduX3ro/ 5vK8= X-Gm-Gg: ASbGncusAN3NPng1/5BoUQ8GQ277hKgeDCA0X83XzZl00qLtJhdhYy20DQj/5HkGCQh YToORmTG1e3VbzQcYf6X3D1x8lZhWvI27nrZSVpBrddfsoily/2LILa6mHPhOemc661YxqrQS0+ NxMQ1qvrYaH8HUk9MC2wUcRNIEeAZinTz1HoSZQzMUeDtcXiaqsSm5C60rDVlxw/KeRna5YoPW2 duJ6dHAffQ7ZEK8d6oaR4gS5XUvYAU+9RFxA53w5GFz53xVOgdPlrR4U4cNB0K4ni6JlyBR+kGo yOazApC+Z91q5s2apMmC4zHgpv3JKEluXF/e1T1jz0/GuLXaDQSfd8C7hvMQNcLd7CoMcOnQulg jEh9iuRoWa0mCw9UPYJy1iZncve1i/9V8NWkq1/hNWmuL1Uc= X-Google-Smtp-Source: AGHT+IE6OiHLhwrRi590UFvEmq1U+3Yvi/OE0meUH/K8SbPo9CJPv30sHVwgWqAXwtobtk/mFAF5gg== X-Received: by 2002:a05:6000:4387:b0:3a5:276b:1ec7 with SMTP id ffacd0b85a97d-3a5276b2066mr2200737f8f.7.1749119152974; Thu, 05 Jun 2025 03:25:52 -0700 (PDT) Message-ID: <017e689a-41a2-4722-a5e7-19ffef27500f@suse.com> Date: Thu, 5 Jun 2025 12:25:46 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v5 3/6] x86: re-work memcpy() From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= References: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @suse.com) X-ZM-MESSAGEID: 1749119171270116600 Content-Type: text/plain; charset="utf-8" Move the function to its own assembly file. Having it in C just for the entire body to be an asm() isn't really helpful. Then have two flavors: A "basic" version using qword steps for the bulk of the operation, and an ERMS version for modern hardware, to be substituted in via alternatives patching. Alternatives patching, however, requires an extra precaution: It uses memcpy() itself, and hence the function may patch itself. Luckily the patched-in code only replaces the prolog of the original function. Make sure this remains this way. Additionally alternatives patching, while supposedly safe via enforcing a control flow change when modifying already prefetched code, may not really be. Afaict a request is pending to drop the first of the two options in the SDM's "Handling Self- and Cross-Modifying Code" section. Insert a serializing instruction there. Signed-off-by: Jan Beulich Reviewed-by: Teddy Astie --- We may want to consider branching over the REP MOVSQ as well, if the number of qwords turns out to be zero. We may also want to consider using non-REP MOVS{L,W,B} for the tail. TBD: We may further need a workaround similar to Linux'es 8ca97812c3c8 ("x86/mce: Work around an erratum on fast string copy instructions"). TBD: Some older AMD CPUs have an issue with REP MOVS when source and destination are misaligned with one another (modulo 32?), which may require a separate memcpy() flavor. --- v5: Re-base. v4: Use CR2 write as serializing insn, and limit its use to boot time. v3: Re-base. --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -47,6 +47,7 @@ obj-$(CONFIG_RETURN_THUNK) +=3D indirect-t obj-$(CONFIG_PV) +=3D ioport_emulate.o obj-y +=3D irq.o obj-$(CONFIG_KEXEC) +=3D machine_kexec.o +obj-y +=3D memcpy.o obj-y +=3D memset.o obj-y +=3D mm.o x86_64/mm.o obj-$(CONFIG_VM_EVENT) +=3D monitor.o --- a/xen/arch/x86/alternative.c +++ b/xen/arch/x86/alternative.c @@ -195,12 +195,16 @@ void *place_ret(void *ptr) * executing. * * "noinline" to cause control flow change and thus invalidate I$ and - * cause refetch after modification. + * cause refetch after modification. While the SDM continues to suggest t= his + * is sufficient, it may not be - issue a serializing insn afterwards as w= ell, + * unless this is for live-patching. */ static void init_or_livepatch noinline text_poke(void *addr, const void *opcode, size_t len) { memcpy(addr, opcode, len); + if ( system_state < SYS_STATE_active ) + asm volatile ( "mov %%rax, %%cr2" ::: "memory" ); } =20 extern void *const __initdata_cf_clobber_start[]; --- /dev/null +++ b/xen/arch/x86/memcpy.S @@ -0,0 +1,20 @@ +#include + +FUNC(memcpy) + mov %rdx, %rcx + mov %rdi, %rax + /* + * We need to be careful here: memcpy() is involved in alternatives + * patching, so the code doing the actual copying (i.e. past setti= ng + * up registers) may not be subject to patching (unless further + * precautions were taken). + */ + ALTERNATIVE "and $7, %edx; shr $3, %rcx", \ + STR(rep movsb; RET), X86_FEATURE_ERMS + rep movsq + or %edx, %ecx + jz 1f + rep movsb +1: + RET +END(memcpy) --- a/xen/arch/x86/string.c +++ b/xen/arch/x86/string.c @@ -7,21 +7,6 @@ =20 #include =20 -void *(memcpy)(void *dest, const void *src, size_t n) -{ - long d0, d1, d2; - - asm volatile ( - " rep ; movs"__OS" ; " - " mov %k4,%k3 ; " - " rep ; movsb " - : "=3D&c" (d0), "=3D&D" (d1), "=3D&S" (d2) - : "0" (n/BYTES_PER_LONG), "r" (n%BYTES_PER_LONG), "1" (dest), "2" = (src) - : "memory" ); - - return dest; -} - void *(memmove)(void *dest, const void *src, size_t n) { long d0, d1, d2; From nobody Fri Oct 31 04:01:22 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=quarantine dis=none) header.from=suse.com ARC-Seal: i=1; a=rsa-sha256; t=1749119228; cv=none; d=zohomail.com; s=zohoarc; b=LoaxTwGOoHRiw8nE82GXh1BaGKqg43ueTVkIpzaouTa5E2P8e5/uNLWEkkvfMoMPuk6kI3cw9rwkOTw6ctJHnR3NAWSS4XuJliF5urzLy8OO043+Dkp8i08wD1JoprUNR4gMInUxxu5GpcIKydZlPZdDTWA8ciV/UkdHAImfiKs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749119228; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=1v47NjpfklYXAFnt1Xg3d5eW/3aEbLVULucn4vLD+N4=; b=X1flDRmTephoRllzFEy207Qrq+DblG4+CXFrL2Zq+krPrE+3jOP055aVutMXogWxVHhCVxdj/YWhXXcXwvwNzbBrjSNwupqNIng8iE5z1W3eiv9U9J+lqiI0xaQo2mHvjgoV9/Vui3S7XxKxE3jVYB2L3kq6GnQNHjEKLUd2U5s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1749119228927604.0771843953006; Thu, 5 Jun 2025 03:27:08 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1006561.1385767 (Exim 4.92) (envelope-from ) id 1uN7oI-0005Mz-23; Thu, 05 Jun 2025 10:26:58 +0000 Received: by outflank-mailman (output) from mailman id 1006561.1385767; Thu, 05 Jun 2025 10:26:58 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7oH-0005Ms-VV; Thu, 05 Jun 2025 10:26:57 +0000 Received: by outflank-mailman (input) for mailman id 1006561; Thu, 05 Jun 2025 10:26:56 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7oG-0005Mk-O4 for xen-devel@lists.xenproject.org; Thu, 05 Jun 2025 10:26:56 +0000 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [2a00:1450:4864:20::42a]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 9a962fb2-41f7-11f0-a300-13f23c93f187; Thu, 05 Jun 2025 12:26:55 +0200 (CEST) Received: by mail-wr1-x42a.google.com with SMTP id ffacd0b85a97d-3a525eee2e3so526060f8f.2 for ; Thu, 05 Jun 2025 03:26:55 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3132bff6688sm1224745a91.5.2025.06.05.03.26.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Jun 2025 03:26:54 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 9a962fb2-41f7-11f0-a300-13f23c93f187 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1749119215; x=1749724015; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=1v47NjpfklYXAFnt1Xg3d5eW/3aEbLVULucn4vLD+N4=; b=DosBO0Gz/81wa4HianKDWqliA/kzP8Hmlif+/6lWy/w6zVMaKGmDeJHQl5AnhKQjGF qVv9Grygcoane414JhCa+lfH3qCgfEOcO90jbpu4cK4Y4H3JxYqw7MdVuASevyCM7Zgv 11vDG89LTR79B1DBdi44R1rqvFSdIVibBTqugwNn29NOGgZd2iPALQGgpbyesBisIuwN WQk/i0hmj+2DQ26XGsQo1O/jghH3vHDYc9TGjtbdftMoQGNP1bMIA5xoIJ7yHm5hsElJ +nDfAYtPtX1amfXwsto4y3OmPduxhV8gzThsQyxClUdbcnsasJ1NVzd996krZ2KxkMIW nX6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749119215; x=1749724015; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1v47NjpfklYXAFnt1Xg3d5eW/3aEbLVULucn4vLD+N4=; b=htM6StGGG68s9btcBf3hwCdfTyzKKry/k2KiuiiZdhFJPaoodkLcfN1iAgZn6UEz+K MiaLadxTZ6yDIH0NeYfZewpNLz3J+H9fmXJP8EW3rKqbOJigVwhb0ue7iov9WVpWtZyq 2QjvKY6XqzRko5/Z0ZwlV1cqYehmF6C49D33p+KfJ31k0BskjMGs0ZYMtSVPl1fcCz5X HI5ZrdazXcd7GwiRqTiwfMqULUgMUs5nDwWjAcgzGVdMt2lKvx5fAsYrRkvDq27B5RIM U4drxF6UM3pTRSPFAM+Msf2tF+TilAMRE37AgNK0D7DCvINhCGAUrWk8moExNxoBtTVK 2VAg== X-Gm-Message-State: AOJu0YyJML73KVBbnqdJS9qAkugCsyXstQPd2JhGaLRkJrGJEWGtRkPl rVI0ymUf+UQtzz85aMjyvOfBHK5Q1Jtj3j0RgstWMDNPlVMCAceug9zdmXwctxRVVeqRfhQRHXt 05Kw= X-Gm-Gg: ASbGncuGJJ5xKEpH0MvqxKw7movLJhYY/4wUQj0VDN8/B7c/V6ArA31mrfBljLVPhvS eiMvgt0DHimaz453vLWNcYnUuMFrZOFAMbCWoinjY9MhPTcWSjP2/aO3aezzdveOg4GRyNSVXVZ 2HGOviOIpLWhbqr4YXDPU9OQO37UMleRbcHxJKyzjEzFObi0GEoIpa9NU/b/sB1cdYoanrUKDxy ctMM56wBfak89XuJaF80PUpcRlHNdDoQ6C/xLxiYUjb4ndH0hbyFW6iq9YodvTPXysbkJM2f8g6 v4I9FS0U1JMIl21zyAiSc4Gr8wSRuys01JhJx4O4lL6kbmRcj4DbHjKshea2HcZXo40xFvoCGkr vFU6vgFdbS6CzK3+JB4so7YR5nl0J/e3CNoyWHc87pMVNYc0= X-Google-Smtp-Source: AGHT+IHeyyS21Ai+7nJZfxT9YbsjuA/gqHmUIi2J+XzzBjGSD6NPvK+8TuhRB9TSp5IKQQ1CoUx/KA== X-Received: by 2002:a05:6000:2c11:b0:3a4:f63b:4bfc with SMTP id ffacd0b85a97d-3a51d961af6mr5282398f8f.34.1749119215013; Thu, 05 Jun 2025 03:26:55 -0700 (PDT) Message-ID: Date: Thu, 5 Jun 2025 12:26:48 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v5 4/6] x86: control memset() and memcpy() inlining From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= References: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @suse.com) X-ZM-MESSAGEID: 1749119229812116600 Content-Type: text/plain; charset="utf-8" Stop the compiler from inlining non-trivial memset() and memcpy() (for memset() see e.g. map_vcpu_info() or kimage_load_segments() for examples). This way we even keep the compiler from using REP STOSQ / REP MOVSQ when we'd prefer REP STOSB / REP MOVSB (when ERMS is available). With gcc10 this yields a modest .text size reduction (release build) of around 2k. Unfortunately these options aren't understood by the clang versions I have readily available for testing with; I'm unaware of equivalents. Note also that using cc-option-add is not an option here, or at least I couldn't make things work with it (in case the option was not supported by the compiler): The embedded comma in the option looks to be getting in the way. Requested-by: Andrew Cooper Signed-off-by: Jan Beulich --- v3: Re-base. v2: New. --- The boundary values are of course up for discussion - I wasn't really certain whether to use 16 or 32; I'd be less certain about using yet larger values. Similarly whether to permit the compiler to emit REP STOSQ / REP MOVSQ for known size, properly aligned blocks is up for discussion. --- a/xen/arch/x86/arch.mk +++ b/xen/arch/x86/arch.mk @@ -58,6 +58,9 @@ endif $(call cc-option-add,CFLAGS_stack_boundary,CC,-mpreferred-stack-boundary= =3D3) export CFLAGS_stack_boundary =20 +CFLAGS +=3D $(call cc-option,$(CC),-mmemcpy-strategy=3Dunrolled_loop:16:no= align$(comma)libcall:-1:noalign) +CFLAGS +=3D $(call cc-option,$(CC),-mmemset-strategy=3Dunrolled_loop:16:no= align$(comma)libcall:-1:noalign) + ifeq ($(CONFIG_UBSAN),y) # Don't enable alignment sanitisation. x86 has efficient unaligned access= es, # and various things (ACPI tables, hypercall pages, stubs, etc) are wont-f= ix. From nobody Fri Oct 31 04:01:22 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=quarantine dis=none) header.from=suse.com ARC-Seal: i=1; a=rsa-sha256; t=1749119279; cv=none; d=zohomail.com; s=zohoarc; b=G4vvI+B67XD6Rn7n9Jx/DaCfNhfnivQtR/OQwaJ8I2HqGMnUIqkWZxcwf72JndKYPqX3OKSWoFwmMGq3rD+rEq1kb/8UmzSSpFsD7+9A/Ta+8Aea8+GI7241P3cdETSW935a0sGyxknGoMTYVvOAG6yl4wI/mEro+3+Svj9vaZ8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749119279; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=uAx+h4vdQMnFmuPeDTCptQ0FqB53/bxqCnS75yppj5g=; b=mdfpnt25+oOFdjKtCsvUx/rohJNVuNB/wBdSGQoB+W4DUFcaJc/+k1wGhn+1k0gNRoahyfCquNMvVs/ereeXA3DX8cQwPmtywJQBa3+uUIzfUV3h5SSBvElQgKAeqYgJp7jM5ZhFp5s9MGELjcT8hhEyH9+IhM9CVxGcGsry9O0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1749119279405853.5511651564975; Thu, 5 Jun 2025 03:27:59 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1006567.1385776 (Exim 4.92) (envelope-from ) id 1uN7p6-0005tS-A4; Thu, 05 Jun 2025 10:27:48 +0000 Received: by outflank-mailman (output) from mailman id 1006567.1385776; Thu, 05 Jun 2025 10:27:48 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7p6-0005tJ-7J; Thu, 05 Jun 2025 10:27:48 +0000 Received: by outflank-mailman (input) for mailman id 1006567; Thu, 05 Jun 2025 10:27:47 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7p5-0005gE-RD for xen-devel@lists.xenproject.org; Thu, 05 Jun 2025 10:27:47 +0000 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [2a00:1450:4864:20::32c]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id b8c026f8-41f7-11f0-b894-0df219b8e170; Thu, 05 Jun 2025 12:27:46 +0200 (CEST) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-43edecbfb46so5597785e9.0 for ; Thu, 05 Jun 2025 03:27:46 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-747afff7459sm12501568b3a.169.2025.06.05.03.27.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Jun 2025 03:27:45 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: b8c026f8-41f7-11f0-b894-0df219b8e170 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1749119265; x=1749724065; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=uAx+h4vdQMnFmuPeDTCptQ0FqB53/bxqCnS75yppj5g=; b=D1ramRQt7YUhBQsQAK5+2I3EbpoHJsHf5j2KPwuOB/HJTk2cw2MZy/rbUoAr8iFkwJ rDkJeqyuEa/eEtB6Q2yeiRNYIY0VK9o7d2QeP8cm/qcyEJz3Faz3gZJS5vMsvtuE5X6V kKVry0Zg+BTdLEf5G5vb9ds4YTEIwvH9WAnB4Mcvj1IVCV2TTVIuFLrc1tnqWFD93Dbg WNoESLP+KBRn9OBb+XoIVDmsxI82/nOaSuN9QONFiW5fZf3TCBHOYQAbOQyuJS+9AbsF 2VKetl4j2xy4RG1mLWxwv/SWoVHYZ7XJHZVBz63avH2f6Z6jQOzIUU3SUNkD0qkvyK12 DZmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749119265; x=1749724065; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uAx+h4vdQMnFmuPeDTCptQ0FqB53/bxqCnS75yppj5g=; b=vVrncCVx5uWms6yv2fyE3XDy9scPq+RZsSyFFpe3el1hpLbTgnh3f+OvZgLJxELGMm aS45Ao1sPHHhkG3nkv/KWs0971eeuALGbJY4Kxw2wwEOCkJwxaAR4z/I3VrQYAsRYGtl OpN5O/Ci3VLN9a95IOQUohJeXWsbVCEQ2Nw65hR4oS0/gpdEuErjR84zUhS5kR/Qc5+F 4nlVWvILArZZVNxrHvTUvLuUcAqRwkZ79LJJUhinc8+9/vqCa+KLcnCTKCYIihTEbmzl vR3SBjWyu8y1bVbN5Ln4SbZYvMmWic6oiQXV1w3XAr3NKIqKsfR7F9DG8vKe+LQky+sR wkLQ== X-Gm-Message-State: AOJu0YzmTsWXZXlWIK2KNJ+52Wu7wVygzT0LuO147opMkjKuCl0979Kr jMDAn2xnaP45DmrfQMLtyqjRkgfCyX/oEKMSIAUUOfKgrL5uUHYXV2/oP3CDfEw2NuMm4JWDMtE 86js= X-Gm-Gg: ASbGncv6yU/HJkA7iPgSnj1lYBSRmMEeYSb8RKi3tpzqsQp9usZHkesPQYmLPhFvKyH 4Eci13wSAltny2O5wqnOHv+4b6fJCswDp5bke5GBzX7ymE86xXJ/XnjbPGONv9xDVg7xXVuWOIf S8UzAZKzvcGG8P7MH+jwg4Lilg7H4BxWXv6t57IDPvtyAuc/XdqHtApCTsXddBjMX/cuKP8a1NF B0/5G7PoQ2KF/L2Zay5Qq/Sz4APuDHLPOGLu9JnqrpvgFM1i1YDGDnpq7yktbjNaCjWQcDfIkFh njSvXDNjv4eHQT7p9vqGjtpBPTox0l3EYThP0ezsbtIAbxKoHhuxaNZlmoWo6HAWWra5nkUfAAM BCzZ0L1Ad2lX4uamjct/dGWpEGN/OVA+8MFnnXYugYxnlN6U= X-Google-Smtp-Source: AGHT+IETgYM/WluMPaHlS/7Qy8wCK4I2i4OlczpWppxKMCL04DAhHEP67cEjUObBzP6EtYif7E9F+Q== X-Received: by 2002:a05:6000:2c10:b0:3a4:f8fa:9c94 with SMTP id ffacd0b85a97d-3a51d91ff46mr5283128f8f.13.1749119265441; Thu, 05 Jun 2025 03:27:45 -0700 (PDT) Message-ID: <5fd7631c-a7aa-438b-ae7e-7f35af65cef2@suse.com> Date: Thu, 5 Jun 2025 12:27:38 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v5 5/6] x86: introduce "hot" and "cold" page clearing functions From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= References: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @suse.com) X-ZM-MESSAGEID: 1749119280807116600 Content-Type: text/plain; charset="utf-8" The present clear_page_sse2() is useful in case a page isn't going to get touched again soon, or if we want to limit churn on the caches. Amend it by alternatively using CLZERO, which has been found to be quite a bit faster on Zen2 hardware at least. Note that to use CLZERO, we need to know the cache line size, and hence a feature dependency on CLFLUSH gets introduced. For cases where latency is the most important aspect, or when it is expected that sufficiently large parts of a page will get accessed again soon after the clearing, introduce a "hot" alternative. Again use alternatives patching to select between a "legacy" and an ERMS variant. Don't switch any callers just yet - this will be the subject of subsequent changes. Signed-off-by: Jan Beulich --- v5: Re-base. v3: Re-base. v2: New. --- Note: Ankur indicates that for ~L3-size or larger regions MOVNT/CLZERO is better even latency-wise. --- a/xen/arch/x86/clear_page.S +++ b/xen/arch/x86/clear_page.S @@ -5,7 +5,7 @@ #include #include =20 -FUNC(clear_page_sse2) + .macro clear_page_sse2 mov $PAGE_SIZE/32, %ecx xor %eax,%eax =20 @@ -19,4 +19,42 @@ FUNC(clear_page_sse2) =20 sfence RET -END(clear_page_sse2) + .endm + + .macro clear_page_clzero + mov %rdi, %rax + mov $PAGE_SIZE/64, %ecx + .globl clear_page_clzero_post_count +clear_page_clzero_post_count: + +0: clzero + sub $-64, %rax + .globl clear_page_clzero_post_neg_size +clear_page_clzero_post_neg_size: + sub $1, %ecx + jnz 0b + + sfence + RET + .endm + +FUNC(clear_page_cold) + ALTERNATIVE clear_page_sse2, clear_page_clzero, X86_FEATURE_CLZERO +END(clear_page_cold) + + .macro clear_page_stosb + mov $PAGE_SIZE, %ecx + xor %eax,%eax + rep stosb + .endm + + .macro clear_page_stosq + mov $PAGE_SIZE/8, %ecx + xor %eax, %eax + rep stosq + .endm + +FUNC(clear_page_hot) + ALTERNATIVE clear_page_stosq, clear_page_stosb, X86_FEATURE_ERMS + RET +END(clear_page_hot) --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -60,6 +60,9 @@ DEFINE_PER_CPU(bool, full_gdt_loaded); =20 DEFINE_PER_CPU(uint32_t, pkrs); =20 +extern uint32_t clear_page_clzero_post_count[]; +extern int8_t clear_page_clzero_post_neg_size[]; + void __init setup_clear_cpu_cap(unsigned int cap) { const uint32_t *dfs; @@ -357,8 +360,38 @@ void __init early_cpu_init(bool verbose) =20 edx &=3D ~cleared_caps[FEATURESET_1d]; ecx &=3D ~cleared_caps[FEATURESET_1c]; - if (edx & cpufeat_mask(X86_FEATURE_CLFLUSH)) - c->x86_cache_alignment =3D ((ebx >> 8) & 0xff) * 8; + if (edx & cpufeat_mask(X86_FEATURE_CLFLUSH)) { + unsigned int size =3D ((ebx >> 8) & 0xff) * 8; + + c->x86_cache_alignment =3D size; + + /* + * Patch in parameters of clear_page_cold()'s CLZERO + * alternative. Note that for now we cap this at 128 bytes. + * Larger cache line sizes would still be dealt with + * correctly, but would cause redundant work done. + */ + if (size > 128) + size =3D 128; + if (size && !(size & (size - 1))) { + /* + * Need to play some games to keep the compiler from + * recognizing the negative array index as being out + * of bounds. The labels in assembler code really are + * _after_ the locations to be patched, so the + * negative index is intentional. + */ + uint32_t *pcount =3D clear_page_clzero_post_count; + int8_t *neg_size =3D clear_page_clzero_post_neg_size; + + OPTIMIZER_HIDE_VAR(pcount); + OPTIMIZER_HIDE_VAR(neg_size); + pcount[-1] =3D PAGE_SIZE / size; + neg_size[-1] =3D -size; + } + else + setup_clear_cpu_cap(X86_FEATURE_CLZERO); + } /* Leaf 0x1 capabilities filled in early for Xen. */ c->x86_capability[FEATURESET_1d] =3D edx; c->x86_capability[FEATURESET_1c] =3D ecx; --- a/xen/arch/x86/include/asm/asm-defns.h +++ b/xen/arch/x86/include/asm/asm-defns.h @@ -1,5 +1,9 @@ #include =20 +.macro clzero + .byte 0x0f, 0x01, 0xfc +.endm + /* * Call a noreturn function. This could be JMP, but CALL results in a more * helpful backtrace. BUG is to catch functions which do decide to return= ... --- a/xen/arch/x86/include/asm/page.h +++ b/xen/arch/x86/include/asm/page.h @@ -219,10 +219,11 @@ typedef struct { u64 pfn; } pagetable_t; #define pagetable_from_paddr(p) pagetable_from_pfn((p)>>PAGE_SHIFT) #define pagetable_null() pagetable_from_pfn(0) =20 -void clear_page_sse2(void *pg); +void clear_page_hot(void *pg); +void clear_page_cold(void *pg); void copy_page_sse2(void *to, const void *from); =20 -#define clear_page(_p) clear_page_sse2(_p) +#define clear_page(_p) clear_page_cold(_p) #define copy_page(_t, _f) copy_page_sse2(_t, _f) =20 /* Convert between Xen-heap virtual addresses and machine addresses. */ --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -212,6 +212,10 @@ def crunch_numbers(state): # the first place. APIC: [X2APIC, TSC_DEADLINE, EXTAPIC], =20 + # The CLZERO insn requires a means to determine the cache line siz= e, + # which is tied to the CLFLUSH insn. + CLFLUSH: [CLZERO], + # AMD built MMXExtentions and 3DNow as extentions to MMX. MMX: [MMXEXT, _3DNOW], From nobody Fri Oct 31 04:01:22 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=quarantine dis=none) header.from=suse.com ARC-Seal: i=1; a=rsa-sha256; t=1749119346; cv=none; d=zohomail.com; s=zohoarc; b=RDtpiG695JpUT0MctfaZx5Im26lKKxE9f50yWjwIRslkwMIF36q7QXCvzw3Bfc4M87mSixqdik9NCmt8zdBfYrhYDz8kK9OHLLa+tRclzdNgPNGHSn8AuX81kR5juurZgXoadAmq4RYqiwpSQ8uRwGltE4ezqLy49urHgfVnQLA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749119346; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=4eEicFsdYkA5hJ8hFbKaqmBk1yERZphW9t6IYOWNxpY=; b=JTVSX9KQuI5b/1Ui6s3Y9QbMGSrVAT8p651mp+rHQ7N5x4A/LM3i0NoblSI/kbH/tIadPe0Jv45cbXboJDl87lo+1dDqOpD/xSJdZaV5lIHcJO1DZA4+LhbMus3H+m9rbw/M7e4C1v70eDAuMSzx/akX4JMMXmbCw1a9hgCU+pY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1749119346162452.50260653591386; Thu, 5 Jun 2025 03:29:06 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.1006582.1385798 (Exim 4.92) (envelope-from ) id 1uN7q6-0006wM-07; Thu, 05 Jun 2025 10:28:50 +0000 Received: by outflank-mailman (output) from mailman id 1006582.1385798; Thu, 05 Jun 2025 10:28:49 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7q5-0006wF-SR; Thu, 05 Jun 2025 10:28:49 +0000 Received: by outflank-mailman (input) for mailman id 1006582; Thu, 05 Jun 2025 10:28:49 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1uN7q4-0006Ay-TC for xen-devel@lists.xenproject.org; Thu, 05 Jun 2025 10:28:49 +0000 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [2a00:1450:4864:20::430]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id dd8a412d-41f7-11f0-a300-13f23c93f187; Thu, 05 Jun 2025 12:28:48 +0200 (CEST) Received: by mail-wr1-x430.google.com with SMTP id ffacd0b85a97d-3a35c894313so678430f8f.2 for ; Thu, 05 Jun 2025 03:28:47 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-313319d45dfsm971765a91.32.2025.06.05.03.28.40 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 05 Jun 2025 03:28:46 -0700 (PDT) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: dd8a412d-41f7-11f0-a300-13f23c93f187 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1749119327; x=1749724127; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=4eEicFsdYkA5hJ8hFbKaqmBk1yERZphW9t6IYOWNxpY=; b=V9qzNbxkTnTPZSrk2mJzfGUX6VUlXCvExlosxw6e8L1oPsI5RijIhA2PxM3j9thEIT r4Lsnsbe7YcekE3zOfwbGzWDDU9sZ7i0UHhNya39+CDBdzqMcFlsRHb2oGw7JrE964nr sToHE7h4w1wPnWCG+ISKaYOmezmkjVKJTPF4A2PP6AMVwR6njY8Gb5XLtFcC9mOe4CXd xdcYJOSumhkChMeXTd+JgEZeFx2iqVew5Ldv4dp5dOzYb9Q2rhIEyyT4C6g1jNxqjRk/ G7b+PCrV2sxTKMo9u43vcyxVU02Rck4q72gYOyrD6dHogzbtnu/dyUiWF6s/b5O0Jkdr UoIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749119327; x=1749724127; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4eEicFsdYkA5hJ8hFbKaqmBk1yERZphW9t6IYOWNxpY=; b=MKS+nvJoARKN6P8+kL/iNtxs5hdG5d1uT27QV2PyTjioWzoG8vMQaVqBayOx9jgOcL DjLfm+EtSfqvUxIjm2zcFTZM+0+E9EnxgS8sr6xZlafH0NETrr/scufLGE1Jr3RxVrhY QAAP/6CTLb8srBMjC9rJn1FOWk2PP9CKi/CMPRRYbGBJ/K80AqNnIUdsn2JuED5XPOWl R0JBGuP2B/gGGCW3AQVvtD2vUvc8aII3Uv2p60vVwQpDgSgGLu4F90Ue/t+Yfp5Zous2 LIEpsXd5ZRpPt/QuV1R4A8nCeWuetAjLmsE8sOgF8s5vebocZGFdrePEXJOt1jJgJXdp JqOg== X-Gm-Message-State: AOJu0Yw3x5KvlsBLY/wrH/I7QhoY4z0OohuYgdd9Gb9L1norqhRi3jlI /XWOaEAyscViaJLTQznYlyLE7Dcpd8JnK4Ea7NeR91DgBuoWVQKbBbsSNE3kW4zD0lDiA+0ZuA0 1NuA= X-Gm-Gg: ASbGnct2Da4fxVxVBShgXcCzIdwFP5zX+2Va+PA2u0yaE6bSuTNu1/goakqkuHuEhwS c5lj2ClxfwQQqkdboG8ZPI7QoOIzBjn17gNhR6uyJ0z4w2V+lfrIR6S0gvhFNTAW+DC6sFc+aBc boS2tWt/4o6a0utQwVfYqmQNu7ArJby340EFaw4efZfYxabefuAqQ92HfkHYXyElGmhhFOwkHuZ Gl0R8eHwgR8cRQgGxEVJdtv9s3epXqZ5p4jkJW76n8sj6Hx2vzReRtP4418EGHrK5hoUnhi/xLF YgMvdOCd2efgBWz/EprynzwfcTI9TzhdfkZou2toWjRzUJBVMhImGzfXioGYYME83vYlfwt5BI+ m8wRgvr19bzxw9/+DxWmoBMvaBYNJ3/K/O47F9L10HmrRlkVYtdfHmODixQ== X-Google-Smtp-Source: AGHT+IFF2cBgmthnjaPY0d7ZcKwjaIvijwnZ6Vo0g+oi5WwQuWMEzpGV2GKkvMxJO2iG7gtswr52zg== X-Received: by 2002:a05:6000:3106:b0:3a4:f70d:a65e with SMTP id ffacd0b85a97d-3a51d95a4f1mr4785868f8f.37.1749119327240; Thu, 05 Jun 2025 03:28:47 -0700 (PDT) Message-ID: Date: Thu, 5 Jun 2025 12:28:36 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v5 6/6] mm: allow page scrubbing routine(s) to be arch controlled From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Julien Grall , Stefano Stabellini , Anthony PERARD , Michal Orzel , =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= , Bertrand Marquis , Volodymyr Babchuk , Oleksii Kurochko , Shawn Anastasio References: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <73481cbf-337f-4e85-81d2-3487366cd822@suse.com> Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @suse.com) X-ZM-MESSAGEID: 1749119347913116600 Content-Type: text/plain; charset="utf-8" Especially when dealing with large amounts of memory, memset() may not be very efficient; this can be bad enough that even for debug builds a custom function is warranted. We additionally want to distinguish "hot" and "cold" cases (with, as initial heuristic, "hot" being for any allocations a domain does for itself, assuming that in all other cases the page wouldn't be accessed [again] soon). The goal is for accesses of "cold" pages to not disturb caches (albeit finding a good balance between this and the higher latency looks to be difficult). Keep the default fallback to clear_page_*() in common code; this may want to be revisited down the road. Signed-off-by: Jan Beulich Acked-by: Julien Grall --- v4: Re-base. v3: Re-base. v2: New. --- The choice between hot and cold in scrub_one_page()'s callers is certainly up for discussion / improvement. --- a/xen/arch/arm/include/asm/page.h +++ b/xen/arch/arm/include/asm/page.h @@ -144,6 +144,12 @@ extern size_t dcache_line_bytes; =20 #define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE) =20 +#define clear_page_hot clear_page +#define clear_page_cold clear_page + +#define scrub_page_hot(page) memset(page, SCRUB_BYTE_PATTERN, PAGE_SIZE) +#define scrub_page_cold scrub_page_hot + static inline size_t read_dcache_line_bytes(void) { register_t ctr; --- a/xen/arch/ppc/include/asm/page.h +++ b/xen/arch/ppc/include/asm/page.h @@ -188,6 +188,12 @@ static inline void invalidate_icache(voi #define clear_page(page) memset(page, 0, PAGE_SIZE) #define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE) =20 +#define clear_page_hot clear_page +#define clear_page_cold clear_page + +#define scrub_page_hot(page) memset(page, SCRUB_BYTE_PATTERN, PAGE_SIZE) +#define scrub_page_cold scrub_page_hot + /* TODO: Flush the dcache for an entire page. */ static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache) { --- a/xen/arch/riscv/include/asm/page.h +++ b/xen/arch/riscv/include/asm/page.h @@ -198,6 +198,12 @@ static inline void invalidate_icache(voi #define clear_page(page) memset((void *)(page), 0, PAGE_SIZE) #define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE) =20 +#define clear_page_hot clear_page +#define clear_page_cold clear_page + +#define scrub_page_hot(page) memset(page, SCRUB_BYTE_PATTERN, PAGE_SIZE) +#define scrub_page_cold scrub_page_hot + static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache) { const void *v =3D map_domain_page(_mfn(mfn)); --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -58,6 +58,7 @@ obj-y +=3D pci.o obj-y +=3D physdev.o obj-$(CONFIG_COMPAT) +=3D x86_64/physdev.o obj-$(CONFIG_X86_PSR) +=3D psr.o +obj-bin-$(CONFIG_DEBUG) +=3D scrub_page.o obj-y +=3D setup.o obj-y +=3D shutdown.o obj-y +=3D smp.o --- a/xen/arch/x86/include/asm/page.h +++ b/xen/arch/x86/include/asm/page.h @@ -226,6 +226,11 @@ void copy_page_sse2(void *to, const void #define clear_page(_p) clear_page_cold(_p) #define copy_page(_t, _f) copy_page_sse2(_t, _f) =20 +#ifdef CONFIG_DEBUG +void scrub_page_hot(void *); +void scrub_page_cold(void *); +#endif + /* Convert between Xen-heap virtual addresses and machine addresses. */ #define __pa(x) (virt_to_maddr(x)) #define __va(x) (maddr_to_virt(x)) --- /dev/null +++ b/xen/arch/x86/scrub_page.S @@ -0,0 +1,39 @@ + .file __FILE__ + +#include +#include +#include + +FUNC(scrub_page_cold) + mov $PAGE_SIZE/32, %ecx + mov $SCRUB_PATTERN, %rax + +0: movnti %rax, (%rdi) + movnti %rax, 8(%rdi) + movnti %rax, 16(%rdi) + movnti %rax, 24(%rdi) + add $32, %rdi + sub $1, %ecx + jnz 0b + + sfence + ret +END(scrub_page_cold) + + .macro scrub_page_stosb + mov $PAGE_SIZE, %ecx + mov $SCRUB_BYTE_PATTERN, %eax + rep stosb + ret + .endm + + .macro scrub_page_stosq + mov $PAGE_SIZE/8, %ecx + mov $SCRUB_PATTERN, %rax + rep stosq + ret + .endm + +FUNC(scrub_page_hot) + ALTERNATIVE scrub_page_stosq, scrub_page_stosb, X86_FEATURE_ERMS +END(scrub_page_hot) --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -135,6 +135,7 @@ #include #include #include +#include #include #include #include @@ -779,27 +780,31 @@ static void page_list_add_scrub(struct p page_list_add(pg, &heap(node, zone, order)); } =20 -/* SCRUB_PATTERN needs to be a repeating series of bytes. */ -#ifndef NDEBUG -#define SCRUB_PATTERN 0xc2c2c2c2c2c2c2c2ULL -#else -#define SCRUB_PATTERN 0ULL +/* + * While in debug builds we want callers to avoid relying on allocations + * returning zeroed pages, for a production build, clear_page_*() is the + * fastest way to scrub. + */ +#ifndef CONFIG_DEBUG +# undef scrub_page_hot +# define scrub_page_hot clear_page_hot +# undef scrub_page_cold +# define scrub_page_cold clear_page_cold #endif -#define SCRUB_BYTE_PATTERN (SCRUB_PATTERN & 0xff) =20 -static void scrub_one_page(const struct page_info *pg) +static void scrub_one_page(const struct page_info *pg, bool cold) { + void *ptr; + if ( unlikely(pg->count_info & PGC_broken) ) return; =20 -#ifndef NDEBUG - /* Avoid callers relying on allocations returning zeroed pages. */ - unmap_domain_page(memset(__map_domain_page(pg), - SCRUB_BYTE_PATTERN, PAGE_SIZE)); -#else - /* For a production build, clear_page() is the fastest way to scrub. */ - clear_domain_page(_mfn(page_to_mfn(pg))); -#endif + ptr =3D __map_domain_page(pg); + if ( cold ) + scrub_page_cold(ptr); + else + scrub_page_hot(ptr); + unmap_domain_page(ptr); } =20 static void poison_one_page(struct page_info *pg) @@ -1079,12 +1084,14 @@ static struct page_info *alloc_heap_page if ( first_dirty !=3D INVALID_DIRTY_IDX || (scrub_debug && !(memflags & MEMF_no_scrub)) ) { + bool cold =3D d && d !=3D current->domain; + for ( i =3D 0; i < (1U << order); i++ ) { if ( test_and_clear_bit(_PGC_need_scrub, &pg[i].count_info) ) { if ( !(memflags & MEMF_no_scrub) ) - scrub_one_page(&pg[i]); + scrub_one_page(&pg[i], cold); =20 dirty_cnt++; } @@ -1349,7 +1356,7 @@ bool scrub_free_pages(void) { if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) { - scrub_one_page(&pg[i]); + scrub_one_page(&pg[i], true); /* * We can modify count_info without holding heap * lock since we effectively locked this buddy by @@ -2074,7 +2081,7 @@ static struct page_info *alloc_color_hea if ( !(memflags & MEMF_no_scrub) ) { if ( need_scrub ) - scrub_one_page(pg); + scrub_one_page(pg, d !=3D current->domain); else check_one_page(pg); } @@ -2225,7 +2232,7 @@ static void __init cf_check smp_scrub_he if ( !mfn_valid(_mfn(mfn)) || !page_state_is(pg, free) ) continue; =20 - scrub_one_page(pg); + scrub_one_page(pg, true); } } =20 @@ -2930,7 +2937,7 @@ void unprepare_staticmem_pages(struct pa if ( need_scrub ) { /* TODO: asynchronous scrubbing for pages of static memory. */ - scrub_one_page(pg); + scrub_one_page(pg, true); } =20 pg[i].count_info |=3D PGC_static; --- /dev/null +++ b/xen/include/xen/scrub.h @@ -0,0 +1,24 @@ +#ifndef __XEN_SCRUB_H__ +#define __XEN_SCRUB_H__ + +#include + +/* SCRUB_PATTERN needs to be a repeating series of bytes. */ +#ifdef CONFIG_DEBUG +# define SCRUB_PATTERN _AC(0xc2c2c2c2c2c2c2c2,ULL) +#else +# define SCRUB_PATTERN _AC(0,ULL) +#endif +#define SCRUB_BYTE_PATTERN (SCRUB_PATTERN & 0xff) + +#endif /* __XEN_SCRUB_H__ */ + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */