From nobody Sat Oct 4 21:02:24 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 039F7287247 for ; Wed, 13 Aug 2025 17:27:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755106028; cv=none; b=bQiXK626yCjbtkG+62xO1Kb3XV44b/a08I/hfSSzgQQ+ZhUywQgpDsjccFzhnWyNfqiuIr2JZgBqx96fkkje8W+VpDYlClqhNw0qmVeN8KTqTXB51qJU0LyLb8RT+3pLamsoX8OSR1FuKmv0JV1dX75+mGVtayOQ0LrNFj0xhPQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755106028; c=relaxed/simple; bh=YBtgJINRq9CWupVyEhQDXVWYVFuj92uvEpfEeIJvJww=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sdSPzI78DDsnEBEBQLKxiqUZLLyaHxoydzrUy1uaOaoNYmjD96RHEjRUCw5RR1Gvn7fjMf4v900xi5yd/y9MoXZ2LdRrLHeUUyWLMKPt6E58kHKkbTrfKlk90WppuopbmBDYvGvZ98xYCT6UgV6PmFWnjUgMmNmiAEHCGVBqGVk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JhzNoXLe; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JhzNoXLe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1755106028; x=1786642028; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YBtgJINRq9CWupVyEhQDXVWYVFuj92uvEpfEeIJvJww=; b=JhzNoXLeyOm4mEpcN4Ozt8u3jHDGRCAf8lEV1NpHsOXXBRgG16h5os3Q 70vMzpHvfsSyFC+liv+lMLgBnoJLPnco9fLrHhz1HNgbnu7nOPe9pbr+5 I7C6DcP9we+fXB8e7cN8tKcojvKG2htY7OEuIUBTLIwpHMgrl48vlPvF0 LYNbhGU2XrMyahcyvG1fqypfxRyou0N/09ib0qbTqKzg+bJLrqDTMEQdG 2ZGaejDUOWeVsWJYbFHX/7OREfi1AfH0IuBOFh5RDLVQY6k5TastKAN4R LivJpIxW+LzQ+jVxbYZdMc6fRlb7SWM4b9HqPGFojIoiW2R8dQbxoX2BB w==; X-CSE-ConnectionGUID: OHXsCw4QQ+uS16sm0M18TQ== X-CSE-MsgGUID: y/soZd0rRz6fLyTOKhLnCQ== X-IronPort-AV: E=McAfee;i="6800,10657,11520"; a="61255205" X-IronPort-AV: E=Sophos;i="6.17,287,1747724400"; d="scan'208";a="61255205" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Aug 2025 10:27:07 -0700 X-CSE-ConnectionGUID: JqHXX5nXTUu6lUSrNdDDRg== X-CSE-MsgGUID: IMnTRMzIQOWqF5lPvpnIbQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,287,1747724400"; d="scan'208";a="167329701" Received: from cbae1-mobl.amr.corp.intel.com (HELO cbae1-mobl.intel.com) ([10.124.161.193]) by fmviesa010.fm.intel.com with ESMTP; 13 Aug 2025 10:27:06 -0700 From: "Chang S. Bae" To: x86@kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, colinmitchell@google.com, chao.gao@intel.com, abusse@amazon.de, chang.seok.bae@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v4 1/6] x86/microcode: Introduce staging step to reduce late-loading time Date: Wed, 13 Aug 2025 10:26:44 -0700 Message-ID: <20250813172649.15474-2-chang.seok.bae@intel.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250813172649.15474-1-chang.seok.bae@intel.com> References: <20250409232713.4536-1-chang.seok.bae@intel.com> <20250813172649.15474-1-chang.seok.bae@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As microcode patch sizes continue to grow, late-loading latency spikes can lead to timeouts and disruptions in running workloads. This trend of increasing patch sizes is expected to continue, so a foundational solution is needed to address the issue. To mitigate the problem, a new staging feature is introduced. This option processes most of the microcode update (excluding activation) on a non-critical path, allowing CPUs to remain operational during the majority of the update. By offloading work from the critical path, staging can significantly reduces latency spikes. Integrate staging as a preparatory step in late-loading. Introduce a new callback for staging, which is invoked at the beginning of load_late_stop_cpus(), before CPUs enter the rendezvous phase. Staging follows an opportunistic model: * If successful, it reduces CPU rendezvous time * Even though it fails, the process falls back to the legacy path to finish the loading process but with potentially higher latency. Extend struct microcode_ops to incorporate staging properties, which will be implemented in the vendor code separately. Signed-off-by: Chang S. Bae Tested-by: Anselm Busse Reviewed-by: Chao Gao --- There were discussions about whether staging success should be enforced by a configurable option. That topic is identified as follow-up work, separate from this series. https://lore.kernel.org/lkml/54308373-7867-4b76-be34-63730953f83c@intel= .com/ V1 -> V2: * Move invocation inside of load_late_stop_cpus() (Boris) * Add more note about staging (Dave) --- arch/x86/kernel/cpu/microcode/core.c | 11 +++++++++++ arch/x86/kernel/cpu/microcode/internal.h | 4 +++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/mic= rocode/core.c index b92e09a87c69..34e569ee1db2 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -552,6 +552,17 @@ static int load_late_stop_cpus(bool is_safe) pr_err("You should switch to early loading, if possible.\n"); } =20 + /* + * Pre-load the microcode image into a staging device. This + * process is preemptible and does not require stopping CPUs. + * Successful staging simplifies the subsequent late-loading + * process, reducing rendezvous time. + * + * Even if the transfer fails, the update will proceed as usual. + */ + if (microcode_ops->use_staging) + microcode_ops->stage_microcode(); + atomic_set(&late_cpus_in, num_online_cpus()); atomic_set(&offline_in_nmi, 0); loops_per_usec =3D loops_per_jiffy / (TICK_NSEC / 1000); diff --git a/arch/x86/kernel/cpu/microcode/internal.h b/arch/x86/kernel/cpu= /microcode/internal.h index 50a9702ae4e2..adf02ebbf7a3 100644 --- a/arch/x86/kernel/cpu/microcode/internal.h +++ b/arch/x86/kernel/cpu/microcode/internal.h @@ -31,10 +31,12 @@ struct microcode_ops { * See also the "Synchronization" section in microcode_core.c. */ enum ucode_state (*apply_microcode)(int cpu); + void (*stage_microcode)(void); int (*collect_cpu_info)(int cpu, struct cpu_signature *csig); void (*finalize_late_load)(int result); unsigned int nmi_safe : 1, - use_nmi : 1; + use_nmi : 1, + use_staging : 1; }; =20 struct early_load_data { --=20 2.48.1