From nobody Sat Feb 7 17:55:40 2026 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17A67480949 for ; Thu, 22 Jan 2026 08:10:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769069432; cv=none; b=Hz9SIGQwgWokjXFsZejZPGdlIO/k+sBw9GGH5KaPZwAY67ooJXThTkAJmoOiAaPB65NE7Xl4CAg0i318GGsYtnW7SkV+TsS2RI8DjE/ahnQ0+w8+Fa8lvnrl5qgGfh9qdDzTj9l2xaaLLwuUq3DZAOvqcMItdVSOk+ZkfutR2s0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769069432; c=relaxed/simple; bh=IvOasZrut6PCd3af8YtoDpHpptDz9z7PgxcJbJl9S1c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SUYJfQp1BNt9Xgb/n9T6fcztYItvTumaQK+10XyVFRbOqCXs4TCOcdZ3mQpnwz0Rt9snrmTPPi3xLcseZpzRc+5J4vnn+RpyS1hMyaBBe7Xi6nhzPhzg+9rlCp55i9CYA01bostwlMqp1m/txpgvSozNhYiX75FceB6DNBsoL24= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=mNv42n/e; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mNv42n/e" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-42fb4eeb482so438609f8f.0 for ; Thu, 22 Jan 2026 00:10:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769069428; x=1769674228; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DtOpl7xweHl0EFuZPJYt8DnqlAREakqhJs13CERid+E=; b=mNv42n/e1RFd1ldrDrgWz+L0DOvPdZzEYHP9lDBsrkuRW6Xon0MnbkCwc9Jvejm/sa BqdKbzXXkxf+dv4xmtDCaTq1Xnk1u3ZHsfslnVR0w93WfmjMrFk5Q57eHYA+o2X/qgVJ bDmFuWE8y9HlrCTmNB8IAkExWYBEwYHqcg3M9oqXQxz2+93x7mNBIDvI1pGIAhv2zqf5 vVRDatPRKhHTtsoD6Gq+Kwj0A77/5FVOJfjS2p6ebcASxV02ZvNfyhYQJXMUladzhbBk 8mYZeBwL+DmMNH3WqaKqLnikuBPRVst20IUUUQ+zWgryd7o+GuGCNfR74mXbGceE2zNo Bg5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769069428; x=1769674228; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=DtOpl7xweHl0EFuZPJYt8DnqlAREakqhJs13CERid+E=; b=s+/j+HCjWaPLd74N0zzdizvJlrSbmYwEG90fzUT/ybh7Ef1J5JGgKIH8RmDOk7uTOX fOwvGLslIC16wFBei05kz8t14eg1rG4NgUH2MPRGSpEPWEXxNYBPDnZO6GjP0wcoUmY8 ZIBD1e7PladgkdLnMTbohkfYb8icsPRc1HSlZX3T+IP+A4/tZ2s3ZD5Uql8fWI7VMaS3 e9o7gslsEMUoQuBS0utYzCTrDV3+vec4a2XdK+OgURVRVjRW7Be71GXyiuAGq7ZFQRuP DpnDpCZKP9hGXOxiQPJrfgy3pogTKomapPTZAAtRNTIggNVlU90ZmpGY8DKpt/IjtAmV lnfQ== X-Forwarded-Encrypted: i=1; AJvYcCWPWH7hR907782MPzvM3tko+iGEeo6b59GXHSexHPR2nWv3RA4kMx8fBhu9VqlO/MU1Sitwh3Zk4COzsr4=@vger.kernel.org X-Gm-Message-State: AOJu0YxmxaXhNT+bEsZsA+O1ML8bS9/ZpMyq39OcJjUUyDYgZNoBdQZu 0W7mTCKnurCsxWxbiiI3kG0aPKmk5z/wEopBKFQPm5S62urS9LyFZRhK X-Gm-Gg: AZuq6aKHZAiJRdYKcXObwux5sqXxvWGOObPWMixVPARRhKOGIJFAnGOyc6pQhcb0TR4 qblnxl/PszZ5aW762oAATi1cPC+Qqg5Jz6lJazwH9bo3qJZnmqnDEb28zUscmonoYvIEZ3SpTPF UMtHUVJ3DPTKzOFX5pBUGdN1HKHqsCMedubKdYh1Sq34UOLrGXCkLzyVs9BSmctZ6nl6mMYrnhB +tPFZHKjD7sy/xcA+VYU4st5RE7QaLE8vn6MZ4PhQFDOQNo4/qlC7y3lTrrSg00de/N/FETxIcC afmHG8uUY1z9I7n98mQfzbgfDgwfOVP8lwRGJ43TVMWvL2VdYCYrn6aPl4wyM5E9+A++T0/Ku1d HTYbhiuGFkjzUBVtMk3ASoWNgWhqKQmbE4FTdCG6WZ2n99gXIrm9iFLzqmPdZIe2bYoWL2bVXhb SXa9OQFmPiwskk/HGylcueUfqDmhDigZXV/63vC/E= X-Received: by 2002:a05:6000:2510:b0:435:9cd5:bb2e with SMTP id ffacd0b85a97d-4359cd5bc49mr9588007f8f.11.1769069428106; Thu, 22 Jan 2026 00:10:28 -0800 (PST) Received: from ionutnechita-arz2022.local ([2a02:2f0e:c606:9800:ea1b:9133:ab8e:bdea]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4356997ed8bsm43766261f8f.36.2026.01.22.00.10.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jan 2026 00:10:27 -0800 (PST) From: "Ionut Nechita (Sunlight Linux)" To: rafael@kernel.org Cc: daniel.lezcano@linaro.org, christian.loehle@arm.com, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, yumpusamongus@gmail.com, Ionut Nechita , stable@vger.kernel.org Subject: [PATCH v2 1/1] cpuidle: menu: Use min() to prevent deep C-states when tick is stopped Date: Thu, 22 Jan 2026 10:09:39 +0200 Message-ID: <20260122080937.22347-4-sunlightlinux@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260122080937.22347-2-sunlightlinux@gmail.com> References: <20260122080937.22347-2-sunlightlinux@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Ionut Nechita When the tick is already stopped and the predicted idle duration is short (< TICK_NSEC), the original code uses next_timer_ns directly. This can lead to selecting excessively deep C-states when the actual idle duration is much shorter than the next timer event. On modern Intel server platforms (Sapphire Rapids and newer), deep package C-states can have exit latencies of 150-190us due to: - Tile-based architecture with per-tile power gating - DDR5 and CXL power management overhead - Complex mesh interconnect resynchronization When a network packet arrives after 500us but the governor selected a deep C-state (PC6) based on a 10ms timer, the high exit latency (150us+) dominates the response time. Use the minimum of predicted_ns and next_timer_ns instead of using next_timer_ns directly. This avoids selecting unnecessarily deep states when the prediction is short but the next timer is distant, while still being conservative enough to prevent getting stuck in shallow states for extended periods. Testing on Sapphire Rapids with qperf tcp_lat shows: - Before: 151us average latency (frequent PC6 entry) - After: ~30us average latency (avoids PC6 on short predictions) - Improvement: 5x latency reduction The fix is platform-agnostic and benefits other platforms with high C-state exit latencies. Testing on systems with large C-state gaps (e.g., C2 at 36us =E2=86=92 C3 at 700us with 350us latency) shows similar improvements in avoiding deep state selection for short idle periods. Power efficiency testing shows minimal impact (<1% difference in package power consumption during mixed workloads), well within measurement noise. Cc: stable@vger.kernel.org Signed-off-by: Ionut Nechita --- drivers/cpuidle/governors/menu.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/m= enu.c index 64d6f7a1c776..199eac2a1849 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -287,12 +287,16 @@ static int menu_select(struct cpuidle_driver *drv, st= ruct cpuidle_device *dev, /* * If the tick is already stopped, the cost of possible short idle * duration misprediction is much higher, because the CPU may be stuck - * in a shallow idle state for a long time as a result of it. In that - * case, say we might mispredict and use the known time till the closest - * timer event for the idle state selection. + * in a shallow idle state for a long time as a result of it. + * + * Instead of using next_timer_ns directly (which could be very large, + * e.g., 10ms), use the minimum of the prediction and the timer. This + * prevents selecting excessively deep C-states when the prediction + * suggests a short idle period, while still clamping to next_timer_ns + * to avoid unnecessarily shallow states. */ if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC) - predicted_ns =3D data->next_timer_ns; + predicted_ns =3D min(predicted_ns, data->next_timer_ns); =20 /* * Find the idle state with the lowest power while satisfying --=20 2.52.0