From nobody Sat Jun 13 17:05:57 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 82CFB370D6E; Wed, 6 May 2026 07:11:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778051504; cv=none; b=Mw/O0iGHDUcdFLsXyr0cOFDJPw9hfsP+X+15m6EB13xoi/NiinRl/zZ1cNkJycpi4nCCapUI76sPS7lipGng0lDTYVK3fR9OzeqwwXsDA3YdnO41qdq+dZOk+o9rir3wtG6/3My+BrY+shAfUmwyxQRT5Z+s2oltZBa8Jc5UwHA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778051504; c=relaxed/simple; bh=WUPdy98YcNyF3UIC0fRITamexH18BMhxlTEo/1RFqDw=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=KHxNu/0FjpIuZCK5xKphaP0ntbjBelLc6OqcE3p9NEdgXgBNmD8+LXvxHdcLJ/eKLoHAZtdcDUGpr/EvKa27SAi50Y3rVpiSxuDochDo4GkMB0ipQVfgEsOGixH2SndsyhoMmB9bto3zd5/Sq2vOUm2bKoJsGZHQYFgl/E4mpIw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=I7ESyhFE; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=WPo5BkH6; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="I7ESyhFE"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="WPo5BkH6" Date: Wed, 06 May 2026 07:11:39 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1778051501; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4/cw+uAODB96hVdd+2K+uVaOTuhBax/Mk6D3VSOX/sg=; b=I7ESyhFE+zTVl3ZB3tbINY7EJV4VIBcm+COQdSgWEDj9yQO6HlSh9bzZXVqS3CuTiNIBfo oWSAaEP9W2F5NYI0UTblQgO0R+WndL6MsXchM1aLjRRZ1Azt9Ehks7Q7cCfMlxlXBqBVI5 aFLhdWfeh9vpjslTKU2NtS1UgtQRtXdXFhapSEEHQYrBIs8PojtW7XB8zOO9TgTlgLHzdx se6keu1yxdAkHEuLOnaT3uSmDUDJduOf4dC6+lzHTCjtqyalW0nzrnpTL+GT1SBdFEsEQt WIygMFbxNzdrV9a89yfGpA9JLgpzCoLJ4AGU7Adc1D2MmZE4pvfLHISntt4hiQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1778051501; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4/cw+uAODB96hVdd+2K+uVaOTuhBax/Mk6D3VSOX/sg=; b=WPo5BkH62JeL+arJyY4IXaF41N0Voj3aOIACwwvQ2iriIgFlf9TSBnRiFMo8FN2vr4KYin KVJ0bRWzyRFcU8Dg== From: "tip-bot2 for Frederic Weisbecker" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: timers/core] timers/migration: Split per-capacity hierarchies Cc: Sehee Jeong , Thomas Gleixner , Frederic Weisbecker , Thomas Gleixner , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20260423165354.95152-5-frederic@kernel.org> References: <20260423165354.95152-5-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <177805149959.424702.11263336267047588872.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the timers/core branch of tip: Commit-ID: 098cbaad8e573cf6cac9e68e7ca2e7b7363d2434 Gitweb: https://git.kernel.org/tip/098cbaad8e573cf6cac9e68e7ca2e7b73= 63d2434 Author: Frederic Weisbecker AuthorDate: Thu, 23 Apr 2026 18:53:52 +02:00 Committer: Thomas Gleixner CommitterDate: Wed, 06 May 2026 08:33:07 +02:00 timers/migration: Split per-capacity hierarchies Systems with heterogeneous CPU capacities, such as big.LITTLE, have reported power issues since the introduction of the new timer migration code. Timers migrate from small capacity CPUs to big ones, degrading their target residency and thus overall power consumption. Solve this with splitting hierarchies per CPU capacity. For example in a big.LITTLE machine, split a single hierarchy in two: one for big capacity CPUs and another one for small capacity CPUs. This way global timers only migrate across CPUs of the same capacity. For simplicity purpose, split hierarchies keep the same number of possible levels as if there were a single hierarchy, even though the CPUs are distributed between multiple hierarchies. This could be a problem on NUMA systems with heterogeneous CPU capacities (provided that ever exists yet) where useless intermediate nodes may be created. Solving this properly will imply on boot to know in advance how many capacities are available and the number of CPUs for each of them. Reported-by: Sehee Jeong Suggested-by: Thomas Gleixner Signed-off-by: Frederic Weisbecker Signed-off-by: Thomas Gleixner Link: https://patch.msgid.link/20260423165354.95152-5-frederic@kernel.org --- kernel/time/timer_migration.c | 123 ++++++++++++++++++++++++--------- kernel/time/timer_migration.h | 7 ++- 2 files changed, 100 insertions(+), 30 deletions(-) diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c index a68b9c7..03ae8c7 100644 --- a/kernel/time/timer_migration.c +++ b/kernel/time/timer_migration.c @@ -417,7 +417,7 @@ =20 static DEFINE_MUTEX(tmigr_mutex); =20 -static struct tmigr_hierarchy *hierarchy; +static LIST_HEAD(tmigr_hierarchy_list); =20 static unsigned int tmigr_hierarchy_levels __read_mostly; static unsigned int tmigr_crossnode_level __read_mostly; @@ -1889,6 +1889,12 @@ static int tmigr_setup_groups(struct tmigr_hierarchy= *hier, unsigned int cpu, data.childmask =3D start->groupmask; __walk_groups_from(tmigr_active_up, &data, start, start->parent); } + } else if (start) { + union tmigr_state state; + + /* Remote activation assumes the whole target's hierarchy is inactive */ + state.state =3D atomic_read(&start->migr_state); + WARN_ON_ONCE(state.active); } =20 /* Root update */ @@ -1907,34 +1913,78 @@ out: return err; } =20 -static struct tmigr_hierarchy *tmigr_get_hierarchy(void) +static struct tmigr_hierarchy *tmigr_get_hierarchy(unsigned int capacity) { - if (hierarchy) - return hierarchy; + struct tmigr_hierarchy *hier =3D NULL, *iter; + + list_for_each_entry(iter, &tmigr_hierarchy_list, node) { + if (iter->capacity =3D=3D capacity) + hier =3D iter; + } + + if (hier) + return hier; =20 - hierarchy =3D kzalloc(sizeof(*hierarchy), GFP_KERNEL); - if (!hierarchy) + hier =3D kzalloc(sizeof(*hier), GFP_KERNEL); + if (!hier) return ERR_PTR(-ENOMEM); =20 - hierarchy->cpumask =3D kzalloc(cpumask_size(), GFP_KERNEL); - if (!hierarchy->cpumask) + hier->cpumask =3D kzalloc(cpumask_size(), GFP_KERNEL); + if (!hier->cpumask) goto err; =20 - hierarchy->level_list =3D kzalloc_objs(struct list_head, tmigr_hierarchy_= levels); - if (!hierarchy->level_list) + hier->level_list =3D kzalloc_objs(struct list_head, tmigr_hierarchy_level= s); + if (!hier->level_list) goto err; =20 for (int i =3D 0; i < tmigr_hierarchy_levels; i++) - INIT_LIST_HEAD(&hierarchy->level_list[i]); + INIT_LIST_HEAD(&hier->level_list[i]); =20 - return hierarchy; + hier->capacity =3D capacity; + list_add_tail(&hier->node, &tmigr_hierarchy_list); + + return hier; err: - kfree(hierarchy->cpumask); - kfree(hierarchy); - hierarchy =3D NULL; + kfree(hier->cpumask); + kfree(hier); return ERR_PTR(-ENOMEM); } =20 +static int tmigr_connect_old_root(struct tmigr_hierarchy *hier, int cpu, + struct tmigr_group *old_root, bool activate) +{ + /* + * The target CPU must never do the prepare work, except + * on early boot when the boot CPU is the target. Otherwise + * it may spuriously activate the old top level group inside + * the new one (nevertheless whether old top level group is + * active or not) and/or release an uninitialized childmask. + */ + WARN_ON_ONCE(cpu =3D=3D smp_processor_id()); + if (activate) { + /* + * The current CPU is expected to be online in the hierarchy, + * otherwise the old root may not be active as expected. + */ + WARN_ON_ONCE(!__this_cpu_read(tmigr_cpu.available)); + } + + return tmigr_setup_groups(hier, -1, old_root->numa_node, old_root, activa= te); +} + +static long connect_old_root_work(void *arg) +{ + struct tmigr_group *old_root =3D arg; + struct tmigr_hierarchy *hier; + int cpu =3D smp_processor_id(); + + hier =3D tmigr_get_hierarchy(arch_scale_cpu_capacity(cpu)); + if (IS_ERR(hier)) + return PTR_ERR(hier); + + return tmigr_connect_old_root(hier, cpu, old_root, true); +} + static int tmigr_add_cpu(unsigned int cpu) { struct tmigr_hierarchy *hier; @@ -1944,7 +1994,7 @@ static int tmigr_add_cpu(unsigned int cpu) =20 guard(mutex)(&tmigr_mutex); =20 - hier =3D tmigr_get_hierarchy(); + hier =3D tmigr_get_hierarchy(arch_scale_cpu_capacity(cpu)); if (IS_ERR(hier)) return PTR_ERR(hier); =20 @@ -1957,20 +2007,33 @@ static int tmigr_add_cpu(unsigned int cpu) =20 /* Root has changed? Connect the old one to the new */ if (old_root && old_root !=3D hier->root) { - /* - * The target CPU must never do the prepare work, except - * on early boot when the boot CPU is the target. Otherwise - * it may spuriously activate the old top level group inside - * the new one (nevertheless whether old top level group is - * active or not) and/or release an uninitialized childmask. - */ - WARN_ON_ONCE(cpu =3D=3D raw_smp_processor_id()); - /* - * The (likely) current CPU is expected to be online in the hierarchy, - * otherwise the old root may not be active as expected. - */ - WARN_ON_ONCE(!per_cpu_ptr(&tmigr_cpu, raw_smp_processor_id())->available= ); - ret =3D tmigr_setup_groups(hier, -1, old_root->numa_node, old_root, true= ); + guard(migrate)(); + + if (cpumask_test_cpu(smp_processor_id(), hier->cpumask)) { + /* + * If the target belong to the same hierarchy, the old root is expected + * to be active. Link and propagate to the new root. + */ + ret =3D tmigr_connect_old_root(hier, cpu, old_root, true); + } else { + int target =3D cpumask_first_and(hier->cpumask, tmigr_available_cpumask= ); + + if (target < nr_cpu_ids) { + /* + * If the target doesn't belong to the same hierarchy as the current + * CPU, activate from a relevant one to make sure the old root is + * active. + */ + ret =3D work_on_cpu(target, connect_old_root_work, old_root); + } else { + /* + * No other available CPUs in the remote hierarchy. Link the + * old root remotely but don't propagate activation since the + * old root is not expected to be active. + */ + ret =3D tmigr_connect_old_root(hier, cpu, old_root, false); + } + } } =20 if (ret >=3D 0) diff --git a/kernel/time/timer_migration.h b/kernel/time/timer_migration.h index 0cfbb8d..291bfb6 100644 --- a/kernel/time/timer_migration.h +++ b/kernel/time/timer_migration.h @@ -7,14 +7,21 @@ =20 /** * struct tmigr_hierarchy - a hierarchy associated to a given CPU capacity. + * Homogeneous systems have only one hierarchy. + * Heterogenous have one hierarchy per CPU capaci= ty. * @level_list: Per level lists of tmigr groups * @cpumask: CPUs belonging to this hierarchy * @root: The current root of the hierarchy + * @capacity: CPU capacity associated to this hierarchy + * @node: Node in the global hierarchy list */ struct tmigr_hierarchy { struct list_head *level_list; struct cpumask *cpumask; struct tmigr_group *root; + unsigned long capacity; + struct list_head node; + }; =20 /**