From nobody Wed Jun 17 06:26:39 2026 Received: from canpmsgout01.his.huawei.com (canpmsgout01.his.huawei.com [113.46.200.216]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC0283B8D76 for ; Mon, 27 Apr 2026 13:13:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=113.46.200.216 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777295633; cv=none; b=aF5LN7JPNF/iHHqqZP+HCLdkJj9GzHiqxWL2q8MOt1c7rL4bJJ+mPs0nFkpx8y1vjhBMJDFRVRIozWAuXa7vE9K/DXy/9raJh3sg2kRz5kGsb/x2MNfcyobXYPsuSVpUgOjfRkZXsZrBKpZQ7NTzgKi+qUidAotUUJVjrjpCAoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777295633; c=relaxed/simple; bh=I8NQ3Ky2DE9VSzzZjcc+/L1Nreg7PCPUb6FdaIPU8fk=; h=Message-ID:Date:MIME-Version:From:Subject:To:CC:Content-Type; b=CdQNFMIRJ+5xt2x0pVEErIuxBEhv16c00mAtVgoE6FqBD3L90rooteOMTPAzgmKSuTR5S/UY/TcGKNg355lyINeszH6njNamYV1yKSJ6apLAtjTYOZusGwu7+0ZqgcAj76f74GnjlAT5rSJpNvsBa34s3c9/UpxZXCI/jQ1JsNA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b=ftUm+lHJ; arc=none smtp.client-ip=113.46.200.216 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b="ftUm+lHJ" dkim-signature: v=1; a=rsa-sha256; d=huawei.com; s=dkim; c=relaxed/relaxed; q=dns/txt; h=From; bh=FPfP8NFywsvAL4SiuBlehuvSYoubu/NHQMndv4dIwUQ=; b=ftUm+lHJEwzOBr51bJa8NjIlOB945t+CzPLF1UnHyLmlBD/5eNwPF2jBrl2P0VeYPJGY+lt7l zVWZbBorxb+2sX0yk2q5hvQ1JOyERNqC3lLC8AKubMyd6X2uYA2WznsN9Q6yFXCLEYtICbzDg3f Iy3DMCe7kx3kIG7BzUQ3iCM= Received: from mail.maildlp.com (unknown [172.19.162.140]) by canpmsgout01.his.huawei.com (SkyGuard) with ESMTPS id 4g43kn2j49z1T4Gh; Mon, 27 Apr 2026 21:07:29 +0800 (CST) Received: from dggemv705-chm.china.huawei.com (unknown [10.3.19.32]) by mail.maildlp.com (Postfix) with ESMTPS id 2D9A0203B8; Mon, 27 Apr 2026 21:13:47 +0800 (CST) Received: from kwepemn500008.china.huawei.com (7.202.194.149) by dggemv705-chm.china.huawei.com (10.3.19.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Mon, 27 Apr 2026 21:13:46 +0800 Received: from [10.67.110.89] (10.67.110.89) by kwepemn500008.china.huawei.com (7.202.194.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Mon, 27 Apr 2026 21:13:46 +0800 Message-ID: <6a7905b1-c2b6-42ac-b877-094bb6f6db11@huawei.com> Date: Mon, 27 Apr 2026 21:13:45 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US From: Xia Fukun Subject: [RFC] debugfs: sched/migration_cost_ns should accept -1 like legacy sysctl interface? To: Peter Zijlstra , Juri Lelli , Vincent Guittot , Greg KH , "mingo@redhat.com" CC: Dietmar Eggemann , Steven Rostedt , "bsegall@google.com" , "mgorman@suse.de" , "vschneid@redhat.com" , LKML , , , , "tanghui (C)" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems500001.china.huawei.com (7.221.188.70) To kwepemn500008.china.huawei.com (7.202.194.149) Hi all, I noticed an inconsistency in how the scheduler=E2=80=99s migration_cost_ns= tunable is exposed via debugfs versus its original sysctl/proc interface. Currently, the file /sys/kernel/debug/sched/migration_cost_ns is created wi= th debugfs_create_u32(). This means that writing -1 to it fails with =E2=80=9C= Invalid argument=E2=80=9D, because u32 attributes reject negative values. However, the scheduler logic in task_hot() (in kernel/sched/fair.c) explici= tly checks for -1: static int task_hot(struct task_struct *p, struct lb_env *env) { ... if (sysctl_sched_migration_cost =3D=3D -1) return 1; /* * Don't migrate task if the task's cookie does not match * with the destination CPU's core cookie. */ if (!sched_core_cookie_match(cpu_rq(env->dst_cpu), p)) return 1; if (sysctl_sched_migration_cost =3D=3D 0) return 0; delta =3D rq_clock_task(env->src_rq) - p->se.exec_start; return delta < (s64)sysctl_sched_migration_cost; } In kernels prior to v5.10 (when this tunable was still exposed via /proc/sys/kernel/sched_migration_cost_ns using proc_dointvec()), writing -1= was perfectly valid and had well-defined semantics: it effectively disables migration bas= ed on execution time. Now that the debugfs interface uses an unsigned type, this useful configura= tion is no longer accessible from userspace=E2=80=94even though the kernel code still = supports it. This seems like an unintended regression in debugfs exposure. Should we con= sider changing the interface to use a signed type so that -1 (and other negative values, i= f meaningful) can be written again? One possible approach would be to introduce a debugfs_create_s32() helper (= similar to existing u32/u64 helpers) and use it for migration_cost_ns. The following is a partial code snippet, which does not include the impleme= ntation of the debugfs_create_s32() interface: diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index b24f40f05019..379190e2f8a9 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -608,7 +608,7 @@ static __init int sched_init_debug(void) debugfs_create_u32("latency_warn_once", 0644, debugfs_sched, &sysct= l_resched_latency_warn_once); debugfs_create_file("tunable_scaling", 0644, debugfs_sched, NULL, &= sched_scaling_fops); - debugfs_create_u32("migration_cost_ns", 0644, debugfs_sched, &sysct= l_sched_migration_cost); + debugfs_create_s32("migration_cost_ns", 0644, debugfs_sched, &sysct= l_sched_migration_cost); debugfs_create_u32("nr_migrate", 0644, debugfs_sched, &sysctl_sched= _nr_migrate); sched_domains_mutex_lock(); I=E2=80=99d appreciate feedback on: Whether this behavior change was intentional; If not, whether adding debugfs_create_s32() is an acceptable solution; Or if there=E2=80=99s a better way to preserve the -1 semantic without brea= king debugfs conventions. Thanks!