From nobody Sat Sep 13 18:30:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A58C4C63797 for ; Wed, 1 Feb 2023 08:04:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230273AbjBAIEC (ORCPT ); Wed, 1 Feb 2023 03:04:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232030AbjBAIDs (ORCPT ); Wed, 1 Feb 2023 03:03:48 -0500 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2076.outbound.protection.outlook.com [40.107.223.76]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A6225D122 for ; Wed, 1 Feb 2023 00:03:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QutHDQm4lZeAIrwxYF/HoNq/20PqkqrHp+UedaZawYjNZfMVLUGMahVGSeIwQkpgGs0mpXNZRqCh/WlcqU3UdfJePrtjOJgb+AV+NDyq19nYXlU0v35hQNjDpfzcreIRrU3h5yVjz1UObAz+jds8X7Zpt7df4TZv2hBVgC4A7yJI0dHMjnXm9HBJ0mjaeS0UXY0gy4Ib5mVbYrGNocplsUBu3kHnsvGZKtb+D+GhVT/knKyeRkSYCdd/VqIUlzOaJU3pV7vpd4IlNpn8y06YtSo95Qj2eNvsDJ4IHIA7Gt11isAdUW9FamhAZaOrSpWnTbrRF4EwoasZUbzdqF8M2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zBy421iEXPoI1DU2317/qdpzKhPgWFnxaUG/6sW3lQg=; b=EVxB//Q5DkKMXQ+ICFbhLggcbgDfTtefYMDneP9KRL360VFdla2R3Uf4Vm8sVVx8gebO+PoTLQN8RmW1ETaKAV7qbCI7uSXivdGubmjpUdnbt96PkBQat+aiAHegartWGH/bpEV1jf8U6ILc3IUOa7deaCx/nJtUjgQ9/cWxDlxj+tY7btoPH/nx+aMh+/6lqB8nWA4C7K/2mtMWoK9NVvfzdA2lqoasdHiJbeitutXFPZ5oNNSDVlAaljxKwA5bamCQIPbfCe2md+mtKOUwwJa70ROfMf7GhNwDQHRxQKG5vpiyar1oXHIker9wiBIXX3LvwfAfXSEKwjwnMXiNgw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zBy421iEXPoI1DU2317/qdpzKhPgWFnxaUG/6sW3lQg=; b=qirlsF8c/7kHm5TSItSbvRRqT6avHT5eRJREx0qNRCRuqsknuODdeUwxfE79Kx/tRwUrv6CpkRz2EHutgBB80NaBe6hUzYBAUMPyOXZ84sceCFCz2NiDP6yOsu7dZMKC6VE6qbSPxRHVuhKakwG4KRIwJsUtBZStMk+pMd4g7hE= Received: from BN1PR14CA0014.namprd14.prod.outlook.com (2603:10b6:408:e3::19) by MN0PR12MB6272.namprd12.prod.outlook.com (2603:10b6:208:3c0::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6043.38; Wed, 1 Feb 2023 08:03:41 +0000 Received: from BN8NAM11FT086.eop-nam11.prod.protection.outlook.com (2603:10b6:408:e3:cafe::fd) by BN1PR14CA0014.outlook.office365.com (2603:10b6:408:e3::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6043.38 via Frontend Transport; Wed, 1 Feb 2023 08:03:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by BN8NAM11FT086.mail.protection.outlook.com (10.13.176.220) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6064.22 via Frontend Transport; Wed, 1 Feb 2023 08:03:41 +0000 Received: from BLR-L-RKODSARA.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Wed, 1 Feb 2023 02:03:29 -0600 From: Raghavendra K T To: , CC: Ingo Molnar , Peter Zijlstra , "Mel Gorman" , Andrew Morton , "David Hildenbrand" , , Bharata B Rao , Disha Talreja , Raghavendra K T Subject: [PATCH V2 3/3] sched/numa: Reset the accessing PID information periodically Date: Wed, 1 Feb 2023 13:32:22 +0530 Message-ID: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT086:EE_|MN0PR12MB6272:EE_ X-MS-Office365-Filtering-Correlation-Id: 753a8f99-9e26-4f55-bbcb-08db042ad528 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oLTh+cidC2YJSLzJPpFtDSWZ3aa4tbq+BHJD3eBC6XHLkMqyP9hHdLxDGiybeu5SHHs/kpch6WSBZhoXfUFK5xYGVaPOfwpkY/94OX9RiWssyqd9f9gb+TwEmz2ZLq58Ef2blCJxcQN3jO5l3XXwNQgyGike9bC8tmDCDzIdf8ScSMObuW/nq69xTH0h0PInmlUjH3SwHLXDaquVLALw23NPkhKM2hoVrmpNpQm5pfEUprNrM2+ddxBzwU7wcELSL/pzWU4MP6TxkQ+wXtyP+9Ed3papyn+MhAH6RpVvbjfa+dfdaBa/TSu5UBUbi9pDa987wufya1V50YG7e91qHLbq/snFoqj0m6hf7YRUnGqhQ4WoHlaDjMcP/4ycYQtOCCyJnI5tMLHiTH6P1t0sA8rGYqRreldX6xuMWlVaZ9BRtENHtqTO6/gfkZg0P+I9NIq7Jbxu/ziJnU3DlJJXbpnF8HH8hTlWlWfMuv3tEnIz2DOodjXTQyNtvH5HTTxjGUPmbnF9aLHwQP2IuWRZLTtd4BWX5+FqM+BL15qADgK1iv/iYaJCCDfleHHx6586LSrf/2s7NfR5hThK/bX2e8AlB3ii+6CkqQtrb4WrzqkeNEhRuOGDuF01hw4XksVZUvAbaGtFC1eqjZijvxWVZsoLsUrFz8b1y2VQ0U5/DCEidpIovQekiK+STLU+kpSuX1r7+tfitM1G1MG0Q8A+2LYBij/w1RoOuDUE3ikwaJJCjc5pWqLnWxJ6Rw15ws5Q X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230025)(4636009)(346002)(39860400002)(376002)(396003)(136003)(451199018)(46966006)(40470700004)(36840700001)(356005)(81166007)(82740400003)(40480700001)(82310400005)(36756003)(40460700003)(336012)(110136005)(426003)(7696005)(316002)(54906003)(47076005)(2616005)(6666004)(478600001)(26005)(16526019)(186003)(41300700001)(36860700001)(70206006)(70586007)(5660300002)(4326008)(83380400001)(2906002)(8676002)(8936002)(36900700001)(2101003);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Feb 2023 08:03:41.3942 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 753a8f99-9e26-4f55-bbcb-08db042ad528 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT086.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN0PR12MB6272 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This helps to ensure, only recently accessed PIDs scan the VMAs. Current implementation: Reset accessing PIDs every (4 * sysctl_numa_balancing_scan_delay) interval after initial scan delay period expires. The reset logic is implemented in scan path Suggested-by: Mel Gorman Signed-off-by: Raghavendra K T --- Some of the potential ideas for clearing the accessing PIDs 1) Flag to indicate phase in life cycle of vma and tie with timestamp (reus= e next_scan or so) VMA life cycle t1 t2 t3 t4 t5 = t6 |<- DS ->|<- US ->|<- CS ->|<- US ->|<- CS = ->| flags used to indicate whether we are in DS/CS/US phase DS (delay scan): Initial phase where scan is avoided for new VMA US (unconditional scan): Brief period where scanning is allowed irrespectiv= e of task faulting the VMA CS (conditional scan) : Longer conditiona scanning phase where task scanni= ng is allowed only for VMA of interest =20 2) Maintain duplicate list of accessing PIDs to keep track of history of ac= cess. and switch/reset. use OR operation during iteration Two lists of PIDs maintained. At regular interval old list is reset and we= make current list as old list At any point of time tracking of PIDs accessing VMA is determined by ORing = list1 and list2 =20 accessing_pids_list1 <- current list accessing_pids_list2 <- old list 3) Maintain per vma numa_seq also Currently numa_seq (how many times we are scanning entire set of VMAs) is m= aintained at mm level. Having numa_seq (almost like how many times the current VMA considered for = scanning) per VMA may be helpful in some context (for e.g., whether we need to allow VMA scanning unconditio= nally for a newly created VMA). 4) Reset accessing PIDs at regular intervals (current implementation) t1 t2 t3 t4 t5 t6 |<- DS ->|<- CS ->|<- CS ->|<- CS ->|<- CS ->| The current implementation resets accessing PIDs every 4*scan_delay interva= ls after initial scan delay time expires. The reset logic is implemented in scan path include/linux/mm_types.h | 1 + kernel/sched/fair.c | 17 +++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 980a6a4308b6..08a007744ea1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -437,6 +437,7 @@ struct anon_vma_name { =20 struct vma_numab { unsigned long next_scan; + unsigned long next_pid_reset; unsigned long accessing_pids; }; =20 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3505ae57c07c..14db6d8a5090 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2928,6 +2928,8 @@ static bool vma_is_accessed(struct vm_area_struct *vm= a) return vma->numab->accessing_pids & (1UL << active_pid_bit); } =20 +#define VMA_PID_RESET_PERIOD (4 * sysctl_numa_balancing_scan_delay) + /* * The expensive part of numa migration is done from task_work context. * Triggered from task_tick_numa(). @@ -3035,6 +3037,10 @@ static void task_numa_work(struct callback_head *wor= k) =20 vma->numab->next_scan =3D now + msecs_to_jiffies(sysctl_numa_balancing_scan_delay); + + /* Reset happens after 4 times scan delay of scan start */ + vma->numab->next_pid_reset =3D vma->numab->next_scan + + msecs_to_jiffies(VMA_PID_RESET_PERIOD); } =20 /* @@ -3047,6 +3053,17 @@ static void task_numa_work(struct callback_head *wor= k) if (!vma_is_accessed(vma)) continue; =20 + /* + * RESET accessing PIDs regularly for old VMAs. Resetting after checking + * vma for recent access to avoid clearing PID info before access.. + */ + if (mm->numa_scan_seq && + time_after(jiffies, vma->numab->next_pid_reset)) { + vma->numab->next_pid_reset =3D vma->numab->next_pid_reset + + msecs_to_jiffies(VMA_PID_RESET_PERIOD); + vma->numab->accessing_pids =3D 0; + } + do { start =3D max(start, vma->vm_start); end =3D ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE); --=20 2.34.1