From nobody Mon Feb 9 12:01:33 2026 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBBD345C3A7 for ; Thu, 8 Jan 2026 10:16:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767867388; cv=none; b=rLPUAb4WiKX+41POFABgX8YeziSPeBWmBruJuKMblCYMwo6/5sHz7Mh0Xw1nDUCMyeEaC7OdwnaHl0vCqX1/35KQ8In45Ktt9pNvX1Xfc71Bit6PUJ6KA2RFc4+41ahXuy0zsIwDF9w8F+EMlWzTqUVtanIEj5IGiItTsYsmR90= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767867388; c=relaxed/simple; bh=uwpd6A8/JJiZRPsY0eYWPt+toE8mJr3sa3KvHSB6r6E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uUmx6fcgXL3jt/tMcYtYiCLxWqA4m2pK6ZnNpu1H+P4vkiNmhI80N3VB9/2hEGNiNoVxG3O/mDVoUfEUm35d/47FSj9k6FzSOwWn3nPGUoGGsUOU+OB+KAGJgB1SIdZ+Yr5sCv5cRgUUbOpIbk8GjNL7C8IYQLBlFsaXI9FVf6I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=l11mSy+8; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="l11mSy+8" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-7acd9a03ba9so2008238b3a.1 for ; Thu, 08 Jan 2026 02:16:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767867384; x=1768472184; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=10vzVgMW/SIAVZiREFQWpaxB2zwT+8w8TEXAc8bNZjs=; b=l11mSy+8FndsHMTJYiiwv0cFca+qGjx5Wv55/cQ6brh+s4Z9Vf14PURhVRZMtQY90U Wl5vXcoyPAN+QN1hgXYrF2fwVQuEGK5f0ggqcEXEsHjhxrvye3VSblwE1TWFoZzJC6Ho 3mGLoEPNE9tBjBXOod7v6lUKSmp10bX4Y6627VnG9kaUmKrYcmKAuSkevgZLaYf9po/w kOJDmDP1ZiWgqDkNC52UV0VOz6U5zz7tYUL0bOAhOoiSxZM4Cbq37OZmMRqDcOukzmOy bct6DIy7DFcKZnEeN/c2gWBgnmrVjNn+2EUzLt7m5Cuet/xAeKcfN8zUPQ4j1G4XXqHg XIBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767867384; x=1768472184; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=10vzVgMW/SIAVZiREFQWpaxB2zwT+8w8TEXAc8bNZjs=; b=UkeeUJJIw43m0qoLuvv8wOW8gOYt0XjqHN2LKLIdCigzYGddGcRryzMzZ1LT+FcVpP sd1M7Nnj7NEEZy3c+n0fh7ypjW4YLIfigzv1rhKatXKIa0b2lPcYcc7IjkYleCLPPpK9 ehFcixtp5bvqu8gRe8arufatA4IQNQ7UnhUX7x9I21h2415r6xJfqrymTdbDV9KaDzff d+EjjciMXh0IjnUqzGXR38Iz4JKOplp9LPjHlER/AHDZRs6rvqaIQN6RJR9DzDkE62Rt T95s4EU0qCzB6taRrLNztclSFhubaM4V5vEoDYpRjFkBrDyYDZdjSg22A2e1dsu2k2p1 eTwQ== X-Forwarded-Encrypted: i=1; AJvYcCX1psuZmKx+U1R9KSHA4EO3PMwVuXLfEqa78zdDKG9echUu5YNOBCZzBEssKOJLj4nFmUL94TVhg0+UPWA=@vger.kernel.org X-Gm-Message-State: AOJu0YzR2bSOvUa00gWswHZ1uZBcMpV7UUYsJjTIARZAhfRP3lBnKTOw JyGbrUKnk57cGD/8Lthd96CXOxB4mDci58FtP9Rebt929PP8gcbcge5m X-Gm-Gg: AY/fxX7LXTqMZs+EDNm3k+V2fAG9UdAJ5Fcj9YqHOLVVG300wlK/o6a4ONhR+Rcorlf ebHWtuIGaJx5l45nzL1azig+uzHpkk75HaIum8/eVpaxDMJCabFYe8vb2Zy4pO9MW4Z/XMchSbk cbQyLct9UjET7QFbXsNtrO/5ps18F/O6sJeWloHEMv0WTblan46ojTNANbexG5HiS2H7GlBDNOu n9O7D7dGZ4Tftj458DxkprEGuRJ1c6O+Kz6Hyo9JmUtJNUKHaC4nUSJ8lZpTp9fkjd8oht4KHqb m8DjlqiTlYvE0xcxqpbXz7fl7tTct3rj3RHyn7uOvNlAa35o/yjzT0xjwptqEKjQjOlsqUG7auT qS8z8u5M877Nxh5pvyjaBTvrkeNCxfvUbmw+kRXzIZrtvxSK2VrbaDM0eYsVPJgC+V96neY8/vH sTzftGJeY39qia686W7qxCK7AJ X-Google-Smtp-Source: AGHT+IEZ/zQHkmzZD5tH7EULcvT2Rbbyog8IW+xRrH5RFccnrCaoGMPZL2CJNEUsgMOip+rJXNFPnQ== X-Received: by 2002:a05:6a21:33a7:b0:366:1953:1d30 with SMTP id adf61e73a8af0-3898f8f54d1mr5334164637.5.1767867383759; Thu, 08 Jan 2026 02:16:23 -0800 (PST) Received: from localhost.localdomain ([240f:34:212d:1:8352:dfa:3b18:eb4e]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a3e3c49299sm73785245ad.42.2026.01.08.02.16.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Jan 2026 02:16:23 -0800 (PST) From: Akinobu Mita To: akinobu.mita@gmail.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com, hannes@cmpxchg.org, david@kernel.org, mhocko@kernel.org, zhengqi.arch@bytedance.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, bingjiao@google.com Subject: [PATCH v3 1/3] mm: memory-tiers, numa_emu: enable to create memory tiers using fake numa nodes Date: Thu, 8 Jan 2026 19:15:33 +0900 Message-ID: <20260108101535.50696-2-akinobu.mita@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260108101535.50696-1-akinobu.mita@gmail.com> References: <20260108101535.50696-1-akinobu.mita@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This makes it possible to create memory tiers using fake numa nodes generated by numa emulation. The "numa_emulation.adistance=3D" kernel cmdline option allows you to set the abstract distance for each NUMA node. For example, you can create two fake nodes, each in a different memory tier by booting with "numa=3Dfake=3D2 numa_emulation.adistance=3D576,704". Here, the abstract distances of node0 and node1 are set to 576 and 706, respectively. Each memory tier covers an abstract distance chunk size of 128. Thus, nodes with abstract distances between 512 and 639 are classified into the same memory tier, and nodes with abstract distances between 640 and 767 are classified into the next slower memory tier. The abstract distance of fake nodes not specified in the parameter will be the default DRAM abstract distance of 576. Signed-off-by: Akinobu Mita Reviewed-by: Jonathan Cameron --- v2: - fix the explanation about cmdline parameter in the commit log mm/numa_emulation.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/mm/numa_emulation.c b/mm/numa_emulation.c index 703c8fa05048..a4266da21344 100644 --- a/mm/numa_emulation.c +++ b/mm/numa_emulation.c @@ -6,6 +6,9 @@ #include #include #include +#include +#include +#include #include #include #include @@ -344,6 +347,27 @@ static int __init setup_emu2phys_nid(int *dfl_phys_nid) return max_emu_nid; } =20 +static int adistance[MAX_NUMNODES]; +module_param_array(adistance, int, NULL, 0400); +MODULE_PARM_DESC(adistance, "Abstract distance values for each NUMA node"); + +static int emu_calculate_adistance(struct notifier_block *self, + unsigned long nid, void *data) +{ + if (adistance[nid]) { + int *adist =3D data; + + *adist =3D adistance[nid]; + return NOTIFY_STOP; + } + return NOTIFY_OK; +} + +static struct notifier_block emu_adist_nb =3D { + .notifier_call =3D emu_calculate_adistance, + .priority =3D INT_MIN, +}; + /** * numa_emulation - Emulate NUMA nodes * @numa_meminfo: NUMA configuration to massage @@ -532,6 +556,8 @@ void __init numa_emulation(struct numa_meminfo *numa_me= minfo, int numa_dist_cnt) } } =20 + register_mt_adistance_algorithm(&emu_adist_nb); + /* free the copied physical distance table */ memblock_free(phys_dist, phys_size); return; --=20 2.43.0 From nobody Mon Feb 9 12:01:33 2026 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B44E4611EC for ; Thu, 8 Jan 2026 10:16:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.193 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767867399; cv=none; b=TNs1omhKWQo0o8zzrd+z5vWlg9SYtLDKsDvI+/EnRZfDrrZBlSTBHpB68EL9kTtZuTycDjIjzF+eFdUq6OI4onT3ZYFkIj9m6lCtztM7bXAVeoLsPxU1CLFGrxL9SsaaZSURD+wTS8uteJtBD4VoCXNHz60x439DHasXsTztg7k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767867399; c=relaxed/simple; bh=0ItxBjCWeeYPymcdVA6acSuKxkPEbVxqgc4wC6gsjtc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uJSPNBQOkEn/jhaP0feR85ThTdu1hrZDSuH0mAgrrUAL/oy794oU5wCxGKPNmpUQ9YP1tRfTjlYhErbp6+kL8OmwzvEgtH1lZLDsxorwECfaWXdp89hMWI999W3fVScy7mkGmoPBud4rFM7OT71vvnOHd04HTMyXQ4ayATlN3m8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=j52VY0p1; arc=none smtp.client-ip=209.85.210.193 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="j52VY0p1" Received: by mail-pf1-f193.google.com with SMTP id d2e1a72fcca58-7fc0c1d45a4so1834123b3a.0 for ; Thu, 08 Jan 2026 02:16:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767867392; x=1768472192; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WKs0XTtJuJVxxVwPA++DZZZ+CNpKQV3ASKLDqcig+Cw=; b=j52VY0p1Xy8PNl31uamlftkilNrQ2tMIpuUFGwtjGAjDpzTFfcrTwI0oXVBF+MWpn6 HhGL5u6WaBtgm3LQyuk67R61X9+Oz3itgjBWO5GiyfNBbybUG1QGa3GpSwnWXGY7kVJD B8p6hU9kqlHH7EitI8ON5DyBmWyTUW8frt43igNYRTxjPcV32KGYM9AmrITkZPOiCoKU i8u5zGPJVyyPphQybIlNYLv+8USX0LDkZt7/20KgMQCpUes6FJwspzYmA7E/m0bIq0zH uUficfllC+QgTFdUYqkSeZVrzuH3bYiY9YcnWQzkb+dNOsCRkkAXbX4qM9yHhvAX6xYF ymnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767867392; x=1768472192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=WKs0XTtJuJVxxVwPA++DZZZ+CNpKQV3ASKLDqcig+Cw=; b=Beye/O4uHVGQ7mZVPTX5m0iu+G2A6TuuqzUGW9F0Lrhe/vQJ9r32f7ogG3mO/H2IqA Nj2k8Q7ckHYRytK6LWdx99qbeq9GNSVgK7LZCd0hma3wt9b75tgAPAjPOl2Y11nEmvCx neTFUdW+EEP1W4T8MU/yRd0Pv4rXX/wcbwU+9BYdxILpluYMkYSe+0W5bXhkbod1WjsV G5zpUjk7w4fwNZoGIoBoSCmUMyznMu3JBOg15fgJIf/Xpk+jFFJTEvnn9sMYHvX2oS6N WyPgUJw6zD1Qfn9okmMJUk8Nd5I8FzEMIbx1xj+Q9hyB2UdNnAg9PmNmFee5utQND0Qo dUXA== X-Forwarded-Encrypted: i=1; AJvYcCXunTBvfcpnlB/NuFMf+Qi0rC8nraGvA7YYGNDh+79zj7gKO0Ihcq8jLQYKU7vQWqidkbCM/bJjjqMO5C8=@vger.kernel.org X-Gm-Message-State: AOJu0YwefqvM3dYNG6rTupqE5Srr/iK8D6VVqKWXwnX9HTHmI3HUSmbT qMWV0GTmimaVO6DBmhvU8Ca+WTBg4PWsOHhLU8KYRKkvZKddGh0190EU X-Gm-Gg: AY/fxX4uh1qZcPxzrCjxUQd+TxTUzzYPXPQZTA0ChDR2j/cSAOBZnGy38vJmNCWbaIy XsjYCMLusm1aw/Q+1Ew4HE1whF3kW8Im5HjUZAdYuKbDEdXHqgsfeKT3QAXSD95lH5aLyIC0+0L KDDf9BRR4w3qhhEZdbw6eAmrlPznXizeN2y+LAHlDXl+E04qOP/b46LMsVCnZWVO57830txfZ7h cgqQzJR5JVqwRQaWeVn7TXnq6cl6XKiYScfrq9vkXHEZ7Jz/se29XHFBcdUCXhEIgiBaLjBMJrU Cb45q0xxGTdqk5yJFSxjlOdtP0vlOR1o06/PdD+eEOUJYjIrtVjs+hBhkjFDT6mPdomtQEkn/0v TpW00ZddVoYrJPjDKrs0trqCJvOkf9NDiXlCu7qgCJuaILfElf9wiuE6XgQYNK0eR4DQ7Lc0NS5 /8ASkF5saSkBTLZGDYO9ibmuJV X-Google-Smtp-Source: AGHT+IG5cjHWe4nCIW20Z8QJTPxi/x38oc0S4rLZqrg1Nkqq2kgc6NrGn9A7eOqoxL6kXyUHFyaZNg== X-Received: by 2002:a05:6a21:3285:b0:35f:b96d:af0f with SMTP id adf61e73a8af0-3898f94ca6emr4948333637.26.1767867390988; Thu, 08 Jan 2026 02:16:30 -0800 (PST) Received: from localhost.localdomain ([240f:34:212d:1:8352:dfa:3b18:eb4e]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a3e3c49299sm73785245ad.42.2026.01.08.02.16.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Jan 2026 02:16:30 -0800 (PST) From: Akinobu Mita To: akinobu.mita@gmail.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com, hannes@cmpxchg.org, david@kernel.org, mhocko@kernel.org, zhengqi.arch@bytedance.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, bingjiao@google.com Subject: [PATCH v3 2/3] mm: numa_emu: add document for NUMA emulation Date: Thu, 8 Jan 2026 19:15:34 +0900 Message-ID: <20260108101535.50696-3-akinobu.mita@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260108101535.50696-1-akinobu.mita@gmail.com> References: <20260108101535.50696-1-akinobu.mita@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a document with a brief explanation of numa emulation and how to use the newly added "numa_emulation.adistance=3D" kernel cmdline parameter. Signed-off-by: Akinobu Mita Reviewed-by: Jonathan Cameron --- v2: - added in v2 Documentation/mm/index.rst | 1 + Documentation/mm/numa_emulation.rst | 30 +++++++++++++++++++++++++++++ 2 files changed, 31 insertions(+) create mode 100644 Documentation/mm/numa_emulation.rst diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst index 7aa2a8886908..7d628edd6a17 100644 --- a/Documentation/mm/index.rst +++ b/Documentation/mm/index.rst @@ -24,6 +24,7 @@ see the :doc:`admin guide <../admin-guide/mm/index>`. page_cache shmfs oom + numa_emulation =20 Unsorted Documentation =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D diff --git a/Documentation/mm/numa_emulation.rst b/Documentation/mm/numa_em= ulation.rst new file mode 100644 index 000000000000..dce9f607c031 --- /dev/null +++ b/Documentation/mm/numa_emulation.rst @@ -0,0 +1,30 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +NUMA emulation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +If CONFIG_NUMA_EMU is enabled, you can create fake NUMA nodes with +``numa=3Dfake=3D`` kernel cmdline option. +See Documentation/admin-guide/kernel-parameters.txt and +Documentation/arch/x86/x86_64/fake-numa-for-cpusets.rst for more informati= on. + + +Multiple Memory Tiers Creation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D + +The "numa_emulation.adistance=3D" kernel cmdline option allows you to set +the abstract distance for each NUMA node. + +For example, you can create two fake nodes, each in a different memory +tier by booting with "numa=3Dfake=3D2 numa_emulation.adistance=3D576,704". +Here, the abstract distances of node0 and node1 are set to 576 and 706, +respectively. + +Each memory tier covers an abstract distance chunk size of 128. Thus, +nodes with abstract distances between 512 and 639 are classified into the +same memory tier, and nodes with abstract distances between 640 and 767 +are classified into the next slower memory tier. + +The abstract distance of fake nodes not specified in the parameter will +be the default DRAM abstract distance of 576. --=20 2.43.0 From nobody Mon Feb 9 12:01:33 2026 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA66C4611EA for ; Thu, 8 Jan 2026 10:16:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767867403; cv=none; b=gHrdqJXst9goEOisphEVCE7miMdrC/TitUbjkiJtm6ajfM/NWoAEvnxcCVVFjNrRT/MrGU0ceXTNid8aWQnXnVmwCY3W2DMyYAqS+47wpLv2p+j2iUiq6S/aUnPwWjOJT8ASNWChyjM/rIJSkJsw/RfpJVjTBaCYndGGXGF2B0s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767867403; c=relaxed/simple; bh=YYf+X5c8xRQHOIoiVkNKohN/SGgRXpW0mOrRpxFNnJs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EmNYPH+XFooQMyulmabfjjLwDUKHQkSefMCJ4klfiX62EXWSUGTs0FMmcexXhCOLBf+VsTpRGI77vzmTvHxKfBuX+p/3qABHdyliCpZtYzP0bYvR4KUa2KaE88glQABCogVS2BTCr0rVT39Ojkda82cacu6uxK4vD5uqjepKFDo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=imB/Yzn4; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="imB/Yzn4" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-2a0c09bb78cso14848385ad.0 for ; Thu, 08 Jan 2026 02:16:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767867395; x=1768472195; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=suTkrZ1EkPHRVVH34Oml2yxp+xWgDllJvbX0nsQJyK8=; b=imB/Yzn4EKaWXIG92ZpWJAM9kF5qZdKuXBpAwy6BK3qh2Rous2EtDXKRrTC/IzoHKY vOqp6o03EjQxR6Ok5uAKmczFoUS1q3um+IDJqGGc8IqFxmt0bSVVVW1yv3CfKcjjgyOo n24SEEdW809TaZXgtHJXBtkCrk4EDIQ1MjcoQBLh/Cdq/fGS1STsLAKdB5zZw/YpwzcE Wm145S9pghGumtDNirh1GjNbblFxm8pYIrrxK61KFvieBNmdYOjrYqC7BPPoaCfrNWEP BfMCxzqb2eXjnD0J6+jSb5tywRC+C5cNF1YDbSZtoBb3wfg7knvJ6u9GWGYAaaXZiZAY VDJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767867395; x=1768472195; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=suTkrZ1EkPHRVVH34Oml2yxp+xWgDllJvbX0nsQJyK8=; b=GTh96JR3Mha9LZpc2KgyoOu6Bdy4LjOo0GevoEVVXMKqUaiscppq+Q2qAVGSDnftcA f+/IhWkkWkzCasi0hYVfo1vMHE0tNgJYRT8sk/tg3pCDZ3ngX64oSghdK4A6/Yr7V0bJ nbQidc+4LdL1lW9823yEouUd7ODOhLmsU0+kTOQmj9EVXPQw+cdytlgapFCYsSAR9TrF yoVAOMWMZUPxffzMrexK87UeOsTR5WYHV++drcTe97zp8wS6tCthIDGvSsOuNZIrHTi2 uIIjqX3SKy898O5Z8mJ8z6Rw6TKXx/drPP14NpxhXCpDaZHyqKnJDvqJJDZH2xXtXOEZ dQ/A== X-Forwarded-Encrypted: i=1; AJvYcCU4Tm+nV8/FuPMGQnPrWn1rjujN7YJC3QzNRMPWAWETz9w4WL4b++qOCrJzd0xq1iesZ5Jm4wpV1j4X6uQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yw9yWvVKgXdKJJzBfaj15RSvGdQQ6GCDGELSG+HNf/CHLv4uf0z 7Cxv2YqpMQBBgXrdJ/MMBQZ3O7EiXK+TOpd8l2ZmVhmSrCFMimsDtBPH X-Gm-Gg: AY/fxX7eM3o7G1VmAH8w53vjMIU9PQu2JC7+w6kU3nAkq4Mi+EYKBdUu6pvWE8VI4Js awDguCj0D6FkgGkh4Q2V3q9ZMnKqmi7zqTvFgRRBM0aWTYbyNGhc4dmdxmdsI3yEivC+aFY3aj+ krSxFc1IlWt4PU2oz67oHfQ9QvehRx8s0v9wD198sxQq9cXNlQ2VK+Eqy8C5/3PFiFQwTt6XuIt ntjftvfCNzdHYmnpKPqlRwBgv9yvMYG1jYASZbaDIDWfLd+yScE75wCZU68EGG5/cdbYGOa/i5Z oLqcKWvvrykI4b12zlc9QGv1IKkz2xm8e29NY3cFvkG15AMTlQlTnMKFJayDb6trjJ0FFhoTISq n+dXlpDyjbtbkVdyD3cyQHj7L9CAWYd4KNPWabMmmYUbsXtpstwn3A/T7mWWnjsEI4y3pX4YA6K WW3BAAXTtG0oQ9GyQMnJdMMFYe X-Google-Smtp-Source: AGHT+IEjPBmAvzyeXCoA9/7yT7vSwivlYjHbfpR026nkIclI0cizfQblaziO9HyuHaaz+q4kQ2IDPw== X-Received: by 2002:a17:903:11d0:b0:29f:1b1f:784 with SMTP id d9443c01a7336-2a3e39828f1mr90701885ad.4.1767867395489; Thu, 08 Jan 2026 02:16:35 -0800 (PST) Received: from localhost.localdomain ([240f:34:212d:1:8352:dfa:3b18:eb4e]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a3e3c49299sm73785245ad.42.2026.01.08.02.16.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Jan 2026 02:16:35 -0800 (PST) From: Akinobu Mita To: akinobu.mita@gmail.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com, hannes@cmpxchg.org, david@kernel.org, mhocko@kernel.org, zhengqi.arch@bytedance.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, bingjiao@google.com Subject: [PATCH v3 3/3] mm/vmscan: don't demote if there is not enough free memory in the lower memory tier Date: Thu, 8 Jan 2026 19:15:35 +0900 Message-ID: <20260108101535.50696-4-akinobu.mita@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260108101535.50696-1-akinobu.mita@gmail.com> References: <20260108101535.50696-1-akinobu.mita@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On systems with multiple memory-tiers consisting of DRAM and CXL memory, the OOM killer is not invoked properly. Here's the command to reproduce: $ sudo swapoff -a $ stress-ng --oomable -v --memrate 20 --memrate-bytes 10G \ --memrate-rd-mbs 1 --memrate-wr-mbs 1 The memory usage is the number of workers specified with the --memrate option multiplied by the buffer size specified with the --memrate-bytes option, so please adjust it so that it exceeds the total size of the installed DRAM and CXL memory. If swap is disabled, you can usually expect the OOM killer to terminate the stress-ng process when memory usage approaches the installed memory size. However, if multiple memory-tiers exist (multiple /sys/devices/virtual/memory_tiering/memory_tier directories exist) and /sys/kernel/mm/numa/demotion_enabled is true, the OOM killer will not be invoked and the system will become inoperable, regardless of whether MGLRU is enabled or not. This issue can be reproduced using NUMA emulation even on systems with only DRAM. You can create two-fake memory-tiers by booting a single-node system with "numa=3Dfake=3D2 numa_emulation.adistance=3D576,704" kernel parameters. The reason for this issue is that memory allocations do not directly trigger the oom-killer, assuming that if the target node has an underlying memory tier, it can always be reclaimed by demotion. So this change avoids this issue by not attempting to demote if the underlying node has less free memory than the minimum watermark, and the oom-killer will be triggered directly from memory allocations. Signed-off-by: Akinobu Mita --- v3: - rebase to linux-next (next-20260108), where demotion target has changed from node id to node mask. v2: - describe reproducibility with !mglru in the commit log - removed unnecessary consideration for scan control when checking demotion= _nid watermarks mm/vmscan.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index a34cf784e131..9a4b12ef6b53 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -358,7 +358,21 @@ static bool can_demote(int nid, struct scan_control *s= c, =20 /* Filter out nodes that are not in cgroup's mems_allowed. */ mem_cgroup_node_filter_allowed(memcg, &allowed_mask); - return !nodes_empty(allowed_mask); + if (nodes_empty(allowed_mask)) + return false; + + for_each_node_mask(nid, allowed_mask) { + int z; + struct zone *zone; + struct pglist_data *pgdat =3D NODE_DATA(nid); + + for_each_managed_zone_pgdat(zone, pgdat, z, MAX_NR_ZONES - 1) { + if (zone_watermark_ok(zone, 0, min_wmark_pages(zone), + ZONE_MOVABLE, 0)) + return true; + } + } + return false; } =20 static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg, --=20 2.43.0