From nobody Fri Dec 19 13:05:04 2025 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1104E314A64 for ; Mon, 8 Dec 2025 09:41:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765186891; cv=none; b=PpxurI5/7aaEHgY9yKl2OdBxjnIzRqwHWwEtKIkzgpBE89//w/tgPSrrNBQTJJ6vcIBOp91OZs6ozBB5LfZXTJhlxjsbhoiYQwCKrDgVDdoYoYOiURZBhO7t0OgCancpKygI3UZQZugFT/S/PMxQ2ZZOt7pyJ9xUkWAsTuV/oQQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765186891; c=relaxed/simple; bh=vaRn1KfT36KE3FEN/+Bf12Zbj9vhejAeH1OmduUjF4A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NP3VlOFCKgJ5UA6YtURn/UnJGbYjZ3CNe1ISiD1psNB6jKrWvdZd8g2NkLNX/bJIhuFhfFtAFLNh2YD+8dlcNAR/tcM4IBboKCMopJPiMOfDHuZ15Z+3/kyHNSoKyK4g8wtJDsQjr8uP7h5tWJvo/IqqQUfUnnEj9/K/U5ivR5M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=VbOxDbxq; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VbOxDbxq" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-7b9c17dd591so3899585b3a.3 for ; Mon, 08 Dec 2025 01:41:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765186889; x=1765791689; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bExisJd2rawtEUNYxhwuMVIv9x+cEHj6grigFw1J+tY=; b=VbOxDbxqhxFfad/uYI8S2RJHYoEde3IUbvqYyxj8GojEYzhXXBzaOmzonaGKyxiL+S 2qn2C1npNUyd/eJBcgs4Ligzyawsi9YdqOMeUnIgwHx+LBwycvM599Z8BkST6vmJX4bA xH/sJTbqVbf8iJrttOb461pUY4jb2OHFnd+4TSVcx8xJ4JscmI8tOFU4vT0JzKYE8HN9 JiJ9EwkVDMnsxcWQxqoZl5SQtYcgP/UE9A2WBsccLRJhD2aozHFe9HN1zEvNtG7S47wS CAhbMHvD4UCpjdtQfvUYyMuI/qtCuGFVtutCcUTPeeihOcwwKcP6hhos/mtEygLjmPlr opjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765186889; x=1765791689; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=bExisJd2rawtEUNYxhwuMVIv9x+cEHj6grigFw1J+tY=; b=hqTfpp2q9SUDYAP4rW2xOA4TymKppOIwB08RBu2HsUmMPtQfazUU8lMiCLWENyDd5X IWsuncn2nLC7WHXafwoZdzVCpVOwj4oV34IvTiSuDo+/MCgd3LH4Xa/2zVIEViwKfuUW E+vK6ysZ8yTuGofZj7zzL3Y25+EfFzCDVtRqN1hollkZnUhoPbTz/fMWrchQUTPU5PmN fI+IZDUqtJy81QXoAaeWei/ya9aYxKC2h7zbsSWJn55o4FPsHOPJG85w10qfH8putnnN 5p/4eHsgu1/ybJ05pxO02B8GqnRSwSmFgFuj088GPE4XKUPx9do07b0o0Oyoz0chkB2X SbhA== X-Forwarded-Encrypted: i=1; AJvYcCXrptyXaH0laUyTO2vFAu4aQafW4gl3XT463XCzjRseztxKlAUW33s9EvTZ/M8LhtQUplp3fagnk5LDnnM=@vger.kernel.org X-Gm-Message-State: AOJu0YzZjxK5Xvezrx7Wa+5B03w2r4rSbWwqWc9hhdSZXghHv3U8VkrC bdYueptnpl/JEf5ty5q8QQ+JiPzsqZxIxYHnyne9hrH5Bc/IPuziNBfD X-Gm-Gg: ASbGncsz4SxwlA7auaW2zjUkkayyFdXZuQ10t46ECep35A2u54y5XF9k6luJKIL4DLP mU/GJeHBwRrfX6BvktWlWZk9ZjmRcRDa4emzJoyRkVxZMcmVWMmrUD6bobJ8OQWTdi/OsWIOhJw zB9heUJ4PshsSTYD8rg7N5IXL1dYSFjyFS5F9N3iehAmF/L1sQAtSej+ge6MWK4xuRmexe+bNVi rzSdKC0G773gUgn4CgAw/udTzreChK2JOsdT+wltmKWyCsiG/FRo+qgukJSE7sVvdrIgWvm9h5O iQBUojDNPmu1LOxRwq8/VHo2ZFxoteIP8BambphrqxSWCWzpODujDcvJGKWUgkIayQDZ+Q5Fsw1 Wda9K0m0w5uOD7sd/l7V2nUAHAUZ2wZNK1K6bzKGLlZZ4Z7sCjAMYIXM/0A9VHmoPDhtWCISp7L 9hDFPZ4op1RyzUSzHvPdeuN+Fp X-Google-Smtp-Source: AGHT+IFqvRUGIwFEaNgQfl/NTg1cwY2IZM60Uy8TIHC22S07N4FVSlZj92GMWE1sIDQ0qozblkpCYw== X-Received: by 2002:a05:6a20:ed0f:20b0:366:1e42:630 with SMTP id adf61e73a8af0-3661e420638mr4892693637.1.1765186889147; Mon, 08 Dec 2025 01:41:29 -0800 (PST) Received: from localhost.localdomain ([240f:34:212d:1:e251:d9f:c2ef:caf4]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-bf6a36bbc60sm11675279a12.36.2025.12.08.01.41.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Dec 2025 01:41:28 -0800 (PST) From: Akinobu Mita To: akinobu.mita@gmail.com Cc: linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com, hannes@cmpxchg.org, david@kernel.org, mhocko@kernel.org, zhengqi.arch@bytedance.com, shakeel.butt@linux.dev, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com Subject: [PATCH 2/2] mm/vmscan: don't demote if there is not enough free memory in the lower memory tier Date: Mon, 8 Dec 2025 18:40:28 +0900 Message-ID: <20251208094028.214949-3-akinobu.mita@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20251208094028.214949-1-akinobu.mita@gmail.com> References: <20251208094028.214949-1-akinobu.mita@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On systems with multiple memory-tiers consisting of DRAM and CXL memory, the OOM killer is not invoked properly. Here's the command to reproduce: $ sudo swapoff -a $ stress-ng --oomable -v --memrate 20 --memrate-bytes 10G \ --memrate-rd-mbs 1 --memrate-wr-mbs 1 The memory usage is the number of workers specified with the --memrate option multiplied by the buffer size specified with the --memrate-bytes option, so please adjust it so that it exceeds the total size of the installed DRAM and CXL memory. If swap is disabled, you can usually expect the OOM killer to terminate the stress-ng process when memory usage approaches the installed memory size. However, if multiple memory-tiers exist (multiple /sys/devices/virtual/memory_tiering/memory_tier directories exist), and /sys/kernel/mm/numa/demotion_enabled is true and /sys/kernel/mm/lru_gen/min_ttl_ms is 0, the OOM killer will not be invoked and the system will become inoperable. This issue can be reproduced using NUMA emulation even on systems with only DRAM. You can create two-fake memory-tiers by booting a single-node system with "numa=3Dfake=3D2 numa_emulation.adistance=3D576,704" kernel parameters. The reason for this issue is that if the target node for allocation has an underlying memory tier, it is always assumed that it can be reclaimed via demotion. So this change avoids this issue by not attempting to demote if the demoting node has less free memory than the minimum watermark. Signed-off-by: Akinobu Mita --- mm/vmscan.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index fddd168a9737..f4748f258294 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -356,7 +356,20 @@ static bool can_demote(int nid, struct scan_control *s= c, return false; =20 /* If demotion node isn't in the cgroup's mems_allowed, fall back */ - return mem_cgroup_node_allowed(memcg, demotion_nid); + if (mem_cgroup_node_allowed(memcg, demotion_nid)) { + int z; + struct zone *zone; + struct pglist_data *pgdat =3D NODE_DATA(demotion_nid); + unsigned int highest_zoneidx =3D sc ? sc->reclaim_idx : MAX_NR_ZONES - 1; + int order =3D sc ? sc->order : 0; + + for_each_managed_zone_pgdat(zone, pgdat, z, highest_zoneidx) { + if (zone_watermark_ok(zone, order, min_wmark_pages(zone), + highest_zoneidx, 0)) + return true; + } + } + return false; } =20 static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg, --=20 2.43.0