From nobody Thu Dec 18 15:04:13 2025 Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6784C26B755 for ; Tue, 16 Dec 2025 20:09:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765915784; cv=none; b=hebtQAAfPBLY3kQK9j5EtugSykIHTRa7Izyyyp11lofzNUKgvET1O23vJWX3cNcB/7IKKpSs7Naam/bMmWSk48ZjXyPZNtkcu7SZkoAhJNtGjeHcOffkHDyApZgM6b6z+fbpVnJpVmatbPT3nPRplbB6XCMThSoTkDlGEJsqPy8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765915784; c=relaxed/simple; bh=4ARCCbZNe7KxElcMJanxPM6cq0qttha9wNebjQozMBo=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=fOtHO3XK2nTe2zRKyYE60VAg/+PAKOz2Mrh/huaTFqKWxadDWQwMzapJrUtq1QJ9F/st4IS7L4BXFgMGzu9tONhNMgXGJDw4YHTlamZ7JRV6d5Z6o2G+OFKBdwDUppHATMCnfRCq79mYPhDxICcCll64MZOke4oMwrqPZV71h2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jhkjWF2T; arc=none smtp.client-ip=209.85.128.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jhkjWF2T" Received: by mail-yw1-f179.google.com with SMTP id 00721157ae682-78c64370309so44390477b3.2 for ; Tue, 16 Dec 2025 12:09:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765915781; x=1766520581; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=nMS6a3jAyVimHierhPrk+dmchPtXev21lxjws8o9I7U=; b=jhkjWF2T4oJvdObhpBBNFtwVdTLIF2SmLHwT7C5kOyBDbw/Ork+cw2Xl4U0Ou/2Gh+ HEndR7QP/9Cq/TviXdpxUmS4SIoiakvePbDY5vJ5oIECGDtUgk1hgwIf2B8LMO4Q1U2a 2m1M17FLbhg4eGho+eiJFLpubl9Y9f+0k0PVQ8LSnZqTiL52hFLMjT7rZVHcS3Aj6Bgv sGrJQh8ch8aOQXRgm97nhz16+riWlXIAscGM73ZRloFPiGOqyHIahjKRBjgyQLygKwN+ Tky+GdLfGvJWG/cdzoS20Sl6Oc276qpFA0ycN2YKN/DJtVDxxbGkSkcBTpBfH64Ln1yA 1HQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765915781; x=1766520581; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=nMS6a3jAyVimHierhPrk+dmchPtXev21lxjws8o9I7U=; b=PhUwqZS5PyVvt6YHO1snrEFcWSr+PPnN2eyilvVtDWZCyutISyd/+BwjjtnH+ZpKjp ZsfvOmiIpv6iB/gXNsGc+PoGARBaRATf7eqOPLp1ckYyK9OVVMqWPhSWc9tpWyfGcjt9 qoNpZyrJqkf0WAaDxtcZbXFG7WuxLUZU4F8thFD05GVvnxJmOYoT6sSmh3tE0lGmyY/9 uQarx+Rah6NSYrmj2tUtu7N64K8NUs8EMUrxLQgaS8Yov4X54z2gPXdLYKqtnZKY0H4g tlPHG2+fsU+HpFr10YMcCwteHLIe2zJKwxo968X6F53P+c5rNkiC0apTfYHLtLELSAI8 mR3g== X-Forwarded-Encrypted: i=1; AJvYcCUC1o58scvTZ4HwnwzWEIO9omWmxB2cEib3yPLHDAEI0V9S6tN26IiXx3VoJvC82jg+gRbSmA7VSdPEGas=@vger.kernel.org X-Gm-Message-State: AOJu0Yxod7d9i1z/eXXhlgg2uzA4sn9ohUkpWhCQ/Vl/TjjMK1jlXgqk nr2RRQCri8Pqcrv8YanSEfdPXw8yuulYFxCY0JvHJg2k/OWchvrseFIU X-Gm-Gg: AY/fxX6QqVMtBhFGvbSUwHHJgVVbNymF9ONhjW0FmVlbJddcGv9AVoW9FUfwYcFEnTo ztPT+9XQKv2p0nkjQ3UcFHlTp8cLBW/TONkb7w/kBMiNRt8/ErP0Tv6OLJP1BWjv1sHCzm2zQj8 HhkaqvE3r3AXQkVRUjOfk00qxuBDJ+1oZhN5lcl6zyAp3KwFPviS5cwbj0A+l7/8cXT7Pf2+rA7 Py0T9EouedcAt/fI6/nVzVPNitR1aTjWiFOrL2PIAtAsTFoJ88flRp3pHDW8WrJ/FE4edbOZct7 HGuc6geBRbZaOreZFzG26UBWE2YCx+q7WKMd/2p/c3FV5S18sUCpTxfIm+GHfF47UgpFUdYu0Sn 18ZOcVoWB42M5lDthiSwEPuzGVa5eHLbhCjrMnbdlVLn0OmqiYmjJ1/BhR30SHGwXHg/qJs98eE vRwqPc2q6z1cFXlAxYGDb3lSKYZkzQeewQp7KiSFGu0syq X-Google-Smtp-Source: AGHT+IE1rNltaFw15QKa77QeIDdymJ4q3ZBePGvgojJXsNzIzp4mSRCbWloz3h5stjEdm8mcjMlo5g== X-Received: by 2002:a05:690c:885:b0:786:8ce9:3b55 with SMTP id 00721157ae682-78e66ce7049mr114466567b3.5.1765915781146; Tue, 16 Dec 2025 12:09:41 -0800 (PST) Received: from manaslu.cs.wisc.edu (manaslu.cs.wisc.edu. [128.105.15.4]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78e748ef9e3sm42652317b3.17.2025.12.16.12.09.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Dec 2025 12:09:40 -0800 (PST) From: Bijan Tabatabai To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, david@kernel.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, shivankg@amd.com, Bijan Tabatabai Subject: [PATCH] mm: Consider non-anon swap cache folios in folio_expected_ref_count() Date: Tue, 16 Dec 2025 14:07:27 -0600 Message-Id: <20251216200727.2360228-1-bijan311@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, folio_expected_ref_count() only adds references for the swap cache if the folio is anonymous. However, according to the comment above the definition of PG_swapcache in enum pageflags, shmem folios can also have PG_swapcache set. This patch makes sure references for the swap cache are added if folio_test_swapcache(folio) is true. This issue was found when trying to hot-unplug memory in a QEMU/KVM virtual machine. When initiating hot-unplug when most of the guest memory is allocated, hot-unplug hangs partway through removal due to migration failures. The following message would be printed several times, and would be printed again about every five seconds: [ 49.641309] migrating pfn b12f25 failed ret:7 [ 49.641310] page: refcount:2 mapcount:0 mapping:0000000033bd8fe2 index:0= x7f404d925 pfn:0xb12f25 [ 49.641311] aops:swap_aops [ 49.641313] flags: 0x300000000030508(uptodate|active|owner_priv_1|reclai= m|swapbacked|node=3D0|zone=3D3) [ 49.641314] raw: 0300000000030508 ffffed312c4bc908 ffffed312c4bc9c8 0000= 000000000000 [ 49.641315] raw: 00000007f404d925 00000000000c823b 00000002ffffffff 0000= 000000000000 [ 49.641315] page dumped because: migration failure When debugging this, I found that these migration failures were due to __migrate_folio() returning -EAGAIN for a small set of folios because the expected reference count it calculates via folio_expected_ref_count() is one less than the actual reference count of the folios. Furthermore, all of the affected folios were not anonymous, but had the PG_swapcache flag set, inspiring this patch. After applying this patch, the memory hot-unplug behaves as expected. I tested this on a machine running Ubuntu 24.04 with kernel version 6.8.0-90-generic and 64GB of memory. The guest VM is managed by libvirt and runs Ubuntu 24.04 with kernel version 6.18 (though the head of the mm-unstable branch as a Dec 16, 2025 was also tested and behaves the same) and 48GB of memory. The libvirt XML definition for the VM can be found at [1]. CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_MOVABLE is set in the guest kernel so the hot-pluggable memory is automatically onlined. Below are the steps to reproduce this behavior: 1) Define and start and virtual machine host$ virsh -c qemu:///system define ./test_vm.xml # test_vm.xml from [1] host$ virsh -c qemu:///system start test_vm 2) Setup swap in the guest guest$ sudo fallocate -l 32G /swapfile guest$ sudo chmod 0600 /swapfile guest$ sudo mkswap /swapfile guest$ sudo swapon /swapfile 3) Use alloc_data [2] to allocate most of the remaining guest memory guest$ ./alloc_data 45 4) In a separate guest terminal, monitor the amount of used memory guest$ watch -n1 free -h 5) When alloc_data has finished allocating, initiate the memory hot-unplug using the provided xml file [3] host$ virsh -c qemu:///system detach-device test_vm ./remove.xml --live After initiating the memory hot-unplug, you should see the amount of available memory in the guest decrease, and the amount of used swap data increase. If everything works as expected, when all of the memory is unplugged, there should be around 8.5-9GB of data in swap. If the unplugging is unsuccessful, the amount of used swap data will settle below that. If that happens, you should be able to see log messages in dmesg similar to the one posted above. [1] https://github.com/BijanT/linux_patch_files/blob/main/test_vm.xml [2] https://github.com/BijanT/linux_patch_files/blob/main/alloc_data.c [3] https://github.com/BijanT/linux_patch_files/blob/main/remove.xml Fixes: 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference coun= t calculation") Signed-off-by: Bijan Tabatabai Acked-by: David Hildenbrand (Red Hat) Reviewed-by: Baolin Wang --- I am not very familiar with the memory hot-(un)plug or swapping code, so I am not 100% certain if this patch actually solves the root of the problem. I believe the issue is from shmem folios, in which case I believe this patch is correct. However, I couldn't think of an easy way to confirm that the affected folios were from shmem. I guess it could be possible that the root cause could be from some bug where some anonymous pages do not return true to folio_test_anon(). I don't think that's the case, but figured the MM maintainers would have a better idea of what's going on. --- include/linux/mm.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 15076261d0c2..6f959d8ca4b4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2459,10 +2459,10 @@ static inline int folio_expected_ref_count(const st= ruct folio *folio) if (WARN_ON_ONCE(page_has_type(&folio->page) && !folio_test_hugetlb(folio= ))) return 0; =20 - if (folio_test_anon(folio)) { - /* One reference per page from the swapcache. */ - ref_count +=3D folio_test_swapcache(folio) << order; - } else { + /* One reference per page from the swapcache. */ + ref_count +=3D folio_test_swapcache(folio) << order; + + if (!folio_test_anon(folio)) { /* One reference per page from the pagecache. */ ref_count +=3D !!folio->mapping << order; /* One reference from PG_private. */ --=20 2.43.0