From nobody Fri Feb 13 16:37:03 2026 Received: from out-181.mta1.migadu.com (out-181.mta1.migadu.com [95.215.58.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF437381D1 for ; Fri, 24 May 2024 08:57:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541062; cv=none; b=SzmTwJvSmHhTvx3bJ5dkbWpLXNn3mpQRwnY5sSyUsrj+RotXORue4MbI5sGl/XM2XearG3Zpw6tZa3xe12pNVjXR9QBQFj5CN1QZFVIX72XFFvIV5jgO5pa9s3REOwPBWnIR3PuCHd0kY5I10uucW72xeFY46wMxwkGQNkVN5Q8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541062; c=relaxed/simple; bh=JxsnAPdpaUy+vf0sl0PZjYU5lyGNxjQTcWpShsy4QXs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GlJTAHtmSaZ3VE0/hgeBVvNGWl1FDXZ93cA7y0fmyHl+6w/aGnSukifrJmid+/ijtzqETnVdN+H/ooEII80YwLKq41pexBiD/czUygNVR0rjdDa0v6VE8ZOZFI/hJvVeW308evd2g91tDJbQWhJwHytpYoctiPLZ8LOa2Ca06Jw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=KzK2wgoY; arc=none smtp.client-ip=95.215.58.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="KzK2wgoY" X-Envelope-To: linux-mm@kvack.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1716541057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YHfQsO4NVTRNBD9ihuMdP3lZwpfoz/pqOUOarvdBcXM=; b=KzK2wgoYer0vn4GzZWGWDfZ/tEhEAAuly8IRy9wsz+ICAhFVcu1u7w0U2xgEzapA1beUxr Cg2RxMRwAs8u8ZeUyZCjq5yDfqpMWfTtf7foWhd11ZGZBNVmSH3cwAsudxn5P3p1OhqWa2 1T//kN+DQM8ZtlnL6O0rW8oo4QuAuMg= X-Envelope-To: hughd@google.com X-Envelope-To: chengming.zhou@linux.dev X-Envelope-To: zhouchengming@bytedance.com X-Envelope-To: shr@devkernel.io X-Envelope-To: david@redhat.com X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: aarcange@redhat.com X-Envelope-To: linux-kernel@vger.kernel.org X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Date: Fri, 24 May 2024 16:56:50 +0800 Subject: [PATCH 1/4] mm/ksm: refactor out try_to_merge_with_zero_page() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240524-b4-ksm-scan-optimize-v1-1-053b31bd7ab4@linux.dev> References: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> In-Reply-To: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> To: Andrew Morton , david@redhat.com, aarcange@redhat.com, hughd@google.com, shr@devkernel.io Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com, Chengming Zhou X-Developer-Signature: v=1; a=ed25519-sha256; t=1716541051; l=3157; i=chengming.zhou@linux.dev; s=20240508; h=from:subject:message-id; bh=JxsnAPdpaUy+vf0sl0PZjYU5lyGNxjQTcWpShsy4QXs=; b=+IclB6lAEN9i9mpmGfuCQ9JjgV8dqoL221kOWpl1/LyLcUwK7iGdiv9V31dGtFNFcnBd483w1 oZJ5Bw1PSUwBuBpDWFbZ8IR5pOHuMwXOhMQ8PA7CAvcviaO3yj6FgCI X-Developer-Key: i=chengming.zhou@linux.dev; a=ed25519; pk=kx40VUetZeR6MuiqrM7kPCcGakk1md0Az5qHwb6gBdU= X-Migadu-Flow: FLOW_OUT In preparation for later changes, refactor out a new function called try_to_merge_with_zero_page(), which tries to merge with zero page. Signed-off-by: Chengming Zhou --- mm/ksm.c | 67 +++++++++++++++++++++++++++++++++++-------------------------= ---- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 4dc707d175fa..cbd4ba7ea974 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1531,6 +1531,41 @@ static int try_to_merge_one_page(struct vm_area_stru= ct *vma, return err; } =20 +/* This function returns 0 if the pages were merged, -EFAULT otherwise. */ +static int try_to_merge_with_zero_page(struct ksm_rmap_item *rmap_item, + struct page *page) +{ + struct mm_struct *mm =3D rmap_item->mm; + int err =3D -EFAULT; + + /* + * Same checksum as an empty page. We attempt to merge it with the + * appropriate zero page if the user enabled this via sysfs. + */ + if (ksm_use_zero_pages && (rmap_item->oldchecksum =3D=3D zero_checksum)) { + struct vm_area_struct *vma; + + mmap_read_lock(mm); + vma =3D find_mergeable_vma(mm, rmap_item->address); + if (vma) { + err =3D try_to_merge_one_page(vma, page, + ZERO_PAGE(rmap_item->address)); + trace_ksm_merge_one_page( + page_to_pfn(ZERO_PAGE(rmap_item->address)), + rmap_item, mm, err); + } else { + /* + * If the vma is out of date, we do not need to + * continue. + */ + err =3D 0; + } + mmap_read_unlock(mm); + } + + return err; +} + /* * try_to_merge_with_ksm_page - like try_to_merge_two_pages, * but no new kernel page is allocated: kpage must already be a ksm page. @@ -2305,7 +2340,6 @@ static void stable_tree_append(struct ksm_rmap_item *= rmap_item, */ static noinline void cmp_and_merge_page(struct page *page, struct ksm_rmap= _item *rmap_item) { - struct mm_struct *mm =3D rmap_item->mm; struct ksm_rmap_item *tree_rmap_item; struct page *tree_page =3D NULL; struct ksm_stable_node *stable_node; @@ -2374,36 +2408,9 @@ static noinline void cmp_and_merge_page(struct page = *page, struct ksm_rmap_item return; } =20 - /* - * Same checksum as an empty page. We attempt to merge it with the - * appropriate zero page if the user enabled this via sysfs. - */ - if (ksm_use_zero_pages && (checksum =3D=3D zero_checksum)) { - struct vm_area_struct *vma; + if (!try_to_merge_with_zero_page(rmap_item, page)) + return; =20 - mmap_read_lock(mm); - vma =3D find_mergeable_vma(mm, rmap_item->address); - if (vma) { - err =3D try_to_merge_one_page(vma, page, - ZERO_PAGE(rmap_item->address)); - trace_ksm_merge_one_page( - page_to_pfn(ZERO_PAGE(rmap_item->address)), - rmap_item, mm, err); - } else { - /* - * If the vma is out of date, we do not need to - * continue. - */ - err =3D 0; - } - mmap_read_unlock(mm); - /* - * In case of failure, the page was not really empty, so we - * need to continue. Otherwise we're done. - */ - if (!err) - return; - } tree_rmap_item =3D unstable_tree_search_insert(rmap_item, page, &tree_page); if (tree_rmap_item) { --=20 2.45.1 From nobody Fri Feb 13 16:37:03 2026 Received: from out-174.mta1.migadu.com (out-174.mta1.migadu.com [95.215.58.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8EFC7129A77 for ; Fri, 24 May 2024 08:57:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541065; cv=none; b=ojHPHArpZq4kEqvp9d/9YrwBiRvyzJ1VBT92rndCD/aPz/f4AcKEFACGPhQTNfG8REase+DQd4Hmn7vil6/JgMqIPl7dCUYf85PmPSWHt0GgFy1uCTJUGz5m3Z73oI0ktnsRucPojP79m0GEEnGNNtiVyIJ0G68p+MwBYHktblo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541065; c=relaxed/simple; bh=1G8p4ukUYdMOk654nZZJtD5LJSaJbjnQHPtspc5amuo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=eu5jCUqbmmvvKQBc1IRZc2ZzzvdWyTqgMoAPMWiHrXMRT/nuXm1x+9zL0VcYaTe8YK9wsDTYWfmBgsFGWO3mvvuf7JSY4g36S+uI6JfCDXv/9UPyNhRnuT5thjaQ1myjDaeTXKwrDz9HKQ0ARA1PQXFqcpdO2Az7xNmcK+79Bfg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=E5m6I8de; arc=none smtp.client-ip=95.215.58.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="E5m6I8de" X-Envelope-To: linux-mm@kvack.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1716541060; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IrbgcBme4gf384UGODzdSyOVXcWHfhExwH5eqJcgItc=; b=E5m6I8deIoqw6vq/84vM96nA3u/7CWXC5h8vgVpVwMJUh9OwqPhQXgVDBobdnZCcRDd2Gc B5HXL/d0UK9N+vBSADHhJzUu5BnEttIHr756vbu3OZuMvZ340tmEvLOGgNm72mD3NyGRsH zGXnHf/1q761IrZcj6NTN3l/RrrjkK0= X-Envelope-To: hughd@google.com X-Envelope-To: chengming.zhou@linux.dev X-Envelope-To: zhouchengming@bytedance.com X-Envelope-To: shr@devkernel.io X-Envelope-To: david@redhat.com X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: aarcange@redhat.com X-Envelope-To: linux-kernel@vger.kernel.org X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Date: Fri, 24 May 2024 16:56:51 +0800 Subject: [PATCH 2/4] mm/ksm: don't waste time searching stable tree for fast changing page Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240524-b4-ksm-scan-optimize-v1-2-053b31bd7ab4@linux.dev> References: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> In-Reply-To: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> To: Andrew Morton , david@redhat.com, aarcange@redhat.com, hughd@google.com, shr@devkernel.io Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com, Chengming Zhou X-Developer-Signature: v=1; a=ed25519-sha256; t=1716541051; l=2614; i=chengming.zhou@linux.dev; s=20240508; h=from:subject:message-id; bh=1G8p4ukUYdMOk654nZZJtD5LJSaJbjnQHPtspc5amuo=; b=CYjqK00M2lLdR7+WKq2pN5yhtp9lGVF60p4EeSzgxNV31x3wCoi5EIyOTEKvBZMWRWTvsRgmg N2x1EYuvlGYA6VQKuU6yVelvTy0n7Rt4HTLV50/OImvqxM243FfBbNl X-Developer-Key: i=chengming.zhou@linux.dev; a=ed25519; pk=kx40VUetZeR6MuiqrM7kPCcGakk1md0Az5qHwb6gBdU= X-Migadu-Flow: FLOW_OUT The code flow in cmp_and_merge_page() is suboptimal for handling the ksm page and non-ksm page at the same time. For example: - ksm page 1. Mostly just return if this ksm page is not migrated and this rmap_item has been on the rmap hlist. Or we have to fix this rmap_item mapping. 2. But we absolutely don't need to checksum for this ksm page, since it can't change. - non-ksm page 1. First don't need to waste time searching stable tree if fast changing. 2. Should try to merge with zero page before search the stable tree. 3. Then search stable tree to find mergeable ksm page. This patch optimizes the code flow so the handling differences between ksm page and non-ksm page become clearer and more efficient too. Signed-off-by: Chengming Zhou --- mm/ksm.c | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index cbd4ba7ea974..2424081f386e 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2366,6 +2366,23 @@ static noinline void cmp_and_merge_page(struct page = *page, struct ksm_rmap_item */ if (!is_page_sharing_candidate(stable_node)) max_page_sharing_bypass =3D true; + } else { + remove_rmap_item_from_tree(rmap_item); + + /* + * If the hash value of the page has changed from the last time + * we calculated it, this page is changing frequently: therefore we + * don't want to insert it in the unstable tree, and we don't want + * to waste our time searching for something identical to it there. + */ + checksum =3D calc_checksum(page); + if (rmap_item->oldchecksum !=3D checksum) { + rmap_item->oldchecksum =3D checksum; + return; + } + + if (!try_to_merge_with_zero_page(rmap_item, page)) + return; } =20 /* We first start with searching the page inside the stable tree */ @@ -2396,21 +2413,6 @@ static noinline void cmp_and_merge_page(struct page = *page, struct ksm_rmap_item return; } =20 - /* - * If the hash value of the page has changed from the last time - * we calculated it, this page is changing frequently: therefore we - * don't want to insert it in the unstable tree, and we don't want - * to waste our time searching for something identical to it there. - */ - checksum =3D calc_checksum(page); - if (rmap_item->oldchecksum !=3D checksum) { - rmap_item->oldchecksum =3D checksum; - return; - } - - if (!try_to_merge_with_zero_page(rmap_item, page)) - return; - tree_rmap_item =3D unstable_tree_search_insert(rmap_item, page, &tree_page); if (tree_rmap_item) { --=20 2.45.1 From nobody Fri Feb 13 16:37:03 2026 Received: from out-174.mta1.migadu.com (out-174.mta1.migadu.com [95.215.58.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03DDB129E94 for ; Fri, 24 May 2024 08:57:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541067; cv=none; b=BH4SNgYalwFMr9qU4cDHW2n2ZwZ4AwW2AxqXtZWoTPHkVaNUNATfUmkvNiVdQ39SouordkSutmJegaX+LlAUeAM8crkqEKyaUMUHX2lmJUYXb4OE92hQoZ1p0DU/q69mYDpzQLKhaRsx+nAC8vcmw3PFt6s01O5PFhdwS3tm3r4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541067; c=relaxed/simple; bh=B3mWul1z83IaYeQ2cUT6DdRB2E/1FafRRpZB8x99BtA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OaKpNU41K6HJJgiGZKjcRQlOIJSyS57yV+c73brfgLqnlAbaRmcywFItFAuB770AuWlD98JVAplUuoXrp1M/XmQWP14kc/qVNZYQiQB/rE+4OXnX23k/pX67B8pnMiIf2kOAzhG62+8GaK4B+CHVR7Xiok7BEdUrm/e5Blh9Ujk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=nsleZO40; arc=none smtp.client-ip=95.215.58.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="nsleZO40" X-Envelope-To: linux-mm@kvack.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1716541063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9kGF9w1QUnOSltohj+L0P/TzAsAcBq0IIlZW6mywScQ=; b=nsleZO401TgVx5KF9wb+ceO8era7Jem4f+cMSs9YNzbTHmueEL9+9Y9c8sTtLrNEBtJsGh TVWPzIMkfKyekJKQrQz94vOH7zwFzGNrq1TfyoiswZIpfmVgHm/fUh5BoRL9zHDQe9LJkH BgwGtPqzyKjQ5I8hRzTnr0ocAj0w5Xo= X-Envelope-To: hughd@google.com X-Envelope-To: chengming.zhou@linux.dev X-Envelope-To: zhouchengming@bytedance.com X-Envelope-To: shr@devkernel.io X-Envelope-To: david@redhat.com X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: aarcange@redhat.com X-Envelope-To: linux-kernel@vger.kernel.org X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Date: Fri, 24 May 2024 16:56:52 +0800 Subject: [PATCH 3/4] mm/ksm: optimize the chain()/chain_prune() interfaces Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240524-b4-ksm-scan-optimize-v1-3-053b31bd7ab4@linux.dev> References: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> In-Reply-To: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> To: Andrew Morton , david@redhat.com, aarcange@redhat.com, hughd@google.com, shr@devkernel.io Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com, Chengming Zhou X-Developer-Signature: v=1; a=ed25519-sha256; t=1716541051; l=9550; i=chengming.zhou@linux.dev; s=20240508; h=from:subject:message-id; bh=B3mWul1z83IaYeQ2cUT6DdRB2E/1FafRRpZB8x99BtA=; b=nzbFXMDobe1UukjngEHFuUwYFMRq/StBqzcrTRZ6CxoVvEgcltAoBkvGjF5A0L2B94mAqC+tM 3OkiFr8H+y1DAWKpNEe8YR0LIoUUYoc9WUzqfX4EpwozFF5S//xFomx X-Developer-Key: i=chengming.zhou@linux.dev; a=ed25519; pk=kx40VUetZeR6MuiqrM7kPCcGakk1md0Az5qHwb6gBdU= X-Migadu-Flow: FLOW_OUT Now the implementation of stable_node_dup() causes chain()/chain_prune() interfaces and usages are overcomplicated. Why? stable_node_dup() only find and return a candidate stable_node for sharing, so the users have to recheck using stable_node_dup_any() if any non-candidate stable_node exist. And try to ksm_get_folio() from it again. Actually, stable_node_dup() can just return a best stable_node as it can, then the users can check if it's a candidate for sharing or not. The code is simplified too and fewer corner cases: such as stable_node and stable_node_dup can't be NULL if returned tree_folio is not NULL. Signed-off-by: Chengming Zhou --- mm/ksm.c | 152 ++++++++++++-----------------------------------------------= ---- 1 file changed, 27 insertions(+), 125 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 2424081f386e..f923699452ed 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1660,7 +1660,6 @@ static struct folio *stable_node_dup(struct ksm_stabl= e_node **_stable_node_dup, struct ksm_stable_node *dup, *found =3D NULL, *stable_node =3D *_stable_n= ode; struct hlist_node *hlist_safe; struct folio *folio, *tree_folio =3D NULL; - int nr =3D 0; int found_rmap_hlist_len; =20 if (!prune_stale_stable_nodes || @@ -1687,33 +1686,26 @@ static struct folio *stable_node_dup(struct ksm_sta= ble_node **_stable_node_dup, folio =3D ksm_get_folio(dup, KSM_GET_FOLIO_NOLOCK); if (!folio) continue; - nr +=3D 1; - if (is_page_sharing_candidate(dup)) { - if (!found || - dup->rmap_hlist_len > found_rmap_hlist_len) { - if (found) - folio_put(tree_folio); - found =3D dup; - found_rmap_hlist_len =3D found->rmap_hlist_len; - tree_folio =3D folio; - - /* skip put_page for found dup */ - if (!prune_stale_stable_nodes) - break; - continue; - } + /* Pick the best candidate if possible. */ + if (!found || (is_page_sharing_candidate(dup) && + (!is_page_sharing_candidate(found) || + dup->rmap_hlist_len > found_rmap_hlist_len))) { + if (found) + folio_put(tree_folio); + found =3D dup; + found_rmap_hlist_len =3D found->rmap_hlist_len; + tree_folio =3D folio; + /* skip put_page for found candidate */ + if (!prune_stale_stable_nodes && + is_page_sharing_candidate(found)) + break; + continue; } folio_put(folio); } =20 if (found) { - /* - * nr is counting all dups in the chain only if - * prune_stale_stable_nodes is true, otherwise we may - * break the loop at nr =3D=3D 1 even if there are - * multiple entries. - */ - if (prune_stale_stable_nodes && nr =3D=3D 1) { + if (hlist_is_singular_node(&found->hlist_dup, &stable_node->hlist)) { /* * If there's not just one entry it would * corrupt memory, better BUG_ON. In KSM @@ -1765,25 +1757,15 @@ static struct folio *stable_node_dup(struct ksm_sta= ble_node **_stable_node_dup, hlist_add_head(&found->hlist_dup, &stable_node->hlist); } + } else { + /* Its hlist must be empty if no one found. */ + free_stable_node_chain(stable_node, root); } =20 *_stable_node_dup =3D found; return tree_folio; } =20 -static struct ksm_stable_node *stable_node_dup_any(struct ksm_stable_node = *stable_node, - struct rb_root *root) -{ - if (!is_stable_node_chain(stable_node)) - return stable_node; - if (hlist_empty(&stable_node->hlist)) { - free_stable_node_chain(stable_node, root); - return NULL; - } - return hlist_entry(stable_node->hlist.first, - typeof(*stable_node), hlist_dup); -} - /* * Like for ksm_get_folio, this function can free the *_stable_node and * *_stable_node_dup if the returned tree_page is NULL. @@ -1804,17 +1786,10 @@ static struct folio *__stable_node_chain(struct ksm= _stable_node **_stable_node_d bool prune_stale_stable_nodes) { struct ksm_stable_node *stable_node =3D *_stable_node; + if (!is_stable_node_chain(stable_node)) { - if (is_page_sharing_candidate(stable_node)) { - *_stable_node_dup =3D stable_node; - return ksm_get_folio(stable_node, KSM_GET_FOLIO_NOLOCK); - } - /* - * _stable_node_dup set to NULL means the stable_node - * reached the ksm_max_page_sharing limit. - */ - *_stable_node_dup =3D NULL; - return NULL; + *_stable_node_dup =3D stable_node; + return ksm_get_folio(stable_node, KSM_GET_FOLIO_NOLOCK); } return stable_node_dup(_stable_node_dup, _stable_node, root, prune_stale_stable_nodes); @@ -1828,16 +1803,10 @@ static __always_inline struct folio *chain_prune(st= ruct ksm_stable_node **s_n_d, } =20 static __always_inline struct folio *chain(struct ksm_stable_node **s_n_d, - struct ksm_stable_node *s_n, + struct ksm_stable_node **s_n, struct rb_root *root) { - struct ksm_stable_node *old_stable_node =3D s_n; - struct folio *tree_folio; - - tree_folio =3D __stable_node_chain(s_n_d, &s_n, root, false); - /* not pruning dups so s_n cannot have changed */ - VM_BUG_ON(s_n !=3D old_stable_node); - return tree_folio; + return __stable_node_chain(s_n_d, s_n, root, false); } =20 /* @@ -1855,7 +1824,7 @@ static struct page *stable_tree_search(struct page *p= age) struct rb_root *root; struct rb_node **new; struct rb_node *parent; - struct ksm_stable_node *stable_node, *stable_node_dup, *stable_node_any; + struct ksm_stable_node *stable_node, *stable_node_dup; struct ksm_stable_node *page_node; struct folio *folio; =20 @@ -1879,45 +1848,7 @@ static struct page *stable_tree_search(struct page *= page) =20 cond_resched(); stable_node =3D rb_entry(*new, struct ksm_stable_node, node); - stable_node_any =3D NULL; tree_folio =3D chain_prune(&stable_node_dup, &stable_node, root); - /* - * NOTE: stable_node may have been freed by - * chain_prune() if the returned stable_node_dup is - * not NULL. stable_node_dup may have been inserted in - * the rbtree instead as a regular stable_node (in - * order to collapse the stable_node chain if a single - * stable_node dup was found in it). In such case the - * stable_node is overwritten by the callee to point - * to the stable_node_dup that was collapsed in the - * stable rbtree and stable_node will be equal to - * stable_node_dup like if the chain never existed. - */ - if (!stable_node_dup) { - /* - * Either all stable_node dups were full in - * this stable_node chain, or this chain was - * empty and should be rb_erased. - */ - stable_node_any =3D stable_node_dup_any(stable_node, - root); - if (!stable_node_any) { - /* rb_erase just run */ - goto again; - } - /* - * Take any of the stable_node dups page of - * this stable_node chain to let the tree walk - * continue. All KSM pages belonging to the - * stable_node dups in a stable_node chain - * have the same content and they're - * write protected at all times. Any will work - * fine to continue the walk. - */ - tree_folio =3D ksm_get_folio(stable_node_any, - KSM_GET_FOLIO_NOLOCK); - } - VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); if (!tree_folio) { /* * If we walked over a stale stable_node, @@ -1955,7 +1886,7 @@ static struct page *stable_tree_search(struct page *p= age) goto chain_append; } =20 - if (!stable_node_dup) { + if (!is_page_sharing_candidate(stable_node_dup)) { /* * If the stable_node is a chain and * we got a payload match in memcmp @@ -2064,9 +1995,6 @@ static struct page *stable_tree_search(struct page *p= age) return &folio->page; =20 chain_append: - /* stable_node_dup could be null if it reached the limit */ - if (!stable_node_dup) - stable_node_dup =3D stable_node_any; /* * If stable_node was a chain and chain_prune collapsed it, * stable_node has been updated to be the new regular @@ -2111,7 +2039,7 @@ static struct ksm_stable_node *stable_tree_insert(str= uct folio *kfolio) struct rb_root *root; struct rb_node **new; struct rb_node *parent; - struct ksm_stable_node *stable_node, *stable_node_dup, *stable_node_any; + struct ksm_stable_node *stable_node, *stable_node_dup; bool need_chain =3D false; =20 kpfn =3D folio_pfn(kfolio); @@ -2127,33 +2055,7 @@ static struct ksm_stable_node *stable_tree_insert(st= ruct folio *kfolio) =20 cond_resched(); stable_node =3D rb_entry(*new, struct ksm_stable_node, node); - stable_node_any =3D NULL; - tree_folio =3D chain(&stable_node_dup, stable_node, root); - if (!stable_node_dup) { - /* - * Either all stable_node dups were full in - * this stable_node chain, or this chain was - * empty and should be rb_erased. - */ - stable_node_any =3D stable_node_dup_any(stable_node, - root); - if (!stable_node_any) { - /* rb_erase just run */ - goto again; - } - /* - * Take any of the stable_node dups page of - * this stable_node chain to let the tree walk - * continue. All KSM pages belonging to the - * stable_node dups in a stable_node chain - * have the same content and they're - * write protected at all times. Any will work - * fine to continue the walk. - */ - tree_folio =3D ksm_get_folio(stable_node_any, - KSM_GET_FOLIO_NOLOCK); - } - VM_BUG_ON(!stable_node_dup ^ !!stable_node_any); + tree_folio =3D chain(&stable_node_dup, &stable_node, root); if (!tree_folio) { /* * If we walked over a stale stable_node, --=20 2.45.1 From nobody Fri Feb 13 16:37:03 2026 Received: from out-175.mta1.migadu.com (out-175.mta1.migadu.com [95.215.58.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0EA3712A16B for ; Fri, 24 May 2024 08:57:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541069; cv=none; b=WRcCWMVR0n6qu9tmBCkYow3Wsveb2VMf6BS6RP5en60FYGH8MT8aYiU5VNeT1xXbj4UsxuhWywzFJXMSrlMLw+eYghjLmG+LTppBxY8HTdP2CNjrs8JF5PQFZHjPTAaM1ddalFJ0WxSqid2EjVsAVR+cVfgfQqNNLKnNozclMfc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716541069; c=relaxed/simple; bh=smYx6vKdhJOj0qp5JFWbHhMVKDAu8/fdVq+SRWlUWTA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XNDqGqSux4AyVb+mF4hv5hAMBueSHK0/2XcmhCEWEkOb556my1tqj92aQRYei/IcYuQ+k0PK9iLDCgdPo6sioQG5+6fSmm2UdzUjSRL8vAht15FU1IYuLfc297YOktW0qleIhXdvLgGoMoSKFGEiEgQfLXO7pv6UGxl6WeUc/z0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=hYgIUtR+; arc=none smtp.client-ip=95.215.58.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="hYgIUtR+" X-Envelope-To: linux-mm@kvack.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1716541066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dYG4t/EaNB7EWepqtfFni8zXj5gK0Ak5wPBBilgzMF4=; b=hYgIUtR+IGQiW5SYzAi6hn187p+6dLSPfSc5qCAwXfrCx9zJofdC5SRZC9uNmtLVNrr//S K35K1A3BY7lW01HjC1Jo6/eGc5/+mcoy71lAOeeLqpOAoODxFEmBtTCscyOa9YL6GanUCE SRBiSiD3MN3ajReabx4U2DoEUf+9QpQ= X-Envelope-To: hughd@google.com X-Envelope-To: chengming.zhou@linux.dev X-Envelope-To: zhouchengming@bytedance.com X-Envelope-To: shr@devkernel.io X-Envelope-To: david@redhat.com X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: aarcange@redhat.com X-Envelope-To: linux-kernel@vger.kernel.org X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou Date: Fri, 24 May 2024 16:56:53 +0800 Subject: [PATCH 4/4] mm/ksm: use ksm page itself if no another ksm page is found on stable tree Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20240524-b4-ksm-scan-optimize-v1-4-053b31bd7ab4@linux.dev> References: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> In-Reply-To: <20240524-b4-ksm-scan-optimize-v1-0-053b31bd7ab4@linux.dev> To: Andrew Morton , david@redhat.com, aarcange@redhat.com, hughd@google.com, shr@devkernel.io Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com, Chengming Zhou X-Developer-Signature: v=1; a=ed25519-sha256; t=1716541051; l=2490; i=chengming.zhou@linux.dev; s=20240508; h=from:subject:message-id; bh=smYx6vKdhJOj0qp5JFWbHhMVKDAu8/fdVq+SRWlUWTA=; b=Uk5bDGiklVu1N1UJwFk4FaByr6X04WBgoDgj5019z6tb87qoXKOdmsjtjRjcvjab0iuVGGhhe RoMjv/9K6JPCH5inyUAKr53usGvYwINzGTYbvuRTf/CObYXOBS7Atoe X-Developer-Key: i=chengming.zhou@linux.dev; a=ed25519; pk=kx40VUetZeR6MuiqrM7kPCcGakk1md0Az5qHwb6gBdU= X-Migadu-Flow: FLOW_OUT It's interesting that a mapped ksm page also need to stable_tree_search(), instead of using stable_tree_insert() directly. The reason is that we have a minor optimization for migrated ksm page that has only one mapcount, in which case we can find another ksm page that already on the stable tree to replace it. But what if we can't find another shareable candidate on the stable tree? Obviously, we should just return the ksm page itself if it has been inserted on the tree. And we shouldn't return NULL if no another ksm page is found on the tree, since we will still map on this ksm page but the rmap_item will be removed out to insert on the unstable tree if we return NULL in this case. We can ignore the is_page_sharing_candidate() check in this case, since max_page_sharing_bypass is set to true in cmp_and_merge_page(). Signed-off-by: Chengming Zhou --- mm/ksm.c | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index f923699452ed..6dea83998258 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1940,11 +1940,8 @@ static struct page *stable_tree_search(struct page *= page) rb_link_node(&page_node->node, parent, new); rb_insert_color(&page_node->node, root); out: - if (is_page_sharing_candidate(page_node)) { - folio_get(folio); - return &folio->page; - } else - return NULL; + folio_get(folio); + return &folio->page; =20 replace: /* @@ -1966,10 +1963,7 @@ static struct page *stable_tree_search(struct page *= page) rb_replace_node(&stable_node_dup->node, &page_node->node, root); - if (is_page_sharing_candidate(page_node)) - folio_get(folio); - else - folio =3D NULL; + folio_get(folio); } else { rb_erase(&stable_node_dup->node, root); folio =3D NULL; @@ -1982,10 +1976,7 @@ static struct page *stable_tree_search(struct page *= page) list_del(&page_node->list); DO_NUMA(page_node->nid =3D nid); stable_node_chain_add_dup(page_node, stable_node); - if (is_page_sharing_candidate(page_node)) - folio_get(folio); - else - folio =3D NULL; + folio_get(folio); } else { folio =3D NULL; } @@ -2009,7 +2000,7 @@ static struct page *stable_tree_search(struct page *p= age) stable_node =3D alloc_stable_node_chain(stable_node_dup, root); if (!stable_node) - return NULL; + goto out; } /* * Add this stable_node dup that was --=20 2.45.1