From nobody Fri Feb 13 12:30:09 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D826AE7D0C5 for ; Tue, 26 Sep 2023 06:10:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233841AbjIZGKq (ORCPT ); Tue, 26 Sep 2023 02:10:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233730AbjIZGKY (ORCPT ); Tue, 26 Sep 2023 02:10:24 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC023E5F for ; Mon, 25 Sep 2023 23:10:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695708604; x=1727244604; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Y9KtfnCLPgA/SXtfssQyiyzVZDDHiki6dZr93OTwpTQ=; b=LXGSa84Je5uepGb7AEOrLOXz5KWch/lhkQe7PtQuDbxpxUGLCkUPBMV6 u0foCItjsnM30MkWTkam7bT80hhV9Ah7/k2Ba4e3cCdtzg7LANlWUh5z+ CvCwsEqFhBhvg3CQdfO999ZTanPXu8z3eo6CK+lxP2rAi7sBQc8FAiakw raqPUHiZ3Uzv6/FCBd38mbKFds6SjEWM3wXWEceorYUxHSwlBoGx2coEM khIrPSFbQ4URkwwI1uphqpRHi9K7hGTiGp871hElCRdAi/Qd3QHqWM9JH TLV0Hsao7XQrSMPKWEc/cRSb2T8qdkJeGgQPPaUohHFvVh2iOQmZ/7eKm Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="447991452" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="447991452" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 23:10:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="892076071" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="892076071" Received: from aozhu-mobl.ccr.corp.intel.com (HELO yhuang6-mobl2.ccr.corp.intel.com) ([10.255.31.94]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 23:08:54 -0700 From: Huang Ying To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Arjan Van De Ven , Huang Ying , Mel Gorman , Vlastimil Babka , David Hildenbrand , Johannes Weiner , Dave Hansen , Michal Hocko , Pavel Tatashin , Matthew Wilcox , Christoph Lameter Subject: [PATCH -V2 08/10] mm, pcp: decrease PCP high if free pages < high watermark Date: Tue, 26 Sep 2023 14:09:09 +0800 Message-Id: <20230926060911.266511-9-ying.huang@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230926060911.266511-1-ying.huang@intel.com> References: <20230926060911.266511-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" One target of PCP is to minimize pages in PCP if the system free pages is too few. To reach that target, when page reclaiming is active for the zone (ZONE_RECLAIM_ACTIVE), we will stop increasing PCP high in allocating path, decrease PCP high and free some pages in freeing path. But this may be too late because the background page reclaiming may introduce latency for some workloads. So, in this patch, during page allocation we will detect whether the number of free pages of the zone is below high watermark. If so, we will stop increasing PCP high in allocating path, decrease PCP high and free some pages in freeing path. With this, we can reduce the possibility of the premature background page reclaiming caused by too large PCP. The high watermark checking is done in allocating path to reduce the overhead in hotter freeing path. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Mel Gorman Cc: Vlastimil Babka Cc: David Hildenbrand Cc: Johannes Weiner Cc: Dave Hansen Cc: Michal Hocko Cc: Pavel Tatashin Cc: Matthew Wilcox Cc: Christoph Lameter --- include/linux/mmzone.h | 1 + mm/page_alloc.c | 22 ++++++++++++++++++++-- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index d6cfb5023f3e..8a19e2af89df 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1006,6 +1006,7 @@ enum zone_flags { * Cleared when kswapd is woken. */ ZONE_RECLAIM_ACTIVE, /* kswapd may be scanning the zone. */ + ZONE_BELOW_HIGH, /* zone is below high watermark. */ }; =20 static inline unsigned long zone_managed_pages(struct zone *zone) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0d482a55235b..08b74c65b88a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2409,7 +2409,13 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, st= ruct zone *zone, return min(batch << 2, pcp->high); } =20 - if (pcp->count >=3D high && high_min !=3D high_max) { + if (high_min =3D=3D high_max) + return high; + + if (test_bit(ZONE_BELOW_HIGH, &zone->flags)) { + pcp->high =3D max(high - (batch << pcp->free_factor), high_min); + high =3D max(pcp->count, high_min); + } else if (pcp->count >=3D high) { int need_high =3D (batch << pcp->free_factor) + batch; =20 /* pcp->high should be large enough to hold batch freed pages */ @@ -2459,6 +2465,10 @@ static void free_unref_page_commit(struct zone *zone= , struct per_cpu_pages *pcp, if (pcp->count >=3D high) { free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high), pcp, pindex); + if (test_bit(ZONE_BELOW_HIGH, &zone->flags) && + zone_watermark_ok(zone, 0, high_wmark_pages(zone), + ZONE_MOVABLE, 0)) + clear_bit(ZONE_BELOW_HIGH, &zone->flags); } } =20 @@ -2765,7 +2775,7 @@ static int nr_pcp_alloc(struct per_cpu_pages *pcp, st= ruct zone *zone, int order) * If we had larger pcp->high, we could avoid to allocate from * zone. */ - if (high_min !=3D high_max && !test_bit(ZONE_RECLAIM_ACTIVE, &zone->flags= )) + if (high_min !=3D high_max && !test_bit(ZONE_BELOW_HIGH, &zone->flags)) high =3D pcp->high =3D min(high + batch, high_max); =20 if (!order) { @@ -3226,6 +3236,14 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int = order, int alloc_flags, } } =20 + mark =3D high_wmark_pages(zone); + if (zone_watermark_fast(zone, order, mark, + ac->highest_zoneidx, alloc_flags, + gfp_mask)) + goto try_this_zone; + else if (!test_bit(ZONE_BELOW_HIGH, &zone->flags)) + set_bit(ZONE_BELOW_HIGH, &zone->flags); + mark =3D wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK); if (!zone_watermark_fast(zone, order, mark, ac->highest_zoneidx, alloc_flags, --=20 2.39.2