From nobody Sat Feb 7 18:52:08 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9BFAB27A461; Mon, 19 Jan 2026 23:02:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768863764; cv=none; b=OGYv7iXDHgtifcmC1hA6hf96HNDj4WoO6bhtpUnG+0z4eh1FFtLBB3wR/e/DgL23yyECexSpwexr2lIyDiUFo+SuqWbdefW7VEOapCc5PGzsUYy4Hddh7aJErwhzW4vlEICE3avQdYc1cZZ+j2/TXLL+YPFsG0rwSA8axQ+4YYA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768863764; c=relaxed/simple; bh=Pf5BxzivrH4BXObSIgzNQrUk+URrkIatB5ImogUIjQU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NqQ/j6CR58PeuXMB72Az/4r8OM56ghwGp7euKLcUs/c8kowpZZOCViWyL0yHSimhjhwQtK+hAYKR9VnWWz2E7ySbUUyOnVToilUGjKX+VnaUshS+ksFlukAXawhv0SLuluYYH+8qqGVOzkGOaVmol2RrUO0NjWCVHNx0Nh7oN+c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kL3yk/Nx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kL3yk/Nx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 05F78C116C6; Mon, 19 Jan 2026 23:02:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1768863764; bh=Pf5BxzivrH4BXObSIgzNQrUk+URrkIatB5ImogUIjQU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kL3yk/NxqLODZejr8sB3SxDlCN70szrILaWrvJJIW+HQIGg4P/+YP1wzydq+dc68t KOpMP3LxEmZALZci/WwsQshv725k6uY1y/mzahpied33Xva4RzQCSQ7c4gSpPToObs WHeu0YhuZVaaFIUCfvYl74JU71ePBkTm58aBxTYlZRKnfXc5lU+msYNmpB+z+wl4ru JcO2lScUWd+bNcp6qpk53CQ1B0nr/RDBVW8prfZ3gs+4sNo4SQnqA1m8tfxPp79Ua+ lzYTwtGP4Zn+sr2Ljv3uykQLhJsQl+DNAIRP8yHOzwWGaK9v7HUiTIRQVUaiyoqSkK 2Zs3VMw3HZoWg== From: "David Hildenbrand (Red Hat)" To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, Broadcom internal kernel review list , linux-doc@vger.kernel.org, virtualization@lists.linux.dev, "David Hildenbrand (Red Hat)" , Andrew Morton , Oscar Salvador , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Jonathan Corbet , Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Arnd Bergmann , Greg Kroah-Hartman , Jerrin Shaji George , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Zi Yan Subject: [PATCH v3 09/24] mm/balloon_compaction: remove dependency on page lock Date: Tue, 20 Jan 2026 00:01:17 +0100 Message-ID: <20260119230133.3551867-10-david@kernel.org> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260119230133.3551867-1-david@kernel.org> References: <20260119230133.3551867-1-david@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Let's stop using the page lock in balloon code and instead use only the balloon_device_lock. As soon as we set the PG_movable_ops flag, we might now get isolation callbacks for that page as we are no longer holding the page lock. In there, we'll simply synchronize using the balloon_device_lock. So in balloon_page_isolate() lookup the balloon_dev_info through page->private under balloon_device_lock. It's crucial that we update page->private under the balloon_device_lock, so the isolation callback can properly deal with concurrent deflation. Consequently, make sure that balloon_page_finalize() is called under balloon_device_lock as we remove a page from the list and clear page->private. balloon_page_insert() is already called with the balloon_device_lock held. Note that the core will still lock the pages, for example in isolate_movable_ops_page(). The lock is there still relevant for handling the PageMovableOpsIsolated flag, but that can be later changed to use an atomic test-and-set instead, or moved into the movable_ops backends. Acked-by: Michael S. Tsirkin Signed-off-by: David Hildenbrand (Red Hat) --- include/linux/balloon_compaction.h | 25 ++++++++++---------- mm/balloon_compaction.c | 38 ++++++++++-------------------- 2 files changed, 25 insertions(+), 38 deletions(-) diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_com= paction.h index 9a8568fcd477d..ad594af6ed100 100644 --- a/include/linux/balloon_compaction.h +++ b/include/linux/balloon_compaction.h @@ -12,25 +12,27 @@ * is derived from the page type (PageOffline()) combined with the * PG_movable_ops flag (PageMovableOps()). * + * Once the page type and the PG_movable_ops are set, migration code + * can initiate page isolation by invoking the + * movable_operations()->isolate_page() callback + * + * As long as page->private is set, the page is either on the balloon list + * or isolated for migration. If page->private is not set, the page is + * either still getting inflated, or was deflated to be freed by the ballo= on + * driver soon. Isolation is impossible in both cases. + * * As the page isolation scanning step a compaction thread does is a lockl= ess * procedure (from a page standpoint), it might bring some racy situations= while * performing balloon page compaction. In order to sort out these racy sce= narios * and safely perform balloon's page compaction and migration we must, alw= ays, * ensure following these simple rules: * - * i. Setting the PG_movable_ops flag and page->private with the followi= ng - * lock order - * +-page_lock(page); - * +--spin_lock_irq(&balloon_pages_lock); + * i. Inflation/deflation must set/clear page->private under the + * balloon_pages_lock * * ii. isolation or dequeueing procedure must remove the page from balloon * device page list under balloon_pages_lock * - * The functions provided by this interface are placed to help on coping w= ith - * the aforementioned balloon page corner case, as well as to ensure the s= imple - * set of exposed rules are satisfied while we are dealing with balloon pa= ges - * compaction / migration. - * * Copyright (C) 2012, Red Hat, Inc. Rafael Aquini */ #ifndef _LINUX_BALLOON_COMPACTION_H @@ -93,8 +95,7 @@ static inline struct balloon_dev_info *balloon_page_devic= e(struct page *page) * @balloon : pointer to balloon device * @page : page to be assigned as a 'balloon page' * - * Caller must ensure the page is locked and the spin_lock protecting ball= oon - * pages list is held before inserting a page into the balloon device. + * Caller must ensure the balloon_pages_lock is held. */ static inline void balloon_page_insert(struct balloon_dev_info *balloon, struct page *page) @@ -119,7 +120,7 @@ static inline gfp_t balloon_mapping_gfp_mask(void) * balloon list for release to the page allocator * @page: page to be released to the page allocator * - * Caller must ensure that the page is locked. + * Caller must ensure the balloon_pages_lock is held. */ static inline void balloon_page_finalize(struct page *page) { diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c index a0fd779bbd012..75763c73dbd52 100644 --- a/mm/balloon_compaction.c +++ b/mm/balloon_compaction.c @@ -20,15 +20,7 @@ static DEFINE_SPINLOCK(balloon_pages_lock); static void balloon_page_enqueue_one(struct balloon_dev_info *b_dev_info, struct page *page) { - /* - * Block others from accessing the 'page' when we get around to - * establishing additional references. We should be the only one - * holding a reference to the 'page' at this point. If we are not, then - * memory corruption is possible and we should stop execution. - */ - BUG_ON(!trylock_page(page)); balloon_page_insert(b_dev_info, page); - unlock_page(page); if (b_dev_info->adjust_managed_page_count) adjust_managed_page_count(page, -1); __count_vm_event(BALLOON_INFLATE); @@ -93,22 +85,12 @@ size_t balloon_page_list_dequeue(struct balloon_dev_inf= o *b_dev_info, list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { if (n_pages =3D=3D n_req_pages) break; - - /* - * Block others from accessing the 'page' while we get around to - * establishing additional references and preparing the 'page' - * to be released by the balloon driver. - */ - if (!trylock_page(page)) - continue; - list_del(&page->lru); if (b_dev_info->adjust_managed_page_count) adjust_managed_page_count(page, 1); balloon_page_finalize(page); __count_vm_event(BALLOON_DEFLATE); list_add(&page->lru, pages); - unlock_page(page); dec_node_page_state(page, NR_BALLOON_PAGES); n_pages++; } @@ -213,13 +195,19 @@ EXPORT_SYMBOL_GPL(balloon_page_dequeue); static bool balloon_page_isolate(struct page *page, isolate_mode_t mode) =20 { - struct balloon_dev_info *b_dev_info =3D balloon_page_device(page); + struct balloon_dev_info *b_dev_info; unsigned long flags; =20 - if (!b_dev_info) - return false; - spin_lock_irqsave(&balloon_pages_lock, flags); + b_dev_info =3D balloon_page_device(page); + if (!b_dev_info) { + /* + * The page already got deflated and removed from the + * balloon list. + */ + spin_unlock_irqrestore(&balloon_pages_lock, flags); + return false; + } list_del(&page->lru); b_dev_info->isolated_pages++; spin_unlock_irqrestore(&balloon_pages_lock, flags); @@ -253,9 +241,6 @@ static int balloon_page_migrate(struct page *newpage, s= truct page *page, unsigned long flags; int rc; =20 - VM_BUG_ON_PAGE(!PageLocked(page), page); - VM_BUG_ON_PAGE(!PageLocked(newpage), newpage); - /* * When we isolated the page, the page was still inflated in a balloon * device. As isolated balloon pages cannot get deflated, we still have @@ -293,10 +278,11 @@ static int balloon_page_migrate(struct page *newpage,= struct page *page, } =20 b_dev_info->isolated_pages--; - spin_unlock_irqrestore(&balloon_pages_lock, flags); =20 /* Free the now-deflated page we isolated in balloon_page_isolate(). */ balloon_page_finalize(page); + spin_unlock_irqrestore(&balloon_pages_lock, flags); + put_page(page); =20 return 0; --=20 2.52.0