From nobody Fri Dec 19 20:54:13 2025 Received: from SHSQR01.spreadtrum.com (unknown [222.66.158.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3D392D5937 for ; Wed, 3 Dec 2025 08:58:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=222.66.158.135 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764752324; cv=none; b=fWoJ+XrGeWPMdba4O5D8C61PKJqNDrmsK1TBZ+9ZBMP36vudXuvhn5NIiwvmS3X+/FuGdTUM2GjmuAZvvxuC4ioXhk1zhuZqV+clHadkKmThdPghK6iLnhfqOYlUExw+JsBisr7Z+K5HAELN3ir1QuKWzkSeH3rsbmjUsotRPWM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764752324; c=relaxed/simple; bh=6I26vZWLCRTD7HJVeY7TDDGYjV4f1cwIB8+1QG47nUI=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=FARlPSLLT1tkH6umH+JIuJfmyZVAQA7GgZh4IGm9ae17JEJXxHSWyrf6ukN1CdWRvb7UqZJ/AgtybebPuFuA3GqGU+yqG/QP/4dEQyk8RAUkBC20WpUHkiIeflHWNRdTDDR+pSVkLAnxzdfZKgB1qGJZB8DQGKOcvC04OJYA6yw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=unisoc.com; spf=pass smtp.mailfrom=unisoc.com; dkim=pass (2048-bit key) header.d=unisoc.com header.i=@unisoc.com header.b=OWZyw+3y; arc=none smtp.client-ip=222.66.158.135 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=unisoc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=unisoc.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=unisoc.com header.i=@unisoc.com header.b="OWZyw+3y" Received: from SHSQR01.spreadtrum.com (localhost [127.0.0.2] (may be forged)) by SHSQR01.spreadtrum.com with ESMTP id 5B38wbCp092056 for ; Wed, 3 Dec 2025 16:58:37 +0800 (+08) (envelope-from Zhiguo.Niu@unisoc.com) Received: from dlp.unisoc.com ([10.29.3.86]) by SHSQR01.spreadtrum.com with ESMTP id 5B38vh44085996; Wed, 3 Dec 2025 16:57:43 +0800 (+08) (envelope-from Zhiguo.Niu@unisoc.com) Received: from SHDLP.spreadtrum.com (BJMBX02.spreadtrum.com [10.0.64.8]) by dlp.unisoc.com (SkyGuard) with ESMTPS id 4dLryr0SR1z2KcbFD; Wed, 3 Dec 2025 16:53:40 +0800 (CST) Received: from bj08434pcu.spreadtrum.com (10.0.73.87) by BJMBX02.spreadtrum.com (10.0.64.8) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Wed, 3 Dec 2025 16:57:41 +0800 From: Zhiguo Niu To: , CC: , , , , , Subject: [PATCH RFC] f2fs: fix infinite foreground gc loop in f2fs_gc Date: Wed, 3 Dec 2025 16:56:19 +0800 Message-ID: <1764752179-1936-1-git-send-email-zhiguo.niu@unisoc.com> X-Mailer: git-send-email 1.9.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: SHCAS03.spreadtrum.com (10.0.1.207) To BJMBX02.spreadtrum.com (10.0.64.8) X-MAIL: SHSQR01.spreadtrum.com 5B38vh44085996 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=unisoc.com; s=default; t=1764752274; bh=ajoBRUTBt71ZYOJqir81AKyXdIAqrTQbJpFRp3cR2EY=; h=From:To:CC:Subject:Date; b=OWZyw+3yelMcNSxdwCldYu1MDnBKLXszhsGySqaCxkF0pY5tXJGB1X48ZEzKil/ay Ioxbt/Cws49lgWIj1aIrk3oq4rNwyEV4eY+hM6gqHhvgPT3GpD0VhlBp9DtQzr2KQm Pmqi1Q5A0u6odzEoiI5OaVi/kWMoRecmVhE+4HRNyZ7o5VY8ZDgGYEMw5EmnUmPeH4 ilIrabJ5Y0bp7bMIDU1DGrdRgrFegavoxq5qjxTXekfigc6OHpN9h6sEb90GcX2yAX RtKG8/FspFQa5aYyx/GuAAdKwoDSBPSYQlRJYRBYu2W9LcBOFYMOGleA68HOUhMa1P ZmX2FQEpLNmDQ== Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" I'm currently encountering the same issue as shown in commit bbf9f7d90f21e ("f2fs: Fix indefinite loop in f2fs_gc()"), but this commit only works when CONFIG_F2FS_CHECK_FS is enabled and blkaddr check fails. This doesn't seem to cover all !is_alive cases, and CONFIG_F2FS_CHECK_FS is currently disabled on Android devices. Here's the problem flow: 1. Some high-pressure read/write/random power-down tests corrupted the content of the nid=3D4 entry in the NAT block. ino/bock_addr are both corrupted to 0, and the mount log also indicates this. crash_arm64> f2fs_nat_block ffffff8097bfe000 -x version =3D 0x0, ino =3D 0x3, block_addr =3D 0x86da9 }, { version =3D 0x0, ino =3D 0x0, block_addr =3D 0x0 [ 6.495406] F2FS-fs (dm-56): quota file may be corrupted, skip loading it 2. Insufficient free space triggers foreground garbage collection (GC) duri= ng boot. crash_arm64> bt 1 PID: 1 TASK: ffffff80801cec00 CPU: 6 COMMAND: "init" [ffffffc00806b870] rwsem_read_trylock at ffffffc00826d5c8 [ffffffc00806b980] gc_data_segment at ffffffc0088a2720 [ffffffc00806ba40] do_garbage_collect at ffffffc0088a21d8 [ffffffc00806bb60] f2fs_gc at ffffffc0088a17ac [ffffffc00806bbf0] f2fs_balance_fs at ffffffc0088cbf44 [ffffffc00806bc20] f2fs_setattr at ffffffc00885e67c [ffffffc00806bc70] notify_change at ffffffc008610ae0 [ffffffc00806bd30] chown_common at ffffffc0085c8a3c [ffffffc00806bdb0] do_fchownat at ffffffc0085c9154 [ffffffc00806be00] __arm64_sys_fchownat at ffffffc0085c908c [ffffffc00806be20] invoke_syscall at ffffffc0081221b4 [ffffffc00806be40] el0_svc_common at ffffffc008122114 Infinite GC loop causes critical processes to be blocked, preventing the device from booting up properly. 3. The GC process enters an infinite loop because the victim segment contains a data block belongs to nid=3D4, but the is_alive check fails as the following calling flow: is_alive->f2fs_get_node_page->__get_node_page->read_node_page return -ENOENT This will prevent the data in this segment from being completely migraged o= ut. 4. This segment has low cost, which is chosen for GC again in next time. Although the problem should be addressed by finding the cause of NAT block corruption, but this will prevent the device from booting up. This patch records each `!is_alive` case as an invalid segment to avoid selecting the same one in next time. BTW, some debug information output has been enhanced in is_alive. Cc: Sahitya Tummala Signed-off-by: Zhiguo Niu --- fs/f2fs/gc.c | 32 +++++++++++++++----------------- fs/f2fs/segment.c | 6 ++---- fs/f2fs/segment.h | 3 +-- 3 files changed, 18 insertions(+), 23 deletions(-) diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index 384fa7e..a95ade9 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -872,7 +872,6 @@ int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned = int *result, p.offset =3D segno + p.ofs_unit; nsearched++; =20 -#ifdef CONFIG_F2FS_CHECK_FS /* * skip selecting the invalid segno (that is failed due to block * validity check failure during GC) to avoid endless GC loop in @@ -880,7 +879,6 @@ int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned = int *result, */ if (test_bit(segno, sm->invalid_segmap)) goto next; -#endif =20 secno =3D GET_SEC_FROM_SEG(sbi, segno); =20 @@ -1145,16 +1143,19 @@ static bool is_alive(struct f2fs_sb_info *sbi, stru= ct f2fs_summary *sum, unsigned int ofs_in_node, max_addrs, base; block_t source_blkaddr; =20 + unsigned int segno =3D GET_SEGNO(sbi, blkaddr); nid =3D le32_to_cpu(sum->nid); ofs_in_node =3D le16_to_cpu(sum->ofs_in_node); =20 node_folio =3D f2fs_get_node_folio(sbi, nid, NODE_TYPE_REGULAR); - if (IS_ERR(node_folio)) - return false; + if (IS_ERR(node_folio)) { + f2fs_err(sbi, "get_node_folio err(%ld) for nid(%u)", PTR_ERR(node_folio)= , nid); + goto check_invalid; + } =20 if (f2fs_get_node_info(sbi, nid, dni, false)) { f2fs_folio_put(node_folio, true); - return false; + goto check_invalid; } =20 if (sum->version !=3D dni->version) { @@ -1165,7 +1166,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct= f2fs_summary *sum, =20 if (f2fs_check_nid_range(sbi, dni->ino)) { f2fs_folio_put(node_folio, true); - return false; + goto check_invalid; } =20 if (IS_INODE(node_folio)) { @@ -1180,7 +1181,7 @@ static bool is_alive(struct f2fs_sb_info *sbi, struct= f2fs_summary *sum, f2fs_err(sbi, "Inconsistent blkaddr offset: base:%u, ofs_in_node:%u, max= :%u, ino:%u, nid:%u", base, ofs_in_node, max_addrs, dni->ino, dni->nid); f2fs_folio_put(node_folio, true); - return false; + goto check_invalid; } =20 *nofs =3D ofs_of_node(node_folio); @@ -1188,21 +1189,18 @@ static bool is_alive(struct f2fs_sb_info *sbi, stru= ct f2fs_summary *sum, f2fs_folio_put(node_folio, true); =20 if (source_blkaddr !=3D blkaddr) { -#ifdef CONFIG_F2FS_CHECK_FS - unsigned int segno =3D GET_SEGNO(sbi, blkaddr); unsigned long offset =3D GET_BLKOFF_FROM_SEG0(sbi, blkaddr); - if (unlikely(check_valid_map(sbi, segno, offset))) { - if (!test_and_set_bit(segno, SIT_I(sbi)->invalid_segmap)) { - f2fs_err(sbi, "mismatched blkaddr %u (source_blkaddr %u) in seg %u", - blkaddr, source_blkaddr, segno); - set_sbi_flag(sbi, SBI_NEED_FSCK); - } + f2fs_err(sbi, "mismatched blkaddr %u (source_blkaddr %u) in seg %u", + blkaddr, source_blkaddr, segno); + set_sbi_flag(sbi, SBI_NEED_FSCK); + goto check_invalid; } -#endif - return false; } return true; +check_invalid: + set_bit(segno, SIT_I(sbi)->invalid_segmap); + return false; } =20 static int ra_data_block(struct inode *inode, pgoff_t index) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 8375dca..6a55e20 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -826,9 +826,7 @@ static void __remove_dirty_segment(struct f2fs_sb_info = *sbi, unsigned int segno, if (valid_blocks =3D=3D 0) { clear_bit(GET_SEC_FROM_SEG(sbi, segno), dirty_i->victim_secmap); -#ifdef CONFIG_F2FS_CHECK_FS clear_bit(segno, SIT_I(sbi)->invalid_segmap); -#endif } if (__is_large_section(sbi)) { unsigned int secno =3D GET_SEC_FROM_SEG(sbi, segno); @@ -4899,12 +4897,12 @@ static int build_sit_info(struct f2fs_sb_info *sbi) sit_bitmap_size, GFP_KERNEL); if (!sit_i->sit_bitmap_mir) return -ENOMEM; +#endif =20 sit_i->invalid_segmap =3D f2fs_kvzalloc(sbi, main_bitmap_size, GFP_KERNEL); if (!sit_i->invalid_segmap) return -ENOMEM; -#endif =20 sit_i->sit_base_addr =3D le32_to_cpu(raw_super->sit_blkaddr); sit_i->sit_blocks =3D SEGS_TO_BLKS(sbi, sit_segs); @@ -5862,8 +5860,8 @@ static void destroy_sit_info(struct f2fs_sb_info *sbi) kfree(sit_i->sit_bitmap); #ifdef CONFIG_F2FS_CHECK_FS kfree(sit_i->sit_bitmap_mir); - kvfree(sit_i->invalid_segmap); #endif + kvfree(sit_i->invalid_segmap); kfree(sit_i); } =20 diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index 07dcbcb..2437a7e2 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -211,10 +211,9 @@ struct sit_info { char *sit_bitmap; /* SIT bitmap pointer */ #ifdef CONFIG_F2FS_CHECK_FS char *sit_bitmap_mir; /* SIT bitmap mirror */ - +#endif /* bitmap of segments to be ignored by GC in case of errors */ unsigned long *invalid_segmap; -#endif unsigned int bitmap_size; /* SIT bitmap size */ =20 unsigned long *tmp_map; /* bitmap for temporal use */ --=20 1.9.1