From nobody Mon Feb  9 00:45:19 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B213BC072A2
	for <linux-kernel@archiver.kernel.org>; Mon, 20 Nov 2023 03:19:56 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231881AbjKTDT5 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Sun, 19 Nov 2023 22:19:57 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54718 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231814AbjKTDTy (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Sun, 19 Nov 2023 22:19:54 -0500
Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDBECD3
        for <linux-kernel@vger.kernel.org>;
 Sun, 19 Nov 2023 19:19:49 -0800 (PST)
Received: from kwepemm000013.china.huawei.com (unknown [172.30.72.55])
        by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4SYXmQ11N8zWhYP;
        Mon, 20 Nov 2023 11:19:18 +0800 (CST)
Received: from huawei.com (10.175.104.67) by kwepemm000013.china.huawei.com
 (7.193.23.81) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 20 Nov
 2023 11:19:47 +0800
From: Zhihao Cheng <chengzhihao1@huawei.com>
To: <richard@nod.at>, <Carson.Li1@unisoc.com>,
        <s.hauer@pengutronix.de>, <houtao1@huawei.com>,
        <ext-adrian.hunter@nokia.com>
CC: <linux-mtd@lists.infradead.org>, <linux-kernel@vger.kernel.org>
Subject: [PATCH 1/3] ubifs: Check @c->dirty_[n|p]n_cnt and @c->nroot state
 under @c->lp_mutex
Date: Mon, 20 Nov 2023 19:13:45 +0800
Message-ID: <20231120111347.2254153-2-chengzhihao1@huawei.com>
X-Mailer: git-send-email 2.39.2
In-Reply-To: <20231120111347.2254153-1-chengzhihao1@huawei.com>
References: <20231120111347.2254153-1-chengzhihao1@huawei.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: [10.175.104.67]
X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To
 kwepemm000013.china.huawei.com (7.193.23.81)
X-CFilter-Loop: Reflected
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

The checking of @c->nroot->flags and @c->dirty_[n|p]n_cnt in function
nothing_to_commit() is not atomic, which could be raced with modifying
of lpt, for example:
       P1        P2        P3
run_gc
 ubifs_garbage_collect
              do_commit
 ubifs_return_leb
  ubifs_lpt_lookup_dirty
   dirty_cow_nnode
                       do_commit
			nothing_to_commit
			 if (test_bit(DIRTY_CNODE, &c->nroot->flags)
			 // false
   test_and_set_bit(DIRTY_CNODE, &nnode->flags)
   c->dirty_nn_cnt +=3D 1
                         ubifs_assert(c, c->dirty_nn_cnt =3D=3D 0)
			 // false !

Fetch a reproducer in Link:
 UBIFS error (ubi0:0 pid 2747): ubifs_assert_failed
 UBIFS assert failed: c->dirty_pn_cnt =3D=3D 0, in fs/ubifs/commit.c
 Call Trace:
  ubifs_ro_mode+0x58/0x70 [ubifs]
  ubifs_assert_failed+0x6a/0x90 [ubifs]
  do_commit+0x5b7/0x930 [ubifs]
  ubifs_run_commit+0xc6/0x1a0 [ubifs]
  ubifs_sync_fs+0xd8/0x110 [ubifs]
  sync_filesystem+0xb4/0x120
  do_syscall_64+0x6f/0x140

Fix it by checking @c->dirty_[n|p]n_cnt and @c->nroot state with
@c->lp_mutex locked.

Fixes: 944fdef52ca9 ("UBIFS: do not start the commit if there is nothing to=
 commit")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=3D218162
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/ubifs/commit.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/ubifs/commit.c b/fs/ubifs/commit.c
index c4fc1047fc07..5b3a840098b0 100644
--- a/fs/ubifs/commit.c
+++ b/fs/ubifs/commit.c
@@ -69,6 +69,14 @@ static int nothing_to_commit(struct ubifs_info *c)
 	if (c->zroot.znode && ubifs_zn_dirty(c->zroot.znode))
 		return 0;
=20
+	/*
+	 * Increasing @c->dirty_pn_cnt/@c->dirty_nn_cnt and marking
+	 * nnodes/pnodes as dirty in run_gc() could race with following
+	 * checking, which leads inconsistent states between @c->nroot
+	 * and @c->dirty_pn_cnt/@c->dirty_nn_cnt, holding @c->lp_mutex
+	 * to avoid that.
+	 */
+	mutex_lock(&c->lp_mutex);
 	/*
 	 * Even though the TNC is clean, the LPT tree may have dirty nodes. For
 	 * example, this may happen if the budgeting subsystem invoked GC to
@@ -76,12 +84,15 @@ static int nothing_to_commit(struct ubifs_info *c)
 	 * free space. In this case GC would just change the lprops of this
 	 * LEB (by turning all space into free space) and unmap it.
 	 */
-	if (c->nroot && test_bit(DIRTY_CNODE, &c->nroot->flags))
+	if (c->nroot && test_bit(DIRTY_CNODE, &c->nroot->flags)) {
+		mutex_unlock(&c->lp_mutex);
 		return 0;
+	}
=20
 	ubifs_assert(c, atomic_long_read(&c->dirty_zn_cnt) =3D=3D 0);
 	ubifs_assert(c, c->dirty_pn_cnt =3D=3D 0);
 	ubifs_assert(c, c->dirty_nn_cnt =3D=3D 0);
+	mutex_unlock(&c->lp_mutex);
=20
 	return 1;
 }
--=20
2.39.2
From nobody Mon Feb  9 00:45:19 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 39CA9C5ACB3
	for <linux-kernel@archiver.kernel.org>; Mon, 20 Nov 2023 03:19:59 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231898AbjKTDUA (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Sun, 19 Nov 2023 22:20:00 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54734 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231862AbjKTDTz (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Sun, 19 Nov 2023 22:19:55 -0500
Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFC1F137
        for <linux-kernel@vger.kernel.org>;
 Sun, 19 Nov 2023 19:19:50 -0800 (PST)
Received: from kwepemm000013.china.huawei.com (unknown [172.30.72.53])
        by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4SYXj30Chkz1P6j8;
        Mon, 20 Nov 2023 11:16:23 +0800 (CST)
Received: from huawei.com (10.175.104.67) by kwepemm000013.china.huawei.com
 (7.193.23.81) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 20 Nov
 2023 11:19:48 +0800
From: Zhihao Cheng <chengzhihao1@huawei.com>
To: <richard@nod.at>, <Carson.Li1@unisoc.com>,
        <s.hauer@pengutronix.de>, <houtao1@huawei.com>,
        <ext-adrian.hunter@nokia.com>
CC: <linux-mtd@lists.infradead.org>, <linux-kernel@vger.kernel.org>
Subject: [PATCH 2/3] ubifs: ubifs_tnc_locate: Drop TNC mutex lockless reading
 path
Date: Mon, 20 Nov 2023 19:13:46 +0800
Message-ID: <20231120111347.2254153-3-chengzhihao1@huawei.com>
X-Mailer: git-send-email 2.39.2
In-Reply-To: <20231120111347.2254153-1-chengzhihao1@huawei.com>
References: <20231120111347.2254153-1-chengzhihao1@huawei.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: [10.175.104.67]
X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To
 kwepemm000013.china.huawei.com (7.193.23.81)
X-CFilter-Loop: Reflected
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

This problem is described in [1], and Tao has proposed a solution in [2].
There exists a race between node reading and gc, which is shown as:
         P1                 P2
ubifs_tnc_locate
 zbr.lnum =3D lnum_a
                         GC // node is moved from lnum_a to lnum_b
			 journal head switching // lnum_a becomes a bud
 ubifs_get_wbuf(c, zbr.lnum) // true
 ubifs_tnc_read_node
  ubifs_read_node_wbuf // read data from lnum_a
  check node failed !

There are two ways of reading node(See ubifs_tnc_read_node()):
1. Reading from wbuf. The node is written in wbuf(in mem), and wbuf is
   not written back to flash.
2. Otherwise, reading from flash.

In most cases, ubifs_tnc_read_node() is invoked with TNC mutex locked,
but except the lockless path in ubifs_tnc_locate() which is imported
by commit 601c0bc46753("UBIFS: allow for racing between GC and TNC").

Function ubifs_tnc_locate() is mainly used for path lookup and file
reading, VFS has inode/dentry/page cache for multiple times reading, the
lockless optimization only works for first reading. Based on the
discussion in [2], this patch simply drops the TNC mutex lockless reading
path in ubifs_tnc_locate().

Fetch a reproducer in [3].

[1] https://lore.kernel.org/all/fda84926-09d1-1fc7-4b78-99e0d04508bc@huawei=
.com/T/
[2] https://lore.kernel.org/linux-mtd/20200305092205.127758-1-houtao1@huawe=
i.com/
[3] https://bugzilla.kernel.org/show_bug.cgi?id=3D218163

Fixes: 601c0bc46753 ("UBIFS: allow for racing between GC and TNC")
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system")
Reported-by: =E6=9D=8E=E5=82=B2=E5=82=B2 (Carson Li1/9542) <Carson.Li1@unis=
oc.com>
Analyzed-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/ubifs/tnc.c | 31 +++----------------------------
 1 file changed, 3 insertions(+), 28 deletions(-)

diff --git a/fs/ubifs/tnc.c b/fs/ubifs/tnc.c
index f4728e65d1bd..7b7d75ed3ec7 100644
--- a/fs/ubifs/tnc.c
+++ b/fs/ubifs/tnc.c
@@ -1478,11 +1478,10 @@ static int maybe_leb_gced(struct ubifs_info *c, int=
 lnum, int gc_seq1)
 int ubifs_tnc_locate(struct ubifs_info *c, const union ubifs_key *key,
 		     void *node, int *lnum, int *offs)
 {
-	int found, n, err, safely =3D 0, gc_seq1;
+	int found, n, err;
 	struct ubifs_znode *znode;
-	struct ubifs_zbranch zbr, *zt;
+	struct ubifs_zbranch *zt;
=20
-again:
 	mutex_lock(&c->tnc_mutex);
 	found =3D ubifs_lookup_level0(c, key, &znode, &n);
 	if (!found) {
@@ -1505,31 +1504,7 @@ int ubifs_tnc_locate(struct ubifs_info *c, const uni=
on ubifs_key *key,
 		err =3D tnc_read_hashed_node(c, zt, node);
 		goto out;
 	}
-	if (safely) {
-		err =3D ubifs_tnc_read_node(c, zt, node);
-		goto out;
-	}
-	/* Drop the TNC mutex prematurely and race with garbage collection */
-	zbr =3D znode->zbranch[n];
-	gc_seq1 =3D c->gc_seq;
-	mutex_unlock(&c->tnc_mutex);
-
-	if (ubifs_get_wbuf(c, zbr.lnum)) {
-		/* We do not GC journal heads */
-		err =3D ubifs_tnc_read_node(c, &zbr, node);
-		return err;
-	}
-
-	err =3D fallible_read_node(c, key, &zbr, node);
-	if (err <=3D 0 || maybe_leb_gced(c, zbr.lnum, gc_seq1)) {
-		/*
-		 * The node may have been GC'ed out from under us so try again
-		 * while keeping the TNC mutex locked.
-		 */
-		safely =3D 1;
-		goto again;
-	}
-	return 0;
+	err =3D ubifs_tnc_read_node(c, zt, node);
=20
 out:
 	mutex_unlock(&c->tnc_mutex);
--=20
2.39.2

From nobody Mon Feb  9 00:45:19 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 515EEC072A2
	for <linux-kernel@archiver.kernel.org>; Mon, 20 Nov 2023 03:20:01 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231913AbjKTDUC (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Sun, 19 Nov 2023 22:20:02 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54742 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231874AbjKTDT4 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Sun, 19 Nov 2023 22:19:56 -0500
Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7758513E
        for <linux-kernel@vger.kernel.org>;
 Sun, 19 Nov 2023 19:19:51 -0800 (PST)
Received: from kwepemm000013.china.huawei.com (unknown [172.30.72.53])
        by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4SYXj26WJTzsRH6;
        Mon, 20 Nov 2023 11:16:22 +0800 (CST)
Received: from huawei.com (10.175.104.67) by kwepemm000013.china.huawei.com
 (7.193.23.81) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 20 Nov
 2023 11:19:48 +0800
From: Zhihao Cheng <chengzhihao1@huawei.com>
To: <richard@nod.at>, <Carson.Li1@unisoc.com>,
        <s.hauer@pengutronix.de>, <houtao1@huawei.com>,
        <ext-adrian.hunter@nokia.com>
CC: <linux-mtd@lists.infradead.org>, <linux-kernel@vger.kernel.org>
Subject: [PATCH 3/3] ubifs: Don't stop retrying even if committing times
 exceeds two
Date: Mon, 20 Nov 2023 19:13:47 +0800
Message-ID: <20231120111347.2254153-4-chengzhihao1@huawei.com>
X-Mailer: git-send-email 2.39.2
In-Reply-To: <20231120111347.2254153-1-chengzhihao1@huawei.com>
References: <20231120111347.2254153-1-chengzhihao1@huawei.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Originating-IP: [10.175.104.67]
X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To
 kwepemm000013.china.huawei.com (7.193.23.81)
X-CFilter-Loop: Reflected
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Recently we catched ENOSPC returned by make_reservation() while doing
fsstress on UBIFS, we got following information when it occurred (See
details in Link):

 UBIFS error (ubi0:0 pid 3640152): make_reservation [ubifs]: cannot
 reserve 112 bytes in jhead 2, error -28
 CPU: 2 PID: 3640152 Comm: kworker/u16:2 Tainted: G    B   W
 Hardware name: Hisilicon PhosphorHi1230 EMU (DT)
 Workqueue: writeback wb_workfn (flush-ubifs_0_0)
 Call trace:
  dump_stack+0x114/0x198
  make_reservation+0x564/0x610 [ubifs]
  ubifs_jnl_write_data+0x328/0x48c [ubifs]
  do_writepage+0x2a8/0x3e4 [ubifs]
  ubifs_writepage+0x16c/0x374 [ubifs]
  generic_writepages+0xb4/0x114
  do_writepages+0xcc/0x11c
  writeback_sb_inodes+0x2d0/0x564
  wb_writeback+0x20c/0x2b4
  wb_workfn+0x404/0x510
  process_one_work+0x304/0x4ac
  worker_thread+0x31c/0x4e4
  kthread+0x23c/0x290
 (pid 3640152) Budgeting info: data budget sum 17576, total budget sum 17768
	budg_data_growth 4144, budg_dd_growth 13432, budg_idx_growth 192
	min_idx_lebs 13, old_idx_sz 988640, uncommitted_idx 0
	page_budget 4144, inode_budget 160, dent_budget 312
	nospace 0, nospace_rp 0
	dark_wm 8192, dead_wm 4096, max_idx_node_sz 192
	freeable_cnt 0, calc_idx_sz 988640, idx_gc_cnt 0
	dirty_pg_cnt 4, dirty_zn_cnt 0, clean_zn_cnt 4811
	gc_lnum 21, ihead_lnum 14
	jhead 0 (GC)	 LEB 16
	jhead 1 (base)	 LEB 34
	jhead 2 (data)	 LEB 23
	bud LEB 16
	bud LEB 23
	bud LEB 34
	old bud LEB 33
	old bud LEB 31
	old bud LEB 15
	commit state 4
 Budgeting predictions:
	available: 33832, outstanding 17576, free 15356
 (pid 3640152) start dumping LEB properties
 (pid 3640152) Lprops statistics: empty_lebs 3, idx_lebs  11
	taken_empty_lebs 1, total_free 1253376, total_dirty 2445736
	total_used 3438712, total_dark 65536, total_dead 17248
 LEB 15 free 0      dirty 248000   used 5952   (taken)
 LEB 16 free 110592 dirty 896      used 142464 (taken, jhead 0 (GC))
 LEB 21 free 253952 dirty 0        used 0      (taken, GC LEB)
 LEB 23 free 0      dirty 248104   used 5848   (taken, jhead 2 (data))
 LEB 29 free 253952 dirty 0        used 0      (empty)
 LEB 33 free 0      dirty 253952   used 0      (taken)
 LEB 34 free 217088 dirty 36544    used 320    (taken, jhead 1 (base))
 LEB 37 free 253952 dirty 0        used 0      (empty)
 OTHERS: index lebs, zero-available non-index lebs

According to the budget algorithm, there are 5 LEBs reserved for budget:
three journal heads(16,23,34), 1 GC LEB(21) and 1 deletion LEB(can be
used in make_reservation()). There are 2 empty LEBs used for index nodes,
which is calculated as min_idx_lebs - idx_lebs =3D 2. In theory, LEB 15
and 33 should be reclaimed as free state after committing, but it is now
in taken state. After looking the realization of reserve_space(), there's
a possible situation:

LEB 15: free 2000 dirty 248000 used 3952 (jhead 2)
LEB 23: free 2000 dirty 248104 used 3848 (bud, taken)
LEB 33: free 2000 dirty 251952 used 0    (bud, taken)

      wb_workfn          wb_workfn_2
do_writepage // write 3000 bytes
 ubifs_jnl_write_data
  make_reservation
   reserve_space
    ubifs_garbage_collect
     ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken
   nospc_retries++  // 1
   ubifs_run_commit
    do_commit

LEB 15: free 2000 dirty 248000 used 3952 (jhead 2)
LEB 23: free 2000 dirty 248104 used 3848 (dirty)
LEB 33: free 2000 dirty 251952 used 0    (dirty)

                   do_writepage // write 2000 bytes for 3 times
		    ubifs_jnl_write_data
		    // grabs 15\23\33

LEB 15: free 0    dirty 248000 used 5952 (bud, taken)
LEB 23: free 0    dirty 248104 used 5848 (jhead 2)
LEB 33: free 0    dirty 253952 used 0    (bud, taken)

   reserve_space
    ubifs_garbage_collect
     ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken
   if (nospc_retries++ < 2) // false
 ubifs_ro_mode !

Fetch a reproducer in Link.

The dirty LEBs could be grabbed by other writeback threads, which fails
finding dirty LEBs of GC in current thread, so make_reservation() should
try many times to invoke GC&&committing, but current realization limits
the times of retrying as 'nospc_retries'(twice). Actually the limitation
of retrying times should be removed from make_reservation(), but
considering the existence of 'cmt_retries' in case of infinite loop,
just remove 'nospc_retries' and set the limitation of retrying times same
as 'cmt_retries'.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=3D218164
Fixes: 1e51764a3c2a ("UBIFS: add new flash file system")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/ubifs/journal.c | 26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index f0a5538c84b0..ec42d42f06dc 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -310,7 +310,7 @@ static int write_head(struct ubifs_info *c, int jhead, =
void *buf, int len,
  */
 static int make_reservation(struct ubifs_info *c, int jhead, int len)
 {
-	int err, cmt_retries =3D 0, nospc_retries =3D 0;
+	int err, cmt_retries =3D 0;
=20
 again:
 	down_read(&c->commit_sem);
@@ -323,21 +323,14 @@ static int make_reservation(struct ubifs_info *c, int=
 jhead, int len)
 	if (err =3D=3D -ENOSPC) {
 		/*
 		 * GC could not make any progress. We should try to commit
-		 * once because it could make some dirty space and GC would
-		 * make progress, so make the error -EAGAIN so that the below
-		 * will commit and re-try.
-		 */
-		if (nospc_retries++ < 2) {
-			dbg_jnl("no space, retry");
-			err =3D -EAGAIN;
-		}
-
-		/*
-		 * This means that the budgeting is incorrect. We always have
-		 * to be able to write to the media, because all operations are
-		 * budgeted. Deletions are not budgeted, though, but we reserve
-		 * an extra LEB for them.
+		 * because it could make some dirty space and GC would make
+		 * progress, so that the below will commit and re-try. This
+		 * process could repeat many times(cmt_retries < 128),
+		 * journal heads in other threads could grab the dirty lebs,
+		 * which fails finding dirty LEBs of GC in current thread.
 		 */
+		dbg_jnl("no space, retry");
+		err =3D -EAGAIN;
 	}
=20
 	if (err !=3D -EAGAIN)
@@ -350,7 +343,8 @@ static int make_reservation(struct ubifs_info *c, int j=
head, int len)
 	if (cmt_retries > 128) {
 		/*
 		 * This should not happen unless the journal size limitations
-		 * are too tough.
+		 * are too tough, or the racing between GC and journal heads
+		 * switching is too frequent when space is nearly run out.
 		 */
 		ubifs_err(c, "stuck in space allocation");
 		err =3D -ENOSPC;
--=20
2.39.2