From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5730C433EF for ; Sat, 18 Dec 2021 10:23:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232714AbhLRKXj (ORCPT ); Sat, 18 Dec 2021 05:23:39 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:33866 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232707AbhLRKXh (ORCPT ); Sat, 18 Dec 2021 05:23:37 -0500 Received: from kwepemi500010.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JGMPZ5TK1zcbfT; Sat, 18 Dec 2021 18:23:14 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500010.china.huawei.com (7.221.188.191) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:34 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:33 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 01/13] ubifs: rename_whiteout: Fix double free for whiteout_ui->data Date: Sat, 18 Dec 2021 18:35:00 +0800 Message-ID: <20211218103512.370420-2-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" 'whiteout_ui->data' will be freed twice if space budget fail for rename whiteout operation as following process: rename_whiteout dev =3D kmalloc whiteout_ui->data =3D dev kfree(whiteout_ui->data) // Free first time iput(whiteout) ubifs_free_inode kfree(ui->data) // Double free! KASAN reports: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D BUG: KASAN: double-free or invalid-free in ubifs_free_inode+0x4f/0x70 Call Trace: kfree+0x117/0x490 ubifs_free_inode+0x4f/0x70 [ubifs] i_callback+0x30/0x60 rcu_do_batch+0x366/0xac0 __do_softirq+0x133/0x57f Allocated by task 1506: kmem_cache_alloc_trace+0x3c2/0x7a0 do_rename+0x9b7/0x1150 [ubifs] ubifs_rename+0x106/0x1f0 [ubifs] do_syscall_64+0x35/0x80 Freed by task 1506: kfree+0x117/0x490 do_rename.cold+0x53/0x8a [ubifs] ubifs_rename+0x106/0x1f0 [ubifs] do_syscall_64+0x35/0x80 The buggy address belongs to the object at ffff88810238bed8 which belongs to the cache kmalloc-8 of size 8 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Let ubifs_free_inode() free 'whiteout_ui->data'. BTW, delete unused assignment 'whiteout_ui->data_len =3D 0', process 'ubifs_evict_inode() -> ubifs_jnl_delete_inode() -> ubifs_jnl_write_inode()' doesn't need it (because 'inc_nlink(whiteout)' won't be excuted by 'goto out_release', and the nlink of whiteout inode is 0). Fixes: 9e0a1fff8db56ea ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Zhihao Cheng --- fs/ubifs/dir.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 7c61d0ec0159..cfa8881d8cca 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1425,8 +1425,6 @@ static int do_rename(struct inode *old_dir, struct de= ntry *old_dentry, =20 err =3D ubifs_budget_space(c, &wht_req); if (err) { - kfree(whiteout_ui->data); - whiteout_ui->data_len =3D 0; iput(whiteout); goto out_release; } --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF62BC433F5 for ; Sat, 18 Dec 2021 10:23:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232898AbhLRKXz (ORCPT ); Sat, 18 Dec 2021 05:23:55 -0500 Received: from szxga02-in.huawei.com ([45.249.212.188]:28329 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232716AbhLRKXl (ORCPT ); Sat, 18 Dec 2021 05:23:41 -0500 Received: from kwepemi500005.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4JGMPb0xNkzbhsX; Sat, 18 Dec 2021 18:23:15 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500005.china.huawei.com (7.221.188.179) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:35 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:34 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 02/13] ubifs: Fix deadlock in concurrent rename whiteout and inode writeback Date: Sat, 18 Dec 2021 18:35:01 +0800 Message-ID: <20211218103512.370420-3-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Following hung tasks: [ 77.028764] task:kworker/u8:4 state:D stack: 0 pid: 132 [ 77.028820] Call Trace: [ 77.029027] schedule+0x8c/0x1b0 [ 77.029067] mutex_lock+0x50/0x60 [ 77.029074] ubifs_write_inode+0x68/0x1f0 [ubifs] [ 77.029117] __writeback_single_inode+0x43c/0x570 [ 77.029128] writeback_sb_inodes+0x259/0x740 [ 77.029148] wb_writeback+0x107/0x4d0 [ 77.029163] wb_workfn+0x162/0x7b0 [ 92.390442] task:aa state:D stack: 0 pid: 1506 [ 92.390448] Call Trace: [ 92.390458] schedule+0x8c/0x1b0 [ 92.390461] wb_wait_for_completion+0x82/0xd0 [ 92.390469] __writeback_inodes_sb_nr+0xb2/0x110 [ 92.390472] writeback_inodes_sb_nr+0x14/0x20 [ 92.390476] ubifs_budget_space+0x705/0xdd0 [ubifs] [ 92.390503] do_rename.cold+0x7f/0x187 [ubifs] [ 92.390549] ubifs_rename+0x8b/0x180 [ubifs] [ 92.390571] vfs_rename+0xdb2/0x1170 [ 92.390580] do_renameat2+0x554/0x770 , are caused by concurrent rename whiteout and inode writeback processes: rename_whiteout(Thread 1) wb_workfn(Thread2) ubifs_rename do_rename lock_4_inodes (Hold ui_mutex) ubifs_budget_space make_free_space shrink_liability __writeback_inodes_sb_nr bdi_split_work_to_wbs (Queue new wb work) wb_do_writeback(wb work) __writeback_single_inode ubifs_write_inode LOCK(ui_mutex) =E2=86=91 wb_wait_for_completion (Wait wb work) <-- deadlock! Reproducer (Detail program in [Link]): 1. SYS_renameat2("/mp/dir/file", "/mp/dir/whiteout", RENAME_WHITEOUT) 2. Consume out of space before kernel(mdelay) doing budget for whiteout Fix it by doing whiteout space budget before locking ubifs inodes. BTW, it also fixes wrong goto tag 'out_release' in whiteout budget error handling path(It should at least recover dir i_size and unlock 4 ubifs inodes). Fixes: 9e0a1fff8db56ea ("ubifs: Implement RENAME_WHITEOUT") Link: https://bugzilla.kernel.org/show_bug.cgi?id=3D214733 Signed-off-by: Zhihao Cheng --- fs/ubifs/dir.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index cfa8881d8cca..2735ad1affed 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -1324,6 +1324,7 @@ static int do_rename(struct inode *old_dir, struct de= ntry *old_dentry, =20 if (flags & RENAME_WHITEOUT) { union ubifs_dev_desc *dev =3D NULL; + struct ubifs_budget_req wht_req; =20 dev =3D kmalloc(sizeof(union ubifs_dev_desc), GFP_NOFS); if (!dev) { @@ -1345,6 +1346,20 @@ static int do_rename(struct inode *old_dir, struct d= entry *old_dentry, whiteout_ui->data =3D dev; whiteout_ui->data_len =3D ubifs_encode_dev(dev, MKDEV(0, 0)); ubifs_assert(c, !whiteout_ui->dirty); + + memset(&wht_req, 0, sizeof(struct ubifs_budget_req)); + wht_req.dirtied_ino =3D 1; + wht_req.dirtied_ino_d =3D ALIGN(whiteout_ui->data_len, 8); + /* + * To avoid deadlock between space budget (holds ui_mutex and + * waits wb work) and writeback work(waits ui_mutex), do space + * budget before ubifs inodes locked. + */ + err =3D ubifs_budget_space(c, &wht_req); + if (err) { + iput(whiteout); + goto out_release; + } } =20 lock_4_inodes(old_dir, new_dir, new_inode, whiteout); @@ -1419,16 +1434,6 @@ static int do_rename(struct inode *old_dir, struct d= entry *old_dentry, } =20 if (whiteout) { - struct ubifs_budget_req wht_req =3D { .dirtied_ino =3D 1, - .dirtied_ino_d =3D \ - ALIGN(ubifs_inode(whiteout)->data_len, 8) }; - - err =3D ubifs_budget_space(c, &wht_req); - if (err) { - iput(whiteout); - goto out_release; - } - inc_nlink(whiteout); mark_inode_dirty(whiteout); =20 --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10BCAC433FE for ; Sat, 18 Dec 2021 10:23:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232738AbhLRKXl (ORCPT ); Sat, 18 Dec 2021 05:23:41 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:15938 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229536AbhLRKXh (ORCPT ); Sat, 18 Dec 2021 05:23:37 -0500 Received: from kwepemi500004.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JGMLS3w67zZcgG; Sat, 18 Dec 2021 18:20:32 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500004.china.huawei.com (7.221.188.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:36 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:35 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 03/13] ubifs: Fix wrong number of inodes locked by ui_mutex in ubifs_inode comment Date: Sat, 18 Dec 2021 18:35:02 +0800 Message-ID: <20211218103512.370420-4-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Since 9ec64962afb1702f75b("ubifs: Implement RENAME_EXCHANGE") and 9e0a1fff8db56eaaebb("ubifs: Implement RENAME_WHITEOUT") are applied, ubifs_rename locks and changes 4 ubifs inodes, correct the comment for ui_mutex in ubifs_inode. Signed-off-by: Zhihao Cheng --- fs/ubifs/ubifs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h index c38066ce9ab0..972e41daff01 100644 --- a/fs/ubifs/ubifs.h +++ b/fs/ubifs/ubifs.h @@ -372,7 +372,7 @@ struct ubifs_gced_idx_leb { * @ui_mutex exists for two main reasons. At first it prevents inodes from * being written back while UBIFS changing them, being in the middle of an= VFS * operation. This way UBIFS makes sure the inode fields are consistent. F= or - * example, in 'ubifs_rename()' we change 3 inodes simultaneously, and + * example, in 'ubifs_rename()' we change 4 inodes simultaneously, and * write-back must not write any of them before we have finished. * * The second reason is budgeting - UBIFS has to budget all operations. If= an --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91D48C433F5 for ; Sat, 18 Dec 2021 10:23:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232733AbhLRKXn (ORCPT ); Sat, 18 Dec 2021 05:23:43 -0500 Received: from szxga03-in.huawei.com ([45.249.212.189]:30139 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232710AbhLRKXj (ORCPT ); Sat, 18 Dec 2021 05:23:39 -0500 Received: from kwepemi500007.china.huawei.com (unknown [172.30.72.57]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4JGMMN5vZDz8vtm; Sat, 18 Dec 2021 18:21:20 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500007.china.huawei.com (7.221.188.207) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:36 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:36 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 04/13] ubifs: Add missing iput if do_tmpfile() failed in rename whiteout Date: Sat, 18 Dec 2021 18:35:03 +0800 Message-ID: <20211218103512.370420-5-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" whiteout inode should be put when do_tmpfile() failed if inode has been initialized. Otherwise we will get following warning during umount: UBIFS error (ubi0:0 pid 1494): ubifs_assert_failed [ubifs]: UBIFS assert failed: c->bi.dd_growth =3D=3D 0, in fs/ubifs/super.c:1930 VFS: Busy inodes after unmount of ubifs. Self-destruct in 5 seconds. Fixes: 9e0a1fff8db56ea ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Zhihao Cheng Suggested-by: Sascha Hauer --- fs/ubifs/dir.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 2735ad1affed..2cbc5f05f671 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -432,6 +432,8 @@ static int do_tmpfile(struct inode *dir, struct dentry = *dentry, make_bad_inode(inode); if (!instantiated) iput(inode); + else if (whiteout) + iput(*whiteout); out_budg: ubifs_release_budget(c, &req); if (!instantiated) --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF1A5C433F5 for ; Sat, 18 Dec 2021 10:23:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232789AbhLRKXo (ORCPT ); Sat, 18 Dec 2021 05:23:44 -0500 Received: from szxga02-in.huawei.com ([45.249.212.188]:16826 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232725AbhLRKXk (ORCPT ); Sat, 18 Dec 2021 05:23:40 -0500 Received: from kwepemi500006.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4JGMP66KYJz91Kv; Sat, 18 Dec 2021 18:22:50 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500006.china.huawei.com (7.221.188.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:37 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:36 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 05/13] ubifs: Rename whiteout atomically Date: Sat, 18 Dec 2021 18:35:04 +0800 Message-ID: <20211218103512.370420-6-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently, rename whiteout has 3 steps: 1. create tmpfile(which associates old dentry to tmpfile inode) for whiteout, and store tmpfile to disk 2. link whiteout, associate whiteout inode to old dentry agagin and store old dentry, old inode, new dentry on disk 3. writeback dirty whiteout inode to disk Suddenly power-cut or error occurring(eg. ENOSPC returned by budget, memory allocation failure) during above steps may cause kinds of problems: Problem 1: ENOSPC returned by whiteout space budget (before step 2), old dentry will disappear after rename syscall, whiteout file cannot be found either. ls dir // we get file, whiteout rename(dir/file, dir/whiteout, REANME_WHITEOUT) ENOSPC =3D ubifs_budget_space(&wht_req) // return ls dir // empty (no file, no whiteout) Problem 2: Power-cut happens before step 3, whiteout inode with 'nlink=3D= 1' is not stored on disk, whiteout dentry(old dentry) is written on disk, whiteout file is lost on next mount (We get "dead directory entry" after executing 'ls -l' on whiteout file). Now, we use following 3 steps to finish rename whiteout: 1. create an in-mem inode with 'nlink =3D 1' as whiteout 2. ubifs_jnl_rename (Write on disk to finish associating old dentry to whiteout inode, associating new dentry with old inode) 3. iput(whiteout) Rely writing in-mem inode on disk by ubifs_jnl_rename() to finish rename whiteout, which avoids middle disk state caused by suddenly power-cut and error occurring. Fixes: 9e0a1fff8db56ea ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Zhihao Cheng --- fs/ubifs/dir.c | 144 +++++++++++++++++++++++++++++---------------- fs/ubifs/journal.c | 52 +++++++++++++--- 2 files changed, 136 insertions(+), 60 deletions(-) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index 2cbc5f05f671..deaf2d5dba5b 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -349,8 +349,56 @@ static int ubifs_create(struct user_namespace *mnt_use= rns, struct inode *dir, return err; } =20 -static int do_tmpfile(struct inode *dir, struct dentry *dentry, - umode_t mode, struct inode **whiteout) +static struct inode *create_whiteout(struct inode *dir, struct dentry *den= try) +{ + int err; + umode_t mode =3D S_IFCHR | WHITEOUT_MODE; + struct inode *inode; + struct ubifs_info *c =3D dir->i_sb->s_fs_info; + struct fscrypt_name nm; + + /* + * Create an inode('nlink =3D 1') for whiteout without updating journal, + * let ubifs_jnl_rename() store it on flash to complete rename whiteout + * atomically. + */ + + dbg_gen("dent '%pd', mode %#hx in dir ino %lu", + dentry, mode, dir->i_ino); + + err =3D fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm); + if (err) + return ERR_PTR(err); + + inode =3D ubifs_new_inode(c, dir, mode); + if (IS_ERR(inode)) { + err =3D PTR_ERR(inode); + goto out_free; + } + + init_special_inode(inode, inode->i_mode, WHITEOUT_DEV); + ubifs_assert(c, inode->i_op =3D=3D &ubifs_file_inode_operations); + + err =3D ubifs_init_security(dir, inode, &dentry->d_name); + if (err) + goto out_inode; + + /* The dir size is updated by do_rename. */ + insert_inode_hash(inode); + + return inode; + +out_inode: + make_bad_inode(inode); + iput(inode); +out_free: + fscrypt_free_filename(&nm); + ubifs_err(c, "cannot create whiteout file, error %d", err); + return ERR_PTR(err); +} + +static int ubifs_tmpfile(struct user_namespace *mnt_userns, struct inode *= dir, + struct dentry *dentry, umode_t mode) { struct inode *inode; struct ubifs_info *c =3D dir->i_sb->s_fs_info; @@ -392,25 +440,13 @@ static int do_tmpfile(struct inode *dir, struct dentr= y *dentry, } ui =3D ubifs_inode(inode); =20 - if (whiteout) { - init_special_inode(inode, inode->i_mode, WHITEOUT_DEV); - ubifs_assert(c, inode->i_op =3D=3D &ubifs_file_inode_operations); - } - err =3D ubifs_init_security(dir, inode, &dentry->d_name); if (err) goto out_inode; =20 mutex_lock(&ui->ui_mutex); insert_inode_hash(inode); - - if (whiteout) { - mark_inode_dirty(inode); - drop_nlink(inode); - *whiteout =3D inode; - } else { - d_tmpfile(dentry, inode); - } + d_tmpfile(dentry, inode); ubifs_assert(c, ui->dirty); =20 instantiated =3D 1; @@ -432,8 +468,6 @@ static int do_tmpfile(struct inode *dir, struct dentry = *dentry, make_bad_inode(inode); if (!instantiated) iput(inode); - else if (whiteout) - iput(*whiteout); out_budg: ubifs_release_budget(c, &req); if (!instantiated) @@ -443,12 +477,6 @@ static int do_tmpfile(struct inode *dir, struct dentry= *dentry, return err; } =20 -static int ubifs_tmpfile(struct user_namespace *mnt_userns, struct inode *= dir, - struct dentry *dentry, umode_t mode) -{ - return do_tmpfile(dir, dentry, mode, NULL); -} - /** * vfs_dent_type - get VFS directory entry type. * @type: UBIFS directory entry type @@ -1266,17 +1294,19 @@ static int do_rename(struct inode *old_dir, struct = dentry *old_dentry, .dirtied_ino =3D 3 }; struct ubifs_budget_req ino_req =3D { .dirtied_ino =3D 1, .dirtied_ino_d =3D ALIGN(old_inode_ui->data_len, 8) }; + struct ubifs_budget_req wht_req; struct timespec64 time; unsigned int saved_nlink; struct fscrypt_name old_nm, new_nm; =20 /* - * Budget request settings: deletion direntry, new direntry, removing - * the old inode, and changing old and new parent directory inodes. + * Budget request settings: + * req: deletion direntry, new direntry, removing the old inode, + * and changing old and new parent directory inodes. * - * However, this operation also marks the target inode as dirty and - * does not write it, so we allocate budget for the target inode - * separately. + * wht_req: new whiteout inode for RENAME_WHITEOUT. + * + * ino_req: marks the target inode as dirty and does not write it. */ =20 dbg_gen("dent '%pd' ino %lu in dir ino %lu to dent '%pd' in dir ino %lu f= lags 0x%x", @@ -1326,7 +1356,6 @@ static int do_rename(struct inode *old_dir, struct de= ntry *old_dentry, =20 if (flags & RENAME_WHITEOUT) { union ubifs_dev_desc *dev =3D NULL; - struct ubifs_budget_req wht_req; =20 dev =3D kmalloc(sizeof(union ubifs_dev_desc), GFP_NOFS); if (!dev) { @@ -1334,24 +1363,26 @@ static int do_rename(struct inode *old_dir, struct = dentry *old_dentry, goto out_release; } =20 - err =3D do_tmpfile(old_dir, old_dentry, S_IFCHR | WHITEOUT_MODE, &whiteo= ut); - if (err) { + /* + * The whiteout inode without dentry is pinned in memory, + * umount won't happen during rename process because we + * got parent dentry. + */ + whiteout =3D create_whiteout(old_dir, old_dentry); + if (IS_ERR(whiteout)) { + err =3D PTR_ERR(whiteout); kfree(dev); goto out_release; } =20 - spin_lock(&whiteout->i_lock); - whiteout->i_state |=3D I_LINKABLE; - spin_unlock(&whiteout->i_lock); - whiteout_ui =3D ubifs_inode(whiteout); whiteout_ui->data =3D dev; whiteout_ui->data_len =3D ubifs_encode_dev(dev, MKDEV(0, 0)); ubifs_assert(c, !whiteout_ui->dirty); =20 memset(&wht_req, 0, sizeof(struct ubifs_budget_req)); - wht_req.dirtied_ino =3D 1; - wht_req.dirtied_ino_d =3D ALIGN(whiteout_ui->data_len, 8); + wht_req.new_ino =3D 1; + wht_req.new_ino_d =3D ALIGN(whiteout_ui->data_len, 8); /* * To avoid deadlock between space budget (holds ui_mutex and * waits wb work) and writeback work(waits ui_mutex), do space @@ -1359,6 +1390,11 @@ static int do_rename(struct inode *old_dir, struct d= entry *old_dentry, */ err =3D ubifs_budget_space(c, &wht_req); if (err) { + /* + * Whiteout inode can not be written on flash by + * ubifs_jnl_write_inode(), because it's neither + * dirty nor zero-nlink. + */ iput(whiteout); goto out_release; } @@ -1433,17 +1469,11 @@ static int do_rename(struct inode *old_dir, struct = dentry *old_dentry, sync =3D IS_DIRSYNC(old_dir) || IS_DIRSYNC(new_dir); if (unlink && IS_SYNC(new_inode)) sync =3D 1; - } - - if (whiteout) { - inc_nlink(whiteout); - mark_inode_dirty(whiteout); - - spin_lock(&whiteout->i_lock); - whiteout->i_state &=3D ~I_LINKABLE; - spin_unlock(&whiteout->i_lock); - - iput(whiteout); + /* + * S_SYNC flag of whiteout inherits from the old_dir, and we + * have already checked the old dir inode. So there is no need + * to check whiteout. + */ } =20 err =3D ubifs_jnl_rename(c, old_dir, old_inode, &old_nm, new_dir, @@ -1454,6 +1484,11 @@ static int do_rename(struct inode *old_dir, struct d= entry *old_dentry, unlock_4_inodes(old_dir, new_dir, new_inode, whiteout); ubifs_release_budget(c, &req); =20 + if (whiteout) { + ubifs_release_budget(c, &wht_req); + iput(whiteout); + } + mutex_lock(&old_inode_ui->ui_mutex); release =3D old_inode_ui->dirty; mark_inode_dirty_sync(old_inode); @@ -1462,11 +1497,16 @@ static int do_rename(struct inode *old_dir, struct = dentry *old_dentry, if (release) ubifs_release_budget(c, &ino_req); if (IS_SYNC(old_inode)) - err =3D old_inode->i_sb->s_op->write_inode(old_inode, NULL); + /* + * Rename finished here. Although old inode cannot be updated + * on flash, old ctime is not a big problem, don't return err + * code to userspace. + */ + old_inode->i_sb->s_op->write_inode(old_inode, NULL); =20 fscrypt_free_filename(&old_nm); fscrypt_free_filename(&new_nm); - return err; + return 0; =20 out_cancel: if (unlink) { @@ -1487,11 +1527,11 @@ static int do_rename(struct inode *old_dir, struct = dentry *old_dentry, inc_nlink(old_dir); } } + unlock_4_inodes(old_dir, new_dir, new_inode, whiteout); if (whiteout) { - drop_nlink(whiteout); + ubifs_release_budget(c, &wht_req); iput(whiteout); } - unlock_4_inodes(old_dir, new_dir, new_inode, whiteout); out_release: ubifs_release_budget(c, &ino_req); ubifs_release_budget(c, &req); diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index 8ea680dba61e..75dab0ae3939 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -1207,9 +1207,9 @@ int ubifs_jnl_xrename(struct ubifs_info *c, const str= uct inode *fst_dir, * @sync: non-zero if the write-buffer has to be synchronized * * This function implements the re-name operation which may involve writin= g up - * to 4 inodes and 2 directory entries. It marks the written inodes as cle= an - * and returns zero on success. In case of failure, a negative error code = is - * returned. + * to 4 inodes(new inode, whiteout inode, old and new parent directory ino= des) + * and 2 directory entries. It marks the written inodes as clean and retur= ns + * zero on success. In case of failure, a negative error code is returned. */ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir, const struct inode *old_inode, @@ -1222,14 +1222,15 @@ int ubifs_jnl_rename(struct ubifs_info *c, const st= ruct inode *old_dir, void *p; union ubifs_key key; struct ubifs_dent_node *dent, *dent2; - int err, dlen1, dlen2, ilen, lnum, offs, len, orphan_added =3D 0; + int err, dlen1, dlen2, ilen, wlen, lnum, offs, len, orphan_added =3D 0; int aligned_dlen1, aligned_dlen2, plen =3D UBIFS_INO_NODE_SZ; int last_reference =3D !!(new_inode && new_inode->i_nlink =3D=3D 0); int move =3D (old_dir !=3D new_dir); - struct ubifs_inode *new_ui; + struct ubifs_inode *new_ui, *whiteout_ui; u8 hash_old_dir[UBIFS_HASH_ARR_SZ]; u8 hash_new_dir[UBIFS_HASH_ARR_SZ]; u8 hash_new_inode[UBIFS_HASH_ARR_SZ]; + u8 hash_whiteout_inode[UBIFS_HASH_ARR_SZ]; u8 hash_dent1[UBIFS_HASH_ARR_SZ]; u8 hash_dent2[UBIFS_HASH_ARR_SZ]; =20 @@ -1249,9 +1250,20 @@ int ubifs_jnl_rename(struct ubifs_info *c, const str= uct inode *old_dir, } else ilen =3D 0; =20 + if (whiteout) { + whiteout_ui =3D ubifs_inode(whiteout); + ubifs_assert(c, mutex_is_locked(&whiteout_ui->ui_mutex)); + ubifs_assert(c, whiteout->i_nlink =3D=3D 1); + ubifs_assert(c, !whiteout_ui->dirty); + wlen =3D UBIFS_INO_NODE_SZ; + wlen +=3D whiteout_ui->data_len; + } else + wlen =3D 0; + aligned_dlen1 =3D ALIGN(dlen1, 8); aligned_dlen2 =3D ALIGN(dlen2, 8); - len =3D aligned_dlen1 + aligned_dlen2 + ALIGN(ilen, 8) + ALIGN(plen, 8); + len =3D aligned_dlen1 + aligned_dlen2 + ALIGN(ilen, 8) + + ALIGN(wlen, 8) + ALIGN(plen, 8); if (move) len +=3D plen; =20 @@ -1313,6 +1325,15 @@ int ubifs_jnl_rename(struct ubifs_info *c, const str= uct inode *old_dir, p +=3D ALIGN(ilen, 8); } =20 + if (whiteout) { + pack_inode(c, p, whiteout, 0); + err =3D ubifs_node_calc_hash(c, p, hash_whiteout_inode); + if (err) + goto out_release; + + p +=3D ALIGN(wlen, 8); + } + if (!move) { pack_inode(c, p, old_dir, 1); err =3D ubifs_node_calc_hash(c, p, hash_old_dir); @@ -1352,6 +1373,9 @@ int ubifs_jnl_rename(struct ubifs_info *c, const stru= ct inode *old_dir, if (new_inode) ubifs_wbuf_add_ino_nolock(&c->jheads[BASEHD].wbuf, new_inode->i_ino); + if (whiteout) + ubifs_wbuf_add_ino_nolock(&c->jheads[BASEHD].wbuf, + whiteout->i_ino); } release_head(c, BASEHD); =20 @@ -1368,8 +1392,6 @@ int ubifs_jnl_rename(struct ubifs_info *c, const stru= ct inode *old_dir, err =3D ubifs_tnc_add_nm(c, &key, lnum, offs, dlen2, hash_dent2, old_nm); if (err) goto out_ro; - - ubifs_delete_orphan(c, whiteout->i_ino); } else { err =3D ubifs_add_dirt(c, lnum, dlen2); if (err) @@ -1390,6 +1412,15 @@ int ubifs_jnl_rename(struct ubifs_info *c, const str= uct inode *old_dir, offs +=3D ALIGN(ilen, 8); } =20 + if (whiteout) { + ino_key_init(c, &key, whiteout->i_ino); + err =3D ubifs_tnc_add(c, &key, lnum, offs, wlen, + hash_whiteout_inode); + if (err) + goto out_ro; + offs +=3D ALIGN(wlen, 8); + } + ino_key_init(c, &key, old_dir->i_ino); err =3D ubifs_tnc_add(c, &key, lnum, offs, plen, hash_old_dir); if (err) @@ -1410,6 +1441,11 @@ int ubifs_jnl_rename(struct ubifs_info *c, const str= uct inode *old_dir, new_ui->synced_i_size =3D new_ui->ui_size; spin_unlock(&new_ui->ui_lock); } + /* + * No need to mark whiteout inode clean. + * Whiteout doesn't have non-zero size, no need to update + * synced_i_size for whiteout_ui. + */ mark_inode_clean(c, ubifs_inode(old_dir)); if (move) mark_inode_clean(c, ubifs_inode(new_dir)); --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B0EEC433FE for ; Sat, 18 Dec 2021 10:23:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232815AbhLRKXr (ORCPT ); Sat, 18 Dec 2021 05:23:47 -0500 Received: from szxga03-in.huawei.com ([45.249.212.189]:30140 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232741AbhLRKXn (ORCPT ); Sat, 18 Dec 2021 05:23:43 -0500 Received: from kwepemi500003.china.huawei.com (unknown [172.30.72.56]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4JGMMQ2pKLz8vrh; Sat, 18 Dec 2021 18:21:22 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500003.china.huawei.com (7.221.188.51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:38 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:37 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 06/13] ubifs: Fix 'ui->dirty' race between do_tmpfile() and writeback work Date: Sat, 18 Dec 2021 18:35:05 +0800 Message-ID: <20211218103512.370420-7-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" 'ui->dirty' is not protected by 'ui_mutex' in function do_tmpfile() which may race with ubifs_write_inode[wb_workfn] to access/update 'ui->dirty', finally dirty space is released twice. open(O_TMPFILE) wb_workfn do_tmpfile ubifs_budget_space(ino_req =3D { .dirtied_ino =3D 1}) d_tmpfile // mark inode(tmpfile) dirty ubifs_jnl_update // without holding tmpfile's ui_mutex mark_inode_clean(ui) if (ui->dirty) ubifs_release_dirty_inode_budget(ui) // release first time ubifs_write_inode mutex_lock(&ui->ui_mutex) ubifs_release_dirty_inode_budget(ui) // release second time mutex_unlock(&ui->ui_mutex) ui->dirty =3D 0 Run generic/476 can reproduce following message easily (See reproducer in [Link]): UBIFS error (ubi0:0 pid 2578): ubifs_assert_failed [ubifs]: UBIFS assert failed: c->bi.dd_growth >=3D 0, in fs/ubifs/budget.c:554 UBIFS warning (ubi0:0 pid 2578): ubifs_ro_mode [ubifs]: switched to read-only mode, error -22 Workqueue: writeback wb_workfn (flush-ubifs_0_0) Call Trace: ubifs_ro_mode+0x54/0x60 [ubifs] ubifs_assert_failed+0x4b/0x80 [ubifs] ubifs_release_budget+0x468/0x5a0 [ubifs] ubifs_release_dirty_inode_budget+0x53/0x80 [ubifs] ubifs_write_inode+0x121/0x1f0 [ubifs] ... wb_workfn+0x283/0x7b0 Fix it by holding tmpfile ubifs inode lock during ubifs_jnl_update(). Similar problem exists in whiteout renaming, but previous fix("ubifs: Rename whiteout atomically") has solved the problem. Fixes: 474b93704f32163 ("ubifs: Implement O_TMPFILE") Link: https://bugzilla.kernel.org/show_bug.cgi?id=3D214765 Signed-off-by: Zhihao Cheng --- fs/ubifs/dir.c | 60 +++++++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index deaf2d5dba5b..ebdc9aa04cbb 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -397,6 +397,32 @@ static struct inode *create_whiteout(struct inode *dir= , struct dentry *dentry) return ERR_PTR(err); } =20 +/** + * lock_2_inodes - a wrapper for locking two UBIFS inodes. + * @inode1: first inode + * @inode2: second inode + * + * We do not implement any tricks to guarantee strict lock ordering, becau= se + * VFS has already done it for us on the @i_mutex. So this is just a simple + * wrapper function. + */ +static void lock_2_inodes(struct inode *inode1, struct inode *inode2) +{ + mutex_lock_nested(&ubifs_inode(inode1)->ui_mutex, WB_MUTEX_1); + mutex_lock_nested(&ubifs_inode(inode2)->ui_mutex, WB_MUTEX_2); +} + +/** + * unlock_2_inodes - a wrapper for unlocking two UBIFS inodes. + * @inode1: first inode + * @inode2: second inode + */ +static void unlock_2_inodes(struct inode *inode1, struct inode *inode2) +{ + mutex_unlock(&ubifs_inode(inode2)->ui_mutex); + mutex_unlock(&ubifs_inode(inode1)->ui_mutex); +} + static int ubifs_tmpfile(struct user_namespace *mnt_userns, struct inode *= dir, struct dentry *dentry, umode_t mode) { @@ -404,7 +430,7 @@ static int ubifs_tmpfile(struct user_namespace *mnt_use= rns, struct inode *dir, struct ubifs_info *c =3D dir->i_sb->s_fs_info; struct ubifs_budget_req req =3D { .new_ino =3D 1, .new_dent =3D 1}; struct ubifs_budget_req ino_req =3D { .dirtied_ino =3D 1 }; - struct ubifs_inode *ui, *dir_ui =3D ubifs_inode(dir); + struct ubifs_inode *ui; int err, instantiated =3D 0; struct fscrypt_name nm; =20 @@ -452,18 +478,18 @@ static int ubifs_tmpfile(struct user_namespace *mnt_u= serns, struct inode *dir, instantiated =3D 1; mutex_unlock(&ui->ui_mutex); =20 - mutex_lock(&dir_ui->ui_mutex); + lock_2_inodes(dir, inode); err =3D ubifs_jnl_update(c, dir, &nm, inode, 1, 0); if (err) goto out_cancel; - mutex_unlock(&dir_ui->ui_mutex); + unlock_2_inodes(dir, inode); =20 ubifs_release_budget(c, &req); =20 return 0; =20 out_cancel: - mutex_unlock(&dir_ui->ui_mutex); + unlock_2_inodes(dir, inode); out_inode: make_bad_inode(inode); if (!instantiated) @@ -690,32 +716,6 @@ static int ubifs_dir_release(struct inode *dir, struct= file *file) return 0; } =20 -/** - * lock_2_inodes - a wrapper for locking two UBIFS inodes. - * @inode1: first inode - * @inode2: second inode - * - * We do not implement any tricks to guarantee strict lock ordering, becau= se - * VFS has already done it for us on the @i_mutex. So this is just a simple - * wrapper function. - */ -static void lock_2_inodes(struct inode *inode1, struct inode *inode2) -{ - mutex_lock_nested(&ubifs_inode(inode1)->ui_mutex, WB_MUTEX_1); - mutex_lock_nested(&ubifs_inode(inode2)->ui_mutex, WB_MUTEX_2); -} - -/** - * unlock_2_inodes - a wrapper for unlocking two UBIFS inodes. - * @inode1: first inode - * @inode2: second inode - */ -static void unlock_2_inodes(struct inode *inode1, struct inode *inode2) -{ - mutex_unlock(&ubifs_inode(inode2)->ui_mutex); - mutex_unlock(&ubifs_inode(inode1)->ui_mutex); -} - static int ubifs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry) { --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9937AC433EF for ; Sat, 18 Dec 2021 10:24:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232972AbhLRKYF (ORCPT ); Sat, 18 Dec 2021 05:24:05 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:33867 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232734AbhLRKXo (ORCPT ); Sat, 18 Dec 2021 05:23:44 -0500 Received: from kwepemi500002.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JGMPg2t9MzcbrL; Sat, 18 Dec 2021 18:23:19 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500002.china.huawei.com (7.221.188.171) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:39 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:38 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 07/13] ubifs: Rectify space amount budget for mkdir/tmpfile operations Date: Sat, 18 Dec 2021 18:35:06 +0800 Message-ID: <20211218103512.370420-8-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" UBIFS should make sure the flash has enough space to store dirty (Data that is newer than disk) data (in memory), space budget is exactly designed to do that. If space budget calculates less data than we need, 'make_reservation()' will do more work(return -ENOSPC if no free space lelf, sometimes we can see "cannot reserve xxx bytes in jhead xxx, error -28" in ubifs error messages) with ubifs inodes locked, which may effect other syscalls. A simple way to decide how much space do we need when make a budget: See how much space is needed by 'make_reservation()' in ubifs_jnl_xxx() function according to corresponding operation. It's better to report ENOSPC in ubifs_budget_space(), as early as we can. Fixes: 474b93704f32163 ("ubifs: Implement O_TMPFILE") Fixes: 1e51764a3c2ac05 ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng --- fs/ubifs/dir.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c index ebdc9aa04cbb..7ae48f273feb 100644 --- a/fs/ubifs/dir.c +++ b/fs/ubifs/dir.c @@ -428,15 +428,18 @@ static int ubifs_tmpfile(struct user_namespace *mnt_u= serns, struct inode *dir, { struct inode *inode; struct ubifs_info *c =3D dir->i_sb->s_fs_info; - struct ubifs_budget_req req =3D { .new_ino =3D 1, .new_dent =3D 1}; + struct ubifs_budget_req req =3D { .new_ino =3D 1, .new_dent =3D 1, + .dirtied_ino =3D 1}; struct ubifs_budget_req ino_req =3D { .dirtied_ino =3D 1 }; struct ubifs_inode *ui; int err, instantiated =3D 0; struct fscrypt_name nm; =20 /* - * Budget request settings: new dirty inode, new direntry, - * budget for dirtied inode will be released via writeback. + * Budget request settings: new inode, new direntry, changing the + * parent directory inode. + * Allocate budget separately for new dirtied inode, the budget will + * be released via writeback. */ =20 dbg_gen("dent '%pd', mode %#hx in dir ino %lu", @@ -979,7 +982,8 @@ static int ubifs_mkdir(struct user_namespace *mnt_usern= s, struct inode *dir, struct ubifs_inode *dir_ui =3D ubifs_inode(dir); struct ubifs_info *c =3D dir->i_sb->s_fs_info; int err, sz_change; - struct ubifs_budget_req req =3D { .new_ino =3D 1, .new_dent =3D 1 }; + struct ubifs_budget_req req =3D { .new_ino =3D 1, .new_dent =3D 1, + .dirtied_ino =3D 1}; struct fscrypt_name nm; =20 /* --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19C77C433F5 for ; Sat, 18 Dec 2021 10:24:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232767AbhLRKYB (ORCPT ); Sat, 18 Dec 2021 05:24:01 -0500 Received: from szxga08-in.huawei.com ([45.249.212.255]:29141 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232746AbhLRKXp (ORCPT ); Sat, 18 Dec 2021 05:23:45 -0500 Received: from kwepemi100010.china.huawei.com (unknown [172.30.72.56]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4JGMLX4dTLz1DK4c; Sat, 18 Dec 2021 18:20:36 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi100010.china.huawei.com (7.221.188.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:40 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:39 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 08/13] ubifs: setflags: Make dirtied_ino_d 8 bytes aligned Date: Sat, 18 Dec 2021 18:35:07 +0800 Message-ID: <20211218103512.370420-9-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Make 'ui->data_len' aligned with 8 bytes before it is assigned to dirtied_ino_d. Since 8871d84c8f8b0c6b("ubifs: convert to fileattr") applied, 'setflags()' only affects regular files and directories, only xattr inode, symlink inode and special inode(pipe/char_dev/block_dev) have none- zero 'ui->data_len' field, so assertion '!(req->dirtied_ino_d & 7)' cannot fail in ubifs_budget_space(). To avoid assertion fails in future evolution(eg. setflags can operate special inodes), it's better to make dirtied_ino_d 8 bytes aligned, after all aligned size is still zero for regular files. Fixes: 1e51764a3c2ac05a ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng --- fs/ubifs/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ubifs/ioctl.c b/fs/ubifs/ioctl.c index c6a863487780..71bcebe45f9c 100644 --- a/fs/ubifs/ioctl.c +++ b/fs/ubifs/ioctl.c @@ -108,7 +108,7 @@ static int setflags(struct inode *inode, int flags) struct ubifs_inode *ui =3D ubifs_inode(inode); struct ubifs_info *c =3D inode->i_sb->s_fs_info; struct ubifs_budget_req req =3D { .dirtied_ino =3D 1, - .dirtied_ino_d =3D ui->data_len }; + .dirtied_ino_d =3D ALIGN(ui->data_len, 8) }; =20 err =3D ubifs_budget_space(c, &req); if (err) --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 420BDC433F5 for ; Sat, 18 Dec 2021 10:24:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232895AbhLRKYH (ORCPT ); Sat, 18 Dec 2021 05:24:07 -0500 Received: from szxga08-in.huawei.com ([45.249.212.255]:30072 "EHLO szxga08-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232755AbhLRKXu (ORCPT ); Sat, 18 Dec 2021 05:23:50 -0500 Received: from kwepemi100008.china.huawei.com (unknown [172.30.72.55]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4JGMLb41rVz1DK5t; Sat, 18 Dec 2021 18:20:39 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi100008.china.huawei.com (7.221.188.57) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:40 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:40 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 09/13] ubifs: Fix read out-of-bounds in ubifs_wbuf_write_nolock() Date: Sat, 18 Dec 2021 18:35:08 +0800 Message-ID: <20211218103512.370420-10-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Function ubifs_wbuf_write_nolock() may access buf out of bounds in following process: ubifs_wbuf_write_nolock(): aligned_len =3D ALIGN(len, 8); // Assume len =3D 4089, aligned_len =3D = 4096 if (aligned_len <=3D wbuf->avail) ... // Not satisfy if (wbuf->used) { ubifs_leb_write() // Fill some data in avail wbuf len -=3D wbuf->avail; // len is still not 8-bytes aligned aligned_len -=3D wbuf->avail; } n =3D aligned_len >> c->max_write_shift; if (n) { n <<=3D c->max_write_shift; err =3D ubifs_leb_write(c, wbuf->lnum, buf + written, wbuf->offs, n); // n > len, read out of bounds less than 8(n-len) bytes } , which can be catched by KASAN: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D BUG: KASAN: slab-out-of-bounds in ecc_sw_hamming_calculate+0x1dc/0x7d0 Read of size 4 at addr ffff888105594ff8 by task kworker/u8:4/128 Workqueue: writeback wb_workfn (flush-ubifs_0_0) Call Trace: kasan_report.cold+0x81/0x165 nand_write_page_swecc+0xa9/0x160 ubifs_leb_write+0xf2/0x1b0 [ubifs] ubifs_wbuf_write_nolock+0x421/0x12c0 [ubifs] write_head+0xdc/0x1c0 [ubifs] ubifs_jnl_write_inode+0x627/0x960 [ubifs] wb_workfn+0x8af/0xb80 Function ubifs_wbuf_write_nolock() accepts that parameter 'len' is not 8 bytes aligned, the 'len' represents the true length of buf (which is allocated in 'ubifs_jnl_xxx', eg. ubifs_jnl_write_inode), so ubifs_wbuf_write_nolock() must handle the length read from 'buf' carefully to write leb safely. Fetch a reproducer in [Link]. Fixes: 1e51764a3c2ac0 ("UBIFS: add new flash file system") Link: https://bugzilla.kernel.org/show_bug.cgi?id=3D214785 Reported-by: Chengsong Ke Signed-off-by: Zhihao Cheng --- fs/ubifs/io.c | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-) diff --git a/fs/ubifs/io.c b/fs/ubifs/io.c index 00b61dba62b7..b019dd6f7fa0 100644 --- a/fs/ubifs/io.c +++ b/fs/ubifs/io.c @@ -833,16 +833,42 @@ int ubifs_wbuf_write_nolock(struct ubifs_wbuf *wbuf, = void *buf, int len) */ n =3D aligned_len >> c->max_write_shift; if (n) { - n <<=3D c->max_write_shift; + int m =3D n - 1; + dbg_io("write %d bytes to LEB %d:%d", n, wbuf->lnum, wbuf->offs); - err =3D ubifs_leb_write(c, wbuf->lnum, buf + written, - wbuf->offs, n); + + if (m) { + /* '(n-1)<max_write_shift < len' is always true. */ + m <<=3D c->max_write_shift; + err =3D ubifs_leb_write(c, wbuf->lnum, buf + written, + wbuf->offs, m); + if (err) + goto out; + wbuf->offs +=3D m; + aligned_len -=3D m; + len -=3D m; + written +=3D m; + } + + /* + * The non-written len of buf may be less than 'n' because + * parameter 'len' is not 8 bytes aligned, so here we read + * min(len, n) bytes from buf. + */ + n =3D 1 << c->max_write_shift; + memcpy(wbuf->buf, buf + written, min(len, n)); + if (n > len) { + ubifs_assert(c, n - len < 8); + ubifs_pad(c, wbuf->buf + len, n - len); + } + + err =3D ubifs_leb_write(c, wbuf->lnum, wbuf->buf, wbuf->offs, n); if (err) goto out; wbuf->offs +=3D n; aligned_len -=3D n; - len -=3D n; + len -=3D min(len, n); written +=3D n; } =20 --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A56F1C433F5 for ; Sat, 18 Dec 2021 10:23:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232801AbhLRKXt (ORCPT ); Sat, 18 Dec 2021 05:23:49 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:33868 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232760AbhLRKXn (ORCPT ); Sat, 18 Dec 2021 05:23:43 -0500 Received: from kwepemi500001.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JGMPj4qnRzcbh2; Sat, 18 Dec 2021 18:23:21 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi500001.china.huawei.com (7.221.188.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:41 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:40 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 10/13] ubifs: Fix to add refcount once page is set private Date: Sat, 18 Dec 2021 18:35:09 +0800 Message-ID: <20211218103512.370420-11-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" MM defined the rule [1] very clearly that once page was set with PG_private flag, we should increment the refcount in that page, also main flows like pageout(), migrate_page() will assume there is one additional page reference count if page_has_private() returns true. Otherwise, we may get a BUG in page migration: page:0000000080d05b9d refcount:-1 mapcount:0 mapping:000000005f4d82a8 index:0xe2 pfn:0x14c12 aops:ubifs_file_address_operations [ubifs] ino:8f1 dentry name:"f30e" flags: 0x1fffff80002405(locked|uptodate|owner_priv_1|private|node=3D0| zone=3D1|lastcpupid=3D0x1fffff) page dumped because: VM_BUG_ON_PAGE(page_count(page) !=3D 0) ------------[ cut here ]------------ kernel BUG at include/linux/page_ref.h:184! invalid opcode: 0000 [#1] SMP CPU: 3 PID: 38 Comm: kcompactd0 Not tainted 5.15.0-rc5 RIP: 0010:migrate_page_move_mapping+0xac3/0xe70 Call Trace: ubifs_migrate_page+0x22/0xc0 [ubifs] move_to_new_page+0xb4/0x600 migrate_pages+0x1523/0x1cc0 compact_zone+0x8c5/0x14b0 kcompactd+0x2bc/0x560 kthread+0x18c/0x1e0 ret_from_fork+0x1f/0x30 Before the time, we should make clean a concept, what does refcount means in page gotten from grab_cache_page_write_begin(). There are 2 situations: Situation 1: refcount is 3, page is created by __page_cache_alloc. TYPE_A - the write process is using this page TYPE_B - page is assigned to one certain mapping by calling __add_to_page_cache_locked() TYPE_C - page is added into pagevec list corresponding current cpu by calling lru_cache_add() Situation 2: refcount is 2, page is gotten from the mapping's tree TYPE_B - page has been assigned to one certain mapping TYPE_A - the write process is using this page (by calling page_cache_get_speculative()) Filesystem releases one refcount by calling put_page() in xxx_write_end(), the released refcount corresponds to TYPE_A (write task is using it). If there are any processes using a page, page migration process will skip the page by judging whether expected_page_refs() equals to page refcount. The BUG is caused by following process: PA(cpu 0) kcompactd(cpu 1) compact_zone ubifs_write_begin page_a =3D grab_cache_page_write_begin add_to_page_cache_lru lru_cache_add pagevec_add // put page into cpu 0's pagevec (refcnf =3D 3, for page creation process) ubifs_write_end SetPagePrivate(page_a) // doesn't increase page count ! unlock_page(page_a) put_page(page_a) // refcnt =3D 2 [...] PB(cpu 0) filemap_read filemap_get_pages add_to_page_cache_lru lru_cache_add __pagevec_lru_add // traverse all pages in cpu 0's pagevec __pagevec_lru_add_fn SetPageLRU(page_a) isolate_migratepages isolate_migratepages_block get_page_unless_zero(page_a) // refcnt =3D 3 list_add(page_a, from_list) migrate_pages(from_list) __unmap_and_move move_to_new_page ubifs_migrate_page(page_a) migrate_page_move_mapping expected_page_refs get 3 (migration[1] + mapping[1] + private[1]) release_pages put_page_testzero(page_a) // refcnt =3D 3 page_ref_freeze // refcnt =3D 0 page_ref_dec_and_test(0 - 1 =3D -1) page_ref_unfreeze VM_BUG_ON_PAGE(-1 !=3D 0, page) UBIFS doesn't increase the page refcount after setting private flag, which leads to page migration task believes the page is not used by any other processes, so the page is migrated. This causes concurrent accessing on page refcount between put_page() called by other process(eg. read process calls lru_cache_add) and page_ref_unfreeze() called by migration task. Actually zhangjun has tried to fix this problem [2] by recalculating page refcnt in ubifs_migrate_page(). It's better to follow MM rules [1], because just like Kirill suggested in [2], we need to check all users of page_has_private() helper. Like f2fs does in [3], fix it by adding/deleting refcount when setting/clearing private for a page. BTW, according to [4], we set 'page->private' as 1 because ubifs just simply SetPagePrivate(). And, [5] provided a common helper to set/clear page private, ubifs can use this helper following the example of iomap, afs, btrfs, etc. Jump [6] to find a reproducer. [1] https://lore.kernel.org/lkml/2b19b3c4-2bc4-15fa-15cc-27a13e5c7af1@aol.c= om [2] https://www.spinics.net/lists/linux-mtd/msg04018.html [3] http://lkml.iu.edu/hypermail/linux/kernel/1903.0/03313.html [4] https://lore.kernel.org/linux-f2fs-devel/20210422154705.GO3596236@caspe= r.infradead.org [5] https://lore.kernel.org/all/20200517214718.468-1-guoqing.jiang@cloud.io= nos.com [6] https://bugzilla.kernel.org/show_bug.cgi?id=3D214961 Fixes: 1e51764a3c2ac0 ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng --- fs/ubifs/file.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 5cfa28cd00cd..6b45a037a047 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -570,7 +570,7 @@ static int ubifs_write_end(struct file *file, struct ad= dress_space *mapping, } =20 if (!PagePrivate(page)) { - SetPagePrivate(page); + attach_page_private(page, (void *)1); atomic_long_inc(&c->dirty_pg_cnt); __set_page_dirty_nobuffers(page); } @@ -947,7 +947,7 @@ static int do_writepage(struct page *page, int len) release_existing_page_budget(c); =20 atomic_long_dec(&c->dirty_pg_cnt); - ClearPagePrivate(page); + detach_page_private(page); ClearPageChecked(page); =20 kunmap(page); @@ -1304,7 +1304,7 @@ static void ubifs_invalidatepage(struct page *page, u= nsigned int offset, release_existing_page_budget(c); =20 atomic_long_dec(&c->dirty_pg_cnt); - ClearPagePrivate(page); + detach_page_private(page); ClearPageChecked(page); } =20 @@ -1471,8 +1471,8 @@ static int ubifs_migrate_page(struct address_space *m= apping, return rc; =20 if (PagePrivate(page)) { - ClearPagePrivate(page); - SetPagePrivate(newpage); + detach_page_private(page); + attach_page_private(newpage, (void *)1); } =20 if (mode !=3D MIGRATE_SYNC_NO_COPY) @@ -1496,7 +1496,7 @@ static int ubifs_releasepage(struct page *page, gfp_t= unused_gfp_flags) return 0; ubifs_assert(c, PagePrivate(page)); ubifs_assert(c, 0); - ClearPagePrivate(page); + detach_page_private(page); ClearPageChecked(page); return 1; } @@ -1567,7 +1567,7 @@ static vm_fault_t ubifs_vm_page_mkwrite(struct vm_fau= lt *vmf) else { if (!PageChecked(page)) ubifs_convert_page_budget(c); - SetPagePrivate(page); + attach_page_private(page, (void *)1); atomic_long_inc(&c->dirty_pg_cnt); __set_page_dirty_nobuffers(page); } --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C886C433EF for ; Sat, 18 Dec 2021 10:23:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232818AbhLRKXy (ORCPT ); Sat, 18 Dec 2021 05:23:54 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:33869 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232773AbhLRKXo (ORCPT ); Sat, 18 Dec 2021 05:23:44 -0500 Received: from kwepemi100009.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JGMPk3NDQzcbsj; Sat, 18 Dec 2021 18:23:22 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi100009.china.huawei.com (7.221.188.242) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:42 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:41 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 11/13] ubi: fastmap: Return error code if memory allocation fails in add_aeb() Date: Sat, 18 Dec 2021 18:35:10 +0800 Message-ID: <20211218103512.370420-12-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Abort fastmap scanning and return error code if memory allocation fails in add_aeb(). Otherwise ubi will get wrong peb statistics information after scanning. Fixes: dbb7d2a88d2a7b ("UBI: Add fastmap core") Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c index 022af59906aa..6b5f1ffd961b 100644 --- a/drivers/mtd/ubi/fastmap.c +++ b/drivers/mtd/ubi/fastmap.c @@ -468,7 +468,9 @@ static int scan_pool(struct ubi_device *ubi, struct ubi= _attach_info *ai, if (err =3D=3D UBI_IO_FF_BITFLIPS) scrub =3D 1; =20 - add_aeb(ai, free, pnum, ec, scrub); + ret =3D add_aeb(ai, free, pnum, ec, scrub); + if (ret) + goto out; continue; } else if (err =3D=3D 0 || err =3D=3D UBI_IO_BITFLIPS) { dbg_bld("Found non empty PEB:%i in pool", pnum); @@ -638,8 +640,10 @@ static int ubi_attach_fastmap(struct ubi_device *ubi, if (fm_pos >=3D fm_size) goto fail_bad; =20 - add_aeb(ai, &ai->free, be32_to_cpu(fmec->pnum), - be32_to_cpu(fmec->ec), 0); + ret =3D add_aeb(ai, &ai->free, be32_to_cpu(fmec->pnum), + be32_to_cpu(fmec->ec), 0); + if (ret) + goto fail; } =20 /* read EC values from used list */ @@ -649,8 +653,10 @@ static int ubi_attach_fastmap(struct ubi_device *ubi, if (fm_pos >=3D fm_size) goto fail_bad; =20 - add_aeb(ai, &used, be32_to_cpu(fmec->pnum), - be32_to_cpu(fmec->ec), 0); + ret =3D add_aeb(ai, &used, be32_to_cpu(fmec->pnum), + be32_to_cpu(fmec->ec), 0); + if (ret) + goto fail; } =20 /* read EC values from scrub list */ @@ -660,8 +666,10 @@ static int ubi_attach_fastmap(struct ubi_device *ubi, if (fm_pos >=3D fm_size) goto fail_bad; =20 - add_aeb(ai, &used, be32_to_cpu(fmec->pnum), - be32_to_cpu(fmec->ec), 1); + ret =3D add_aeb(ai, &used, be32_to_cpu(fmec->pnum), + be32_to_cpu(fmec->ec), 1); + if (ret) + goto fail; } =20 /* read EC values from erase list */ @@ -671,8 +679,10 @@ static int ubi_attach_fastmap(struct ubi_device *ubi, if (fm_pos >=3D fm_size) goto fail_bad; =20 - add_aeb(ai, &ai->erase, be32_to_cpu(fmec->pnum), - be32_to_cpu(fmec->ec), 1); + ret =3D add_aeb(ai, &ai->erase, be32_to_cpu(fmec->pnum), + be32_to_cpu(fmec->ec), 1); + if (ret) + goto fail; } =20 ai->mean_ec =3D div_u64(ai->ec_sum, ai->ec_count); --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7BABC433EF for ; Sat, 18 Dec 2021 10:23:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232830AbhLRKX6 (ORCPT ); Sat, 18 Dec 2021 05:23:58 -0500 Received: from szxga02-in.huawei.com ([45.249.212.188]:16827 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232740AbhLRKXp (ORCPT ); Sat, 18 Dec 2021 05:23:45 -0500 Received: from kwepemi100007.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4JGMPD1xXDz91Pb; Sat, 18 Dec 2021 18:22:56 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi100007.china.huawei.com (7.221.188.115) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:43 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:42 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 12/13] ubi: fastmap: Add all fastmap pebs into 'ai->fastmap' when fm->used_blocks>=2 Date: Sat, 18 Dec 2021 18:35:11 +0800 Message-ID: <20211218103512.370420-13-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Fastmap pebs(pnum >=3D UBI_FM_MAX_START) won't be added into 'ai->fastmap' while attaching ubi device if 'fm->used_blocks' is greater than 2, which may cause warning from 'ubi_assert(ubi->good_peb_count =3D=3D found_pebs)': UBI assert failed in ubi_wl_init at 1878 (pid 2409) Call Trace: ubi_wl_init.cold+0xae/0x2af [ubi] ubi_attach+0x1b0/0x780 [ubi] ubi_init+0x23a/0x3ad [ubi] load_module+0x22d2/0x2430 Reproduce: ID=3D"0x20,0x33,0x00,0x00" # 16M 16KB PEB, 512 page modprobe nandsim id_bytes=3D$ID modprobe ubi mtd=3D"0,0" fm_autoconvert # Fastmap takes 2 pebs rmmod ubi modprobe ubi mtd=3D"0,0" fm_autoconvert # Attach by fastmap Add all used fastmap pebs into list 'ai->fastmap' to make sure they can be counted into 'found_pebs'. Fixes: fdf10ed710c0aa ("ubi: Rework Fastmap attach base code") Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap.c | 35 +++++------------------------------ 1 file changed, 5 insertions(+), 30 deletions(-) diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c index 6b5f1ffd961b..01dcdd94c9d2 100644 --- a/drivers/mtd/ubi/fastmap.c +++ b/drivers/mtd/ubi/fastmap.c @@ -828,24 +828,6 @@ static int find_fm_anchor(struct ubi_attach_info *ai) return ret; } =20 -static struct ubi_ainf_peb *clone_aeb(struct ubi_attach_info *ai, - struct ubi_ainf_peb *old) -{ - struct ubi_ainf_peb *new; - - new =3D ubi_alloc_aeb(ai, old->pnum, old->ec); - if (!new) - return NULL; - - new->vol_id =3D old->vol_id; - new->sqnum =3D old->sqnum; - new->lnum =3D old->lnum; - new->scrub =3D old->scrub; - new->copy_flag =3D old->copy_flag; - - return new; -} - /** * ubi_scan_fastmap - scan the fastmap. * @ubi: UBI device object @@ -865,7 +847,6 @@ int ubi_scan_fastmap(struct ubi_device *ubi, struct ubi= _attach_info *ai, struct ubi_vid_hdr *vh; struct ubi_ec_hdr *ech; struct ubi_fastmap_layout *fm; - struct ubi_ainf_peb *aeb; int i, used_blocks, pnum, fm_anchor, ret =3D 0; size_t fm_size; __be32 crc, tmp_crc; @@ -875,17 +856,6 @@ int ubi_scan_fastmap(struct ubi_device *ubi, struct ub= i_attach_info *ai, if (fm_anchor < 0) return UBI_NO_FASTMAP; =20 - /* Copy all (possible) fastmap blocks into our new attach structure. */ - list_for_each_entry(aeb, &scan_ai->fastmap, u.list) { - struct ubi_ainf_peb *new; - - new =3D clone_aeb(ai, aeb); - if (!new) - return -ENOMEM; - - list_add(&new->u.list, &ai->fastmap); - } - down_write(&ubi->fm_protect); memset(ubi->fm_buf, 0, ubi->fm_size); =20 @@ -1029,6 +999,11 @@ int ubi_scan_fastmap(struct ubi_device *ubi, struct u= bi_attach_info *ai, "err: %i)", i, pnum, ret); goto free_hdr; } + + /* Add all fastmap blocks into attach structure. */ + ret =3D add_aeb(ai, &ai->fastmap, pnum, be64_to_cpu(ech->ec), 0); + if (ret) + goto free_hdr; } =20 kfree(fmsb); --=20 2.31.1 From nobody Wed Jul 1 16:47:56 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 625AAC433FE for ; Sat, 18 Dec 2021 10:24:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232859AbhLRKYC (ORCPT ); Sat, 18 Dec 2021 05:24:02 -0500 Received: from szxga01-in.huawei.com ([45.249.212.187]:15939 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232792AbhLRKXp (ORCPT ); Sat, 18 Dec 2021 05:23:45 -0500 Received: from kwepemi100006.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4JGMLc1lpszZdLq; Sat, 18 Dec 2021 18:20:40 +0800 (CST) Received: from kwepemm600013.china.huawei.com (7.193.23.68) by kwepemi100006.china.huawei.com (7.221.188.165) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:43 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600013.china.huawei.com (7.193.23.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.20; Sat, 18 Dec 2021 18:23:43 +0800 From: Zhihao Cheng To: , , , , , CC: , , Subject: [PATCH v4 13/13] ubifs: ubifs_writepage: Mark page dirty after writing inode failed Date: Sat, 18 Dec 2021 18:35:12 +0800 Message-ID: <20211218103512.370420-14-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211218103512.370420-1-chengzhihao1@huawei.com> References: <20211218103512.370420-1-chengzhihao1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemm600013.china.huawei.com (7.193.23.68) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There are two states for ubifs writing pages: 1. Dirty, Private 2. Not Dirty, Not Private There is a third possibility which maybe related to [1] that page is private but not dirty caused by following process: PA lock(page) ubifs_write_end attach_page_private // set Private __set_page_dirty_nobuffers // set Dirty unlock(page) write_cache_pages lock(page) clear_page_dirty_for_io(page) // clear Dirty ubifs_writepage write_inode // fail, goto out, following codes are not executed // do_writepage // set_page_writeback // set Writeback // detach_page_private // clear Private // end_page_writeback // clear Writeback out: unlock(page) // Private, Not Dirty PB ksys_fadvise64_64 generic_fadvise invalidate_inode_page // page is neither Dirty nor Writeback invalidate_complete_page // page_has_private is true try_to_release_page ubifs_releasepage ubifs_assert(c, 0) !!! Then we may get following assertion failed: UBIFS error (ubi0:0 pid 1492): ubifs_assert_failed [ubifs]: UBIFS assert failed: 0, in fs/ubifs/file.c:1499 UBIFS warning (ubi0:0 pid 1492): ubifs_ro_mode [ubifs]: switched to read-only mode, error -22 CPU: 2 PID: 1492 Comm: aa Not tainted 5.16.0-rc2-00012-g7bb767dee0ba-dirty Call Trace: dump_stack+0x13/0x1b ubifs_ro_mode+0x54/0x60 [ubifs] ubifs_assert_failed+0x4b/0x80 [ubifs] ubifs_releasepage+0x7e/0x1e0 [ubifs] try_to_release_page+0x57/0xe0 invalidate_inode_page+0xfb/0x130 invalidate_mapping_pagevec+0x12/0x20 generic_fadvise+0x303/0x3c0 vfs_fadvise+0x35/0x40 ksys_fadvise64_64+0x4c/0xb0 Jump [2] to find a reproducer. [1] https://linux-mtd.infradead.narkive.com/NQoBeT1u/patch-rfc-ubifs-fix-as= sert-failed-in-ubifs-set-page-dirty [2] https://bugzilla.kernel.org/show_bug.cgi?id=3D215357 Fixes: 1e51764a3c2ac0 ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng --- fs/ubifs/file.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 6b45a037a047..7cc2abcb70ae 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -1031,7 +1031,7 @@ static int ubifs_writepage(struct page *page, struct = writeback_control *wbc) if (page->index >=3D synced_i_size >> PAGE_SHIFT) { err =3D inode->i_sb->s_op->write_inode(inode, NULL); if (err) - goto out_unlock; + goto out_redirty; /* * The inode has been written, but the write-buffer has * not been synchronized, so in case of an unclean @@ -1059,11 +1059,17 @@ static int ubifs_writepage(struct page *page, struc= t writeback_control *wbc) if (i_size > synced_i_size) { err =3D inode->i_sb->s_op->write_inode(inode, NULL); if (err) - goto out_unlock; + goto out_redirty; } =20 return do_writepage(page, len); - +out_redirty: + /* + * redirty_page_for_writepage() won't call ubifs_dirty_inode() because + * it passes I_DIRTY_PAGES flag while calling __mark_inode_dirty(), so + * there is no need to do space budget for dirty inode. + */ + redirty_page_for_writepage(wbc, page); out_unlock: unlock_page(page); return err; --=20 2.31.1