fs/ocfs2/aops.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
while in ocfs2_dio_end_io_write, it acquires these locks in reverse
order. This creates an ABBA lock ordering violation on lock classes
ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
ocfs2_file_ip_alloc_sem_key.
Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
ocfs2_unlink
ocfs2_prepare_orphan_dir
ocfs2_lookup_lock_orphan_dir
inode_lock(orphan_dir_inode) <- lock A
__ocfs2_prepare_orphan_dir
ocfs2_prepare_dir_for_insert
ocfs2_extend_dir
ocfs2_expand_inline_dir
down_write(&oi->ip_alloc_sem) <- Lock B
Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
ocfs2_dio_end_io_write
down_write(&oi->ip_alloc_sem) <- Lock B
ocfs2_del_inode_from_orphan()
inode_lock(orphan_dir_inode) <- Lock A
Deadlock Scenario:
CPU0 (unlink) CPU1 (dio_end_io_write)
------ ------
inode_lock(orphan_dir_inode)
down_write(ip_alloc_sem)
down_write(ip_alloc_sem)
inode_lock(orphan_dir_inode)
Since ip_alloc_sem is to protect allocation changes, which is unrelated
with operations in ocfs2_del_inode_from_orphan. So move
ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock.
Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04
Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
---
fs/ocfs2/aops.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 17ba79f443ee..09146b43d1f0 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2294,8 +2294,6 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
goto out;
}
- down_write(&oi->ip_alloc_sem);
-
/* Delete orphan before acquire i_rwsem. */
if (dwc->dw_orphaned) {
BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
@@ -2308,6 +2306,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
mlog_errno(ret);
}
+ down_write(&oi->ip_alloc_sem);
di = (struct ocfs2_dinode *)di_bh->b_data;
ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), di_bh);
--
2.39.3
On Fri, Mar 06, 2026 at 11:22:11AM +0800, Joseph Qi wrote:
> ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
> while in ocfs2_dio_end_io_write, it acquires these locks in reverse
> order. This creates an ABBA lock ordering violation on lock classes
> ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
> ocfs2_file_ip_alloc_sem_key.
>
> Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
> ocfs2_unlink
> ocfs2_prepare_orphan_dir
> ocfs2_lookup_lock_orphan_dir
> inode_lock(orphan_dir_inode) <- lock A
> __ocfs2_prepare_orphan_dir
> ocfs2_prepare_dir_for_insert
> ocfs2_extend_dir
> ocfs2_expand_inline_dir
> down_write(&oi->ip_alloc_sem) <- Lock B
>
> Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
> ocfs2_dio_end_io_write
> down_write(&oi->ip_alloc_sem) <- Lock B
> ocfs2_del_inode_from_orphan()
> inode_lock(orphan_dir_inode) <- Lock A
>
> Deadlock Scenario:
> CPU0 (unlink) CPU1 (dio_end_io_write)
> ------ ------
> inode_lock(orphan_dir_inode)
> down_write(ip_alloc_sem)
> down_write(ip_alloc_sem)
> inode_lock(orphan_dir_inode)
>
> Since ip_alloc_sem is to protect allocation changes, which is unrelated
> with operations in ocfs2_del_inode_from_orphan. So move
> ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock.
>
> Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04
> Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write")
> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
LGTM.
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
btw, I have a question below that is unrelated to this bug.
> ---
> fs/ocfs2/aops.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 17ba79f443ee..09146b43d1f0 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -2294,8 +2294,6 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
> goto out;
> }
>
> - down_write(&oi->ip_alloc_sem);
> -
> /* Delete orphan before acquire i_rwsem. */
The comment above looks wired. From commit a86a72a4a4e0, the correct one seems:
/* Delete orphan without acquiring i_rwsem. */
Heming
> if (dwc->dw_orphaned) {
> BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
> @@ -2308,6 +2306,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
> mlog_errno(ret);
> }
>
> + down_write(&oi->ip_alloc_sem);
> di = (struct ocfs2_dinode *)di_bh->b_data;
>
> ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), di_bh);
> --
> 2.39.3
>
© 2016 - 2026 Red Hat, Inc.