[PATCH] ocfs2: fix possible deadlock between unlink and dio_end_io_write

Joseph Qi posted 1 patch 1 month ago
fs/ocfs2/aops.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
[PATCH] ocfs2: fix possible deadlock between unlink and dio_end_io_write
Posted by Joseph Qi 1 month ago
ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
while in ocfs2_dio_end_io_write, it acquires these locks in reverse
order. This creates an ABBA lock ordering violation on lock classes
ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
ocfs2_file_ip_alloc_sem_key.

Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
ocfs2_unlink
  ocfs2_prepare_orphan_dir
    ocfs2_lookup_lock_orphan_dir
      inode_lock(orphan_dir_inode) <- lock A
    __ocfs2_prepare_orphan_dir
      ocfs2_prepare_dir_for_insert
        ocfs2_extend_dir
	  ocfs2_expand_inline_dir
	    down_write(&oi->ip_alloc_sem) <- Lock B

Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
ocfs2_dio_end_io_write
  down_write(&oi->ip_alloc_sem) <- Lock B
  ocfs2_del_inode_from_orphan()
    inode_lock(orphan_dir_inode) <- Lock A

Deadlock Scenario:
  CPU0 (unlink)                     CPU1 (dio_end_io_write)
  ------                            ------
  inode_lock(orphan_dir_inode)
                                    down_write(ip_alloc_sem)
  down_write(ip_alloc_sem)
                                    inode_lock(orphan_dir_inode)

Since ip_alloc_sem is to protect allocation changes, which is unrelated
with operations in ocfs2_del_inode_from_orphan. So move
ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock.

Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04
Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
---
 fs/ocfs2/aops.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index 17ba79f443ee..09146b43d1f0 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -2294,8 +2294,6 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
 		goto out;
 	}
 
-	down_write(&oi->ip_alloc_sem);
-
 	/* Delete orphan before acquire i_rwsem. */
 	if (dwc->dw_orphaned) {
 		BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
@@ -2308,6 +2306,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
 			mlog_errno(ret);
 	}
 
+	down_write(&oi->ip_alloc_sem);
 	di = (struct ocfs2_dinode *)di_bh->b_data;
 
 	ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), di_bh);
-- 
2.39.3
Re: [PATCH] ocfs2: fix possible deadlock between unlink and dio_end_io_write
Posted by Heming Zhao 1 month ago
On Fri, Mar 06, 2026 at 11:22:11AM +0800, Joseph Qi wrote:
> ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
> while in ocfs2_dio_end_io_write, it acquires these locks in reverse
> order. This creates an ABBA lock ordering violation on lock classes
> ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
> ocfs2_file_ip_alloc_sem_key.
> 
> Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
> ocfs2_unlink
>   ocfs2_prepare_orphan_dir
>     ocfs2_lookup_lock_orphan_dir
>       inode_lock(orphan_dir_inode) <- lock A
>     __ocfs2_prepare_orphan_dir
>       ocfs2_prepare_dir_for_insert
>         ocfs2_extend_dir
> 	  ocfs2_expand_inline_dir
> 	    down_write(&oi->ip_alloc_sem) <- Lock B
> 
> Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
> ocfs2_dio_end_io_write
>   down_write(&oi->ip_alloc_sem) <- Lock B
>   ocfs2_del_inode_from_orphan()
>     inode_lock(orphan_dir_inode) <- Lock A
> 
> Deadlock Scenario:
>   CPU0 (unlink)                     CPU1 (dio_end_io_write)
>   ------                            ------
>   inode_lock(orphan_dir_inode)
>                                     down_write(ip_alloc_sem)
>   down_write(ip_alloc_sem)
>                                     inode_lock(orphan_dir_inode)
> 
> Since ip_alloc_sem is to protect allocation changes, which is unrelated
> with operations in ocfs2_del_inode_from_orphan. So move
> ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock.
> 
> Reported-by: syzbot+67b90111784a3eac8c04@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=67b90111784a3eac8c04
> Fixes: a86a72a4a4e0 ("ocfs2: take ip_alloc_sem in ocfs2_dio_get_block & ocfs2_dio_end_io_write")
> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>

LGTM.
Reviewed-by: Heming Zhao <heming.zhao@suse.com>

btw, I have a question below that is unrelated to this bug.
> ---
>  fs/ocfs2/aops.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> index 17ba79f443ee..09146b43d1f0 100644
> --- a/fs/ocfs2/aops.c
> +++ b/fs/ocfs2/aops.c
> @@ -2294,8 +2294,6 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
>  		goto out;
>  	}
>  
> -	down_write(&oi->ip_alloc_sem);
> -
>  	/* Delete orphan before acquire i_rwsem. */

The comment above looks wired. From commit a86a72a4a4e0, the correct one seems:
/* Delete orphan without acquiring i_rwsem. */

Heming
>  	if (dwc->dw_orphaned) {
>  		BUG_ON(dwc->dw_writer_pid != task_pid_nr(current));
> @@ -2308,6 +2306,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
>  			mlog_errno(ret);
>  	}
>  
> +	down_write(&oi->ip_alloc_sem);
>  	di = (struct ocfs2_dinode *)di_bh->b_data;
>  
>  	ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), di_bh);
> -- 
> 2.39.3
>