From nobody Tue Apr 7 04:18:29 2026 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A41D730CD82 for ; Mon, 16 Mar 2026 10:33:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773657197; cv=none; b=kSR3EgMOL4Qo+IwnGKEC85D6is1VeM/pVosmgttkdtcR/SU2U8KVBIQVA+gdtoOGKGpGUHv1pq8c4aF4OdyTsa25jpUXrh8ReVCj9RcjVhfKFErs8RABqqt3fxzbkAWxqLpSRN40JYlKcM6OkZOpM5AihryvsMaFa0bXykNUpoI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773657197; c=relaxed/simple; bh=ZBJ+7ABdWfBwJ1kdEpniZFc+VpfNdA2ChzEJcdAKvWA=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=dvyuB2bFHEkW5svFKqeuxtOAq8iSMuwjbMVjg2tWL3TMdwpfVaMdI/SxV/xsFQgcxqCNEUi3mTfJTQ20jmxsKNMGFWewVIogcoe+hJy20GOgOtgcRUeqh6JWIf9D7jsg2m+mTlee87bkrIN3RHd+ynf7+ALSa+PHjk2WG3DqaX4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HGSlAT+0; arc=none smtp.client-ip=209.85.128.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HGSlAT+0" Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-4852a9c6309so35154665e9.0 for ; Mon, 16 Mar 2026 03:33:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773657194; x=1774261994; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=P+MaXMpggT9LK9ANjrahayvw32XcgfuOHp5LC+zevnU=; b=HGSlAT+0Fe5/SMs/fZQmtZV5W17fxA/ndlsZXmAYtCnnL0ZWfiEM64YxMZSHrac2UW Z7Pi8cpjDxoWBGRvT3yJKGyiKhaKHTuCnsffrjc9Y5uxKBcqTSCHYqG3EDEYYEr1jvef 09nVHxfAaE23kNS1RL1ZUmFvbl4Y9qX636iUnvnRPuhxvBGME1d1P7EBUZNgpuQmm4Eb q+7jqTE13siY/DayWpVRR0X+deQyg3hFDOdzu7VtCdV2tRq0ZR6KXabdKzlqYXX+tSzN C38fAkqNqu9+UHFRv6WmobG0pfcUrHfJl/2z9HG9Z50NVCprUgmgy+M86aO+seye4lJP Dynw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773657194; x=1774261994; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=P+MaXMpggT9LK9ANjrahayvw32XcgfuOHp5LC+zevnU=; b=D8VXNMUWQpdO0H1eTVR8iijuNoXw01zeeh47RUnew4KCVP4UbahLyrAjzasJLel9k7 A/AatyPHhM9+wR3W4nmsERO1es5HeY7AlT6zc44brcbf8mzqeSHhjXNYPECGdcL4mJy3 n8wUcK5tNUp8iczRpBA8HETYe/v8qxj5i5zyfDFYGHfm2skri9Unp5N9XHhfuBjiPW9w uaGrO0PO7opoTqfMdugpca7oq/sFa62uEbvV2eWvPpKr73V0Da7HDZbzyqQFg6FJl3RO 3zLQZoJgkaiSzhR2GbHy5XqzLLcrh1EJPt4EJemBropHS9881r7mK5nmVg09JjRUEVXX kLGQ== X-Forwarded-Encrypted: i=1; AJvYcCWrOjAQebP8W13DNs2lyvELcSjxtbFT7Xu+jIsMbCDVixK73v9Dw8g32oJ9rzS7wfuTHBJIUjyAC/7qDa0=@vger.kernel.org X-Gm-Message-State: AOJu0Yyr6ihB6zZZ8t97XbgxOsDiafHEpyl9KT3tteh8zgqcZnlFBFUb lzm32giri1dWL2lYK7OeBIGbn0mpsTdGch/F1X4VEbP2Kyo0buj3k4CV X-Gm-Gg: ATEYQzwII38RoDiRhR/alPX93NWCjWEXkXXZXhjqTX7IOghTBH4p2gUhJPN/fPRYTSn a4kUOrNTHvpfg+Z1GwGNV0UV9DAbopq0ZEKkPJ6pb43wm+N5Ow4hepBztePvBebNwTftYKg/RP5 MUI8Go3pTeOscplImEGCNJcfR6gUdjbtQItqm4J0DYdS1uo3BxL2D8fWw/zahURAqFEcPJ0zeAo DEVVKJfH71/nrHTH8j3voBvqdt8E+uqK3WGG8L4tApoRlq5U6580Bye5weq7ZOJPh7UDjfkI6xV ckPqxUB+nATrOauAG1iA478kaU44jDcFj8EpUS9m6IaP7vd5PIEP5hvnWXfTw8pzyEFNueELAJr dACAz2RrFx4g25OZMMb0le5RBAuKeoW+XCWwB4iEdvak2G6nE/tC/Da8d46KInQNbd6tKc92Jr8 KzMJiM6q3OPIrNm/NraPOCcVRNvp5P8zN09gL41HASxV0x85YXzyh63/+iZaFg82VVmyeYNy8= X-Received: by 2002:a05:600c:c0d1:20b0:485:3abe:ab86 with SMTP id 5b1f17b1804b1-485570ce1cfmr148452185e9.4.1773657193560; Mon, 16 Mar 2026 03:33:13 -0700 (PDT) Received: from f.. (cst-prg-90-18.cust.vodafone.cz. [46.135.90.18]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-439fe19aec5sm41389446f8f.4.2026.03.16.03.33.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2026 03:33:12 -0700 (PDT) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Mateusz Guzik , "Lai, Yi" Subject: [PATCH] fs: revert insert_inode_locked() eviction wait change and explain why Date: Mon, 16 Mar 2026 11:33:05 +0100 Message-ID: <20260316103306.1258289-1-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It causes a deadlock, reproducer can be found here: https://lore.kernel.org/linux-fsdevel/abNvb2PcrKj1FBeC@ly-workstation/ The real bug is in ext4, but I'm not digging into it and a working order needs to be restored. Commentary is added as a warning sign for another sucker^Wdeveloper. Fixes: 88ec797c468097a8 ("fs: make insert_inode_locked() wait for inode des= truction") Reported-by: "Lai, Yi" Signed-off-by: Mateusz Guzik --- generated against master FYI next has a change which has a trivial conflic (ino type change) fs/inode.c | 53 +++++++++++++++++++++++++++++------------------------ 1 file changed, 29 insertions(+), 24 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index cc12b68e021b..5f7e76c9fb53 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1037,20 +1037,19 @@ long prune_icache_sb(struct super_block *sb, struct= shrink_control *sc) return freed; } =20 -static void __wait_on_freeing_inode(struct inode *inode, bool hash_locked,= bool rcu_locked); - +static void __wait_on_freeing_inode(struct inode *inode, bool is_inode_has= h_locked); /* * Called with the inode lock held. */ static struct inode *find_inode(struct super_block *sb, struct hlist_head *head, int (*test)(struct inode *, void *), - void *data, bool hash_locked, + void *data, bool is_inode_hash_locked, bool *isnew) { struct inode *inode =3D NULL; =20 - if (hash_locked) + if (is_inode_hash_locked) lockdep_assert_held(&inode_hash_lock); else lockdep_assert_not_held(&inode_hash_lock); @@ -1064,7 +1063,7 @@ static struct inode *find_inode(struct super_block *s= b, continue; spin_lock(&inode->i_lock); if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE)) { - __wait_on_freeing_inode(inode, hash_locked, true); + __wait_on_freeing_inode(inode, is_inode_hash_locked); goto repeat; } if (unlikely(inode_state_read(inode) & I_CREATING)) { @@ -1088,11 +1087,11 @@ static struct inode *find_inode(struct super_block = *sb, */ static struct inode *find_inode_fast(struct super_block *sb, struct hlist_head *head, unsigned long ino, - bool hash_locked, bool *isnew) + bool is_inode_hash_locked, bool *isnew) { struct inode *inode =3D NULL; =20 - if (hash_locked) + if (is_inode_hash_locked) lockdep_assert_held(&inode_hash_lock); else lockdep_assert_not_held(&inode_hash_lock); @@ -1106,7 +1105,7 @@ static struct inode *find_inode_fast(struct super_blo= ck *sb, continue; spin_lock(&inode->i_lock); if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE)) { - __wait_on_freeing_inode(inode, hash_locked, true); + __wait_on_freeing_inode(inode, is_inode_hash_locked); goto repeat; } if (unlikely(inode_state_read(inode) & I_CREATING)) { @@ -1842,13 +1841,28 @@ int insert_inode_locked(struct inode *inode) while (1) { struct inode *old =3D NULL; spin_lock(&inode_hash_lock); -repeat: hlist_for_each_entry(old, head, i_hash) { if (old->i_ino !=3D ino) continue; if (old->i_sb !=3D sb) continue; spin_lock(&old->i_lock); + /* + * FIXME: inodes awaiting eviction don't get waited for + * + * This is a bug because the hash can temporarily end up with duplicate= inodes. + * It happens to work becuase new inodes are inserted at the beginning = of the + * chain, meaning they will be found first should anyone do a lookup. + * + * Fixing the above results in deadlocks in ext4 due to journal handlin= g during + * inode creation and eviction -- the eviction side waits for creation = side to + * finish. Adding __wait_on_freeing_inode results in both sides waiting= on each + * other. + */ + if (inode_state_read(old) & (I_FREEING | I_WILL_FREE)) { + spin_unlock(&old->i_lock); + continue; + } break; } if (likely(!old)) { @@ -1859,11 +1873,6 @@ int insert_inode_locked(struct inode *inode) spin_unlock(&inode_hash_lock); return 0; } - if (inode_state_read(old) & (I_FREEING | I_WILL_FREE)) { - __wait_on_freeing_inode(old, true, false); - old =3D NULL; - goto repeat; - } if (unlikely(inode_state_read(old) & I_CREATING)) { spin_unlock(&old->i_lock); spin_unlock(&inode_hash_lock); @@ -2534,18 +2543,16 @@ EXPORT_SYMBOL(inode_needs_sync); * wake_up_bit(&inode->i_state, __I_NEW) after removing from the hash list * will DTRT. */ -static void __wait_on_freeing_inode(struct inode *inode, bool hash_locked,= bool rcu_locked) +static void __wait_on_freeing_inode(struct inode *inode, bool is_inode_has= h_locked) { struct wait_bit_queue_entry wqe; struct wait_queue_head *wq_head; =20 - VFS_BUG_ON(!hash_locked && !rcu_locked); - /* * Handle racing against evict(), see that routine for more details. */ if (unlikely(inode_unhashed(inode))) { - WARN_ON(hash_locked); + WARN_ON(is_inode_hash_locked); spin_unlock(&inode->i_lock); return; } @@ -2553,16 +2560,14 @@ static void __wait_on_freeing_inode(struct inode *i= node, bool hash_locked, bool wq_head =3D inode_bit_waitqueue(&wqe, inode, __I_NEW); prepare_to_wait_event(wq_head, &wqe.wq_entry, TASK_UNINTERRUPTIBLE); spin_unlock(&inode->i_lock); - if (rcu_locked) - rcu_read_unlock(); - if (hash_locked) + rcu_read_unlock(); + if (is_inode_hash_locked) spin_unlock(&inode_hash_lock); schedule(); finish_wait(wq_head, &wqe.wq_entry); - if (hash_locked) + if (is_inode_hash_locked) spin_lock(&inode_hash_lock); - if (rcu_locked) - rcu_read_lock(); + rcu_read_lock(); } =20 static __initdata unsigned long ihash_entries; --=20 2.48.1