From nobody Thu Apr 2 14:06:38 2026 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2BA4A34F48B for ; Sat, 28 Mar 2026 15:32:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774711924; cv=none; b=IZZ7Tb161VDWu1KtGCKLOeELI5eDzzkoVbhFnTW8V4jyvYBMtVZ6zxcjxXJJgvYlpVKUqnL6mUPsEqtrzKrthACpWdLxzdfonIG1l+TpMLQnZVkrkErpgYRnT9d4mknNCv8huan4Y4SHYyzIiP8+0+cQ+gD9LbSbKH5kz8MsZVw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774711924; c=relaxed/simple; bh=s/8IlebDzkbVB0x3ZPq6jRzAdKw0+tljoRl8W93Mw2o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WlzwpIMdwj5ni6cn5EYWvW264nT/ZHurEq72FrmltC2mWYK6taXiTEstXSNcAODyysmHUrm0CBZ09aYYeWjfg2FQg+WlOh52GH4KjUguEOTVstefLCt8KQLQ+G59sjvosRw86jgYHBXOeK8VDMpo3Lc7E4AFnhB/d376nkVdJLc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kQ9DhyGR; arc=none smtp.client-ip=209.85.221.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kQ9DhyGR" Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-43b40fb7f95so2619524f8f.3 for ; Sat, 28 Mar 2026 08:32:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774711922; x=1775316722; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cm/8dTd6WxBttMkVbYS0e9e3tXxe/ZGmXHRFHym+rA8=; b=kQ9DhyGR/H6HWDYXR4grXvLtF4yEj6LB4icExNiReYwFPzU8e/kP6Q2jVKjGZx/AL7 7HTa+hBJvpCEzfNVMcCA1tZeoRSMN+jDJOQKs4DzWHJevIi6clIeRDNUDnd8Y/fPKkLW 4NTh0mUJTukoik6xCAFANIHmOAQMp9VNw/zyrFYVRWEc3KpkburNrvkhk0pkntEUjv6B F9eh2EhALicSjv5dtHL1qU56IKJsPMWs9mt8MU0n74srj9E3KbNLFXI8KUM0+mmJsU/S ycfcw8LIfECSO6kYVPio7ShSuBD2TJb/OzxTN7izpehZ6Ai3M5QKVNggizfs2Bo9jtuR 4Dlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774711922; x=1775316722; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cm/8dTd6WxBttMkVbYS0e9e3tXxe/ZGmXHRFHym+rA8=; b=a63gJ5hrgWDtR8nRnKQqgTcn4VrUhwfToESAT7x4Ht6dm/0zQ1F5ypc2XEp3GIRkgs 7gH70qTtc4jUAs07oD0EVBCaLCJGZVx07oZL4C+Mb5cgkgXmJa52xB14C2no9FLnXTE8 iiVSWJ9ZQD0U6hXOghIiGO9B0F/mHyTM6O9cpjEkrFyegYh6ba/DHW1RYHHp5Y6lUbQg bmwAOn4JKMh7Rl0VKRU6IVyMNkc+nRTS0mGrSS/7kV/2ta/Lfg89EWKep9UppdiQ3pVy G9q3eBK67+U7v2Rdei9BkSbut9z6i6C5OVfza4HQUP16GJ1NKn3wmCGSVW7yqx3JwND2 lLNQ== X-Forwarded-Encrypted: i=1; AJvYcCVj6XEnWsqWyAP15kYEcrje2BFRbMMw2A1wVuytw4QwLqYEg0mV2sOzVIDKhJkKdEysmABnr5zchiq2+qc=@vger.kernel.org X-Gm-Message-State: AOJu0YwkM99weYsyOWxAZMdzfsFJIl6EVsOmYRo9sg4zPDzmUIu6vlH4 OwxAIP/oy9FWP8espdTV8I4IIrjJT5Av0UBuE9li6R9oCcpyaGHSqtn3AM9rSw== X-Gm-Gg: ATEYQzznVKr+HUXEoWgOfVzxf2es2wgFzejBk+WMRL2C9uDpHLyGvGtedXfYT44Pkmx 7CtXCbC6uySvd9IN8fPoayqRuv08TrskWnMR8PdAZrFUQQ+dVESMi8fXMplq9klqn10g5W7bbie yT6BWdz9tQSotTUuf0/ISUEiKwaK86wrzcxD/lTffrq403E8UFF5LXIxh2O1ru4LTkcMTCcNOXk dv1XyFuC+bcj5nyNCLcbrzfyweWUL2LHx7OxPoBar5d+XWkEVg2KVnCzM/pr5sJF5CKsfCqJv88 c3wqA7/bGDEv+SkYuIdlckUmDu4oRnOU+FmA2lqy5vBWTLZ/Rqe5ZOReqw4QoLThbyQHhQhEB98 ywoHd3US2ohX5kaYn2H+JItTuc034t9pCAbwlibwr80OJjrENP/ItGdnquYlMys/iEDuxpTtvO4 ScpgVDBmKhi+sRP5h8+KfST0llpVhhNathOFD5SGltzJqSxNgfx2m6AfP+YSW14yP+uCnJ0omtX 2ONz5s6vdeP X-Received: by 2002:a05:6000:238a:b0:439:ccd7:cdb6 with SMTP id ffacd0b85a97d-43b9e9ea0f9mr11085170f8f.14.1774711921583; Sat, 28 Mar 2026 08:32:01 -0700 (PDT) Received: from f.. (cst-prg-89-171.cust.vodafone.cz. [46.135.89.171]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43cf245f8a3sm6725310f8f.24.2026.03.28.08.31.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Mar 2026 08:32:01 -0700 (PDT) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Mateusz Guzik , "Lai, Yi" Subject: [PATCH] fs: revert insert_inode_locked() eviction wait change and explain why Date: Sat, 28 Mar 2026 16:31:42 +0100 Message-ID: <20260328153146.3368470-3-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260328153146.3368470-1-mjguzik@gmail.com> References: <20260328153146.3368470-1-mjguzik@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It causes a deadlock, reproducer can be found here: https://lore.kernel.org/linux-fsdevel/abNvb2PcrKj1FBeC@ly-workstation/ The real bug is in ext4, but I'm not digging into it and a working order needs to be restored. Commentary is added as a warning sign for another sucker^Wdeveloper. Fixes: 88ec797c468097a8 ("fs: make insert_inode_locked() wait for inode des= truction") Reported-by: "Lai, Yi" Signed-off-by: Mateusz Guzik --- fs/inode.c | 53 +++++++++++++++++++++++++++++------------------------ 1 file changed, 29 insertions(+), 24 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index cc12b68e021b..5f7e76c9fb53 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1037,20 +1037,19 @@ long prune_icache_sb(struct super_block *sb, struct= shrink_control *sc) return freed; } =20 -static void __wait_on_freeing_inode(struct inode *inode, bool hash_locked,= bool rcu_locked); - +static void __wait_on_freeing_inode(struct inode *inode, bool is_inode_has= h_locked); /* * Called with the inode lock held. */ static struct inode *find_inode(struct super_block *sb, struct hlist_head *head, int (*test)(struct inode *, void *), - void *data, bool hash_locked, + void *data, bool is_inode_hash_locked, bool *isnew) { struct inode *inode =3D NULL; =20 - if (hash_locked) + if (is_inode_hash_locked) lockdep_assert_held(&inode_hash_lock); else lockdep_assert_not_held(&inode_hash_lock); @@ -1064,7 +1063,7 @@ static struct inode *find_inode(struct super_block *s= b, continue; spin_lock(&inode->i_lock); if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE)) { - __wait_on_freeing_inode(inode, hash_locked, true); + __wait_on_freeing_inode(inode, is_inode_hash_locked); goto repeat; } if (unlikely(inode_state_read(inode) & I_CREATING)) { @@ -1088,11 +1087,11 @@ static struct inode *find_inode(struct super_block = *sb, */ static struct inode *find_inode_fast(struct super_block *sb, struct hlist_head *head, unsigned long ino, - bool hash_locked, bool *isnew) + bool is_inode_hash_locked, bool *isnew) { struct inode *inode =3D NULL; =20 - if (hash_locked) + if (is_inode_hash_locked) lockdep_assert_held(&inode_hash_lock); else lockdep_assert_not_held(&inode_hash_lock); @@ -1106,7 +1105,7 @@ static struct inode *find_inode_fast(struct super_blo= ck *sb, continue; spin_lock(&inode->i_lock); if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE)) { - __wait_on_freeing_inode(inode, hash_locked, true); + __wait_on_freeing_inode(inode, is_inode_hash_locked); goto repeat; } if (unlikely(inode_state_read(inode) & I_CREATING)) { @@ -1842,13 +1841,28 @@ int insert_inode_locked(struct inode *inode) while (1) { struct inode *old =3D NULL; spin_lock(&inode_hash_lock); -repeat: hlist_for_each_entry(old, head, i_hash) { if (old->i_ino !=3D ino) continue; if (old->i_sb !=3D sb) continue; spin_lock(&old->i_lock); + /* + * FIXME: inodes awaiting eviction don't get waited for + * + * This is a bug because the hash can temporarily end up with duplicate= inodes. + * It happens to work becuase new inodes are inserted at the beginning = of the + * chain, meaning they will be found first should anyone do a lookup. + * + * Fixing the above results in deadlocks in ext4 due to journal handlin= g during + * inode creation and eviction -- the eviction side waits for creation = side to + * finish. Adding __wait_on_freeing_inode results in both sides waiting= on each + * other. + */ + if (inode_state_read(old) & (I_FREEING | I_WILL_FREE)) { + spin_unlock(&old->i_lock); + continue; + } break; } if (likely(!old)) { @@ -1859,11 +1873,6 @@ int insert_inode_locked(struct inode *inode) spin_unlock(&inode_hash_lock); return 0; } - if (inode_state_read(old) & (I_FREEING | I_WILL_FREE)) { - __wait_on_freeing_inode(old, true, false); - old =3D NULL; - goto repeat; - } if (unlikely(inode_state_read(old) & I_CREATING)) { spin_unlock(&old->i_lock); spin_unlock(&inode_hash_lock); @@ -2534,18 +2543,16 @@ EXPORT_SYMBOL(inode_needs_sync); * wake_up_bit(&inode->i_state, __I_NEW) after removing from the hash list * will DTRT. */ -static void __wait_on_freeing_inode(struct inode *inode, bool hash_locked,= bool rcu_locked) +static void __wait_on_freeing_inode(struct inode *inode, bool is_inode_has= h_locked) { struct wait_bit_queue_entry wqe; struct wait_queue_head *wq_head; =20 - VFS_BUG_ON(!hash_locked && !rcu_locked); - /* * Handle racing against evict(), see that routine for more details. */ if (unlikely(inode_unhashed(inode))) { - WARN_ON(hash_locked); + WARN_ON(is_inode_hash_locked); spin_unlock(&inode->i_lock); return; } @@ -2553,16 +2560,14 @@ static void __wait_on_freeing_inode(struct inode *i= node, bool hash_locked, bool wq_head =3D inode_bit_waitqueue(&wqe, inode, __I_NEW); prepare_to_wait_event(wq_head, &wqe.wq_entry, TASK_UNINTERRUPTIBLE); spin_unlock(&inode->i_lock); - if (rcu_locked) - rcu_read_unlock(); - if (hash_locked) + rcu_read_unlock(); + if (is_inode_hash_locked) spin_unlock(&inode_hash_lock); schedule(); finish_wait(wq_head, &wqe.wq_entry); - if (hash_locked) + if (is_inode_hash_locked) spin_lock(&inode_hash_lock); - if (rcu_locked) - rcu_read_lock(); + rcu_read_lock(); } =20 static __initdata unsigned long ihash_entries; --=20 2.48.1