From nobody Tue Dec 2 01:33:05 2025 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BF0226ED45 for ; Fri, 21 Nov 2025 07:22:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763709768; cv=none; b=n8m9dcSkHUe0PovH4NzfhX0/Yv/Hbf7jxARDF7eDpwz3Z2PAl+pZzw5XeBonTbwiUQQMzj397PrrauhfdlBDE4GGCCcO/OT3zDpBq1gg1klx5qk9qD2essNL1VEf3QTQrZYgAMrwtPGL4Yx4bC7SLhjdnRj9tUx1gzF70sT1tMA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763709768; c=relaxed/simple; bh=QU7UgbrRQWIE5jt+i0Rs5HLMeVgJ47HBJ/y0Eje87Ac=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=GyetYpaoNPB9Ti7Fqzhi3RUJyPmekmcaNUSio/4DGftoHeOroyCns5wujEUdHyDgUfuGc3kcowKdBRvagcRnVbHhPJUHOWM6ejMwyo0szNmiG41RCWThQx1ha5nRUgA8R8nHLEvjSDIJQepjXljBaEKjeGUqtXglcmCN3hUs4aI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MgUBHAWc; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MgUBHAWc" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-42b3c5defb2so1045636f8f.2 for ; Thu, 20 Nov 2025 23:22:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763709763; x=1764314563; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=qBonvU1U3ZemupKyXYNMEJLvdVev/yCBsIjnIBAPUGc=; b=MgUBHAWcmcbNGdST01Mn+EjPHK85Ej8jrbibhgw/MDJMxtq2TJ1jI0qNosa3JsdoUo J2shLOSZCgwFl1GfuiNIpE0rpCxsQZdSeXB1XwKLytQd4o/fMdji/O4O+Wys6yT4QSn1 ow3xAuhqmUBkaZLy07eBiFEAMYefRM5EVfg/V8aVsUWyoZFZjimzqGpTzF05yiP1Mp5u WjK6wDopONFwhdPgdwnjsOXtDu06d6s0oSlWlddJOb1j+RwPkm21qPaUPLfiXShoSdk8 e453Jr/qMBsbuWpga8iDraoP8ozc5P2cSjGhHiEeAgVtFDx1OHuSZxl2IMqF46ins4Xz T5lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763709763; x=1764314563; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=qBonvU1U3ZemupKyXYNMEJLvdVev/yCBsIjnIBAPUGc=; b=cGOvJGoB3HJuZqbbWijTS0Sf0OS3bw0P90RXaM7RcdztUA9iRvXjr5lIqkYaUibdb+ eWljoYScjIJ78cnTfeBLU5gQ5JnesBHpVaRPxe5aSMnX5pDx5EMt3V1LaKl9SQDeBOnC wwW8tnNUSXvEgqr72HkeZjL752q2z97uVjWeo8B7PAhDdX9HSW5HgWJ+q4xqtilf/Vob nmHTbx6Arwz+i87VREmmx+JAvg+K1GYD/VMsrUn1ox7/5dGbfjYP7F0er3qrTmqShbUK jaD9c2g24nYIu+rivIhjuIpslFqde+ZpNacen+4hglYR7qW6QdAB6UT2WNWWfH/eMcta G/jQ== X-Forwarded-Encrypted: i=1; AJvYcCU2kO6xO2gQ0wS9XBnubIA1WsKeTE3ETMwT5yLrwcOSvrUKl7yBTx6YwITdlxOekQ4GB3TNLXYErLYMLtU=@vger.kernel.org X-Gm-Message-State: AOJu0YzLpmyaG2rUHmPOIOp7BbAXpNwcn7wzPSiR7rbpZgWloz55LpJA 4AHrHVd/RLhdbqLIM+Z1KPOWOH2iXR8D5Ly+VaNw/fQn+/pTZXRKJyEazhzStw== X-Gm-Gg: ASbGnctKdiMIz9DFZTa+NAacxj7Yzc2UyFPSmFRawq5sF/v4mw8jw9YIQpCHAi9VxP9 Rq69NwjhrITdEKr1++6et09OnzYjdYgn62OMUAlLz+NnJ2yiUsOBefvz+0Wk9i9O4CpylzPcIAB bkRPDaL80quYMYp7HX5/K2W92HPDLIvR7mI6nOeCw5nGpx/zBMLboc0fVsMSpqiaoa6eVvJ/HWh IbEe3yREacf2ZMDm3Nj1rJynX+HwGFg3ujfziMLM/eR0n0ESJV8dObc5T1bfg+EDHqpM35jAOj7 xWLF+ALN4O2OIkn6GVY5iR0hK3sfiWtwTr/WGkgxZfuVNn8IVAIjlBagiedX4otrEHdri9nwTgU tCoJkO9sNMDRichHsl5ZbNHApNNKWWh6C4c1ZYn+xVMmZFfc0FrUgBznwyGHJwDGB/w5UPUzvBt MRrcnYr0qFWNvJEcDOzoAt+zSgczRwN6Jc22zHGsnx7M5h5E+9hclPUdj8tqY= X-Google-Smtp-Source: AGHT+IHCa/mDVJYgIa60qmXsx8NLFXfcSCtUIzYM70iUWvyZT9ae/OaYZVKQJFWWylGnkkowy1Q1pg== X-Received: by 2002:a05:6000:228a:b0:42b:3e60:18ba with SMTP id ffacd0b85a97d-42cc1ac9ca3mr1072320f8f.8.1763709763103; Thu, 20 Nov 2025 23:22:43 -0800 (PST) Received: from f.. (cst-prg-14-82.cust.vodafone.cz. [46.135.14.82]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7f34ff3sm9986431f8f.16.2025.11.20.23.22.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Nov 2025 23:22:42 -0800 (PST) From: Mateusz Guzik To: brauner@kernel.org Cc: viro@zeniv.linux.org.uk, jack@suse.cz, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Mateusz Guzik Subject: [PATCH] fs: scale opening of character devices Date: Fri, 21 Nov 2025 08:22:37 +0100 Message-ID: <20251121072237.3230021-1-mjguzik@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" chrdev_open() always takes cdev_lock, which is only needed to synchronize against cd_forget(). But the latter is only ever called by inode evict(), meaning these two can never legally race. Solidify this with asserts. More cleanups are needed here but this is enough to get the thing out of the way. Rationale is funny-sounding at first: opening of /dev/zero happens to be a contention point in large-scale package building (think 100+ packages at the same with a thread count to support it). Such a workload is not only very fork+exec heavy, but frequently involves scripts which use the idiom of silencing output by redirecting it to /dev/null. A non-large-scale microbenchmark of opening /dev/null in a loop in 16 processes: before: 2865472 after: 4011960 (+40%) Code goes from being bottlenecked on the spinlock to being bottlenecked on lockref. Signed-off-by: Mateusz Guzik --- I'll note for interested my experience with the workload at hand comes from FreeBSD and was surprised to find /dev/null on the profile. Given that Linux is globally serializing on it, it has to be a factor as well in this case. fs/char_dev.c | 20 +++++++++++--------- fs/inode.c | 2 +- include/linux/cdev.h | 2 +- 3 files changed, 13 insertions(+), 11 deletions(-) diff --git a/fs/char_dev.c b/fs/char_dev.c index c2ddb998f3c9..dfde57cb5eed 100644 --- a/fs/char_dev.c +++ b/fs/char_dev.c @@ -374,15 +374,15 @@ static int chrdev_open(struct inode *inode, struct fi= le *filp) { const struct file_operations *fops; struct cdev *p; - struct cdev *new =3D NULL; int ret =3D 0; =20 - spin_lock(&cdev_lock); - p =3D inode->i_cdev; + VFS_BUG_ON_INODE(icount_read(inode) < 1, inode); + + p =3D READ_ONCE(inode->i_cdev); if (!p) { struct kobject *kobj; + struct cdev *new; int idx; - spin_unlock(&cdev_lock); kobj =3D kobj_lookup(cdev_map, inode->i_rdev, &idx); if (!kobj) return -ENXIO; @@ -392,19 +392,19 @@ static int chrdev_open(struct inode *inode, struct fi= le *filp) we dropped the lock. */ p =3D inode->i_cdev; if (!p) { - inode->i_cdev =3D p =3D new; + p =3D new; + WRITE_ONCE(inode->i_cdev, p); list_add(&inode->i_devices, &p->list); new =3D NULL; } else if (!cdev_get(p)) ret =3D -ENXIO; + spin_unlock(&cdev_lock); + cdev_put(new); } else if (!cdev_get(p)) ret =3D -ENXIO; - spin_unlock(&cdev_lock); - cdev_put(new); if (ret) return ret; =20 - ret =3D -ENXIO; fops =3D fops_get(p->ops); if (!fops) goto out_cdev_put; @@ -423,8 +423,10 @@ static int chrdev_open(struct inode *inode, struct fil= e *filp) return ret; } =20 -void cd_forget(struct inode *inode) +void inode_cdev_forget(struct inode *inode) { + VFS_BUG_ON_INODE(!(inode_state_read_once(inode) & I_FREEING), inode); + spin_lock(&cdev_lock); list_del_init(&inode->i_devices); inode->i_cdev =3D NULL; diff --git a/fs/inode.c b/fs/inode.c index a62032864ddf..88be1f20782d 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -840,7 +840,7 @@ static void evict(struct inode *inode) clear_inode(inode); } if (S_ISCHR(inode->i_mode) && inode->i_cdev) - cd_forget(inode); + inode_cdev_forget(inode); =20 remove_inode_hash(inode); =20 diff --git a/include/linux/cdev.h b/include/linux/cdev.h index 0e8cd6293deb..bed99967ad90 100644 --- a/include/linux/cdev.h +++ b/include/linux/cdev.h @@ -34,6 +34,6 @@ void cdev_device_del(struct cdev *cdev, struct device *de= v); =20 void cdev_del(struct cdev *); =20 -void cd_forget(struct inode *); +void inode_cdev_forget(struct inode *); =20 #endif --=20 2.48.1