From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7772CC433F5 for ; Fri, 25 Feb 2022 05:21:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237503AbiBYFW1 (ORCPT ); Fri, 25 Feb 2022 00:22:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbiBYFWN (ORCPT ); Fri, 25 Feb 2022 00:22:13 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A125431530 for ; Thu, 24 Feb 2022 21:21:40 -0800 (PST) Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4jdqR019338; Fri, 25 Feb 2022 05:21:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=ktxkINM0xFCRjBCR6qswvo2bL6f5HAVIj6fEBbDnibQ=; b=JUGyYsPXW8pGhOceUC1QA5gbX28hEOLBsa+sU4yYfvRLXIhOqf6D2cO/AmUN7haL0f2R Ozivf0XQR/OQxZi9XbfZiCM7EMgZx3Xc8sX+sly4wLLF2Tpb+yoqDyEucBKYESF/dp2C yyZoYPxjFJ6kU2/I//fGITzBQDa8s8iKWS7UggYld20ee+eqJ2NzzhVpVmbmInFRGBmN QWUu2SFr+1ylfQACqRGJ0h1TeJmXh/MvGwy1t1FSZjAdWa3awyA9LytSz1KTGErl/hvz aQpRzf4t6BhouUCQpvnrM0QbEecLTpwdDDymCADR7fZ6db9UUZyAw9wth0E4znbFAHem kQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3ect3cs1xc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:32 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5Bg4c011961; Fri, 25 Feb 2022 05:21:31 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2103.outbound.protection.outlook.com [104.47.58.103]) by userp3030.oracle.com with ESMTP id 3eannykfbv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JqUiIhPOBOzHwPGD+fWwMeSLCYE7FQpwT0GN4KEWGPmiE0F3iaJVe4j/fyGcEpqsMwBT0GQLiY2l8pUNpLLk0p4EbqHL4hMxyDAN6OqDEYDKImsZ3POshQ2xXMXOv5tGvLT6oMr52T0StwL2H0JLm7xBsEup+6unAMe3KXQPB7jXbbVVeoLrY7/4QkHy8iEe2jV0LVo5boP4iUkDfiskqPRCU41/C1xfgPwYx5KQsWto6u7AYgxIzCGLnVraaa9cYghnrj4Gv4Bv6yQ1bYmTMRoyRXkzLvgl7sbXEe/uLiE4g9NWh9233EpztVj8SmVgWIxvtOJfD195jyjxjMgDpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ktxkINM0xFCRjBCR6qswvo2bL6f5HAVIj6fEBbDnibQ=; b=dEPOh8z+svtvawWe+piekhsbJ2+ESBQ9dcSYHMLdqXk58SwC08nqfGtPU13pTkfkeDsKg6dJS2I76/MQ+X++iSo561dScgIiamOXlAT20LpPkA15yuGktm+8GmhqYJqk958CbO1UMsr6PIP9PLNTI71SYNGbl7da+Eu26BnZ/Yd9bRPzIXU+v0UAg1UguOg6A9av3NVdBb/DsfUVJn57Bidxm7lcZzatf+QjgmsBCnk4GlvajTr/wh7plmmwjXZFd7flI5wy3Jq0Emi7fVA1IK3mGusx4ZN00g9nNDdDrBPlraQIn5DNRxAo7e6LxKRZSX2kWnk/oPBngBipHOn91A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ktxkINM0xFCRjBCR6qswvo2bL6f5HAVIj6fEBbDnibQ=; b=cozXqm5j369JNfgmuKkvODdWzbbitlDJqJscuFFu4mt6H7o4BkfTnTAEP+fPZmxOiBB4XUxXHDGkseMD3mo2A8o70JRbJzTYqqUk+thbMAYqR1eNTcmL9Bv+87xTuoSDLBMMTWVyY34LPaQ2wk0OPHHwY05LNXcOIQCtlw37Y6E= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:29 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:29 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 1/8] kernfs: Introduce interface to access global kernfs_open_file_mutex. Date: Fri, 25 Feb 2022 16:21:09 +1100 Message-Id: <20220225052116.1243150-2-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0404deb0-4283-45ec-396f-08d9f81ead67 X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: CiEZroHxg+mbiIgtinArye5HHbIT8ibak42Ulq75NppJZoJMLpOAG15BQDb/ejd7/TYF3375tVWqtKpWaLhy4eAWsKqjKg9hHjSx8NPUPGHtQ2VgEEzVAIBMmuw0HIXxiyO8tUOqu1dXpvcGLkadWX/okorzhjhkX++R18XB+Eu6cD305AVWSnEqJyUJBOxBJMKK+/okXaV/k4f/Sqt1iw9ZclUQoOhopfJGSIZrOuYyImo6r/dFUf0mhToorq3aXgSdSNwoCA6qspr/aWd/ArtpGxF6zJtBNOPAchfHtqcDDpVSEgjHo2GqyuHAhhFK+TKY52ziB8MTRg9LqwsZTOn0w+1fYsU5oh/kWO8f63yBNKpEb4hWEV+KzbIeAfQyOcslImTMe/vlt21W1x9+B8bG3RnIcNgJYk8BhKL31EoxwcH8dkjuwI9PJP9BuSV5EsHhIPo9B6bA+QAmIi1Ia69pmx0FBEW3jL9pMUO5kZJ8duBeqt0xcV3AkqZdROoHQMhSnb+mcz642ERjIuckayTIHPL/i6tb49VAh0x/4NFtCTBKCSqLE5hVsEv7iv2PLb4WTVsYUEAM9QIp4pLwlNzSaM5BdOBTGqv6MaOp6uuRdPOMf4RP+U5C4+bv43uHd2szrLFGlOch6EhGJq8n8im8AGCEn1Ovpz/HLSZYRuwwNl0t+avLMAR3BpMbgnALGkQzzZdfz4vXQcs6ebR3yw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?x1umumBXMQhKyHlHdZIWFGvDYIJYbLq2i/TwYxTrhRhIx/WHlmRQOoW3CdHG?= =?us-ascii?Q?Xn6BHLxIv087f4m4b49mK5Y0ee8EPlGXAMcyBOPIeDnhKVI6OtUP3m0/8X3s?= =?us-ascii?Q?ebxNRaEznKPzXgNdDPqFa6FAynq0RcHwfzFZycfB0C8TQ4QBdI1GkYs9Z6F6?= =?us-ascii?Q?MxbZ10KTFpbBqQ/7DzM5PbLi206Q2UtkPsctZTkEFuB6DCO88+/BqoBEr1OG?= =?us-ascii?Q?D6NKRC78xf6dqvJGRg3z4P6jI7/Btuz0UeR18GqoVNdrLoYGZ0bG4BwwsnkS?= =?us-ascii?Q?fw6sehfCCYPkTPamy67rbawoPD8rSJht5L8Q9oICWmwhXd6c5Nv0nLuYVbs/?= =?us-ascii?Q?y8bvd2eMk4GYmHJCEZuMjd1zROjr7IcjqzPEApzfZ8JR0Ghc9tN6zoASVuT2?= =?us-ascii?Q?frz/1I/Y6dmWgBWjHXf2zWZHeu3OuvWDVpE49vWE77M4lpzhqzyYSj0TM4U3?= =?us-ascii?Q?sgV73XlAoLDkE3/bfHhjcwvosqZep0OROwML8hLHveUDKS+5QJf8zbP6mC5l?= =?us-ascii?Q?8g77n+62sfI5kFY9V5732ZWjXmdUfDEHptwOxrmPssXUVN5y0fofM8H9L3fB?= =?us-ascii?Q?k/Ke1ep8YN2TpO9Tt06il2eGQeAsMldQ88EPB9kU+6fkSdH/CzfvQd+bIZ1F?= =?us-ascii?Q?5K8+u1wQ/ljvHVYM5PG/j4KDU7u5CuV2qNsxN7j5HzVR0aAgZW+jR8UKMd02?= =?us-ascii?Q?iP9jCKybUf9Z9zo4hsJfXvYraqU2CYxJH8oL842l6KPlBppszI0EbkWx+EY1?= =?us-ascii?Q?b25UtuasVn+/aidVnga13VYledRKjIBfESMSaxCEDwdHpB1LJCcZGQhIQse5?= =?us-ascii?Q?OFNomkm7tW4fCJ9DmmwAQMXRuDDXBP+9bdaHJrlhmobg/XlqtDdjeAXIYDcL?= =?us-ascii?Q?8G/Z9qJft5rBGxqWPxiOnOMWzbI64G4trnkqEzZLuGRTOwW1kHwrMQIOf9k7?= =?us-ascii?Q?KTkIoeoRZ0InGCrSIzRozgSKDxQz+UlYRWeTH8hP8Jy8AXrKvJiLT5wqWGrV?= =?us-ascii?Q?7hKOVAooZvWEMfFyXVL6eQ7jBl/eTDUJWzwfT4Fio2euUVd9RXDYCl3bMEIB?= =?us-ascii?Q?eN5An0Y6caME8oLm8a3c0giC/+zG3p1r+drDv36Q22lhewriuCzdJf2B7aT8?= =?us-ascii?Q?yXqLo8RNY8y1d0rKer6Ss6N4kwRgsB6Utw9B2n7ERPe9DeWIYh5bFRrhJ346?= =?us-ascii?Q?+7RbGHR5zi+HX8O091DtIW01nX1+STT3MTFmGIAfeN2vcKmGlYllNt8AD7TM?= =?us-ascii?Q?REO36rwWJmGBDo9QNWlfOtcGBDsUXT9jQiewmgjJjyiqissWIhKqKHUmbZfn?= =?us-ascii?Q?H1QnPfg5e1dXkEdGXkN1KwmFsFycak3YHhUPXkFgyi1vA3R9dWGgr+XJtfr2?= =?us-ascii?Q?1gj1heLcfrAq2RvSH9FHU3ubv/jQlct3A27PqWUqwIZrmozzZHCJKps3lyQZ?= =?us-ascii?Q?qnL8fxmIBi1+tZce8kiIqTJoqykbPsZQ6Rfrs53HimYoDo3dytv8+iDgAxw0?= =?us-ascii?Q?baX1sv282sVeBWxR+Iv14SiFAvjC5RLy86uXDe1gK5/M7hJClyegc8Xggmqw?= =?us-ascii?Q?raPYSPVu5+zaelhtof5z/YyNp+fwohLn6/BVCCxHEASTTzIJW/nB5niepzw1?= =?us-ascii?Q?BBhIRjxs66eahTF6s0WAkY0=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0404deb0-4283-45ec-396f-08d9f81ead67 X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:29.2972 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: xLSx4S+rWuov/45E6TYb8o0WxoWQPB8cftNx/3Iqv7h05ICYG2hMs8TX/TlGWZCsOgShM7Zbvd+ADgzpQTB2rg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxlogscore=999 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-ORIG-GUID: 5gL7qaOR0P-_HOnu7_D_yarAmL3Vfabm X-Proofpoint-GUID: 5gL7qaOR0P-_HOnu7_D_yarAmL3Vfabm Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This allows to change underlying mutex locking, without needing to change the users of the lock. For example next patch modifies this interface to use hashed mutexes in place of a single global kernfs_open_file_mutex. Signed-off-by: Imran Khan --- fs/kernfs/file.c | 26 +++++++++++++++----------- fs/kernfs/kernfs-internal.h | 18 ++++++++++++++++++ 2 files changed, 33 insertions(+), 11 deletions(-) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 7aefaca876a02..99793c32abc39 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -30,7 +30,7 @@ * kernfs_open_node->files, which is protected by kernfs_open_file_mutex. */ static DEFINE_SPINLOCK(kernfs_open_node_lock); -static DEFINE_MUTEX(kernfs_open_file_mutex); +DEFINE_MUTEX(kernfs_open_file_mutex); =20 struct kernfs_open_node { atomic_t refcnt; @@ -519,9 +519,10 @@ static int kernfs_get_open_node(struct kernfs_node *kn, struct kernfs_open_file *of) { struct kernfs_open_node *on, *new_on =3D NULL; + struct mutex *mutex =3D NULL; =20 retry: - mutex_lock(&kernfs_open_file_mutex); + mutex =3D kernfs_open_file_mutex_lock(kn); spin_lock_irq(&kernfs_open_node_lock); =20 if (!kn->attr.open && new_on) { @@ -536,7 +537,7 @@ static int kernfs_get_open_node(struct kernfs_node *kn, } =20 spin_unlock_irq(&kernfs_open_node_lock); - mutex_unlock(&kernfs_open_file_mutex); + mutex_unlock(mutex); =20 if (on) { kfree(new_on); @@ -570,9 +571,10 @@ static void kernfs_put_open_node(struct kernfs_node *k= n, struct kernfs_open_file *of) { struct kernfs_open_node *on =3D kn->attr.open; + struct mutex *mutex =3D NULL; unsigned long flags; =20 - mutex_lock(&kernfs_open_file_mutex); + mutex =3D kernfs_open_file_mutex_lock(kn); spin_lock_irqsave(&kernfs_open_node_lock, flags); =20 if (of) @@ -584,7 +586,7 @@ static void kernfs_put_open_node(struct kernfs_node *kn, on =3D NULL; =20 spin_unlock_irqrestore(&kernfs_open_node_lock, flags); - mutex_unlock(&kernfs_open_file_mutex); + mutex_unlock(mutex); =20 kfree(on); } @@ -724,11 +726,11 @@ static void kernfs_release_file(struct kernfs_node *k= n, /* * @of is guaranteed to have no other file operations in flight and * we just want to synchronize release and drain paths. - * @kernfs_open_file_mutex is enough. @of->mutex can't be used + * @open_file_mutex is enough. @of->mutex can't be used * here because drain path may be called from places which can * cause circular dependency. */ - lockdep_assert_held(&kernfs_open_file_mutex); + lockdep_assert_held(kernfs_open_file_mutex_ptr(kn)); =20 if (!of->released) { /* @@ -745,11 +747,12 @@ static int kernfs_fop_release(struct inode *inode, st= ruct file *filp) { struct kernfs_node *kn =3D inode->i_private; struct kernfs_open_file *of =3D kernfs_of(filp); + struct mutex *lock =3D NULL; =20 if (kn->flags & KERNFS_HAS_RELEASE) { - mutex_lock(&kernfs_open_file_mutex); + lock =3D kernfs_open_file_mutex_lock(kn); kernfs_release_file(kn, of); - mutex_unlock(&kernfs_open_file_mutex); + mutex_unlock(lock); } =20 kernfs_put_open_node(kn, of); @@ -764,6 +767,7 @@ void kernfs_drain_open_files(struct kernfs_node *kn) { struct kernfs_open_node *on; struct kernfs_open_file *of; + struct mutex *mutex =3D NULL; =20 if (!(kn->flags & (KERNFS_HAS_MMAP | KERNFS_HAS_RELEASE))) return; @@ -776,7 +780,7 @@ void kernfs_drain_open_files(struct kernfs_node *kn) if (!on) return; =20 - mutex_lock(&kernfs_open_file_mutex); + mutex =3D kernfs_open_file_mutex_lock(kn); =20 list_for_each_entry(of, &on->files, list) { struct inode *inode =3D file_inode(of->file); @@ -788,7 +792,7 @@ void kernfs_drain_open_files(struct kernfs_node *kn) kernfs_release_file(kn, of); } =20 - mutex_unlock(&kernfs_open_file_mutex); + mutex_unlock(mutex); =20 kernfs_put_open_node(kn, NULL); } diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index f9cc912c31e1b..91b7cfb8a1105 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -147,4 +147,22 @@ void kernfs_drain_open_files(struct kernfs_node *kn); */ extern const struct inode_operations kernfs_symlink_iops; =20 +extern struct mutex kernfs_open_file_mutex; + +static inline struct mutex *kernfs_open_file_mutex_ptr(struct kernfs_node = *kn) +{ + return &kernfs_open_file_mutex; +} + +static inline struct mutex *kernfs_open_file_mutex_lock(struct kernfs_node= *kn) +{ + struct mutex *lock; + + lock =3D kernfs_open_file_mutex_ptr(kn); + + mutex_lock(lock); + + return lock; +} + #endif /* __KERNFS_INTERNAL_H */ --=20 2.30.2 From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8227AC433F5 for ; Fri, 25 Feb 2022 05:21:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237496AbiBYFWU (ORCPT ); Fri, 25 Feb 2022 00:22:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237484AbiBYFWN (ORCPT ); Fri, 25 Feb 2022 00:22:13 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A11653135E for ; Thu, 24 Feb 2022 21:21:40 -0800 (PST) Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4ZeYt019933; Fri, 25 Feb 2022 05:21:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=fgjxur0EgX2dWUNVQwOGPe/nuPSLAID3WpmItQ9ROY0=; b=eZj+DBaxbZQ0CyvSOusMWXc4VLcDqFjjttkNR0wur+cm/Y/8GAnqSFeRmdCUKgTG7j13 4NrY3w6UNWrzU/g/OjzHat+9yMG6M7kNaGLjLx72DfMh9P3e6ihqUgjSIcGSv+5jv7qg dJPnqfxXrhve0HRuxO0YgTttDtBidzrIcLT0Q5Ag8fsgAL0CL9bIZ6wdlDOt+WVfz/+L zhLawBFyeqM82Fzrv3cVNTeXOYHFgrGfHXs38M9fcYBGHDreLHuirTecwBSe8I6jgXaw 26JjehV9WCxIWn2DQMm2Mna28yC20aRsM9p+Wk38Kur5Yw9F7Lny7hzkcki8nQniPFhW Yg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by mx0b-00069f02.pphosted.com with ESMTP id 3ect3cs1xe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:34 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5Bg4d011961; Fri, 25 Feb 2022 05:21:33 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2107.outbound.protection.outlook.com [104.47.58.107]) by userp3030.oracle.com with ESMTP id 3eannykfct-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:32 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=D0f5w7nY8FtXdOXn7rB/cQc7DUbrRWoT1fke5EGxVXuj/6cO8kxVQSLtsupoIw8nSZHyWybqEEhMuVy/tsSgsFN13EmyHtYOEYfccxZOW+d02bL9N4U81w4W6M576oMT0YF8cVtpCCu1+e8c2FcsxNndkyB2IFm8y6q31nf+1F/52dE7cFlt6nQHUTMULIxmSyKLGU5CvZby8AmWFyQ3s7dw3rN2dTrnlgGBa/2B/gywq7zghQziMJQH4KJBsbZm04N7AHo0nz6SO+aBPSmS00Uhx50AVcqhQsn5q7aOXEvBuIKD4WrKitYIBa9W8nmt9Ob69cgkcursBA4l7+RHxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fgjxur0EgX2dWUNVQwOGPe/nuPSLAID3WpmItQ9ROY0=; b=Wud/F17/dMkaLWbEgseyuTx8HS/2H/vjADBj21ZNBChAXJPIN2bYyVDfwNFjt0Ig4mGOkqPxL/AikKHyUJYVEG0z+E4D26e92pZ2x9zG2B88SkIZe5gLwZrywdve4miwyrzw5hZGhuY5UmEdJMxb2WhChFeDD7mhQowsptuDrLLiOTO2ifj7j3qobCFpyStC2nvydWNiApPioBTWsprwC3aHESw3wgQblW0J+dYMnc9Mq9MJNxZgKYSiEzCsqjCSDfyLrKq45vNhNq516DKsbqXLJxpi4FouGaI2wUpGm2JVxCcd2qnwVVCzCECYI5z7c3A1nooxQk364THDoTZhLA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fgjxur0EgX2dWUNVQwOGPe/nuPSLAID3WpmItQ9ROY0=; b=TbAQkPQ110bHBbF+B3jwkN+1CKYHQzr/ZbkAc0YLwK0+CnojlGOBvuHERHSty5K9C11QS4gQU4QJ14W7mPvFO76Np8X0CgEbQlJSHLV6I5Vub3vhO1CFDb6yeRQ2HgYQvNqixX9NXyRN7UsHIW0+FGIEUhOwBr6gF729++LbjBo= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:31 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:30 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 2/8] kernfs: Replace global kernfs_open_file_mutex with hashed mutexes. Date: Fri, 25 Feb 2022 16:21:10 +1100 Message-Id: <20220225052116.1243150-3-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 9584ed51-f8d5-447b-b448-08d9f81eae61 X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pGHXezNUVijV9LPbE/cbT/LOhUiLaDyu23Q6UTYx2aXbq/rJ/0Z4kTwtvIoZPolcQlbcIi3E28grIdOUMjCdze3RfEk0blfP8rwsdqyw1+RjI+gbodDfB927rmnjWttkWe9m/5YxPKHW3cWq7+UexL4qLsP5jMdP7wVlexNtvTB7V7SmLf5iVeEscRuXfCPEH6eQyMZKV4iT383aZNPDYuDhfyUcpt1OA48ruLup/tRGsmSOTWQIKNUmMlQJTSioFyfXih8iOKuBXSqkTpsvX0lSLB8YAkzu5yAbCgEJDBCUuVxEZwfI+xwPwSPRTKEzJ9XgDV2hXDBHWjKeRo9dT4g5goq9LI2NByZN6HKDMAK8wGesykwm1ygeflsTHsz3ahBzLoyfyWHcz38JXxUen9P5RnC9ZcsfDo7FFDsBduaWcpLVzZsULbm/9/GCdVxYUc9oDb5ydOfG34kAFU/ijTlAeB5U3zTCbwclrt9vBGuRjCXBfR1KeBG/P5cvZpqT6zH44men1pYdDkPd0C1GFZhfaR+g6/7dyzfsjz+kmm2wL7GPX/CXhKcVyhf1OZyJ35JY4sE3p/F6CJ3BxRptuteDpK/gowY4B3uhOLy27kXW3pcNCEIvfucPvPvIGOEYTcW0kqzmTJ9rQz15J1M9XQl3+jrE6JAU+2waxvdKIAcxLGuDP5b2Yj5Q2peUVmG8VzW0vDKqbMDMH0UAQgyJlg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?2A+Gvp4hIjRf3K3TVa2sPSwbwnUdK88G7Pgg/+CFyextJ4bsQbBHVRST09SJ?= =?us-ascii?Q?DAXekWEz6ktGJuTSpLBNesMjZsHGaCJ/ltBGWwwMaeJEJHru0hFLoqnEorgN?= =?us-ascii?Q?qKhSpJv5klk/+pcJbpwxDYOUMboxs9sQ9MiKitv/9j8oMk9Ii6ePei7H8KD1?= =?us-ascii?Q?C8o2ptfxp456IggtsZkecIdZQqu0M065X3pujgRnKys89afjPKr5I4syIZlR?= =?us-ascii?Q?IaP85fr8WQQwxCjF2TDpdK60k0VAutSIdjswRarwdFRcXdkXm53Ghjcgld8u?= =?us-ascii?Q?30OtM0QKt2sDY+8P+5tJezXXmnh+mCcVbQ+mQpVudY9ZSn4D4zxPIQSJaPHb?= =?us-ascii?Q?YYwESb+unOrtjJfVNps1hYCtKRmjV8FEqmkikWm+goERwBC9CjdvpvcyM40b?= =?us-ascii?Q?o2Cou5XwOoiqyvB8YjIpZ41QZjMyD7OkWtW5zm1eQsL7wZdug4pZeLm7CZuC?= =?us-ascii?Q?TmIvgALFfl8Sug1uEVvzuH9jKQRa4UcZXkBNXDLDzsfpLGOuDGQdJayHHxZx?= =?us-ascii?Q?0HI2ClNQf6pUgtbTlzGOo7Dr7r2jOFdT0++ITTabtsfhxqlG+0MS9GwaKTLh?= =?us-ascii?Q?WtjNs0EHW5SmRWomwDUDOo87O/mApJnPkU/AzNulw6/6ugbU9KEC+99YJFV4?= =?us-ascii?Q?7qNBN303Yf6qCARzQJ7gfNpRd45rw1ZRL16U4G39tgO8nbmbd3HYPRv/Ck/x?= =?us-ascii?Q?cnAOzH331/EIVmDsYrqvmYwl/ByU+kJ+KOj2CL3amEqXM4AH+N3KbiHfvG8O?= =?us-ascii?Q?9jZ1OynYQiENHzZB2rfijgcZr1vBxVWSmXDSnf11F0jqSn1H3oqN8Ry1U46Y?= =?us-ascii?Q?IZdmXSNMTDxX7jxfwgOfpfxnbubLXGXWV8xK+FxcPdTLL+8JMKJMnZOFHz1N?= =?us-ascii?Q?fFJDexmeP22twnz+T1Nkad70+IgGSAKnf3AZOnQl/Lw5qarmJE5FI0eJmZOL?= =?us-ascii?Q?HtKa5h53oPKEkHRLOTGxerdFvoOLirwkaWLKPAdevOAscfVrN8MMBf9TF8Lg?= =?us-ascii?Q?ZjU111rQapri6xBcxOUNeOIQ4CPf6zQo8lfvOKke97tLsem4WE8mbpOwzDc2?= =?us-ascii?Q?xuYj7Q5KGi2lzrAeqP2Eyrr0Pjd8FWPOsdUlydiaSFNdLy3S6UmyfnRtN0DU?= =?us-ascii?Q?/d9WqcuBx7xom2c6NjYmTGcJSW+JLu62x8MRj/VBDgv9GIvubaGlghDeQyZa?= =?us-ascii?Q?IjSIWZjpkulmNG8SA5VoFyDvJSr9fmKFYS3FB5cMJYAd2ukwVOMQV5HhHIQB?= =?us-ascii?Q?hPbSyQLBHkKkCXLABpV1C5iTVXZjrfkEcIZt4p0few2E0nrom734kMuBr6gX?= =?us-ascii?Q?F2LsKy8uzg54RQ//bcd8ZcLLFXeHckWzsTAtHqLolLXu2GPrx2nemilrHZNT?= =?us-ascii?Q?rNDC2GtIZRRL6W9em7pgad+uibnt5cm8ArVlZCMDpRoXmx3AjAivinXiew6c?= =?us-ascii?Q?m64g3/0JPsQNb5ku+9p3pzqU34D2zdK/PwR2wDx0+ECvHvtP8eIiUcxA3EXv?= =?us-ascii?Q?UQr/AojqsPwQn39aI1DOoAfVM69F4/IDKAhuIrfEwLBnfFLgtv4TyXnkJnFa?= =?us-ascii?Q?4x3MQB/yag7TaZAMm/OwWKvJqWyaoA2saFhiYZbsNBnUjtatNCIA7l6x+YxD?= =?us-ascii?Q?6pxH0tvRebtG6HKnWmLSZ1o=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9584ed51-f8d5-447b-b448-08d9f81eae61 X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:30.9075 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: WkvKpCWri09iZ61k/1r0MiyiD3Eutq6deoPgxMZx10zH3UiDQGBxKALjhHGp9vTyq2u5+5RqCQJUw2nLWSPb5Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 adultscore=0 malwarescore=0 phishscore=0 mlxlogscore=999 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-ORIG-GUID: BOqkGdeQwUjz4QDMsRIBG5ldp11fuDqv X-Proofpoint-GUID: BOqkGdeQwUjz4QDMsRIBG5ldp11fuDqv Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In current kernfs design a single mutex, kernfs_open_file_mutex, protects the list of kernfs_open_file instances corresponding to a sysfs attribute. So even if different tasks are opening or closing different sysfs files they can contend on osq_lock of this mutex. The contention is more apparent in large scale systems with few hundred CPUs where most of the CPUs have running tasks that are opening, accessing or closing sysfs files at any point of time. Using hashed mutexes in place of a single global mutex, can significantly reduce contention around global mutex and hence can provide better scalability. Moreover as these hashed mutexes are not part of kernfs_node objects we will not see any singnificant change in memory utilization of kernfs based file systems like sysfs, cgroupfs etc. Modify interface introduced in previous patch to make use of hashed mutexes. Use kernfs_node address as hashing key. Signed-off-by: Imran Khan --- fs/kernfs/file.c | 7 +---- fs/kernfs/kernfs-internal.h | 9 ++++-- fs/kernfs/mount.c | 13 ++++++++ include/linux/kernfs.h | 59 +++++++++++++++++++++++++++++++++++++ 4 files changed, 80 insertions(+), 8 deletions(-) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 99793c32abc39..8996b00568c38 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -19,18 +19,13 @@ #include "kernfs-internal.h" =20 /* - * There's one kernfs_open_file for each open file and one kernfs_open_node - * for each kernfs_node with one or more open files. - * * kernfs_node->attr.open points to kernfs_open_node. attr.open is * protected by kernfs_open_node_lock. * * filp->private_data points to seq_file whose ->private points to - * kernfs_open_file. kernfs_open_files are chained at - * kernfs_open_node->files, which is protected by kernfs_open_file_mutex. + * kernfs_open_file. */ static DEFINE_SPINLOCK(kernfs_open_node_lock); -DEFINE_MUTEX(kernfs_open_file_mutex); =20 struct kernfs_open_node { atomic_t refcnt; diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index 91b7cfb8a1105..03e983953eda4 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -147,11 +147,16 @@ void kernfs_drain_open_files(struct kernfs_node *kn); */ extern const struct inode_operations kernfs_symlink_iops; =20 -extern struct mutex kernfs_open_file_mutex; +/* + * kernfs locks + */ +extern struct kernfs_global_locks *kernfs_locks; =20 static inline struct mutex *kernfs_open_file_mutex_ptr(struct kernfs_node = *kn) { - return &kernfs_open_file_mutex; + int idx =3D hash_ptr(kn, NR_KERNFS_LOCK_BITS); + + return &kernfs_locks->open_file_mutex[idx].lock; } =20 static inline struct mutex *kernfs_open_file_mutex_lock(struct kernfs_node= *kn) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index cfa79715fc1a7..fa3fa22c95b21 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -20,6 +20,7 @@ #include "kernfs-internal.h" =20 struct kmem_cache *kernfs_node_cache, *kernfs_iattrs_cache; +struct kernfs_global_locks *kernfs_locks; =20 static int kernfs_sop_show_options(struct seq_file *sf, struct dentry *den= try) { @@ -387,6 +388,17 @@ void kernfs_kill_sb(struct super_block *sb) kfree(info); } =20 +void __init kernfs_lock_init(void) +{ + int count; + + kernfs_locks =3D kmalloc(sizeof(struct kernfs_global_locks), GFP_KERNEL); + WARN_ON(!kernfs_locks); + + for (count =3D 0; count < NR_KERNFS_LOCKS; count++) + mutex_init(&kernfs_locks->open_file_mutex[count].lock); +} + void __init kernfs_init(void) { kernfs_node_cache =3D kmem_cache_create("kernfs_node_cache", @@ -397,4 +409,5 @@ void __init kernfs_init(void) kernfs_iattrs_cache =3D kmem_cache_create("kernfs_iattrs_cache", sizeof(struct kernfs_iattrs), 0, SLAB_PANIC, NULL); + kernfs_lock_init(); } diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 861c4f0f8a29f..3f72d38d48e31 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -18,6 +18,7 @@ #include #include #include +#include =20 struct file; struct dentry; @@ -34,6 +35,62 @@ struct kernfs_fs_context; struct kernfs_open_node; struct kernfs_iattrs; =20 +/* + * NR_KERNFS_LOCK_BITS determines size (NR_KERNFS_LOCKS) of hash + * table of locks. + * Having a small hash table would impact scalability, since + * more and more kernfs_node objects will end up using same lock + * and having a very large hash table would waste memory. + * + * At the moment size of hash table of locks is being set based on + * the number of CPUs as follows: + * + * NR_CPU NR_KERNFS_LOCK_BITS NR_KERNFS_LOCKS + * 1 1 2 + * 2-3 2 4 + * 4-7 4 16 + * 8-15 6 64 + * 16-31 8 256 + * 32 and more 10 1024 + * + * The above relation between NR_CPU and number of locks is based + * on some internal experimentation which involved booting qemu + * with different values of smp, performing some sysfs operations + * on all CPUs and observing how increase in number of locks impacts + * completion time of these sysfs operations on each CPU. + */ +#ifdef CONFIG_SMP +#define NR_KERNFS_LOCK_BITS (2 * (ilog2(NR_CPUS < 32 ? NR_CPUS : 32))) +#else +#define NR_KERNFS_LOCK_BITS 1 +#endif + +#define NR_KERNFS_LOCKS (1 << NR_KERNFS_LOCK_BITS) + +/* + * There's one kernfs_open_file for each open file and one kernfs_open_node + * for each kernfs_node with one or more open files. + * + * filp->private_data points to seq_file whose ->private points to + * kernfs_open_file. + * kernfs_open_files are chained at kernfs_open_node->files, which is + * protected by kernfs_open_file_mutex.lock. + */ + +struct kernfs_open_file_mutex { + struct mutex lock; +} ____cacheline_aligned_in_smp; + +/* + * To reduce possible contention in sysfs access, arising due to single + * locks, use an array of locks and use kernfs_node object address as + * hash keys to get the index of these locks. + */ + +struct kernfs_global_locks { + struct kernfs_open_file_mutex open_file_mutex[NR_KERNFS_LOCKS]; +}; + enum kernfs_node_type { KERNFS_DIR =3D 0x0001, KERNFS_FILE =3D 0x0002, @@ -413,6 +470,8 @@ void kernfs_kill_sb(struct super_block *sb); =20 void kernfs_init(void); =20 +void kernfs_lock_init(void); + struct kernfs_node *kernfs_find_and_get_node_by_id(struct kernfs_root *roo= t, u64 id); #else /* CONFIG_KERNFS */ --=20 2.30.2 From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E16FAC433F5 for ; Fri, 25 Feb 2022 05:22:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237519AbiBYFWf (ORCPT ); Fri, 25 Feb 2022 00:22:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237486AbiBYFWN (ORCPT ); Fri, 25 Feb 2022 00:22:13 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91EEC39B81 for ; Thu, 24 Feb 2022 21:21:41 -0800 (PST) Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4dJ25030077; Fri, 25 Feb 2022 05:21:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=SYuIxv/RqoNVWUmbbv+At0lYarOex6TIOn08Yr8eGkk=; b=shn1/wBI0gImsDq9LsFPndd5A560tyYsR+7U1LpRPMUbwo0pYSo13vHTTiHF3uAB6p/Q TfhT5cSxNQ0zvKq+NxXFolNTSqDKfgUVVcVVeF2ZuER02Zq09VpFQi+64IxbmBpKwbpc o7HaBg3HeGu9WYmCh+2JHc1Oj2AEskOvcYngsdGPhz5aGp8vfO9URuFcB5PuQSBsseIp kg/fP5/rtF9aiYFH7c0L/btAEzPmLAyOoZQLM/9bc/Io2OfQ3KTEws0WcHeLPBuOCWEv KlTaGmmhJG+cNoJGErdjv1xizBJ4whJpFk3YgSf4KsIAg0Hfaf4qxDV06U9VJjY2BQRk AA== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3ect7as9aw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:35 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5AxFM181362; Fri, 25 Feb 2022 05:21:34 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2106.outbound.protection.outlook.com [104.47.58.106]) by aserp3030.oracle.com with ESMTP id 3eapkm8a3t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:34 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hnTaQIxNEx/xFBgYWc/D8WbnDx4mrJC/U+iH3dCpCtkql3zZ8MXXuqhP/J3zfyBmzXvtoUntcMtCPbMRaUgh2Va+QjjUhP3wsyHkiyQ9DpTngKS5t1T3EKcUQQpxhLAgLpMIQ6B7DVUr1c5tI8jSUC9lZqrynp4arg37+ekEGaA+ZhmJ+xu673QOl49x0SyTVjOQ2WR3XJGNzaP3wCkTMpFZwzoSA1GCk5NimD0yafk63E9bth3X8y+S40he3rQBa+DpJmajnUF8xWDAH4Tlio0fLrtzBKMmUGWkpA9J5rnizaBvpQ0A8L8YYzqugoZOddSAaluKIe1iJtRMN5Wwag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SYuIxv/RqoNVWUmbbv+At0lYarOex6TIOn08Yr8eGkk=; b=c7mmm5UbNBtvBROo9mOdMIvp1+XzWFiGwGK4SqJY4w43oIxcJR5Ppsy8i64DBiqkvVWLnva/hRx+OXVRrRBJM5FNaA1r9pf8JlyzxdPtVAnulDkEF+5gapAUECqjPmjx3/ctxpkMe0m4+TTFopiAVe9uf8aVMLYxDrpr6xrEyzbSUHmN10bHUBAlt5r48qU8zaOggUgHDvLV11VHECIbLw62q92+nC7zHRcAgFN2ExvH/+l8fzf1+TiiXZsu9oJWsFcX0VLWVkVtjNTfaStxTPrFuHWu3jMkm32fDmL0l+V40p+Yws4ow8LhBPAWQkyzZMOiq8EFXg1uEDeJboGG7A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SYuIxv/RqoNVWUmbbv+At0lYarOex6TIOn08Yr8eGkk=; b=uzVgo4VIgVqz+2cobhr5sDDAbbqJ5+TRjiXGvpVnPX20gW02BZJUIGxopXlYi9sGmmdkJS5TPfVmY9ruGyt6nhR9fE+hd4iUL0r+fCef8EDxZCw3Zobp5zHWZwcVUkTHYl87xPJ8EbVkpc/b3id+J1hmdGrmmxnhZlKMFkhVcaI= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:32 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:32 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 3/8] kernfs: Introduce interface to access kernfs_open_node_lock. Date: Fri, 25 Feb 2022 16:21:11 +1100 Message-Id: <20220225052116.1243150-4-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: bc7a619e-7d9f-4749-cb8e-08d9f81eaf59 X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 5us2CGb1Z19EpRTMlueXeNqxTnffwTZoLPRhUjUuVWCwWAqq3dd0MN8QJ+7v8xvJtZkfB9ZsrWgeQN8elU0Oc9A3rMsdtcHCy0NDTHJD/3qY+DLL1mFx5VQ3sgm6whSiXMAl3ffaRJUEt/IBKwgvueuSrAnfRlJ/TGpJhkxXV3meD+k4MqjhDEsQ4q18HQQO5jehyne5pUfPxSrovr6nGWMqy7S/mQZ+flk6GZA32l/8DfhHWy2JPfA/ryFr4WVrh5YNPTX9i27ljlltFWDRVNRHeuwrdsaOsBFmXwFvKfJiv5pGJl1DpWkT83c32LNZUQ2+77QPG2fVpj4mb3JI/vpZC8iZTbswqKQERHr2e4KWNWHRsojaswfoqWCJkoChywXFE3RiHezglZTQO9q2qtsYRP9wgMygehB3DLYH5i4OHtcGFI69z3cNSkSAq8jiqHj4RL9413USwlEH2F4WZftijEhnPXlCI/n1gF4OuNYiPJaPJITMFtwozZ0EBRXiH9LslFkkDbTYSmXKZgahkFNyoYDoj1SblK0g8r7LYXunEeokTWE+VN8/PUKjJKLtLAF6vsZFo4YEteygbNDezwdRzhICvb2aY+XN3ctDlUohH+pU+vf57AibQpOvcRfX5wA9MantHkIAaGYDrFaCJmlY/tkiwi+rAUTNPkTv70a6Ifo4VqvyvCpn69BiYlwbqFxrSx4l8OCQde/aLdx4Uw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?ARRnQsWlZeG8hDwPx26Lp2/AYR5EhdpbmqbrgYYEOCzSeXApXuwPW17oAvAx?= =?us-ascii?Q?UiGcpY0gkIXZlqoxtBNsRLvy57lli7K2p8e7DzGRaxl8m2TYQVjpFeMvKc4Z?= =?us-ascii?Q?NkuuIDse2/rY/vInx/wb2z02rH5QRtVKtogE699WhG3P2bKX3kxC2sz8WdNp?= =?us-ascii?Q?pyAjUrB/MiLHOd++a1s6SHcyyWVjzKpYKnOAC8YNiM5BComyn9bGnpcTkqCL?= =?us-ascii?Q?nu+ZmQwLkdLFhK+o92njpPd7njasax827wCWhbwZjQ5CRzMiVjfePTOzbXIg?= =?us-ascii?Q?4VrYA8Mg9E+sI5Y6oUCoSCXGPUyeKe3cqtCu5Um1OzMMoAh88SouQP6rozEl?= =?us-ascii?Q?jgEtYMXteUyfLtwCDGTengQwGK0L5GE0Br7zHbqQ7bVwmghybbG+XWum0n7x?= =?us-ascii?Q?OvvcgYwTvdnUk5Dzc0l/NylbqfHVTk4uI7/Gq9w4fUjem9rQJkKLhRU8bqp/?= =?us-ascii?Q?/vOf+Ti16iFQuRac6NNp4jQLoZ9ULal744Sxf+A0n48ogP8/6GzA5SwVoiAL?= =?us-ascii?Q?gRZfgxpx7hWQlHLgQMY5X5pm43kRCb11eeOMGEtonIu6/Vc8hvhUljggrCs0?= =?us-ascii?Q?osKEnAUu3seyMcv3Kql/6aY/no61J/nRjwM0uiA4Hyq5+EMe/Ech91QYMeCz?= =?us-ascii?Q?kXIe/C21oX0rElhBVZVOxkgkIjxsrKeG+OkiANBYarFfge+HqHrtNxMFpzRf?= =?us-ascii?Q?kH3PKIECMyNNfeahf1OcgImZqR+vk9loss+dPOyoYXbXUcWFCTREi8FE+yZ1?= =?us-ascii?Q?vSidGsyW1K0gqLzJlw4IJzAeDI+4mxwfGrCaEEiTcz+KmQS46D7PnGCYG+iy?= =?us-ascii?Q?QwDMb28fdyCAdZx22/ij8/JED4zU/nVAbcq1Ub/9tYiAYFX322cOexxvxdRB?= =?us-ascii?Q?OF4SSTfTVP5REf9ktf4f2ucxbxXT6MalnNbSfYT9WalwmSJY3X7Wq3Jlx17a?= =?us-ascii?Q?GHTV+qrK1PUSrDyq/DxCddkkwTtpcYanf+wyt0U+/RJ/ktZ9w93CmPya2pNF?= =?us-ascii?Q?Jn4XwUua3CJYefKLF1kc+nl/0kp/Xv5BzCyu6RmVQrK9nnDgARTI/UOKnJ9T?= =?us-ascii?Q?+KTIBVqCQpP0J6cZdh54T6xw1BohY+38yuM12fNnmG3JWZ8ElS9CLzGFy7Tb?= =?us-ascii?Q?KzNKZtz5DtRVE253r8iX0r3a2OIi/j5TtuXMNO+bKSD+W2iD9q2DMrsUQCoV?= =?us-ascii?Q?Ykj/0T3ql2EcAcaHwOIC2qimZzqdVJfMrkhMZjwNP8FlrkmcSE/iH3vs5HE0?= =?us-ascii?Q?xP5g4cZyNYrb2EQ41vjqLY0G9SI1blbyI/IBaBdjh5sluAsq4glkcwsvs14L?= =?us-ascii?Q?+eEIgoPpwjqptRtkoUhJk/kT9BufTga2WnB19SG9VHMAxGhPqsM/SO6rEsGq?= =?us-ascii?Q?LD00atvtDwU2VIDZs54hFWKMMYevS09rkWqMajAaQx9t0PSmuqyNFMH/heyJ?= =?us-ascii?Q?4tjrKG9Upo4xGi3e5R98ev7senH9nvxHmN8FezsNFjBEboXw3dKl/djwZ53Q?= =?us-ascii?Q?3ZiZNOg3tSXFTBb6BPmtp9maEFqLTnUpP8BH6zZ4sa4PnvlGY+REp/revMVv?= =?us-ascii?Q?wo7IZDlb21CFEVmSY2JilSHQDJpZwxaCJhIVoVdRAu81lyoOxgy9M2mXjrsb?= =?us-ascii?Q?A2OP5T3z1cflTcGAilJ+zZY=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: bc7a619e-7d9f-4749-cb8e-08d9f81eaf59 X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:32.7052 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /irrbPyDets1HF+27Nd7ggrKMVW9JdzJ2ibe7K4/kdAkZ35xoecKQimZ3XOh4c6JrOZQTNaFNG7ArKeSfD0BZg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-GUID: 2ywVi-D39Ir_gM4I27GcVYzbcssRoOld X-Proofpoint-ORIG-GUID: 2ywVi-D39Ir_gM4I27GcVYzbcssRoOld Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Having an interface allows to change the underlying locking mechanism without needing to change the user of the lock. For example next patch modifies this interface to make use of hashed spinlocks in place of global kernfs_open_node_lock. Signed-off-by: Imran Khan --- fs/kernfs/file.c | 23 ++++++++++++++--------- fs/kernfs/kernfs-internal.h | 18 ++++++++++++++++++ 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 8996b00568c38..1658bfa048df3 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -25,7 +25,7 @@ * filp->private_data points to seq_file whose ->private points to * kernfs_open_file. */ -static DEFINE_SPINLOCK(kernfs_open_node_lock); +DEFINE_SPINLOCK(kernfs_open_node_lock); =20 struct kernfs_open_node { atomic_t refcnt; @@ -515,10 +515,11 @@ static int kernfs_get_open_node(struct kernfs_node *k= n, { struct kernfs_open_node *on, *new_on =3D NULL; struct mutex *mutex =3D NULL; + spinlock_t *lock =3D NULL; =20 retry: mutex =3D kernfs_open_file_mutex_lock(kn); - spin_lock_irq(&kernfs_open_node_lock); + lock =3D kernfs_open_node_spinlock(kn); =20 if (!kn->attr.open && new_on) { kn->attr.open =3D new_on; @@ -531,7 +532,7 @@ static int kernfs_get_open_node(struct kernfs_node *kn, list_add_tail(&of->list, &on->files); } =20 - spin_unlock_irq(&kernfs_open_node_lock); + spin_unlock_irq(lock); mutex_unlock(mutex); =20 if (on) { @@ -567,10 +568,13 @@ static void kernfs_put_open_node(struct kernfs_node *= kn, { struct kernfs_open_node *on =3D kn->attr.open; struct mutex *mutex =3D NULL; + spinlock_t *lock =3D NULL; unsigned long flags; =20 mutex =3D kernfs_open_file_mutex_lock(kn); - spin_lock_irqsave(&kernfs_open_node_lock, flags); + lock =3D kernfs_open_node_spinlock_ptr(kn); + + spin_lock_irqsave(lock, flags); =20 if (of) list_del(&of->list); @@ -580,7 +584,7 @@ static void kernfs_put_open_node(struct kernfs_node *kn, else on =3D NULL; =20 - spin_unlock_irqrestore(&kernfs_open_node_lock, flags); + spin_unlock_irqrestore(lock, flags); mutex_unlock(mutex); =20 kfree(on); @@ -763,15 +767,16 @@ void kernfs_drain_open_files(struct kernfs_node *kn) struct kernfs_open_node *on; struct kernfs_open_file *of; struct mutex *mutex =3D NULL; + spinlock_t *lock =3D NULL; =20 if (!(kn->flags & (KERNFS_HAS_MMAP | KERNFS_HAS_RELEASE))) return; =20 - spin_lock_irq(&kernfs_open_node_lock); + lock =3D kernfs_open_node_spinlock(kn); on =3D kn->attr.open; if (on) atomic_inc(&on->refcnt); - spin_unlock_irq(&kernfs_open_node_lock); + spin_unlock_irq(lock); if (!on) return; =20 @@ -916,13 +921,13 @@ void kernfs_notify(struct kernfs_node *kn) return; =20 /* kick poll immediately */ - spin_lock_irqsave(&kernfs_open_node_lock, flags); + spin_lock_irqsave(kernfs_open_node_spinlock_ptr(kn), flags); on =3D kn->attr.open; if (on) { atomic_inc(&on->event); wake_up_interruptible(&on->poll); } - spin_unlock_irqrestore(&kernfs_open_node_lock, flags); + spin_unlock_irqrestore(kernfs_open_node_spinlock_ptr(kn), flags); =20 /* schedule work to kick fsnotify */ spin_lock_irqsave(&kernfs_notify_lock, flags); diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index 03e983953eda4..ef5b04d43ef1b 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -152,6 +152,8 @@ extern const struct inode_operations kernfs_symlink_iop= s; */ extern struct kernfs_global_locks *kernfs_locks; =20 +extern spinlock_t kernfs_open_node_lock; + static inline struct mutex *kernfs_open_file_mutex_ptr(struct kernfs_node = *kn) { int idx =3D hash_ptr(kn, NR_KERNFS_LOCK_BITS); @@ -170,4 +172,20 @@ static inline struct mutex *kernfs_open_file_mutex_loc= k(struct kernfs_node *kn) return lock; } =20 +static inline spinlock_t *kernfs_open_node_spinlock_ptr(struct kernfs_node= *kn) +{ + return &kernfs_open_node_lock; +} + +static inline spinlock_t *kernfs_open_node_spinlock(struct kernfs_node *kn) +{ + spinlock_t *lock; + + lock =3D kernfs_open_node_spinlock_ptr(kn); + + spin_lock_irq(lock); + + return lock; +} + #endif /* __KERNFS_INTERNAL_H */ --=20 2.30.2 From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74D65C433EF for ; Fri, 25 Feb 2022 05:22:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237525AbiBYFWn (ORCPT ); Fri, 25 Feb 2022 00:22:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237485AbiBYFWN (ORCPT ); Fri, 25 Feb 2022 00:22:13 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6BDB366B6 for ; Thu, 24 Feb 2022 21:21:40 -0800 (PST) Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4jfZf006248; Fri, 25 Feb 2022 05:21:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=H/MTTrg1Y1OvDYnkNIyADXVI0viOQp8B7sbZh21c2lU=; b=dfKhMkF9Dx6tATkf/yJ8rfw2CDMciwsH9KCxm3DxRJpfm6LyeOaFbrm4SMiIu+5tMxCI LoSaPol6cIGsS0wJNEORbF/zAKJg0sI36qSSYcS52BjD1tn4dtFBFYUO0TYOnrH2pBEf tclPm1zD0XH4AlsNY9QRd8rYNWNdL3SiLqMfVdu5X+D5w7LGt3yzf1YwMPKF431GP+5e odmpq+B+lgsHMHU3Pv7FLtEpR7mrkfN2aUhpcMHxFioMojj5U96Lob3EXbcFaTL4RW+a RXkPytsD/JQVRNDrJgMiedW7D4VErF883oE2AJlcs3sYz1YXoV2uiG2q2hbQZN4OtKPC tQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3ectsx9a5k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:37 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5AxUp181372; Fri, 25 Feb 2022 05:21:36 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2109.outbound.protection.outlook.com [104.47.58.109]) by aserp3030.oracle.com with ESMTP id 3eapkm8a45-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:36 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aChM7DpW8nclij8yt2blEBKYF6SRmeMxQ9uN555QMX4VqOQ9PKCt+42xLZXyLPbHsLkA5z2dWy/M/YkcbPBpgikPiKrIuU1WQ3AOCgVG3LnPGkvn929vc2xEND7UpGXsByDzBM21S4uwZF0ruUOjgnDWd68jR5SnOt/5a8ujBKoZ1hkrUV577Drc7OY+YaATBFQV/9HSylyJOeCmCmUdsDZezXC/erXD4hGceqdjm6c6KO5+IfBQfe12k+BmHXkVGLqBEKcerQB7V2AhLCjqnjHIwOuEMHYRlAs9byc6qgG07M/J5AbgVJse+eaEvBSnUfs6zsBVddFkb1+5eybz3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=H/MTTrg1Y1OvDYnkNIyADXVI0viOQp8B7sbZh21c2lU=; b=UIr+r7bZs1FrQXD4cFwwfaQS2XOufbLDsQA2B1WvkVTFCzqM7zS6qd9AUwTJJFWxrHF1q4AKetoDQVABBEgLeAcu3imzXo5ZxXXHDjZdpcpVVNex5ObMUQOaSLV/6CeOpFnMpi2D+ZiNuEnpGBofuVutALZeotQYiZq6QKsVy+hDcBvvqEFbttIRJFAgv7kj+tPE7TeK3YKcMe2lpqK07ywYunqT2uiFo0aMZ6lnkzls/wU659QDqUhsl7V+Ekdpt8vrQBgh4DIhHPk7obn/kf3afd1B9Xcmz9zEm3LTYnzDqZPdkfToUz4Bf2wt0hzpBEBg+YNxsIThPSiP5IEVOg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=H/MTTrg1Y1OvDYnkNIyADXVI0viOQp8B7sbZh21c2lU=; b=U+64TEDrLGEtlXj54/gFxdDshrxrIpwitXYdGRFdhzm/0GoJdX5oqSmvEyMzRvRO0/2vjQgGTQRuQTxES82OW7fGy+p0/yfXqmEKa6fNXC1sLULQ7q7GAFh5A5q53V81cLyyRHlPaaefVNR8Wop5aQMJH55gViIsOy/mva0Fnv0= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:34 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:34 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 4/8] kernfs: Replace global kernfs_open_node_lock with hashed spinlocks. Date: Fri, 25 Feb 2022 16:21:12 +1100 Message-Id: <20220225052116.1243150-5-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 42ece961-5a12-48e7-fb8e-08d9f81eb072 X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: l6oiVYLbd3rSmQTWzfc1hyakHerthXxEU0C+dkJMEka9JhcLrwkiL7PzJQs1fAzif44Yja/m62nYKtEPNDE7a8BuEvRm3f5gEmat8+VUjmIcMc0vpJlhaA0h7LkRUNgu/PodL2Vshsk4HPLREkZOOjcozCdSpadlJwJBMmZgB8rr+/BlG+3WCgeG6blm1sNwAl4Y27LwU5XZ/WZK7+1u+2N0MoCDnt861m6cuLGj/Hv7w6CECu6ecFnNSIRjGUF53PF83EQbBYnxtTWa9sPfA57AG4RzcayQkT580MyB3nP8uqNoCeYf8n/eV4V3t0VdJSgpbpLh6n04wIKZcKYXtdfYoX9CHSmM6HMNdmlV/tOhTR0lF3UX/7HAtEdBkrRRJyafW+rJCTKkWBdMybTVHr/ApZs7sKGmWGt5OkckaetmAJOnxDnAkDseuTu1qStdpd9CtGhH5w3T1A6tpX0ETEvRpHP7XiMi1ywkjXWMzanhuBp5xhlswa944Kngwy1cGU3whlUnzKy+BJhg5q6qxhcOvH654RfFXhqemzuqAZEWsJhqpkliVs3FW+VmWyyjc4VwknBW1/+l6tibaDDer1wZBf9SPpLqL4nGZ78i+I74zA9oOkfJMQFPfCGo2nxPkSc0QLcrO2vHB74kXDb50BtQgSl/Yry/cPvJxh8yS+50XoVLKrW7PofA/oYR8a9q+e85OfL/MrXjyvnqfXmMvg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?dKq4BHBgDc+X4CCDbT/V+sm4F/QOszm2sq6ym9qj8SWMfPUjJa34Y4ccrcYD?= =?us-ascii?Q?dmt3ldpzHVZUEFe8IMplnrmgCqY4iCWFeVN95EHI1kbGtvs/wm0qVqnCpmEn?= =?us-ascii?Q?LjgBeYuRQqQcZZGMHetCtITGgQKgODiDd1VzKBcbgc1U+n7ZWfh3GEr6AAfv?= =?us-ascii?Q?AcruDDXbXL5juWKQ05dzVx7ZwgK48SKgQ3lKvWqC1ce8oB/8dsPz7JOX9WJC?= =?us-ascii?Q?nkUNG9tt0Sp5miPsFhmR96KyoGPyQiJCNZV0gqa0EBAVvdldiwieyJZ40YSW?= =?us-ascii?Q?U/TI17/1FeDEO/elreOgeWxxJcQ+AOvc0v3J9Wl+7fVaZy84G5wQOmuCmQwV?= =?us-ascii?Q?uot7saQ2fYRSL+V6F+405k1kAvg8plponNbtO6jbxYnsomHEihlcqbX6+D6n?= =?us-ascii?Q?uWYCkfFdEoGJ+ntA9Mm8P7BxfriFfUoQEQrT2ByO/WkpwuCIrkO0t9i2X0pL?= =?us-ascii?Q?SZ/HCwyqg631iJuYJYAkdjnE73rtqjPUNJmULoJxozY7QoZ2kH5rbxk+uiV/?= =?us-ascii?Q?KzjdZfeWFqmQCKCJSDTe6enz4SUnOPBvh3tE20FS29oNAeqCLpJw4l6zdkoB?= =?us-ascii?Q?M7I/br+MlKmYHqIaTaN8xBDYBFOCbVvwJtQOcb7b2lN19nLNT5u03+Wd+tJ8?= =?us-ascii?Q?RXNwr/J3X3m9HNZyVH9q7u6NZJnFVpOaN1TMATCGNrRwi1XHSqUylgdS+5jn?= =?us-ascii?Q?Oe8e8vs83ZXsDh35QEYaOG5BwMRLbX4hzSCD8NvYKQcBpXVNQlGJAz0mYgon?= =?us-ascii?Q?9Ll/QIhkzjzN2ikzkUDFP9fGY9V9LUkMQkI+lRrk392BpSkgdRDQLmg3UDRD?= =?us-ascii?Q?8YLdZvf9LKdWJi8bbGF4BC2rIClxwas+kpJlfeTQ0NtzOL69PtA5N4eWIrID?= =?us-ascii?Q?W4Q7CYvN0nGljLWe3JGJUFnxadIEK/aZKeWoBrQNcXWyKeH934Qa6rvF0tAA?= =?us-ascii?Q?9Jx1tLUOCI/5YmgFTMiUSr0WSV0/B9t7/0t22kBZ11pnqk9cNCGXaAiVlGJr?= =?us-ascii?Q?ZxA9DiUKQ5J/ATp9TLqUEbGy9soJVP1qCTCWQY+LTCOsS8ItuRdN06HfL/1L?= =?us-ascii?Q?4Wwj1inRMWuunE2a6uemKWI4UT4krj51qS2tCl9KMnoXYW2sMhwrGYww/mdc?= =?us-ascii?Q?qC22l+b9+bBHJ4KWrQ81r3byi5fDdNaZXQ6Oxpu1wskf4TxNQIJhlNKhyehW?= =?us-ascii?Q?3EYLw6gDg2v+R1O8Ad84ZLobm49GCRmQW4g3+NwSGcfaquKxICENdESNrIrq?= =?us-ascii?Q?6APHxivbHTYU3SzrFwUy/lidj4fQcL3kjY7/qnpLJIQI2TTWz7/rxlgOjBLY?= =?us-ascii?Q?xW1LDpsum9w6HAKU8Ak6mQTeCVXZ1FR/V0QJkAYAg/Jf4QH/5/Yn9wvs647s?= =?us-ascii?Q?LZl+CuA49/U7Wgb+f83axdrxleSTyDQaKfixjX4py1hRD9a5he4W24E7zbix?= =?us-ascii?Q?R8qxNt1lw7nKSNgkSj5lvW+K154m2Qw6h+yYTOXzZahEjjLzA3qm/Z8Rqxa8?= =?us-ascii?Q?ZP7BX7qWZ7V7KkwkHT9DunYgB+E5v09Av6flCUKHOJ96pHGTc7RG1j4JNabC?= =?us-ascii?Q?vfi7ruUy3CjVSsVSqZgHq0p49hT3smkJi2e5eZ1Ke1ossaS22k/+P8Ixk1B+?= =?us-ascii?Q?t4XIsc+soLrjurfm98rHis0=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 42ece961-5a12-48e7-fb8e-08d9f81eb072 X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:34.3762 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: qrF+4uS6OA1tW30/xKuzSSOg9ME0B0rbOcSAgXVLuAKy7v8OiawKDIR3fEzjlV9Lu89veYDqYsx8WJ3tr/aZ8Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-ORIG-GUID: 6u2tXuXnV46ZREmcUsQda_UMWM6O9DoW X-Proofpoint-GUID: 6u2tXuXnV46ZREmcUsQda_UMWM6O9DoW Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In current kernfs design a single spinlock, kernfs_open_node_lock, protects the kernfs_node->attr.open i.e kernfs_open_node instances corresponding to a sysfs attribute. So separate tasks, which are opening or closing separate sysfs files, can contend on this spinlock. The contention is more apparent in large scale systems with few hundred CPUs where most of the CPUs have running tasks that are opening, accessing or closing sysfs files at any point of time. Using hashed spinlocks in place of a single global spinlock, can reduce contention around global spinlock and hence provide better scalability. Moreover as these hashed spinlocks are not part of kernfs_node objects we will not see any singnificant change in memory utilization of kernfs based file systems like sysfs, cgroupfs etc. Modify interface introduced in previous patch to make use of hashed spinlocks. Use kernfs_node address as hashing key. Signed-off-by: Imran Khan --- fs/kernfs/file.c | 9 --------- fs/kernfs/kernfs-internal.h | 6 +++--- fs/kernfs/mount.c | 4 +++- include/linux/kernfs.h | 10 +++++++++- 4 files changed, 15 insertions(+), 14 deletions(-) diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 1658bfa048df3..95426df9f0304 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -18,15 +18,6 @@ =20 #include "kernfs-internal.h" =20 -/* - * kernfs_node->attr.open points to kernfs_open_node. attr.open is - * protected by kernfs_open_node_lock. - * - * filp->private_data points to seq_file whose ->private points to - * kernfs_open_file. - */ -DEFINE_SPINLOCK(kernfs_open_node_lock); - struct kernfs_open_node { atomic_t refcnt; atomic_t event; diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index ef5b04d43ef1b..64e9cca66d436 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -152,8 +152,6 @@ extern const struct inode_operations kernfs_symlink_iop= s; */ extern struct kernfs_global_locks *kernfs_locks; =20 -extern spinlock_t kernfs_open_node_lock; - static inline struct mutex *kernfs_open_file_mutex_ptr(struct kernfs_node = *kn) { int idx =3D hash_ptr(kn, NR_KERNFS_LOCK_BITS); @@ -174,7 +172,9 @@ static inline struct mutex *kernfs_open_file_mutex_lock= (struct kernfs_node *kn) =20 static inline spinlock_t *kernfs_open_node_spinlock_ptr(struct kernfs_node= *kn) { - return &kernfs_open_node_lock; + int idx =3D hash_ptr(kn, NR_KERNFS_LOCK_BITS); + + return &kernfs_locks->open_node_locks[idx].lock; } =20 static inline spinlock_t *kernfs_open_node_spinlock(struct kernfs_node *kn) diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index fa3fa22c95b21..809b738739b18 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -395,8 +395,10 @@ void __init kernfs_lock_init(void) kernfs_locks =3D kmalloc(sizeof(struct kernfs_global_locks), GFP_KERNEL); WARN_ON(!kernfs_locks); =20 - for (count =3D 0; count < NR_KERNFS_LOCKS; count++) + for (count =3D 0; count < NR_KERNFS_LOCKS; count++) { mutex_init(&kernfs_locks->open_file_mutex[count].lock); + spin_lock_init(&kernfs_locks->open_node_locks[count].lock); + } } =20 void __init kernfs_init(void) diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 3f72d38d48e31..e50528c45bcd4 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -19,6 +19,7 @@ #include #include #include +#include =20 struct file; struct dentry; @@ -75,20 +76,27 @@ struct kernfs_iattrs; * kernfs_open_file. * kernfs_open_files are chained at kernfs_open_node->files, which is * protected by kernfs_open_file_mutex.lock. + * + * kernfs_node->attr.open points to kernfs_open_node. attr.open is + * protected by kernfs_open_node_lock.lock. */ =20 struct kernfs_open_file_mutex { struct mutex lock; } ____cacheline_aligned_in_smp; =20 +struct kernfs_open_node_lock { + spinlock_t lock; +} ____cacheline_aligned_in_smp; + /* * To reduce possible contention in sysfs access, arising due to single * locks, use an array of locks and use kernfs_node object address as * hash keys to get the index of these locks. */ - struct kernfs_global_locks { struct kernfs_open_file_mutex open_file_mutex[NR_KERNFS_LOCKS]; + struct kernfs_open_node_lock open_node_locks[NR_KERNFS_LOCKS]; }; =20 enum kernfs_node_type { --=20 2.30.2 From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F05E9C433F5 for ; Fri, 25 Feb 2022 05:22:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237509AbiBYFWb (ORCPT ); Fri, 25 Feb 2022 00:22:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237487AbiBYFWO (ORCPT ); Fri, 25 Feb 2022 00:22:14 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FAF03B022 for ; Thu, 24 Feb 2022 21:21:43 -0800 (PST) Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4k3tv019955; Fri, 25 Feb 2022 05:21:39 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=aZa/TrkKtTNqKT9QgLUCzdbxF8gSva5/VMgR85ZGRIQ=; b=1Mj1w8dJ0SKZRgxrs06aPQ0VYd4OgIy7cYoiUbbxkR6CpytZBRATY+TE/iu+GEcLCV/E 65wWUhYWtvYf8Lv4JMuj/vKQYrNwak7zOmJAOwOejR4Cc7F21creazRlIElOvEA7BHhh UNci0+2ZMXdAojTsqumx0x8GILlqAWcwUfxsQomLbnmrG5JSD40FF006w5HcHyrzQxxK 7cyjm6U1/8ydKfwuNLn5kJZAjE/Rglj1y7GyQROLiI7GnBIiQwNMX9oIa/jFm+qD/d6M 3rLF7hDKelBgXJoT4W25OQPmipn7caCKwDMEDwR7bNeCTdonEDDAtsB2R8kmLjrtDcdV 9Q== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by mx0b-00069f02.pphosted.com with ESMTP id 3ect3cs1xk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:39 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5AnHw169721; Fri, 25 Feb 2022 05:21:38 GMT Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2100.outbound.protection.outlook.com [104.47.58.100]) by aserp3020.oracle.com with ESMTP id 3eb484e7gw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:37 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Jnc5Rzntykcnzx5jijmpQVFMEkjIgO7hJZn4aX0fiYYs0jsD1j8VvyrtnpPbq4TAf6cb0Cv1/2qRe98BD5oQXpSula2CRpah0WZRZTYcz/WJ/gPRb7anCtTmfBLZAJ76GEzSmzFlsYCWCfFJyaZyx0XW22z8sMvt//5krxMeasH9noeEibnQ24BnoTVNqRU81kFhZQ+rQsMdSSntZI8ToqmJRhdXuTbTtD2QRC23UTjCJQhMydWb2qiUMXgMU9SJYlwvIEIrwdxYfqjSoZiqLmpqqFzbPWXMyMvV7oBIpWZC8ryJn0Emwy2SDyARl8Qb5uXtrvmi/nPldB19k+fWDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aZa/TrkKtTNqKT9QgLUCzdbxF8gSva5/VMgR85ZGRIQ=; b=Apd1RLQ2zLG/SEAR5HtWIJFH3Lc84ffJym1cvMHbbd5Mz4Cl5w1PnA//YlVjLJQv+3++WMnUqc/6a28hTnxPGbhK616DC9jwoU+TWhJhm9DM+2FRNMCRNBiW19h6TXNK3+jqwrQQMtEoMYlDXi4F3+nvIASgpJQNO1YlDXczRdqMz5X8XfD2HNOkiPR8wSRuWuNn7ksaDGVnxD3+yHaX3eFkI6vLw1e6dbId123u5WCGksIfLJmZ8lw6MOrqfQKGXqCENm8i00R/zi1t6EGFYBS6KRTr9vbamA5fiufT3p/KiRqEhruV5+dEmppyl6NzZ/P2QxwyV4sxkaC1Qw8zrg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aZa/TrkKtTNqKT9QgLUCzdbxF8gSva5/VMgR85ZGRIQ=; b=mvGES3DrPBsFwAg64k+paKTB50IxHaAdMJ4sGtlZE4bl1odDOM64LRc0eRvXoeIPQyWpKh5xrpSbsfw0dVnFTk4K0O3g5p38FJWpVpBXhs4FRjsqoK1ohAmo3ox9xtPN4ERmm/FHQxiWAbTlnO84e9lkeCjMNBArr45kAs+hro0= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:36 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:36 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 5/8] kernfs: Use a per-fs rwsem to protect per-fs list of kernfs_super_info. Date: Fri, 25 Feb 2022 16:21:13 +1100 Message-Id: <20220225052116.1243150-6-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b41d8613-a0a7-4385-7ded-08d9f81eb16d X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 98EcE3xQo3cPYJDj68+lXwnE/jNFCYdD7Pgu6IaJ6llRW7d/f+NVdNA0Y950KK2nqZ2Wktr+Oqyv3bT3WI5mxrcp6kKMHJ+Y8ghryS8y1ydiZRSRpdK1wm/n60rvTGS+J1s8lptLHNNLs9GGS/GPvK1+MWx18VTY24GV6jw9T+Spxv887RkamcyxvTwjutqZa/k4R9i739pF7Q7Cnmu0qfZEmrkWwOvdDuABzR4fYE8U4lGyQ/++jFNEbp0+AqhbUnOEWae6pM10wfMpTExIk0/EKI0e1cGib4in+0n5eFEwtYe+eE8xgs943uCpg7DerNLiswkuTUSS2HUB7MJCdlUu5KspxgJFJpBIWoFm0NzhCJyLT4hHgUeVnfTGKaZuimEFDOcjspDrTtdW0wlU+OerVUZZ8AAdkHZGdGWcGyElYAW5xBrP6is/+48xb+mOeusWlS1VUX/+bpLN0KYujd4FbsHv2Z8Migv4YRPfDFd3cVE4Rr/3FKwf5s9/IiDfuK8L52NmzSQEHD3yHk9jzM3t7rp9kABCj37LOgTRfT3R2rPT1p1Apvb1/S96PoQjkGxo0d6pK7RG56pRhgw2pAkRhpraWKTuwrG3oTKPTBSXNrDgmAsg0B0WcGj33mL+6ppPf+b8HlhcpcpNcojbf/2e6YyWFlSKVYCraQJyDSuKWDKMOy+yA1eWZTca8b0tlMrxuFnATUI1juWFz9ly+A== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?ws9AqELYUI65NfQ2S2h1g5n5ln37WUQFDsuUAnMPGzj9+2BG33H4CxdzTBZb?= =?us-ascii?Q?XeWj9Fg0c6PgxXG3cA3D3HufYGUGWGbqtowQ3A8lcwa6A3E6go+O8m5/picO?= =?us-ascii?Q?56+q/yZiOQzA8lwh4wsyADxEdaX19GaWqhRYSP5WV9XdKIyfcj2pzzc14E2b?= =?us-ascii?Q?zq6W0+EcteX5ZywpdzOeerd3M59T8IW4p3UpJYCp3uXeTXoWUs/kMw3dXoqk?= =?us-ascii?Q?uNFzG1GC2tttiwjQgtR6yFRqeKa3/8bmONKO0+7RF/EVUa4V8xoToz+aaAlp?= =?us-ascii?Q?DSFAAQE6aAbajXAGGv7AhWjDsCMCOE2qqSMENu8GAV3EUmCi+4f6c3sSG1Z0?= =?us-ascii?Q?NYa4tyRf6mY7twUqZhAyzstQ7sqGrW9grUHXmLhQv8JH3mKRs3yir8fk+o9o?= =?us-ascii?Q?3u7uHuKguI/aeGNSQUGU5gUttJA8sQS0SEEyntJ9VzowqGhMNNwWeQ5Q7ufg?= =?us-ascii?Q?Go/09bG2GBhafstIo0Icp7Yog8p2Vf8P52nz0KrV0aCmq/zrTyRiFoLEmIzD?= =?us-ascii?Q?i6VOd1zycM1HnfmkeK1AOn/wp0j+OW+ekSIW1w7A1LOT/9zRgr92RicQd3Wf?= =?us-ascii?Q?L/yZC8tAvzkEwC9VJOHyiTl4u9oIv458nIDaL79jSxQsGf/KFyPj6hQhZcWV?= =?us-ascii?Q?AHDBgB7z8/owpc5C0DRD/0tUKLJ7n/RtaevQ6p3Ryj3Y7UBhMeiochQytxc3?= =?us-ascii?Q?FUUn89zYhQmBcURrI9zDXpd+xV1fexeQHYrFN5te4rTCLyZR1DrCbqrpD5co?= =?us-ascii?Q?Ul9HD3tdz24F/rFFuU+3kY5JD7CT78VXjnnMsPJQsdbcaNzs+rbf1EcmRmm6?= =?us-ascii?Q?FeNKMiX6PTMqJd/BHpiUfMw1Z3W9tQTXqqn6YSNecxH2Nz/3P30qtmSZjVYh?= =?us-ascii?Q?KPJzKYqJPYItvFwoj0rc+BAea/n41tkUxAYzHtA8nJvnaK+RaCaLWgYWJB8G?= =?us-ascii?Q?2IWbe9q+mwOjiL+HqoRwXF+LwPbnmwpa36sw+X+Peu87s91reVyAnSInl6Mx?= =?us-ascii?Q?fsanAHL6GyNbOAPBzpGSrzz3LrPNqu4bvF7YIVx9NFVYcOuBAxvXWY81MpcU?= =?us-ascii?Q?LjmdR91S8m7Z0oDpp7XpbHH3ZqGL+Iv7kh/8XuhrX/HXg4P0tZ/CojiS0vFV?= =?us-ascii?Q?QfcmaFBnrEX6z6YWTj6i34catukVGcdKOXbqsi2MMC3tsUA0LOrgwTUXGZgH?= =?us-ascii?Q?oLUtKI+k160TrtMA9oIzL3qrj1K+tV/IQ+k6cGnVbDgUaGfBQ3JodGpiWzsJ?= =?us-ascii?Q?Aq65EMW4plLmyksIYCj71gHWqbBdI4WwbERXkJblU8fkohGP+KX8Q7+MTzW2?= =?us-ascii?Q?/zd9vYfRu6iFzys6kg0mHyV3j+4Ay4+WBJ+ssRlF52o/2ZbjEicqUEPXx5o9?= =?us-ascii?Q?mYM0+e3y3nqQY16ytJ13reMiN7cPlEq1ZW/Q0MLjLkmK6xOzJV9Wx0rE8kRR?= =?us-ascii?Q?+8H/O/fImC1IXDNXj7PnSvgDX0x3geuTIhJem9FaXiVeD0c4oRTxkWwRlF3V?= =?us-ascii?Q?WPHAiGea4vgpsTSJTrf00ABeRdJU0IN2DZ1AYzBw9MjTYJv6GKyLlpcH1JKw?= =?us-ascii?Q?yO32+qa4UYkjygypXyGpP7j9lVidkUlxoqp+yCDEIH+P3/R4Lf904wljevd5?= =?us-ascii?Q?Bwk+DRqx4aVrjkjg5xGfCRE=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: b41d8613-a0a7-4385-7ded-08d9f81eb16d X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:36.0178 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Lrca/P1zQbccTR5ud/Xh/HllnzxugQ6xovUKhaN4qO6HAiBhE8J8WUUSqb7c/WzJBtvFmWuRkm4ZZu7O8cLOYw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 malwarescore=0 mlxlogscore=999 adultscore=0 bulkscore=0 phishscore=0 suspectscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-ORIG-GUID: G6fG2hia_kE-hCAz4TjXZIFO3oxQIKj7 X-Proofpoint-GUID: G6fG2hia_kE-hCAz4TjXZIFO3oxQIKj7 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Right now per-fs kernfs_rwsem protects list of kernfs_super_info instances for a kernfs_root. Since kernfs_rwsem is used to synchronize several other operations across kernfs and since most of these operations don't impact kernfs_super_info, we can use a separate per-fs rwsem to synchronize access to list of kernfs_super_info. This helps in reducing contention around kernfs_rwsem and also allows operations that change/access list of kernfs_super_info to proceed without contending for kernfs_rwsem. Signed-off-by: Imran Khan --- fs/kernfs/dir.c | 1 + fs/kernfs/file.c | 2 ++ fs/kernfs/mount.c | 8 ++++---- include/linux/kernfs.h | 1 + 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index e6d9772ddb4ca..dc769301ac96b 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -917,6 +917,7 @@ struct kernfs_root *kernfs_create_root(struct kernfs_sy= scall_ops *scops, idr_init(&root->ino_idr); init_rwsem(&root->kernfs_rwsem); INIT_LIST_HEAD(&root->supers); + init_rwsem(&root->supers_rwsem); =20 /* * On 64bit ino setups, id is ino. On 32bit, low 32bits are ino. diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 95426df9f0304..07003d47343d7 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -854,6 +854,7 @@ static void kernfs_notify_workfn(struct work_struct *wo= rk) /* kick fsnotify */ down_write(&root->kernfs_rwsem); =20 + down_write(&root->supers_rwsem); list_for_each_entry(info, &kernfs_root(kn)->supers, node) { struct kernfs_node *parent; struct inode *p_inode =3D NULL; @@ -889,6 +890,7 @@ static void kernfs_notify_workfn(struct work_struct *wo= rk) =20 iput(inode); } + up_write(&root->supers_rwsem); =20 up_write(&root->kernfs_rwsem); kernfs_put(kn); diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index 809b738739b18..d35142226c340 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -347,9 +347,9 @@ int kernfs_get_tree(struct fs_context *fc) } sb->s_flags |=3D SB_ACTIVE; =20 - down_write(&root->kernfs_rwsem); + down_write(&root->supers_rwsem); list_add(&info->node, &info->root->supers); - up_write(&root->kernfs_rwsem); + up_write(&root->supers_rwsem); } =20 fc->root =3D dget(sb->s_root); @@ -376,9 +376,9 @@ void kernfs_kill_sb(struct super_block *sb) struct kernfs_super_info *info =3D kernfs_info(sb); struct kernfs_root *root =3D info->root; =20 - down_write(&root->kernfs_rwsem); + down_write(&root->supers_rwsem); list_del(&info->node); - up_write(&root->kernfs_rwsem); + up_write(&root->supers_rwsem); =20 /* * Remove the superblock from fs_supers/s_instances diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index e50528c45bcd4..3f7f39b92c8b0 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -266,6 +266,7 @@ struct kernfs_root { =20 wait_queue_head_t deactivate_waitq; struct rw_semaphore kernfs_rwsem; + struct rw_semaphore supers_rwsem; }; =20 struct kernfs_open_file { --=20 2.30.2 From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF68EC433F5 for ; Fri, 25 Feb 2022 05:22:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237531AbiBYFWr (ORCPT ); Fri, 25 Feb 2022 00:22:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237492AbiBYFWT (ORCPT ); Fri, 25 Feb 2022 00:22:19 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B52F93AA55 for ; Thu, 24 Feb 2022 21:21:45 -0800 (PST) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4nVbC028644; Fri, 25 Feb 2022 05:21:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=PlWeK45XGS9t3zzcEE0vktnH7u79J5+oUzg3+TDy4fM=; b=avEylrDlnlhge/gnRJMVq3xucuB/H/Szh9p7OhAPGZmlkhfqyYRK1lrCfu3M+0BuExkf drG9f4Sr0aHVZOB1p4H6ugxSfGaVieWHPeuC8MW02TzcGqSc1rajUoRDv2WsLYH2ibjV /zSldVI9uzleZz2Nw2xXaLN2gVRJ3fsDbE15Ss3ZRlsM7Uxh1CV8x+6lywrH4e2dK8zF NBKX7/PkVwojOiiXO2Z4znk8a6hzUbLpLXDAa1BiREbZ6Y4fYaT6t1Mgzt0MHL54dpYm cVWIahjlHconfpLDz/YZMedd/1HqbdIMDWhV2uTsqXZneTUysxDic6TUln8P2KGpOb8G +w== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by mx0b-00069f02.pphosted.com with ESMTP id 3ecvar8uhb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:41 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5B7uK181664; Fri, 25 Feb 2022 05:21:40 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2102.outbound.protection.outlook.com [104.47.70.102]) by aserp3030.oracle.com with ESMTP id 3eapkm8a5p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gj7tK+koA+rpMLveWgaSfc1bSllv1iJFcQuSCaX7Hiqkk1QKNc1KISUmqug4uXw3AlG53JuAw/Uw7gYJBfbIIa/LDWYr8kc+iP6P9zCa6VPQOouR+rbolAKpttWNpSJ5qQI5grT2c1j5wxZb8dR3LBQMifecP1x0PiFuiQk4UHEXGE7Ig1PCJQuq5M4UxfkX/kB3avmD2lokzqZiqwZqG5d+UZZfaWhaX3hfBWE/7DLz4qAfh3Oy/1PptiSs5oNs9LwEXkGWHYSjbmUox+2svytt5DUHOzCxgZ1in2jE8IE4DMdR93sbGmWetP1cvyFQmBvovVb6PpA3YfacskJ85g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PlWeK45XGS9t3zzcEE0vktnH7u79J5+oUzg3+TDy4fM=; b=MBtIiY7vwGL51yebQ/oorbcQwq+QnORGV5Hmoy95ilEFozl0cEySn3YuRJ3yaEvi4/nofgZtusbtbTfMfNxJ/NXJ+DiI7lwwxsrIxEg5AQoKDJr8VryHWT6fDAQNsdaTKOkYfBik0kXlLRZ8hJqXb5IPPFXoMkeN6oIkJQMd6Pc/1+uqPBz7VGRiVy/8aCkZqVPEHeA4I+aK8hDHNYKUqlAU6EPRlmRBaB88L5cunrw1wp8sjaeLo6oaSbwHVZxgaMzu+9WyhaPW7ek/rFNHBSBEqSC7YqTj4JSNTfLhcOJIjbX/FTMaLtbJqScmyC2siaXI+yNalPSE4mFgdGkRdQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PlWeK45XGS9t3zzcEE0vktnH7u79J5+oUzg3+TDy4fM=; b=COveSxUyyLZF532djh8ZevyP+UBjCMoxS/ahDC2uWtSFjHGwJ9+dHUYuLifDXny5dmoFIJckI2n5TnUhM3z5gzvMPuUUApOnU3aiJCc6N/4Ve4tLuAR0Wu2tzO7Bfq+0ZGsggx8qro2PVWvvpdnXXwNiu87366+ujHiZ37/+qXI= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:37 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:37 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 6/8] kernfs: Introduce interface to access per-fs rwsem. Date: Fri, 25 Feb 2022 16:21:14 +1100 Message-Id: <20220225052116.1243150-7-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 84574453-6edf-4962-f6eb-08d9f81eb267 X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: UMdqWCiLw9xW2S7Dq5H1qzCDpQgYf0uiypNMFT/pvudGymSkEF8GMxHqEX6ePXOlPKO1Fy3YI90A0IWoSpZuHt5dnSpLY1KKNa2BEWg8YhlnWR1kw19DshKCLt3uuMB4FC1A4TMoyPNvS4Dz836ngRGYCYsYwuC5MRCsyLSYWM+1fBdF+QCaX8xwccWxwbe8Nu469qeFOFsy2JfpQQ+vLshxYT+K5gxxn3WUlu1o4l4hInKhJ3Kswt/XNtf0mKYeF8QFpCwhUVN0xraXqn+HruUlfDrgr220224wboUf4kjJVhQm+7VudiEt7HWMlpFQtNNs4QzNmXpdbfjJwNvLMyNakR28yNh0/G6OoNod3x0J72KQCy4/8ptj5o1Q1+qL3k1jWzfuiBjrypB8ddT0bQGVAXxut/6RbnatFal+DxhiD7pxR5XSNhRaWH4ecMvMV+7KasYY97L52COyV6MQRHJl/YmxaI4hgAZo9KSVvZsjyi90M4R6vJgE3CNiVGRaYs+IvteEDMnRzcUKmrM2zrcVTJDV1FWuf2MZz+JDQpt5ZyzYCZOVxWechY3Pcx7qtbp1zMki7t19G3INOdOZAXfb1Uz2jS3hyj8/d81flvbjpvng+RKc7sr9xTfX4Ri/maqDeRa66vWTyRusyTtldEJVtwUnaruNgzrO67wMhB2JH1v+ldHOPRsT/wexJ6bfbRnIYtiUmWFbsbDUEx/xYQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(30864003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?2u7V/KfI/TLKzhwXBJr53eXFXgUr6H/8969xnQ9eH8mbqX5tZt2DCGe6xUqj?= =?us-ascii?Q?n31F1r9iBehMZUqF592EKjhwP92aVm6/Z7o+QNLLEGKRDu1I3R24VMD4Up3F?= =?us-ascii?Q?jXqJhmDjQcLUx2wnwagD585kgnc4ImDzrkvrDCxDEU/KHvbjMvDrTtaBNOgj?= =?us-ascii?Q?irMdmn0KcFe1kgvtQk+jihUdKU3No8cZ2Qh/e9F5bRXYBA/SoauUvvRe14bf?= =?us-ascii?Q?NcDeXiPSEDQJMx1RBja/yoZgv/xtsJZSBsFVvRon9CGML6ikEMv9n5AE8NJ3?= =?us-ascii?Q?Ax23XE7lZsj97gvYqkVsfOsN0b7jtqVia2ZmYWQnPfBptxkPOZpq3H/LQ4j/?= =?us-ascii?Q?tq0gSgXd9LQ1OglP4LvaX0BfcTNh9wkMnblVaQh+qOC49CZ+YKw7jVOHbcUm?= =?us-ascii?Q?zW8SnChn0XEGUMM0DSY7Ud8HHzuSCRGK+JaXOLf9xHBUcGyqxJTuKMkG1Ig5?= =?us-ascii?Q?pqgxu1qt1TB3VlaU8VwNRR/co+6H0onp9kmnswKynY/wtcOdEzwU83Jw8gn+?= =?us-ascii?Q?5dJtQKXzt5VD3kwZfjpNWc27SURYKTxVFY9Tgx7PlsSFz+NxLgrTI4I4fS4X?= =?us-ascii?Q?tifyLpFEP4c3t/m0BToE3TZiTygKyckBFFKnEhSplkYLxaCnFw7HeouIRwsw?= =?us-ascii?Q?pabu8gacEu6mTXTVDAtFvwwzFT4NhtM7O3kui9Vt53xtOfYCPtf8+Hp8J2bm?= =?us-ascii?Q?I+Z0dka9N07Xl7aRv0+sQGzPBM8ePth0r83++OdYDXeSacdxYGFVwnnN8fe5?= =?us-ascii?Q?lGFgtuFgTXjOnuVZqZPNY8Q1KuJOjNBRKEPWDAatHlAfmbyTBM+q3BiovH/9?= =?us-ascii?Q?V7g25rE2jx2Rpa66JAAdNz71pYdMMf4Y3Gb1mU2lev4Qc5FA5BP8FNaqXyP5?= =?us-ascii?Q?34yOb+firZZwfdeXjxDzmsW0C4O9gN2jp853isfjr3RETZV9dQOS87Vrj0n4?= =?us-ascii?Q?qpbnwMD8a+URq4zHdlZ18ooisp41iWusEIsPPP8DAxeX1MEXnOENmHwNN5ag?= =?us-ascii?Q?W7it96H3yjBf9eytbKG2zZeF3KamvwaLFvlYBbeH48SCKEuCVC31rcbBqjLi?= =?us-ascii?Q?Q08A69zUeO9wI8B8/QFxqjBXL8YZRegIDuoyXBuKaS5qKbij6nu2K02SAsCB?= =?us-ascii?Q?CsIXRvtllXm/1pIC80h0cFinTTpvI+r9OZgzgmTHzhkVvi1N5W34WZLbo44Y?= =?us-ascii?Q?2s5TpOAQuYxxJT3rj2L5bHKkfdvh26vXv5HjSfwGJsEyFIDzzevq2/4R0rDo?= =?us-ascii?Q?BQ+FQWkBYIJYJ6dE1rS5c0aTa2vFec42gV6qmyD5/PVdcPZpNvgbSe0oIMt1?= =?us-ascii?Q?SZFt3fL/j229d11xdqQhBcWjQCt90RWFr8UoUG5AD+QYRTv2OpOCA8iC0oiH?= =?us-ascii?Q?R3eiC64ONRrfVQBgq8h6QX3caGYtw7naz8e4fon+vIte5lPq/0jbxPgcR2K7?= =?us-ascii?Q?JRh2299gW1juIVRFe9X2wDRqKa4IHV0sEvbcUrSUzyzbjOuRmr2HFtme53wf?= =?us-ascii?Q?DdFynV6yyBeRD9a8Lx41cHAu5VAhpYF7QOidtUAP63kO1ebeKNJ/7TCDlPVm?= =?us-ascii?Q?0LD9lHdW5MkmrFORvEBjCvccex8SBIy3tCFsHJV1fZfs9baj5fJkL19cfiBC?= =?us-ascii?Q?YeFVoDmmsQBSpq0Fhdte5hs=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 84574453-6edf-4962-f6eb-08d9f81eb267 X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:37.8638 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: GS02jQ8YySebq8vCYzNNz2JSxLOohXSpQP8R7kOtFLRsmcdPyrrxMyisKPl9fszC+cZ+AfhFpoBDfRxMtIlypw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=0 mlxscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-GUID: KzkgXq0Hc0-38JVhnJ1SysCOHkb3qQtU X-Proofpoint-ORIG-GUID: KzkgXq0Hc0-38JVhnJ1SysCOHkb3qQtU Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" per-fs rwsem is used across kernfs for synchronization purposes. Having an interface to access it not only avoids code duplication, it can also help in changing the underlying locking mechanism without needing to change the lock users. For example next patch modifies this interface to make use of hashed rwsems in place of per-fs rwsem. Signed-off-by: Imran Khan --- fs/kernfs/dir.c | 114 ++++++++++++++++++------------------ fs/kernfs/file.c | 5 +- fs/kernfs/inode.c | 26 ++++---- fs/kernfs/kernfs-internal.h | 78 ++++++++++++++++++++++++ fs/kernfs/mount.c | 6 +- fs/kernfs/symlink.c | 6 +- 6 files changed, 156 insertions(+), 79 deletions(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index dc769301ac96b..8f22b2735755f 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -25,7 +25,7 @@ static DEFINE_SPINLOCK(kernfs_idr_lock); /* root->ino_idr= */ =20 static bool kernfs_active(struct kernfs_node *kn) { - lockdep_assert_held(&kernfs_root(kn)->kernfs_rwsem); + kernfs_rwsem_assert_held(kn); return atomic_read(&kn->active) >=3D 0; } =20 @@ -461,10 +461,16 @@ static void kernfs_drain(struct kernfs_node *kn) { struct kernfs_root *root =3D kernfs_root(kn); =20 - lockdep_assert_held_write(&root->kernfs_rwsem); + /** + * kn has the same root as its ancestor, so it can be used to get + * per-fs rwsem. + */ + struct rw_semaphore *rwsem =3D kernfs_rwsem_ptr(kn); + + kernfs_rwsem_assert_held_write(kn); WARN_ON_ONCE(kernfs_active(kn)); =20 - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); =20 if (kernfs_lockdep(kn)) { rwsem_acquire(&kn->dep_map, 0, 0, _RET_IP_); @@ -483,7 +489,7 @@ static void kernfs_drain(struct kernfs_node *kn) =20 kernfs_drain_open_files(kn); =20 - down_write(&root->kernfs_rwsem); + kernfs_down_write(kn); } =20 /** @@ -718,12 +724,12 @@ struct kernfs_node *kernfs_find_and_get_node_by_id(st= ruct kernfs_root *root, int kernfs_add_one(struct kernfs_node *kn) { struct kernfs_node *parent =3D kn->parent; - struct kernfs_root *root =3D kernfs_root(parent); struct kernfs_iattrs *ps_iattr; + struct rw_semaphore *rwsem; bool has_ns; int ret; =20 - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(parent); =20 ret =3D -EINVAL; has_ns =3D kernfs_ns_enabled(parent); @@ -754,7 +760,7 @@ int kernfs_add_one(struct kernfs_node *kn) ps_iattr->ia_mtime =3D ps_iattr->ia_ctime; } =20 - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); =20 /* * Activate the new node unless CREATE_DEACTIVATED is requested. @@ -768,7 +774,7 @@ int kernfs_add_one(struct kernfs_node *kn) return 0; =20 out_unlock: - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); return ret; } =20 @@ -789,7 +795,7 @@ static struct kernfs_node *kernfs_find_ns(struct kernfs= _node *parent, bool has_ns =3D kernfs_ns_enabled(parent); unsigned int hash; =20 - lockdep_assert_held(&kernfs_root(parent)->kernfs_rwsem); + kernfs_rwsem_assert_held(parent); =20 if (has_ns !=3D (bool)ns) { WARN(1, KERN_WARNING "kernfs: ns %s in '%s' for '%s'\n", @@ -821,7 +827,7 @@ static struct kernfs_node *kernfs_walk_ns(struct kernfs= _node *parent, size_t len; char *p, *name; =20 - lockdep_assert_held_read(&kernfs_root(parent)->kernfs_rwsem); + kernfs_rwsem_assert_held_read(parent); =20 /* grab kernfs_rename_lock to piggy back on kernfs_pr_cont_buf */ spin_lock_irq(&kernfs_rename_lock); @@ -860,12 +866,12 @@ struct kernfs_node *kernfs_find_and_get_ns(struct ker= nfs_node *parent, const char *name, const void *ns) { struct kernfs_node *kn; - struct kernfs_root *root =3D kernfs_root(parent); + struct rw_semaphore *rwsem; =20 - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(parent); kn =3D kernfs_find_ns(parent, name, ns); kernfs_get(kn); - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); =20 return kn; } @@ -885,12 +891,12 @@ struct kernfs_node *kernfs_walk_and_get_ns(struct ker= nfs_node *parent, const char *path, const void *ns) { struct kernfs_node *kn; - struct kernfs_root *root =3D kernfs_root(parent); + struct rw_semaphore *rwsem; =20 - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(parent); kn =3D kernfs_walk_ns(parent, path, ns); kernfs_get(kn); - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); =20 return kn; } @@ -1046,7 +1052,7 @@ struct kernfs_node *kernfs_create_empty_dir(struct ke= rnfs_node *parent, static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags) { struct kernfs_node *kn; - struct kernfs_root *root; + struct rw_semaphore *rwsem; =20 if (flags & LOOKUP_RCU) return -ECHILD; @@ -1062,13 +1068,12 @@ static int kernfs_dop_revalidate(struct dentry *den= try, unsigned int flags) parent =3D kernfs_dentry_node(dentry->d_parent); if (parent) { spin_unlock(&dentry->d_lock); - root =3D kernfs_root(parent); - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(parent); if (kernfs_dir_changed(parent, dentry)) { - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); return 0; } - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); } else spin_unlock(&dentry->d_lock); =20 @@ -1079,8 +1084,7 @@ static int kernfs_dop_revalidate(struct dentry *dentr= y, unsigned int flags) } =20 kn =3D kernfs_dentry_node(dentry); - root =3D kernfs_root(kn); - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(kn); =20 /* The kernfs node has been deactivated */ if (!kernfs_active(kn)) @@ -1099,10 +1103,10 @@ static int kernfs_dop_revalidate(struct dentry *den= try, unsigned int flags) kernfs_info(dentry->d_sb)->ns !=3D kn->ns) goto out_bad; =20 - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); return 1; out_bad: - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); return 0; } =20 @@ -1116,12 +1120,11 @@ static struct dentry *kernfs_iop_lookup(struct inod= e *dir, { struct kernfs_node *parent =3D dir->i_private; struct kernfs_node *kn; - struct kernfs_root *root; struct inode *inode =3D NULL; const void *ns =3D NULL; + struct rw_semaphore *rwsem; =20 - root =3D kernfs_root(parent); - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(parent); if (kernfs_ns_enabled(parent)) ns =3D kernfs_info(dir->i_sb)->ns; =20 @@ -1132,7 +1135,7 @@ static struct dentry *kernfs_iop_lookup(struct inode = *dir, * create a negative. */ if (!kernfs_active(kn)) { - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); return NULL; } inode =3D kernfs_get_inode(dir->i_sb, kn); @@ -1147,7 +1150,7 @@ static struct dentry *kernfs_iop_lookup(struct inode = *dir, */ if (!IS_ERR(inode)) kernfs_set_rev(parent, dentry); - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); =20 /* instantiate and hash (possibly negative) dentry */ return d_splice_alias(inode, dentry); @@ -1270,7 +1273,7 @@ static struct kernfs_node *kernfs_next_descendant_pos= t(struct kernfs_node *pos, { struct rb_node *rbn; =20 - lockdep_assert_held_write(&kernfs_root(root)->kernfs_rwsem); + kernfs_rwsem_assert_held_write(root); =20 /* if first iteration, visit leftmost descendant which may be root */ if (!pos) @@ -1305,9 +1308,9 @@ static struct kernfs_node *kernfs_next_descendant_pos= t(struct kernfs_node *pos, void kernfs_activate(struct kernfs_node *kn) { struct kernfs_node *pos; - struct kernfs_root *root =3D kernfs_root(kn); + struct rw_semaphore *rwsem; =20 - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); =20 pos =3D NULL; while ((pos =3D kernfs_next_descendant_post(pos, kn))) { @@ -1321,14 +1324,14 @@ void kernfs_activate(struct kernfs_node *kn) pos->flags |=3D KERNFS_ACTIVATED; } =20 - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); } =20 static void __kernfs_remove(struct kernfs_node *kn) { struct kernfs_node *pos; =20 - lockdep_assert_held_write(&kernfs_root(kn)->kernfs_rwsem); + kernfs_rwsem_assert_held_write(kn); =20 /* * Short-circuit if non-root @kn has already finished removal. @@ -1398,11 +1401,11 @@ static void __kernfs_remove(struct kernfs_node *kn) */ void kernfs_remove(struct kernfs_node *kn) { - struct kernfs_root *root =3D kernfs_root(kn); + struct rw_semaphore *rwsem; =20 - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); __kernfs_remove(kn); - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); } =20 /** @@ -1488,9 +1491,9 @@ void kernfs_unbreak_active_protection(struct kernfs_n= ode *kn) bool kernfs_remove_self(struct kernfs_node *kn) { bool ret; - struct kernfs_root *root =3D kernfs_root(kn); + struct rw_semaphore *rwsem; =20 - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); kernfs_break_active_protection(kn); =20 /* @@ -1518,9 +1521,9 @@ bool kernfs_remove_self(struct kernfs_node *kn) atomic_read(&kn->active) =3D=3D KN_DEACTIVATED_BIAS) break; =20 - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); schedule(); - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); } finish_wait(waitq, &wait); WARN_ON_ONCE(!RB_EMPTY_NODE(&kn->rb)); @@ -1533,7 +1536,7 @@ bool kernfs_remove_self(struct kernfs_node *kn) */ kernfs_unbreak_active_protection(kn); =20 - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); return ret; } =20 @@ -1550,7 +1553,7 @@ int kernfs_remove_by_name_ns(struct kernfs_node *pare= nt, const char *name, const void *ns) { struct kernfs_node *kn; - struct kernfs_root *root; + struct rw_semaphore *rwsem; =20 if (!parent) { WARN(1, KERN_WARNING "kernfs: can not remove '%s', no directory\n", @@ -1558,14 +1561,13 @@ int kernfs_remove_by_name_ns(struct kernfs_node *pa= rent, const char *name, return -ENOENT; } =20 - root =3D kernfs_root(parent); - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(parent); =20 kn =3D kernfs_find_ns(parent, name, ns); if (kn) __kernfs_remove(kn); =20 - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); =20 if (kn) return 0; @@ -1584,16 +1586,15 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct= kernfs_node *new_parent, const char *new_name, const void *new_ns) { struct kernfs_node *old_parent; - struct kernfs_root *root; const char *old_name =3D NULL; + struct rw_semaphore *rwsem; int error; =20 /* can't move or rename root */ if (!kn->parent) return -EINVAL; =20 - root =3D kernfs_root(kn); - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); =20 error =3D -ENOENT; if (!kernfs_active(kn) || !kernfs_active(new_parent) || @@ -1647,7 +1648,7 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct k= ernfs_node *new_parent, =20 error =3D 0; out: - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); return error; } =20 @@ -1718,14 +1719,13 @@ static int kernfs_fop_readdir(struct file *file, st= ruct dir_context *ctx) struct dentry *dentry =3D file->f_path.dentry; struct kernfs_node *parent =3D kernfs_dentry_node(dentry); struct kernfs_node *pos =3D file->private_data; - struct kernfs_root *root; const void *ns =3D NULL; + struct rw_semaphore *rwsem; =20 if (!dir_emit_dots(file, ctx)) return 0; =20 - root =3D kernfs_root(parent); - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(parent); =20 if (kernfs_ns_enabled(parent)) ns =3D kernfs_info(dentry->d_sb)->ns; @@ -1742,12 +1742,12 @@ static int kernfs_fop_readdir(struct file *file, st= ruct dir_context *ctx) file->private_data =3D pos; kernfs_get(pos); =20 - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); if (!dir_emit(ctx, name, len, ino, type)) return 0; - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(parent); } - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); file->private_data =3D NULL; ctx->pos =3D INT_MAX; return 0; diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 07003d47343d7..f46c25fb789fb 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -838,6 +838,7 @@ static void kernfs_notify_workfn(struct work_struct *wo= rk) struct kernfs_node *kn; struct kernfs_super_info *info; struct kernfs_root *root; + struct rw_semaphore *rwsem; repeat: /* pop one off the notify_list */ spin_lock_irq(&kernfs_notify_lock); @@ -852,7 +853,7 @@ static void kernfs_notify_workfn(struct work_struct *wo= rk) =20 root =3D kernfs_root(kn); /* kick fsnotify */ - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); =20 down_write(&root->supers_rwsem); list_for_each_entry(info, &kernfs_root(kn)->supers, node) { @@ -892,7 +893,7 @@ static void kernfs_notify_workfn(struct work_struct *wo= rk) } up_write(&root->supers_rwsem); =20 - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); kernfs_put(kn); goto repeat; } diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c index 3d783d80f5daa..efe5ae98abf46 100644 --- a/fs/kernfs/inode.c +++ b/fs/kernfs/inode.c @@ -99,11 +99,11 @@ int __kernfs_setattr(struct kernfs_node *kn, const stru= ct iattr *iattr) int kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr) { int ret; - struct kernfs_root *root =3D kernfs_root(kn); + struct rw_semaphore *rwsem; =20 - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); ret =3D __kernfs_setattr(kn, iattr); - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); return ret; } =20 @@ -112,14 +112,13 @@ int kernfs_iop_setattr(struct user_namespace *mnt_use= rns, struct dentry *dentry, { struct inode *inode =3D d_inode(dentry); struct kernfs_node *kn =3D inode->i_private; - struct kernfs_root *root; + struct rw_semaphore *rwsem; int error; =20 if (!kn) return -EINVAL; =20 - root =3D kernfs_root(kn); - down_write(&root->kernfs_rwsem); + rwsem =3D kernfs_down_write(kn); error =3D setattr_prepare(&init_user_ns, dentry, iattr); if (error) goto out; @@ -132,7 +131,7 @@ int kernfs_iop_setattr(struct user_namespace *mnt_usern= s, struct dentry *dentry, setattr_copy(&init_user_ns, inode, iattr); =20 out: - up_write(&root->kernfs_rwsem); + kernfs_up_write(rwsem); return error; } =20 @@ -187,14 +186,14 @@ int kernfs_iop_getattr(struct user_namespace *mnt_use= rns, { struct inode *inode =3D d_inode(path->dentry); struct kernfs_node *kn =3D inode->i_private; - struct kernfs_root *root =3D kernfs_root(kn); + struct rw_semaphore *rwsem; =20 - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(kn); spin_lock(&inode->i_lock); kernfs_refresh_inode(kn, inode); generic_fillattr(&init_user_ns, inode, stat); spin_unlock(&inode->i_lock); - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); =20 return 0; } @@ -277,22 +276,21 @@ void kernfs_evict_inode(struct inode *inode) int kernfs_iop_permission(struct user_namespace *mnt_userns, struct inode *inode, int mask) { + struct rw_semaphore *rwsem; struct kernfs_node *kn; - struct kernfs_root *root; int ret; =20 if (mask & MAY_NOT_BLOCK) return -ECHILD; =20 kn =3D inode->i_private; - root =3D kernfs_root(kn); =20 - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(kn); spin_lock(&inode->i_lock); kernfs_refresh_inode(kn, inode); ret =3D generic_permission(&init_user_ns, inode, mask); spin_unlock(&inode->i_lock); - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); =20 return ret; } diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index 64e9cca66d436..bb934949d5eb5 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -188,4 +188,82 @@ static inline spinlock_t *kernfs_open_node_spinlock(st= ruct kernfs_node *kn) return lock; } =20 +static inline struct rw_semaphore *kernfs_rwsem_ptr(struct kernfs_node *kn) +{ + struct kernfs_root *root =3D kernfs_root(kn); + + return &root->kernfs_rwsem; +} + +static inline void kernfs_rwsem_assert_held(struct kernfs_node *kn) +{ + lockdep_assert_held(kernfs_rwsem_ptr(kn)); +} + +static inline void kernfs_rwsem_assert_held_write(struct kernfs_node *kn) +{ + lockdep_assert_held_write(kernfs_rwsem_ptr(kn)); +} + +static inline void kernfs_rwsem_assert_held_read(struct kernfs_node *kn) +{ + lockdep_assert_held_read(kernfs_rwsem_ptr(kn)); +} + +/** + * kernfs_down_write() - Acquire kernfs rwsem + * + * @kn: kernfs_node for which rwsem needs to be taken + * + * Return: pointer to acquired rwsem + */ +static inline struct rw_semaphore *kernfs_down_write(struct kernfs_node *k= n) +{ + struct rw_semaphore *rwsem =3D kernfs_rwsem_ptr(kn); + + down_write(rwsem); + + return rwsem; +} + +/** + * kernfs_up_write - Release kernfs rwsem + * + * @rwsem: address of rwsem to release + * + * Return: void + */ +static inline void kernfs_up_write(struct rw_semaphore *rwsem) +{ + up_write(rwsem); +} + +/** + * kernfs_down_read() - Acquire kernfs rwsem + * + * @kn: kernfs_node for which rwsem needs to be taken + * + * Return: pointer to acquired rwsem + */ +static inline struct rw_semaphore *kernfs_down_read(struct kernfs_node *kn) +{ + struct rw_semaphore *rwsem =3D kernfs_rwsem_ptr(kn); + + down_read(rwsem); + + return rwsem; +} + +/** + * kernfs_up_read - Release kernfs rwsem + * + * @rwsem: address of rwsem to release + * + * Return: void + */ +static inline void kernfs_up_read(struct rw_semaphore *rwsem) +{ + up_read(rwsem); +} + #endif /* __KERNFS_INTERNAL_H */ diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index d35142226c340..f88dc4e26ffb5 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -237,9 +237,9 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *k= n, static int kernfs_fill_super(struct super_block *sb, struct kernfs_fs_cont= ext *kfc) { struct kernfs_super_info *info =3D kernfs_info(sb); - struct kernfs_root *kf_root =3D kfc->root; struct inode *inode; struct dentry *root; + struct rw_semaphore *rwsem; =20 info->sb =3D sb; /* Userspace would break if executables or devices appear on sysfs */ @@ -257,9 +257,9 @@ static int kernfs_fill_super(struct super_block *sb, st= ruct kernfs_fs_context *k sb->s_shrink.seeks =3D 0; =20 /* get root inode, initialize and unlock it */ - down_read(&kf_root->kernfs_rwsem); + rwsem =3D kernfs_down_read(info->root->kn); inode =3D kernfs_get_inode(sb, info->root->kn); - up_read(&kf_root->kernfs_rwsem); + kernfs_up_read(rwsem); if (!inode) { pr_debug("kernfs: could not get root inode\n"); return -ENOMEM; diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 0ab13824822f7..9d41036025547 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -113,12 +113,12 @@ static int kernfs_getlink(struct inode *inode, char *= path) struct kernfs_node *kn =3D inode->i_private; struct kernfs_node *parent =3D kn->parent; struct kernfs_node *target =3D kn->symlink.target_kn; - struct kernfs_root *root =3D kernfs_root(parent); + struct rw_semaphore *rwsem; int error; =20 - down_read(&root->kernfs_rwsem); + rwsem =3D kernfs_down_read(parent); error =3D kernfs_get_target_path(parent, target, path); - up_read(&root->kernfs_rwsem); + kernfs_up_read(rwsem); =20 return error; } --=20 2.30.2 From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15ADAC433EF for ; Fri, 25 Feb 2022 05:23:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237583AbiBYFXA (ORCPT ); Fri, 25 Feb 2022 00:23:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237490AbiBYFWW (ORCPT ); Fri, 25 Feb 2022 00:22:22 -0500 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3926B3135E for ; Thu, 24 Feb 2022 21:21:49 -0800 (PST) Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4ia9Z006290; Fri, 25 Feb 2022 05:21:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=ls7MR3VZ94Z64C7mVXThwgv0Ppqc/K95D6mWw4jrrbM=; b=KpiGj+sQS1CpqaY7qGNjhe3u21ubWYY8eQDTohU1PjzkONhRQrzqx+6cb7oQS/kmTBAq 1L/pRPO+QcxV5TpQSAVK1+pHXrP7yhV2JTPfjWbsCNofJk4ZEvW288npwAf36HJxp9i/ K5R1SqiOM7xi9uSnGa5OnH5Moy142/efh4p5XFT/gRYylBV3CbK0Ta8SmefHLklv3+xt jnG8PVbCdlve+XARxM6EQi4uSP8wadH6QdjNdX+d3e+KEGxmgaSUXymW7hLWGCh7qGdN 6/nMCQj1eAZKZw1Q9m4evJBeDu9YucEguLD/x8lmg83hf8Q38zvSSnmYKgpASOB6j3IT pw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3ecv6f15u1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:43 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5B0PB017724; Fri, 25 Feb 2022 05:21:42 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3020.oracle.com with ESMTP id 3eat0rmw9p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:42 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QSsedgN6Ieg3jOqVOo3tgUKBn+/txSOsdumKVB6Wm2T4ydlywsdxLCfSX5A0de+9fhxIm7V41BDjvx88Q6uclxavKVcOve2RKZW/hPDd2cTkqPz/svst/AtrxMP3GW2o4ndzJEa80oOTGCBnDJ+FwFhZUuIa1R9/wOz2gnQHqIOk8RiNaQvUKpsJo9o75X1qLKMgllHI1GFyZ9lPY0UU4qgIGuYKBZgc/67oAtSrBemcWRw98hBF7wdM7yREGYQ65iP4ECKnLTHXy18e8/eXlJNOKJdFzyiLdtZIiyqyQQjfJH5uNJAGcuKuQDQ8AxNlP33iMz5dQbp02R/Mf1/wOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ls7MR3VZ94Z64C7mVXThwgv0Ppqc/K95D6mWw4jrrbM=; b=Hh944YDRtMUL5znCQyetctq8is27NtTtTuWb5DFqwErAXncrLp3/BN/vWk0zRfKcqmqucMbbKjujGBWPs2o4rZ1EWuPLUCRziTGalFVNKBZmaApkIWbvBisXMLED2c9eTW255c5GnU3ksGKA8yx6i8L6FLlz9g1MpZ5QKTO01goJs94ZNAg/HKn6YVROpwmoulcKkRT71AwKE3NibC7Y27lzGK3PiYerQy5pzLlSygdDqyCbwcvAkREOTtvsL9QArAQqdKoNUwjZqxiTFbMyqQLKec1xbjMBJVvi+tnqfNlLeWStU6mJdV5u3FuNSu2sS9I32E4hXCOvknGlWSOCLg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ls7MR3VZ94Z64C7mVXThwgv0Ppqc/K95D6mWw4jrrbM=; b=rbvOZVPQWk2WLBHApOQKVJZhloufl7oNVPVmPeVg1Bm2n09uWkG7yY3ax74BRUJZxqFEJO6E+tYv4cid/6eYdfAUblfZ91gwA/tqCVFCJ1JH2GDe793FFTPoJiBJDHLzrXrTIB8zxq24L8i0BPKqfzbaOwOhfBczAPMz4WXk/Iw= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:39 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:39 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 7/8] kernfs: Replace per-fs rwsem with hashed rwsems. Date: Fri, 25 Feb 2022 16:21:15 +1100 Message-Id: <20220225052116.1243150-8-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: d81f7361-7e38-47be-24d8-08d9f81eb390 X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 80jmJh1LWOLej/A97HsQ/P33uMIbiPBuUu8Tk/ZSSWkirCEni1EW6tymmHypUTWglE9f/2UmrLeuuTR2k0YaDi8JTZu25gdn/dpucyvtey6A6p5fJO+kkQDTbXIDv5mVWVFt2I2IRfHjiNz6xozgF1clRwn4MTnRmR9YfyAk1E3FRAyAhYkdU8olms8rItuWiyY5cBVIP6udEqSHo8NwcwPeJojFwwdigNDJeTIntblWZNtHD2mI/h/yQ8Epd3enTuvFFYORBGxRJzOmpbCa3SBW/TxWTSXlQERcWuEsmeWOu4PUHOwWx0cChjNnrmWtCIHTXz6TbD5RRzLLogSn5cA/6pp+lpdm7fBfnNvPwkbcsJtv/rLpLHRPYySzxXg3XAf1KUUiwh9PkzvcSGs+IAw+lyi9FHmPOAG5JI/b1680foH0sBoQutU/yO/Qv0gJvJoBdwBKRQZu6OjMn1UML/zol+2TqTMzadqAsUKA5Te8RMcMVFjDYTEPOh7Ka9F2dNvJDygbuQtNFRb+0N80QkLMx2golmXfxRUqUfuPXw2YKBsF3O0e0qK2D3ETAxuHnJ5MPHr0nRkMBy0AI23CkFnuRhIxZ3+NfOQha5kokPTphibErFz3Yl1AOzUTt4AVNVwBs74EeS2oYM9zegFhPeQhQ6mmxVyAnsy0mYKWa95/Rog2qSw7+IhwjZ5YgOmNiyvkEzDFq6zzCb4tVabNQg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(30864003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?fH9W0lViS0KUkd2ldFVqEg5BI8oEGnUpiatMmth9OAEaMxc7cYO82Vl9O+Zd?= =?us-ascii?Q?hnrh+AnzOkYhsKUhl97/yD1xwR5iI6ORuEMnixessRyRAbnvyxY5xUMF8Y5v?= =?us-ascii?Q?XkI70Y5Y6pWHMt8g3xny4bW9RtxT1IgSE7d5jkFdDRAnQM0j6V7rVpuwpGXn?= =?us-ascii?Q?SOyjqVTCuZxaZwgMLg6/bu2D9NY4REyqZWhFFWZjkY3AzvzILtuCFLRKKgXH?= =?us-ascii?Q?Y+0nHgCBNqWW/di67TRchPL5sNldLTC/luWmzkLOgQNg7pcokW18Zs5dE9S5?= =?us-ascii?Q?N1nBgPVWAj3EDPbEnqOiioQru/9Cm2jk3RL8ws3VKAqb6PnF/TbcBFu8jFEb?= =?us-ascii?Q?sD5UvYq+/1gzi5Hy+zJKVoF0rYRoBxi4OpGsU8hPcOWy1hAz7dLaHR/wF/HS?= =?us-ascii?Q?QJmOadT3SqNPbiP2GJN/Y2l5QnYagGURQIuRNO6FTj1pQ0+F2vZF9tkFAp8w?= =?us-ascii?Q?JKyXKwYQFZR4Ba/AnznkCCopF/H/K231YIoc2/tXn/X7ji7tDvORVdHNWDJG?= =?us-ascii?Q?OrhWeEliig2W5xFuFofcf5irxfzrDe17z02erd4TmZTaL+L2ZJ9R9R+L6LQJ?= =?us-ascii?Q?q8BPYRQAJBIuj6O1tnf/OGXQwGL0Our6fd+a9dw4ZDFwdf/FumaxYstVpA/r?= =?us-ascii?Q?87uuuK/iZAYCPeMlfQH+lqjksvAa3wSav8uEpynCV45QrRHQUURSSIa7LEWJ?= =?us-ascii?Q?PwQALtslxTiDxT7PuYS9o2qLbntwUxTy5EHnXYr94ZoNS2ObzFS4FpK29tVu?= =?us-ascii?Q?8fpj/RJt5BnfGOK0puYx0h/lsaxVue6V1VQpVIDui1RSHsiXEo3Z4yDFdOUr?= =?us-ascii?Q?sGHVWE1fIgywsyEgxoOalU2ACacWfizdccYQ2K03YG/EFVbLYGtEOFRdqmw/?= =?us-ascii?Q?FoOsVlLsCz/nWWBbkINUUXgjsO2aXdN2HS4ktDUWBiIWGmGN3LVwzvbP0It0?= =?us-ascii?Q?w9lJf7CncDCeBZ60TGyZJPz56mWEgsbm6ctie0z4IfDQrXSNY1IEEpWMUobY?= =?us-ascii?Q?fJ0FYSRuUUaA5vUnTTWo9WyMhsuvplmllmurLrN+je+9MInDIlX3vq4HMSo1?= =?us-ascii?Q?QOQO0Vc30as7I7j4AEBTnWWAdnnzSy6TxnMjD3G0oTQdIH8X8E4kOVVD/Oz/?= =?us-ascii?Q?ECLHF+mUrHNOxcw9E2X8NR4w9cF2aam/JC7LeA+6JKodbxW0cIZd0aW0zIh8?= =?us-ascii?Q?FiWo6mZbJ5cMqBOdVjmLz2EfYQXe1RIcOh0mzHWFueHQJB166fzRle5pwVIG?= =?us-ascii?Q?ZA505rI9YHtZLkdYGwZ0f3JH5qZLjixscGIb4l3pVNW33O2FePtu7YXPGPT2?= =?us-ascii?Q?W2LaSOka4T9GF33psP114lA94f7tddVmQdvSRRcNY7Ced1Wat/oOa5aye58S?= =?us-ascii?Q?bvtgqJAVYXRWgrbgVHztwfin8SQsGx5e/lTb0BsKq03Byql539Tgu7+5QEGD?= =?us-ascii?Q?4879YMVl5F7S9Q3OhLLG8cv1hq1KbKMEVfNIVe6qS8H/rCswvHSyYaklz3ZR?= =?us-ascii?Q?A78ODnRjSxhbeG6p6+m8Vsp4cKs9CLjkHaZwU9DFWIFNZGEYWPyCgEFxOmBx?= =?us-ascii?Q?hgcEXWIKz/BQ/ukDOPYSRqqsGqhT3mudd4bnxYG/0cbMSqFjiG0+GPje0909?= =?us-ascii?Q?mQJ+qgmVBkDRvSLnOhFevZk=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: d81f7361-7e38-47be-24d8-08d9f81eb390 X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:39.6316 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: L4Cc+Bpx6b67GKW/mBpAinRt3YsulVeMneLk4EogfX4lPwkbCdszsY14YOjP0aP+hLwHw7BN0QkP0UPtGBt4zw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 adultscore=0 mlxlogscore=999 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-GUID: n62w9Kjk8EzLZeY74hUO-1mwFy__hFrP X-Proofpoint-ORIG-GUID: n62w9Kjk8EzLZeY74hUO-1mwFy__hFrP Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Having a single rwsem to synchronize all operations across a kernfs based file system (cgroup, sysfs etc.) does not scale well. The contention around this single rwsem becomes more apparent in large scale systems with few hundred CPUs where most of the CPUs have running tasks that are opening, accessing or closing sysfs files at any point of time. Using hashed rwsems in place of a per-fs rwsem, can significantly reduce contention around per-fs rwsem and hence provide better scalability. Moreover as these hashed rwsems are not part of kernfs_node objects we will not see any singnificant change in memory utilization of kernfs based file systems like sysfs, cgroupfs etc. Modify interface introduced in previous patch to make use of hashed rwsems. Just like earlier change use kernfs_node address as hashing key. Since we are getting rid of per-fs lock, in certain cases we may need to acquire locks corresponding to multiple nodes and in such cases of nested locking, locks are taken in order of their addresses. Introduce helpers to acquire rwsems corresponding to multiple nodes for such cases. For operations that involve finding the node first and then operating on it (for example operations involving find_and_get_ns), acquiring rwsem for the node being searched is not possible. Such operations need to make sure that a concurrent remove does not remove the found node. Introduce a per-fs mutex that can be used to synchronize these operations against parallel removal of involved node. Replacing global mutex and spinlocks with hashed ones (as mentioned in previous changes) and global rwsem with hashed rwsem (as done in this change) reduces contention around kernfs and results in better performance numbers. For example on a system with 384 cores, if I run 200 instances of an application which is mostly executing the following loop: for (int loop =3D 0; loop <100 ; loop++) { for (int port_num =3D 1; port_num < 2; port_num++) { for (int gid_index =3D 0; gid_index < 254; gid_index++ ) { char ret_buf[64], ret_buf_lo[64]; char gid_file_path[1024]; int ret_len; int ret_fd; ssize_t ret_rd; ub4 i, saved_errno; memset(ret_buf, 0, sizeof(ret_buf)); memset(gid_file_path, 0, sizeof(gid_file_path)); ret_len =3D snprintf(gid_file_path, sizeof(gid_file_path), "/sys/class/infiniband/%s/ports/%d/gids/%d", dev_name, port_num, gid_index); ret_fd =3D open(gid_file_path, O_RDONLY | O_CLOEXEC); if (ret_fd < 0) { printf("Failed to open %s\n", gid_file_path); continue; } /* Read the GID */ ret_rd =3D read(ret_fd, ret_buf, 40); if (ret_rd =3D=3D -1) { printf("Failed to read from file %s, errno: %u\n", gid_file_path, saved_errno); continue; } close(ret_fd); } } I can see contention around above mentioned locks as follows: - 54.07% 53.60% showgids [kernel.kallsyms] [k] osq_lock - 53.60% __libc_start_main - 32.29% __GI___libc_open entry_SYSCALL_64_after_hwframe do_syscall_64 sys_open do_sys_open do_filp_open path_openat vfs_open do_dentry_open kernfs_fop_open mutex_lock - __mutex_lock_slowpath - 32.23% __mutex_lock.isra.5 osq_lock - 21.31% __GI___libc_close entry_SYSCALL_64_after_hwframe do_syscall_64 exit_to_usermode_loop task_work_run ____fput __fput kernfs_fop_release kernfs_put_open_node.isra.8 mutex_lock - __mutex_lock_slowpath - 21.28% __mutex_lock.isra.5 osq_lock - 10.49% 10.39% showgids [kernel.kallsyms] [k] down_read 10.39% __libc_start_main __GI___libc_open entry_SYSCALL_64_after_hwframe do_syscall_64 sys_open do_sys_open do_filp_open - path_openat - 9.72% link_path_walk - 5.21% inode_permission - __inode_permission - 5.21% kernfs_iop_permission down_read - 4.08% walk_component lookup_fast - d_revalidate.part.24 - 4.08% kernfs_dop_revalidate - 7.48% 7.41% showgids [kernel.kallsyms] [k] up_read 7.41% __libc_start_main __GI___libc_open entry_SYSCALL_64_after_hwframe do_syscall_64 sys_open do_sys_open do_filp_open - path_openat - 7.01% link_path_walk - 4.12% inode_permission - __inode_permission - 4.12% kernfs_iop_permission up_read - 2.61% walk_component lookup_fast - d_revalidate.part.24 - 2.61% kernfs_dop_revalidate Moreover this run of 200 application isntances takes 32-34 secs. to complete. With the patched kernel and on the same test setup, we no longer see contention around osq_lock (i.e kernfs_open_file_mutex) and also contention around per-fs kernfs_rwsem has reduced significantly as well. This can be seen in the following perf snippet: - 1.66% 1.65% showgids [kernel.kallsyms] [k] down_read 1.65% __libc_start_main __GI___libc_open entry_SYSCALL_64_after_hwframe do_syscall_64 sys_open do_sys_open do_filp_open - path_openat - 1.62% link_path_walk - 0.98% inode_permission - __inode_permission + 0.98% kernfs_iop_permission - 0.52% walk_component lookup_fast - d_revalidate.part.24 - 0.52% kernfs_dop_revalidate - 1.12% 1.11% showgids [kernel.kallsyms] [k] up_read 1.11% __libc_start_main __GI___libc_open entry_SYSCALL_64_after_hwframe do_syscall_64 sys_open do_sys_open do_filp_open - path_openat - 1.11% link_path_walk - 0.69% inode_permission - __inode_permission - 0.69% kernfs_iop_permission up_read Moreover the test execution time has reduced from 32-34 secs to 18-19 secs. Signed-off-by: Imran Khan --- fs/kernfs/Makefile | 2 +- fs/kernfs/dir.c | 133 ++++++++++++++---- fs/kernfs/inode.c | 20 +++ fs/kernfs/kernfs-internal.c | 259 ++++++++++++++++++++++++++++++++++++ fs/kernfs/kernfs-internal.h | 44 +++++- fs/kernfs/mount.c | 1 + fs/kernfs/symlink.c | 13 +- include/linux/kernfs.h | 3 +- 8 files changed, 443 insertions(+), 32 deletions(-) create mode 100644 fs/kernfs/kernfs-internal.c diff --git a/fs/kernfs/Makefile b/fs/kernfs/Makefile index 4ca54ff54c986..778da6b118e9b 100644 --- a/fs/kernfs/Makefile +++ b/fs/kernfs/Makefile @@ -3,4 +3,4 @@ # Makefile for the kernfs pseudo filesystem # =20 -obj-y :=3D mount.o inode.o dir.o file.o symlink.o +obj-y :=3D mount.o inode.o dir.o file.o symlink.o kernfs-internal.o diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 8f22b2735755f..169f58e487900 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -25,7 +25,6 @@ static DEFINE_SPINLOCK(kernfs_idr_lock); /* root->ino_idr= */ =20 static bool kernfs_active(struct kernfs_node *kn) { - kernfs_rwsem_assert_held(kn); return atomic_read(&kn->active) >=3D 0; } =20 @@ -450,26 +449,24 @@ void kernfs_put_active(struct kernfs_node *kn) /** * kernfs_drain - drain kernfs_node * @kn: kernfs_node to drain + * @anc: ancestor of kernfs_node to drain * * Drain existing usages and nuke all existing mmaps of @kn. Mutiple * removers may invoke this function concurrently on @kn and all will * return after draining is complete. */ -static void kernfs_drain(struct kernfs_node *kn) - __releases(&kernfs_root(kn)->kernfs_rwsem) - __acquires(&kernfs_root(kn)->kernfs_rwsem) +static void kernfs_drain(struct kernfs_node *kn, struct kernfs_node *anc) + __releases(kernfs_rwsem_ptr(anc)) + __acquires(kernfs_rwsem_ptr(anc)) { struct kernfs_root *root =3D kernfs_root(kn); =20 - /** - * kn has the same root as its ancestor, so it can be used to get - * per-fs rwsem. - */ - struct rw_semaphore *rwsem =3D kernfs_rwsem_ptr(kn); + struct rw_semaphore *rwsem; =20 - kernfs_rwsem_assert_held_write(kn); + kernfs_rwsem_assert_held_write(anc); WARN_ON_ONCE(kernfs_active(kn)); =20 + rwsem =3D kernfs_rwsem_ptr(anc); kernfs_up_write(rwsem); =20 if (kernfs_lockdep(kn)) { @@ -489,7 +486,7 @@ static void kernfs_drain(struct kernfs_node *kn) =20 kernfs_drain_open_files(kn); =20 - kernfs_down_write(kn); + kernfs_down_write(anc); } =20 /** @@ -729,6 +726,11 @@ int kernfs_add_one(struct kernfs_node *kn) bool has_ns; int ret; =20 + /** + * The node being added is not active at this point of time and may + * be activated later depending on CREATE_DEACTIVATED flag. So at + * this point of time just locking the parent is enough. + */ rwsem =3D kernfs_down_write(parent); =20 ret =3D -EINVAL; @@ -867,11 +869,20 @@ struct kernfs_node *kernfs_find_and_get_ns(struct ker= nfs_node *parent, { struct kernfs_node *kn; struct rw_semaphore *rwsem; + struct kernfs_root *root =3D kernfs_root(parent); =20 + /** + * We don't have address of kernfs_node (that is being searched) + * yet. Acquiring root->kernfs_rm_mutex and releasing it after + * pinning the found kernfs_node, ensures that found kernfs_node + * will not disappear due to a parallel remove operation. + */ + mutex_lock(&root->kernfs_rm_mutex); rwsem =3D kernfs_down_read(parent); kn =3D kernfs_find_ns(parent, name, ns); kernfs_get(kn); kernfs_up_read(rwsem); + mutex_unlock(&root->kernfs_rm_mutex); =20 return kn; } @@ -892,11 +903,20 @@ struct kernfs_node *kernfs_walk_and_get_ns(struct ker= nfs_node *parent, { struct kernfs_node *kn; struct rw_semaphore *rwsem; + struct kernfs_root *root =3D kernfs_root(parent); =20 + /** + * We don't have address of kernfs_node (that is being searched) + * yet. Acquiring root->kernfs_rm_mutex and releasing it after + * pinning the found kernfs_node, ensures that found kernfs_node + * will not disappear due to a parallel remove operation. + */ + mutex_lock(&root->kernfs_rm_mutex); rwsem =3D kernfs_down_read(parent); kn =3D kernfs_walk_ns(parent, path, ns); kernfs_get(kn); kernfs_up_read(rwsem); + mutex_unlock(&root->kernfs_rm_mutex); =20 return kn; } @@ -921,9 +941,9 @@ struct kernfs_root *kernfs_create_root(struct kernfs_sy= scall_ops *scops, return ERR_PTR(-ENOMEM); =20 idr_init(&root->ino_idr); - init_rwsem(&root->kernfs_rwsem); INIT_LIST_HEAD(&root->supers); init_rwsem(&root->supers_rwsem); + mutex_init(&root->kernfs_rm_mutex); =20 /* * On 64bit ino setups, id is ino. On 32bit, low 32bits are ino. @@ -1084,6 +1104,11 @@ static int kernfs_dop_revalidate(struct dentry *dent= ry, unsigned int flags) } =20 kn =3D kernfs_dentry_node(dentry); + /** + * For dentry revalidation just acquiring kernfs_node's rwsem for + * reading should be enough. If a competing rename or remove wins + * one of the checks below will fail. + */ rwsem =3D kernfs_down_read(kn); =20 /* The kernfs node has been deactivated */ @@ -1123,24 +1148,35 @@ static struct dentry *kernfs_iop_lookup(struct inod= e *dir, struct inode *inode =3D NULL; const void *ns =3D NULL; struct rw_semaphore *rwsem; + struct kernfs_root *root =3D kernfs_root(parent); =20 + /** + * We don't have address of kernfs_node (that is being searched) + * yet. So take root->kernfs_rm_mutex to avoid parallel removal of + * found kernfs_node. + */ + mutex_lock(&root->kernfs_rm_mutex); rwsem =3D kernfs_down_read(parent); if (kernfs_ns_enabled(parent)) ns =3D kernfs_info(dir->i_sb)->ns; =20 kn =3D kernfs_find_ns(parent, dentry->d_name.name, ns); + kernfs_up_read(rwsem); /* attach dentry and inode */ if (kn) { /* Inactive nodes are invisible to the VFS so don't * create a negative. */ + rwsem =3D kernfs_down_read(kn); if (!kernfs_active(kn)) { kernfs_up_read(rwsem); + mutex_unlock(&root->kernfs_rm_mutex); return NULL; } inode =3D kernfs_get_inode(dir->i_sb, kn); if (!inode) inode =3D ERR_PTR(-ENOMEM); + kernfs_up_read(rwsem); } /* * Needed for negative dentry validation. @@ -1148,9 +1184,11 @@ static struct dentry *kernfs_iop_lookup(struct inode= *dir, * or transforms from positive dentry in dentry_unlink_inode() * called from vfs_rmdir(). */ + rwsem =3D kernfs_down_read(parent); if (!IS_ERR(inode)) kernfs_set_rev(parent, dentry); kernfs_up_read(rwsem); + mutex_unlock(&root->kernfs_rm_mutex); =20 /* instantiate and hash (possibly negative) dentry */ return d_splice_alias(inode, dentry); @@ -1330,27 +1368,40 @@ void kernfs_activate(struct kernfs_node *kn) static void __kernfs_remove(struct kernfs_node *kn) { struct kernfs_node *pos; + struct rw_semaphore *rwsem; + struct kernfs_root *root; =20 - kernfs_rwsem_assert_held_write(kn); + if (!kn) + return; + + root =3D kernfs_root(kn); =20 /* * Short-circuit if non-root @kn has already finished removal. * This is for kernfs_remove_self() which plays with active ref * after removal. */ - if (!kn || (kn->parent && RB_EMPTY_NODE(&kn->rb))) + mutex_lock(&root->kernfs_rm_mutex); + rwsem =3D kernfs_down_write(kn); + if (kn->parent && RB_EMPTY_NODE(&kn->rb)) { + kernfs_up_write(rwsem); + mutex_unlock(&root->kernfs_rm_mutex); return; + } =20 pr_debug("kernfs %s: removing\n", kn->name); =20 /* prevent any new usage under @kn by deactivating all nodes */ pos =3D NULL; + while ((pos =3D kernfs_next_descendant_post(pos, kn))) if (kernfs_active(pos)) atomic_add(KN_DEACTIVATED_BIAS, &pos->active); + kernfs_up_write(rwsem); =20 /* deactivate and unlink the subtree node-by-node */ do { + rwsem =3D kernfs_down_write(kn); pos =3D kernfs_leftmost_descendant(kn); =20 /* @@ -1368,10 +1419,25 @@ static void __kernfs_remove(struct kernfs_node *kn) * error paths without worrying about draining. */ if (kn->flags & KERNFS_ACTIVATED) - kernfs_drain(pos); + kernfs_drain(pos, kn); else WARN_ON_ONCE(atomic_read(&kn->active) !=3D KN_DEACTIVATED_BIAS); =20 + kernfs_up_write(rwsem); + + /** + * By now node and all of its descendants have been deactivated + * Once a descendant has been drained, acquire its parent's lock + * and unlink it from parent's children rb tree. + * We drop kn's lock before acquiring pos->parent's lock to avoid + * deadlock that will happen if pos->parent and kn hash to same lock. + * Dropping kn's lock should be safe because it is in deactived state. + * Further root->kernfs_rm_mutex ensures that we will not have + * concurrent instances of __kernfs_remove + */ + if (pos->parent) + rwsem =3D kernfs_down_write(pos->parent); + /* * kernfs_unlink_sibling() succeeds once per node. Use it * to decide who's responsible for cleanups. @@ -1389,8 +1455,12 @@ static void __kernfs_remove(struct kernfs_node *kn) kernfs_put(pos); } =20 + if (pos->parent) + kernfs_up_write(rwsem); kernfs_put(pos); } while (pos !=3D kn); + + mutex_unlock(&root->kernfs_rm_mutex); } =20 /** @@ -1401,11 +1471,7 @@ static void __kernfs_remove(struct kernfs_node *kn) */ void kernfs_remove(struct kernfs_node *kn) { - struct rw_semaphore *rwsem; - - rwsem =3D kernfs_down_write(kn); __kernfs_remove(kn); - kernfs_up_write(rwsem); } =20 /** @@ -1507,9 +1573,11 @@ bool kernfs_remove_self(struct kernfs_node *kn) */ if (!(kn->flags & KERNFS_SUICIDAL)) { kn->flags |=3D KERNFS_SUICIDAL; + kernfs_up_write(rwsem); __kernfs_remove(kn); kn->flags |=3D KERNFS_SUICIDED; ret =3D true; + rwsem =3D kernfs_down_write(kn); } else { wait_queue_head_t *waitq =3D &kernfs_root(kn)->deactivate_waitq; DEFINE_WAIT(wait); @@ -1563,11 +1631,17 @@ int kernfs_remove_by_name_ns(struct kernfs_node *pa= rent, const char *name, =20 rwsem =3D kernfs_down_write(parent); =20 + /** + * Since the node being searched will be removed eventually, + * we don't need to take root->kernfs_rm_mutex. + * Even if a parallel remove succeeds, the subsequent __kernfs_remove + * will detect it and bail-out early. + */ kn =3D kernfs_find_ns(parent, name, ns); - if (kn) - __kernfs_remove(kn); =20 kernfs_up_write(rwsem); + if (kn) + __kernfs_remove(kn); =20 if (kn) return 0; @@ -1587,14 +1661,24 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct= kernfs_node *new_parent, { struct kernfs_node *old_parent; const char *old_name =3D NULL; - struct rw_semaphore *rwsem; + struct kernfs_rwsem_token token; int error; =20 /* can't move or rename root */ if (!kn->parent) return -EINVAL; =20 - rwsem =3D kernfs_down_write(kn); + old_parent =3D kn->parent; + kernfs_get(old_parent); + kernfs_down_write_triple_nodes(kn, old_parent, new_parent, &token); + while (old_parent !=3D kn->parent) { + kernfs_put(old_parent); + kernfs_up_write_triple_nodes(kn, old_parent, new_parent, &token); + old_parent =3D kn->parent; + kernfs_get(old_parent); + kernfs_down_write_triple_nodes(kn, old_parent, new_parent, &token); + } + kernfs_put(old_parent); =20 error =3D -ENOENT; if (!kernfs_active(kn) || !kernfs_active(new_parent) || @@ -1629,7 +1713,6 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct k= ernfs_node *new_parent, /* rename_lock protects ->parent and ->name accessors */ spin_lock_irq(&kernfs_rename_lock); =20 - old_parent =3D kn->parent; kn->parent =3D new_parent; =20 kn->ns =3D new_ns; @@ -1648,7 +1731,7 @@ int kernfs_rename_ns(struct kernfs_node *kn, struct k= ernfs_node *new_parent, =20 error =3D 0; out: - kernfs_up_write(rwsem); + kernfs_up_write_triple_nodes(kn, new_parent, old_parent, &token); return error; } =20 diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c index efe5ae98abf46..36a40b08b97fa 100644 --- a/fs/kernfs/inode.c +++ b/fs/kernfs/inode.c @@ -101,6 +101,12 @@ int kernfs_setattr(struct kernfs_node *kn, const struc= t iattr *iattr) int ret; struct rw_semaphore *rwsem; =20 + /** + * Since we are only modifying the inode attribute, we just need + * to lock involved node. Operations that add or remove a node + * acquire parent's lock before changing the inode attributes, so + * such operations are also in sync with this interface. + */ rwsem =3D kernfs_down_write(kn); ret =3D __kernfs_setattr(kn, iattr); kernfs_up_write(rwsem); @@ -118,6 +124,12 @@ int kernfs_iop_setattr(struct user_namespace *mnt_user= ns, struct dentry *dentry, if (!kn) return -EINVAL; =20 + /** + * Since we are only modifying the inode attribute, we just need + * to lock involved node. Operations that add or remove a node + * acquire parent's lock before changing the inode attributes, so + * such operations are also in sync with .setattr backend. + */ rwsem =3D kernfs_down_write(kn); error =3D setattr_prepare(&init_user_ns, dentry, iattr); if (error) @@ -188,6 +200,10 @@ int kernfs_iop_getattr(struct user_namespace *mnt_user= ns, struct kernfs_node *kn =3D inode->i_private; struct rw_semaphore *rwsem; =20 + /** + * As we are only reading ->iattr, acquiring kn's rwsem for + * reading is enough. + */ rwsem =3D kernfs_down_read(kn); spin_lock(&inode->i_lock); kernfs_refresh_inode(kn, inode); @@ -285,6 +301,10 @@ int kernfs_iop_permission(struct user_namespace *mnt_u= serns, =20 kn =3D inode->i_private; =20 + /** + * As we are only reading ->iattr, acquiring kn's rwsem for + * reading is enough. + */ rwsem =3D kernfs_down_read(kn); spin_lock(&inode->i_lock); kernfs_refresh_inode(kn, inode); diff --git a/fs/kernfs/kernfs-internal.c b/fs/kernfs/kernfs-internal.c new file mode 100644 index 0000000000000..80d7d64532fe3 --- /dev/null +++ b/fs/kernfs/kernfs-internal.c @@ -0,0 +1,259 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * This file provides inernal helpers for kernfs. + */ + +#include "kernfs-internal.h" + +static void kernfs_swap_rwsems(struct rw_semaphore **array, int i, int j) +{ + struct rw_semaphore *tmp; + + tmp =3D array[i]; + array[i] =3D array[j]; + array[j] =3D tmp; +} + +static void kernfs_sort_rwsems(struct kernfs_rwsem_token *token) +{ + struct rw_semaphore **array =3D &token->rwsems[0]; + + if (token->count =3D=3D 2) { + if (array[0] =3D=3D array[1]) + token->count =3D 1; + else if (array[0] > array[1]) + kernfs_swap_rwsems(array, 0, 1); + } else { + if (array[0] =3D=3D array[1] && array[0] =3D=3D array[2]) + token->count =3D 1; + else { + if (array[0] > array[1]) + kernfs_swap_rwsems(array, 0, 1); + + if (array[0] > array[2]) + kernfs_swap_rwsems(array, 0, 2); + + if (array[1] > array[2]) + kernfs_swap_rwsems(array, 1, 2); + + if (array[0] =3D=3D array[1] || array[1] =3D=3D array[2]) + token->count =3D 2; + } + } +} + +/** + * kernfs_down_write_double_nodes() - take hashed rwsem for 2 nodes + * + * @kn1: first kernfs_node of interest + * @kn2: second kernfs_node of interest + * @token: token to pass unlocking information to caller + * + * Acquire hashed rwsem for 2 nodes. Some operation may need to acquire + * hashed rwsems for 2 nodes (for example for a node and its parent). + * This function can be used in such cases. + * + * Return: void + */ +void kernfs_down_write_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token) +{ + struct rw_semaphore **array =3D &token->rwsems[0]; + + array[0] =3D kernfs_rwsem_ptr(kn1); + array[1] =3D kernfs_rwsem_ptr(kn2); + token->count =3D 2; + + kernfs_sort_rwsems(token); + + if (token->count =3D=3D 1) { + /* Both nodes hash to same rwsem */ + down_write_nested(array[0], 0); + } else { + /* Both nodes hash to different rwsems */ + down_write_nested(array[0], 0); + down_write_nested(array[1], 1); + } +} + +/** + * kernfs_up_write_double_nodes - release hashed rwsem for 2 nodes + * + * @kn1: first kernfs_node of interest + * @kn2: second kernfs_node of interest + * @token: token to indicate unlocking information + * ->rwsems is a sorted list of rwsem addresses + * ->count contains number of unique locks + * + * Release hashed rwsems for 2 nodes + * + * Return: void + */ +void kernfs_up_write_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token) +{ + struct rw_semaphore **array =3D &token->rwsems[0]; + + if (token->count =3D=3D 1) { + /* Both nodes hash to same rwsem */ + up_write(array[0]); + } else { + /* Both nodes hashe to different rwsems */ + up_write(array[0]); + up_write(array[1]); + } +} + +/** + * kernfs_down_read_double_nodes() - take hashed rwsem for 2 nodes + * + * @kn1: first kernfs_node of interest + * @kn2: second kernfs_node of interest + * @token: token to pass unlocking information to caller + * + * Acquire hashed rwsem for 2 nodes. Some operation may need to acquire + * hashed rwsems for 2 nodes (for example for a node and its parent). + * This function can be used in such cases. + * + * Return: void + */ +void kernfs_down_read_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token) +{ + struct rw_semaphore **array =3D &token->rwsems[0]; + + array[0] =3D kernfs_rwsem_ptr(kn1); + array[1] =3D kernfs_rwsem_ptr(kn2); + token->count =3D 2; + + kernfs_sort_rwsems(token); + + if (token->count =3D=3D 1) { + /* Both nodes hash to same rwsem */ + down_read_nested(array[0], 0); + } else { + /* Both nodes hash to different rwsems */ + down_read_nested(array[0], 0); + down_read_nested(array[1], 1); + } +} + +/** + * kernfs_up_read_double_nodes - release hashed rwsem for 2 nodes + * + * @kn1: first kernfs_node of interest + * @kn2: second kernfs_node of interest + * @token: token to indicate unlocking information + * ->rwsems is a sorted list of rwsem addresses + * ->count contains number of unique locks + * + * Release hashed rwsems for 2 nodes + * + * Return: void + */ +void kernfs_up_read_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token) +{ + struct rw_semaphore **array =3D &token->rwsems[0]; + + if (token->count =3D=3D 1) { + /* Both nodes hash to same rwsem */ + up_read(array[0]); + } else { + /* Both nodes hashe to different rwsems */ + up_read(array[0]); + up_read(array[1]); + } +} + +/** + * kernfs_down_write_triple_nodes() - take hashed rwsem for 3 nodes + * + * @kn1: first kernfs_node of interest + * @kn2: second kernfs_node of interest + * @kn3: third kernfs_node of interest + * @token: token to pass unlocking information to caller + * + * Acquire hashed rwsem for 3 nodes. Some operation may need to acquire + * hashed rwsems for 3 nodes (for example rename operation needs to + * acquire rwsem corresponding to node, its current parent and its future + * parent). This function can be used in such cases. + * + * Return: void + */ +void kernfs_down_write_triple_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_node *kn3, + struct kernfs_rwsem_token *token) +{ + struct rw_semaphore **array =3D &token->rwsems[0]; + + array[0] =3D kernfs_rwsem_ptr(kn1); + array[1] =3D kernfs_rwsem_ptr(kn2); + array[2] =3D kernfs_rwsem_ptr(kn3); + token->count =3D 3; + + kernfs_sort_rwsems(token); + + if (token->count =3D=3D 1) { + /* All 3 nodes hash to same rwsem */ + down_write_nested(array[0], 0); + } else if (token->count =3D=3D 2) { + /** + * Two nodes hash to same rwsem, and these + * will occupy consecutive places in array after + * sorting. + */ + down_write_nested(array[0], 0); + down_write_nested(array[2], 1); + } else { + /* All 3 nodes hashe to different rwsems */ + down_write_nested(array[0], 0); + down_write_nested(array[1], 1); + down_write_nested(array[2], 2); + } +} + +/** + * kernfs_up_write_triple_nodes - release hashed rwsem for 3 nodes + * + * @kn1: first kernfs_node of interest + * @kn2: second kernfs_node of interest + * @kn3: third kernfs_node of interest + * @token: token to indicate unlocking information + * ->rwsems is a sorted list of rwsem addresses + * ->count contains number of unique locks + * + * Release hashed rwsems for 3 nodes + * + * Return: void + */ +void kernfs_up_write_triple_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_node *kn3, + struct kernfs_rwsem_token *token) +{ + struct rw_semaphore **array =3D &token->rwsems[0]; + + if (token->count =3D=3D 1) { + /* All 3 nodes hash to same rwsem */ + up_write(array[0]); + } else if (token->count =3D=3D 2) { + /** + * Two nodes hash to same rwsem, and these + * will occupy consecutive places in array after + * sorting. + */ + up_write(array[0]); + up_write(array[2]); + } else { + /* All 3 nodes hashe to different rwsems */ + up_write(array[0]); + up_write(array[1]); + up_write(array[2]); + } +} diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index bb934949d5eb5..d14e197f91684 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -19,6 +19,20 @@ #include #include =20 +/** + * Token for nested locking interfaces. + * + * rwsems: array of rwsems to acquire + * count: has 2 uses + * As input argument it specifies size of ->rwsems array + * As return argument it specifies number of unique rwsems + * present in ->rwsems array + */ +struct kernfs_rwsem_token { + struct rw_semaphore *rwsems[3]; + int count; +}; + struct kernfs_iattrs { kuid_t ia_uid; kgid_t ia_gid; @@ -190,9 +204,9 @@ static inline spinlock_t *kernfs_open_node_spinlock(str= uct kernfs_node *kn) =20 static inline struct rw_semaphore *kernfs_rwsem_ptr(struct kernfs_node *kn) { - struct kernfs_root *root =3D kernfs_root(kn); + int idx =3D hash_ptr(kn, NR_KERNFS_LOCK_BITS); =20 - return &root->kernfs_rwsem; + return &kernfs_locks->kernfs_rwsem[idx]; } =20 static inline void kernfs_rwsem_assert_held(struct kernfs_node *kn) @@ -266,4 +280,30 @@ static inline void kernfs_up_read(struct rw_semaphore = *rwsem) up_read(rwsem); } =20 + +void kernfs_down_write_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token); + +void kernfs_up_write_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token); + +void kernfs_down_read_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token); + +void kernfs_up_read_double_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_rwsem_token *token); + +void kernfs_down_write_triple_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_node *kn3, + struct kernfs_rwsem_token *token); + +void kernfs_up_write_triple_nodes(struct kernfs_node *kn1, + struct kernfs_node *kn2, + struct kernfs_node *kn3, + struct kernfs_rwsem_token *token); #endif /* __KERNFS_INTERNAL_H */ diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c index f88dc4e26ffb5..f2b3d981b42d8 100644 --- a/fs/kernfs/mount.c +++ b/fs/kernfs/mount.c @@ -398,6 +398,7 @@ void __init kernfs_lock_init(void) for (count =3D 0; count < NR_KERNFS_LOCKS; count++) { mutex_init(&kernfs_locks->open_file_mutex[count].lock); spin_lock_init(&kernfs_locks->open_node_locks[count].lock); + init_rwsem(&kernfs_locks->kernfs_rwsem[count]); } } =20 diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 9d41036025547..cbdd1be5f0a8c 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -113,12 +113,19 @@ static int kernfs_getlink(struct inode *inode, char *= path) struct kernfs_node *kn =3D inode->i_private; struct kernfs_node *parent =3D kn->parent; struct kernfs_node *target =3D kn->symlink.target_kn; - struct rw_semaphore *rwsem; + struct kernfs_rwsem_token token; int error; =20 - rwsem =3D kernfs_down_read(parent); + /** + * Lock both parent and target, to avoid their movement + * or removal in the middle of path construction. + * If a competing remove or rename for parent or target + * wins, it will be reflected in result returned from + * kernfs_get_target_path. + */ + kernfs_down_read_double_nodes(target, parent, &token); error =3D kernfs_get_target_path(parent, target, path); - kernfs_up_read(rwsem); + kernfs_up_read_double_nodes(target, parent, &token); =20 return error; } diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index 3f7f39b92c8b0..54208412ca801 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -97,6 +97,7 @@ struct kernfs_open_node_lock { struct kernfs_global_locks { struct kernfs_open_file_mutex open_file_mutex[NR_KERNFS_LOCKS]; struct kernfs_open_node_lock open_node_locks[NR_KERNFS_LOCKS]; + struct rw_semaphore kernfs_rwsem[NR_KERNFS_LOCKS]; }; =20 enum kernfs_node_type { @@ -265,8 +266,8 @@ struct kernfs_root { struct list_head supers; =20 wait_queue_head_t deactivate_waitq; - struct rw_semaphore kernfs_rwsem; struct rw_semaphore supers_rwsem; + struct mutex kernfs_rm_mutex; }; =20 struct kernfs_open_file { --=20 2.30.2 From nobody Tue Jun 23 22:31:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C8BCC433F5 for ; Fri, 25 Feb 2022 05:22:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237540AbiBYFWv (ORCPT ); Fri, 25 Feb 2022 00:22:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236489AbiBYFWW (ORCPT ); Fri, 25 Feb 2022 00:22:22 -0500 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DC16B0A6B for ; Thu, 24 Feb 2022 21:21:49 -0800 (PST) Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 21P4Zfp6017855; Fri, 25 Feb 2022 05:21:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2021-07-09; bh=2v+BnuJfqs0WBVA1gqGP5VA8LzXnWDHE7vHZlrqQttw=; b=ZFi3MCc+NOYLu0aqsib+1y/Pmheg4k8KJopG5kuQ6jHBXzNVPGHhN3Ugp6PPHaXb1Q9b hveXJwTwdrXfRR5E5JsR/bP8ooCcI1nlXgEkL48XtTH3HKVPNYyjPpCoO/AmF29FzTHb aixlfCuWLilZ5o91Wu7mkohM4MBS09zMPsaMLYOk2cM4zdp4vF18IvbgVkBGRL6a8InC gzr5uE+JccL3mootNQh1oa0JUUF/ATDr8VL4qO03ld8N1zlEKRWa0x57IQqar5zaHeNq QrstrNO5aW9K1P6cwl+Ulrpkj0XhBi8cGxQNWSBDsocgD3JWwug7ivSIR7J99AV1Eoys KA== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by mx0b-00069f02.pphosted.com with ESMTP id 3ecxfb0msx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:44 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.1.2/8.16.1.2) with SMTP id 21P5B0PC017724; Fri, 25 Feb 2022 05:21:43 GMT Received: from nam10-bn7-obe.outbound.protection.outlook.com (mail-bn7nam10lp2108.outbound.protection.outlook.com [104.47.70.108]) by userp3020.oracle.com with ESMTP id 3eat0rmw9p-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 25 Feb 2022 05:21:43 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=U29IsQIXuWgzt1v8Lj0FTHixGb/UyjzGivDFHEd5Kjgp6srNeY56a5edLAUNmLnwCzt0kFAgLypJ2OVQalAnso12z9mNO8wJewtSVDxjUOLhtGPY/+83M4br8+jsItMqQq7qYxxeF+OFTM9sCqPOZznFzwhrJe+pK+6jPc7UNDat62/63QA0sf+sT6Mi7gi07qKw3paT/RBBuuDmWfe/RzHqGekMXRo3o8Y/lZX0y7jzIgppmbo+AyXNx7vRUIf4t4ZMV421r3aNASpCQz34SA6oOJcmcj9Rwed87Qa60JIh8r7kU0p7uIHHUBf0B8k2sAWwt7AI9SqHsjaj2cFgeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2v+BnuJfqs0WBVA1gqGP5VA8LzXnWDHE7vHZlrqQttw=; b=fLbT2K4hPOg+BRddHdhpp8cJTX99s42YNE3+9KyhLPELkpoEo18HpwDNNDPlGnBocrRaBig7w6v9Y8O3lwWiJv4qzXKVJO51tYiOvaWVowtisdJ6fg6s+FUA27LP3h4DzQ+TtSmqp7dZHiowMXgfFyDXnjiv+/N0kGvSrXv2Mig3Iz2bxICzaqyBu9jCa91q0Iz05dDQkBus4sNB/KrswMRo8/IGOaoR6gG/loxeBmpPd+ksX0rso32bH7VXBzLQt9OOD5N+8rSH8ag6qnLMKIwzwkyELEqIbT5Tbj99lz1AklYSSlyjV1QPVuNYAIcFtItP4tRJVV2LTcrtv8EeXg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2v+BnuJfqs0WBVA1gqGP5VA8LzXnWDHE7vHZlrqQttw=; b=YYPK9wfRiCQ0V4/1Wo7xtwVdXRZ1CeRKh3zgvhc9L9HU/KhFslhqPX8yUUa5/7LNgYs7vgCwEJMuv0whyScBecrhplXvmYy6SR+kIXZRNoTXLgMLd3+meSjnMSD3lYNZXzOhsBsNt+vwnJhEIHs7FlL6ScwjYi+UHLmeNciB/Rs= Received: from CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) by MWHPR10MB1677.namprd10.prod.outlook.com (2603:10b6:301:a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4995.24; Fri, 25 Feb 2022 05:21:41 +0000 Received: from CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa]) by CO1PR10MB4468.namprd10.prod.outlook.com ([fe80::b5ab:1c3e:6540:d2fa%9]) with mapi id 15.20.5017.025; Fri, 25 Feb 2022 05:21:41 +0000 From: Imran Khan To: tj@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk Cc: linux-kernel@vger.kernel.org Subject: [PATCH v7 8/8] kernfs: Add a document to describe hashed locks used in kernfs. Date: Fri, 25 Feb 2022 16:21:16 +1100 Message-Id: <20220225052116.1243150-9-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220225052116.1243150-1-imran.f.khan@oracle.com> References: <20220225052116.1243150-1-imran.f.khan@oracle.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SYAPR01CA0043.ausprd01.prod.outlook.com (2603:10c6:1:1::31) To CO1PR10MB4468.namprd10.prod.outlook.com (2603:10b6:303:6c::24) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 1804c7b5-6180-4155-98df-08d9f81eb48a X-MS-TrafficTypeDiagnostic: MWHPR10MB1677:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: sC2XTWOaHgyGNmPaXyxiD1mKDTClQG/9pscN8VZGT+d0Yx7wkg/tFF7QJWiMv4iJoxb6s9SmqvV6ldyT/qsnsic9HEoInwsDTBcyFbXZV0NT9ksWAQ+dteB2OG7DcSxYZl1hbgymILrz8SYqpoxCW+tznXi9zvtU7oaldVBpJoobe/4VKZlV+wB4iugIEVHCcWYheY16RpRWx9dPK/cYOJ+qc/EFwHaShjCdw3zVuRbZ24KwZVtqTblQtyZoiMRqAzxX3CWoh+E6aD2RXVF9jr5BGCldmnMvlKdcYB2GDtQfM1qk7cH0ODSriKbUzw/C1CPJjtP6vJzOyybJcctQjGw2MvUfdRpMjZdocHnHaBxWRGNd/JRZiIVzqPFAX2yDL4Ruzcv4/e5+A51aQpTPZVRgDl6BOBjYPLc9AzX8ZTaxwykdv3iXhK/fULwsoO2nSPuNqpMZd51FYqxN2HQ9f4TpSgNM3Lodm1iGoqSTNr3hOSerjzIMOumEtodnjFUFph5YV5ba7eFS9yGe1u2fSwHOe1x8A3tppemvXGrZJGKwqXxrk60SKIWc0PGVRU6pGswcXeQj+7hkvY8ipQ5p8Ft6k+370jGZrtelJiIHnfDcLMLAVFnC6+1da9atFIgSpHeBblqEH1/VSlyK8FkN27/Vcq5SxJOTHWu8O2zgIboY7i7Cw8RV01vhpLoUabrEeV5K5UWh8SlbwkXLk4AJHg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CO1PR10MB4468.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230001)(366004)(186003)(26005)(38350700002)(6666004)(8936002)(38100700002)(83380400001)(1076003)(30864003)(6512007)(508600001)(86362001)(6486002)(6506007)(2906002)(52116002)(316002)(66946007)(36756003)(4326008)(5660300002)(8676002)(66476007)(103116003)(66556008)(2616005);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?7IN06vPShCOLNadPw8F5JSxWOzdKPa9EihtlTUycUhgCFrfTmpA8SDgvLYR8?= =?us-ascii?Q?H3cEYgsp2CmCC3+K+PhfFRuqUJolRTfXUJMEx3vBuAdoV3fp7beBBljTeG0A?= =?us-ascii?Q?fYYYuCAT095/2Tr52XgGDErseupFD8vWy3mpB38D36aQub7IBsu3FxwT5hnI?= =?us-ascii?Q?WNWcPZjeEfeeDERmColcFiB8SL5aNjkLDhLsEjHr49W9fFpjz97+fmMXwWm7?= =?us-ascii?Q?9cu9QKFpwfZBGgBPIwKKV0M5ziuShZfkOEbN5c1t2kUPcEjoTWepl+OQ+5IW?= =?us-ascii?Q?SQ1HzHTNIEvgx+LCWpw84xSWjihqFawcamwIURID+bo8i0Dp4FAeciqO0bzd?= =?us-ascii?Q?BrOd9VT1fmBFlegRvDQc+c+8Lz4k0EsmIDj5VqcFrSAOeZdlS1gA2wmTTJSB?= =?us-ascii?Q?OmYMaSKZnNBJv6H4b2ObR6bMTJKK+BQS5Hj/iVvZnzaRsP/xgMgOsVo9mLEo?= =?us-ascii?Q?mNuqXgcfupZ5uAJgVByb41+3wIkuceokvJa7cAjivn4Xzpw7T37YiYZzf51m?= =?us-ascii?Q?2mVnt0fn4kd/QOwhcJDTQNwf3uo6wVRl81nZ/l4lOvMmRMgLO++oSQD9aewv?= =?us-ascii?Q?BRTUSfsX5gpI/zFIao3UnZI7OXyu/uMYqWDNPNcGhc+u4mweeEcW/OaZLMiG?= =?us-ascii?Q?klne7ueD96LRtePX89LG4Bo+sUUPcTx7iQ8aV5CZJOyk4KJgPsvrfslp6b+A?= =?us-ascii?Q?+uN2cw7ckN6wf8EGAwEjypqauGRFquCFCrRymjFVmhDNoLzwPdmYwtNDDe++?= =?us-ascii?Q?YsWjRCDyFIgyimxD/7Y7ae3PBBVHkqqRn+YbW5WgGE+pdLoz4qL9c1QI6mdl?= =?us-ascii?Q?ypnHvEHpI83Q6mVDdGgHXARzKR1kZ0yJMiUR8hogp96BgAGeK1cZHtmvXSRu?= =?us-ascii?Q?wNmQRTmQEy3popUN8xCAikzBpIE2zMwe9ashH1tBQKQc4eiOaxkTpijAORP3?= =?us-ascii?Q?s7nl7ld54K2kTlOZ9SEk33+Ia0F1fTJfA+hCxwFf1Rq0BgGZnUY1+g6DDrXa?= =?us-ascii?Q?9y5ag7ddh0hi9ribZ60ES+zCA0n/bO2otTgS/xr3CInQkxcs2DqbtfYU1mtl?= =?us-ascii?Q?ZtVFVuBfVa1QEAs8w0hwARPzxTa+5dE//swBe1P+BxRTaVbGBrSP05XDnfnn?= =?us-ascii?Q?ip5MazKcttSBZQ30w8TCLBL4tbN1SBjuFTogRdwBM+QNKhetgZpbLLBtnOaa?= =?us-ascii?Q?P1XP1OCZmbcnm3kJoMhqnF6LHfi7wEDuwP9N79qxNX9Ey8qJcKMgF3oZeB33?= =?us-ascii?Q?mZgUgkZlheMaJGf7OpZIbxFLY9fB/XKN1ANGIQpSRsjP2xzqJD312s+UYPaZ?= =?us-ascii?Q?yRwMiDkn9RNhDE1UBjGeKGfd0OvUB6dm3QZqnEPyITNU3t8kN5GuVUPb6RhC?= =?us-ascii?Q?Lb5rZaaFAtL7v+MCUH+8PovWPwmQjzHahLh7lqxk+1lr2wnztF0NLWyTNPwe?= =?us-ascii?Q?PFnzHUFrFFEqDRHF66Co94nO8zRIDbsy3dyQ6JsHZO6TkPmqi3yfKnlapGbs?= =?us-ascii?Q?pMdMfxhNEykHS6WsRgJVpF6jgkg7Nj2yf+6T1JhlK9+9fUzg5s6aZLBJPRD9?= =?us-ascii?Q?8CVu9nJW2/qt2sG/oHY6WyWh7sCzWjc8Xsy/aRVFAh+FKueDbLnNF8Sln8YK?= =?us-ascii?Q?Bt2jZyAPQ3eaAqXW+gFXUng=3D?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1804c7b5-6180-4155-98df-08d9f81eb48a X-MS-Exchange-CrossTenant-AuthSource: CO1PR10MB4468.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Feb 2022 05:21:41.2731 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: qaQkGCoyz9iYb6A+6iaiIIQAOD3ky7bqbOla7zrD8nfq4BUdl7JI4x1O3ayUOWoNQQkTHc5AMJ2X00JWnGCkeA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR10MB1677 X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10268 signatures=684655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 adultscore=0 mlxlogscore=999 suspectscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202250027 X-Proofpoint-GUID: VdyQSemIQ_FQFdLSTbWdLYBaAinWz-C9 X-Proofpoint-ORIG-GUID: VdyQSemIQ_FQFdLSTbWdLYBaAinWz-C9 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This document describes usage and proof of various hashed locks introduced in this patch set Signed-off-by: Imran Khan --- .../filesystems/kernfs-hashed-locks.rst | 245 ++++++++++++++++++ 1 file changed, 245 insertions(+) create mode 100644 Documentation/filesystems/kernfs-hashed-locks.rst diff --git a/Documentation/filesystems/kernfs-hashed-locks.rst b/Documentat= ion/filesystems/kernfs-hashed-locks.rst new file mode 100644 index 0000000000000..2ffa579ee1e3b --- /dev/null +++ b/Documentation/filesystems/kernfs-hashed-locks.rst @@ -0,0 +1,245 @@ +.. SPDX-License-Identifier: GPL-2.0-only + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +kernfs hashed locks +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +kernfs uses following hashed locks + +1. Hashed mutexes +2. Hashed spinlock +3. Hashed rwsem + +In certain cases hashed rwsem needs to work in conjunction with a per-fs m= utex +(Described further below).So this document describes this mutex as well. + +A kernfs_global_locks object (defined below) provides hashed mutexes, +hashed spinlocks and hashed rwsems. + + struct kernfs_global_locks { + struct kernfs_open_file_mutex open_file_mutex[NR_KERNFS_LOCKS]; + struct kernfs_open_node_lock open_node_locks[NR_KERNFS_LOCKS]; + struct rw_semaphore kernfs_rwsem[NR_KERNFS_LOCKS]; + }; + +The hashed mutexes and spinlocks are encapsulated in kernfs_open_file_mute= x and +kernfs_open_node_lock respectively as shown below: + +struct kernfs_open_file_mutex { + struct mutex lock; +} ____cacheline_aligned_in_smp; + +struct kernfs_open_node_lock { + spinlock_t lock; +} ____cacheline_aligned_in_smp; + + +For all hashed locks address of a kernfs_node object acts as hashing key. + +For the remainder of this document a node means a kernfs_node object. The +node can refer to a file, directory or symlink of a kernfs based file syst= em. +Also a node's mutex, spinlock or rwsem refers to hashed mutex, hashed spin= lock +or hashed rwsem corresponding to the node. +It does not mean any locking construct embedded in the kernfs_node itself. + +What is protected by hashed locks +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D + +(1) There's one kernfs_open_file for each open file and all kernfs_open_fi= le + instances corresponding to a kernfs_node are maintained in a list. + hashed mutexes or kernfs_global_locks.open_file_mutex[index].lock prot= ects + this list. + +(2) For each kernfs file that has been opened there is one instance of + kernfs_open_node and kernfs_node->attr.open points to it. + hashed spinlocks or kernfs_global_locks.open_node_locks[index].lock pr= otects + ->attr.open. + +(3) Hashed rwsems or kernfs_global_locks.kernfs_rwsem[index] protects node= 's + state and synchronizes operations that change state of a node or depen= d on + the state of a node. + +(4) per-fs mutex (mentioned earlier) provides synchronization between look= up + and remove operations. + While looking for a node we will not have address of corresponding node + so we can't acquire node's rwsem right from the beginning. + On the other hand a parallel remove operation for the same node can ac= quire + corresponding rwsem and go ahead with node removal. So it may happen t= hat + search operation for the node finds and returns it but before it can be + pinned or used, the remove operation, that was going on in parallel, r= emoves + the node and hence makes its any future use wrong. + per-fs mutex ensures that for competing search and remove operations o= nly + one proceeds at a time and since object returned by search is pinned b= efore + releasing the per-fs mutex, it will be available for subsequent usage. + + +Lock usage and proof +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +(1) Hashed mutexes + + Since hashed mutexes protect the list of kernfs_open_file instances + corresponding to a kernfs_node, ->open and ->release backends of + file_operations need to acquire hashed mutex corresponding to kernfs_n= ode. + Also when a kernfs_node is removed, all of its kernfs_open_file instan= ces + are drained after deactivating the node. This drain operation acquires + hashed mutex to traverse list of kernfs_open_file instances. + So addition (via ->open), deletion (via ->release) and traversal + (during kernfs_drain) of kernfs_open_file list occurs in a synchronous + manner. + +(2) Hashed spinlocks + + As hashed spinlocks protect ->attr.open, ->open and ->release backends= of + file operations need to acquire spinlock corresponding the kernfs_node= so + that kernfs_open_node instances can be properly refcounted and freed w= hen + this refcount reaches 0. + + file events notifier uses ->poll backend of kernfs_open_node instance,= so + it also needs node's spinlock before accessing ->attr.open. + +(3) Hashed rwsems + + 3.1. A node's rwsem protects its state and needs to be acquired to: + 3.1.a. Remove the node + 3.1.b. Move the node + 3.1.c. Travers or modify a node's children RB tree (for + directories), i.e to add/remove files/subdirectories + within/from a directory. + 3.1.d. Modify or access node's inode attributes + + 3.2. Hashed rwsems are used in following operations: + + 3.2.a. Addition of a new node + + While adding a new kernfs_node under a kernfs directory + kernfs_add_one acquires directory node's rwsem for + writing. Clause 3.1.a ensures that directory exists + throughout the operation. Clause 3.1.c ensures proper + updation of children rb tree (i.e ->dir.children). + Clause 3.1.d ensures correct modification of inode + attribute to reflect timestamp of this operation. + If the directory gets removed while waiting for semaphore, + the subsequent checks in kernfs_add_one will fail resulting + in early bail out from kernfs_add_one. + + 3.2.b. Removal of a node + + Removal of a node involves recursive removal of all of its + descendants as well. per-fs mutex (i.e kernfs_rm_mutex) avoids + concurrent node removals even if the nodes are different. + + At first node's rwsem is acquired. Clause 3.1.c avoids parallel + modification of descendant tree and while holding this rwsem + each of the descendants are deactivated. + + Once a descendant has been deactivated and drained, its parent's + rwsem is taken. Clause 3.1.c ensures proper unlinking of this + descendant from its siblings. Clause 3.1.d ensures that parent's + inode attributes are correctly updated to record time stamp of + removal. + + 3.2.c. Movement of a node + + Moving or renaming a node (kernfs_rename_ns) acquires rwsem for + node and its old and new parents. Clauses 3.1.b and 3.1.c avoid + concurrent move operations for the same node. + Also if old parent of a node changes while waiting for rwsem, + the acquisition of rwsem for 3 involved nodes is attempted + again. It is always ensured that as far as old parent is + concerned, rwsem corresponding to current parent is acquired. + + 3.2.d. Reading a directory + + For diectory reading kernfs_fop_readdir acquires directory + node's rwsem for reading. Clause 3.1.c ensures a consistent view + of children RB tree. + As far as directroy being read is concerned, if it gets removed + while waiting for semaphore, the for loop that iterates through + children will be ineffective. So for this operation acquiring + directory node's rwsem for reading is enough. + + 3.2.e. Dentry revalidation + + A dentry revalidation (kernfs_dop_revalidate) can happen for a + negative or for a normal dentry. + For negative dentries we just need to check parent change, so in + this case acquiring parent kernfs_node's rwsem for reading is + enough. + For a normal dentry acquiring node's rwsem for reading is enough + (Clause 3.1.a and 3.1.b). + If node gets removed while waiting for the lock subsequent checks + in kernfs_dop_revalidate will fail and kernfs_dop_revalidate will + exit early. + + 3.2.f. kernfs_node lookup + + While searching for a node under a given parent + (kernfs_find_and_get_ns, kernfs_walk_and_get_ns) rwsem of parent + node is acquired for reading. Clause 3.1.c ensures a consistent + view of parent's children RB tree. To avoid parallel removal of + found node before it gets pinned, these operation make use of + per-fs mutex (kernfs_rm_mutex) as explained earlier. + This per-fs mutex is also taken during kernfs_node removal + (__kernfs_remove). + + If the node being searched gets removed while waiting for the + mutex or rwsem, the subsequent kernfs_find_ns or kernfs_walk_ns + will fail. + + 3.2.g. kenfs_node's inode lookup + + Looking up for inode instances via kernfs_iop_lookup involves + node lookup. So locks acquired are same as ones required in 3.2.f. + Also once node lookup is complete parent's rwsem is released and + rwsem of found node is acquired to get corresponding inode. + Since we are operating under per-fs kernfs_rm_mutex the found node + will not disappear in the middle. + + 3.2.h. Updating or reading inode attribute + + Interfaces that change inode attributes(i.e kernfs_setattr and + kernfs_iop_setattr) acquire node's rwsem for writing. + If the kernfs_node gets removed while waiting for the semaphore + the subsequent __kernfs_setattr will fail. + From 3.2.a and 3.2.b we know that updates due to addition or + removal of nodes will not happen in parallel. + So just locking the kernfs_node in these cases is enough to + guarantee correct modification of inode attributes. + Similarly the interfaces that read inode attributes + (i.e kernfs_iop_getattr, kernfs_iop_permission) just need to + acquire involved node's rwsem for reading. + + 3.2.i. kernfs file event generation + + kernfs_notify pins involved node before scheduling + kernfs_notify_work and kernfs_notify_workfn acquires node's + rwsem. Clauses in 3.1 ensure a consistent view of node state + throughout execution of work handler. + + 3.2.j. mount + + kernfs_fill_super, invoked during mount operation, acquires root + node's rwsem. During mount process there can't be other execution + contexts trying to move or delete the node so just locking the + involved node(i.e the root node) is enough. + + 3.2.k. getting symlink for a kernfs file + + kernfs_getlink locks both parent and target nodes. Clauses + 3.1.a and 3.1.b avoid movement/removal of parent and target + nodes. If parent or target gets moved or removed while + kernfs_getlink was waiting for rwsem, the subsequent + kernfs_get_target_path will return error. + + 3.2.l. while activating a node + + For a node that started as deactivated, kernfs_activate + activates the node. In this case acquiring node's rwsem is + enough. Since the node is not active yet any parallel removal + that wins the race for rwsem will skip this node and its + descendents. Also user space can't see a deactivated node so we + don't have any parallel access emanating from their as well. + + 3.3 For operations that involve locking multiple nodes at the same time + locks are acquired in order of their addresses. --=20 2.30.2