From nobody Sun Feb 8 13:10:21 2026 Received: from sg-1-104.ptr.blmpb.com (sg-1-104.ptr.blmpb.com [118.26.132.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1B9434EEFD for ; Thu, 22 Jan 2026 14:53:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.104 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769093620; cv=none; b=VilCwfXH8Xid5qltKG18aXzAfaTiBmSTkVSrSiSzohYXM+lQumvNZuEuVbeqBgCqEL9/WuTrfUM+QdGCFZa8tlk3EQr4UEGg0x6xC7MPjb3zZ7hjNsaJs2JwX/FCTWFJUwObnEI78OxApybHBnQERkwiiLZc3Fz/jyAKEsaocpY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769093620; c=relaxed/simple; bh=IZkV4tTavSfT0g6OInPIqtFpbWh79RvTO5JIqAwXSUc=; h=From:Mime-Version:References:Content-Type:To:Date:In-Reply-To:Cc: Message-Id:Subject; b=AqUEnCfdzE2i8C/qwS4hxoPn/nVuibWhSWSQkHfhfgqsbFgcpKvQ3aAct9OIaDkHX5/LFBgM3iX54gTnoiYZg18XLwcE23RT/4rFfm5MLHq6VxC5ZVXKfiRniIxbOYE+lwrpaX57748aME9azWqRNDer08jOo8VlJ0sHlutgFBI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=e+Gm1gTt; arc=none smtp.client-ip=118.26.132.104 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="e+Gm1gTt" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1769093606; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=PNn0c27rEDDAHeW79wncnlC/NSoB+XCUipsvkMRyvlw=; b=e+Gm1gTtfj0Sc6ZHC9hh0q/ZAA/fkzl1afhIVmi5kCJOl20f9xFVwYCLBA0E/fKqLQK9AW wsvIIJnZMfVk9PabIAjN+WKoQjHUdEJf1YRTj0h4rlUf+qOzpMXto+Fb9D2OOW7Sv1mmUA MyDsjTe7oPlG0tFlELpXRYB3GvYxpC4LzelaF1zsl/ulm1F6qRLDqVQWN+ZfCxUIcDNVZU 3yKZAe21j/1Yq5HimG9wdbBVKi3S2up9OpISaz/1G4hB8ZEYjBw/RcgfEA0796Edyiq77K PXFl3oyWnyMRfLsXy6tixLrxgwMx5fgQ+BsljIh3gUOQUqueXoqXi12l6Wtk6g== From: "Jinhui Guo" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260122145208.1013-1-guojinhui.liam@bytedance.com> X-Lms-Return-Path: To: , , , , , , , , , Date: Thu, 22 Jan 2026 22:52:06 +0800 In-Reply-To: <20260122145208.1013-1-guojinhui.liam@bytedance.com> Cc: , , Content-Transfer-Encoding: quoted-printable X-Original-From: Jinhui Guo Message-Id: <20260122145208.1013-2-guojinhui.liam@bytedance.com> X-Mailer: git-send-email 2.17.1 Subject: [PATCH v2 1/3] driver core: Introduce helper function __device_attach_driver_scan() Content-Type: text/plain; charset="utf-8" The logic responsible for managing parent device PM runtime (get/put), iterating over the bus drivers,and determining if asynchronous probing is required is currently duplicated between __device_attach() and __device_attach_async_helper(). This patch factors out this common logic into a new static helper function __device_attach_driver_scan(). This change reduces code duplication and improves maintainability without altering the existing behavior. Signed-off-by: Jinhui Guo Reviewed-by: Danilo Krummrich Acked-by: Dan Williams --- drivers/base/dd.c | 71 ++++++++++++++++++++++++++--------------------- 1 file changed, 40 insertions(+), 31 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index ed3a07624816..b6be95871d3d 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -964,6 +964,44 @@ static int __device_attach_driver(struct device_driver= *drv, void *_data) return ret =3D=3D 0; } =20 +static int __device_attach_driver_scan(struct device_attach_data *data, + bool *need_async) +{ + int ret =3D 0; + struct device *dev =3D data->dev; + + if (dev->parent) + pm_runtime_get_sync(dev->parent); + + ret =3D bus_for_each_drv(dev->bus, NULL, data, + __device_attach_driver); + /* + * When running in an async worker, a NULL need_async is passed + * since we are already in an async worker. + */ + if (need_async && !ret && data->check_async && data->have_async) { + /* + * If we could not find appropriate driver + * synchronously and we are allowed to do + * async probes and there are drivers that + * want to probe asynchronously, we'll + * try them. + */ + dev_dbg(dev, "scheduling asynchronous probe\n"); + get_device(dev); + *need_async =3D true; + } else { + if (!need_async) + dev_dbg(dev, "async probe completed\n"); + pm_request_idle(dev); + } + + if (dev->parent) + pm_runtime_put(dev->parent); + + return ret; +} + static void __device_attach_async_helper(void *_dev, async_cookie_t cookie) { struct device *dev =3D _dev; @@ -984,16 +1022,8 @@ static void __device_attach_async_helper(void *_dev, = async_cookie_t cookie) if (dev->p->dead || dev->driver) goto out_unlock; =20 - if (dev->parent) - pm_runtime_get_sync(dev->parent); + __device_attach_driver_scan(&data, NULL); =20 - bus_for_each_drv(dev->bus, NULL, &data, __device_attach_driver); - dev_dbg(dev, "async probe completed\n"); - - pm_request_idle(dev); - - if (dev->parent) - pm_runtime_put(dev->parent); out_unlock: device_unlock(dev); =20 @@ -1027,28 +1057,7 @@ static int __device_attach(struct device *dev, bool = allow_async) .want_async =3D false, }; =20 - if (dev->parent) - pm_runtime_get_sync(dev->parent); - - ret =3D bus_for_each_drv(dev->bus, NULL, &data, - __device_attach_driver); - if (!ret && allow_async && data.have_async) { - /* - * If we could not find appropriate driver - * synchronously and we are allowed to do - * async probes and there are drivers that - * want to probe asynchronously, we'll - * try them. - */ - dev_dbg(dev, "scheduling asynchronous probe\n"); - get_device(dev); - async =3D true; - } else { - pm_request_idle(dev); - } - - if (dev->parent) - pm_runtime_put(dev->parent); + ret =3D __device_attach_driver_scan(&data, &async); } out_unlock: device_unlock(dev); --=20 2.20.1 From nobody Sun Feb 8 13:10:21 2026 Received: from sg-1-102.ptr.blmpb.com (sg-1-102.ptr.blmpb.com [118.26.132.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 408D03559C8 for ; Thu, 22 Jan 2026 14:53:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.102 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769093639; cv=none; b=Sx+eMgEwlyDWjGZSvl5/qAxWzqaNgl6r08ASFKLy98GmyjrBy/IXNoN10IGPfJqAoHY5j21vfKyQUzh0QHbcoaHRt1o9WaOF+0nDpUCo9bWn4eRMCJUx1GZVfWvrzij2nUgb2XLMLqhO6S84BSdvEoRFwsr7OxBylVh5cC0MMm0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769093639; c=relaxed/simple; bh=dqtWNzfpeZHZxeHHMc5iXQXjPyWQDqjMRAyCL4NwOew=; h=Cc:From:Date:Content-Type:Subject:Message-Id:Mime-Version: In-Reply-To:To:References; b=AOaizgzXW6xFLRvJVZYYBEqBY72devW7AuQ/bCs/14sK7OL3zQHQvEG9LLFR5KYWT88HIFvhAPnsRIApSgeaE+j41doeJMyGr5xIGeV5DcBMOrnWvw1vGtCKw99lYY6i3T05g44higRULQ3gU9Op9yEu31wOdt0f5B8X+WLGCPA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=YZR16OUi; arc=none smtp.client-ip=118.26.132.102 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="YZR16OUi" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1769093631; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=L9L2mUXKRwXuM09p6XvQbq2cHnY/3KHPcitXOfGDdIg=; b=YZR16OUiKa5ENzeUHyydGw8+hSw1gl9jWwDjkmzp2P71P6w0kqV5Mtm8xCOAcWeHGebLox 3V/1vJdP70BnNRmslHLoJ2z5btwrwFbeQFoQHeiVKMQxCKgN5+AwPq3Dvlkavw1PnrstQD pXAc9Lg3B6KaFgVJCmlpCGZ8zyCfEgHLKwcbjR6XVTa34q23MKHK2p+a0hVRLbd1OYTnhK 42lNI1nLkiGdXOvyHVkr2BtzB0+ZNQzuGLmAB2XqHSp/uC18rjr2rOsCRfMZVzj+1YVXkr fmZjCzqWZ1hLZ5lj5ckv9zaMW63JDytvPMxAVeUABgpH6UiJsuW3slovZYEVCQ== Cc: , , From: "Jinhui Guo" Date: Thu, 22 Jan 2026 22:52:07 +0800 X-Mailer: git-send-email 2.17.1 Subject: [PATCH v2 2/3] driver core: Add NUMA-node awareness to the synchronous probe path Message-Id: <20260122145208.1013-3-guojinhui.liam@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable In-Reply-To: <20260122145208.1013-1-guojinhui.liam@bytedance.com> X-Lms-Return-Path: X-Original-From: Jinhui Guo To: , , , , , , , , , References: <20260122145208.1013-1-guojinhui.liam@bytedance.com> Content-Type: text/plain; charset="utf-8" Introduce NUMA-node-aware synchronous probing: drivers can initialize and allocate memory on the device=E2=80=99s local node without scattering kmalloc_node() calls throughout the code. NUMA-aware probing was first added to PCI drivers by commit d42c69972b85 ("[PATCH] PCI: Run PCI driver initialization on local node") in 2005 and has benefited PCI drivers ever since. The asynchronous probe path already supports NUMA-node-aware probing via async_schedule_dev() in the driver core. Since NUMA affinity is orthogonal to sync/async probing, this patch adds NUMA-node-aware support to the synchronous probe path. Signed-off-by: Jinhui Guo Acked-by: Dan Williams --- drivers/base/dd.c | 76 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 70 insertions(+), 6 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index b6be95871d3d..a8d560034abe 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -810,10 +810,56 @@ static int __driver_probe_device(const struct device_= driver *drv, struct device return ret; } =20 +/* Context for NUMA execution */ +struct numa_work_ctx { + struct work_struct work; + const struct device_driver *drv; + struct device *dev; + int result; +}; + +/* Worker function running on the target node */ +static void __driver_probe_device_node_helper(struct work_struct *work) +{ + struct numa_work_ctx *ctx =3D container_of(work, struct numa_work_ctx, wo= rk); + + ctx->result =3D __driver_probe_device(ctx->drv, ctx->dev); +} + +/* + * __driver_probe_device_node - execute __driver_probe_device on a specifi= c NUMA node synchronously + * @drv: driver to bind a device to + * @dev: device to try to bind to the driver + * + * Returns the result of the function execution, or -ENODEV if initializat= ion fails. + * If the node is invalid or offline, it falls back to local execution. + */ +static int __driver_probe_device_node(const struct device_driver *drv, str= uct device *dev) +{ + struct numa_work_ctx ctx; + int node =3D dev_to_node(dev); + + if (node < 0 || node >=3D MAX_NUMNODES || !node_online(node)) + return __driver_probe_device(drv, dev); + + ctx.drv =3D drv; + ctx.dev =3D dev; + ctx.result =3D -ENODEV; + INIT_WORK_ONSTACK(&ctx.work, __driver_probe_device_node_helper); + + /* Use system_dfl_wq to allow execution on the specific node. */ + queue_work_node(node, system_dfl_wq, &ctx.work); + flush_work(&ctx.work); + destroy_work_on_stack(&ctx.work); + + return ctx.result; +} + /** * driver_probe_device - attempt to bind device & driver together * @drv: driver to bind a device to * @dev: device to try to bind to the driver + * @in_async: true if the caller is running in an asynchronous worker cont= ext * * This function returns -ENODEV if the device is not registered, -EBUSY i= f it * already has a driver, 0 if the device is bound successfully and a posit= ive @@ -824,13 +870,22 @@ static int __driver_probe_device(const struct device_= driver *drv, struct device * * If the device has a parent, runtime-resume the parent before driver pro= bing. */ -static int driver_probe_device(const struct device_driver *drv, struct dev= ice *dev) +static int driver_probe_device(const struct device_driver *drv, struct dev= ice *dev, bool in_async) { int trigger_count =3D atomic_read(&deferred_trigger_count); int ret; =20 atomic_inc(&probe_count); - ret =3D __driver_probe_device(drv, dev); + /* + * If we are already in an asynchronous worker, invoke __driver_probe_dev= ice() + * directly to avoid the overhead of an additional workqueue scheduling. + * The async subsystem manages its own concurrency and placement. + */ + if (in_async) + ret =3D __driver_probe_device(drv, dev); + else + ret =3D __driver_probe_device_node(drv, dev); + if (ret =3D=3D -EPROBE_DEFER || ret =3D=3D EPROBE_DEFER) { driver_deferred_probe_add(dev); =20 @@ -919,6 +974,13 @@ struct device_attach_data { * driver, we'll encounter one that requests asynchronous probing. */ bool have_async; + + /* + * True when running inside an asynchronous worker context scheduled + * by async_schedule_dev() with callback function + * __device_attach_async_helper(). + */ + bool in_async; }; =20 static int __device_attach_driver(struct device_driver *drv, void *_data) @@ -958,7 +1020,7 @@ static int __device_attach_driver(struct device_driver= *drv, void *_data) * Ignore errors returned by ->probe so that the next driver can try * its luck. */ - ret =3D driver_probe_device(drv, dev); + ret =3D driver_probe_device(drv, dev, data->in_async); if (ret < 0) return ret; return ret =3D=3D 0; @@ -1009,6 +1071,7 @@ static void __device_attach_async_helper(void *_dev, = async_cookie_t cookie) .dev =3D dev, .check_async =3D true, .want_async =3D true, + .in_async =3D true, }; =20 device_lock(dev); @@ -1055,6 +1118,7 @@ static int __device_attach(struct device *dev, bool a= llow_async) .dev =3D dev, .check_async =3D allow_async, .want_async =3D false, + .in_async =3D false, }; =20 ret =3D __device_attach_driver_scan(&data, &async); @@ -1144,7 +1208,7 @@ int device_driver_attach(const struct device_driver *= drv, struct device *dev) int ret; =20 __device_driver_lock(dev, dev->parent); - ret =3D __driver_probe_device(drv, dev); + ret =3D __driver_probe_device_node(drv, dev); __device_driver_unlock(dev, dev->parent); =20 /* also return probe errors as normal negative errnos */ @@ -1165,7 +1229,7 @@ static void __driver_attach_async_helper(void *_dev, = async_cookie_t cookie) __device_driver_lock(dev, dev->parent); drv =3D dev->p->async_driver; dev->p->async_driver =3D NULL; - ret =3D driver_probe_device(drv, dev); + ret =3D driver_probe_device(drv, dev, true); __device_driver_unlock(dev, dev->parent); =20 dev_dbg(dev, "driver %s async attach completed: %d\n", drv->name, ret); @@ -1233,7 +1297,7 @@ static int __driver_attach(struct device *dev, void *= data) } =20 __device_driver_lock(dev, dev->parent); - driver_probe_device(drv, dev); + driver_probe_device(drv, dev, false); __device_driver_unlock(dev, dev->parent); =20 return 0; --=20 2.20.1 From nobody Sun Feb 8 13:10:21 2026 Received: from va-2-111.ptr.blmpb.com (va-2-111.ptr.blmpb.com [209.127.231.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2268B31D362 for ; Thu, 22 Jan 2026 15:06:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.127.231.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769094398; cv=none; b=Z1tDdbTMyW6VqAxcPrpjnOlo+aQccwTi4xfpFZtytAcq8DJEfwH78d+o4HFKNnLvicNDyZ98b5VpBEHtNpz+D5/KpXldugjqBAUCl7JCr3ahMFXYIJwsERnti0m0hblMeDuOewr6KxF6wZWMDlTv+JSyMFJMAetG2pDptRRB9Ps= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769094398; c=relaxed/simple; bh=ZS4UOSrNS+KfGTKnJgHgc3eBCljR5+cbx3Y7xDJqgv0=; h=Mime-Version:Content-Type:Message-Id:In-Reply-To:To:Date: References:Cc:From:Subject; b=nlfzrc7LtYQxR9SV9ZW8P2ZcoiBMn4R4K1cg0AR3Jib487m/lNT2Qg/xGcV5jvbGFMdPeowVDktoaupq0fseOd9P/QB5+dDkUKkXPGc0gjT/A+Fur3AJ1dBvKouMPo/TBBAaGShVIFgX/9bqDzC2/6zE5tWS+5g5uzz4ACIjZww= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=PamVu+UK; arc=none smtp.client-ip=209.127.231.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="PamVu+UK" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1769093666; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=PMVB8s5v8HETmoix20BnmGHUQkrMR5TqEENrEuudru4=; b=PamVu+UKvInV/O1wAxegBTRvpf05nQAUp8rLRQjGzsWudMMPnrM5SpgTjYfCPYwFnYwXow DOVQUnSSfk12nidwkHedeqYeuO32PpL6DICDvzN7Jf/eNGt/nfCh5RLOkbdpXsDNizk62z +k7CTDLjpWYzIrLz87Et3UDKXEH1C8wljN7a/ljqIbp3is6uJeA9Im4Jo9ix2Ylq3xDYZz gypigMYUeZbF4wEWA4LC/5N0YW6VrRe98+kp8l2ZlTVG36ZFWLgAX71ZOraCPHjm+RPOYo nsZBOZmUELTmokilqmwOZL/lYD1oJ2iQZXOviGk1LDbMfF1t0A5xF/DpjYEljQ== Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Message-Id: <20260122145208.1013-4-guojinhui.liam@bytedance.com> In-Reply-To: <20260122145208.1013-1-guojinhui.liam@bytedance.com> X-Mailer: git-send-email 2.17.1 To: , , , , , , , , , Date: Thu, 22 Jan 2026 22:52:08 +0800 References: <20260122145208.1013-1-guojinhui.liam@bytedance.com> X-Lms-Return-Path: Cc: , , From: "Jinhui Guo" Subject: [PATCH v2 3/3] PCI: Clean up NUMA-node awareness in pci_bus_type probe X-Original-From: Jinhui Guo Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With NUMA-node-aware probing now handled by the driver core, the equivalent code in the PCI driver is redundant and can be removed. Dropping it speeds up asynchronous probe by 35%; the gain comes from eliminating the work_on_cpu() call in pci_call_probe() that previously pinned every worker to the same CPU, forcing serial probe of devices on the same NUMA node. Testing three NVMe devices on the same NUMA node of an AMD EPYC 9A64 2.4 GHz processor shows a 35% probe-time improvement with the patch: Before (all on CPU 0): nvme 0000:01:00.0: CPU: 0, COMM: kworker/0:1, cost: 52266334ns nvme 0000:02:00.0: CPU: 0, COMM: kworker/0:0, cost: 50787194ns nvme 0000:03:00.0: CPU: 0, COMM: kworker/0:2, cost: 50541584ns After (spread across CPUs 1, 2, 4): nvme 0000:01:00.0: CPU: 1, COMM: kworker/u1025:2, cost: 35399608ns nvme 0000:02:00.0: CPU: 2, COMM: kworker/u1025:3, cost: 35156157ns nvme 0000:03:00.0: CPU: 4, COMM: kworker/u1025:0, cost: 35322116ns The improvement grows with more PCI devices because fewer probes contend for the same CPU. Signed-off-by: Jinhui Guo Acked-by: Dan Williams --- drivers/pci/pci-driver.c | 116 +++------------------------------------ include/linux/pci.h | 4 -- kernel/sched/isolation.c | 2 - 3 files changed, 8 insertions(+), 114 deletions(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 6b80400ee9b9..258f16da6550 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -296,17 +296,9 @@ static struct attribute *pci_drv_attrs[] =3D { }; ATTRIBUTE_GROUPS(pci_drv); =20 -struct drv_dev_and_id { - struct pci_driver *drv; - struct pci_dev *dev; - const struct pci_device_id *id; -}; - -static int local_pci_probe(struct drv_dev_and_id *ddi) +static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, + const struct pci_device_id *id) { - struct pci_dev *pci_dev =3D ddi->dev; - struct pci_driver *pci_drv =3D ddi->drv; - struct device *dev =3D &pci_dev->dev; int rc; =20 /* @@ -318,113 +310,25 @@ static int local_pci_probe(struct drv_dev_and_id *dd= i) * count, in its probe routine and pm_runtime_get_noresume() in * its remove routine. */ - pm_runtime_get_sync(dev); - pci_dev->driver =3D pci_drv; - rc =3D pci_drv->probe(pci_dev, ddi->id); + pm_runtime_get_sync(&dev->dev); + dev->driver =3D drv; + rc =3D drv->probe(dev, id); if (!rc) return rc; if (rc < 0) { - pci_dev->driver =3D NULL; - pm_runtime_put_sync(dev); + dev->driver =3D NULL; + pm_runtime_put_sync(&dev->dev); return rc; } /* * Probe function should return < 0 for failure, 0 for success * Treat values > 0 as success, but warn. */ - pci_warn(pci_dev, "Driver probe function unexpectedly returned %d\n", + pci_warn(dev, "Driver probe function unexpectedly returned %d\n", rc); return 0; } =20 -static struct workqueue_struct *pci_probe_wq; - -struct pci_probe_arg { - struct drv_dev_and_id *ddi; - struct work_struct work; - int ret; -}; - -static void local_pci_probe_callback(struct work_struct *work) -{ - struct pci_probe_arg *arg =3D container_of(work, struct pci_probe_arg, wo= rk); - - arg->ret =3D local_pci_probe(arg->ddi); -} - -static bool pci_physfn_is_probed(struct pci_dev *dev) -{ -#ifdef CONFIG_PCI_IOV - return dev->is_virtfn && dev->physfn->is_probed; -#else - return false; -#endif -} - -static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, - const struct pci_device_id *id) -{ - int error, node, cpu; - struct drv_dev_and_id ddi =3D { drv, dev, id }; - - /* - * Execute driver initialization on node where the device is - * attached. This way the driver likely allocates its local memory - * on the right node. - */ - node =3D dev_to_node(&dev->dev); - dev->is_probed =3D 1; - - cpu_hotplug_disable(); - /* - * Prevent nesting work_on_cpu() for the case where a Virtual Function - * device is probed from work_on_cpu() of the Physical device. - */ - if (node < 0 || node >=3D MAX_NUMNODES || !node_online(node) || - pci_physfn_is_probed(dev)) { - error =3D local_pci_probe(&ddi); - } else { - struct pci_probe_arg arg =3D { .ddi =3D &ddi }; - - INIT_WORK_ONSTACK(&arg.work, local_pci_probe_callback); - /* - * The target election and the enqueue of the work must be within - * the same RCU read side section so that when the workqueue pool - * is flushed after a housekeeping cpumask update, further readers - * are guaranteed to queue the probing work to the appropriate - * targets. - */ - rcu_read_lock(); - cpu =3D cpumask_any_and(cpumask_of_node(node), - housekeeping_cpumask(HK_TYPE_DOMAIN)); - - if (cpu < nr_cpu_ids) { - struct workqueue_struct *wq =3D pci_probe_wq; - - if (WARN_ON_ONCE(!wq)) - wq =3D system_percpu_wq; - queue_work_on(cpu, wq, &arg.work); - rcu_read_unlock(); - flush_work(&arg.work); - error =3D arg.ret; - } else { - rcu_read_unlock(); - error =3D local_pci_probe(&ddi); - } - - destroy_work_on_stack(&arg.work); - } - - dev->is_probed =3D 0; - cpu_hotplug_enable(); - return error; -} - -void pci_probe_flush_workqueue(void) -{ - flush_workqueue(pci_probe_wq); -} - /** * __pci_device_probe - check if a driver wants to claim a specific PCI de= vice * @drv: driver to call to check if it wants the PCI device @@ -1734,10 +1638,6 @@ static int __init pci_driver_init(void) { int ret; =20 - pci_probe_wq =3D alloc_workqueue("sync_wq", WQ_PERCPU, 0); - if (!pci_probe_wq) - return -ENOMEM; - ret =3D bus_register(&pci_bus_type); if (ret) return ret; diff --git a/include/linux/pci.h b/include/linux/pci.h index 7e36936bb37a..ae05faa105e2 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -486,7 +486,6 @@ struct pci_dev { unsigned int io_window_1k:1; /* Intel bridge 1K I/O windows */ unsigned int irq_managed:1; unsigned int non_compliant_bars:1; /* Broken BARs; ignore them */ - unsigned int is_probed:1; /* Device probing in progress */ unsigned int link_active_reporting:1;/* Device capable of reporting link = active */ unsigned int no_vf_scan:1; /* Don't scan for VFs after IOV enablement */ unsigned int no_command_memory:1; /* No PCI_COMMAND_MEMORY */ @@ -1211,7 +1210,6 @@ struct pci_bus *pci_create_root_bus(struct device *pa= rent, int bus, struct pci_ops *ops, void *sysdata, struct list_head *resources); int pci_host_probe(struct pci_host_bridge *bridge); -void pci_probe_flush_workqueue(void); int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax); int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax); void pci_bus_release_busn_res(struct pci_bus *b); @@ -2085,8 +2083,6 @@ static inline int pci_has_flag(int flag) { return 0; } _PCI_NOP_ALL(read, *) _PCI_NOP_ALL(write,) =20 -static inline void pci_probe_flush_workqueue(void) { } - static inline struct pci_dev *pci_get_device(unsigned int vendor, unsigned int device, struct pci_dev *from) diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index ef152d401fe2..3d28d8163ee4 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -8,7 +8,6 @@ * */ #include -#include #include "sched.h" =20 enum hk_flags { @@ -144,7 +143,6 @@ int housekeeping_update(struct cpumask *isol_mask) =20 synchronize_rcu(); =20 - pci_probe_flush_workqueue(); mem_cgroup_flush_workqueue(); vmstat_flush_workqueue(); =20 --=20 2.20.1