From nobody Sat Apr 4 01:50:13 2026 Received: from mail-dy1-f170.google.com (mail-dy1-f170.google.com [74.125.82.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91A952D97A6 for ; Sat, 21 Mar 2026 03:08:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774062522; cv=none; b=c7qT9/abtSvGGNazq//ox65nLT8pfm93l/+RlOCnmDjzu37B2CvaFeoWsGV8C6vTQpzEZQkyJqF9GBSbO0ZnjVkgqxyPXqAQIqMUHSwSE5TxreIidvF8zS2uHXKmj8VXHBCdLK3zDx4Tzcg1Z3Hg0R1V87nzAGrdNww6IGB0hhs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774062522; c=relaxed/simple; bh=ncTIMNASHjpZ6B39V82ZgctoszodwVhtazh6QmPsAfs=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=BTYpyX3wZia3x7AWOhy/PP64pFl+fUYuFP4r03rvBROcrHlTAsRDNs62lNpPGt4XRvlYStLfAIXrDkuB0juS6mniqAdcT0XIbgcdlWz8isn3NACRMJMZzFT8UXPF1uEdpXIIUKFFIJN8tsGr/FAYgivjnMKbJdWCRgpMIx+0MTQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=WkdYJcwT; arc=none smtp.client-ip=74.125.82.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="WkdYJcwT" Received: by mail-dy1-f170.google.com with SMTP id 5a478bee46e88-2c0bb213b16so4437332eec.0 for ; Fri, 20 Mar 2026 20:08:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1774062520; x=1774667320; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Q/+aAzcKE9DLWpX+pM5zZXoXrAxmtKIn1qCJ1uOuKGc=; b=WkdYJcwTRK+jzRGBrxZ+wqXNOcG8jVOkmZVEbTsqvfzl8cEcLdQKK3ACMwFGl0Sco2 jwygnsk3Z6vl48HvubQkpiM2lanAhhJemAzPZwHMtDCuYMCH3Sc7WeJZetUNjzFhK7tQ yyj+9JIant3B9x+J6YFhm/nF4oOjZlJPb1YNo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774062520; x=1774667320; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Q/+aAzcKE9DLWpX+pM5zZXoXrAxmtKIn1qCJ1uOuKGc=; b=K5yJRhHbcZN0f847K5QK1HlwKriSnDWrGMD8r1dw9GEtg5RNLjSiGh8gWO/nA1Mxuz xPlm+rKGS0xjDgFgIeJMCWC+dihOC6oc414UE2mz7Xa28QZimqFKM8G70SyN7i82gHVz i8iOO7HczeMNjfgV7GoVO5GHQ7AMs+dr8KZSXMY9WvQUbMs5d5wT0u+WAPk3Gt3V++bP VDHwvaqnMZaBdWrEBpXcIli1VpY7YXtXtEiCptipEabrYxghbz+rgGb8Ga93/PnRByRi on2Jw7cua4K4WLEHQOhOefzrRCTLZdj1raS01rPJIQHU7wJHEwg904FV8Xld79pdkXCW xmZA== X-Forwarded-Encrypted: i=1; AJvYcCWIhj39SJi3O3lBmPK69QqVCdEmEUoD2Rm83NaYX4It4DLFoW67J3s/EER/duNTe8grOQHkfgLIJFSAaEw=@vger.kernel.org X-Gm-Message-State: AOJu0Yx2CUrqySEd84OOWY3Yu7ZNQXbOSUJJajmvZpIOp+vrTttaxCQF S+ML/s62Lrq6EShxDYPF/GVPdqLslIPIBXX1nxANM0Bjn9cIgnwh+Zt9hv5Tob7esg== X-Gm-Gg: ATEYQzwKTOE1f1SWdxkuQYQy7qB4D5SRQbotR8ii72cTXtdUprrAnd7LFsQElDxPgK4 T8kIl+7p0ZRcaL04k/aRaj4Kf+m01+ILvaQn/4CpoEIYSUkt+FQ7PpNb/JCebe9nomYhYqmIhiv XLdjH2sH6rzpeYNIlhhES6te6ry1/vcsRFFDUPsroaVelAK2upni5L9GA4YStVtkncQ7NBzFOoG PoTOFv2/AYPJrbC/jhE2dLyLfMbhGeW8kY5RKgHvS7aW1WzdqknOSLR1eONsrMSzYhHQy7Ywa87 H3o8CPCmJNaNBs1COYVsUp/YMh77aNwfDRpRK7ybYC9X5w2V6IZ5KW4suztrxR0UaiL1+9FbUi5 kXqYpwb0Aozhqs1wHpBql4M/TQmUoVq67UUWZ74YznpSkzIIlkkLJJvhQZVnOc5KFhrrpRLtPdg FjZ8ttX9JQjtTDFgeBJAtJ3NhvZnRX2AibdTPLsjCiCETYwzRdAQvDZCpOxMLjkZswQm3tk1JGT jkLOk8= X-Received: by 2002:a05:7300:7496:b0:2be:e92:7f33 with SMTP id 5a478bee46e88-2c1097ecb50mr3430566eec.35.1774062519477; Fri, 20 Mar 2026 20:08:39 -0700 (PDT) Received: from dianders.sjc.corp.google.com ([2a00:79e0:2e7c:8:5b58:b8db:c8fc:9]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2c10b29d28csm7236852eec.19.2026.03.20.20.08.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Mar 2026 20:08:37 -0700 (PDT) From: Douglas Anderson To: Greg Kroah-Hartman , "Rafael J . Wysocki" , Danilo Krummrich Cc: Alan Stern , Kay Sievers , Saravana Kannan , Douglas Anderson , stable@vger.kernel.org, driver-core@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [RFC PATCH] driver core: Don't link the device to the bus until we're ready to probe Date: Fri, 20 Mar 2026 20:06:58 -0700 Message-ID: <20260320200656.RFC.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid> X-Mailer: git-send-email 2.53.0.959.g497ff81fa9-goog Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The moment we link a "struct device" into the list of devices for the bus, it's possible probe can happen. This is because another thread can load the driver at any time and that can cause the device to probe. This has been seen in practice with a stack crawl that looks like this [1]: really_probe() __driver_probe_device() driver_probe_device() __driver_attach() bus_for_each_dev() driver_attach() bus_add_driver() driver_register() __platform_driver_register() init_module() [some module] do_one_initcall() do_init_module() load_module() __arm64_sys_finit_module() invoke_syscall() Practically speaking, in the case this happened device_links_driver_bound() got called for the device before "dev->fwnode->dev" was assigned. This prevented __fw_devlink_pickup_dangling_consumers() from being called which meant that other devices waiting on our driver's sub-nodes were stuck deferring forever. Fix the problem by adjusting where we link the device. Notably: * Make sure we assign the dev->fwnode->dev before we link the device, since that needs to happen before a device probes. * Make sure we link the device _before_ sending the UEVENT, as described in commit 2023c610dc54 ("Driver core: add new device to bus's list before probing"). [1] Captured on a machine running a downstream 6.6 kernel Cc: stable@vger.kernel.org Fixes: 2023c610dc54 ("Driver core: add new device to bus's list before prob= ing") Signed-off-by: Douglas Anderson --- This is a super tricky area of code, and I don't have a lot of confidence that I got this exactly right. If you think I should do something different, please yell! Notably: * I don't know 100% for sure what happens if probe runs at the same time as (or before) the call to `bus_notify(dev, BUS_NOTIFY_ADD_DEVICE)`. Presumably we're at least no worse off than we were before this patch. * I haven't dug into whether there could be other types of race conditions with a driver trying to probe the device at the same time we call bus_probe_device(). I can try to dig into those things, but I also figured that people on the mailing lists would know this code a whole lot better than I do. Feel free to consider this RFC patch as a bug report and post a totally different solution. I won't be offended! :-) I have only done very minimal testing with this patch so far (the system doesn't seem to blow up with this patch). I'm kicking off tests for the weekend, but I figured I'd post this up in parallel. drivers/base/base.h | 1 + drivers/base/bus.c | 16 ++++++++++++++-- drivers/base/core.c | 24 ++++++++++++++++++------ 3 files changed, 33 insertions(+), 8 deletions(-) diff --git a/drivers/base/base.h b/drivers/base/base.h index 1af95ac68b77..d4933ba7b651 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -166,6 +166,7 @@ static inline void auxiliary_bus_init(void) { } struct kobject *virtual_device_parent(void); =20 int bus_add_device(struct device *dev); +void bus_link_device(struct device *dev); void bus_probe_device(struct device *dev); void bus_remove_device(struct device *dev); void bus_notify(struct device *dev, enum bus_notifier_event value); diff --git a/drivers/base/bus.c b/drivers/base/bus.c index bb61d8adbab1..573c22ebb5b7 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -510,7 +510,6 @@ EXPORT_SYMBOL_GPL(bus_for_each_drv); * * - Add device's bus attributes. * - Create links to device's bus. - * - Add the device to its bus's list of devices. */ int bus_add_device(struct device *dev) { @@ -545,7 +544,6 @@ int bus_add_device(struct device *dev) if (error) goto out_subsys; =20 - klist_add_tail(&dev->p->knode_bus, &sp->klist_devices); return 0; =20 out_subsys: @@ -557,6 +555,20 @@ int bus_add_device(struct device *dev) return error; } =20 +/** + * bus_link_device - Make a device findable in the list of devices + * @dev: device being linked + * + * - Add the device to its bus's list of devices. + */ +void bus_link_device(struct device *dev) +{ + struct subsys_private *sp =3D bus_to_subsys(dev->bus); + + if (sp) + klist_add_tail(&dev->p->knode_bus, &sp->klist_devices); +} + /** * bus_probe_device - probe drivers for a new device * @dev: device to probe diff --git a/drivers/base/core.c b/drivers/base/core.c index 791f9e444df8..6ff4725d677a 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3663,12 +3663,6 @@ int device_add(struct device *dev) devtmpfs_create_node(dev); } =20 - /* Notify clients of device addition. This call must come - * after dpm_sysfs_add() and before kobject_uevent(). - */ - bus_notify(dev, BUS_NOTIFY_ADD_DEVICE); - kobject_uevent(&dev->kobj, KOBJ_ADD); - /* * Check if any of the other devices (consumers) have been waiting for * this device (supplier) to be added so that they can create a device @@ -3680,12 +3674,30 @@ int device_add(struct device *dev) * But this also needs to happen before bus_probe_device() to make sure * waiting consumers can link to it before the driver is bound to the * device and the driver sync_state callback is called for this device. + * + * Because a bus may be probed the moment bus_link_device() is called, + * this must happen before bus_link_device(). */ if (dev->fwnode && !dev->fwnode->dev) { dev->fwnode->dev =3D dev; fw_devlink_link_device(dev); } =20 + /* + * The moment we link the bus in, it's possible for another thread + * (one registering a new driver) to notice it and start probing. + * At the same time, we need to link the bus before the uevent is + * sent announcing the device or user programs might try to access + * the device before it has been added to the bus. + */ + bus_link_device(dev); + + /* Notify clients of device addition. This call must come + * after dpm_sysfs_add() and before kobject_uevent(). + */ + bus_notify(dev, BUS_NOTIFY_ADD_DEVICE); + kobject_uevent(&dev->kobj, KOBJ_ADD); + bus_probe_device(dev); =20 /* --=20 2.53.0.959.g497ff81fa9-goog