From nobody Sun Feb 8 15:59:28 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3552732573C for ; Thu, 8 Jan 2026 08:32:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767861171; cv=none; b=gkqbEOGbirEIjc9n1s8WFaVabkwOL5+XsZW11Gz6j69Kig7zepLkUmsGjijQVJncUraPQ0lI8rggO+gByuC2/FOoJwPoHXKDvR/WjOGn9LOC53xDNYeWBcEXaQXZP0H530DcrVijpQB54XwE0d9i73LMEAGq+nEV7mFFOY+lA1Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767861171; c=relaxed/simple; bh=A9w0BCdpbzO3cEAawaW5oG0xj1MDnVrwPeqSo2zfeBc=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=YJPeMZENZ3uQEgGp0ZaxXx0IO2f4Iif70+GHHEUhNhiw9b03qXGaVMXZyOlTVnn67TNH5l4xbSJhDCTr9O1Cjd9IOnmiekjInUzgxZfcnxLmqt+dPM6Vvhavhh5dZwqEBKtU8CJ8dxPnnEemdhc9MqQKus8aRhuXJaQgy8Hp6HI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fyT3UKpK; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fyT3UKpK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EF8DDC16AAE; Thu, 8 Jan 2026 08:32:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1767861170; bh=A9w0BCdpbzO3cEAawaW5oG0xj1MDnVrwPeqSo2zfeBc=; h=From:To:Cc:Subject:Date:From; b=fyT3UKpKLreox1cC29WGoTYSVcT2OXnPWiXm4VgI4Q7Uj2y3SH9BCbZED1BsSA3C8 k+FLER4I5ZMvteq/qH6bMuRhHJQ2tSMBwdQI96KeuLUkappdbNzafVQ4cMNIe9+R9W j4LuIbB99+ZHxUv+ZECfYKsedvqDEhN9FSaRJEfwyM7oUZ/2RsfbsTYMoIlJ51A0uI beIR/ap/+cJZvEcrwJSfURJYjGLrSNwicgjAjSbUPAjzeo0KIA4+yBy9dg8gZRUmPY eWo7hHS79P9ZA5cqVHtCLpp+IwsYPSh1Nk5gVINraFTun1u26s1deXP1EE4/zGW8f9 obktCBUjCUKoQ== From: Philipp Stanner To: Matthew Brost , Danilo Krummrich , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH] drm/sched: Remove racy hack from drm_sched_fini() Date: Thu, 8 Jan 2026 09:30:20 +0100 Message-ID: <20260108083019.63532-2-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" drm_sched_fini() contained a hack to work around a race in amdgpu. According to AMD, the hack should not be necessary anymore. In case there should have been undetected users, commit 975ca62a014c ("drm/sched: Add warning for removing hack in drm_sched= _fini()") had added a warning one release cycle ago. Thus, it can be derived that the hack can be savely removed by now. Remove the hack. Signed-off-by: Philipp Stanner --- As hinted at in the commit, I want to cozyly queue this one up for the next merge window, since we're printing that warning since last merge window already. If someone has concerns I'm also happy to delay this patch for a few more releases. --- drivers/gpu/drm/scheduler/sched_main.c | 38 +------------------------- 1 file changed, 1 insertion(+), 37 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/sched= uler/sched_main.c index 1d4f1b822e7b..381c1694a12e 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1416,48 +1416,12 @@ static void drm_sched_cancel_remaining_jobs(struct = drm_gpu_scheduler *sched) */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { - struct drm_sched_entity *s_entity; int i; =20 drm_sched_wqueue_stop(sched); =20 - for (i =3D DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) { - struct drm_sched_rq *rq =3D sched->sched_rq[i]; - - spin_lock(&rq->lock); - list_for_each_entry(s_entity, &rq->entities, list) { - /* - * Prevents reinsertion and marks job_queue as idle, - * it will be removed from the rq in drm_sched_entity_fini() - * eventually - * - * FIXME: - * This lacks the proper spin_lock(&s_entity->lock) and - * is, therefore, a race condition. Most notably, it - * can race with drm_sched_entity_push_job(). The lock - * cannot be taken here, however, because this would - * lead to lock inversion -> deadlock. - * - * The best solution probably is to enforce the life - * time rule of all entities having to be torn down - * before their scheduler. Then, however, locking could - * be dropped alltogether from this function. - * - * For now, this remains a potential race in all - * drivers that keep entities alive for longer than - * the scheduler. - * - * The READ_ONCE() is there to make the lockless read - * (warning about the lockless write below) slightly - * less broken... - */ - if (!READ_ONCE(s_entity->stopped)) - dev_warn(sched->dev, "Tearing down scheduler with active entities!\n"); - s_entity->stopped =3D true; - } - spin_unlock(&rq->lock); + for (i =3D DRM_SCHED_PRIORITY_KERNEL; i < sched->num_rqs; i++) kfree(sched->sched_rq[i]); - } =20 /* Wakeup everyone stuck in drm_sched_entity_flush for this scheduler */ wake_up_all(&sched->job_scheduled); --=20 2.49.0