From nobody Sat Feb 7 20:39:39 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED626221F24 for ; Tue, 28 Oct 2025 13:46:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761659177; cv=none; b=hcevC4zRzuEXqUFFraQ0b2UJXWCgdj2u7TO4Cy0Fp1CpdoTe5flol+bkiEv/OuvSgi2up8AOsRLmj66CqUpwZ00eHafgKjo9RMeDmFFWUiELWmfSV+XZPavis/XkyiVOzl/XWf1FGBmm/WESkr/CoP8qunhkMp5h35Q5gETvP8Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761659177; c=relaxed/simple; bh=w7f3xidSiMuMp3mDXJlUUc5D0wccUhU3tupiMlxaRc0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CtnYX5IaPtf9uMSbLqzJ+Xj8ge+oyXH3zedYnbJDBVA9xhKUf7JLJwPa38np7a/3mmwYwjX5Lll8c34k533m4Fgpc4ifrtEmPssVNkMWkWcInzvQmSbVDT15cV5wL0qg7T0tCT6/SBNYVr2F2gxjhACNtFBtOOh8ak1t2QAhLNo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Zx7oSTRF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Zx7oSTRF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0478FC4CEF7; Tue, 28 Oct 2025 13:46:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1761659176; bh=w7f3xidSiMuMp3mDXJlUUc5D0wccUhU3tupiMlxaRc0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Zx7oSTRFwurmTdWXau1Z5IYE1gJR/tV9SO4p/Ycxp5J5gqUa4EVZiXTDQw7EMmww0 fJ7GOUecF3mfVIVyQvXD/Gn/rJ+UH6togC0DNUVAEc6v6Bdwm0TbtjyJkGmx2EC1Yx e4Pb6WKzbseYp8fk0qIdvvJ7n7qR7sU0hZqco4RKlUyDLBlRJOmmGxAPQqfW67/7bc W09jBQgot9r1DgricVqPjJ/MIM+WgBCtsNyGvo/q+//c54kQdrV++ZDInpwlxDchbR EkidB072S6NQlCijuPnBx8TeEivoLS9NLH6vPrIf1JlN9NuAlS0U4XnJVQHir6lN9l BISEH5KanAWQw== From: Philipp Stanner To: Matthew Brost , Danilo Krummrich , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , tursulin@ursulin.net Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/2] drm/sched: Fix comment in drm_sched_run_job_work() Date: Tue, 28 Oct 2025 14:46:01 +0100 Message-ID: <20251028134602.94125-3-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20251028134602.94125-2-phasta@kernel.org> References: <20251028134602.94125-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" drm_sched_run_job_work() contains a comment which explains that an entity being NULL means that there is no more work to do. It can, however, also mean that there is work, but the scheduler doesn't have enough credits to process the jobs right now. Provide this detail in the comment. Signed-off-by: Philipp Stanner Reviewed-by: Matthew Brost --- drivers/gpu/drm/scheduler/sched_main.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/sched= uler/sched_main.c index c39f0245e3a9..492e8af639db 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1237,8 +1237,13 @@ static void drm_sched_run_job_work(struct work_struc= t *w) =20 /* Find entity with a ready job */ entity =3D drm_sched_select_entity(sched); - if (!entity) - return; /* No more work */ + if (!entity) { + /* + * Either no more work to do, or the next ready job needs more + * credits than the scheduler has currently available. + */ + return; + } =20 sched_job =3D drm_sched_entity_pop_job(entity); if (!sched_job) { --=20 2.49.0 From nobody Sat Feb 7 20:39:39 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD002221F24 for ; Tue, 28 Oct 2025 13:46:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761659180; cv=none; b=IaZXrZWUVY8LEurGt5W9vZaW8EK0XCNZiTjXFysJAe+AZn914FDJe3RZPJVWcCBNfAZMDIMjssilm3ar5jYUuVqalMMu9/N+a6BYH68eWP74Nw52AhFCGTOOL80g5JfHv0n/tyyW8sXODV2xXciqiSwSNdg11oOWhKDgcYBswCI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761659180; c=relaxed/simple; bh=VoiscabQPRor6DkMinX+/cG24TnZhK5eJDrniLN5DIE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y/V4oqDcA3Kh97JVTaz5VOvjZwMRjDg06rOykvQhD8iuT5wt0oG6YdEOZP67RgVFJnNiVHCoFxTJx79alf3fPsU08YL7r+TmK6ksmQqhhH+JHwXz+5TWdZJgQDiOaE1s/tVhDLzWxTfag2Be1vtrU/O10UUzEKj/wDL+AWi3840= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ILl/nI7k; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ILl/nI7k" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6114EC4CEFD; Tue, 28 Oct 2025 13:46:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1761659180; bh=VoiscabQPRor6DkMinX+/cG24TnZhK5eJDrniLN5DIE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ILl/nI7kFfekVqR2DFI7+J9Vg2cPFUrSZJ3SvXzxDzW11ybKisnkQ/C3oVxqk41Mp 45yofm9nVnEgjeseh9U/naHjYrvZkNZVdpFEx5SEjVNE+iBu66XfdKxnc0MbKUL2Gg 4rsbSoqL0dlQ0eykSGypfWH/Qy/c+3u1WS6oEwODX7h55tQsyGhrEcoNOylp4H+c1L sCRcouX2c+KzRMBMt5rEEXeEj0rPLlOqbYVcI2iQ1rTzn78XbP8lwokHcEWrkXsRjq ynqpusAbOeQY/DUJRVhdgKv5UBsaYEqNPi5Q1106Pqug51EZ4+gdiZ1lmw/coAjzKF 2EObUYJB75QmQ== From: Philipp Stanner To: Matthew Brost , Danilo Krummrich , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , tursulin@ursulin.net Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] drm/sched: Add FIXME detailing potential hang Date: Tue, 28 Oct 2025 14:46:02 +0100 Message-ID: <20251028134602.94125-4-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20251028134602.94125-2-phasta@kernel.org> References: <20251028134602.94125-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If a job from a ready entity needs more credits than are currently available, drm_sched_run_job_work() (a work item) simply returns and doesn't reschedule itself. The scheduler is only woken up again when the next job gets pushed with drm_sched_entity_push_job(). If someone submits a job that needs too many credits and doesn't submit more jobs afterwards, this would lead to the scheduler never pulling the too-expensive job, effectively hanging forever. Document this problem as a FIXME. Signed-off-by: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_main.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/sched= uler/sched_main.c index 492e8af639db..eaf8d17b2a66 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1237,6 +1237,16 @@ static void drm_sched_run_job_work(struct work_struc= t *w) =20 /* Find entity with a ready job */ entity =3D drm_sched_select_entity(sched); + /* + * FIXME: + * The entity can be NULL when the scheduler currently has no capacity + * (credits) for more jobs. If that happens, the work item terminates + * itself here, without rescheduling itself. + * + * It only gets started again in drm_sched_entity_push_job(). IOW, the + * scheduler might hang forever if a job that needs too many credits + * gets submitted to an entity and no other, subsequent jobs are. + */ if (!entity) { /* * Either no more work to do, or the next ready job needs more --=20 2.49.0