From nobody Tue Oct 7 13:10:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 294F62957CE; Wed, 9 Jul 2025 11:53:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062006; cv=none; b=abZ84yaI2KOAPMgDgXvzxuwDZK6HWJJJSoIMaAT58kAIe3tZ+h4upSUsUVPZOX/GrD8WnxqrjaC3oum9+CDy57IkKEpLza1tDRQ/THEz6KdmCh0JI0F95uVwHv6ST+1l9eQPHtA7l+Wbx20DikESozglidZX2PdJSn/gFQr1nFU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062006; c=relaxed/simple; bh=gPZ3HY8Z/Qf3HBXo6MxOplLv9ot5uaDonjEvcOp6z+A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ftAE2p+2Cz9jc8ZqFUryhCdPfxJBECJLhPW7ynlea1Y5TpUSKCq0RybxELczmV68TzdErj/zMl6wo72MfHcwDf09UJNQGiLAEUFyS8yVcQ+zbz/mLUJCf/hpz3DdBRELrZRlf/6qsHfPcdcB9BsV+TNRpro9bSaNU0im8J123e8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BxtkvIHP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BxtkvIHP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9D569C4CEEF; Wed, 9 Jul 2025 11:53:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752062005; bh=gPZ3HY8Z/Qf3HBXo6MxOplLv9ot5uaDonjEvcOp6z+A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BxtkvIHPPN/JEn87h24VP6VXtIxaoK545KYc294WK/rroBBbzEUK4YcJL8+kT9Q9T V9UAVK+Iux4rOX9KXe+A6zESnxlm/+QU7TiSVAJGvCioIR6vdamXB/sioGErhMD9gh jzVAdNhWKaHaJlOCo4nTckcBERTkt13mcRtVa0x3SR8/Ns/N6vHQ+9FcEI8flSS5BE Si6SycY6iDZfRvxhZergTNDuWpuQFNobmykYaUi8LHFv6kfINWSVvTNZdDSzRbynCE KtKNbTG/fyxGWeQDStvzuHUCL2riRSIh/oG81kmxvCKOUySdNslpQycEuUtI3jw9k/ fv8UqcOl2A9kA== From: Philipp Stanner To: Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , Matthew Brost , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , Tvrtko Ursulin , Pierre-Eric Pelloux-Prayer Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, =?UTF-8?q?Ma=C3=ADra=20Canal?= Subject: [PATCH v3 1/7] drm/sched: Avoid memory leaks with cancel_job() callback Date: Wed, 9 Jul 2025 13:52:51 +0200 Message-ID: <20250709115257.106370-3-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250709115257.106370-2-phasta@kernel.org> References: <20250709115257.106370-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Since its inception, the GPU scheduler can leak memory if the driver calls drm_sched_fini() while there are still jobs in flight. The simplest way to solve this in a backwards compatible manner is by adding a new callback, drm_sched_backend_ops.cancel_job(), which instructs the driver to signal the hardware fence associated with the job. Afterwards, the scheduler can safely use the established free_job() callback for freeing the job. Implement the new backend_ops callback cancel_job(). Suggested-by: Tvrtko Ursulin Link: https://lore.kernel.org/dri-devel/20250418113211.69956-1-tvrtko.ursul= in@igalia.com/ Signed-off-by: Philipp Stanner Reviewed-by: Ma=C3=ADra Canal --- drivers/gpu/drm/scheduler/sched_main.c | 34 ++++++++++++++++---------- include/drm/gpu_scheduler.h | 18 ++++++++++++++ 2 files changed, 39 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/sched= uler/sched_main.c index 81ad40d9582b..a971f0c9e6e0 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1352,6 +1352,18 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, = const struct drm_sched_init_ } EXPORT_SYMBOL(drm_sched_init); =20 +static void drm_sched_cancel_remaining_jobs(struct drm_gpu_scheduler *sche= d) +{ + struct drm_sched_job *job, *tmp; + + /* All other accessors are stopped. No locking necessary. */ + list_for_each_entry_safe_reverse(job, tmp, &sched->pending_list, list) { + sched->ops->cancel_job(job); + list_del(&job->list); + sched->ops->free_job(job); + } +} + /** * drm_sched_fini - Destroy a gpu scheduler * @@ -1359,19 +1371,11 @@ EXPORT_SYMBOL(drm_sched_init); * * Tears down and cleans up the scheduler. * - * This stops submission of new jobs to the hardware through - * drm_sched_backend_ops.run_job(). Consequently, drm_sched_backend_ops.fr= ee_job() - * will not be called for all jobs still in drm_gpu_scheduler.pending_list. - * There is no solution for this currently. Thus, it is up to the driver t= o make - * sure that: - * - * a) drm_sched_fini() is only called after for all submitted jobs - * drm_sched_backend_ops.free_job() has been called or that - * b) the jobs for which drm_sched_backend_ops.free_job() has not been ca= lled - * after drm_sched_fini() ran are freed manually. - * - * FIXME: Take care of the above problem and prevent this function from le= aking - * the jobs in drm_gpu_scheduler.pending_list under any circumstances. + * This stops submission of new jobs to the hardware through &struct + * drm_sched_backend_ops.run_job. If &struct drm_sched_backend_ops.cancel_= job + * is implemented, all jobs will be canceled through it and afterwards cle= aned + * up through &struct drm_sched_backend_ops.free_job. If cancel_job is not + * implemented, memory could leak. */ void drm_sched_fini(struct drm_gpu_scheduler *sched) { @@ -1401,6 +1405,10 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) /* Confirm no work left behind accessing device structures */ cancel_delayed_work_sync(&sched->work_tdr); =20 + /* Avoid memory leaks if supported by the driver. */ + if (sched->ops->cancel_job) + drm_sched_cancel_remaining_jobs(sched); + if (sched->own_submit_wq) destroy_workqueue(sched->submit_wq); sched->ready =3D false; diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index e62a7214e052..190844370f48 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -512,6 +512,24 @@ struct drm_sched_backend_ops { * and it's time to clean it up. */ void (*free_job)(struct drm_sched_job *sched_job); + + /** + * @cancel_job: Used by the scheduler to guarantee remaining jobs' fences + * get signaled in drm_sched_fini(). + * + * Used by the scheduler to cancel all jobs that have not been executed + * with &struct drm_sched_backend_ops.run_job by the time + * drm_sched_fini() gets invoked. + * + * Drivers need to signal the passed job's hardware fence with an + * appropriate error code (e.g., -ECANCELED) in this callback. They + * must not free the job. + * + * The scheduler will only call this callback once it stopped calling + * all other callbacks forever, with the exception of &struct + * drm_sched_backend_ops.free_job. + */ + void (*cancel_job)(struct drm_sched_job *sched_job); }; =20 /** --=20 2.49.0 From nobody Tue Oct 7 13:10:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 462EE2D8796; Wed, 9 Jul 2025 11:53:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062012; cv=none; b=MZGtZf25nZigKZ85bIH6f3GD0PeIjPK8T1yUAl1RwoOccUzwpYxOxS/wB3hq5eIesYY5lnsISiDFIEbja4cyikvYQjjbw9cToyALyKJm/8hssQXkoHfyq/m03NlJIrHF0uLSc/oBhq17XYc47DRBvBko7gn3Dz7QVM+qfwz2B5w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062012; c=relaxed/simple; bh=aFTWuG5wx8zk/YVPbJb3PdCE2lsw+uaJDNCeO3dVdgc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nuf3XiGTyOWFkwP2eaNQ69cGyACRPlcxIVZqUjyYbhYY2XamiSdgII6YzI0I6hukUxGxNUpsH2nO7/yvQyHISJGclr+LPFvttqE5qtl/OcV40gh6+vrYsm24fgHk9sv+T8i4aqffuz3qW4zLxcH4p6tLCThzKFNkQu4ZFXVSwT0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Vapcjewp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Vapcjewp" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 11B5DC4CEEF; Wed, 9 Jul 2025 11:53:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752062012; bh=aFTWuG5wx8zk/YVPbJb3PdCE2lsw+uaJDNCeO3dVdgc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VapcjewpgClYpmgzEGlGbSHD6EZglAchhsa23Oq+PShnqE3FzwD13g0CQmLmWMY1J X6wFCxvZqtxDjqY7ou8VMvkAld/2DoZaiwg4m3PJDRVgdYAN4XwCUf+zAgaOtcc9IV BgOYnZ0X6nHtIJXaUg+gU+aY7Wccyj0AhJqOOUl3iVGDe3CY/t9LipZVoRI3ceQtJe efymy4Xs0p1c/U0M48T48+lWKsNi1+EeCygpwlwwp7tc/NEMBdG5wNHK5T6FlBjFul UmnJunm4SkGrQIDNQNYJptnKqemrK9b5VC2xhu5ij51Bah6ng1mSgHbHiPN18hAxj2 /KSNOSyWpeMXQ== From: Philipp Stanner To: Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , Matthew Brost , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , Tvrtko Ursulin , Pierre-Eric Pelloux-Prayer Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org Subject: [PATCH v3 2/7] drm/sched/tests: Implement cancel_job() callback Date: Wed, 9 Jul 2025 13:52:52 +0200 Message-ID: <20250709115257.106370-4-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250709115257.106370-2-phasta@kernel.org> References: <20250709115257.106370-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The GPU Scheduler now supports a new callback, cancel_job(), which lets the scheduler cancel all jobs which might not yet be freed when drm_sched_fini() runs. Using this callback allows for significantly simplifying the mock scheduler teardown code. Implement the cancel_job() callback and adjust the code where necessary. Signed-off-by: Philipp Stanner Reviewed-by: Tvrtko Ursulin --- .../gpu/drm/scheduler/tests/mock_scheduler.c | 68 +++++++------------ drivers/gpu/drm/scheduler/tests/sched_tests.h | 1 - 2 files changed, 25 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c b/drivers/gpu= /drm/scheduler/tests/mock_scheduler.c index 49d067fecd67..0d1d57213e05 100644 --- a/drivers/gpu/drm/scheduler/tests/mock_scheduler.c +++ b/drivers/gpu/drm/scheduler/tests/mock_scheduler.c @@ -63,7 +63,7 @@ static void drm_mock_sched_job_complete(struct drm_mock_s= ched_job *job) lockdep_assert_held(&sched->lock); =20 job->flags |=3D DRM_MOCK_SCHED_JOB_DONE; - list_move_tail(&job->link, &sched->done_list); + list_del(&job->link); dma_fence_signal_locked(&job->hw_fence); complete(&job->done); } @@ -236,26 +236,41 @@ mock_sched_timedout_job(struct drm_sched_job *sched_j= ob) =20 static void mock_sched_free_job(struct drm_sched_job *sched_job) { - struct drm_mock_scheduler *sched =3D - drm_sched_to_mock_sched(sched_job->sched); struct drm_mock_sched_job *job =3D drm_sched_job_to_mock_job(sched_job); - unsigned long flags; =20 - /* Remove from the scheduler done list. */ - spin_lock_irqsave(&sched->lock, flags); - list_del(&job->link); - spin_unlock_irqrestore(&sched->lock, flags); dma_fence_put(&job->hw_fence); - drm_sched_job_cleanup(sched_job); =20 /* Mock job itself is freed by the kunit framework. */ } =20 +static void mock_sched_cancel_job(struct drm_sched_job *sched_job) +{ + struct drm_mock_scheduler *sched =3D drm_sched_to_mock_sched(sched_job->s= ched); + struct drm_mock_sched_job *job =3D drm_sched_job_to_mock_job(sched_job); + unsigned long flags; + + hrtimer_cancel(&job->timer); + + spin_lock_irqsave(&sched->lock, flags); + if (!dma_fence_is_signaled_locked(&job->hw_fence)) { + list_del(&job->link); + dma_fence_set_error(&job->hw_fence, -ECANCELED); + dma_fence_signal_locked(&job->hw_fence); + } + spin_unlock_irqrestore(&sched->lock, flags); + + /* + * The GPU Scheduler will call drm_sched_backend_ops.free_job(), still. + * Mock job itself is freed by the kunit framework. + */ +} + static const struct drm_sched_backend_ops drm_mock_scheduler_ops =3D { .run_job =3D mock_sched_run_job, .timedout_job =3D mock_sched_timedout_job, - .free_job =3D mock_sched_free_job + .free_job =3D mock_sched_free_job, + .cancel_job =3D mock_sched_cancel_job, }; =20 /** @@ -289,7 +304,6 @@ struct drm_mock_scheduler *drm_mock_sched_new(struct ku= nit *test, long timeout) sched->hw_timeline.context =3D dma_fence_context_alloc(1); atomic_set(&sched->hw_timeline.next_seqno, 0); INIT_LIST_HEAD(&sched->job_list); - INIT_LIST_HEAD(&sched->done_list); spin_lock_init(&sched->lock); =20 return sched; @@ -304,38 +318,6 @@ struct drm_mock_scheduler *drm_mock_sched_new(struct k= unit *test, long timeout) */ void drm_mock_sched_fini(struct drm_mock_scheduler *sched) { - struct drm_mock_sched_job *job, *next; - unsigned long flags; - LIST_HEAD(list); - - drm_sched_wqueue_stop(&sched->base); - - /* Force complete all unfinished jobs. */ - spin_lock_irqsave(&sched->lock, flags); - list_for_each_entry_safe(job, next, &sched->job_list, link) - list_move_tail(&job->link, &list); - spin_unlock_irqrestore(&sched->lock, flags); - - list_for_each_entry(job, &list, link) - hrtimer_cancel(&job->timer); - - spin_lock_irqsave(&sched->lock, flags); - list_for_each_entry_safe(job, next, &list, link) - drm_mock_sched_job_complete(job); - spin_unlock_irqrestore(&sched->lock, flags); - - /* - * Free completed jobs and jobs not yet processed by the DRM scheduler - * free worker. - */ - spin_lock_irqsave(&sched->lock, flags); - list_for_each_entry_safe(job, next, &sched->done_list, link) - list_move_tail(&job->link, &list); - spin_unlock_irqrestore(&sched->lock, flags); - - list_for_each_entry_safe(job, next, &list, link) - mock_sched_free_job(&job->base); - drm_sched_fini(&sched->base); } =20 diff --git a/drivers/gpu/drm/scheduler/tests/sched_tests.h b/drivers/gpu/dr= m/scheduler/tests/sched_tests.h index fbba38137f0c..0eddfb8d89e6 100644 --- a/drivers/gpu/drm/scheduler/tests/sched_tests.h +++ b/drivers/gpu/drm/scheduler/tests/sched_tests.h @@ -49,7 +49,6 @@ struct drm_mock_scheduler { =20 spinlock_t lock; struct list_head job_list; - struct list_head done_list; =20 struct { u64 context; --=20 2.49.0 From nobody Tue Oct 7 13:10:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C89632D94BE; Wed, 9 Jul 2025 11:53:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062018; cv=none; b=hOEKFF6lKw/8hj/LE0UpFXNLVvmSyo9VVe3ZsytJX/f9TxZFITbjvQmW/zXV3R/j2+/e0uLfblPa7dZiINfMwtki4+Gh8wDMjL5svtR0iMP0n7Y4P+7EhqkCNWkYEd00XJEdu3eYupr/M9sMJ33u14aam72X7BXhykwhVOjWPk4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062018; c=relaxed/simple; bh=i/0vpVLwuNfJONVWKkjXsF93uf5k77Dhi7/RFzvBlmA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z1mIYZ4gfiIlHSP9FtAARlwceT10CVVfOAlhwnOma0dMDDizrSUXU5zG/52whadQBe3bjQOtSVp9+VYxoZh08DUJDHI5inFBZLwqQBRsBMUmYkkL60xhYWS8Mz8ytbvltoR93MQFdHXHKDbSXie7Rzx2TUNB1dlvJPPkVf84NJY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WN5+m8hC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WN5+m8hC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8BF27C4CEF5; Wed, 9 Jul 2025 11:53:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752062016; bh=i/0vpVLwuNfJONVWKkjXsF93uf5k77Dhi7/RFzvBlmA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WN5+m8hCqTg6Nv00z0Zk93h/yacz9gONUma6hONCfMdRKhKi+W6KTOXSdrRmjm8SN p0JEUtmE+HeU9we0Ul97k1iGyOGbGKxk1ErR1aDMOyMOfUiDdwfZ3X1yzUVjMGVCWa 3j0sIU6jAntivlH3IASY9zcGOTU9NjpBAH5KWGr1KiKgETcQyYaivAEyNZZrimzuAM ynBl358QbenNY8RCSqZaPJ5gdaVuHGPyXeCmU6tQSzDZk7uJFbCCOg3NJubZazB+dK Gi6qq4gczF0ClIxVh6nfNZ1j+TpojCPHepVlZkgXonBLUWw0zYM60C0OCOARSfQB+Z NV+c89v21vvdw== From: Philipp Stanner To: Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , Matthew Brost , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , Tvrtko Ursulin , Pierre-Eric Pelloux-Prayer Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org Subject: [PATCH v3 3/7] drm/sched/tests: Add unit test for cancel_job() Date: Wed, 9 Jul 2025 13:52:53 +0200 Message-ID: <20250709115257.106370-5-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250709115257.106370-2-phasta@kernel.org> References: <20250709115257.106370-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The scheduler unit tests now provide a new callback, cancel_job(). This callback gets used by drm_sched_fini() for all still pending jobs to cancel them. Implement a new unit test to test this. Signed-off-by: Philipp Stanner Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/scheduler/tests/tests_basic.c | 42 +++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/drivers/gpu/drm/scheduler/tests/tests_basic.c b/drivers/gpu/dr= m/scheduler/tests/tests_basic.c index 7230057e0594..b1ae10c6bb37 100644 --- a/drivers/gpu/drm/scheduler/tests/tests_basic.c +++ b/drivers/gpu/drm/scheduler/tests/tests_basic.c @@ -204,6 +204,47 @@ static struct kunit_suite drm_sched_basic =3D { .test_cases =3D drm_sched_basic_tests, }; =20 +static void drm_sched_basic_cancel(struct kunit *test) +{ + struct drm_mock_sched_entity *entity; + struct drm_mock_scheduler *sched; + struct drm_mock_sched_job *job; + bool done; + + /* + * Check that drm_sched_fini() uses the cancel_job() callback to cancel + * jobs that are still pending. + */ + + sched =3D drm_mock_sched_new(test, MAX_SCHEDULE_TIMEOUT); + entity =3D drm_mock_sched_entity_new(test, DRM_SCHED_PRIORITY_NORMAL, + sched); + + job =3D drm_mock_sched_job_new(test, entity); + + drm_mock_sched_job_submit(job); + + done =3D drm_mock_sched_job_wait_scheduled(job, HZ); + KUNIT_ASSERT_TRUE(test, done); + + drm_mock_sched_entity_free(entity); + drm_mock_sched_fini(sched); + + KUNIT_ASSERT_EQ(test, job->hw_fence.error, -ECANCELED); +} + +static struct kunit_case drm_sched_cancel_tests[] =3D { + KUNIT_CASE(drm_sched_basic_cancel), + {} +}; + +static struct kunit_suite drm_sched_cancel =3D { + .name =3D "drm_sched_basic_cancel_tests", + .init =3D drm_sched_basic_init, + .exit =3D drm_sched_basic_exit, + .test_cases =3D drm_sched_cancel_tests, +}; + static void drm_sched_basic_timeout(struct kunit *test) { struct drm_mock_scheduler *sched =3D test->priv; @@ -471,6 +512,7 @@ static struct kunit_suite drm_sched_credits =3D { =20 kunit_test_suites(&drm_sched_basic, &drm_sched_timeout, + &drm_sched_cancel, &drm_sched_priority, &drm_sched_modify_sched, &drm_sched_credits); --=20 2.49.0 From nobody Tue Oct 7 13:10:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 150042DC34D; Wed, 9 Jul 2025 11:53:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062021; cv=none; b=lw2FL/C3jf+QupvEydjAw6acSOX666NSBL4wB/+0i0sD/qCGKb/GF30QEokHgfBozhosKKO3kcKvTHek4uTQ2i4zjCRRAu2Ypt6lMuv0ZQLEsKpalqJyPzTTWjgIvtpRnRL+UvU9w0MC2XjR7hwf25rclZI/vdzmdiU4/w0umVM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062021; c=relaxed/simple; bh=Jr45GDaF+42wl4BEFdHS/S6QGGlCSlKcGecFU5CGZvo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NcYrtg38I2tLv2Ph7+wye5X49/2BzB3v/j2Q3VgBYUM+KM0GDzf5XLHoJU0w7PmJQoiQSu5p52IjP5LdEmEpPXs1/t/1E0dqRw/XNbFdx8achjH+QwoPgRkCUDSACLdaac88bkEFH2Q8TAQ4NYHGSMn/vPczNvV42uX+TG3Kr7U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=p9vmiO0w; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="p9vmiO0w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C3875C4CEEF; Wed, 9 Jul 2025 11:53:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752062020; bh=Jr45GDaF+42wl4BEFdHS/S6QGGlCSlKcGecFU5CGZvo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=p9vmiO0wJGy+be/TMUQ/D2XXdl/jmSvqoVU39tj/aVdyp06EAhivoC75sjpqg+XTM pI7O8kCdTiEACr4GdnrvuJ9zwC2A77BJqVOdJnxkQ7OytBTlBd9MEOXlcwmghk1iGg zEVxT7GizIyW3TI9pmROs7+CHOAkYfckYkFUPMfpnDkDPFXM+aoUbor6WVKdELYvJI jWVYYHkUeHYQ4mkYcS7fHURDki+nZkvdI3IubfzNs+dhFrBJB5aazdB4EnDkiIU1zb bjI4IzyQSqEVWuIGm7I0e5NjkrwmYikbFhbvlkPYl8KUIV4ZeO5ZzF7Ax/HWVgOHzV SIrkeYMymZAfA== From: Philipp Stanner To: Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , Matthew Brost , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , Tvrtko Ursulin , Pierre-Eric Pelloux-Prayer Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org Subject: [PATCH v3 4/7] drm/sched: Warn if pending_list is not empty Date: Wed, 9 Jul 2025 13:52:54 +0200 Message-ID: <20250709115257.106370-6-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250709115257.106370-2-phasta@kernel.org> References: <20250709115257.106370-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" drm_sched_fini() can leak jobs under certain circumstances. Warn if that happens. Signed-off-by: Philipp Stanner --- drivers/gpu/drm/scheduler/sched_main.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/sched= uler/sched_main.c index a971f0c9e6e0..93cb74c0ccf8 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1414,6 +1414,9 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched) sched->ready =3D false; kfree(sched->sched_rq); sched->sched_rq =3D NULL; + + if (!list_empty(&sched->pending_list)) + dev_err(sched->dev, "Tearing down scheduler while jobs are pending!\n"); } EXPORT_SYMBOL(drm_sched_fini); =20 --=20 2.49.0 From nobody Tue Oct 7 13:10:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48DD92D77F6; Wed, 9 Jul 2025 11:53:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062027; cv=none; b=YcaIibCOjjGr10cq/ezpumx40WHg1Jd/1cisA3MBFbJLfjB/2x95vxkYKn+UgY/TIErcrl/lpIP0XqQoU0Uv3AX3PH341QDw/pexssYsse2OYvgR6CgzUZc5asafVcZxs11dTcZz+Gal9ErYWAhV0X0LxLWSQ6kaPBzaemmL26o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062027; c=relaxed/simple; bh=OZBWcl+PCUb2kCnaY8+5TU8XsQRk5Xln7k2Zc73mGUs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QJbqFd9WjlZrNIn6wimclnMl+YAzzV7xgqJdeh/s2iT5QWIhZR9FuJkvXgXgqUYYM7mXcYhiSJv8OstFe3Qt0VtsdORLhBWd1BZBz/2fgd1DSaaOlBh66zl0ipSiu/xy0n3JkChvCR+MDA0CfsooU80GDY31Mu2KJhmrPhxwk7Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hc7htDR1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hc7htDR1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 18D8EC4CEF4; Wed, 9 Jul 2025 11:53:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752062024; bh=OZBWcl+PCUb2kCnaY8+5TU8XsQRk5Xln7k2Zc73mGUs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hc7htDR1NMf/EMaHugzGMu/u1ql8feeUr50su8YvSky3tbLezUtDafyj8Da3hjSZU rfRuRj0IOtQzkDV1Q1hRWDOSpdR237tZPdZzxNxhW7RguWZ8hvMY0bQb3bqoCm79e5 go6TdvZOw3J78s+ARn+5wSd1B4341aWpOyq1LexJ7IXALpVuWEMDfrJg1ZwZbu9zdC H170JKdktW2c6hlIMjGCrnAxwRK9tC07qQiYIBiDofvVIxUWfOBTMhAy36pzAqkIqR hVliH8SGLWifukgJClHn0UtMX+iVDlFuW7W0roaWG7/196SFKDhWqkplf6Wq5dm/U8 XAN+PxAI8m+JQ== From: Philipp Stanner To: Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , Matthew Brost , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , Tvrtko Ursulin , Pierre-Eric Pelloux-Prayer Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org Subject: [PATCH v3 5/7] drm/nouveau: Make fence container helper usable driver-wide Date: Wed, 9 Jul 2025 13:52:55 +0200 Message-ID: <20250709115257.106370-7-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250709115257.106370-2-phasta@kernel.org> References: <20250709115257.106370-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In order to implement a new DRM GPU scheduler callback in Nouveau, a helper for obtaining a nouveau_fence from a dma_fence is necessary. Such a helper exists already inside nouveau_fence.c, called from_fence(). Make that helper available to other C files with a more precise name. Signed-off-by: Philipp Stanner Acked-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_fence.c | 20 +++++++------------- drivers/gpu/drm/nouveau/nouveau_fence.h | 6 ++++++ 2 files changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouv= eau/nouveau_fence.c index d5654e26d5bc..869d4335c0f4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -38,12 +38,6 @@ static const struct dma_fence_ops nouveau_fence_ops_uevent; static const struct dma_fence_ops nouveau_fence_ops_legacy; =20 -static inline struct nouveau_fence * -from_fence(struct dma_fence *fence) -{ - return container_of(fence, struct nouveau_fence, base); -} - static inline struct nouveau_fence_chan * nouveau_fctx(struct nouveau_fence *fence) { @@ -77,7 +71,7 @@ nouveau_local_fence(struct dma_fence *fence, struct nouve= au_drm *drm) fence->ops !=3D &nouveau_fence_ops_uevent) return NULL; =20 - return from_fence(fence); + return to_nouveau_fence(fence); } =20 void @@ -268,7 +262,7 @@ nouveau_fence_done(struct nouveau_fence *fence) static long nouveau_fence_wait_legacy(struct dma_fence *f, bool intr, long wait) { - struct nouveau_fence *fence =3D from_fence(f); + struct nouveau_fence *fence =3D to_nouveau_fence(f); unsigned long sleep_time =3D NSEC_PER_MSEC / 1000; unsigned long t =3D jiffies, timeout =3D t + wait; =20 @@ -448,7 +442,7 @@ static const char *nouveau_fence_get_get_driver_name(st= ruct dma_fence *fence) =20 static const char *nouveau_fence_get_timeline_name(struct dma_fence *f) { - struct nouveau_fence *fence =3D from_fence(f); + struct nouveau_fence *fence =3D to_nouveau_fence(f); struct nouveau_fence_chan *fctx =3D nouveau_fctx(fence); =20 return !fctx->dead ? fctx->name : "dead channel"; @@ -462,7 +456,7 @@ static const char *nouveau_fence_get_timeline_name(stru= ct dma_fence *f) */ static bool nouveau_fence_is_signaled(struct dma_fence *f) { - struct nouveau_fence *fence =3D from_fence(f); + struct nouveau_fence *fence =3D to_nouveau_fence(f); struct nouveau_fence_chan *fctx =3D nouveau_fctx(fence); struct nouveau_channel *chan; bool ret =3D false; @@ -478,7 +472,7 @@ static bool nouveau_fence_is_signaled(struct dma_fence = *f) =20 static bool nouveau_fence_no_signaling(struct dma_fence *f) { - struct nouveau_fence *fence =3D from_fence(f); + struct nouveau_fence *fence =3D to_nouveau_fence(f); =20 /* * caller should have a reference on the fence, @@ -503,7 +497,7 @@ static bool nouveau_fence_no_signaling(struct dma_fence= *f) =20 static void nouveau_fence_release(struct dma_fence *f) { - struct nouveau_fence *fence =3D from_fence(f); + struct nouveau_fence *fence =3D to_nouveau_fence(f); struct nouveau_fence_chan *fctx =3D nouveau_fctx(fence); =20 kref_put(&fctx->fence_ref, nouveau_fence_context_put); @@ -521,7 +515,7 @@ static const struct dma_fence_ops nouveau_fence_ops_leg= acy =3D { =20 static bool nouveau_fence_enable_signaling(struct dma_fence *f) { - struct nouveau_fence *fence =3D from_fence(f); + struct nouveau_fence *fence =3D to_nouveau_fence(f); struct nouveau_fence_chan *fctx =3D nouveau_fctx(fence); bool ret; =20 diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouv= eau/nouveau_fence.h index 6a983dd9f7b9..183dd43ecfff 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.h +++ b/drivers/gpu/drm/nouveau/nouveau_fence.h @@ -17,6 +17,12 @@ struct nouveau_fence { unsigned long timeout; }; =20 +static inline struct nouveau_fence * +to_nouveau_fence(struct dma_fence *fence) +{ + return container_of(fence, struct nouveau_fence, base); +} + int nouveau_fence_create(struct nouveau_fence **, struct nouveau_channel = *); int nouveau_fence_new(struct nouveau_fence **, struct nouveau_channel *); void nouveau_fence_unref(struct nouveau_fence **); --=20 2.49.0 From nobody Tue Oct 7 13:10:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81D442D780E; Wed, 9 Jul 2025 11:53:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062029; cv=none; b=fMpFx/x3l3J2mfqS2Yp0o/sI+2QQnCBJNHtQFbeatfJcc7jbd0/V+ygdnbO8sLN+92RGx1rk3RMpRzVUPhmwLi8qebHPet8glk2BQbx1C481/EaKoPg9JSD2K1YaqoGUvmaHRI0QtKhshWBPOR70wLPlLFcV8y2Ruv9xFNTK6N8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062029; c=relaxed/simple; bh=IPadNyzazGr54PAKwUSWd3Gd6sWe9Ohiu/qKagtD5Uo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=taHu8dzkZuEioha5bQpmWHP90cPks0v/xlH/EsKONwgOK49OMgL5ro1Oj4iXwmQmqDhGynocQ9ul0zQYfqlszy8ImDOluiiktOauK2Ac9R7V6Vt72IicJieWnw8p020Tz/XEGaDQAirsq6i+aIrYW/VSMXj3YIlorwIJCjzMnLQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RWqgbQc/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RWqgbQc/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 49752C4CEEF; Wed, 9 Jul 2025 11:53:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752062029; bh=IPadNyzazGr54PAKwUSWd3Gd6sWe9Ohiu/qKagtD5Uo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RWqgbQc/NwpLyAV7TVCfpisO2Yk8RxFE/h+0Fc7ry4hLWetS/NuL4yDK6VVqr8TvT igb9JZtnJ40e5RCTWygJUcg0QibKQKAXy45GlMbtNPAjOm0se9G3hkLLWIMM/pBBHA 71BIqEIFwOSvfnWB1+CWHPS+rNpYWlY8BJ02TmJeDXMGngAT+Yes0F0i6Quff+evwE ksfthBrCSmEyaGrz2FDSCzMQ5MMv8QGJFOHicJX7qXDyb2+UubxPHpGRfofMyv4/h4 EdnsUn5dVnng/+5RhXuQh1WbSS7u+r4ds8KEKkRdHOyV/qQOhJf7/0D/8NKm/Bqa3W 1MNR+ZYasJtvg== From: Philipp Stanner To: Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , Matthew Brost , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , Tvrtko Ursulin , Pierre-Eric Pelloux-Prayer Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org Subject: [PATCH v3 6/7] drm/nouveau: Add new callback for scheduler teardown Date: Wed, 9 Jul 2025 13:52:56 +0200 Message-ID: <20250709115257.106370-8-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250709115257.106370-2-phasta@kernel.org> References: <20250709115257.106370-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There is a new callback for always tearing the scheduler down in a leak-free, deadlock-free manner. Port Nouveau as its first user by providing the scheduler with a callback that ensures the fence context gets killed in drm_sched_fini(). Signed-off-by: Philipp Stanner Acked-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_fence.c | 15 +++++++++++++++ drivers/gpu/drm/nouveau/nouveau_fence.h | 1 + drivers/gpu/drm/nouveau/nouveau_sched.c | 15 ++++++++++++++- 3 files changed, 30 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouv= eau/nouveau_fence.c index 869d4335c0f4..9f345a008717 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -240,6 +240,21 @@ nouveau_fence_emit(struct nouveau_fence *fence) return ret; } =20 +void +nouveau_fence_cancel(struct nouveau_fence *fence) +{ + struct nouveau_fence_chan *fctx =3D nouveau_fctx(fence); + unsigned long flags; + + spin_lock_irqsave(&fctx->lock, flags); + if (!dma_fence_is_signaled_locked(&fence->base)) { + dma_fence_set_error(&fence->base, -ECANCELED); + if (nouveau_fence_signal(fence)) + nvif_event_block(&fctx->event); + } + spin_unlock_irqrestore(&fctx->lock, flags); +} + bool nouveau_fence_done(struct nouveau_fence *fence) { diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouv= eau/nouveau_fence.h index 183dd43ecfff..9957a919bd38 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.h +++ b/drivers/gpu/drm/nouveau/nouveau_fence.h @@ -29,6 +29,7 @@ void nouveau_fence_unref(struct nouveau_fence **); =20 int nouveau_fence_emit(struct nouveau_fence *); bool nouveau_fence_done(struct nouveau_fence *); +void nouveau_fence_cancel(struct nouveau_fence *fence); int nouveau_fence_wait(struct nouveau_fence *, bool lazy, bool intr); int nouveau_fence_sync(struct nouveau_bo *, struct nouveau_channel *, boo= l exclusive, bool intr); =20 diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouv= eau/nouveau_sched.c index 460a5fb02412..2ec62059c351 100644 --- a/drivers/gpu/drm/nouveau/nouveau_sched.c +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c @@ -11,6 +11,7 @@ #include "nouveau_exec.h" #include "nouveau_abi16.h" #include "nouveau_sched.h" +#include "nouveau_chan.h" =20 #define NOUVEAU_SCHED_JOB_TIMEOUT_MS 10000 =20 @@ -393,10 +394,23 @@ nouveau_sched_free_job(struct drm_sched_job *sched_jo= b) nouveau_job_fini(job); } =20 +static void +nouveau_sched_cancel_job(struct drm_sched_job *sched_job) +{ + struct nouveau_fence *fence; + struct nouveau_job *job; + + job =3D to_nouveau_job(sched_job); + fence =3D to_nouveau_fence(job->done_fence); + + nouveau_fence_cancel(fence); +} + static const struct drm_sched_backend_ops nouveau_sched_ops =3D { .run_job =3D nouveau_sched_run_job, .timedout_job =3D nouveau_sched_timedout_job, .free_job =3D nouveau_sched_free_job, + .cancel_job =3D nouveau_sched_cancel_job, }; =20 static int @@ -482,7 +496,6 @@ nouveau_sched_create(struct nouveau_sched **psched, str= uct nouveau_drm *drm, return 0; } =20 - static void nouveau_sched_fini(struct nouveau_sched *sched) { --=20 2.49.0 From nobody Tue Oct 7 13:10:18 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A37C52E11B6; Wed, 9 Jul 2025 11:53:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062033; cv=none; b=DoEdhNlsc6vS6WeZ+JaIz/J7T9GnFrDD1eSUdSx5ovgbz0Vl024g+fQEzyA9keyar2DAH/9HnMAlojCvJ1kKYTrK3VG6Dvs/8GS4GAR4lUQhm8Tp/Z3a2FSNvOMLUZV+Vx9pRzwOyiO7yReWxTiBVw/BuYbNBCJqnHmfplDTVfQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752062033; c=relaxed/simple; bh=jwQeQF43fuuFdeU+odOP0p6i6/kacIOeRK9joESZCeg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ChIVQa0PlMm8FX2MEW7eZZk3G+V0seUsmmyT+tk8x3LDyKVR6YOTA0OQY7aXjIgX0GtTenovfQGvgIOR7CDuEjzflpzO+tmOMdUz/VCGWCkBYnTzSukYSShG1c4KcX0lYs5AL+tVB2mLgQPJ601GIZkI8CFDJYhQI/43lAAus0o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EvGQiVMN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EvGQiVMN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 72262C4CEF4; Wed, 9 Jul 2025 11:53:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752062033; bh=jwQeQF43fuuFdeU+odOP0p6i6/kacIOeRK9joESZCeg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EvGQiVMN8p5LGtSsPCloupb/qx9E6+SH7L0wcNxwuawdlWf3Hm46+of8DVTDB66UB kbjNzc9+wocjSiKGGT0ayFoZZEAZaiTZjXbXHGsG/G9neFX/2UsU/bh34wnBb14xX/ CvZ15mUx4sOy2UbzEfvTT6BeNH5UYSZpOpm42eQaE1mSKm0uUKeCt51fkQ3y33EKoG ES37guYQBzUOXe6JFpBrnl8i02P5/92fNcXewMTmg5Ikjs8o6VxCXweMlYxyZ1Qf4Q jD/gv6HzLABKr8POaH0Ig8/izqDUMrjufRpqndGH7f5ICiyO7y6Izt9Tx43/rh4a2M KjZEofwBWQSig== From: Philipp Stanner To: Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , Matthew Brost , Philipp Stanner , =?UTF-8?q?Christian=20K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Sumit Semwal , Tvrtko Ursulin , Pierre-Eric Pelloux-Prayer Cc: dri-devel@lists.freedesktop.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org Subject: [PATCH v3 7/7] drm/nouveau: Remove waitque for sched teardown Date: Wed, 9 Jul 2025 13:52:57 +0200 Message-ID: <20250709115257.106370-9-phasta@kernel.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250709115257.106370-2-phasta@kernel.org> References: <20250709115257.106370-2-phasta@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" struct nouveau_sched contains a waitque needed to prevent drm_sched_fini() from being called while there are still jobs pending. Doing so so far would have caused memory leaks. With the new memleak-free mode of operation switched on in drm_sched_fini() by providing the callback nouveau_sched_cancel_job() the waitque is not necessary anymore. Remove the waitque. Signed-off-by: Philipp Stanner Acked-by: Danilo Krummrich --- drivers/gpu/drm/nouveau/nouveau_sched.c | 20 +++++++------------- drivers/gpu/drm/nouveau/nouveau_sched.h | 9 +++------ drivers/gpu/drm/nouveau/nouveau_uvmm.c | 8 ++++---- 3 files changed, 14 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c b/drivers/gpu/drm/nouv= eau/nouveau_sched.c index 2ec62059c351..7d9c3418e76b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_sched.c +++ b/drivers/gpu/drm/nouveau/nouveau_sched.c @@ -122,11 +122,9 @@ nouveau_job_done(struct nouveau_job *job) { struct nouveau_sched *sched =3D job->sched; =20 - spin_lock(&sched->job.list.lock); + spin_lock(&sched->job_list.lock); list_del(&job->entry); - spin_unlock(&sched->job.list.lock); - - wake_up(&sched->job.wq); + spin_unlock(&sched->job_list.lock); } =20 void @@ -307,9 +305,9 @@ nouveau_job_submit(struct nouveau_job *job) } =20 /* Submit was successful; add the job to the schedulers job list. */ - spin_lock(&sched->job.list.lock); - list_add(&job->entry, &sched->job.list.head); - spin_unlock(&sched->job.list.lock); + spin_lock(&sched->job_list.lock); + list_add(&job->entry, &sched->job_list.head); + spin_unlock(&sched->job_list.lock); =20 drm_sched_job_arm(&job->base); job->done_fence =3D dma_fence_get(&job->base.s_fence->finished); @@ -460,9 +458,8 @@ nouveau_sched_init(struct nouveau_sched *sched, struct = nouveau_drm *drm, goto fail_sched; =20 mutex_init(&sched->mutex); - spin_lock_init(&sched->job.list.lock); - INIT_LIST_HEAD(&sched->job.list.head); - init_waitqueue_head(&sched->job.wq); + spin_lock_init(&sched->job_list.lock); + INIT_LIST_HEAD(&sched->job_list.head); =20 return 0; =20 @@ -502,9 +499,6 @@ nouveau_sched_fini(struct nouveau_sched *sched) struct drm_gpu_scheduler *drm_sched =3D &sched->base; struct drm_sched_entity *entity =3D &sched->entity; =20 - rmb(); /* for list_empty to work without lock */ - wait_event(sched->job.wq, list_empty(&sched->job.list.head)); - drm_sched_entity_fini(entity); drm_sched_fini(drm_sched); =20 diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.h b/drivers/gpu/drm/nouv= eau/nouveau_sched.h index 20cd1da8db73..b98c3f0bef30 100644 --- a/drivers/gpu/drm/nouveau/nouveau_sched.h +++ b/drivers/gpu/drm/nouveau/nouveau_sched.h @@ -103,12 +103,9 @@ struct nouveau_sched { struct mutex mutex; =20 struct { - struct { - struct list_head head; - spinlock_t lock; - } list; - struct wait_queue_head wq; - } job; + struct list_head head; + spinlock_t lock; + } job_list; }; =20 int nouveau_sched_create(struct nouveau_sched **psched, struct nouveau_drm= *drm, diff --git a/drivers/gpu/drm/nouveau/nouveau_uvmm.c b/drivers/gpu/drm/nouve= au/nouveau_uvmm.c index 48f105239f42..ddfc46bc1b3e 100644 --- a/drivers/gpu/drm/nouveau/nouveau_uvmm.c +++ b/drivers/gpu/drm/nouveau/nouveau_uvmm.c @@ -1019,8 +1019,8 @@ bind_validate_map_sparse(struct nouveau_job *job, u64= addr, u64 range) u64 end =3D addr + range; =20 again: - spin_lock(&sched->job.list.lock); - list_for_each_entry(__job, &sched->job.list.head, entry) { + spin_lock(&sched->job_list.lock); + list_for_each_entry(__job, &sched->job_list.head, entry) { struct nouveau_uvmm_bind_job *bind_job =3D to_uvmm_bind_job(__job); =20 list_for_each_op(op, &bind_job->ops) { @@ -1030,7 +1030,7 @@ bind_validate_map_sparse(struct nouveau_job *job, u64= addr, u64 range) =20 if (!(end <=3D op_addr || addr >=3D op_end)) { nouveau_uvmm_bind_job_get(bind_job); - spin_unlock(&sched->job.list.lock); + spin_unlock(&sched->job_list.lock); wait_for_completion(&bind_job->complete); nouveau_uvmm_bind_job_put(bind_job); goto again; @@ -1038,7 +1038,7 @@ bind_validate_map_sparse(struct nouveau_job *job, u64= addr, u64 range) } } } - spin_unlock(&sched->job.list.lock); + spin_unlock(&sched->job_list.lock); } =20 static int --=20 2.49.0