drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
When entity move from one rq to another, current_entity will be set to NULL
if it is the moving entity. This make entities close to rq head got
selected more frequently, especially when doing load balance between
multiple drm_gpu_scheduler.
Make current_entity to next when removing from rq.
Signed-off-by: brolerliew <brolerliew@gmail.com>
---
drivers/gpu/drm/scheduler/sched_main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 2fab218d7082..00b22cc50f08 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq,
spin_lock(&rq->lock);
atomic_dec(rq->sched->score);
- list_del_init(&entity->list);
if (rq->current_entity == entity)
- rq->current_entity = NULL;
+ rq->current_entity = list_next_entry(entity, list);
+
+ list_del_init(&entity->list);
if (drm_sched_policy == DRM_SCHED_POLICY_FIFO)
drm_sched_rq_remove_fifo_locked(entity);
--
2.34.1
+ Luben On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote: > > When entity move from one rq to another, current_entity will be set to NULL > if it is the moving entity. This make entities close to rq head got > selected more frequently, especially when doing load balance between > multiple drm_gpu_scheduler. > > Make current_entity to next when removing from rq. > > Signed-off-by: brolerliew <brolerliew@gmail.com> > --- > drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c > index 2fab218d7082..00b22cc50f08 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, > spin_lock(&rq->lock); > > atomic_dec(rq->sched->score); > - list_del_init(&entity->list); > > if (rq->current_entity == entity) > - rq->current_entity = NULL; > + rq->current_entity = list_next_entry(entity, list); > + > + list_del_init(&entity->list); > > if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) > drm_sched_rq_remove_fifo_locked(entity); > -- > 2.34.1 >
Looking... Regards, Luben On 2022-10-25 09:35, Alex Deucher wrote: > + Luben > > On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote: >> >> When entity move from one rq to another, current_entity will be set to NULL >> if it is the moving entity. This make entities close to rq head got >> selected more frequently, especially when doing load balance between >> multiple drm_gpu_scheduler. >> >> Make current_entity to next when removing from rq. >> >> Signed-off-by: brolerliew <brolerliew@gmail.com> >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >> index 2fab218d7082..00b22cc50f08 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, >> spin_lock(&rq->lock); >> >> atomic_dec(rq->sched->score); >> - list_del_init(&entity->list); >> >> if (rq->current_entity == entity) >> - rq->current_entity = NULL; >> + rq->current_entity = list_next_entry(entity, list); >> + >> + list_del_init(&entity->list); >> >> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >> drm_sched_rq_remove_fifo_locked(entity); >> -- >> 2.34.1 >>
On 2022-10-25 13:50, Luben Tuikov wrote: > Looking... > > Regards, > Luben > > On 2022-10-25 09:35, Alex Deucher wrote: >> + Luben >> >> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote: >>> >>> When entity move from one rq to another, current_entity will be set to NULL >>> if it is the moving entity. This make entities close to rq head got >>> selected more frequently, especially when doing load balance between >>> multiple drm_gpu_scheduler. >>> >>> Make current_entity to next when removing from rq. >>> >>> Signed-off-by: brolerliew <brolerliew@gmail.com> >>> --- >>> drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >>> index 2fab218d7082..00b22cc50f08 100644 >>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, >>> spin_lock(&rq->lock); >>> >>> atomic_dec(rq->sched->score); >>> - list_del_init(&entity->list); >>> >>> if (rq->current_entity == entity) >>> - rq->current_entity = NULL; >>> + rq->current_entity = list_next_entry(entity, list); >>> + >>> + list_del_init(&entity->list); >>> >>> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >>> drm_sched_rq_remove_fifo_locked(entity); >>> -- >>> 2.34.1 >>> > Looks good. I'll pick it up into some other changes I've in tow, and repost along with my changes, as they're somewhat related. Regards, Luben
On 2022-10-27 03:01, Luben Tuikov wrote: > On 2022-10-25 13:50, Luben Tuikov wrote: >> Looking... >> >> Regards, >> Luben >> >> On 2022-10-25 09:35, Alex Deucher wrote: >>> + Luben >>> >>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote: >>>> >>>> When entity move from one rq to another, current_entity will be set to NULL >>>> if it is the moving entity. This make entities close to rq head got >>>> selected more frequently, especially when doing load balance between >>>> multiple drm_gpu_scheduler. >>>> >>>> Make current_entity to next when removing from rq. >>>> >>>> Signed-off-by: brolerliew <brolerliew@gmail.com> >>>> --- >>>> drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- >>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >>>> index 2fab218d7082..00b22cc50f08 100644 >>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, >>>> spin_lock(&rq->lock); >>>> >>>> atomic_dec(rq->sched->score); >>>> - list_del_init(&entity->list); >>>> >>>> if (rq->current_entity == entity) >>>> - rq->current_entity = NULL; >>>> + rq->current_entity = list_next_entry(entity, list); >>>> + >>>> + list_del_init(&entity->list); >>>> >>>> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >>>> drm_sched_rq_remove_fifo_locked(entity); >>>> -- >>>> 2.34.1 >>>> >> > > Looks good. I'll pick it up into some other changes I've in tow, and repost > along with my changes, as they're somewhat related. Actually, the more I look at it, the more I think that we do want to set rq->current_entity to NULL in that function, in order to pick the next best entity (or scheduler for that matter), the next time around. See sched_entity.c, and drm_sched_rq_select_entity() where we start evaluating from the _next_ entity. So, it is best to leave it to set it to NULL, for now. Regards, Luben
Am 27.10.22 um 10:07 schrieb Luben Tuikov: > On 2022-10-27 03:01, Luben Tuikov wrote: >> On 2022-10-25 13:50, Luben Tuikov wrote: >>> Looking... >>> >>> Regards, >>> Luben >>> >>> On 2022-10-25 09:35, Alex Deucher wrote: >>>> + Luben >>>> >>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote: >>>>> When entity move from one rq to another, current_entity will be set to NULL >>>>> if it is the moving entity. This make entities close to rq head got >>>>> selected more frequently, especially when doing load balance between >>>>> multiple drm_gpu_scheduler. >>>>> >>>>> Make current_entity to next when removing from rq. >>>>> >>>>> Signed-off-by: brolerliew <brolerliew@gmail.com> >>>>> --- >>>>> drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- >>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >>>>> index 2fab218d7082..00b22cc50f08 100644 >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, >>>>> spin_lock(&rq->lock); >>>>> >>>>> atomic_dec(rq->sched->score); >>>>> - list_del_init(&entity->list); >>>>> >>>>> if (rq->current_entity == entity) >>>>> - rq->current_entity = NULL; >>>>> + rq->current_entity = list_next_entry(entity, list); >>>>> + >>>>> + list_del_init(&entity->list); >>>>> >>>>> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >>>>> drm_sched_rq_remove_fifo_locked(entity); >>>>> -- >>>>> 2.34.1 >>>>> >> Looks good. I'll pick it up into some other changes I've in tow, and repost >> along with my changes, as they're somewhat related. > Actually, the more I look at it, the more I think that we do want to set > rq->current_entity to NULL in that function, in order to pick the next best entity > (or scheduler for that matter), the next time around. See sched_entity.c, > and drm_sched_rq_select_entity() where we start evaluating from the _next_ > entity. > > So, it is best to leave it to set it to NULL, for now. Apart from that this patch here could cause a crash when the entity is the last one in the list. In this case current current_entity would be set to an incorrect upcast of the head of the list. Regards, Christian. > > Regards, > Luben >
On 2022-10-27 04:19, Christian König wrote: > Am 27.10.22 um 10:07 schrieb Luben Tuikov: >> On 2022-10-27 03:01, Luben Tuikov wrote: >>> On 2022-10-25 13:50, Luben Tuikov wrote: >>>> Looking... >>>> >>>> Regards, >>>> Luben >>>> >>>> On 2022-10-25 09:35, Alex Deucher wrote: >>>>> + Luben >>>>> >>>>> On Tue, Oct 25, 2022 at 2:55 AM brolerliew <brolerliew@gmail.com> wrote: >>>>>> When entity move from one rq to another, current_entity will be set to NULL >>>>>> if it is the moving entity. This make entities close to rq head got >>>>>> selected more frequently, especially when doing load balance between >>>>>> multiple drm_gpu_scheduler. >>>>>> >>>>>> Make current_entity to next when removing from rq. >>>>>> >>>>>> Signed-off-by: brolerliew <brolerliew@gmail.com> >>>>>> --- >>>>>> drivers/gpu/drm/scheduler/sched_main.c | 5 +++-- >>>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c >>>>>> index 2fab218d7082..00b22cc50f08 100644 >>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c >>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c >>>>>> @@ -168,10 +168,11 @@ void drm_sched_rq_remove_entity(struct drm_sched_rq *rq, >>>>>> spin_lock(&rq->lock); >>>>>> >>>>>> atomic_dec(rq->sched->score); >>>>>> - list_del_init(&entity->list); >>>>>> >>>>>> if (rq->current_entity == entity) >>>>>> - rq->current_entity = NULL; >>>>>> + rq->current_entity = list_next_entry(entity, list); >>>>>> + >>>>>> + list_del_init(&entity->list); >>>>>> >>>>>> if (drm_sched_policy == DRM_SCHED_POLICY_FIFO) >>>>>> drm_sched_rq_remove_fifo_locked(entity); >>>>>> -- >>>>>> 2.34.1 >>>>>> >>> Looks good. I'll pick it up into some other changes I've in tow, and repost >>> along with my changes, as they're somewhat related. >> Actually, the more I look at it, the more I think that we do want to set >> rq->current_entity to NULL in that function, in order to pick the next best entity >> (or scheduler for that matter), the next time around. See sched_entity.c, >> and drm_sched_rq_select_entity() where we start evaluating from the _next_ >> entity. >> >> So, it is best to leave it to set it to NULL, for now. > > Apart from that this patch here could cause a crash when the entity is > the last one in the list. > > In this case current current_entity would be set to an incorrect upcast > of the head of the list. Absolutely. I saw that, but in rejecting the patch, I didn't feel the need to mention it. Thanks for looking into this. Regards, Luben
© 2016 - 2026 Red Hat, Inc.