From nobody Tue Nov 11 08:28:58 2025 Delivered-To: importer@patchew.org Received-SPF: none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1569820979; cv=none; d=zoho.com; s=zohoarc; b=FbQNgXQFNLAtspZK3EDPN0NXXhauc15visn9Ug8AdKVZxfhsGXNnNhqyrmxT5101U/iBdjWNUgAII/Evu+9viNsekWgwtp5uhfEa9rdem/zI4JLpginvPgBI0Pf7cC/0wjLOwqpHSRGADque7wHBYOQfvt3YzDeYn03bxYsAdyo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1569820979; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=PjrXHhOUtSkGAFaqAe9snFOyt3H6pC2+VW2iqROba9g=; b=VX6bImzkBlDdWM026c2CUZ5UO5br/CNFJSTBuTtGXy3ITysFWuYyIEDtJehrG+heTj5hz5zrDCxcost32c1GgWk4k9Fxc8GY+m6PCOCJmJFugUVYs2qPFUWyg7e55T//p+rWnEY1A8uNhz4QaxaGe8x9Yde8oZ/xqP2qLl71Z6k= ARC-Authentication-Results: i=1; mx.zoho.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1569820979501653.0820617404146; Sun, 29 Sep 2019 22:22:59 -0700 (PDT) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iEo8C-000201-1e; Mon, 30 Sep 2019 05:21:56 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iEo8A-0001z8-Am for xen-devel@lists.xenproject.org; Mon, 30 Sep 2019 05:21:54 +0000 Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id 2e93d930-e342-11e9-b588-bc764e2007e4; Mon, 30 Sep 2019 05:21:40 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 5E8ECB01F; Mon, 30 Sep 2019 05:21:39 +0000 (UTC) X-Inumbo-ID: 2e93d930-e342-11e9-b588-bc764e2007e4 X-Virus-Scanned: by amavisd-new at test-mx.suse.de From: Juergen Gross To: xen-devel@lists.xenproject.org Date: Mon, 30 Sep 2019 07:21:18 +0200 Message-Id: <20190930052135.11257-3-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190930052135.11257-1-jgross@suse.com> References: <20190930052135.11257-1-jgross@suse.com> Subject: [Xen-devel] [PATCH v5 02/19] xen/sched: introduce unit_runnable_state() X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Robert VanVossen , Tim Deegan , Julien Grall , Josh Whitehead , Meng Xu , Jan Beulich , Dario Faggioli MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Today the vcpu runstate of a new scheduled vcpu is always set to "running" even if at that time vcpu_runnable() is already returning false due to a race (e.g. with pausing the vcpu). With core scheduling this can no longer work as not all vcpus of a schedule unit have to be "running" when being scheduled. So the vcpu's new runstate has to be selected at the same time as the runnability of the related schedule unit is probed. For this purpose introduce a new helper unit_runnable_state() which will save the new runstate of all tested vcpus in a new field of the vcpu struct. Signed-off-by: Juergen Gross Reviewed-by: Dario Faggioli --- RFC V2: - new patch V3: - add vcpu loop to unit_runnable_state() right now instead of doing so in next patch (Jan Beulich, Dario Faggioli) - make new_state unsigned int (Jan Beulich) V4: - add comment explaining unit_runnable_state() (Jan Beulich) --- xen/common/domain.c | 1 + xen/common/sched_arinc653.c | 2 +- xen/common/sched_credit.c | 49 ++++++++++++++++++++++++-----------------= ---- xen/common/sched_credit2.c | 7 ++++--- xen/common/sched_null.c | 3 ++- xen/common/sched_rt.c | 8 +++++++- xen/common/schedule.c | 2 +- xen/include/xen/sched-if.h | 30 +++++++++++++++++++++++++++ xen/include/xen/sched.h | 1 + 9 files changed, 73 insertions(+), 30 deletions(-) diff --git a/xen/common/domain.c b/xen/common/domain.c index 601da28c9c..a9882509ed 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -157,6 +157,7 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int= vcpu_id) if ( is_idle_domain(d) ) { v->runstate.state =3D RUNSTATE_running; + v->new_state =3D RUNSTATE_running; } else { diff --git a/xen/common/sched_arinc653.c b/xen/common/sched_arinc653.c index fcf81db19a..dd5876eacd 100644 --- a/xen/common/sched_arinc653.c +++ b/xen/common/sched_arinc653.c @@ -563,7 +563,7 @@ a653sched_do_schedule( if ( !((new_task !=3D NULL) && (AUNIT(new_task) !=3D NULL) && AUNIT(new_task)->awake - && unit_runnable(new_task)) ) + && unit_runnable_state(new_task)) ) new_task =3D IDLETASK(cpu); BUG_ON(new_task =3D=3D NULL); =20 diff --git a/xen/common/sched_credit.c b/xen/common/sched_credit.c index 299eff21ac..00beac3ea4 100644 --- a/xen/common/sched_credit.c +++ b/xen/common/sched_credit.c @@ -1894,7 +1894,7 @@ static void csched_schedule( if ( !test_bit(CSCHED_FLAG_UNIT_YIELD, &scurr->flags) && !tasklet_work_scheduled && prv->ratelimit - && unit_runnable(unit) + && unit_runnable_state(unit) && !is_idle_unit(unit) && runtime < prv->ratelimit ) { @@ -1939,33 +1939,36 @@ static void csched_schedule( dec_nr_runnable(sched_cpu); } =20 - snext =3D __runq_elem(runq->next); - - /* Tasklet work (which runs in idle UNIT context) overrides all else. = */ - if ( tasklet_work_scheduled ) - { - TRACE_0D(TRC_CSCHED_SCHED_TASKLET); - snext =3D CSCHED_UNIT(sched_idle_unit(sched_cpu)); - snext->pri =3D CSCHED_PRI_TS_BOOST; - } - /* * Clear YIELD flag before scheduling out */ clear_bit(CSCHED_FLAG_UNIT_YIELD, &scurr->flags); =20 - /* - * SMP Load balance: - * - * If the next highest priority local runnable UNIT has already eaten - * through its credits, look on other PCPUs to see if we have more - * urgent work... If not, csched_load_balance() will return snext, but - * already removed from the runq. - */ - if ( snext->pri > CSCHED_PRI_TS_OVER ) - __runq_remove(snext); - else - snext =3D csched_load_balance(prv, sched_cpu, snext, &migrated); + do { + snext =3D __runq_elem(runq->next); + + /* Tasklet work (which runs in idle UNIT context) overrides all el= se. */ + if ( tasklet_work_scheduled ) + { + TRACE_0D(TRC_CSCHED_SCHED_TASKLET); + snext =3D CSCHED_UNIT(sched_idle_unit(sched_cpu)); + snext->pri =3D CSCHED_PRI_TS_BOOST; + } + + /* + * SMP Load balance: + * + * If the next highest priority local runnable UNIT has already ea= ten + * through its credits, look on other PCPUs to see if we have more + * urgent work... If not, csched_load_balance() will return snext,= but + * already removed from the runq. + */ + if ( snext->pri > CSCHED_PRI_TS_OVER ) + __runq_remove(snext); + else + snext =3D csched_load_balance(prv, sched_cpu, snext, &migrated= ); + + } while ( !unit_runnable_state(snext->unit) ); =20 /* * Update idlers mask if necessary. When we're idling, other CPUs diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index 87d142bbe4..0e29e56d5a 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -3291,7 +3291,7 @@ runq_candidate(struct csched2_runqueue_data *rqd, * In fact, it may be the case that scurr is about to spin, and there's * no point forcing it to do so until rate limiting expires. */ - if ( !yield && prv->ratelimit_us && unit_runnable(scurr->unit) && + if ( !yield && prv->ratelimit_us && unit_runnable_state(scurr->unit) && (now - scurr->unit->state_entry_time) < MICROSECS(prv->ratelimit_= us) ) { if ( unlikely(tb_init_done) ) @@ -3345,7 +3345,7 @@ runq_candidate(struct csched2_runqueue_data *rqd, * * Of course, we also default to idle also if scurr is not runnable. */ - if ( unit_runnable(scurr->unit) && !soft_aff_preempt ) + if ( unit_runnable_state(scurr->unit) && !soft_aff_preempt ) snext =3D scurr; else snext =3D csched2_unit(sched_idle_unit(cpu)); @@ -3405,7 +3405,8 @@ runq_candidate(struct csched2_runqueue_data *rqd, * some budget, then choose it. */ if ( (yield || svc->credit > snext->credit) && - (!has_cap(svc) || unit_grab_budget(svc)) ) + (!has_cap(svc) || unit_grab_budget(svc)) && + unit_runnable_state(svc->unit) ) snext =3D svc; =20 /* In any case, if we got this far, break. */ diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c index 80a7d45935..3dde1dcd00 100644 --- a/xen/common/sched_null.c +++ b/xen/common/sched_null.c @@ -864,7 +864,8 @@ static void null_schedule(const struct scheduler *ops, = struct sched_unit *prev, cpumask_set_cpu(sched_cpu, &prv->cpus_free); } =20 - if ( unlikely(prev->next_task =3D=3D NULL || !unit_runnable(prev->next= _task)) ) + if ( unlikely(prev->next_task =3D=3D NULL || + !unit_runnable_state(prev->next_task)) ) prev->next_task =3D sched_idle_unit(sched_cpu); =20 NULL_UNIT_CHECK(prev->next_task); diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c index cfd7d334fa..fd882f2ca4 100644 --- a/xen/common/sched_rt.c +++ b/xen/common/sched_rt.c @@ -1092,12 +1092,18 @@ rt_schedule(const struct scheduler *ops, struct sch= ed_unit *currunit, else { snext =3D runq_pick(ops, cpumask_of(sched_cpu)); + if ( snext =3D=3D NULL ) snext =3D rt_unit(sched_idle_unit(sched_cpu)); + else if ( !unit_runnable_state(snext->unit) ) + { + q_remove(snext); + snext =3D rt_unit(sched_idle_unit(sched_cpu)); + } =20 /* if scurr has higher priority and budget, still pick scurr */ if ( !is_idle_unit(currunit) && - unit_runnable(currunit) && + unit_runnable_state(currunit) && scurr->cur_budget > 0 && ( is_idle_unit(snext->unit) || compare_unit_priority(scurr, snext) > 0 ) ) diff --git a/xen/common/schedule.c b/xen/common/schedule.c index ff67fb3633..9c1b044b49 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -280,7 +280,7 @@ static inline void sched_unit_runstate_change(struct sc= hed_unit *unit, for_each_sched_unit_vcpu ( unit, v ) { if ( running ) - vcpu_runstate_change(v, RUNSTATE_running, new_entry_time); + vcpu_runstate_change(v, v->new_state, new_entry_time); else vcpu_runstate_change(v, ((v->pause_flags & VPF_blocked) ? RUNSTATE_blocked : diff --git a/xen/include/xen/sched-if.h b/xen/include/xen/sched-if.h index c65dfa943b..7e568a9d9f 100644 --- a/xen/include/xen/sched-if.h +++ b/xen/include/xen/sched-if.h @@ -93,6 +93,36 @@ static inline bool unit_runnable(const struct sched_unit= *unit) return false; } =20 +/* + * Returns whether a sched_unit is runnable and sets new_state for each of= its + * vcpus. It is mandatory to determine the new runstate for all vcpus of a= unit + * without dropping the schedule lock (which happens when synchronizing the + * context switch of the vcpus of a unit) in order to avoid races with e.g. + * vcpu_sleep(). + */ +static inline bool unit_runnable_state(const struct sched_unit *unit) +{ + struct vcpu *v; + bool runnable, ret =3D false; + + if ( is_idle_unit(unit) ) + return true; + + for_each_sched_unit_vcpu ( unit, v ) + { + runnable =3D vcpu_runnable(v); + + v->new_state =3D runnable ? RUNSTATE_running + : (v->pause_flags & VPF_blocked) + ? RUNSTATE_blocked : RUNSTATE_offline; + + if ( runnable ) + ret =3D true; + } + + return ret; +} + static inline void sched_set_res(struct sched_unit *unit, struct sched_resource *res) { diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h index c770ab4aa0..12f00cd78d 100644 --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -174,6 +174,7 @@ struct vcpu XEN_GUEST_HANDLE(vcpu_runstate_info_compat_t) compat; } runstate_guest; /* guest address */ #endif + unsigned int new_state; =20 /* Has the FPU been initialised? */ bool fpu_initialised; --=20 2.16.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel