From nobody Tue Nov 11 08:48:44 2025 Delivered-To: importer@patchew.org Received-SPF: none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1569567807; cv=none; d=zoho.com; s=zohoarc; b=n/ylG7nU/Gn3UTiKLTGPbJC2P6YX+wrbiGPYIY3duHeUKhQl2e+JoL9QSJOrJMsJ6/3j55SR2otvsLUUkwJGT0L6UMO5OHYQdHWm9yOkhP7fskdpTWeX+Oz6ERm1F8j49ieedSl2wLCnnzdoOzM2J5WuBooEDJylhi9/juBlkSU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1569567807; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=J+r9bnICOhjtDiR1ve0qmUvq7TqADhJS0QNssflOTeE=; b=jmVD4wNbqcRjwiUO0cNbe1qSy9lYnOvNzDtcZy9p+esAw17vXPeNbJnnZIH1CDaWokfaq8G9fF6Z+eElpNlDviwYtWqoEEnc4fh10lPXZ+4YDgYJjXnH/pIeGQ+ldc3KyDSKAiomWcgVofjo8na9/bspoEPwFQgHGyumrFlhd0g= ARC-Authentication-Results: i=1; mx.zoho.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1569567807532689.5931364877099; Fri, 27 Sep 2019 00:03:27 -0700 (PDT) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iDkGh-0004ir-Rx; Fri, 27 Sep 2019 07:02:19 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iDkGf-0004eq-R2 for xen-devel@lists.xenproject.org; Fri, 27 Sep 2019 07:02:17 +0000 Received: from mx1.suse.de (unknown [195.135.220.15]) by localhost (Halon) with ESMTPS id 92097eac-e0f4-11e9-bf31-bc764e2007e4; Fri, 27 Sep 2019 07:01:08 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A0D72AFBE; Fri, 27 Sep 2019 07:01:01 +0000 (UTC) X-Inumbo-ID: 92097eac-e0f4-11e9-bf31-bc764e2007e4 X-Virus-Scanned: by amavisd-new at test-mx.suse.de From: Juergen Gross To: xen-devel@lists.xenproject.org Date: Fri, 27 Sep 2019 09:00:28 +0200 Message-Id: <20190927070050.12405-25-jgross@suse.com> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20190927070050.12405-1-jgross@suse.com> References: <20190927070050.12405-1-jgross@suse.com> Subject: [Xen-devel] [PATCH v4 24/46] xen: switch from for_each_vcpu() to for_each_sched_unit() X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Wei Liu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Tim Deegan , Julien Grall , Jan Beulich , Dario Faggioli MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Where appropriate switch from for_each_vcpu() to for_each_sched_unit() in order to prepare core scheduling. As it is beneficial once here and for sure in future add a unit_scheduler() helper and let vcpu_scheduler() use it. Signed-off-by: Juergen Gross Acked-by: Jan Beulich Reviewed-by: Dario Faggioli --- V2: - handle affinity_broken correctly (Jan Beulich) - add unit_scheduler() (Jan Beulich) V3: - add const (Jan Beulich) - add TODOs for missing multiple vcpu per unit support (Jan Beulich) V4: - simplify test for correct cpu (Jan Beulich) - remove stale change (Jan Beulich) - use sched_check_affinity_broken(unit) (Jan Beulich) --- xen/common/domain.c | 9 ++- xen/common/schedule.c | 155 +++++++++++++++++++++++++++++++---------------= ---- 2 files changed, 100 insertions(+), 64 deletions(-) diff --git a/xen/common/domain.c b/xen/common/domain.c index b33a7031ed..699e63361b 100644 --- a/xen/common/domain.c +++ b/xen/common/domain.c @@ -568,7 +568,7 @@ void domain_update_node_affinity(struct domain *d) cpumask_var_t dom_cpumask, dom_cpumask_soft; cpumask_t *dom_affinity; const cpumask_t *online; - struct vcpu *v; + struct sched_unit *unit; unsigned int cpu; =20 /* Do we have vcpus already? If not, no need to update node-affinity. = */ @@ -601,12 +601,11 @@ void domain_update_node_affinity(struct domain *d) * and the full mask of where it would prefer to run (the union of * the soft affinity of all its various vcpus). Let's build them. */ - for_each_vcpu ( d, v ) + for_each_sched_unit ( d, unit ) { - cpumask_or(dom_cpumask, dom_cpumask, - v->sched_unit->cpu_hard_affinity); + cpumask_or(dom_cpumask, dom_cpumask, unit->cpu_hard_affinity); cpumask_or(dom_cpumask_soft, dom_cpumask_soft, - v->sched_unit->cpu_soft_affinity); + unit->cpu_soft_affinity); } /* Filter out non-online cpus */ cpumask_and(dom_cpumask, dom_cpumask, online); diff --git a/xen/common/schedule.c b/xen/common/schedule.c index 46f3c85cc5..dc68cb912e 100644 --- a/xen/common/schedule.c +++ b/xen/common/schedule.c @@ -157,26 +157,32 @@ static inline struct scheduler *dom_scheduler(const s= truct domain *d) return &ops; } =20 -static inline struct scheduler *vcpu_scheduler(const struct vcpu *v) +static inline struct scheduler *unit_scheduler(const struct sched_unit *un= it) { - struct domain *d =3D v->domain; + struct domain *d =3D unit->domain; =20 if ( likely(d->cpupool !=3D NULL) ) return d->cpupool->sched; =20 /* - * If d->cpupool is NULL, this is a vCPU of the idle domain. And this + * If d->cpupool is NULL, this is a unit of the idle domain. And this * case is special because the idle domain does not really belong to * a cpupool and, hence, doesn't really have a scheduler). In fact, its - * vCPUs (may) run on pCPUs which are in different pools, with differe= nt + * units (may) run on pCPUs which are in different pools, with differe= nt * schedulers. * * What we want, in this case, is the scheduler of the pCPU where this - * particular idle vCPU is running. And, since v->processor never chan= ges - * for idle vCPUs, it is safe to use it, with no locks, to figure that= out. + * particular idle unit is running. And, since unit->res never changes + * for idle units, it is safe to use it, with no locks, to figure that= out. */ + ASSERT(is_idle_domain(d)); - return per_cpu(scheduler, v->processor); + return per_cpu(scheduler, unit->res->master_cpu); +} + +static inline struct scheduler *vcpu_scheduler(const struct vcpu *v) +{ + return unit_scheduler(v->sched_unit); } #define VCPU2ONLINE(_v) cpupool_domain_cpumask((_v)->domain) =20 @@ -502,10 +508,11 @@ static void sched_move_irqs(const struct sched_unit *= unit) int sched_move_domain(struct domain *d, struct cpupool *c) { struct vcpu *v; - unsigned int new_p; - void **vcpu_priv; + struct sched_unit *unit; + unsigned int new_p, unit_idx; + void **unit_priv; void *domdata; - void *vcpudata; + void *unitdata; struct scheduler *old_ops; void *old_domdata; =20 @@ -519,25 +526,27 @@ int sched_move_domain(struct domain *d, struct cpupoo= l *c) if ( IS_ERR(domdata) ) return PTR_ERR(domdata); =20 - vcpu_priv =3D xzalloc_array(void *, d->max_vcpus); - if ( vcpu_priv =3D=3D NULL ) + /* TODO: fix array size with multiple vcpus per unit. */ + unit_priv =3D xzalloc_array(void *, d->max_vcpus); + if ( unit_priv =3D=3D NULL ) { sched_free_domdata(c->sched, domdata); return -ENOMEM; } =20 - for_each_vcpu ( d, v ) + unit_idx =3D 0; + for_each_sched_unit ( d, unit ) { - vcpu_priv[v->vcpu_id] =3D sched_alloc_udata(c->sched, v->sched_uni= t, - domdata); - if ( vcpu_priv[v->vcpu_id] =3D=3D NULL ) + unit_priv[unit_idx] =3D sched_alloc_udata(c->sched, unit, domdata); + if ( unit_priv[unit_idx] =3D=3D NULL ) { - for_each_vcpu ( d, v ) - sched_free_udata(c->sched, vcpu_priv[v->vcpu_id]); - xfree(vcpu_priv); + for ( unit_idx =3D 0; unit_priv[unit_idx]; unit_idx++ ) + sched_free_udata(c->sched, unit_priv[unit_idx]); + xfree(unit_priv); sched_free_domdata(c->sched, domdata); return -ENOMEM; } + unit_idx++; } =20 domain_pause(d); @@ -545,30 +554,36 @@ int sched_move_domain(struct domain *d, struct cpupoo= l *c) old_ops =3D dom_scheduler(d); old_domdata =3D d->sched_priv; =20 - for_each_vcpu ( d, v ) + for_each_sched_unit ( d, unit ) { - sched_remove_unit(old_ops, v->sched_unit); + sched_remove_unit(old_ops, unit); } =20 d->cpupool =3D c; d->sched_priv =3D domdata; =20 new_p =3D cpumask_first(c->cpu_valid); - for_each_vcpu ( d, v ) + unit_idx =3D 0; + for_each_sched_unit ( d, unit ) { spinlock_t *lock; + unsigned int unit_p =3D new_p; =20 - vcpudata =3D v->sched_unit->priv; + unitdata =3D unit->priv; =20 - migrate_timer(&v->periodic_timer, new_p); - migrate_timer(&v->singleshot_timer, new_p); - migrate_timer(&v->poll_timer, new_p); + for_each_sched_unit_vcpu ( unit, v ) + { + migrate_timer(&v->periodic_timer, new_p); + migrate_timer(&v->singleshot_timer, new_p); + migrate_timer(&v->poll_timer, new_p); + new_p =3D cpumask_cycle(new_p, c->cpu_valid); + } =20 - lock =3D unit_schedule_lock_irq(v->sched_unit); + lock =3D unit_schedule_lock_irq(unit); =20 - sched_set_affinity(v->sched_unit, &cpumask_all, &cpumask_all); + sched_set_affinity(unit, &cpumask_all, &cpumask_all); =20 - sched_set_res(v->sched_unit, get_sched_res(new_p)); + sched_set_res(unit, get_sched_res(unit_p)); /* * With v->processor modified we must not * - make any further changes assuming we hold the scheduler lock, @@ -576,15 +591,15 @@ int sched_move_domain(struct domain *d, struct cpupoo= l *c) */ spin_unlock_irq(lock); =20 - v->sched_unit->priv =3D vcpu_priv[v->vcpu_id]; + unit->priv =3D unit_priv[unit_idx]; if ( !d->is_dying ) - sched_move_irqs(v->sched_unit); + sched_move_irqs(unit); =20 - new_p =3D cpumask_cycle(new_p, c->cpu_valid); + sched_insert_unit(c->sched, unit); =20 - sched_insert_unit(c->sched, v->sched_unit); + sched_free_udata(old_ops, unitdata); =20 - sched_free_udata(old_ops, vcpudata); + unit_idx++; } =20 domain_update_node_affinity(d); @@ -593,7 +608,7 @@ int sched_move_domain(struct domain *d, struct cpupool = *c) =20 sched_free_domdata(old_ops, old_domdata); =20 - xfree(vcpu_priv); + xfree(unit_priv); =20 return 0; } @@ -877,18 +892,36 @@ static void vcpu_migrate_finish(struct vcpu *v) vcpu_wake(v); } =20 +static bool sched_check_affinity_broken(const struct sched_unit *unit) +{ + const struct vcpu *v; + + for_each_sched_unit_vcpu ( unit, v ) + if ( v->affinity_broken ) + return true; + + return false; +} + +static void sched_reset_affinity_broken(struct sched_unit *unit) +{ + struct vcpu *v; + + for_each_sched_unit_vcpu ( unit, v ) + v->affinity_broken =3D false; +} + void restore_vcpu_affinity(struct domain *d) { unsigned int cpu =3D smp_processor_id(); - struct vcpu *v; + struct sched_unit *unit; =20 ASSERT(system_state =3D=3D SYS_STATE_resume); =20 - for_each_vcpu ( d, v ) + for_each_sched_unit ( d, unit ) { spinlock_t *lock; - unsigned int old_cpu =3D v->processor; - struct sched_unit *unit =3D v->sched_unit; + unsigned int old_cpu =3D sched_unit_master(unit); struct sched_resource *res; =20 ASSERT(!unit_runnable(unit)); @@ -907,17 +940,19 @@ void restore_vcpu_affinity(struct domain *d) cpupool_domain_cpumask(d)); if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) { - if ( v->affinity_broken ) + if ( sched_check_affinity_broken(unit) ) { sched_set_affinity(unit, unit->cpu_hard_affinity_saved, NU= LL); - v->affinity_broken =3D 0; + sched_reset_affinity_broken(unit); cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affin= ity, cpupool_domain_cpumask(d)); } =20 if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) { - printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v); + /* Affinity settings of one vcpu are for the complete unit= . */ + printk(XENLOG_DEBUG "Breaking affinity for %pv\n", + unit->vcpu_list); sched_set_affinity(unit, &cpumask_all, NULL); cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affin= ity, cpupool_domain_cpumask(d)); @@ -931,12 +966,12 @@ void restore_vcpu_affinity(struct domain *d) =20 /* v->processor might have changed, so reacquire the lock. */ lock =3D unit_schedule_lock_irq(unit); - res =3D sched_pick_resource(vcpu_scheduler(v), unit); + res =3D sched_pick_resource(unit_scheduler(unit), unit); sched_set_res(unit, res); spin_unlock_irq(lock); =20 - if ( old_cpu !=3D v->processor ) - sched_move_irqs(v->sched_unit); + if ( old_cpu !=3D sched_unit_master(unit) ) + sched_move_irqs(unit); } =20 domain_update_node_affinity(d); @@ -950,7 +985,6 @@ void restore_vcpu_affinity(struct domain *d) int cpu_disable_scheduler(unsigned int cpu) { struct domain *d; - struct vcpu *v; struct cpupool *c; cpumask_t online_affinity; int ret =3D 0; @@ -961,32 +995,34 @@ int cpu_disable_scheduler(unsigned int cpu) =20 for_each_domain_in_cpupool ( d, c ) { - for_each_vcpu ( d, v ) + struct sched_unit *unit; + + for_each_sched_unit ( d, unit ) { unsigned long flags; - struct sched_unit *unit =3D v->sched_unit; spinlock_t *lock =3D unit_schedule_lock_irqsave(unit, &flags); =20 cpumask_and(&online_affinity, unit->cpu_hard_affinity, c->cpu_= valid); if ( cpumask_empty(&online_affinity) && cpumask_test_cpu(cpu, unit->cpu_hard_affinity) ) { - if ( v->affinity_broken ) + if ( sched_check_affinity_broken(unit) ) { - /* The vcpu is temporarily pinned, can't move it. */ + /* The unit is temporarily pinned, can't move it. */ unit_schedule_unlock_irqrestore(lock, flags, unit); ret =3D -EADDRINUSE; break; } =20 - printk(XENLOG_DEBUG "Breaking affinity for %pv\n", v); + printk(XENLOG_DEBUG "Breaking affinity for %pv\n", + unit->vcpu_list); =20 sched_set_affinity(unit, &cpumask_all, NULL); } =20 - if ( v->processor !=3D cpu ) + if ( unit->res !=3D get_sched_res(cpu) ) { - /* The vcpu is not on this cpu, so we can move on. */ + /* The unit is not on this cpu, so we can move on. */ unit_schedule_unlock_irqrestore(lock, flags, unit); continue; } @@ -999,17 +1035,18 @@ int cpu_disable_scheduler(unsigned int cpu) * * the scheduler will always find a suitable solution, or * things would have failed before getting in here. */ - vcpu_migrate_start(v); + /* TODO: multiple vcpus per unit. */ + vcpu_migrate_start(unit->vcpu_list); unit_schedule_unlock_irqrestore(lock, flags, unit); =20 - vcpu_migrate_finish(v); + vcpu_migrate_finish(unit->vcpu_list); =20 /* * The only caveat, in this case, is that if a vcpu active in * the hypervisor isn't migratable. In this case, the caller * should try again after releasing and reaquiring all locks. */ - if ( v->processor =3D=3D cpu ) + if ( unit->res =3D=3D get_sched_res(cpu) ) ret =3D -EAGAIN; } } @@ -1329,17 +1366,17 @@ int vcpu_temporary_affinity(struct vcpu *v, unsigne= d int cpu, uint8_t reason) ret =3D 0; v->affinity_broken &=3D ~reason; } - if ( !ret && !v->affinity_broken ) + if ( !ret && !sched_check_affinity_broken(unit) ) sched_set_affinity(unit, unit->cpu_hard_affinity_saved, NULL); } else if ( cpu < nr_cpu_ids ) { if ( (v->affinity_broken & reason) || - (v->affinity_broken && v->processor !=3D cpu) ) + (sched_check_affinity_broken(unit) && v->processor !=3D cpu) ) ret =3D -EBUSY; else if ( cpumask_test_cpu(cpu, VCPU2ONLINE(v)) ) { - if ( !v->affinity_broken ) + if ( !sched_check_affinity_broken(unit) ) { cpumask_copy(unit->cpu_hard_affinity_saved, unit->cpu_hard_affinity); --=20 2.16.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel