From nobody Thu May 2 10:53:38 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1582206040; cv=none; d=zohomail.com; s=zohoarc; b=IbcVmBXJxqBBmUYVLt91t+WvhJcNmDbXtFYAn+jHMoeLiOZ9XTtqs/IfQK9msNYr6CXAQ4V6BXphKJzZumVB1aRn67tgPWig8yHy9bua+gAQlH/e2Uw0yyIuu9sGdtVZni54CE5VsIpWBaaRzfYbFpzGbLGdB0y8bV/wR8cWzys= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582206040; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=mQZ+kLyxKcN0B2JPRWGZQhhw5nhNm9lkDSZGcWRPBjw=; b=Xm/hCKD75NAniwW+3sxhHhAnynhZ1xye0PYtst0vIbWmzWjSrGPwDlake0i/bmI4jAoSwPkBkuRbEK3/mc9uonwLvFVKR88yPKLoWdO53KQvVPu7mkitW77bQ8Biew92VnOVgq2fyK8RU6VE0+dV/F+DBMGTl3pB15LYiUy/t8A= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=none (zohomail.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1582206040496666.0843835523458; Thu, 20 Feb 2020 05:40:40 -0800 (PST) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j4m3X-0006XQ-8t; Thu, 20 Feb 2020 13:39:55 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j4m3V-0006XL-Qt for xen-devel@lists.xenproject.org; Thu, 20 Feb 2020 13:39:53 +0000 Received: from mx2.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 785ad3a6-53e6-11ea-852c-12813bfff9fa; Thu, 20 Feb 2020 13:39:52 +0000 (UTC) Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 23C3DAD68; Thu, 20 Feb 2020 13:39:51 +0000 (UTC) X-Inumbo-ID: 785ad3a6-53e6-11ea-852c-12813bfff9fa X-Virus-Scanned: by amavisd-new at test-mx.suse.de From: Juergen Gross To: xen-devel@lists.xenproject.org Date: Thu, 20 Feb 2020 14:39:49 +0100 Message-Id: <20200220133949.29832-1-jgross@suse.com> X-Mailer: git-send-email 2.16.4 Subject: [Xen-devel] [PATCH v2] xen/sched: rework credit2 run-queue allocation X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , George Dunlap , Dario Faggioli MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Currently the memory for each run-queue of the credit2 scheduler is allocated at the scheduler's init function: for each cpu in the system a struct csched2_runqueue_data is being allocated, even if the current scheduler only handles one physical cpu or is configured to work with a single run-queue. As each struct contains 4 cpumasks this sums up to rather large memory sizes pretty fast. Rework the memory allocation for run-queues to be done only when needed, i.e. when adding a physical cpu to the scheduler requiring a new run-queue. In fact this fixes a bug in credit2 related to run-queue handling: cpu_to_runqueue() will return the first free or matching run-queue, which ever is found first. So in case a cpu is removed from credit2 this could result in e.g. run-queue 0 becoming free, so when another cpu is added it will in any case be assigned to that free run-queue, even if it would have found another run-queue matching later. Signed-off-by: Juergen Gross Reviewed-by: Dario Faggioli --- V2: - added two comments (Dario Faggioli) --- xen/common/sched/credit2.c | 371 ++++++++++++++++++++++-------------------= ---- 1 file changed, 183 insertions(+), 188 deletions(-) diff --git a/xen/common/sched/credit2.c b/xen/common/sched/credit2.c index 7d104f15d0..8ae3b80d2e 100644 --- a/xen/common/sched/credit2.c +++ b/xen/common/sched/credit2.c @@ -467,8 +467,12 @@ custom_param("credit2_runqueue", parse_credit2_runqueu= e); struct csched2_runqueue_data { spinlock_t lock; /* Lock for this runqueue = */ =20 + struct list_head rql; /* List of runqueues = */ struct list_head runq; /* Ordered list of runnable vms = */ + unsigned int refcnt; /* How many CPUs reference this runqueue = */ + /* (including not yet active ones) = */ unsigned int nr_cpus; /* How many CPUs are sharing this runqueue = */ + /* (only active ones) = */ int id; /* ID of this runqueue (-1 if invalid) = */ =20 int load; /* Instantaneous load (num of non-idle unit= s) */ @@ -496,8 +500,8 @@ struct csched2_private { unsigned int load_window_shift; /* Lenght of load decaying window = */ unsigned int ratelimit_us; /* Rate limiting for this scheduler= */ =20 - cpumask_t active_queues; /* Runqueues with (maybe) active cp= us */ - struct csched2_runqueue_data *rqd; /* Data of the various runqueues = */ + unsigned int active_queues; /* Number of active runqueues = */ + struct list_head rql; /* List of runqueues = */ =20 cpumask_t initialized; /* CPUs part of this scheduler = */ struct list_head sdom; /* List of domains (for debug key) = */ @@ -508,7 +512,7 @@ struct csched2_private { */ struct csched2_pcpu { cpumask_t sibling_mask; /* Siblings in the same runqueue = */ - int runq_id; + struct csched2_runqueue_data *rqd; /* Runqueue for this CPU = */ }; =20 /* @@ -586,14 +590,13 @@ static inline struct csched2_dom *csched2_dom(const s= truct domain *d) /* CPU to runq_id macro */ static inline int c2r(unsigned int cpu) { - return csched2_pcpu(cpu)->runq_id; + return csched2_pcpu(cpu)->rqd->id; } =20 /* CPU to runqueue struct macro */ -static inline struct csched2_runqueue_data *c2rqd(const struct scheduler *= ops, - unsigned int cpu) +static inline struct csched2_runqueue_data *c2rqd(unsigned int cpu) { - return &csched2_priv(ops)->rqd[c2r(cpu)]; + return csched2_pcpu(cpu)->rqd; } =20 /* Does the domain of this unit have a cap? */ @@ -804,36 +807,6 @@ static inline struct csched2_unit * runq_elem(struct l= ist_head *elem) return list_entry(elem, struct csched2_unit, runq_elem); } =20 -static void activate_runqueue(struct csched2_private *prv, int rqi) -{ - struct csched2_runqueue_data *rqd; - - rqd =3D prv->rqd + rqi; - - BUG_ON(!cpumask_empty(&rqd->active)); - - rqd->max_weight =3D 1; - rqd->id =3D rqi; - INIT_LIST_HEAD(&rqd->svc); - INIT_LIST_HEAD(&rqd->runq); - spin_lock_init(&rqd->lock); - - __cpumask_set_cpu(rqi, &prv->active_queues); -} - -static void deactivate_runqueue(struct csched2_private *prv, int rqi) -{ - struct csched2_runqueue_data *rqd; - - rqd =3D prv->rqd + rqi; - - BUG_ON(!cpumask_empty(&rqd->active)); - - rqd->id =3D -1; - - __cpumask_clear_cpu(rqi, &prv->active_queues); -} - static inline bool same_node(unsigned int cpua, unsigned int cpub) { return cpu_to_node(cpua) =3D=3D cpu_to_node(cpub); @@ -850,51 +823,73 @@ static inline bool same_core(unsigned int cpua, unsig= ned int cpub) cpu_to_core(cpua) =3D=3D cpu_to_core(cpub); } =20 -static unsigned int -cpu_to_runqueue(const struct csched2_private *prv, unsigned int cpu) +static struct csched2_runqueue_data * +cpu_add_to_runqueue(struct csched2_private *prv, unsigned int cpu) { - const struct csched2_runqueue_data *rqd; - unsigned int rqi; + struct csched2_runqueue_data *rqd, *rqd_new; + struct list_head *rqd_ins; + unsigned long flags; + int rqi =3D 0; + bool rqi_unused =3D false, rqd_valid =3D false; =20 - for ( rqi =3D 0; rqi < nr_cpu_ids; rqi++ ) + /* Prealloc in case we need it - not allowed with interrupts off. */ + rqd_new =3D xzalloc(struct csched2_runqueue_data); + + write_lock_irqsave(&prv->lock, flags); + + rqd_ins =3D &prv->rql; + list_for_each_entry ( rqd, &prv->rql, rql ) { unsigned int peer_cpu; =20 - /* - * As soon as we come across an uninitialized runqueue, use it. - * In fact, either: - * - we are initializing the first cpu, and we assign it to - * runqueue 0. This is handy, especially if we are dealing - * with the boot cpu (if credit2 is the default scheduler), - * as we would not be able to use cpu_to_socket() and similar - * helpers anyway (they're result of which is not reliable yet); - * - we have gone through all the active runqueues, and have not - * found anyone whose cpus' topology matches the one we are - * dealing with, so activating a new runqueue is what we want. - */ - if ( prv->rqd[rqi].id =3D=3D -1 ) - break; - - rqd =3D prv->rqd + rqi; - BUG_ON(cpumask_empty(&rqd->active)); + /* Remember first unused queue index. */ + if ( !rqi_unused && rqd->id > rqi ) + rqi_unused =3D true; =20 - peer_cpu =3D cpumask_first(&rqd->active); + peer_cpu =3D rqd->pick_bias; BUG_ON(cpu_to_socket(cpu) =3D=3D XEN_INVALID_SOCKET_ID || cpu_to_socket(peer_cpu) =3D=3D XEN_INVALID_SOCKET_ID); =20 - if (opt_runqueue =3D=3D OPT_RUNQUEUE_CPU) - continue; + /* OPT_RUNQUEUE_CPU will never find an existing runqueue. */ if ( opt_runqueue =3D=3D OPT_RUNQUEUE_ALL || (opt_runqueue =3D=3D OPT_RUNQUEUE_CORE && same_core(peer_cpu,= cpu)) || (opt_runqueue =3D=3D OPT_RUNQUEUE_SOCKET && same_socket(peer_= cpu, cpu)) || (opt_runqueue =3D=3D OPT_RUNQUEUE_NODE && same_node(peer_cpu,= cpu)) ) + { + rqd_valid =3D true; break; + } + + if ( !rqi_unused ) + { + rqi++; + rqd_ins =3D &rqd->rql; + } + } + + if ( !rqd_valid ) + { + if ( !rqd_new ) + { + rqd =3D ERR_PTR(-ENOMEM); + goto out; + } + rqd =3D rqd_new; + rqd_new =3D NULL; + + list_add(&rqd->rql, rqd_ins); + rqd->pick_bias =3D cpu; + rqd->id =3D rqi; } =20 - /* We really expect to be able to assign each cpu to a runqueue. */ - BUG_ON(rqi >=3D nr_cpu_ids); + rqd->refcnt++; =20 - return rqi; + out: + write_unlock_irqrestore(&prv->lock, flags); + + xfree(rqd_new); + + return rqd; } =20 /* Find the domain with the highest weight. */ @@ -972,13 +967,13 @@ _runq_assign(struct csched2_unit *svc, struct csched2= _runqueue_data *rqd) } =20 static void -runq_assign(const struct scheduler *ops, const struct sched_unit *unit) +runq_assign(const struct sched_unit *unit) { struct csched2_unit *svc =3D unit->priv; =20 ASSERT(svc->rqd =3D=3D NULL); =20 - _runq_assign(svc, c2rqd(ops, sched_unit_master(unit))); + _runq_assign(svc, c2rqd(sched_unit_master(unit))); } =20 static void @@ -999,11 +994,11 @@ _runq_deassign(struct csched2_unit *svc) } =20 static void -runq_deassign(const struct scheduler *ops, const struct sched_unit *unit) +runq_deassign(const struct sched_unit *unit) { struct csched2_unit *svc =3D unit->priv; =20 - ASSERT(svc->rqd =3D=3D c2rqd(ops, sched_unit_master(unit))); + ASSERT(svc->rqd =3D=3D c2rqd(sched_unit_master(unit))); =20 _runq_deassign(svc); } @@ -1272,12 +1267,11 @@ update_load(const struct scheduler *ops, update_svc_load(ops, svc, change, now); } =20 -static void -runq_insert(const struct scheduler *ops, struct csched2_unit *svc) +static void runq_insert(struct csched2_unit *svc) { struct list_head *iter; unsigned int cpu =3D sched_unit_master(svc->unit); - struct list_head * runq =3D &c2rqd(ops, cpu)->runq; + struct list_head *runq =3D &c2rqd(cpu)->runq; int pos =3D 0; =20 ASSERT(spin_is_locked(get_sched_res(cpu)->schedule_lock)); @@ -1366,7 +1360,7 @@ static inline bool is_preemptable(const struct csched= 2_unit *svc, static s_time_t tickle_score(const struct scheduler *ops, s_time_t now, const struct csched2_unit *new, unsigned int = cpu) { - struct csched2_runqueue_data *rqd =3D c2rqd(ops, cpu); + struct csched2_runqueue_data *rqd =3D c2rqd(cpu); struct csched2_unit * cur =3D csched2_unit(curr_on_cpu(cpu)); const struct csched2_private *prv =3D csched2_priv(ops); s_time_t score; @@ -1442,7 +1436,7 @@ runq_tickle(const struct scheduler *ops, struct csche= d2_unit *new, s_time_t now) s_time_t max =3D 0; struct sched_unit *unit =3D new->unit; unsigned int bs, cpu =3D sched_unit_master(unit); - struct csched2_runqueue_data *rqd =3D c2rqd(ops, cpu); + struct csched2_runqueue_data *rqd =3D c2rqd(cpu); const cpumask_t *online =3D cpupool_domain_master_cpumask(unit->domain= ); cpumask_t mask; =20 @@ -1618,10 +1612,9 @@ runq_tickle(const struct scheduler *ops, struct csch= ed2_unit *new, s_time_t now) /* * Credit-related code */ -static void reset_credit(const struct scheduler *ops, int cpu, s_time_t no= w, - struct csched2_unit *snext) +static void reset_credit(int cpu, s_time_t now, struct csched2_unit *snext) { - struct csched2_runqueue_data *rqd =3D c2rqd(ops, cpu); + struct csched2_runqueue_data *rqd =3D c2rqd(cpu); struct list_head *iter; int m; =20 @@ -1910,7 +1903,7 @@ unpark_parked_units(const struct scheduler *ops, stru= ct list_head *units) * for the newly replenished budget. */ ASSERT( svc->rqd !=3D NULL ); - ASSERT( c2rqd(ops, sched_unit_master(svc->unit)) =3D=3D svc->r= qd ); + ASSERT( c2rqd(sched_unit_master(svc->unit)) =3D=3D svc->rqd ); __set_bit(__CSFLAG_delayed_runq_add, &svc->flags); } else if ( unit_runnable(svc->unit) ) @@ -1923,7 +1916,7 @@ unpark_parked_units(const struct scheduler *ops, stru= ct list_head *units) */ now =3D NOW(); update_load(ops, svc->rqd, svc, 1, now); - runq_insert(ops, svc); + runq_insert(svc); runq_tickle(ops, svc, now); } list_del_init(&svc->parked_elem); @@ -2088,7 +2081,7 @@ csched2_unit_sleep(const struct scheduler *ops, struc= t sched_unit *unit) } else if ( unit_on_runq(svc) ) { - ASSERT(svc->rqd =3D=3D c2rqd(ops, sched_unit_master(unit))); + ASSERT(svc->rqd =3D=3D c2rqd(sched_unit_master(unit))); update_load(ops, svc->rqd, svc, -1, NOW()); runq_remove(svc); } @@ -2135,16 +2128,16 @@ csched2_unit_wake(const struct scheduler *ops, stru= ct sched_unit *unit) =20 /* Add into the new runqueue if necessary */ if ( svc->rqd =3D=3D NULL ) - runq_assign(ops, unit); + runq_assign(unit); else - ASSERT(c2rqd(ops, sched_unit_master(unit)) =3D=3D svc->rqd ); + ASSERT(c2rqd(sched_unit_master(unit)) =3D=3D svc->rqd ); =20 now =3D NOW(); =20 update_load(ops, svc->rqd, svc, 1, now); =20 /* Put the UNIT on the runq */ - runq_insert(ops, svc); + runq_insert(svc); runq_tickle(ops, svc, now); =20 out: @@ -2168,7 +2161,7 @@ csched2_context_saved(const struct scheduler *ops, st= ruct sched_unit *unit) LIST_HEAD(were_parked); =20 ASSERT(is_idle_unit(unit) || - svc->rqd =3D=3D c2rqd(ops, sched_unit_master(unit))); + svc->rqd =3D=3D c2rqd(sched_unit_master(unit))); =20 /* This unit is now eligible to be put on the runqueue again */ __clear_bit(__CSFLAG_scheduled, &svc->flags); @@ -2189,7 +2182,7 @@ csched2_context_saved(const struct scheduler *ops, st= ruct sched_unit *unit) { ASSERT(!unit_on_runq(svc)); =20 - runq_insert(ops, svc); + runq_insert(svc); runq_tickle(ops, svc, now); } else if ( !is_idle_unit(unit) ) @@ -2205,13 +2198,13 @@ static struct sched_resource * csched2_res_pick(const struct scheduler *ops, const struct sched_unit *uni= t) { struct csched2_private *prv =3D csched2_priv(ops); - int i, min_rqi =3D -1, min_s_rqi =3D -1; unsigned int new_cpu, cpu =3D sched_unit_master(unit); struct csched2_unit *svc =3D csched2_unit(unit); s_time_t min_avgload =3D MAX_LOAD, min_s_avgload =3D MAX_LOAD; bool has_soft; + struct csched2_runqueue_data *rqd, *min_rqd =3D NULL, *min_s_rqd =3D N= ULL; =20 - ASSERT(!cpumask_empty(&prv->active_queues)); + ASSERT(!list_empty(&prv->rql)); =20 SCHED_STAT_CRANK(pick_resource); =20 @@ -2289,13 +2282,10 @@ csched2_res_pick(const struct scheduler *ops, const= struct sched_unit *unit) * Find both runqueues in one pass. */ has_soft =3D has_soft_affinity(unit); - for_each_cpu(i, &prv->active_queues) + list_for_each_entry ( rqd, &prv->rql, rql ) { - struct csched2_runqueue_data *rqd; s_time_t rqd_avgload =3D MAX_LOAD; =20 - rqd =3D prv->rqd + i; - /* * If none of the cpus of this runqueue is in svc's hard-affinity, * skip the runqueue. @@ -2338,18 +2328,18 @@ csched2_res_pick(const struct scheduler *ops, const= struct sched_unit *unit) if ( cpumask_intersects(&mask, unit->cpu_soft_affinity) ) { min_s_avgload =3D rqd_avgload; - min_s_rqi =3D i; + min_s_rqd =3D rqd; } } /* In any case, keep the "hard-affinity minimum" updated too. */ if ( rqd_avgload < min_avgload ) { min_avgload =3D rqd_avgload; - min_rqi =3D i; + min_rqd =3D rqd; } } =20 - if ( has_soft && min_s_rqi !=3D -1 ) + if ( has_soft && min_s_rqd ) { /* * We have soft affinity, and we have a candidate runq, so go for = it. @@ -2369,9 +2359,9 @@ csched2_res_pick(const struct scheduler *ops, const s= truct sched_unit *unit) cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), unit->cpu_soft_affinity); cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), - &prv->rqd[min_s_rqi].active); + &min_s_rqd->active); } - else if ( min_rqi !=3D -1 ) + else if ( min_rqd ) { /* * Either we don't have soft-affinity, or we do, but we did not fi= nd @@ -2383,7 +2373,7 @@ csched2_res_pick(const struct scheduler *ops, const s= truct sched_unit *unit) * with the cpus of the runq. */ cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), - &prv->rqd[min_rqi].active); + &min_rqd->active); } else { @@ -2392,14 +2382,13 @@ csched2_res_pick(const struct scheduler *ops, const= struct sched_unit *unit) * contention). */ new_cpu =3D get_fallback_cpu(svc); - min_rqi =3D c2r(new_cpu); - min_avgload =3D prv->rqd[min_rqi].b_avgload; + min_rqd =3D c2rqd(new_cpu); + min_avgload =3D min_rqd->b_avgload; goto out_up; } =20 - new_cpu =3D cpumask_cycle(prv->rqd[min_rqi].pick_bias, - cpumask_scratch_cpu(cpu)); - prv->rqd[min_rqi].pick_bias =3D new_cpu; + new_cpu =3D cpumask_cycle(min_rqd->pick_bias, cpumask_scratch_cpu(cpu)= ); + min_rqd->pick_bias =3D new_cpu; BUG_ON(new_cpu >=3D nr_cpu_ids); =20 out_up: @@ -2414,7 +2403,7 @@ csched2_res_pick(const struct scheduler *ops, const s= truct sched_unit *unit) } d; d.dom =3D unit->domain->domain_id; d.unit =3D unit->unit_id; - d.rq_id =3D min_rqi; + d.rq_id =3D min_rqd->id; d.b_avgload =3D min_avgload; d.new_cpu =3D new_cpu; __trace_var(TRC_CSCHED2_PICKED_CPU, 1, @@ -2527,7 +2516,7 @@ static void migrate(const struct scheduler *ops, if ( on_runq ) { update_load(ops, svc->rqd, NULL, 1, now); - runq_insert(ops, svc); + runq_insert(svc); runq_tickle(ops, svc, now); SCHED_STAT_CRANK(migrate_on_runq); } @@ -2557,9 +2546,9 @@ static bool unit_is_migrateable(const struct csched2_= unit *svc, static void balance_load(const struct scheduler *ops, int cpu, s_time_t no= w) { struct csched2_private *prv =3D csched2_priv(ops); - int i, max_delta_rqi; struct list_head *push_iter, *pull_iter; bool inner_load_updated =3D 0; + struct csched2_runqueue_data *rqd, *max_delta_rqd; =20 balance_state_t st =3D { .best_push_svc =3D NULL, .best_pull_svc =3D N= ULL }; =20 @@ -2571,22 +2560,22 @@ static void balance_load(const struct scheduler *op= s, int cpu, s_time_t now) */ =20 ASSERT(spin_is_locked(get_sched_res(cpu)->schedule_lock)); - st.lrqd =3D c2rqd(ops, cpu); + st.lrqd =3D c2rqd(cpu); =20 update_runq_load(ops, st.lrqd, 0, now); =20 retry: - max_delta_rqi =3D -1; + max_delta_rqd =3D NULL; if ( !read_trylock(&prv->lock) ) return; =20 st.load_delta =3D 0; =20 - for_each_cpu(i, &prv->active_queues) + list_for_each_entry ( rqd, &prv->rql, rql ) { s_time_t delta; =20 - st.orqd =3D prv->rqd + i; + st.orqd =3D rqd; =20 if ( st.orqd =3D=3D st.lrqd || !spin_trylock(&st.orqd->lock) ) @@ -2601,7 +2590,7 @@ retry: if ( delta > st.load_delta ) { st.load_delta =3D delta; - max_delta_rqi =3D i; + max_delta_rqd =3D rqd; } =20 spin_unlock(&st.orqd->lock); @@ -2609,7 +2598,7 @@ retry: =20 /* Minimize holding the private scheduler lock. */ read_unlock(&prv->lock); - if ( max_delta_rqi =3D=3D -1 ) + if ( !max_delta_rqd ) goto out; =20 { @@ -2621,10 +2610,7 @@ retry: if ( st.orqd->b_avgload > load_max ) load_max =3D st.orqd->b_avgload; =20 - cpus_max =3D st.lrqd->nr_cpus; - i =3D st.orqd->nr_cpus; - if ( i > cpus_max ) - cpus_max =3D i; + cpus_max =3D max(st.lrqd->nr_cpus, st.orqd->nr_cpus); =20 if ( unlikely(tb_init_done) ) { @@ -2660,7 +2646,7 @@ retry: * meantime, try the process over again. This can't deadlock * because if it doesn't get any other rqd locks, it will simply * give up and return. */ - st.orqd =3D prv->rqd + max_delta_rqi; + st.orqd =3D max_delta_rqd; if ( !spin_trylock(&st.orqd->lock) ) goto retry; =20 @@ -2751,7 +2737,7 @@ csched2_unit_migrate( ASSERT(cpumask_test_cpu(new_cpu, &csched2_priv(ops)->initialized)); ASSERT(cpumask_test_cpu(new_cpu, unit->cpu_hard_affinity)); =20 - trqd =3D c2rqd(ops, new_cpu); + trqd =3D c2rqd(new_cpu); =20 /* * Do the actual movement toward new_cpu, and update vc->processor. @@ -2815,7 +2801,7 @@ csched2_dom_cntl( struct csched2_unit *svc =3D csched2_unit(unit); spinlock_t *lock =3D unit_schedule_lock(unit); =20 - ASSERT(svc->rqd =3D=3D c2rqd(ops, sched_unit_master(unit))= ); + ASSERT(svc->rqd =3D=3D c2rqd(sched_unit_master(unit))); =20 svc->weight =3D sdom->weight; update_max_weight(svc->rqd, svc->weight, old_weight); @@ -2898,7 +2884,7 @@ csched2_dom_cntl( if ( unit->is_running ) { unsigned int cpu =3D sched_unit_master(unit); - struct csched2_runqueue_data *rqd =3D c2rqd(ops, c= pu); + struct csched2_runqueue_data *rqd =3D c2rqd(cpu); =20 ASSERT(curr_on_cpu(cpu) =3D=3D unit); =20 @@ -3093,7 +3079,7 @@ csched2_unit_insert(const struct scheduler *ops, stru= ct sched_unit *unit) lock =3D unit_schedule_lock_irq(unit); =20 /* Add unit to runqueue of initial processor */ - runq_assign(ops, unit); + runq_assign(unit); =20 unit_schedule_unlock_irq(lock, unit); =20 @@ -3126,7 +3112,7 @@ csched2_unit_remove(const struct scheduler *ops, stru= ct sched_unit *unit) /* Remove from runqueue */ lock =3D unit_schedule_lock_irq(unit); =20 - runq_deassign(ops, unit); + runq_deassign(unit); =20 unit_schedule_unlock_irq(lock, unit); =20 @@ -3140,7 +3126,7 @@ csched2_runtime(const struct scheduler *ops, int cpu, { s_time_t time, min_time; int rt_credit; /* Proposed runtime measured in credits */ - struct csched2_runqueue_data *rqd =3D c2rqd(ops, cpu); + struct csched2_runqueue_data *rqd =3D c2rqd(cpu); struct list_head *runq =3D &rqd->runq; const struct csched2_private *prv =3D csched2_priv(ops); =20 @@ -3437,7 +3423,7 @@ static void csched2_schedule( =20 BUG_ON(!cpumask_test_cpu(sched_cpu, &csched2_priv(ops)->initialized)); =20 - rqd =3D c2rqd(ops, sched_cpu); + rqd =3D c2rqd(sched_cpu); BUG_ON(!cpumask_test_cpu(sched_cpu, &rqd->active)); =20 ASSERT(spin_is_locked(get_sched_res(sched_cpu)->schedule_lock)); @@ -3551,7 +3537,7 @@ static void csched2_schedule( */ if ( skipped_units =3D=3D 0 && snext->credit <=3D CSCHED2_CREDIT_R= ESET ) { - reset_credit(ops, sched_cpu, now, snext); + reset_credit(sched_cpu, now, snext); balance_load(ops, sched_cpu, now); } =20 @@ -3650,7 +3636,8 @@ csched2_dump(const struct scheduler *ops) struct list_head *iter_sdom; struct csched2_private *prv =3D csched2_priv(ops); unsigned long flags; - unsigned int i, j, loop; + unsigned int j, loop; + struct csched2_runqueue_data *rqd; =20 /* * We need the private scheduler lock as we access global @@ -3660,13 +3647,13 @@ csched2_dump(const struct scheduler *ops) =20 printk("Active queues: %d\n" "\tdefault-weight =3D %d\n", - cpumask_weight(&prv->active_queues), + prv->active_queues, CSCHED2_DEFAULT_WEIGHT); - for_each_cpu(i, &prv->active_queues) + list_for_each_entry ( rqd, &prv->rql, rql ) { s_time_t fraction; =20 - fraction =3D (prv->rqd[i].avgload * 100) >> prv->load_precision_sh= ift; + fraction =3D (rqd->avgload * 100) >> prv->load_precision_shift; =20 printk("Runqueue %d:\n" "\tncpus =3D %u\n" @@ -3675,21 +3662,21 @@ csched2_dump(const struct scheduler *ops) "\tpick_bias =3D %u\n" "\tinstload =3D %d\n" "\taveload =3D %"PRI_stime" (~%"PRI_stime"%%)\n", - i, - prv->rqd[i].nr_cpus, - CPUMASK_PR(&prv->rqd[i].active), - prv->rqd[i].max_weight, - prv->rqd[i].pick_bias, - prv->rqd[i].load, - prv->rqd[i].avgload, + rqd->id, + rqd->nr_cpus, + CPUMASK_PR(&rqd->active), + rqd->max_weight, + rqd->pick_bias, + rqd->load, + rqd->avgload, fraction); =20 printk("\tidlers: %*pb\n" "\ttickled: %*pb\n" "\tfully idle cores: %*pb\n", - CPUMASK_PR(&prv->rqd[i].idle), - CPUMASK_PR(&prv->rqd[i].tickled), - CPUMASK_PR(&prv->rqd[i].smt_idle)); + CPUMASK_PR(&rqd->idle), + CPUMASK_PR(&rqd->tickled), + CPUMASK_PR(&rqd->smt_idle)); } =20 printk("Domain info:\n"); @@ -3721,16 +3708,15 @@ csched2_dump(const struct scheduler *ops) } } =20 - for_each_cpu(i, &prv->active_queues) + list_for_each_entry ( rqd, &prv->rql, rql ) { - struct csched2_runqueue_data *rqd =3D prv->rqd + i; struct list_head *iter, *runq =3D &rqd->runq; int loop =3D 0; =20 /* We need the lock to scan the runqueue. */ spin_lock(&rqd->lock); =20 - printk("Runqueue %d:\n", i); + printk("Runqueue %d:\n", rqd->id); =20 for_each_cpu(j, &rqd->active) dump_pcpu(ops, j); @@ -3755,20 +3741,28 @@ csched2_dump(const struct scheduler *ops) static void * csched2_alloc_pdata(const struct scheduler *ops, int cpu) { + struct csched2_private *prv =3D csched2_priv(ops); struct csched2_pcpu *spc; + struct csched2_runqueue_data *rqd; =20 spc =3D xzalloc(struct csched2_pcpu); if ( spc =3D=3D NULL ) return ERR_PTR(-ENOMEM); =20 - /* Not in any runqueue yet */ - spc->runq_id =3D -1; + rqd =3D cpu_add_to_runqueue(prv, cpu); + if ( IS_ERR(rqd) ) + { + xfree(spc); + return rqd; + } + + spc->rqd =3D rqd; =20 return spc; } =20 /* Returns the ID of the runqueue the cpu is assigned to. */ -static unsigned +static struct csched2_runqueue_data * init_pdata(struct csched2_private *prv, struct csched2_pcpu *spc, unsigned int cpu) { @@ -3778,18 +3772,23 @@ init_pdata(struct csched2_private *prv, struct csch= ed2_pcpu *spc, ASSERT(rw_is_write_locked(&prv->lock)); ASSERT(!cpumask_test_cpu(cpu, &prv->initialized)); /* CPU data needs to be allocated, but still uninitialized. */ - ASSERT(spc && spc->runq_id =3D=3D -1); + ASSERT(spc); =20 - /* Figure out which runqueue to put it in */ - spc->runq_id =3D cpu_to_runqueue(prv, cpu); + rqd =3D spc->rqd; =20 - rqd =3D prv->rqd + spc->runq_id; + ASSERT(rqd && !cpumask_test_cpu(cpu, &spc->rqd->active)); =20 - printk(XENLOG_INFO "Adding cpu %d to runqueue %d\n", cpu, spc->runq_id= ); - if ( ! cpumask_test_cpu(spc->runq_id, &prv->active_queues) ) + printk(XENLOG_INFO "Adding cpu %d to runqueue %d\n", cpu, rqd->id); + if ( !rqd->nr_cpus ) { printk(XENLOG_INFO " First cpu on runqueue, activating\n"); - activate_runqueue(prv, spc->runq_id); + + BUG_ON(!cpumask_empty(&rqd->active)); + rqd->max_weight =3D 1; + INIT_LIST_HEAD(&rqd->svc); + INIT_LIST_HEAD(&rqd->runq); + spin_lock_init(&rqd->lock); + prv->active_queues++; } =20 __cpumask_set_cpu(cpu, &spc->sibling_mask); @@ -3813,7 +3812,7 @@ init_pdata(struct csched2_private *prv, struct csched= 2_pcpu *spc, if ( rqd->nr_cpus =3D=3D 1 ) rqd->pick_bias =3D cpu; =20 - return spc->runq_id; + return rqd; } =20 /* Change the scheduler of cpu to us (Credit2). */ @@ -3823,7 +3822,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsig= ned int cpu, { struct csched2_private *prv =3D csched2_priv(new_ops); struct csched2_unit *svc =3D vdata; - unsigned rqi; + struct csched2_runqueue_data *rqd; =20 ASSERT(pdata && svc && is_idle_unit(svc->unit)); =20 @@ -3840,7 +3839,7 @@ csched2_switch_sched(struct scheduler *new_ops, unsig= ned int cpu, =20 sched_idle_unit(cpu)->priv =3D vdata; =20 - rqi =3D init_pdata(prv, pdata, cpu); + rqd =3D init_pdata(prv, pdata, cpu); =20 /* * Now that we know what runqueue we'll go in, double check what's said @@ -3848,11 +3847,11 @@ csched2_switch_sched(struct scheduler *new_ops, uns= igned int cpu, * this scheduler, and so it's safe to have taken it /before/ our * private global lock. */ - ASSERT(get_sched_res(cpu)->schedule_lock !=3D &prv->rqd[rqi].lock); + ASSERT(get_sched_res(cpu)->schedule_lock !=3D &rqd->lock); =20 write_unlock(&prv->lock); =20 - return &prv->rqd[rqi].lock; + return &rqd->lock; } =20 static void @@ -3866,10 +3865,6 @@ csched2_deinit_pdata(const struct scheduler *ops, vo= id *pcpu, int cpu) =20 write_lock_irqsave(&prv->lock, flags); =20 - /* - * alloc_pdata is not implemented, so pcpu must be NULL. On the other - * hand, init_pdata must have been called for this pCPU. - */ /* * Scheduler specific data for this pCPU must still be there and and be * valid. In fact, if we are here: @@ -3878,20 +3873,21 @@ csched2_deinit_pdata(const struct scheduler *ops, v= oid *pcpu, int cpu) * 2. init_pdata must have been called on this cpu, and deinit_pdata * (us!) must not have been called on it already. */ - ASSERT(spc && spc->runq_id !=3D -1); + ASSERT(spc && spc->rqd); ASSERT(cpumask_test_cpu(cpu, &prv->initialized)); =20 /* Find the old runqueue and remove this cpu from it */ - rqd =3D prv->rqd + spc->runq_id; + rqd =3D spc->rqd; =20 /* No need to save IRQs here, they're already disabled */ spin_lock(&rqd->lock); =20 - printk(XENLOG_INFO "Removing cpu %d from runqueue %d\n", cpu, spc->run= q_id); + printk(XENLOG_INFO "Removing cpu %d from runqueue %d\n", cpu, rqd->id); =20 __cpumask_clear_cpu(cpu, &rqd->idle); __cpumask_clear_cpu(cpu, &rqd->smt_idle); __cpumask_clear_cpu(cpu, &rqd->active); + __cpumask_clear_cpu(cpu, &rqd->tickled); =20 for_each_cpu ( rcpu, &rqd->active ) __cpumask_clear_cpu(cpu, &csched2_pcpu(rcpu)->sibling_mask); @@ -3902,13 +3898,13 @@ csched2_deinit_pdata(const struct scheduler *ops, v= oid *pcpu, int cpu) if ( rqd->nr_cpus =3D=3D 0 ) { printk(XENLOG_INFO " No cpus left on runqueue, disabling\n"); - deactivate_runqueue(prv, spc->runq_id); + + BUG_ON(!cpumask_empty(&rqd->active)); + prv->active_queues--; } else if ( rqd->pick_bias =3D=3D cpu ) rqd->pick_bias =3D cpumask_first(&rqd->active); =20 - spc->runq_id =3D -1; - spin_unlock(&rqd->lock); =20 __cpumask_clear_cpu(cpu, &prv->initialized); @@ -3921,18 +3917,29 @@ csched2_deinit_pdata(const struct scheduler *ops, v= oid *pcpu, int cpu) static void csched2_free_pdata(const struct scheduler *ops, void *pcpu, int cpu) { + struct csched2_private *prv =3D csched2_priv(ops); struct csched2_pcpu *spc =3D pcpu; + struct csched2_runqueue_data *rqd; + unsigned long flags; =20 - /* - * pcpu either points to a valid struct csched2_pcpu, or is NULL (if - * CPU bringup failed, and we're beeing called from CPU_UP_CANCELLED). - * xfree() does not really mind, but we want to be sure that either - * init_pdata has never been called, or deinit_pdata has been called - * already. - */ - ASSERT(!pcpu || spc->runq_id =3D=3D -1); - ASSERT(!cpumask_test_cpu(cpu, &csched2_priv(ops)->initialized)); + if ( !spc ) + return; + + write_lock_irqsave(&prv->lock, flags); + + rqd =3D spc->rqd; + ASSERT(rqd && rqd->refcnt); + ASSERT(!cpumask_test_cpu(cpu, &prv->initialized)); + + rqd->refcnt--; + if ( !rqd->refcnt ) + list_del(&rqd->rql); + else + rqd =3D NULL; + + write_unlock_irqrestore(&prv->lock, flags); =20 + xfree(rqd); xfree(pcpu); } =20 @@ -3966,7 +3973,6 @@ csched2_global_init(void) static int csched2_init(struct scheduler *ops) { - int i; struct csched2_private *prv; =20 printk("Initializing Credit2 scheduler\n"); @@ -3999,18 +4005,9 @@ csched2_init(struct scheduler *ops) ops->sched_data =3D prv; =20 rwlock_init(&prv->lock); + INIT_LIST_HEAD(&prv->rql); INIT_LIST_HEAD(&prv->sdom); =20 - /* Allocate all runqueues and mark them as un-initialized */ - prv->rqd =3D xzalloc_array(struct csched2_runqueue_data, nr_cpu_ids); - if ( !prv->rqd ) - { - xfree(prv); - return -ENOMEM; - } - for ( i =3D 0; i < nr_cpu_ids; i++ ) - prv->rqd[i].id =3D -1; - /* initialize ratelimit */ prv->ratelimit_us =3D sched_ratelimit_us; =20 @@ -4028,8 +4025,6 @@ csched2_deinit(struct scheduler *ops) =20 prv =3D csched2_priv(ops); ops->sched_data =3D NULL; - if ( prv ) - xfree(prv->rqd); xfree(prv); } =20 --=20 2.16.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel