From nobody Fri Apr 19 22:34:12 2024 Delivered-To: importer@patchew.org Received-SPF: none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1573056029; cv=none; d=zoho.com; s=zohoarc; b=K+XuOaRYGkG9Q0DTDaNPJ6/bWb3U2be/Pf/VhacBHMnnC9ye5ky06Ckm8528uIuvOvM864UhV0QJczl7y1YxED0388j7aHDmH+CDOkWN4w7Ylh/t5H62O01nDMQz/1Nx52r03ymYvXjwCNb1Qh8Vns8O1o20yPBpTtpv08k+RZM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1573056029; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=4gpe82fo5pN9aAhUOnQ2T1N7RMGad/aWZVTkGtPgyO8=; b=Aq35UK6Ohn2g/fvCReciwHNN3+vCgajxnzk0J161sHax8JUL7y8hk/zNLCohOakR3flVt3NYpgnk66zSx5fe0uNQjK8T4RDRcQqAC7urKL61TwhkGejhdIIbFMasW1zy7RmleCp85xHWPkfKSuNQYNfvw5xheMfgGKz/aePk6+I= ARC-Authentication-Results: i=1; mx.zoho.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1573056029188769.8323001361031; Wed, 6 Nov 2019 08:00:29 -0800 (PST) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iSNhv-0007Wa-T1; Wed, 06 Nov 2019 15:58:55 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iSNhu-0007WV-7y for xen-devel@lists.xenproject.org; Wed, 06 Nov 2019 15:58:54 +0000 Received: from mail-wr1-f66.google.com (unknown [209.85.221.66]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 54504f76-00ae-11ea-b678-bc764e2007e4; Wed, 06 Nov 2019 15:58:53 +0000 (UTC) Received: by mail-wr1-f66.google.com with SMTP id f2so17562037wrs.11 for ; Wed, 06 Nov 2019 07:58:53 -0800 (PST) Received: from [192.168.0.35] (87.78.186.89.cust.ip.kpnqwest.it. [89.186.78.87]) by smtp.gmail.com with ESMTPSA id k125sm3667879wmf.2.2019.11.06.07.58.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Nov 2019 07:58:50 -0800 (PST) X-Inumbo-ID: 54504f76-00ae-11ea-b678-bc764e2007e4 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:user-agent :mime-version:content-transfer-encoding; bh=PTrMImt4y3y/oG8P/8TUZHYOQbTAJi/JSedi2l4rzfU=; b=nDcYoJaCmLFJiiqq43yrYpzo1uUw/NIwgOD3N3CNC0XludD2GSBvG6ZxT5+TJ45Pb/ ZuiRbrYDPrXxis3a8KNUur2TWKSCQleiorN7aCbOjrjMCqeOl8/GYOeRzGpnOIdE4IxP 7AMIMjQbuEmtNtqIt6HBLa7EyTbw1JghT2a+k9Qoo7EYz72OOgczSfHA8ztLFCFihlYD mdsoc77wzguOhQI7vX2VsMn4b8ZNP6tCMDAL8rkUgLR8Pene0cFH5ZFZcNhHzgSlhcXY eV/BsVlXmmg87+q64ZPIqPPxD03a+yVm8MJcFDMGi1rIPMZWivl5sTbUpfRdPX2nTSFG U8kA== X-Gm-Message-State: APjAAAX9/+2N9Drzv/dr3iK6XnGgT3oklD/DFccoYXFYXbpuirP4qSlR tsSnjbaAX3BRhMSvAnb9H9Q= X-Google-Smtp-Source: APXvYqwFJIa88hVUuaCstXcW87Xzi0hMl8LNvzLNy+XiAJVQf7YR7rGMe6fCrJgOb3EyLSOVfLFI+g== X-Received: by 2002:a5d:4585:: with SMTP id p5mr3468703wrq.134.1573055931891; Wed, 06 Nov 2019 07:58:51 -0800 (PST) From: Dario Faggioli To: xen-devel@lists.xenproject.org Date: Wed, 06 Nov 2019 16:58:49 +0100 Message-ID: <157305592941.20672.10855835711915878073.stgit@Palanthas> User-Agent: StGit/0.19 MIME-Version: 1.0 Subject: [Xen-devel] [BUGFIX PATCH for-4.13] sched: fix dom0less boot with the null scheduler X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: George.Dunlap@eu.citrix.com, julien.grall@arm.com, sstabellini@kernel.org, jgross@suse.de Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" In a dom0less configuration, if the null scheduler is used, the system may fail to boot, because the loop in null_unit_wake() never exits. Bisection showed that this behavior occurs since commit d545f1d6 ("xen: sched: deal with vCPUs being or becoming online or offline") but the real problem is that, in this case, pick_res() always return the same CPU. Fix this by only deal with the simple case, i.e., the vCPU that is coming online can be assigned to a sched. resource right away, in null_unit_wake(). If it can't, just add it to the waitqueue, and we will deal with it in null_schedule(), being careful about not racing with vcpu_wake(). Reported-by: Stefano Stabellini Signed-off-by: Dario Faggioli Tested-by: Stefano Stabellini Reviewed-by: George Dunlap --- xen/common/sched_null.c | 113 +++++++++++++++++++++++++++++++------------= ---- 1 file changed, 75 insertions(+), 38 deletions(-) diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c index 2525464a7c..88bd11a187 100644 --- a/xen/common/sched_null.c +++ b/xen/common/sched_null.c @@ -568,50 +568,52 @@ static void null_unit_wake(const struct scheduler *op= s, else SCHED_STAT_CRANK(unit_wake_not_runnable); =20 + if ( likely(per_cpu(npc, cpu).unit =3D=3D unit) ) + { + cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); + return; + } + /* * If a unit is neither on a pCPU nor in the waitqueue, it means it was - * offline, and that it is now coming back being online. + * offline, and that it is now coming back being online. If we're luck= y, + * and it's previous resource is free (and affinities match), we can j= ust + * assign the unit to it (we own the proper lock already) and be done. */ - if ( unlikely(per_cpu(npc, cpu).unit !=3D unit && list_empty(&nvc->wai= tq_elem)) ) + if ( per_cpu(npc, cpu).unit =3D=3D NULL && + unit_check_affinity(unit, cpu, BALANCE_HARD_AFFINITY) ) { - spin_lock(&prv->waitq_lock); - list_add_tail(&nvc->waitq_elem, &prv->waitq); - spin_unlock(&prv->waitq_lock); - - cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity, - cpupool_domain_master_cpumask(unit->domain)); - - if ( !cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cpu)= ) ) + if ( !has_soft_affinity(unit) || + unit_check_affinity(unit, cpu, BALANCE_SOFT_AFFINITY) ) { - dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any= CPU!\n", - unit->domain->domain_id, unit->unit_id); + unit_assign(prv, unit, cpu); + cpu_raise_softirq(cpu, SCHEDULE_SOFTIRQ); return; } + } =20 - /* - * Now we would want to assign the unit to cpu, but we can't, beca= use - * we don't have the lock. So, let's do the following: - * - try to remove cpu from the list of free cpus, to avoid races = with - * other onlining, inserting or migrating operations; - * - tickle the cpu, which will pickup work from the waitqueue, and - * assign it to itself; - * - if we're racing already, and if there still are free cpus, try - * again. - */ - while ( cpumask_intersects(&prv->cpus_free, cpumask_scratch_cpu(cp= u)) ) - { - unsigned int new_cpu =3D pick_res(prv, unit)->master_cpu; + /* + * If the resource is not free (or affinities do not match) we need + * to assign unit to some other one, but we can't do it here, as: + * - we don't own the proper lock, + * - we can't change v->processor under vcpu_wake()'s feet. + * So we add it to the waitqueue, and tickle all the free CPUs (if any) + * on which unit can run. The first one that schedules will pick it up. + */ + spin_lock(&prv->waitq_lock); + list_add_tail(&nvc->waitq_elem, &prv->waitq); + spin_unlock(&prv->waitq_lock); =20 - if ( test_and_clear_bit(new_cpu, &prv->cpus_free) ) - { - cpu_raise_softirq(new_cpu, SCHEDULE_SOFTIRQ); - return; - } - } - } + cpumask_and(cpumask_scratch_cpu(cpu), unit->cpu_hard_affinity, + cpupool_domain_master_cpumask(unit->domain)); + cpumask_and(cpumask_scratch_cpu(cpu), cpumask_scratch_cpu(cpu), + &prv->cpus_free); =20 - /* Note that we get here only for units assigned to a pCPU */ - cpu_raise_softirq(sched_unit_master(unit), SCHEDULE_SOFTIRQ); + if ( cpumask_empty(cpumask_scratch_cpu(cpu)) ) + dprintk(XENLOG_G_WARNING, "WARNING: d%dv%d not assigned to any CPU= !\n", + unit->domain->domain_id, unit->unit_id); + else + cpumask_raise_softirq(cpumask_scratch_cpu(cpu), SCHEDULE_SOFTIRQ); } =20 static void null_unit_sleep(const struct scheduler *ops, @@ -827,6 +829,8 @@ static void null_schedule(const struct scheduler *ops, = struct sched_unit *prev, */ if ( unlikely(prev->next_task =3D=3D NULL) ) { + bool unit_found; + spin_lock(&prv->waitq_lock); =20 if ( list_empty(&prv->waitq) ) @@ -839,6 +843,7 @@ static void null_schedule(const struct scheduler *ops, = struct sched_unit *prev, * it only in cases where a pcpu has no unit associated (e.g., as * said above, the cpu has just joined a cpupool). */ + unit_found =3D false; for_each_affinity_balance_step( bs ) { list_for_each_entry( wvc, &prv->waitq, waitq_elem ) @@ -849,13 +854,45 @@ static void null_schedule(const struct scheduler *ops= , struct sched_unit *prev, =20 if ( unit_check_affinity(wvc->unit, sched_cpu, bs) ) { - unit_assign(prv, wvc->unit, sched_cpu); - list_del_init(&wvc->waitq_elem); - prev->next_task =3D wvc->unit; - goto unlock; + spinlock_t *lock; + + unit_found =3D true; + + /* + * If the unit in the waitqueue has just come up onlin= e, + * we risk racing with vcpu_wake(). To avoid this, sync + * on the spinlock that vcpu_wake() holds, but only wi= th + * trylock, to avoid deadlock). + */ + lock =3D pcpu_schedule_trylock(sched_unit_master(wvc->= unit)); + + /* + * We know the vcpu's lock is not this resource's lock= . In + * fact, if it were, since this cpu is free, vcpu_wake= () + * would have assigned the unit to here directly. + */ + ASSERT(lock !=3D get_sched_res(sched_cpu)->schedule_lo= ck); + + if ( lock ) { + unit_assign(prv, wvc->unit, sched_cpu); + list_del_init(&wvc->waitq_elem); + prev->next_task =3D wvc->unit; + spin_unlock(lock); + goto unlock; + } } } } + /* + * If we did find a unit with suitable affinity in the waitqueue, = but + * we could not pick it up (due to lock contention), and hence we = are + * still free, plan for another try. In fact, we don't want such u= nit + * to be stuck in the waitqueue, when there are free cpus where it + * could run. + */ + if ( unlikely( unit_found && prev->next_task =3D=3D NULL && + !list_empty(&prv->waitq)) ) + cpu_raise_softirq(cur_cpu, SCHEDULE_SOFTIRQ); unlock: spin_unlock(&prv->waitq_lock); =20 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel