From nobody Mon Feb 9 10:12:11 2026 Delivered-To: importer@patchew.org Received-SPF: none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1555605186; cv=none; d=zoho.com; s=zohoarc; b=YZUCFi3FKd/20UWQJyB3ycXI2Rlet8Z4XNf9SjCorLGi+5CvFIzGihMimPBIX84icZl9Zhvjr0+Z5oiLrEsjRzfSl39zBnUy+gWanKNgqdNqw3p0J+nXR3J4hhI4k5JGZWbaP58yOCXL84aNxJVsycHgklDPgTtaVIIPuXTrRtU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555605186; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=nuSNzp/dkpD2uneUvJdzLrhZWuUkelROiawnsWwD3mY=; b=bq+D95960lKw3BLwcDMOOQzzV9cMJjaKz/6B2TV/g/QOVMwiXDhXK12mSAQ68dXF4SPNkTHZfGWcCfwVa+2F+brB+uSylAuZLFZ6oDKQssToegqaAm84DYISWwnfnncT+2aNaN0l4HS9D9g80BtDBP1gB/Rg+GkUFPHcNxOfNUg= ARC-Authentication-Results: i=1; mx.zoho.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1555605186377861.4513117930962; Thu, 18 Apr 2019 09:33:06 -0700 (PDT) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hH9xT-0002GS-Ji; Thu, 18 Apr 2019 16:32:19 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hH9xR-0002C4-EH for xen-devel@lists.xenproject.org; Thu, 18 Apr 2019 16:32:17 +0000 Received: from SMTP03.CITRIX.COM (unknown [162.221.156.55]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 866c7f9e-61f7-11e9-b3e8-bf25b6989550; Thu, 18 Apr 2019 16:32:15 +0000 (UTC) X-Inumbo-ID: 866c7f9e-61f7-11e9-b3e8-bf25b6989550 X-IronPort-AV: E=Sophos;i="5.60,366,1549929600"; d="scan'208";a="83913803" From: Ian Jackson To: Date: Thu, 18 Apr 2019 17:31:56 +0100 Message-ID: <20190418163158.11408-20-ian.jackson@eu.citrix.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20190418163158.11408-1-ian.jackson@eu.citrix.com> References: <20190418163158.11408-1-ian.jackson@eu.citrix.com> MIME-Version: 1.0 Subject: [Xen-devel] [OSSTEST PATCH 19/21] starvation: Abandon jobs which are unreasonably delaying their flight X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Ian Jackson Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Sometimes, due to a shortage of available resources, a flight might be delayed because a handful of jobs are waiting much longer than the rest. Add a heuristic which causes these jobs to be abandoned. We consider ourselves starving if we are starving now, based on the most optimistic start time seen in the last I. Signed-off-by: Ian Jackson --- ts-hosts-allocate-Executive | 105 ++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 105 insertions(+) diff --git a/ts-hosts-allocate-Executive b/ts-hosts-allocate-Executive index 8c9ddaf7..7ea3c4af 100755 --- a/ts-hosts-allocate-Executive +++ b/ts-hosts-allocate-Executive @@ -62,6 +62,8 @@ our %magictaskid; our $fi; our $jobinfo; our $harness_rev; +our $starvation_p; +our @abs_start_estimates; =20 #---------- general utilities, setup, etc. ---------- =20 @@ -114,12 +116,16 @@ END } =20 $alloc_start_time =3D time // die $!; + + $starvation_p =3D + hostalloc_starvation_parse_runvar($r{hostalloc_maxwait_starvation}); } =20 #---------- prepared sql statements ---------- # all users of these must ->finish them afterwards, to avoid db deadlock =20 our ($flagscheckq, $equivflagscheckq, $duration_estimator, $resprop_q, + $starvation_q, $alloc_findres_q, $alloc_shared_q, $alloc_sharing_slot_q, $claim_share_reuse_q, $claim_maxshare_q, $claim_rmshares_q, $claim_noshares_q, $claim_rmshare_q, $claim_setres_q, @@ -146,6 +152,15 @@ END AND name =3D ? END =20 + $starvation_q=3D $dbh_tests->prepare(<prepare(<{Start}. + # Returns the most optimistic absolute start time "in the last + # $period". Returns undef if we don't have good data yet. + + push @abs_start_estimates, { At =3D> $now, Got =3D> $best->{Start} + $= now }; + + # Actually, what we do is prune all but the last entry from before + # $period, and we expect at least 4 estimates. That ensures that + # the answer involves at least one estimate at least $period ago. + # Ie what we actually return is + # Consider the most recent estimate which is at least $period + # ago (the "oldest relevant"), and all subsequent estimates. + # Answer is the most optimistic start time of all of those, + # provided there are at least 4 of them. + my $is_old =3D sub { return $_[0]{At} <=3D $now - $period; }; + my $need_estimates =3D 4; + while (@abs_start_estimates > $need_estimates && + $is_old->($abs_start_estimates[1])) { + # estimates[1] is at least $period ago and more recent + # than $estimates[0], so $estimates[0] cannot be the + # oldest relevant and is indeed older than the oldest + # relevant. + shift @abs_start_estimates; + } + + my $pr =3D sub { + my ($e) =3D @_; + printf(DEBUG ' %s (@%s)', + $e->{Got} - $now, + $e->{At} - $now); + }; + + print DEBUG "most_optimistic: all:"; + my $optimist; + foreach (@abs_start_estimates) { + $pr->($_); + $optimist =3D $_ if !$optimist || $_->{Got} < $optimist->{Got}; + } + print DEBUG "\n"; + printf(DEBUG "most_optimistic: (period=3D%s):", $period); + $pr->($optimist); + print DEBUG "\n"; + + return undef unless @abs_start_estimates >=3D $need_estimates; + + return $optimist->{Got}; +} + +sub starving ($) { + my ($best_start_abs) =3D @_; + return (0, 'runvar says never give up') unless %$starvation_p; + return (0, 'no estimate') unless defined $best_start_abs; + $starvation_q->execute($flight); + my $d=3D0; + my $w=3D0; + my $maxfin=3D0; + while (my ($j,$st,$fin) =3D $starvation_q->fetchrow_array()) { + if ($st eq 'preparing' || + $st eq 'queued' || + $st eq 'running') { + $w++; + } else { + $d++; + return (0, "job $j status $st but no step finished time!") + unless defined $fin; + $maxfin =3D $fin if $fin > $maxfin; + } + } + # we quit if the total time from the start of the flight + # to our expected finish is more than the total time so + # far (for the completed jobs) by the margin X and I + my $X =3D hostalloc_starvation_calculate_X($starvation_p, $w, $d); + return (0, 'X=3Dinf') unless defined $X; + my $total_d =3D $maxfin - $fi->{started}; + my $projected_me =3D $best_start_abs - $fi->{started}; + my $m =3D "D=3D$d W=3D$w X=3D$X maxfin=3D$maxfin"; + my $bad =3D $projected_me > $X * $total_d + $starvation_p->{I}; + return ($bad, $m); +} + sub attempt_allocation { my $mayalloc; ($plan, $mayalloc) =3D @_; @@ -772,6 +869,14 @@ sub attempt_allocation { if ($wait_sofar > $maxwait/2 && $wait_sofar + $best->{Start} > $maxwait) { logm "timed out: $wait_sofar, $best->{Start}, $maxwait"; + } elsif (%$starvation_p) { + my $est_abs =3D most_optimistic($best, $now, $starvation_p->{I}); + my ($starving, $m) =3D starving($est_abs); + $starvation_q->finish(); + if (!$starving) { + print DEBUG "not starving: $m\n"; + } else { + logm "starving ($m)"; return 2; } } --=20 2.11.0 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel