From nobody Sun Nov 24 12:10:50 2024 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 991BB1917D7 for ; Wed, 6 Nov 2024 02:57:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730861836; cv=none; b=cs1lakzTJuPGnlgjrKs9aZSEsqAyeiyXV0BblqrTaKTC11nCRVeipXE0+EZdNEzVBV8kkCLSj1yyC+vngXO3ppDdSqDSAkzlZ3cLAodh218HWkVavBk5hYWzZsDk2BF+liY5SwsDARHNEOizOqZYLElsFIrQ8IHmv0c5b4BIb7Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730861836; c=relaxed/simple; bh=mlkvhwb0uPalS0AiigOgKNA1RMt/aZBZzzXvsd1pagg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=B7yHTwDdOrCazD6L9YQWm127nF92e0iE9WY6c9nKt1PQgjoHVPJA4zj38Ekq08IABF+AypZNYct6uk7cLZ8fwsqtO/W40p8mY8B3gVZlvleI5vtfUwTQjGXqQK5LPju+RdPGBThvgcYcOBvMKZ5M+Y5d2unEKrrEfw8cphF/0qA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WV5MRFO7; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WV5MRFO7" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-7ea6efcd658so1683406a12.3 for ; Tue, 05 Nov 2024 18:57:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730861834; x=1731466634; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0RdZU9JqmApFBfR2JA2IyOtnzpnkWYvffqeBWmVivMg=; b=WV5MRFO7VOd1qziPtbVzbbsuxPCI5SK4TIfeig3fOBmT25Ggfv/xBh5Nb6Dbaz6WEt /VwmAX4yLKuzf/bNEFMuufM62e2wFd5B2QDyLEo+AR9U98eA5sqLENFqLraC6aqILWD1 aqSaUpu3SuGDJK4nabzbNeZEE1xca0CNOCMCc4DeKhsyYJrfHL9/7KKSageivEG90tHT 1Qf8uJ5DtAr7DY7F9WWsE3I3KGx/5YL/NoL5sENScD2FxdLvebZflkckDtONbZFT1ufL 7qou87ci9NTZkykBYryU8sb5f4r5MkmUkGmjax3yZCTEJebQLtDJx7pe9MSF9YjrPwHN JHoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730861834; x=1731466634; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0RdZU9JqmApFBfR2JA2IyOtnzpnkWYvffqeBWmVivMg=; b=d34bVKdwMAPwvOp2Ly9VGatoY0uRBFUlNW4AhbZzk6ny0XhcYeYPcVjGKMcCKokTBx kbrPUisq+bp1KT5UfFk//26Da/uJT2/3mhgNiycQ3s3x52tzVnpeJWeyu3LCr6lowK7I A3RuaFdxmqNGqvrYpPj3l503ENVdLk/yEYDxgVPdgAM644sxBDc94XSZNk7LUHf+CdWW 8ebyd95Lf/IQl1BDAvhZ8Vinhd8tHzNl2dWUjv9OEvhN0aUBkJV2ocjgAufb8CqAK9fN Z/dQx1RXkjIP4IGuDM7w0PVT28IJoD7D/S9GI0p0W4nnd+f70fGdBiDZ4NF94WdCw40l e/sw== X-Gm-Message-State: AOJu0YylkbG8eWOmTNrCc/W+rN0OecVjDbWv53lUHuLugWh4BHrRpX8M UpXhoHv7u6iOOOeXvcV6wxaMlaaU9gP1HZlpMwCnhYBr+xZFkpPFVpSQ1CHEhiQXx1JMjjSKsvq +Rlen/RsTa6Zy/Y/tyHO2zKucMmXsW8gZzL9xp9O2Hn7QY2VcyNHdBZ+qYYhjp1Q3420U8ZQoB2 lhyGrldgq2kPOoaAbQ9UwxoPSj+PL0DndxczBvSArd2wH+ X-Google-Smtp-Source: AGHT+IFrpJCoOBDGfJfjK+vKO+I+2M2RSr1iOWpEJa2k3GtOKNtwffWmp4DfMQrhwn4CeJAkcA8ZupEwbhQW X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a63:1641:0:b0:7ea:67a0:9651 with SMTP id 41be03b00d2f7-7ee58430880mr27581a12.3.1730861832987; Tue, 05 Nov 2024 18:57:12 -0800 (PST) Date: Tue, 5 Nov 2024 18:56:46 -0800 In-Reply-To: <20241106025656.2326794-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241106025656.2326794-1-jstultz@google.com> X-Mailer: git-send-email 2.47.0.199.ga7371fff76-goog Message-ID: <20241106025656.2326794-7-jstultz@google.com> Subject: [RFC][PATCH v13 6/7] sched: Fix proxy/current (push,pull)ability From: John Stultz To: LKML Cc: Valentin Schneider , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , kernel-team@android.com, "Connor O'Brien" , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Valentin Schneider Proxy execution forms atomic pairs of tasks: The waiting donor task (scheduling context) and a proxy (execution context). The donor task, along with the rest of the blocked chain, follows the proxy wrt CPU placement. They can be the same task, in which case push/pull doesn't need any modification. When they are different, however, FIFO1 & FIFO42: ,-> RT42 | | blocked-on | v blocked_donor | mutex | | owner | v `-- RT1 RT1 RT42 CPU0 CPU1 ^ ^ | | overloaded !overloaded rq prio =3D 42 rq prio =3D 0 RT1 is eligible to be pushed to CPU1, but should that happen it will "carry" RT42 along. Clearly here neither RT1 nor RT42 must be seen as push/pullable. Unfortunately, only the donor task is usually dequeued from the rq, and the proxy'ed execution context (rq->curr) remains on the rq. This can cause RT1 to be selected for migration from logic like the rt pushable_list. Thus, adda a dequeue/enqueue cycle on the proxy task before __schedule returns, which allows the sched class logic to avoid adding the now current task to the pushable_list. Furthermore, tasks becoming blocked on a mutex don't need an explicit dequeue/enqueue cycle to be made (push/pull)able: they have to be running to block on a mutex, thus they will eventually hit put_prev_task(). Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: kernel-team@android.com Signed-off-by: Valentin Schneider Signed-off-by: Connor O'Brien Signed-off-by: John Stultz --- v3: * Tweaked comments & commit message v5: * Minor simplifications to utilize the fix earlier in the patch series. * Rework the wording of the commit message to match selected/ proxy terminology and expand a bit to make it more clear how it works. v6: * Dropped now-unused proxied value, to be re-added later in the series when it is used, as caught by Dietmar v7: * Unused function argument fixup * Commit message nit pointed out by Metin Kaya * Dropped unproven unlikely() and use sched_proxy_exec() in proxy_tag_curr, suggested by Metin Kaya v8: * More cleanups and typo fixes suggested by Metin Kaya v11: * Cleanup of comimt message suggested by Metin v12: * Rework for rq_selected -> rq->donor renaming --- kernel/sched/core.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4e2c51c477b0..42ea651d1469 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6688,6 +6688,23 @@ find_proxy_task(struct rq *rq, struct task_struct *d= onor, struct rq_flags *rf) } #endif /* SCHED_PROXY_EXEC */ =20 +static inline void proxy_tag_curr(struct rq *rq, struct task_struct *owner) +{ + if (!sched_proxy_exec()) + return; + /* + * pick_next_task() calls set_next_task() on the chosen task + * at some point, which ensures it is not push/pullable. + * However, the chosen/donor task *and* the mutex owner form an + * atomic pair wrt push/pull. + * + * Make sure owner we run is not pushable. Unfortunately we can + * only deal with that by means of a dequeue/enqueue cycle. :-/ + */ + dequeue_task(rq, owner, DEQUEUE_NOCLOCK | DEQUEUE_SAVE); + enqueue_task(rq, owner, ENQUEUE_NOCLOCK | ENQUEUE_RESTORE); +} + /* * __schedule() is the main scheduler function. * @@ -6826,6 +6843,10 @@ static void __sched notrace __schedule(int sched_mod= e) * changes to task_struct made by pick_next_task(). */ RCU_INIT_POINTER(rq->curr, next); + + if (!task_current_donor(rq, next)) + proxy_tag_curr(rq, next); + /* * The membarrier system call requires each architecture * to have a full memory barrier after updating @@ -6859,6 +6880,10 @@ static void __sched notrace __schedule(int sched_mod= e) /* Also unlocks the rq: */ rq =3D context_switch(rq, prev, next, &rf); } else { + /* In case next was already curr but just got blocked_donor */ + if (!task_current_donor(rq, next)) + proxy_tag_curr(rq, next); + rq_unpin_lock(rq, &rf); __balance_callbacks(rq); raw_spin_rq_unlock_irq(rq); --=20 2.47.0.199.ga7371fff76-goog