From nobody Fri Dec 26 01:03:46 2025 Received: from mail-ej1-f43.google.com (mail-ej1-f43.google.com [209.85.218.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 798FD4BAA0; Wed, 10 Jan 2024 14:56:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MteIaRGu" Received: by mail-ej1-f43.google.com with SMTP id a640c23a62f3a-a28fb463a28so426528666b.3; Wed, 10 Jan 2024 06:56:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704898564; x=1705503364; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=C63BDPk/7WHx0usrsZtg8+fUN103VfNMkfDwnaCswfU=; b=MteIaRGu5giZh8Saaib2gtY6NDuPusXuxzw3+mJLbVVhKeV6VMOTgwXuo4WNjgPT9+ ZgoGURRqUXcnTwdUjbq2wdkwpk3UZ6AhDtlxgD3UNTvfQBBCcEguF09qnF+OuKh6jkZm URV6kyPBviqje15BlSoYXT0XAjDrmQagClFBnoZseqoH1TiL9yUMw9nOrjiEKMoER3kJ GO+imJpv6GxEoM8rsT6MiMKHmyaBmxQunqFe1TEJyt5goMCggpaElmkMcBnhbI8OhrsX 9a9ZePxZDHLU8uQeRnEXHUv7520dMO2bj5y5Hcb1NFF2BfQSFVB+QvPIXV+T/CZ5SXNU 0ISA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704898564; x=1705503364; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=C63BDPk/7WHx0usrsZtg8+fUN103VfNMkfDwnaCswfU=; b=K5T6YaFGoJ3BdCt08wGsXztu70S8FAywA3byJ/FAQL5E6AuXx4Axpeyy1oZG8NLIng a58b1RrWSdTVB1D2jUCPqWmFgrf07kauGCFX+BxVU1HNSt/uuJ5a/Kvreff8gqcqi1dI XRz5wuMYphVeFiE1xq3xPnh6HmWX/RM8z9IBN5k/DqxKgU4jDwgQwmAXroQ9MqW4IdDb TKnVtx1xob11YIV1isM2VN9tcc6LosAHDVKFtAhUhBQNW3IsfTW81byGNr/ukJNDx9zW 3iFss/2Pwv7nT8npmWlw83H3G4sExnSsj+rRzpfYrlhVkimmI15qrhWnBNz0N3TqapST noVA== X-Gm-Message-State: AOJu0Yxig3WsMqCzthkt7e28dMQIGX1QGHTap78d6jAJxs7Fxy/HFiXB JQbIDwFvyeMUXgd+n8d7bIQ= X-Google-Smtp-Source: AGHT+IEfQ0aiJnEbthSrMkSjQB/zdjUuuUCY/fvSQUuGh/Yx6TJvuSg9tAQCv4MEbcccXdr1VJhfyA== X-Received: by 2002:a17:906:189:b0:a28:da0f:b7b4 with SMTP id 9-20020a170906018900b00a28da0fb7b4mr691778ejb.13.1704898564560; Wed, 10 Jan 2024 06:56:04 -0800 (PST) Received: from andrea.wind3.hub ([31.189.29.12]) by smtp.gmail.com with ESMTPSA id bm3-20020a170906c04300b00a2a4efe7d3dsm2161032ejb.79.2024.01.10.06.56.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 06:56:04 -0800 (PST) From: Andrea Parri To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, mathieu.desnoyers@efficios.com, paulmck@kernel.org, corbet@lwn.net Cc: mmaas@google.com, hboehm@google.com, striker@us.ibm.com, charlie@rivosinc.com, rehn@rivosinc.com, linux-riscv@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Andrea Parri Subject: [PATCH v3 1/4] membarrier: riscv: Add full memory barrier in switch_mm() Date: Wed, 10 Jan 2024 15:55:30 +0100 Message-Id: <20240110145533.60234-2-parri.andrea@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240110145533.60234-1-parri.andrea@gmail.com> References: <20240110145533.60234-1-parri.andrea@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The membarrier system call requires a full memory barrier after storing to rq->curr, before going back to user-space. The barrier is only needed when switching between processes: the barrier is implied by mmdrop() when switching from kernel to userspace, and it's not needed when switching from userspace to kernel. Rely on the feature/mechanism ARCH_HAS_MEMBARRIER_CALLBACKS and on the primitive membarrier_arch_switch_mm(), already adopted by the PowerPC architecture, to insert the required barrier. Fixes: fab957c11efe2f ("RISC-V: Atomic and Locking Code") Signed-off-by: Andrea Parri Reviewed-by: Mathieu Desnoyers --- MAINTAINERS | 2 +- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/membarrier.h | 31 +++++++++++++++++++++++++++++ arch/riscv/mm/context.c | 2 ++ kernel/sched/core.c | 5 +++-- 5 files changed, 38 insertions(+), 3 deletions(-) create mode 100644 arch/riscv/include/asm/membarrier.h diff --git a/MAINTAINERS b/MAINTAINERS index a7c4cf8201e01..0f8cec504b2ba 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13815,7 +13815,7 @@ M: Mathieu Desnoyers M: "Paul E. McKenney" L: linux-kernel@vger.kernel.org S: Supported -F: arch/powerpc/include/asm/membarrier.h +F: arch/*/include/asm/membarrier.h F: include/uapi/linux/membarrier.h F: kernel/sched/membarrier.c =20 diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index cd4c9a204d08c..33d9ea5fa392f 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -27,6 +27,7 @@ config RISCV select ARCH_HAS_GCOV_PROFILE_ALL select ARCH_HAS_GIGANTIC_PAGE select ARCH_HAS_KCOV + select ARCH_HAS_MEMBARRIER_CALLBACKS select ARCH_HAS_MMIOWB select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PMEM_API diff --git a/arch/riscv/include/asm/membarrier.h b/arch/riscv/include/asm/m= embarrier.h new file mode 100644 index 0000000000000..6c016ebb5020a --- /dev/null +++ b/arch/riscv/include/asm/membarrier.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _ASM_RISCV_MEMBARRIER_H +#define _ASM_RISCV_MEMBARRIER_H + +static inline void membarrier_arch_switch_mm(struct mm_struct *prev, + struct mm_struct *next, + struct task_struct *tsk) +{ + /* + * Only need the full barrier when switching between processes. + * Barrier when switching from kernel to userspace is not + * required here, given that it is implied by mmdrop(). Barrier + * when switching from userspace to kernel is not needed after + * store to rq->curr. + */ + if (IS_ENABLED(CONFIG_SMP) && + likely(!(atomic_read(&next->membarrier_state) & + (MEMBARRIER_STATE_PRIVATE_EXPEDITED | + MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) + return; + + /* + * The membarrier system call requires a full memory barrier + * after storing to rq->curr, before going back to user-space. + * Matches a full barrier in the proximity of the membarrier + * system call entry. + */ + smp_mb(); +} + +#endif /* _ASM_RISCV_MEMBARRIER_H */ diff --git a/arch/riscv/mm/context.c b/arch/riscv/mm/context.c index 217fd4de61342..ba8eb3944687c 100644 --- a/arch/riscv/mm/context.c +++ b/arch/riscv/mm/context.c @@ -323,6 +323,8 @@ void switch_mm(struct mm_struct *prev, struct mm_struct= *next, if (unlikely(prev =3D=3D next)) return; =20 + membarrier_arch_switch_mm(prev, next, task); + /* * Mark the current MM context as inactive, and the next as * active. This is at least used by the icache flushing diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a708d225c28e8..711dc753f7216 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6670,8 +6670,9 @@ static void __sched notrace __schedule(unsigned int s= ched_mode) * * Here are the schemes providing that barrier on the * various architectures: - * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC. - * switch_mm() rely on membarrier_arch_switch_mm() on PowerPC. + * - mm ? switch_mm() : mmdrop() for x86, s390, sparc, PowerPC, + * RISC-V. switch_mm() relies on membarrier_arch_switch_mm() + * on PowerPC and on RISC-V. * - finish_lock_switch() for weakly-ordered * architectures where spin_unlock is a full barrier, * - switch_to() for arm64 (weakly-ordered, spin_unlock --=20 2.34.1 From nobody Fri Dec 26 01:03:46 2025 Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E01A04C60E; Wed, 10 Jan 2024 14:56:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="gsn7I2wg" Received: by mail-ed1-f51.google.com with SMTP id 4fb4d7f45d1cf-5572a9b3420so8888119a12.1; Wed, 10 Jan 2024 06:56:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704898567; x=1705503367; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Lzozb8iCEmtU9e7gf57P/YG6f1xRM2em8wJK2yy1xcA=; b=gsn7I2wgcox/JnqcGDqqDKjAm1n2m7hKnF+I3kKwWhaJquGkqcGzHWO7PfCOVyL3PO b+AJmpxDgvUIwvYa3PUmXdVqcnXHNruY1NnDyxiGYEyJlTIMPP8SiDJzQhsJdPewUl9d MNyl1ZKMUubsGhgeJvnR9K/UkXjuObLLSg42rWqJKnsnbYxq4cm6bxWgBEhAJeb+UPY6 vJCM5hN9NLXlFAgxAtVfMgs8TB0HGJu9V/iiJiP0Kk2wzHRpHeaIM4EHHx5tiyx9ykxR XXGzTN2T8ZImWzOGsRxTf1FpB+ZeJWbwYEGua5xhN60EWVNGhSXrfbAZbgtSt2s94vom CbIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704898567; x=1705503367; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Lzozb8iCEmtU9e7gf57P/YG6f1xRM2em8wJK2yy1xcA=; b=Q0R26N0d7RbhlIedFcuMMU2UtZxxFGvXW6Z/bAmLFjgsvSaNUvNQWJsFCL4upS+GiY hNQpg1K66YkLsLc1i+dMLu9ilYi7msiu8nSAE2SvubmXdJe+yJ+4byPpcwLxEQBuGyjV 0N+oFdAg8Qbo0l1Vr8/Zq7J7gZm0GoBHblu14SOdgkMWuftDPmXCUEvAcD/vV4n+sQjT hRMqOm4+PbHFKTJbPrQbuVdR3GnWUNeNbUUUgOxh5uFOfjoD5485JEud/Qa9vPBgcp/U JP2uQx9204WBgoKFAPTBgXtYm1tze0ZK42mGrkmLoj6ht+hiFTss4oiISE901kRz6b5V fHHw== X-Gm-Message-State: AOJu0Yyc0X3/f2l661FEdkMUB/OIorr/KdpvyNrRDD0kmp+O4k5pKqZt R6Ilbe36XTJFbDx0DHHui8g= X-Google-Smtp-Source: AGHT+IFbro0aX4JfenVN0CWXaIJSjNmp4LSiESS/hHcnDW060bKtIxX1WxZLBedPK80cIKSo6lwVAA== X-Received: by 2002:a17:906:4e97:b0:a28:b7c1:7210 with SMTP id v23-20020a1709064e9700b00a28b7c17210mr217767eju.7.1704898566889; Wed, 10 Jan 2024 06:56:06 -0800 (PST) Received: from andrea.wind3.hub ([31.189.29.12]) by smtp.gmail.com with ESMTPSA id bm3-20020a170906c04300b00a2a4efe7d3dsm2161032ejb.79.2024.01.10.06.56.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 06:56:06 -0800 (PST) From: Andrea Parri To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, mathieu.desnoyers@efficios.com, paulmck@kernel.org, corbet@lwn.net Cc: mmaas@google.com, hboehm@google.com, striker@us.ibm.com, charlie@rivosinc.com, rehn@rivosinc.com, linux-riscv@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Andrea Parri Subject: [PATCH v3 2/4] membarrier: Create Documentation/scheduler/membarrier.rst Date: Wed, 10 Jan 2024 15:55:31 +0100 Message-Id: <20240110145533.60234-3-parri.andrea@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240110145533.60234-1-parri.andrea@gmail.com> References: <20240110145533.60234-1-parri.andrea@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To gather the architecture requirements of the "private/global expedited" membarrier commands. The file will be expanded to integrate further information about the membarrier syscall (as needed/desired in the future). While at it, amend some related inline comments in the membarrier codebase. Suggested-by: Mathieu Desnoyers Signed-off-by: Andrea Parri Reviewed-by: Mathieu Desnoyers --- Documentation/scheduler/index.rst | 1 + Documentation/scheduler/membarrier.rst | 37 ++++++++++++++++++++++++++ MAINTAINERS | 1 + kernel/sched/core.c | 7 ++++- kernel/sched/membarrier.c | 8 +++--- 5 files changed, 49 insertions(+), 5 deletions(-) create mode 100644 Documentation/scheduler/membarrier.rst diff --git a/Documentation/scheduler/index.rst b/Documentation/scheduler/in= dex.rst index 3170747226f6d..43bd8a145b7a9 100644 --- a/Documentation/scheduler/index.rst +++ b/Documentation/scheduler/index.rst @@ -7,6 +7,7 @@ Scheduler =20 =20 completion + membarrier sched-arch sched-bwc sched-deadline diff --git a/Documentation/scheduler/membarrier.rst b/Documentation/schedul= er/membarrier.rst new file mode 100644 index 0000000000000..ab7ee3824b407 --- /dev/null +++ b/Documentation/scheduler/membarrier.rst @@ -0,0 +1,37 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +membarrier() System Call +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +MEMBARRIER_CMD_{PRIVATE,GLOBAL}_EXPEDITED - Architecture requirements +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Memory barriers before updating rq->curr +---------------------------------------- + +The command requires each architecture to have a full memory barrier after +coming from user-space, before updating rq->curr. This barrier is implied +by the sequence rq_lock(); smp_mb__after_spinlock() in __schedule(). The +barrier matches a full barrier in the proximity of the membarrier system +call exit, cf. membarrier_{private,global}_expedited(). + +Memory barriers after updating rq->curr +--------------------------------------- + +The command requires each architecture to have a full memory barrier after +updating rq->curr, before returning to user-space. The schemes providing +this barrier on the various architectures are as follows. + + - alpha, arc, arm, hexagon, mips rely on the full barrier implied by + spin_unlock() in finish_lock_switch(). + + - arm64 relies on the full barrier implied by switch_to(). + + - powerpc, riscv, s390, sparc, x86 rely on the full barrier implied by + switch_mm(), if mm is not NULL; they rely on the full barrier implied + by mmdrop(), otherwise. On powerpc and riscv, switch_mm() relies on + membarrier_arch_switch_mm(). + +The barrier matches a full barrier in the proximity of the membarrier syst= em +call entry, cf. membarrier_{private,global}_expedited(). diff --git a/MAINTAINERS b/MAINTAINERS index 0f8cec504b2ba..6bce0aeecb4f2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13815,6 +13815,7 @@ M: Mathieu Desnoyers M: "Paul E. McKenney" L: linux-kernel@vger.kernel.org S: Supported +F: Documentation/scheduler/membarrier.rst F: arch/*/include/asm/membarrier.h F: include/uapi/linux/membarrier.h F: kernel/sched/membarrier.c diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 711dc753f7216..b51bc86f8340c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6599,7 +6599,9 @@ static void __sched notrace __schedule(unsigned int s= ched_mode) * if (signal_pending_state()) if (p->state & @state) * * Also, the membarrier system call requires a full memory barrier - * after coming from user-space, before storing to rq->curr. + * after coming from user-space, before storing to rq->curr; this + * barrier matches a full barrier in the proximity of the membarrier + * system call exit. */ rq_lock(rq, &rf); smp_mb__after_spinlock(); @@ -6677,6 +6679,9 @@ static void __sched notrace __schedule(unsigned int s= ched_mode) * architectures where spin_unlock is a full barrier, * - switch_to() for arm64 (weakly-ordered, spin_unlock * is a RELEASE barrier), + * + * The barrier matches a full barrier in the proximity of + * the membarrier system call entry. */ ++*switch_count; =20 diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 2ad881d07752c..f3d91628d6b8a 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -251,7 +251,7 @@ static int membarrier_global_expedited(void) return 0; =20 /* - * Matches memory barriers around rq->curr modification in + * Matches memory barriers after rq->curr modification in * scheduler. */ smp_mb(); /* system call entry is not a mb. */ @@ -300,7 +300,7 @@ static int membarrier_global_expedited(void) =20 /* * Memory barrier on the caller thread _after_ we finished - * waiting for the last IPI. Matches memory barriers around + * waiting for the last IPI. Matches memory barriers before * rq->curr modification in scheduler. */ smp_mb(); /* exit from system call is not a mb */ @@ -339,7 +339,7 @@ static int membarrier_private_expedited(int flags, int = cpu_id) return 0; =20 /* - * Matches memory barriers around rq->curr modification in + * Matches memory barriers after rq->curr modification in * scheduler. */ smp_mb(); /* system call entry is not a mb. */ @@ -415,7 +415,7 @@ static int membarrier_private_expedited(int flags, int = cpu_id) =20 /* * Memory barrier on the caller thread _after_ we finished - * waiting for the last IPI. Matches memory barriers around + * waiting for the last IPI. Matches memory barriers before * rq->curr modification in scheduler. */ smp_mb(); /* exit from system call is not a mb */ --=20 2.34.1 From nobody Fri Dec 26 01:03:46 2025 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C81994CB27; Wed, 10 Jan 2024 14:56:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KNDs6P6r" Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a28b2e1a13fso444335066b.3; Wed, 10 Jan 2024 06:56:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704898569; x=1705503369; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cEsiVEXZ3aHRFWQgBMV3CYshoa+SSWa26OTBHMwiUvc=; b=KNDs6P6rM02RykLozrGoI+FROJqXg73tnkaaBgchryrp6UH2k94Pj/+xViw0WhXcA/ SiB6NBX5EmJ3wLsytBsgjp4nc2uAPXX11EUlDz8jf+Rqr/ux+clGQvr7qvdQqnmvjveS eM3EDlbafQJGhsNeyHv4ljhyL/g7VIA0w4teRIUIUunnqOG7W5s3miJW9n5Q9fPeWRnI 0Y3dez+VPvb+zWP/webpjmkb38cntkRj6lEucEDb6FYO8fyCEXXxsjv0wqES95bJs+eD pIADt1R4aBxNBvt7mhimGay3+w3xTpO/0QCnRkMFlmYvuUc5ZXhWKxIrBdViXYNyfEdQ XaYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704898569; x=1705503369; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cEsiVEXZ3aHRFWQgBMV3CYshoa+SSWa26OTBHMwiUvc=; b=rHduOY+RzvWYGphgK7LEThdZFWDXPLBEGGLcf1wmQy2Ty7SFg/DOs426fFD0gDQfbB hCzwlnHsHulAQxkfAcm0s/g7aHbc/QxJU7WfYHzrjFxifPDpGAGjLIJISy35c2FaXrVh kJxaiYkc5zb16bOVBcV0uBE1JNQy1RkKP9fTYfiZTOjnzo3QUs2fvTo1ON/QpyoKEbtJ rFx5bfW/n1qHJD34YXqUptraNXztv9jX9OUFAZ7gtAxngGIgYBtX+zST4QUrNmdYVsaM R9XMqxIRhfQ0L5bbiQITIzhrmHGyoyYYElnvgF7P2IklgiOuurL2jUtv0Tp5fKpeohml Ltiw== X-Gm-Message-State: AOJu0Yy9uBQ9Q6cC/Mb1DEVjrMzr2+AuUpbxvaAxH2qliFs5F78iCkOm pKW9t15OrqLaU+TQFoi3icA= X-Google-Smtp-Source: AGHT+IFwxGwbeQbDiKjtn0R8Py9o0Em/I6mAlCsr3OSjWzlRKLx4g5GXutJihc3ls0LK+MenHk9Vjw== X-Received: by 2002:a17:906:168d:b0:a2b:61dd:1687 with SMTP id s13-20020a170906168d00b00a2b61dd1687mr665162ejd.116.1704898569033; Wed, 10 Jan 2024 06:56:09 -0800 (PST) Received: from andrea.wind3.hub ([31.189.29.12]) by smtp.gmail.com with ESMTPSA id bm3-20020a170906c04300b00a2a4efe7d3dsm2161032ejb.79.2024.01.10.06.56.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 06:56:08 -0800 (PST) From: Andrea Parri To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, mathieu.desnoyers@efficios.com, paulmck@kernel.org, corbet@lwn.net Cc: mmaas@google.com, hboehm@google.com, striker@us.ibm.com, charlie@rivosinc.com, rehn@rivosinc.com, linux-riscv@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Andrea Parri Subject: [PATCH v3 3/4] locking: Introduce prepare_sync_core_cmd() Date: Wed, 10 Jan 2024 15:55:32 +0100 Message-Id: <20240110145533.60234-4-parri.andrea@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240110145533.60234-1-parri.andrea@gmail.com> References: <20240110145533.60234-1-parri.andrea@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce an architecture function that architectures can use to set up ("prepare") SYNC_CORE commands. The function will be used by RISC-V to update its "deferred icache- flush" data structures (icache_stale_mask). Architectures defining prepare_sync_core_cmd() static inline need to select ARCH_HAS_PREPARE_SYNC_CORE_CMD. Suggested-by: Mathieu Desnoyers Signed-off-by: Andrea Parri Reviewed-by: Mathieu Desnoyers --- include/linux/sync_core.h | 16 +++++++++++++++- init/Kconfig | 3 +++ kernel/sched/membarrier.c | 1 + 3 files changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/sync_core.h b/include/linux/sync_core.h index 013da4b8b3272..67bb9794b8758 100644 --- a/include/linux/sync_core.h +++ b/include/linux/sync_core.h @@ -17,5 +17,19 @@ static inline void sync_core_before_usermode(void) } #endif =20 -#endif /* _LINUX_SYNC_CORE_H */ +#ifdef CONFIG_ARCH_HAS_PREPARE_SYNC_CORE_CMD +#include +#else +/* + * This is a dummy prepare_sync_core_cmd() implementation that can be used= on + * all architectures which provide unconditional core serializing instruct= ions + * in switch_mm(). + * If your architecture doesn't provide such core serializing instructions= in + * switch_mm(), you may need to write your own functions. + */ +static inline void prepare_sync_core_cmd(struct mm_struct *mm) +{ +} +#endif =20 +#endif /* _LINUX_SYNC_CORE_H */ diff --git a/init/Kconfig b/init/Kconfig index 9ffb103fc927b..87daf50838f02 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1972,6 +1972,9 @@ source "kernel/Kconfig.locks" config ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE bool =20 +config ARCH_HAS_PREPARE_SYNC_CORE_CMD + bool + config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE bool =20 diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index f3d91628d6b8a..6d1f31b3a967b 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -320,6 +320,7 @@ static int membarrier_private_expedited(int flags, int = cpu_id) MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE_READY)) return -EPERM; ipi_func =3D ipi_sync_core; + prepare_sync_core_cmd(mm); } else if (flags =3D=3D MEMBARRIER_FLAG_RSEQ) { if (!IS_ENABLED(CONFIG_RSEQ)) return -EINVAL; --=20 2.34.1 From nobody Fri Dec 26 01:03:46 2025 Received: from mail-lf1-f44.google.com (mail-lf1-f44.google.com [209.85.167.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7ABBD4CB5E; Wed, 10 Jan 2024 14:56:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Y9ktPFDY" Received: by mail-lf1-f44.google.com with SMTP id 2adb3069b0e04-50e80d40a41so5297519e87.1; Wed, 10 Jan 2024 06:56:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704898571; x=1705503371; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rLaTiUEqnhHpEDzcjzh0LRpXh9TZeJ5JQ1BIVLM+gqw=; b=Y9ktPFDYCVz/ffcN52D0ChirQmJnD4Do8vsa0FpcHpPLUJvipD3indQIjC2PNGbBgA QlSJPa2T9YCK+pwHPuA/oa9czSP2BDw1PCOG7tMMPLdy00QZ3CF0lW4z5MmKEOx8SY6S t7h1R4RWaLA34eDfCfTSz67wQ8Pj1lNNVseQPTn87A0yoNomNaVRbKR3QPyu3irtI/Rz 1fojrnrRbrRlDru4t7e94YNa3It5bZ+pc+iYZAxqQznOK375XYdhMnX9N5kD0P+nRBgI pDKe105a+b5yqAQ8V06a+E0Kh927aWTRlnRC3g+9wBEyvVudVrAM7Ip593opUmGYVs8+ Dsyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704898571; x=1705503371; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rLaTiUEqnhHpEDzcjzh0LRpXh9TZeJ5JQ1BIVLM+gqw=; b=GdaE7o6qy0pNdXjAppZ5WicFbghUxG9xVlwyXwSqzywWrbBnCxivGE7VGTWngoipmh lZWFI9NeEaORgay6f8F5QrLwMCKVblhhzCvRdYdDvOlEe78hiz7s6Ru2W99UR3A7zbNg T4eZP568+zhbGa2E0xbEcODPXrNmqFlvfn7ymnmYBm31qKbzVuv8oDaL+TusV7OHMygW h3gPO/UrtdCYnsQjyTingyZupFs0ohg+7MrqnGr+EOCzVYvY/TPuEXu6qPR883BNIXMz cIvoI4K5cvhUIEbySvsi7H+u7/w6G3aOkNsf7omXyM0mM9TQKMWMxqce0qIDPF75ZLVi ogtQ== X-Gm-Message-State: AOJu0YyGX7jUZkNFr8JgiXmVhiUT/gtxqHR9H/CPwPJ6DttA++0J4sGS HzeUDleAEWMIHTbZ19GgO08= X-Google-Smtp-Source: AGHT+IEmWsRJPIzVvjKSG3W4CxovtYvIz/tPv4lvUtVx2c7SaE5vrwQD+TOsnx1Ylj+gfn5QPpRuDQ== X-Received: by 2002:ac2:51b7:0:b0:50e:9343:64df with SMTP id f23-20020ac251b7000000b0050e934364dfmr488193lfk.70.1704898571178; Wed, 10 Jan 2024 06:56:11 -0800 (PST) Received: from andrea.wind3.hub ([31.189.29.12]) by smtp.gmail.com with ESMTPSA id bm3-20020a170906c04300b00a2a4efe7d3dsm2161032ejb.79.2024.01.10.06.56.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 06:56:10 -0800 (PST) From: Andrea Parri To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, mathieu.desnoyers@efficios.com, paulmck@kernel.org, corbet@lwn.net Cc: mmaas@google.com, hboehm@google.com, striker@us.ibm.com, charlie@rivosinc.com, rehn@rivosinc.com, linux-riscv@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, Andrea Parri Subject: [PATCH v3 4/4] membarrier: riscv: Provide core serializing command Date: Wed, 10 Jan 2024 15:55:33 +0100 Message-Id: <20240110145533.60234-5-parri.andrea@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240110145533.60234-1-parri.andrea@gmail.com> References: <20240110145533.60234-1-parri.andrea@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" RISC-V uses xRET instructions on return from interrupt and to go back to user-space; the xRET instruction is not core serializing. Use FENCE.I for providing core serialization as follows: - by calling sync_core_before_usermode() on return from interrupt (cf. ipi_sync_core()), - via switch_mm() and sync_core_before_usermode() (respectively, for uthread->uthread and kthread->uthread transitions) to go back to user-space. On RISC-V, the serialization in switch_mm() is activated by resetting the icache_stale_mask of the mm at prepare_sync_core_cmd(). Suggested-by: Palmer Dabbelt Signed-off-by: Andrea Parri --- .../membarrier-sync-core/arch-support.txt | 18 +++++++++++- MAINTAINERS | 1 + arch/riscv/Kconfig | 3 ++ arch/riscv/include/asm/membarrier.h | 19 ++++++++++++ arch/riscv/include/asm/sync_core.h | 29 +++++++++++++++++++ kernel/sched/core.c | 4 +++ kernel/sched/membarrier.c | 4 +++ 7 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 arch/riscv/include/asm/sync_core.h diff --git a/Documentation/features/sched/membarrier-sync-core/arch-support= .txt b/Documentation/features/sched/membarrier-sync-core/arch-support.txt index d96b778b87ed8..a163170fc0f48 100644 --- a/Documentation/features/sched/membarrier-sync-core/arch-support.txt +++ b/Documentation/features/sched/membarrier-sync-core/arch-support.txt @@ -10,6 +10,22 @@ # Rely on implicit context synchronization as a result of exception return # when returning from IPI handler, and when returning to user-space. # +# * riscv +# +# riscv uses xRET as return from interrupt and to return to user-space. +# +# Given that xRET is not core serializing, we rely on FENCE.I for providing +# core serialization: +# +# - by calling sync_core_before_usermode() on return from interrupt (cf. +# ipi_sync_core()), +# +# - via switch_mm() and sync_core_before_usermode() (respectively, for +# uthread->uthread and kthread->uthread transitions) to go back to +# user-space. +# +# The serialization in switch_mm() is activated by prepare_sync_core_cmd(= ). +# # * x86 # # x86-32 uses IRET as return from interrupt, which takes care of the IPI. @@ -43,7 +59,7 @@ | openrisc: | TODO | | parisc: | TODO | | powerpc: | ok | - | riscv: | TODO | + | riscv: | ok | | s390: | ok | | sh: | TODO | | sparc: | TODO | diff --git a/MAINTAINERS b/MAINTAINERS index 6bce0aeecb4f2..e4ca6288ea3d1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13817,6 +13817,7 @@ L: linux-kernel@vger.kernel.org S: Supported F: Documentation/scheduler/membarrier.rst F: arch/*/include/asm/membarrier.h +F: arch/*/include/asm/sync_core.h F: include/uapi/linux/membarrier.h F: kernel/sched/membarrier.c =20 diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 33d9ea5fa392f..2ad63a216d69a 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -28,14 +28,17 @@ config RISCV select ARCH_HAS_GIGANTIC_PAGE select ARCH_HAS_KCOV select ARCH_HAS_MEMBARRIER_CALLBACKS + select ARCH_HAS_MEMBARRIER_SYNC_CORE select ARCH_HAS_MMIOWB select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE select ARCH_HAS_PMEM_API + select ARCH_HAS_PREPARE_SYNC_CORE_CMD select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_SET_DIRECT_MAP if MMU select ARCH_HAS_SET_MEMORY if MMU select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL select ARCH_HAS_STRICT_MODULE_RWX if MMU && !XIP_KERNEL + select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE select ARCH_HAS_SYSCALL_WRAPPER select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UBSAN_SANITIZE_ALL diff --git a/arch/riscv/include/asm/membarrier.h b/arch/riscv/include/asm/m= embarrier.h index 6c016ebb5020a..47b240d0d596a 100644 --- a/arch/riscv/include/asm/membarrier.h +++ b/arch/riscv/include/asm/membarrier.h @@ -22,6 +22,25 @@ static inline void membarrier_arch_switch_mm(struct mm_s= truct *prev, /* * The membarrier system call requires a full memory barrier * after storing to rq->curr, before going back to user-space. + * + * This barrier is also needed for the SYNC_CORE command when + * switching between processes; in particular, on a transition + * from a thread belonging to another mm to a thread belonging + * to the mm for which a membarrier SYNC_CORE is done on CPU0: + * + * - [CPU0] sets all bits in the mm icache_stale_mask (in + * prepare_sync_core_cmd()); + * + * - [CPU1] stores to rq->curr (by the scheduler); + * + * - [CPU0] loads rq->curr within membarrier and observes + * cpu_rq(1)->curr->mm !=3D mm, so the IPI is skipped on + * CPU1; this means membarrier relies on switch_mm() to + * issue the sync-core; + * + * - [CPU1] switch_mm() loads icache_stale_mask; if the bit + * is zero, switch_mm() may incorrectly skip the sync-core. + * * Matches a full barrier in the proximity of the membarrier * system call entry. */ diff --git a/arch/riscv/include/asm/sync_core.h b/arch/riscv/include/asm/sy= nc_core.h new file mode 100644 index 0000000000000..9153016da8f14 --- /dev/null +++ b/arch/riscv/include/asm/sync_core.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_RISCV_SYNC_CORE_H +#define _ASM_RISCV_SYNC_CORE_H + +/* + * RISC-V implements return to user-space through an xRET instruction, + * which is not core serializing. + */ +static inline void sync_core_before_usermode(void) +{ + asm volatile ("fence.i" ::: "memory"); +} + +#ifdef CONFIG_SMP +/* + * Ensure the next switch_mm() on every CPU issues a core serializing + * instruction for the given @mm. + */ +static inline void prepare_sync_core_cmd(struct mm_struct *mm) +{ + cpumask_setall(&mm->context.icache_stale_mask); +} +#else +static inline void prepare_sync_core_cmd(struct mm_struct *mm) +{ +} +#endif /* CONFIG_SMP */ + +#endif /* _ASM_RISCV_SYNC_CORE_H */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b51bc86f8340c..82de2b7d253cd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6682,6 +6682,10 @@ static void __sched notrace __schedule(unsigned int = sched_mode) * * The barrier matches a full barrier in the proximity of * the membarrier system call entry. + * + * On RISC-V, this barrier pairing is also needed for the + * SYNC_CORE command when switching between processes, cf. + * the inline comments in membarrier_arch_switch_mm(). */ ++*switch_count; =20 diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 6d1f31b3a967b..703e8d80a576d 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -342,6 +342,10 @@ static int membarrier_private_expedited(int flags, int= cpu_id) /* * Matches memory barriers after rq->curr modification in * scheduler. + * + * On RISC-V, this barrier pairing is also needed for the + * SYNC_CORE command when switching between processes, cf. + * the inline comments in membarrier_arch_switch_mm(). */ smp_mb(); /* system call entry is not a mb. */ =20 --=20 2.34.1