[PATCH v3 2/2] cacheinfo: Add arm64 early level initializer implementation

Radu Rendec posted 2 patches 2 years, 10 months ago
There is a newer version of this series
[PATCH v3 2/2] cacheinfo: Add arm64 early level initializer implementation
Posted by Radu Rendec 2 years, 10 months ago
This patch adds an architecture specific early cache level detection
handler for arm64. This is basically the CLIDR_EL1 based detection that
was previously done (only) in init_cache_level().

This is part of a patch series that attempts to further the work in
commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
Previously, in the absence of any DT/ACPI cache info, architecture
specific cache detection and info allocation for secondary CPUs would
happen in non-preemptible context during early CPU initialization and
trigger a "BUG: sleeping function called from invalid context" splat on
an RT kernel.

This patch does not solve the problem completely for RT kernels. It
relies on the assumption that on most systems, the CPUs are symmetrical
and therefore have the same number of cache leaves. The cacheinfo memory
is allocated early (on the primary CPU), relying on the new handler. If
later (when CLIDR_EL1 based detection runs again on the secondary CPU)
the initial assumption proves to be wrong and the CPU has in fact more
leaves, the cacheinfo memory is reallocated, and that still triggers a
splat on an RT kernel.

In other words, asymmetrical CPU systems *must* still provide cacheinfo
data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs
happen to have less leaves than the primary CPU). But symmetrical CPU
systems (the majority) can now get away without the additional DT/ACPI
data and rely on CLIDR_EL1 based detection.

Signed-off-by: Radu Rendec <rrendec@redhat.com>
---
 arch/arm64/kernel/cacheinfo.c | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index c307f69e9b55..520d17e4ebe9 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -38,21 +38,37 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 	this_leaf->type = type;
 }
 
-int init_cache_level(unsigned int cpu)
+static void detect_cache_level(unsigned int *level, unsigned int *leaves)
 {
-	unsigned int ctype, level, leaves;
-	int fw_level, ret;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	unsigned int ctype;
 
-	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
-		ctype = get_cache_type(level);
+	for (*level = 1, *leaves = 0; *level <= MAX_CACHE_LEVEL; (*level)++) {
+		ctype = get_cache_type(*level);
 		if (ctype == CACHE_TYPE_NOCACHE) {
-			level--;
+			(*level)--;
 			break;
 		}
 		/* Separate instruction and data caches */
-		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
+		*leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
 	}
+}
+
+int early_cache_level(unsigned int cpu)
+{
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	detect_cache_level(&this_cpu_ci->num_levels, &this_cpu_ci->num_leaves);
+
+	return 0;
+}
+
+int init_cache_level(unsigned int cpu)
+{
+	unsigned int level, leaves;
+	int fw_level, ret;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+
+	detect_cache_level(&level, &leaves);
 
 	if (acpi_disabled) {
 		fw_level = of_find_last_cache_level(cpu);
-- 
2.39.2
Re: [PATCH v3 2/2] cacheinfo: Add arm64 early level initializer implementation
Posted by Sudeep Holla 2 years, 10 months ago
On Thu, Apr 06, 2023 at 07:39:26PM -0400, Radu Rendec wrote:
> This patch adds an architecture specific early cache level detection
> handler for arm64. This is basically the CLIDR_EL1 based detection that
> was previously done (only) in init_cache_level().
> 
> This is part of a patch series that attempts to further the work in
> commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
> Previously, in the absence of any DT/ACPI cache info, architecture
> specific cache detection and info allocation for secondary CPUs would
> happen in non-preemptible context during early CPU initialization and
> trigger a "BUG: sleeping function called from invalid context" splat on
> an RT kernel.
> 
> This patch does not solve the problem completely for RT kernels. It
> relies on the assumption that on most systems, the CPUs are symmetrical
> and therefore have the same number of cache leaves. The cacheinfo memory
> is allocated early (on the primary CPU), relying on the new handler. If
> later (when CLIDR_EL1 based detection runs again on the secondary CPU)
> the initial assumption proves to be wrong and the CPU has in fact more
> leaves, the cacheinfo memory is reallocated, and that still triggers a
> splat on an RT kernel.
> 
> In other words, asymmetrical CPU systems *must* still provide cacheinfo
> data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs
> happen to have less leaves than the primary CPU). But symmetrical CPU
> systems (the majority) can now get away without the additional DT/ACPI
> data and rely on CLIDR_EL1 based detection.
> 
> Signed-off-by: Radu Rendec <rrendec@redhat.com>
> ---
>  arch/arm64/kernel/cacheinfo.c | 32 ++++++++++++++++++++++++--------
>  1 file changed, 24 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
> index c307f69e9b55..520d17e4ebe9 100644
> --- a/arch/arm64/kernel/cacheinfo.c
> +++ b/arch/arm64/kernel/cacheinfo.c
> @@ -38,21 +38,37 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
>  	this_leaf->type = type;
>  }
>  
> -int init_cache_level(unsigned int cpu)
> +static void detect_cache_level(unsigned int *level, unsigned int *leaves)
>  {
> -	unsigned int ctype, level, leaves;
> -	int fw_level, ret;
> -	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> +	unsigned int ctype;
>  
> -	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
> -		ctype = get_cache_type(level);
> +	for (*level = 1, *leaves = 0; *level <= MAX_CACHE_LEVEL; (*level)++) {
> +		ctype = get_cache_type(*level);
>  		if (ctype == CACHE_TYPE_NOCACHE) {
> -			level--;
> +			(*level)--;
>  			break;
>  		}
>  		/* Separate instruction and data caches */
> -		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
> +		*leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
>  	}
> +}

I prefer to use locals and assign the value to keep it simple/easy to follow.
Compiler can/will optimise this anyway. But I am fine either way.

I need Will's(or Catalin's)  ack if I have to take the changes via Greg's tree.

-- 
Regards,
Sudeep
Re: [PATCH v3 2/2] cacheinfo: Add arm64 early level initializer implementation
Posted by Radu Rendec 2 years, 10 months ago
On Wed, 2023-04-12 at 12:40 +0100, Sudeep Holla wrote:
> On Thu, Apr 06, 2023 at 07:39:26PM -0400, Radu Rendec wrote:
> > This patch adds an architecture specific early cache level detection
> > handler for arm64. This is basically the CLIDR_EL1 based detection that
> > was previously done (only) in init_cache_level().
> > 
> > This is part of a patch series that attempts to further the work in
> > commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU").
> > Previously, in the absence of any DT/ACPI cache info, architecture
> > specific cache detection and info allocation for secondary CPUs would
> > happen in non-preemptible context during early CPU initialization and
> > trigger a "BUG: sleeping function called from invalid context" splat on
> > an RT kernel.
> > 
> > This patch does not solve the problem completely for RT kernels. It
> > relies on the assumption that on most systems, the CPUs are symmetrical
> > and therefore have the same number of cache leaves. The cacheinfo memory
> > is allocated early (on the primary CPU), relying on the new handler. If
> > later (when CLIDR_EL1 based detection runs again on the secondary CPU)
> > the initial assumption proves to be wrong and the CPU has in fact more
> > leaves, the cacheinfo memory is reallocated, and that still triggers a
> > splat on an RT kernel.
> > 
> > In other words, asymmetrical CPU systems *must* still provide cacheinfo
> > data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs
> > happen to have less leaves than the primary CPU). But symmetrical CPU
> > systems (the majority) can now get away without the additional DT/ACPI
> > data and rely on CLIDR_EL1 based detection.
> > 
> > Signed-off-by: Radu Rendec <rrendec@redhat.com>
> > ---
> >  arch/arm64/kernel/cacheinfo.c | 32 ++++++++++++++++++++++++--------
> >  1 file changed, 24 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
> > index c307f69e9b55..520d17e4ebe9 100644
> > --- a/arch/arm64/kernel/cacheinfo.c
> > +++ b/arch/arm64/kernel/cacheinfo.c
> > @@ -38,21 +38,37 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
> >         this_leaf->type = type;
> >  }
> >  
> > -int init_cache_level(unsigned int cpu)
> > +static void detect_cache_level(unsigned int *level, unsigned int *leaves)
> >  {
> > -       unsigned int ctype, level, leaves;
> > -       int fw_level, ret;
> > -       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> > +       unsigned int ctype;
> >  
> > -       for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
> > -               ctype = get_cache_type(level);
> > +       for (*level = 1, *leaves = 0; *level <= MAX_CACHE_LEVEL; (*level)++) {
> > +               ctype = get_cache_type(*level);
> >                 if (ctype == CACHE_TYPE_NOCACHE) {
> > -                       level--;
> > +                       (*level)--;
> >                         break;
> >                 }
> >                 /* Separate instruction and data caches */
> > -               leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
> > +               *leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
> >         }
> > +}
> 
> I prefer to use locals and assign the value to keep it simple/easy to follow.
> Compiler can/will optimise this anyway. But I am fine either way.

To be honest, I was on the fence about this and decided to go with the
pointers, but now that you brought it up, I changed my mind :)

If I keep the original names for the locals and use something else for
the arguments, the patch will look cleaner and it will be obvious for
anyone looking at it that the algorithm for counting the levels/leaves
is unchanged.

Best regards,
Radu

> I need Will's(or Catalin's)  ack if I have to take the changes via
> Greg's tree.
>