[PATCH v2 2/4] arm64/sysreg: Enforce whole word match for open/close tokens

James Clark posted 4 patches 11 months ago
[PATCH v2 2/4] arm64/sysreg: Enforce whole word match for open/close tokens
Posted by James Clark 11 months ago
Opening and closing tokens can also match on words with common prefixes
like "Endsysreg" vs "EndsysregFields". This could potentially make the
script go wrong in weird ways so make it fall through to the fatal
unhandled statement catcher if it doesn't fully match the current
block.

Closing ones also get expect_fields(1) to ensure nothing other than
whitespace follows.

Signed-off-by: James Clark <james.clark@linaro.org>
---
 arch/arm64/tools/gen-sysreg.awk | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
index 1a2afc9fdd42..f2a1732cb1f6 100755
--- a/arch/arm64/tools/gen-sysreg.awk
+++ b/arch/arm64/tools/gen-sysreg.awk
@@ -111,7 +111,7 @@ END {
 /^$/ { next }
 /^[\t ]*#/ { next }
 
-/^SysregFields/ && block_current() == "Root" {
+$1 == "SysregFields" && block_current() == "Root" {
 	block_push("SysregFields")
 
 	expect_fields(2)
@@ -127,7 +127,8 @@ END {
 	next
 }
 
-/^EndSysregFields/ && block_current() == "SysregFields" {
+$1 == "EndSysregFields" && block_current() == "SysregFields" {
+	expect_fields(1)
 	if (next_bit > 0)
 		fatal("Unspecified bits in " reg)
 
@@ -145,7 +146,7 @@ END {
 	next
 }
 
-/^Sysreg/ && block_current() == "Root" {
+$1 == "Sysreg" && block_current() == "Root" {
 	block_push("Sysreg")
 
 	expect_fields(7)
@@ -177,7 +178,8 @@ END {
 	next
 }
 
-/^EndSysreg/ && block_current() == "Sysreg" {
+$1 == "EndSysreg" && block_current() == "Sysreg" {
+	expect_fields(1)
 	if (next_bit > 0)
 		fatal("Unspecified bits in " reg)
 
@@ -206,7 +208,7 @@ END {
 
 # Currently this is effectivey a comment, in future we may want to emit
 # defines for the fields.
-(/^Fields/ || /^Mapping/) && block_current() == "Sysreg" {
+($1 == "Fields" || $1 == "Mapping") && block_current() == "Sysreg" {
 	expect_fields(2)
 
 	if (next_bit != 63)
@@ -224,7 +226,7 @@ END {
 }
 
 
-/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "Res0" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, "RES0", $2)
 	field = "RES0_" msb "_" lsb
@@ -234,7 +236,7 @@ END {
 	next
 }
 
-/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "Res1" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, "RES1", $2)
 	field = "RES1_" msb "_" lsb
@@ -244,7 +246,7 @@ END {
 	next
 }
 
-/^Unkn/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "Unkn" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, "UNKN", $2)
 	field = "UNKN_" msb "_" lsb
@@ -254,7 +256,7 @@ END {
 	next
 }
 
-/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "Field" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -265,14 +267,14 @@ END {
 	next
 }
 
-/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "Raz" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, field, $2)
 
 	next
 }
 
-/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "SignedEnum" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	block_push("Enum")
 
 	expect_fields(3)
@@ -285,7 +287,7 @@ END {
 	next
 }
 
-/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "UnsignedEnum" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	block_push("Enum")
 
 	expect_fields(3)
@@ -298,7 +300,7 @@ END {
 	next
 }
 
-/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+$1 == "Enum" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	block_push("Enum")
 
 	expect_fields(3)
@@ -310,7 +312,8 @@ END {
 	next
 }
 
-/^EndEnum/ && block_current() == "Enum" {
+$1 == "EndEnum" && block_current() == "Enum" {
+	expect_fields(1)
 
 	field = null
 	msb = null
-- 
2.34.1
Re: [PATCH v2 2/4] arm64/sysreg: Enforce whole word match for open/close tokens
Posted by Mark Rutland 9 months ago
On Wed, Jan 15, 2025 at 04:25:56PM +0000, James Clark wrote:
> Opening and closing tokens can also match on words with common prefixes
> like "Endsysreg" vs "EndsysregFields". This could potentially make the
> script go wrong in weird ways so make it fall through to the fatal
> unhandled statement catcher if it doesn't fully match the current
> block.
> 
> Closing ones also get expect_fields(1) to ensure nothing other than
> whitespace follows.
> 
> Signed-off-by: James Clark <james.clark@linaro.org>

In retrospect, this is what I should have wirrten originally. Thanks for
the fix!

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/tools/gen-sysreg.awk | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
> index 1a2afc9fdd42..f2a1732cb1f6 100755
> --- a/arch/arm64/tools/gen-sysreg.awk
> +++ b/arch/arm64/tools/gen-sysreg.awk
> @@ -111,7 +111,7 @@ END {
>  /^$/ { next }
>  /^[\t ]*#/ { next }
>  
> -/^SysregFields/ && block_current() == "Root" {
> +$1 == "SysregFields" && block_current() == "Root" {
>  	block_push("SysregFields")
>  
>  	expect_fields(2)
> @@ -127,7 +127,8 @@ END {
>  	next
>  }
>  
> -/^EndSysregFields/ && block_current() == "SysregFields" {
> +$1 == "EndSysregFields" && block_current() == "SysregFields" {
> +	expect_fields(1)
>  	if (next_bit > 0)
>  		fatal("Unspecified bits in " reg)
>  
> @@ -145,7 +146,7 @@ END {
>  	next
>  }
>  
> -/^Sysreg/ && block_current() == "Root" {
> +$1 == "Sysreg" && block_current() == "Root" {
>  	block_push("Sysreg")
>  
>  	expect_fields(7)
> @@ -177,7 +178,8 @@ END {
>  	next
>  }
>  
> -/^EndSysreg/ && block_current() == "Sysreg" {
> +$1 == "EndSysreg" && block_current() == "Sysreg" {
> +	expect_fields(1)
>  	if (next_bit > 0)
>  		fatal("Unspecified bits in " reg)
>  
> @@ -206,7 +208,7 @@ END {
>  
>  # Currently this is effectivey a comment, in future we may want to emit
>  # defines for the fields.
> -(/^Fields/ || /^Mapping/) && block_current() == "Sysreg" {
> +($1 == "Fields" || $1 == "Mapping") && block_current() == "Sysreg" {
>  	expect_fields(2)
>  
>  	if (next_bit != 63)
> @@ -224,7 +226,7 @@ END {
>  }
>  
>  
> -/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "Res0" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, "RES0", $2)
>  	field = "RES0_" msb "_" lsb
> @@ -234,7 +236,7 @@ END {
>  	next
>  }
>  
> -/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "Res1" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, "RES1", $2)
>  	field = "RES1_" msb "_" lsb
> @@ -244,7 +246,7 @@ END {
>  	next
>  }
>  
> -/^Unkn/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "Unkn" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, "UNKN", $2)
>  	field = "UNKN_" msb "_" lsb
> @@ -254,7 +256,7 @@ END {
>  	next
>  }
>  
> -/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "Field" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -265,14 +267,14 @@ END {
>  	next
>  }
>  
> -/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "Raz" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, field, $2)
>  
>  	next
>  }
>  
> -/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "SignedEnum" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	block_push("Enum")
>  
>  	expect_fields(3)
> @@ -285,7 +287,7 @@ END {
>  	next
>  }
>  
> -/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "UnsignedEnum" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	block_push("Enum")
>  
>  	expect_fields(3)
> @@ -298,7 +300,7 @@ END {
>  	next
>  }
>  
> -/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +$1 == "Enum" && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	block_push("Enum")
>  
>  	expect_fields(3)
> @@ -310,7 +312,8 @@ END {
>  	next
>  }
>  
> -/^EndEnum/ && block_current() == "Enum" {
> +$1 == "EndEnum" && block_current() == "Enum" {
> +	expect_fields(1)
>  
>  	field = null
>  	msb = null
> -- 
> 2.34.1
>
Re: [PATCH v2 2/4] arm64/sysreg: Enforce whole word match for open/close tokens
Posted by Will Deacon 9 months ago
On Wed, Jan 15, 2025 at 04:25:56PM +0000, James Clark wrote:
> Opening and closing tokens can also match on words with common prefixes
> like "Endsysreg" vs "EndsysregFields". This could potentially make the
> script go wrong in weird ways so make it fall through to the fatal
> unhandled statement catcher if it doesn't fully match the current
> block.
> 
> Closing ones also get expect_fields(1) to ensure nothing other than
> whitespace follows.
> 
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
>  arch/arm64/tools/gen-sysreg.awk | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
> index 1a2afc9fdd42..f2a1732cb1f6 100755
> --- a/arch/arm64/tools/gen-sysreg.awk
> +++ b/arch/arm64/tools/gen-sysreg.awk
> @@ -111,7 +111,7 @@ END {
>  /^$/ { next }
>  /^[\t ]*#/ { next }
>  
> -/^SysregFields/ && block_current() == "Root" {
> +$1 == "SysregFields" && block_current() == "Root" {

Stylistic nit, but could you just do:

	/^SysregFields$/ && block_current() == "Root" {

instead? That way the diff is smaller (well, same number of lines) and
you avoid the ugly $1.

But I'm not an Orc, so who knows...

Will
Re: [PATCH v2 2/4] arm64/sysreg: Enforce whole word match for open/close tokens
Posted by Marc Zyngier 9 months ago
On 2025-03-13 21:56, Will Deacon wrote:
> On Wed, Jan 15, 2025 at 04:25:56PM +0000, James Clark wrote:
>> Opening and closing tokens can also match on words with common 
>> prefixes
>> like "Endsysreg" vs "EndsysregFields". This could potentially make the
>> script go wrong in weird ways so make it fall through to the fatal
>> unhandled statement catcher if it doesn't fully match the current
>> block.
>> 
>> Closing ones also get expect_fields(1) to ensure nothing other than
>> whitespace follows.
>> 
>> Signed-off-by: James Clark <james.clark@linaro.org>
>> ---
>>  arch/arm64/tools/gen-sysreg.awk | 31 +++++++++++++++++--------------
>>  1 file changed, 17 insertions(+), 14 deletions(-)
>> 
>> diff --git a/arch/arm64/tools/gen-sysreg.awk 
>> b/arch/arm64/tools/gen-sysreg.awk
>> index 1a2afc9fdd42..f2a1732cb1f6 100755
>> --- a/arch/arm64/tools/gen-sysreg.awk
>> +++ b/arch/arm64/tools/gen-sysreg.awk
>> @@ -111,7 +111,7 @@ END {
>>  /^$/ { next }
>>  /^[\t ]*#/ { next }
>> 
>> -/^SysregFields/ && block_current() == "Root" {
>> +$1 == "SysregFields" && block_current() == "Root" {
> 
> Stylistic nit, but could you just do:
> 
> 	/^SysregFields$/ && block_current() == "Root" {
> 
> instead? That way the diff is smaller (well, same number of lines) and
> you avoid the ugly $1.

The code is trying to match the first field of a line such as:

SysregFields	ZCR_ELx

while you seem to try and match a SysregFields all alone on a line.

That being said, my perl-foo is sub-zero, so I may be very wrong myself.

         M.
-- 
Jazz is not dead. It just smells funny...
Re: [PATCH v2 2/4] arm64/sysreg: Enforce whole word match for open/close tokens
Posted by Will Deacon 9 months ago
On Fri, Mar 14, 2025 at 07:40:36AM +0000, Marc Zyngier wrote:
> On 2025-03-13 21:56, Will Deacon wrote:
> > On Wed, Jan 15, 2025 at 04:25:56PM +0000, James Clark wrote:
> > > Opening and closing tokens can also match on words with common
> > > prefixes
> > > like "Endsysreg" vs "EndsysregFields". This could potentially make the
> > > script go wrong in weird ways so make it fall through to the fatal
> > > unhandled statement catcher if it doesn't fully match the current
> > > block.
> > > 
> > > Closing ones also get expect_fields(1) to ensure nothing other than
> > > whitespace follows.
> > > 
> > > Signed-off-by: James Clark <james.clark@linaro.org>
> > > ---
> > >  arch/arm64/tools/gen-sysreg.awk | 31 +++++++++++++++++--------------
> > >  1 file changed, 17 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/arch/arm64/tools/gen-sysreg.awk
> > > b/arch/arm64/tools/gen-sysreg.awk
> > > index 1a2afc9fdd42..f2a1732cb1f6 100755
> > > --- a/arch/arm64/tools/gen-sysreg.awk
> > > +++ b/arch/arm64/tools/gen-sysreg.awk
> > > @@ -111,7 +111,7 @@ END {
> > >  /^$/ { next }
> > >  /^[\t ]*#/ { next }
> > > 
> > > -/^SysregFields/ && block_current() == "Root" {
> > > +$1 == "SysregFields" && block_current() == "Root" {
> > 
> > Stylistic nit, but could you just do:
> > 
> > 	/^SysregFields$/ && block_current() == "Root" {
> > 
> > instead? That way the diff is smaller (well, same number of lines) and
> > you avoid the ugly $1.
> 
> The code is trying to match the first field of a line such as:
> 
> SysregFields	ZCR_ELx
> 
> while you seem to try and match a SysregFields all alone on a line.
> 
> That being said, my perl-foo is sub-zero, so I may be very wrong myself.


Ah, and this is *awk* so $1 is the first field.

So my suggestion is totally broken and the original patch looks correct:

Acked-by: Will Deacon <will@kernel.org>

Will