[PATCH] docs/devel/loads-stores: Fix git grep regexes

Peter Maydell posted 1 patch 1 year, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230904161703.3996734-1-peter.maydell@linaro.org
docs/devel/loads-stores.rst | 40 ++++++++++++++++++-------------------
1 file changed, 20 insertions(+), 20 deletions(-)
[PATCH] docs/devel/loads-stores: Fix git grep regexes
Posted by Peter Maydell 1 year, 2 months ago
The loads-and-stores documentation includes git grep regexes to find
occurrences of the various functions.  Some of these regexes have
errors, typically failing to escape the '?', '(' and ')' when they
should be metacharacters (since these are POSIX basic REs). We also
weren't consistent about whether to have a ':' on the end of the
line introducing the list of regexes in each section.

Fix the errors.

The following shell rune will complain about any REs in the
file which don't have any matches in the codebase:
 for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 docs/devel/loads-stores.rst | 40 ++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst
index dab6dfa0acc..ec627aa9c06 100644
--- a/docs/devel/loads-stores.rst
+++ b/docs/devel/loads-stores.rst
@@ -63,12 +63,12 @@ which stores ``val`` to ``ptr`` as an ``{endian}`` order value
 of size ``sz`` bytes.
 
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_p\>``
  - ``\<st[bwlq]\(_[hbl]e\)\?_p\>``
  - ``\<st24\(_[hbl]e\)\?_p\>``
- - ``\<ldn_\([hbl]e\)?_p\>``
- - ``\<stn_\([hbl]e\)?_p\>``
+ - ``\<ldn_\([hbl]e\)\?_p\>``
+ - ``\<stn_\([hbl]e\)\?_p\>``
 
 ``cpu_{ld,st}*_mmu``
 ~~~~~~~~~~~~~~~~~~~~
@@ -121,8 +121,8 @@ store: ``cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)``
  - ``_le`` : little endian
 
 Regexes for git grep:
- - ``\<cpu_ld[bwlq](_[bl]e)\?_mmu\>``
- - ``\<cpu_st[bwlq](_[bl]e)\?_mmu\>``
+ - ``\<cpu_ld[bwlq]\(_[bl]e\)\?_mmu\>``
+ - ``\<cpu_st[bwlq]\(_[bl]e\)\?_mmu\>``
 
 
 ``cpu_{ld,st}*_mmuidx_ra``
@@ -155,8 +155,8 @@ store: ``cpu_st{size}{end}_mmuidx_ra(env, ptr, val, mmuidx, retaddr)``
  - ``_le`` : little endian
 
 Regexes for git grep:
- - ``\<cpu_ld[us]\?[bwlq](_[bl]e)\?_mmuidx_ra\>``
- - ``\<cpu_st[bwlq](_[bl]e)\?_mmuidx_ra\>``
+ - ``\<cpu_ld[us]\?[bwlq]\(_[bl]e\)\?_mmuidx_ra\>``
+ - ``\<cpu_st[bwlq]\(_[bl]e\)\?_mmuidx_ra\>``
 
 ``cpu_{ld,st}*_data_ra``
 ~~~~~~~~~~~~~~~~~~~~~~~~
@@ -193,8 +193,8 @@ store: ``cpu_st{size}{end}_data_ra(env, ptr, val, ra)``
  - ``_le`` : little endian
 
 Regexes for git grep:
- - ``\<cpu_ld[us]\?[bwlq](_[bl]e)\?_data_ra\>``
- - ``\<cpu_st[bwlq](_[bl]e)\?_data_ra\>``
+ - ``\<cpu_ld[us]\?[bwlq]\(_[bl]e\)\?_data_ra\>``
+ - ``\<cpu_st[bwlq]\(_[bl]e\)\?_data_ra\>``
 
 ``cpu_{ld,st}*_data``
 ~~~~~~~~~~~~~~~~~~~~~
@@ -231,9 +231,9 @@ store: ``cpu_st{size}{end}_data(env, ptr, val)``
  - ``_be`` : big endian
  - ``_le`` : little endian
 
-Regexes for git grep
- - ``\<cpu_ld[us]\?[bwlq](_[bl]e)\?_data\>``
- - ``\<cpu_st[bwlq](_[bl]e)\?_data\+\>``
+Regexes for git grep:
+ - ``\<cpu_ld[us]\?[bwlq]\(_[bl]e\)\?_data\>``
+ - ``\<cpu_st[bwlq]\(_[bl]e\)\?_data\+\>``
 
 ``cpu_ld*_code``
 ~~~~~~~~~~~~~~~~
@@ -296,7 +296,7 @@ swap: ``translator_ld{sign}{size}_swap(env, ptr, swap)``
  - ``l`` : 32 bits
  - ``q`` : 64 bits
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<translator_ld[us]\?[bwlq]\(_swap\)\?\>``
 
 ``helper_{ld,st}*_mmu``
@@ -325,7 +325,7 @@ store: ``helper_{size}_mmu(env, addr, val, opindex, retaddr)``
  - ``l`` : 32 bits
  - ``q`` : 64 bits
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<helper_ld[us]\?[bwlq]_mmu\>``
  - ``\<helper_st[bwlq]_mmu\>``
 
@@ -382,7 +382,7 @@ succeeded using a MemTxResult return code.
 
 The ``_{endian}`` suffix is omitted for byte accesses.
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<address_space_\(read\|write\|rw\)\>``
  - ``\<address_space_ldu\?[bwql]\(_[lb]e\)\?\>``
  - ``\<address_space_st[bwql]\(_[lb]e\)\?\>``
@@ -400,7 +400,7 @@ Note that portions of the write which attempt to write data to a
 device will be silently ignored -- only real RAM and ROM will
 be written to.
 
-Regexes for git grep
+Regexes for git grep:
  - ``address_space_write_rom``
 
 ``{ld,st}*_phys``
@@ -438,7 +438,7 @@ device doing the access has no way to report such an error.
 
 The ``_{endian}_`` infix is omitted for byte accesses.
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<ldu\?[bwlq]\(_[bl]e\)\?_phys\>``
  - ``\<st[bwlq]\(_[bl]e\)\?_phys\>``
 
@@ -462,7 +462,7 @@ For new code they are better avoided:
 
 ``cpu_physical_memory_rw``
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<cpu_physical_memory_\(read\|write\|rw\)\>``
 
 ``cpu_memory_rw_debug``
@@ -497,7 +497,7 @@ make sure our existing code is doing things correctly.
 
 ``dma_memory_rw``
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<dma_memory_\(read\|write\|rw\)\>``
  - ``\<ldu\?[bwlq]\(_[bl]e\)\?_dma\>``
  - ``\<st[bwlq]\(_[bl]e\)\?_dma\>``
@@ -538,7 +538,7 @@ correct address space for that device.
 
 The ``_{endian}_`` infix is omitted for byte accesses.
 
-Regexes for git grep
+Regexes for git grep:
  - ``\<pci_dma_\(read\|write\|rw\)\>``
  - ``\<ldu\?[bwlq]\(_[bl]e\)\?_pci_dma\>``
  - ``\<st[bwlq]\(_[bl]e\)\?_pci_dma\>``
-- 
2.34.1
Re: [PATCH] docs/devel/loads-stores: Fix git grep regexes
Posted by Eric Blake 1 year, 2 months ago
On Mon, Sep 04, 2023 at 05:17:03PM +0100, Peter Maydell wrote:
> The loads-and-stores documentation includes git grep regexes to find
> occurrences of the various functions.  Some of these regexes have
> errors, typically failing to escape the '?', '(' and ')' when they
> should be metacharacters (since these are POSIX basic REs). We also
> weren't consistent about whether to have a ':' on the end of the
> line introducing the list of regexes in each section.
> 
> Fix the errors.
> 
> The following shell rune will complain about any REs in the
> file which don't have any matches in the codebase:
>  for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  docs/devel/loads-stores.rst | 40 ++++++++++++++++++-------------------
>  1 file changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst
> index dab6dfa0acc..ec627aa9c06 100644
> --- a/docs/devel/loads-stores.rst
> +++ b/docs/devel/loads-stores.rst
> @@ -63,12 +63,12 @@ which stores ``val`` to ``ptr`` as an ``{endian}`` order value
>  of size ``sz`` bytes.
>  
>  
> -Regexes for git grep
> +Regexes for git grep:
>   - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_p\>``

This claims that ldul_be_p() is a valid function name (which I would
expect to take a pointer to a 32-bit integer and produce an unsigned
result suitable for assigning into a 64-bit value).  But it does not
exist, and the fact that ldl_be_p() returns 'int' means I had to add a
cast to avoid unintended sign-extension:

https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg05234.html

cast added in relation to v5 patch at
https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg04923.html

>   - ``\<st[bwlq]\(_[hbl]e\)\?_p\>``
>   - ``\<st24\(_[hbl]e\)\?_p\>``
> - - ``\<ldn_\([hbl]e\)?_p\>``
> - - ``\<stn_\([hbl]e\)?_p\>``
> + - ``\<ldn_\([hbl]e\)\?_p\>``
> + - ``\<stn_\([hbl]e\)\?_p\>``

So as long as we are touching the docs, is it worth considering the
larger task of auditing whether it is appropriate to have all of the
ld*_ functions return unsigned values, and/or implement ldu/lds
variants that guarantee zero or sign extension for widening 32-bit
values when assigning to 64-bit destinations?

>  
>  ``cpu_{ld,st}*_mmu``
>  ~~~~~~~~~~~~~~~~~~~~
> @@ -121,8 +121,8 @@ store: ``cpu_st{size}{end}_mmu(env, ptr, val, oi, retaddr)``
>   - ``_le`` : little endian
>  
>  Regexes for git grep:
> - - ``\<cpu_ld[bwlq](_[bl]e)\?_mmu\>``

As a counterpoint, the cpu_ldl_* functions already return uint32_t.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org
Re: [PATCH] docs/devel/loads-stores: Fix git grep regexes
Posted by Peter Maydell 1 year, 2 months ago
On Tue, 5 Sept 2023 at 15:31, Eric Blake <eblake@redhat.com> wrote:
>
> On Mon, Sep 04, 2023 at 05:17:03PM +0100, Peter Maydell wrote:
> > The loads-and-stores documentation includes git grep regexes to find
> > occurrences of the various functions.  Some of these regexes have
> > errors, typically failing to escape the '?', '(' and ')' when they
> > should be metacharacters (since these are POSIX basic REs). We also
> > weren't consistent about whether to have a ':' on the end of the
> > line introducing the list of regexes in each section.
> >
> > Fix the errors.
> >
> > The following shell rune will complain about any REs in the
> > file which don't have any matches in the codebase:
> >  for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> > ---
> >  docs/devel/loads-stores.rst | 40 ++++++++++++++++++-------------------
> >  1 file changed, 20 insertions(+), 20 deletions(-)
> >
> > diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst
> > index dab6dfa0acc..ec627aa9c06 100644
> > --- a/docs/devel/loads-stores.rst
> > +++ b/docs/devel/loads-stores.rst
> > @@ -63,12 +63,12 @@ which stores ``val`` to ``ptr`` as an ``{endian}`` order value
> >  of size ``sz`` bytes.
> >
> >
> > -Regexes for git grep
> > +Regexes for git grep:
> >   - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_p\>``
>
> This claims that ldul_be_p() is a valid function name

No, it's not claiming that. It's just claiming that this
regex will catch all the function names defined in this
section, not that it will avoid matching on some non-existent
function names.

The documentation section above tells you what is actually
valid, and that says that "sign" is empty for 32 or 64 bit
accesses.

> (which I would
> expect to take a pointer to a 32-bit integer and produce an unsigned
> result suitable for assigning into a 64-bit value).  But it does not
> exist, and the fact that ldl_be_p() returns 'int' means I had to add a
> cast to avoid unintended sign-extension:
>
> https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg05234.html
>
> cast added in relation to v5 patch at
> https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg04923.html
>
> >   - ``\<st[bwlq]\(_[hbl]e\)\?_p\>``
> >   - ``\<st24\(_[hbl]e\)\?_p\>``
> > - - ``\<ldn_\([hbl]e\)?_p\>``
> > - - ``\<stn_\([hbl]e\)?_p\>``
> > + - ``\<ldn_\([hbl]e\)\?_p\>``
> > + - ``\<stn_\([hbl]e\)\?_p\>``
>
> So as long as we are touching the docs, is it worth considering the
> larger task of auditing whether it is appropriate to have all of the
> ld*_ functions return unsigned values, and/or implement ldu/lds
> variants that guarantee zero or sign extension for widening 32-bit
> values when assigning to 64-bit destinations?

No, I think it clearly is not. All I want to do here is
fix the busted regular expressions.

If you would like to try to tidy up some of the semantics
of the load/store APIs you're welcome to have a go at
that. The major obstacle is the obvious one that there
are an absolute ton of existing uses for all of these
API families.

thanks
-- PMM
Re: [PATCH] docs/devel/loads-stores: Fix git grep regexes
Posted by Philippe Mathieu-Daudé 1 year, 2 months ago
On 4/9/23 18:17, Peter Maydell wrote:
> The loads-and-stores documentation includes git grep regexes to find
> occurrences of the various functions.  Some of these regexes have
> errors, typically failing to escape the '?', '(' and ')' when they
> should be metacharacters (since these are POSIX basic REs). We also
> weren't consistent about whether to have a ':' on the end of the
> line introducing the list of regexes in each section.
> 
> Fix the errors.
> 
> The following shell rune will complain about any REs in the
> file which don't have any matches in the codebase:
>   for re in $(sed -ne 's/ - ``\(\\<.*\)``/\1/p' docs/devel/loads-stores.rst); do git grep -q "$re" || echo "no matches for re $re"; done
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>   docs/devel/loads-stores.rst | 40 ++++++++++++++++++-------------------
>   1 file changed, 20 insertions(+), 20 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>