[PATCH v2] docs/misra: document the expected sizes of integer types

Stefano Stabellini posted 1 patch 8 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/alpine.DEB.2.22.394.2403141516550.853156@ubuntu-linux-20-04-desktop
There is a newer version of this series
docs/misra/C-language-toolchain.rst | 58 +++++++++++++++++++++++++++++
1 file changed, 58 insertions(+)
[PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Stefano Stabellini 8 months, 1 week ago
Xen makes assumptions about the size of integer types on the various
architectures. Document these assumptions.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
---
Changes in v2:
- add alignment info
---
 docs/misra/C-language-toolchain.rst | 58 +++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/docs/misra/C-language-toolchain.rst b/docs/misra/C-language-toolchain.rst
index b7c2000992..24d3c1cac6 100644
--- a/docs/misra/C-language-toolchain.rst
+++ b/docs/misra/C-language-toolchain.rst
@@ -480,4 +480,62 @@ The table columns are as follows:
      - See Section "4.13 Preprocessing Directives" of GCC_MANUAL and Section "11.1 Implementation-defined behavior" of CPP_MANUAL.
 
 
+Sizes of Integer types
+______________________
+
+.. list-table::
+   :widths: 10 10 10 45
+   :header-rows: 1
+
+   * - Type
+     - Size
+     - Alignment
+     - Architectures
+
+   * - char 
+     - 8 bits
+     - 8 bits
+     - all architectures
+
+   * - short
+     - 16 bits
+     - 16 bits
+     - all architectures
+
+   * - int
+     - 32 bits
+     - 32 bits
+     - all architectures
+
+   * - long
+     - 32 bits
+     - 32 bits 
+     - 32-bit architectures (x86_32, ARMv8-A AArch32, ARMv8-R AArch32)
+
+   * - long
+     - 64 bits
+     - 64 bits 
+     - 64-bit architectures (x86_64, ARMv8-A AArch64, RV64, PPC64)
+
+   * - long long
+     - 64-bit
+     - 32-bit
+     - x86_32
+
+   * - long long
+     - 64-bit
+     - 64-bit
+     - 64-bit architectures, ARMv8-A AArch32, ARMv8-R AArch32
+
+   * - pointer
+     - 32-bit
+     - 32-bit
+     - 32-bit architectures (x86_32, ARMv8-A AArch32, ARMv8-R AArch32)
+
+   * - pointer
+     - 64-bit
+     - 64-bit
+     - 64-bit architectures (x86_64, ARMv8-A AArch64, RV64, PPC64)
+
+
 END OF DOCUMENT.
-- 
2.25.1
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Jan Beulich 8 months, 1 week ago
On 14.03.2024 23:17, Stefano Stabellini wrote:
> Xen makes assumptions about the size of integer types on the various
> architectures. Document these assumptions.

My prior reservation wrt exact vs minimum sizes remains. Additionally,
is it really meaningful to document x86-32 as an architecture, when it's
been many years that the hypervisor cannot be built anymore for that
target? If it's not (just) the hypervisor build that's intended to be
covered here (the file living under docs/misra/, after all), can that
further purpose please be mentioned?

Jan
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Stefano Stabellini 8 months, 1 week ago
On Fri, 15 Mar 2024, Jan Beulich wrote:
> On 14.03.2024 23:17, Stefano Stabellini wrote:
> > Xen makes assumptions about the size of integer types on the various
> > architectures. Document these assumptions.
> 
> My prior reservation wrt exact vs minimum sizes remains.

We have to specify the exact size. In practice the size is predetermined
and exact with all our supported compilers given a architecture.

Most importantly, unfortunately we use non-fixed-size integer types in
C hypercall entry points and public ABIs. In my opinion, that is not
acceptable.

We have two options:

1) we go with this document, and we clarify that even if we specify
  "unsigned int", we actually mean a 32-bit integer

2) we change all our public ABIs and C hypercall entry points to use
   fixed-size types (e.g. s/unsigned int/uint32_t/g)

2) is preferred because it is clearer but it is more work. So I went
with 1). I also thought you would like 1) more.


> Additionally, is it really meaningful to document x86-32 as an
> architecture, when it's been many years that the hypervisor cannot be
> built anymore for that target?

You are right. I should take x86_32 out. I'll do it in the next version.
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Jan Beulich 8 months, 1 week ago
On 16.03.2024 01:07, Stefano Stabellini wrote:
> On Fri, 15 Mar 2024, Jan Beulich wrote:
>> On 14.03.2024 23:17, Stefano Stabellini wrote:
>>> Xen makes assumptions about the size of integer types on the various
>>> architectures. Document these assumptions.
>>
>> My prior reservation wrt exact vs minimum sizes remains.
> 
> We have to specify the exact size. In practice the size is predetermined
> and exact with all our supported compilers given a architecture.

But that's not the purpose of this document; if it was down to what
compilers offer, we could refer to compiler documentation (and iirc we
already do for various aspects). The purpose of this document, aiui,
is to document assumption we make in hypervisor code. And those should
be >=, not ==.

> Most importantly, unfortunately we use non-fixed-size integer types in
> C hypercall entry points and public ABIs. In my opinion, that is not
> acceptable.

The problem is that I can't see the reason for you thinking so. The C
entry points sit past assembly code doing (required to do) necessary
adjustments, if any. If there was no assembly layer, whether to use
fixed with types for such parameters would depend on what the
architecture guarantees.

As to public ABIs - that's structure definitions, and I agree we ought
to uniformly use fixed-width types there. We largely do; a few things
still require fixing.

> We have two options:
> 
> 1) we go with this document, and we clarify that even if we specify
>   "unsigned int", we actually mean a 32-bit integer
> 
> 2) we change all our public ABIs and C hypercall entry points to use
>    fixed-size types (e.g. s/unsigned int/uint32_t/g)
> 
> 2) is preferred because it is clearer but it is more work. So I went
> with 1). I also thought you would like 1) more.

For ABIs (i.e. structures) we ought to be making that change anyway.
Leaving basic types in there is latently buggy.

I'm happy to see a document like this added, for the purpose described
above. But to me 1) and 2) and largely independent of one another.

Jan
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Stefano Stabellini 8 months, 1 week ago
On Mon, 18 Mar 2024, Jan Beulich wrote:
> On 16.03.2024 01:07, Stefano Stabellini wrote:
> > On Fri, 15 Mar 2024, Jan Beulich wrote:
> >> On 14.03.2024 23:17, Stefano Stabellini wrote:
> >>> Xen makes assumptions about the size of integer types on the various
> >>> architectures. Document these assumptions.
> >>
> >> My prior reservation wrt exact vs minimum sizes remains.
> > 
> > We have to specify the exact size. In practice the size is predetermined
> > and exact with all our supported compilers given a architecture.
> 
> But that's not the purpose of this document; if it was down to what
> compilers offer, we could refer to compiler documentation (and iirc we
> already do for various aspects). The purpose of this document, aiui,
> is to document assumption we make in hypervisor code. And those should
> be >=, not ==.

Well... I guess the two of us are making different assumptions then :-)

Which is the reason why documenting assumptions is so important. More at
the bottom.


> > Most importantly, unfortunately we use non-fixed-size integer types in
> > C hypercall entry points and public ABIs. In my opinion, that is not
> > acceptable.
> 
> The problem is that I can't see the reason for you thinking so. The C
> entry points sit past assembly code doing (required to do) necessary
> adjustments, if any. If there was no assembly layer, whether to use
> fixed with types for such parameters would depend on what the
> architecture guarantees.

This could be the source of the disagreement. I see the little assembly
code as not important, I consider it just like a little trampoline to
me. As we describe the hypercalls in C header files, I consider the C
functions the "official" hypercall entry points.

Also, as this is an ABI, I consider mandatory to use clear width
definitions of all the types (whether with this document or with
fixed-width types, and fixed-width types are clearer and better) in both
the C header files that describe the ABI interfaces, as well as the C
entry points that corresponds to it. E.g. I think we have to use
the same types in both do_sched_op and the hypercall description in
xen/include/public/sched.h


> As to public ABIs - that's structure definitions, and I agree we ought
> to uniformly use fixed-width types there. We largely do; a few things
> still require fixing.

+1


> > We have two options:
> > 
> > 1) we go with this document, and we clarify that even if we specify
> >   "unsigned int", we actually mean a 32-bit integer
> > 
> > 2) we change all our public ABIs and C hypercall entry points to use
> >    fixed-size types (e.g. s/unsigned int/uint32_t/g)
> > 
> > 2) is preferred because it is clearer but it is more work. So I went
> > with 1). I also thought you would like 1) more.
> 
> For ABIs (i.e. structures) we ought to be making that change anyway.
> Leaving basic types in there is latently buggy.

I am glad we agree :-)

It is just that I also consinder the C hypercall entry points as part of
the ABI


> I'm happy to see a document like this added, for the purpose described
> above. But to me 1) and 2) and largely independent of one another.

Good that you are also happy with a document like this.

The remaining question is: what about the rest of the C functions in Xen
that are certainly not part of an ABI?

Those are less critical, still this document should apply uniformily to
them too. I don't understand why you are making the >= width assumption
you mentioned at the top of the file when actually it is impossible to
exercise or test this assumption on any compiler or any architecture
that works with Xen. If it cannot be enabled, it hasn't been tested, and
it probably won't work.
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Jan Beulich 8 months, 1 week ago
On 19.03.2024 04:37, Stefano Stabellini wrote:
> On Mon, 18 Mar 2024, Jan Beulich wrote:
>> On 16.03.2024 01:07, Stefano Stabellini wrote:
>>> On Fri, 15 Mar 2024, Jan Beulich wrote:
>>>> On 14.03.2024 23:17, Stefano Stabellini wrote:
>>>>> Xen makes assumptions about the size of integer types on the various
>>>>> architectures. Document these assumptions.
>>>>
>>>> My prior reservation wrt exact vs minimum sizes remains.
>>>
>>> We have to specify the exact size. In practice the size is predetermined
>>> and exact with all our supported compilers given a architecture.
>>
>> But that's not the purpose of this document; if it was down to what
>> compilers offer, we could refer to compiler documentation (and iirc we
>> already do for various aspects). The purpose of this document, aiui,
>> is to document assumption we make in hypervisor code. And those should
>> be >=, not ==.
> 
> Well... I guess the two of us are making different assumptions then :-)
> 
> Which is the reason why documenting assumptions is so important. More at
> the bottom.
> 
> 
>>> Most importantly, unfortunately we use non-fixed-size integer types in
>>> C hypercall entry points and public ABIs. In my opinion, that is not
>>> acceptable.
>>
>> The problem is that I can't see the reason for you thinking so. The C
>> entry points sit past assembly code doing (required to do) necessary
>> adjustments, if any. If there was no assembly layer, whether to use
>> fixed with types for such parameters would depend on what the
>> architecture guarantees.
> 
> This could be the source of the disagreement. I see the little assembly
> code as not important, I consider it just like a little trampoline to
> me. As we describe the hypercalls in C header files, I consider the C
> functions the "official" hypercall entry points.

Why would that be? Any code we execute in Xen is relevant.

> Also, as this is an ABI, I consider mandatory to use clear width
> definitions of all the types (whether with this document or with
> fixed-width types, and fixed-width types are clearer and better) in both
> the C header files that describe the ABI interfaces, as well as the C
> entry points that corresponds to it. E.g. I think we have to use
> the same types in both do_sched_op and the hypercall description in
> xen/include/public/sched.h

There are two entirely separate aspects to the ABI: One is what we
document towards consumers of it. The other is entirely internal, i.e.
an implementation detail - how we actually consume the data.
Documenting fixed-width types towards consumers is probably okay,
albeit (see below) imo still not strictly necessary (for being
needlessly limiting).

>> As to public ABIs - that's structure definitions, and I agree we ought
>> to uniformly use fixed-width types there. We largely do; a few things
>> still require fixing.
> 
> +1
> 
> 
>>> We have two options:
>>>
>>> 1) we go with this document, and we clarify that even if we specify
>>>   "unsigned int", we actually mean a 32-bit integer
>>>
>>> 2) we change all our public ABIs and C hypercall entry points to use
>>>    fixed-size types (e.g. s/unsigned int/uint32_t/g)
>>>
>>> 2) is preferred because it is clearer but it is more work. So I went
>>> with 1). I also thought you would like 1) more.
>>
>> For ABIs (i.e. structures) we ought to be making that change anyway.
>> Leaving basic types in there is latently buggy.
> 
> I am glad we agree :-)
> 
> It is just that I also consinder the C hypercall entry points as part of
> the ABI
> 
> 
>> I'm happy to see a document like this added, for the purpose described
>> above. But to me 1) and 2) and largely independent of one another.
> 
> Good that you are also happy with a document like this.
> 
> The remaining question is: what about the rest of the C functions in Xen
> that are certainly not part of an ABI?

As per above - anything internal isn't part of the ABI, C entry points
for hypercall handlers included. All we need to ensure is that we consume
the data according to what the ABI sets forth.

To use wording from George when he criticized my supposed lack of actual
arguments: While there's nothing technically wrong with using fixed
width types there (or in fact everywhere), there's also nothing technically
wrong with using plain C types there and almost everywhere else (ABI
structures excluded). With both technically equal, ./CODING_STYLE has the
only criteria to pick between the two. IOW that's what I view wrong in
George's argumentation: Demanding that I provide technical arguments when
the desire to use fixed width types for the purpose under discussion also
isn't backed by any.

> Those are less critical, still this document should apply uniformily to
> them too. I don't understand why you are making the >= width assumption
> you mentioned at the top of the file when actually it is impossible to
> exercise or test this assumption on any compiler or any architecture
> that works with Xen. If it cannot be enabled, it hasn't been tested, and
> it probably won't work.

Hmm, yes, that's one way to look at it. My perspective is different though:
By writing down assumptions that are more strict than necessary, we'd be
excluding ports to environments meeting the >= assumption, but not meeting
the == one. Unless of course you can point me at any place where - not
just by mistake / by being overly lax - we truly depend on the == that you
want to put in place. IOW yes, there likely would need to be adjustments
to code if such a port was to happen. Yet we shouldn't further harden
requirements that were never meant to be there.

Note that by writing down anything more strict than necessary, you'd also
encourage people to further wrongly treat e.g. uint32_t and unsigned int
as identical. Such wrong assumptions had been a severe hindrance in doing
ports from 32- to 64-bit processors some 20 years ago. I would have hoped
that we'd learn from such mistakes.

Jan
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Stefano Stabellini 8 months, 1 week ago
On Tue, 19 Mar 2024, Jan Beulich wrote:
> On 19.03.2024 04:37, Stefano Stabellini wrote:
> > On Mon, 18 Mar 2024, Jan Beulich wrote:
> >> On 16.03.2024 01:07, Stefano Stabellini wrote:
> >>> On Fri, 15 Mar 2024, Jan Beulich wrote:
> >>>> On 14.03.2024 23:17, Stefano Stabellini wrote:
> >>>>> Xen makes assumptions about the size of integer types on the various
> >>>>> architectures. Document these assumptions.
> >>>>
> >>>> My prior reservation wrt exact vs minimum sizes remains.
> >>>
> >>> We have to specify the exact size. In practice the size is predetermined
> >>> and exact with all our supported compilers given a architecture.
> >>
> >> But that's not the purpose of this document; if it was down to what
> >> compilers offer, we could refer to compiler documentation (and iirc we
> >> already do for various aspects). The purpose of this document, aiui,
> >> is to document assumption we make in hypervisor code. And those should
> >> be >=, not ==.
> > 
> > Well... I guess the two of us are making different assumptions then :-)
> > 
> > Which is the reason why documenting assumptions is so important. More at
> > the bottom.
> > 
> > 
> >>> Most importantly, unfortunately we use non-fixed-size integer types in
> >>> C hypercall entry points and public ABIs. In my opinion, that is not
> >>> acceptable.
> >>
> >> The problem is that I can't see the reason for you thinking so. The C
> >> entry points sit past assembly code doing (required to do) necessary
> >> adjustments, if any. If there was no assembly layer, whether to use
> >> fixed with types for such parameters would depend on what the
> >> architecture guarantees.
> > 
> > This could be the source of the disagreement. I see the little assembly
> > code as not important, I consider it just like a little trampoline to
> > me. As we describe the hypercalls in C header files, I consider the C
> > functions the "official" hypercall entry points.
> 
> Why would that be? Any code we execute in Xen is relevant.

There are a few reasons:

- the public interface is described in a C header so it makes sense for
  the corresponding implementation to be in C

- the C entry point is often both the entry point in C and also common
  code

- depending on the architecture, there is typically always some minimal
  assembly entry code to prepare the environment before we can jump into
  C-land; still one wouldn't consider those minimal and routine assembly
  operations to be a meaningful hypercall entry point corresponding to
  the C declaration in the public headers

- as per MISRA and also general good practice, we need the declaration
  in the public header files to match the definition in C


> > Also, as this is an ABI, I consider mandatory to use clear width
> > definitions of all the types (whether with this document or with
> > fixed-width types, and fixed-width types are clearer and better) in both
> > the C header files that describe the ABI interfaces, as well as the C
> > entry points that corresponds to it. E.g. I think we have to use
> > the same types in both do_sched_op and the hypercall description in
> > xen/include/public/sched.h
> 
> There are two entirely separate aspects to the ABI: One is what we
> document towards consumers of it. The other is entirely internal, i.e.
> an implementation detail - how we actually consume the data.
> Documenting fixed-width types towards consumers is probably okay,
> albeit (see below) imo still not strictly necessary (for being
> needlessly limiting).

I don't see it this way.

As the Xen public interface description is in C and used during the
build, my opinion is that the public description and the C definition
need to match.

Also, I don't understand how you can say that public interfaces don't
strictly necessarily have to use fixed-width types.

Imagine that you use native types with different compilers that can
actually output different width interger sizes (which is not possible
today with gcc or clang). Imagine that a guest is written in a language
other than C (e.g. Java) based on the public interface description. It
cannot work correctly, can it?

I don't see how we can possibly have a public interface with anything
other than fixed-width integers.


> >> As to public ABIs - that's structure definitions, and I agree we ought
> >> to uniformly use fixed-width types there. We largely do; a few things
> >> still require fixing.
> > 
> > +1
> > 
> > 
> >>> We have two options:
> >>>
> >>> 1) we go with this document, and we clarify that even if we specify
> >>>   "unsigned int", we actually mean a 32-bit integer
> >>>
> >>> 2) we change all our public ABIs and C hypercall entry points to use
> >>>    fixed-size types (e.g. s/unsigned int/uint32_t/g)
> >>>
> >>> 2) is preferred because it is clearer but it is more work. So I went
> >>> with 1). I also thought you would like 1) more.
> >>
> >> For ABIs (i.e. structures) we ought to be making that change anyway.
> >> Leaving basic types in there is latently buggy.
> > 
> > I am glad we agree :-)
> > 
> > It is just that I also consinder the C hypercall entry points as part of
> > the ABI
> > 
> > 
> >> I'm happy to see a document like this added, for the purpose described
> >> above. But to me 1) and 2) and largely independent of one another.
> > 
> > Good that you are also happy with a document like this.
> > 
> > The remaining question is: what about the rest of the C functions in Xen
> > that are certainly not part of an ABI?
> 
> As per above - anything internal isn't part of the ABI, C entry points
> for hypercall handlers included. All we need to ensure is that we consume
> the data according to what the ABI sets forth.

It doesn't look like we'll convince one another on this point. But let
me try another way.

In my view, having mismatched types between declaration and definition
and having non-fixed-width types in C hypercall entry points is really
bad for a number of reasons, among them:
- correctness
- risk of ABI breakage
- mismatch of declaration and definition

In your view, the drawback is not following the CODING_STYLE.

The two points of views on this subject don't have the same to lose. If
I were you, I would probably not invest my energy to defend the
CODING_STYLE.


> To use wording from George when he criticized my supposed lack of actual
> arguments: While there's nothing technically wrong with using fixed
> width types there (or in fact everywhere), there's also nothing technically
> wrong with using plain C types there and almost everywhere else (ABI
> structures excluded). With both technically equal, ./CODING_STYLE has the
> only criteria to pick between the two. IOW that's what I view wrong in
> George's argumentation: Demanding that I provide technical arguments when
> the desire to use fixed width types for the purpose under discussion also
> isn't backed by any.

I don't think we are in violation of the CODING_STYLE as it explicitly
accounts for exceptions. Public interfaces declarations and definitions
(hypercalls C entry points included) are an exception.

In my opinion, using fixed-width integers in public headers and C
definitions (including C hypercall entry points) is top priority for
correctness. Correctness is more important than style. So, if we need to
change the CODING_STYLE to get there, let's change the CODING_STYLE.


> > Those are less critical, still this document should apply uniformily to
> > them too. I don't understand why you are making the >= width assumption
> > you mentioned at the top of the file when actually it is impossible to
> > exercise or test this assumption on any compiler or any architecture
> > that works with Xen. If it cannot be enabled, it hasn't been tested, and
> > it probably won't work.
> 
> Hmm, yes, that's one way to look at it. My perspective is different though:
> By writing down assumptions that are more strict than necessary, we'd be
> excluding ports to environments meeting the >= assumption, but not meeting
> the == one. Unless of course you can point me at any place where - not
> just by mistake / by being overly lax - we truly depend on the == that you
> want to put in place. IOW yes, there likely would need to be adjustments
> to code if such a port was to happen. Yet we shouldn't further harden
> requirements that were never meant to be there.

I have already shown that all the current implementations and tests only
check for ==. In my opinion, this is sufficient evidence that >= is not
supported.

If you admit it probably wouldn't work without fixes today, would you
security-support such a configuration? Would you safety-support it? I
wouldn't want to buy a car running Xen compiled with a compiler using
integer sizes different from the ones written in this document.

Let me summarize our positions on these topics.

Agreed points:
- public interfaces should use fixed-width types
- it is a good idea to have a document describing our assumptions about
  integer types

Open decision points and misalignments:
- Should the C hypercall entry points match the public header
  declarations and ideally use fixed-width integer types? 

I'd say yes and I would argue for it

- Should the document describing our assumptions about integer types
  specify == (unsigned int == uint32_t) or >= (unsigned int >=
  uint32_t)?

I'd say specify == and I would argue for it
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Jan Beulich 8 months, 1 week ago
On 20.03.2024 07:01, Stefano Stabellini wrote:
> On Tue, 19 Mar 2024, Jan Beulich wrote:
>> On 19.03.2024 04:37, Stefano Stabellini wrote:
>>> On Mon, 18 Mar 2024, Jan Beulich wrote:
>>>> On 16.03.2024 01:07, Stefano Stabellini wrote:
>>>>> On Fri, 15 Mar 2024, Jan Beulich wrote:
>>>>>> On 14.03.2024 23:17, Stefano Stabellini wrote:
>>>>>>> Xen makes assumptions about the size of integer types on the various
>>>>>>> architectures. Document these assumptions.
>>>>>>
>>>>>> My prior reservation wrt exact vs minimum sizes remains.
>>>>>
>>>>> We have to specify the exact size. In practice the size is predetermined
>>>>> and exact with all our supported compilers given a architecture.
>>>>
>>>> But that's not the purpose of this document; if it was down to what
>>>> compilers offer, we could refer to compiler documentation (and iirc we
>>>> already do for various aspects). The purpose of this document, aiui,
>>>> is to document assumption we make in hypervisor code. And those should
>>>> be >=, not ==.
>>>
>>> Well... I guess the two of us are making different assumptions then :-)
>>>
>>> Which is the reason why documenting assumptions is so important. More at
>>> the bottom.
>>>
>>>
>>>>> Most importantly, unfortunately we use non-fixed-size integer types in
>>>>> C hypercall entry points and public ABIs. In my opinion, that is not
>>>>> acceptable.
>>>>
>>>> The problem is that I can't see the reason for you thinking so. The C
>>>> entry points sit past assembly code doing (required to do) necessary
>>>> adjustments, if any. If there was no assembly layer, whether to use
>>>> fixed with types for such parameters would depend on what the
>>>> architecture guarantees.
>>>
>>> This could be the source of the disagreement. I see the little assembly
>>> code as not important, I consider it just like a little trampoline to
>>> me. As we describe the hypercalls in C header files, I consider the C
>>> functions the "official" hypercall entry points.
>>
>> Why would that be? Any code we execute in Xen is relevant.
> 
> There are a few reasons:
> 
> - the public interface is described in a C header so it makes sense for
>   the corresponding implementation to be in C
> 
> - the C entry point is often both the entry point in C and also common
>   code
> 
> - depending on the architecture, there is typically always some minimal
>   assembly entry code to prepare the environment before we can jump into
>   C-land; still one wouldn't consider those minimal and routine assembly
>   operations to be a meaningful hypercall entry point corresponding to
>   the C declaration in the public headers
> 
> - as per MISRA and also general good practice, we need the declaration
>   in the public header files to match the definition in C

Throughout, but especially with this last point, I feel there's confusion
(not sure on which side): There are no declarations of hypercall functions
in the public headers. Adding declarations there for the C entry points in
Xen would actually be wrong, as we don't provide such functions anywhere
(to consumers of the ABI).

>>> Also, as this is an ABI, I consider mandatory to use clear width
>>> definitions of all the types (whether with this document or with
>>> fixed-width types, and fixed-width types are clearer and better) in both
>>> the C header files that describe the ABI interfaces, as well as the C
>>> entry points that corresponds to it. E.g. I think we have to use
>>> the same types in both do_sched_op and the hypercall description in
>>> xen/include/public/sched.h
>>
>> There are two entirely separate aspects to the ABI: One is what we
>> document towards consumers of it. The other is entirely internal, i.e.
>> an implementation detail - how we actually consume the data.
>> Documenting fixed-width types towards consumers is probably okay,
>> albeit (see below) imo still not strictly necessary (for being
>> needlessly limiting).
> 
> I don't see it this way.
> 
> As the Xen public interface description is in C and used during the
> build, my opinion is that the public description and the C definition
> need to match.
> 
> Also, I don't understand how you can say that public interfaces don't
> strictly necessarily have to use fixed-width types.
> 
> Imagine that you use native types with different compilers that can
> actually output different width interger sizes (which is not possible
> today with gcc or clang). Imagine that a guest is written in a language
> other than C (e.g. Java) based on the public interface description. It
> cannot work correctly, can it?

They'd need to write appropriate hypercall invocation functions. As per
above - we don't provide these in the public headers, not even for C
consumers.

> I don't see how we can possibly have a public interface with anything
> other than fixed-width integers.

That's the consumer side of the ABI. It says nothing about the internal
implementation details in Xen. All we need to do there is respect the
ABI. That has no influence whatsoever on the C entry points when those
aren't the actual hypercall entrypoints into the hypervisor.

>>>> As to public ABIs - that's structure definitions, and I agree we ought
>>>> to uniformly use fixed-width types there. We largely do; a few things
>>>> still require fixing.
>>>
>>> +1
>>>
>>>
>>>>> We have two options:
>>>>>
>>>>> 1) we go with this document, and we clarify that even if we specify
>>>>>   "unsigned int", we actually mean a 32-bit integer
>>>>>
>>>>> 2) we change all our public ABIs and C hypercall entry points to use
>>>>>    fixed-size types (e.g. s/unsigned int/uint32_t/g)
>>>>>
>>>>> 2) is preferred because it is clearer but it is more work. So I went
>>>>> with 1). I also thought you would like 1) more.
>>>>
>>>> For ABIs (i.e. structures) we ought to be making that change anyway.
>>>> Leaving basic types in there is latently buggy.
>>>
>>> I am glad we agree :-)
>>>
>>> It is just that I also consinder the C hypercall entry points as part of
>>> the ABI
>>>
>>>
>>>> I'm happy to see a document like this added, for the purpose described
>>>> above. But to me 1) and 2) and largely independent of one another.
>>>
>>> Good that you are also happy with a document like this.
>>>
>>> The remaining question is: what about the rest of the C functions in Xen
>>> that are certainly not part of an ABI?
>>
>> As per above - anything internal isn't part of the ABI, C entry points
>> for hypercall handlers included. All we need to ensure is that we consume
>> the data according to what the ABI sets forth.
> 
> It doesn't look like we'll convince one another on this point. But let
> me try another way.
> 
> In my view, having mismatched types between declaration and definition
> and having non-fixed-width types in C hypercall entry points is really
> bad for a number of reasons, among them:
> - correctness
> - risk of ABI breakage
> - mismatch of declaration and definition

What mismatches are you talking about? There's nothing mismatched now,
and there cannot be any mismatch, because the consumers of the ABI don't
call Xen functions directly.

> In your view, the drawback is not following the CODING_STYLE.
> 
> The two points of views on this subject don't have the same to lose. If
> I were you, I would probably not invest my energy to defend the
> CODING_STYLE.
> 
> 
>> To use wording from George when he criticized my supposed lack of actual
>> arguments: While there's nothing technically wrong with using fixed
>> width types there (or in fact everywhere), there's also nothing technically
>> wrong with using plain C types there and almost everywhere else (ABI
>> structures excluded). With both technically equal, ./CODING_STYLE has the
>> only criteria to pick between the two. IOW that's what I view wrong in
>> George's argumentation: Demanding that I provide technical arguments when
>> the desire to use fixed width types for the purpose under discussion also
>> isn't backed by any.
> 
> I don't think we are in violation of the CODING_STYLE as it explicitly
> accounts for exceptions. Public interfaces declarations and definitions
> (hypercalls C entry points included) are an exception.

If that was technically necessary, I would surely agree to there being an
exception here.

> In my opinion, using fixed-width integers in public headers and C
> definitions (including C hypercall entry points) is top priority for
> correctness. Correctness is more important than style. So, if we need to
> change the CODING_STYLE to get there, let's change the CODING_STYLE.
> 
> 
>>> Those are less critical, still this document should apply uniformily to
>>> them too. I don't understand why you are making the >= width assumption
>>> you mentioned at the top of the file when actually it is impossible to
>>> exercise or test this assumption on any compiler or any architecture
>>> that works with Xen. If it cannot be enabled, it hasn't been tested, and
>>> it probably won't work.
>>
>> Hmm, yes, that's one way to look at it. My perspective is different though:
>> By writing down assumptions that are more strict than necessary, we'd be
>> excluding ports to environments meeting the >= assumption, but not meeting
>> the == one. Unless of course you can point me at any place where - not
>> just by mistake / by being overly lax - we truly depend on the == that you
>> want to put in place. IOW yes, there likely would need to be adjustments
>> to code if such a port was to happen. Yet we shouldn't further harden
>> requirements that were never meant to be there.
> 
> I have already shown that all the current implementations and tests only
> check for ==. In my opinion, this is sufficient evidence that >= is not
> supported.
> 
> If you admit it probably wouldn't work without fixes today, would you
> security-support such a configuration? Would you safety-support it? I
> wouldn't want to buy a car running Xen compiled with a compiler using
> integer sizes different from the ones written in this document.
> 
> Let me summarize our positions on these topics.
> 
> Agreed points:
> - public interfaces should use fixed-width types
> - it is a good idea to have a document describing our assumptions about
>   integer types
> 
> Open decision points and misalignments:
> - Should the C hypercall entry points match the public header
>   declarations and ideally use fixed-width integer types? 

As per above, this question just cannot be validly raised. There are
no public header declarations to match.

> I'd say yes and I would argue for it
> 
> - Should the document describing our assumptions about integer types
>   specify == (unsigned int == uint32_t) or >= (unsigned int >=
>   uint32_t)?
> 
> I'd say specify == and I would argue for it

Actually, I had a further thought here in the meantime: For particular
ports, using == is likely okay - they're conforming to particular
psABI-s, after all (and that's what the compilers used also implement).
I'd nevertheless expect >= to be used in common assumptions. That way
for existing ports you get what you want, and there would still be
provisions for new ports using, say, an ILP64 ABI. Common code would
need to adhere to the common assumptions only. Arch-specific code can
work from the more tight assumptions. (If future sub-arch variants are
to be expected, like RV128, arch-code may still be well advised to try
to avoid the more tight assumptions where possible, just to limit
eventual porting effort.)

Jan
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Stefano Stabellini 8 months ago
On Wed, 20 Mar 2024, Jan Beulich wrote:
> > - the public interface is described in a C header so it makes sense for
> >   the corresponding implementation to be in C
> > 
> > - the C entry point is often both the entry point in C and also common
> >   code
> > 
> > - depending on the architecture, there is typically always some minimal
> >   assembly entry code to prepare the environment before we can jump into
> >   C-land; still one wouldn't consider those minimal and routine assembly
> >   operations to be a meaningful hypercall entry point corresponding to
> >   the C declaration in the public headers
> > 
> > - as per MISRA and also general good practice, we need the declaration
> >   in the public header files to match the definition in C
> 
> Throughout, but especially with this last point, I feel there's confusion
> (not sure on which side): There are no declarations of hypercall functions
> in the public headers. Adding declarations there for the C entry points in
> Xen would actually be wrong, as we don't provide such functions anywhere
> (to consumers of the ABI).

I am copy/pasting text from sched.h:

 * The prototype for this hypercall is:
 * ` long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...)
 *
 * @cmd == SCHEDOP_??? (scheduler operation).
 * @arg == Operation-specific extra argument(s), as described below.
 * ...  == Additional Operation-specific extra arguments, described below.
 *

from event_channel.h:

 * ` enum neg_errnoval
 * ` HYPERVISOR_event_channel_op(enum event_channel_op cmd, void *args)
 * `
 * @cmd  == EVTCHNOP_* (event-channel operation).
 * @args == struct evtchn_* Operation-specific extra arguments (NULL if none).

These are the hypercall declarations in public headers. Although they
are comments, they are the only description of the ABI that we have (as
far as I know). They are in C and use C types. 


> >>> Also, as this is an ABI, I consider mandatory to use clear width
> >>> definitions of all the types (whether with this document or with
> >>> fixed-width types, and fixed-width types are clearer and better) in both
> >>> the C header files that describe the ABI interfaces, as well as the C
> >>> entry points that corresponds to it. E.g. I think we have to use
> >>> the same types in both do_sched_op and the hypercall description in
> >>> xen/include/public/sched.h
> >>
> >> There are two entirely separate aspects to the ABI: One is what we
> >> document towards consumers of it. The other is entirely internal, i.e.
> >> an implementation detail - how we actually consume the data.
> >> Documenting fixed-width types towards consumers is probably okay,
> >> albeit (see below) imo still not strictly necessary (for being
> >> needlessly limiting).
> > 
> > I don't see it this way.
> > 
> > As the Xen public interface description is in C and used during the
> > build, my opinion is that the public description and the C definition
> > need to match.
> > 
> > Also, I don't understand how you can say that public interfaces don't
> > strictly necessarily have to use fixed-width types.
> > 
> > Imagine that you use native types with different compilers that can
> > actually output different width interger sizes (which is not possible
> > today with gcc or clang). Imagine that a guest is written in a language
> > other than C (e.g. Java) based on the public interface description. It
> > cannot work correctly, can it?
> 
> They'd need to write appropriate hypercall invocation functions. As per
> above - we don't provide these in the public headers, not even for C
> consumers.

See above


> > I don't see how we can possibly have a public interface with anything
> > other than fixed-width integers.
> 
> That's the consumer side of the ABI. It says nothing about the internal
> implementation details in Xen. All we need to do there is respect the
> ABI. That has no influence whatsoever on the C entry points when those
> aren't the actual hypercall entrypoints into the hypervisor.

If we go by the strictest definition, nothing is actually called directly
except for the target of a "b" instruction.

When you call a function in C, you are not actually calling a function.
Assembly is generated to save variables and do other things before "b".
Still, typically it is still considered a "direct" call.

It is not exactly the same thing with hypercall, but I hope I conveyed
the idea why I consider the C hypercall entry points part of the ABI.


> >>>> As to public ABIs - that's structure definitions, and I agree we ought
> >>>> to uniformly use fixed-width types there. We largely do; a few things
> >>>> still require fixing.
> >>>
> >>> +1
> >>>
> >>>
> >>>>> We have two options:
> >>>>>
> >>>>> 1) we go with this document, and we clarify that even if we specify
> >>>>>   "unsigned int", we actually mean a 32-bit integer
> >>>>>
> >>>>> 2) we change all our public ABIs and C hypercall entry points to use
> >>>>>    fixed-size types (e.g. s/unsigned int/uint32_t/g)
> >>>>>
> >>>>> 2) is preferred because it is clearer but it is more work. So I went
> >>>>> with 1). I also thought you would like 1) more.
> >>>>
> >>>> For ABIs (i.e. structures) we ought to be making that change anyway.
> >>>> Leaving basic types in there is latently buggy.
> >>>
> >>> I am glad we agree :-)
> >>>
> >>> It is just that I also consinder the C hypercall entry points as part of
> >>> the ABI
> >>>
> >>>
> >>>> I'm happy to see a document like this added, for the purpose described
> >>>> above. But to me 1) and 2) and largely independent of one another.
> >>>
> >>> Good that you are also happy with a document like this.
> >>>
> >>> The remaining question is: what about the rest of the C functions in Xen
> >>> that are certainly not part of an ABI?
> >>
> >> As per above - anything internal isn't part of the ABI, C entry points
> >> for hypercall handlers included. All we need to ensure is that we consume
> >> the data according to what the ABI sets forth.
> > 
> > It doesn't look like we'll convince one another on this point. But let
> > me try another way.
> > 
> > In my view, having mismatched types between declaration and definition
> > and having non-fixed-width types in C hypercall entry points is really
> > bad for a number of reasons, among them:
> > - correctness
> > - risk of ABI breakage
> > - mismatch of declaration and definition
> 
> What mismatches are you talking about? There's nothing mismatched now,
> and there cannot be any mismatch, because the consumers of the ABI don't
> call Xen functions directly.

Let me make an example:

- public header saying enum event_channel_op cmd
- <assembly>
- do_event_channel_op(int cmd, ...)

Do you think this is all good?

There are two pretty serious problems here:
- enum and int are not the same type
- enum and int are not fixed-width

Don't you think it should be:

- public header saying uint32_t cmd in a comment
- <assembly>
- do_something_op(uint32_t cmd, ...)

Or possibly unsigned long depending on the parameter.

?


> > In your view, the drawback is not following the CODING_STYLE.
> > 
> > The two points of views on this subject don't have the same to lose. If
> > I were you, I would probably not invest my energy to defend the
> > CODING_STYLE.
> > 
> > 
> >> To use wording from George when he criticized my supposed lack of actual
> >> arguments: While there's nothing technically wrong with using fixed
> >> width types there (or in fact everywhere), there's also nothing technically
> >> wrong with using plain C types there and almost everywhere else (ABI
> >> structures excluded). With both technically equal, ./CODING_STYLE has the
> >> only criteria to pick between the two. IOW that's what I view wrong in
> >> George's argumentation: Demanding that I provide technical arguments when
> >> the desire to use fixed width types for the purpose under discussion also
> >> isn't backed by any.
> > 
> > I don't think we are in violation of the CODING_STYLE as it explicitly
> > accounts for exceptions. Public interfaces declarations and definitions
> > (hypercalls C entry points included) are an exception.
> 
> If that was technically necessary, I would surely agree to there being an
> exception here.

Great, that's a start


> > In my opinion, using fixed-width integers in public headers and C
> > definitions (including C hypercall entry points) is top priority for
> > correctness. Correctness is more important than style. So, if we need to
> > change the CODING_STYLE to get there, let's change the CODING_STYLE.
> > 
> > 
> >>> Those are less critical, still this document should apply uniformily to
> >>> them too. I don't understand why you are making the >= width assumption
> >>> you mentioned at the top of the file when actually it is impossible to
> >>> exercise or test this assumption on any compiler or any architecture
> >>> that works with Xen. If it cannot be enabled, it hasn't been tested, and
> >>> it probably won't work.
> >>
> >> Hmm, yes, that's one way to look at it. My perspective is different though:
> >> By writing down assumptions that are more strict than necessary, we'd be
> >> excluding ports to environments meeting the >= assumption, but not meeting
> >> the == one. Unless of course you can point me at any place where - not
> >> just by mistake / by being overly lax - we truly depend on the == that you
> >> want to put in place. IOW yes, there likely would need to be adjustments
> >> to code if such a port was to happen. Yet we shouldn't further harden
> >> requirements that were never meant to be there.
> > 
> > I have already shown that all the current implementations and tests only
> > check for ==. In my opinion, this is sufficient evidence that >= is not
> > supported.
> > 
> > If you admit it probably wouldn't work without fixes today, would you
> > security-support such a configuration? Would you safety-support it? I
> > wouldn't want to buy a car running Xen compiled with a compiler using
> > integer sizes different from the ones written in this document.
> > 
> > Let me summarize our positions on these topics.
> > 
> > Agreed points:
> > - public interfaces should use fixed-width types
> > - it is a good idea to have a document describing our assumptions about
> >   integer types
> > 
> > Open decision points and misalignments:
> > - Should the C hypercall entry points match the public header
> >   declarations and ideally use fixed-width integer types? 
> 
> As per above, this question just cannot be validly raised. There are
> no public header declarations to match.

I clarified


> > I'd say yes and I would argue for it
> > 
> > - Should the document describing our assumptions about integer types
> >   specify == (unsigned int == uint32_t) or >= (unsigned int >=
> >   uint32_t)?
> > 
> > I'd say specify == and I would argue for it
> 
> Actually, I had a further thought here in the meantime: For particular
> ports, using == is likely okay - they're conforming to particular
> psABI-s, after all (and that's what the compilers used also implement).
> I'd nevertheless expect >= to be used in common assumptions. That way
> for existing ports you get what you want, and there would still be
> provisions for new ports using, say, an ILP64 ABI. Common code would
> need to adhere to the common assumptions only. Arch-specific code can
> work from the more tight assumptions. (If future sub-arch variants are
> to be expected, like RV128, arch-code may still be well advised to try
> to avoid the more tight assumptions where possible, just to limit
> eventual porting effort.)

I understand the aspirational goal of supporting >= but in reality it is
not tested, if it is not tested it cannot work, if it cannot work, we
cannot support it. If someone creates a compiler or other tool to check
for >= I would be happy to discuss expanding the document. Without any
tests, I don't think it would be useful to write down >=, not even as
an aspirational goal. A goal must be actionable.
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Jan Beulich 8 months ago
On 21.03.2024 02:46, Stefano Stabellini wrote:
> On Wed, 20 Mar 2024, Jan Beulich wrote:
>>> - the public interface is described in a C header so it makes sense for
>>>   the corresponding implementation to be in C
>>>
>>> - the C entry point is often both the entry point in C and also common
>>>   code
>>>
>>> - depending on the architecture, there is typically always some minimal
>>>   assembly entry code to prepare the environment before we can jump into
>>>   C-land; still one wouldn't consider those minimal and routine assembly
>>>   operations to be a meaningful hypercall entry point corresponding to
>>>   the C declaration in the public headers
>>>
>>> - as per MISRA and also general good practice, we need the declaration
>>>   in the public header files to match the definition in C
>>
>> Throughout, but especially with this last point, I feel there's confusion
>> (not sure on which side): There are no declarations of hypercall functions
>> in the public headers. Adding declarations there for the C entry points in
>> Xen would actually be wrong, as we don't provide such functions anywhere
>> (to consumers of the ABI).
> 
> I am copy/pasting text from sched.h:
> 
>  * The prototype for this hypercall is:
>  * ` long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...)
>  *
>  * @cmd == SCHEDOP_??? (scheduler operation).
>  * @arg == Operation-specific extra argument(s), as described below.
>  * ...  == Additional Operation-specific extra arguments, described below.
>  *
> 
> from event_channel.h:
> 
>  * ` enum neg_errnoval
>  * ` HYPERVISOR_event_channel_op(enum event_channel_op cmd, void *args)
>  * `
>  * @cmd  == EVTCHNOP_* (event-channel operation).
>  * @args == struct evtchn_* Operation-specific extra arguments (NULL if none).
> 
> These are the hypercall declarations in public headers. Although they
> are comments, they are the only description of the ABI that we have (as
> far as I know). They are in C and use C types. 

From their use of enum alone they don't qualify as declarations. They're
imo merely meant to provide minimal guidelines.

>>>>>>> We have two options:
>>>>>>>
>>>>>>> 1) we go with this document, and we clarify that even if we specify
>>>>>>>   "unsigned int", we actually mean a 32-bit integer
>>>>>>>
>>>>>>> 2) we change all our public ABIs and C hypercall entry points to use
>>>>>>>    fixed-size types (e.g. s/unsigned int/uint32_t/g)
>>>>>>>
>>>>>>> 2) is preferred because it is clearer but it is more work. So I went
>>>>>>> with 1). I also thought you would like 1) more.
>>>>>>
>>>>>> For ABIs (i.e. structures) we ought to be making that change anyway.
>>>>>> Leaving basic types in there is latently buggy.
>>>>>
>>>>> I am glad we agree :-)
>>>>>
>>>>> It is just that I also consinder the C hypercall entry points as part of
>>>>> the ABI
>>>>>
>>>>>
>>>>>> I'm happy to see a document like this added, for the purpose described
>>>>>> above. But to me 1) and 2) and largely independent of one another.
>>>>>
>>>>> Good that you are also happy with a document like this.
>>>>>
>>>>> The remaining question is: what about the rest of the C functions in Xen
>>>>> that are certainly not part of an ABI?
>>>>
>>>> As per above - anything internal isn't part of the ABI, C entry points
>>>> for hypercall handlers included. All we need to ensure is that we consume
>>>> the data according to what the ABI sets forth.
>>>
>>> It doesn't look like we'll convince one another on this point. But let
>>> me try another way.
>>>
>>> In my view, having mismatched types between declaration and definition
>>> and having non-fixed-width types in C hypercall entry points is really
>>> bad for a number of reasons, among them:
>>> - correctness
>>> - risk of ABI breakage
>>> - mismatch of declaration and definition
>>
>> What mismatches are you talking about? There's nothing mismatched now,
>> and there cannot be any mismatch, because the consumers of the ABI don't
>> call Xen functions directly.
> 
> Let me make an example:
> 
> - public header saying enum event_channel_op cmd
> - <assembly>
> - do_event_channel_op(int cmd, ...)
> 
> Do you think this is all good?
> 
> There are two pretty serious problems here:
> - enum and int are not the same type

See above. The issue I have with this is use of plain "int". Technically
that's not a problem either, but aiui we're aiming to use "unsigned int"
when negative values aren't possible.

And note that it was in 2012 when "int" there was changed to "enum", in an
effort to document things better.

> - enum and int are not fixed-width

Which I don't view as a problem, thanks to the assembly sitting in between.

> Don't you think it should be:
> 
> - public header saying uint32_t cmd in a comment
> - <assembly>
> - do_something_op(uint32_t cmd, ...)

The public header should say whatever is best suited to not misguide
people writing actual prototypes for their functions. I wouldn't mind
uint32_t being stated there. That has no influence whatsoever on
do_<something>_op(), though.

> Or possibly unsigned long depending on the parameter.

You're contradicting yourself: You mean to advocate for fixed-width types,
yet then you suggest "unsigned long". Perhaps because you realized that
there's no single fixed-width type fitting "unsigned long" for all
architectures. xen_ulong_t would likely come closest, but would - aiui -
still not be suitable for Arm32 when used in hypercall (handler)
prototypes; it's suitable for use (again) only in structure definitions.

Jan
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Stefano Stabellini 8 months ago
On Thu, 21 Mar 2024, Jan Beulich wrote:
> On 21.03.2024 02:46, Stefano Stabellini wrote:
> > On Wed, 20 Mar 2024, Jan Beulich wrote:
> >>> - the public interface is described in a C header so it makes sense for
> >>>   the corresponding implementation to be in C
> >>>
> >>> - the C entry point is often both the entry point in C and also common
> >>>   code
> >>>
> >>> - depending on the architecture, there is typically always some minimal
> >>>   assembly entry code to prepare the environment before we can jump into
> >>>   C-land; still one wouldn't consider those minimal and routine assembly
> >>>   operations to be a meaningful hypercall entry point corresponding to
> >>>   the C declaration in the public headers
> >>>
> >>> - as per MISRA and also general good practice, we need the declaration
> >>>   in the public header files to match the definition in C
> >>
> >> Throughout, but especially with this last point, I feel there's confusion
> >> (not sure on which side): There are no declarations of hypercall functions
> >> in the public headers. Adding declarations there for the C entry points in
> >> Xen would actually be wrong, as we don't provide such functions anywhere
> >> (to consumers of the ABI).
> > 
> > I am copy/pasting text from sched.h:
> > 
> >  * The prototype for this hypercall is:
> >  * ` long HYPERVISOR_sched_op(enum sched_op cmd, void *arg, ...)
> >  *
> >  * @cmd == SCHEDOP_??? (scheduler operation).
> >  * @arg == Operation-specific extra argument(s), as described below.
> >  * ...  == Additional Operation-specific extra arguments, described below.
> >  *
> > 
> > from event_channel.h:
> > 
> >  * ` enum neg_errnoval
> >  * ` HYPERVISOR_event_channel_op(enum event_channel_op cmd, void *args)
> >  * `
> >  * @cmd  == EVTCHNOP_* (event-channel operation).
> >  * @args == struct evtchn_* Operation-specific extra arguments (NULL if none).
> > 
> > These are the hypercall declarations in public headers. Although they
> > are comments, they are the only description of the ABI that we have (as
> > far as I know). They are in C and use C types. 
> 
> >From their use of enum alone they don't qualify as declarations. They're
> imo merely meant to provide minimal guidelines.

Even if we call them "minimal guidelines", my opinion is unchanged:
- they need to use fixed-width types
- they should match the C hypercall entry point types


> >>>>>>> We have two options:
> >>>>>>>
> >>>>>>> 1) we go with this document, and we clarify that even if we specify
> >>>>>>>   "unsigned int", we actually mean a 32-bit integer
> >>>>>>>
> >>>>>>> 2) we change all our public ABIs and C hypercall entry points to use
> >>>>>>>    fixed-size types (e.g. s/unsigned int/uint32_t/g)
> >>>>>>>
> >>>>>>> 2) is preferred because it is clearer but it is more work. So I went
> >>>>>>> with 1). I also thought you would like 1) more.
> >>>>>>
> >>>>>> For ABIs (i.e. structures) we ought to be making that change anyway.
> >>>>>> Leaving basic types in there is latently buggy.
> >>>>>
> >>>>> I am glad we agree :-)
> >>>>>
> >>>>> It is just that I also consinder the C hypercall entry points as part of
> >>>>> the ABI
> >>>>>
> >>>>>
> >>>>>> I'm happy to see a document like this added, for the purpose described
> >>>>>> above. But to me 1) and 2) and largely independent of one another.
> >>>>>
> >>>>> Good that you are also happy with a document like this.
> >>>>>
> >>>>> The remaining question is: what about the rest of the C functions in Xen
> >>>>> that are certainly not part of an ABI?
> >>>>
> >>>> As per above - anything internal isn't part of the ABI, C entry points
> >>>> for hypercall handlers included. All we need to ensure is that we consume
> >>>> the data according to what the ABI sets forth.
> >>>
> >>> It doesn't look like we'll convince one another on this point. But let
> >>> me try another way.
> >>>
> >>> In my view, having mismatched types between declaration and definition
> >>> and having non-fixed-width types in C hypercall entry points is really
> >>> bad for a number of reasons, among them:
> >>> - correctness
> >>> - risk of ABI breakage
> >>> - mismatch of declaration and definition
> >>
> >> What mismatches are you talking about? There's nothing mismatched now,
> >> and there cannot be any mismatch, because the consumers of the ABI don't
> >> call Xen functions directly.
> > 
> > Let me make an example:
> > 
> > - public header saying enum event_channel_op cmd
> > - <assembly>
> > - do_event_channel_op(int cmd, ...)
> > 
> > Do you think this is all good?
> > 
> > There are two pretty serious problems here:
> > - enum and int are not the same type
> 
> See above. The issue I have with this is use of plain "int". Technically
> that's not a problem either, but aiui we're aiming to use "unsigned int"
> when negative values aren't possible.

Yeah that is also a problem


> And note that it was in 2012 when "int" there was changed to "enum", in an
> effort to document things better.
> 
> > - enum and int are not fixed-width
> 
> Which I don't view as a problem, thanks to the assembly sitting in between.

I disagree. I view this as risky and error prone. We worked for hours
and hours on security issues and MISRA improvements. All this experience
is also meant to teach us what good code looks like, code that is
resilient to attacks, poses fewer safety issues, and it is clearer for
others to read and modify. After all of the above, I am surprised we are
not aligned on this issue.

I understand your point of view, as I think you understand mine. We are
not going to be able to convince each other. Having explored the
technical aspects in all their details, I think we need more opinions
from others to move forward.

I'll conclude with this. One doesn't have to agree with me to agree
that the suggestions I am making are to make the code and public
interfaces, clearer, more consistent, less error prone. Your suggestions
are to make the code follow CODING_STYLE? I made it clear the value
proposition of what I am suggesting and I fail to see yours.


> > Don't you think it should be:
> > 
> > - public header saying uint32_t cmd in a comment
> > - <assembly>
> > - do_something_op(uint32_t cmd, ...)
> 
> The public header should say whatever is best suited to not misguide
> people writing actual prototypes for their functions. I wouldn't mind
> uint32_t being stated there. That has no influence whatsoever on
> do_<something>_op(), though.

I understand what you are saying but I disagree. It is risky and error
prone. As above, I think we understand each other's points of view but
we won't be able to convince each other.


> > Or possibly unsigned long depending on the parameter.
> 
> You're contradicting yourself: You mean to advocate for fixed-width types,
> yet then you suggest "unsigned long".

No. I explained it in another thread a couple of days ago. There are
cases where we have fixed-width types but the type changes by
architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs.
Rather than having #ifdefs, which is also an option, that is the one
case where using "unsigned long" could be a decent compromise. In this
context "unsigned long" means register size (on ARM we even have
register_t). Once you pick an architecture, the size is actually meant
to be fixed. In fact, it is specified in this document. Which is one of
the reasons why we have to use == in this document and not >=. In
general, fixed-width types like uint32_t are better because they are
clearer and unambiguous. When possible I think they should be our first
choice in ABIs.
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Julien Grall 8 months ago
Hi Stefano,

I haven't fully read the thread. But I wanted to clarify something.

On 21/03/2024 19:03, Stefano Stabellini wrote:
>>> Or possibly unsigned long depending on the parameter.
>>
>> You're contradicting yourself: You mean to advocate for fixed-width types,
>> yet then you suggest "unsigned long".
> 
> No. I explained it in another thread a couple of days ago. There are
> cases where we have fixed-width types but the type changes by
> architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs.
> Rather than having #ifdefs, which is also an option, that is the one
> case where using "unsigned long" could be a decent compromise. In this
> context "unsigned long" means register size (on ARM we even have
> register_t). Once you pick an architecture, the size is actually meant
> to be fixed. In fact, it is specified in this document. Which is one of
> the reasons why we have to use == in this document and not >=. In
> general, fixed-width types like uint32_t are better because they are
> clearer and unambiguous. When possible I think they should be our first
> choice in ABIs.

"unsigned long" is not fixed in a given architecture. It will change 
base on the data model used by the OS. For instance, for Arm 64-bit, we 
have 3 models: ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit.

So effectively unsigned long can't be used in the ABI.

As a side note, Xen will use LP64, hence why we tend to use 'unsigned 
long' to describe 32-bit for Arm32 and 64-bit for Arm64.

Cheers,

-- 
Julien Grall
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Stefano Stabellini 7 months, 3 weeks ago
On Thu, 21 Mar 2024, Julien Grall wrote:
> Hi Stefano,
> 
> I haven't fully read the thread. But I wanted to clarify something.
> 
> On 21/03/2024 19:03, Stefano Stabellini wrote:
> > > > Or possibly unsigned long depending on the parameter.
> > > 
> > > You're contradicting yourself: You mean to advocate for fixed-width types,
> > > yet then you suggest "unsigned long".
> > 
> > No. I explained it in another thread a couple of days ago. There are
> > cases where we have fixed-width types but the type changes by
> > architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs.
> > Rather than having #ifdefs, which is also an option, that is the one
> > case where using "unsigned long" could be a decent compromise. In this
> > context "unsigned long" means register size (on ARM we even have
> > register_t). Once you pick an architecture, the size is actually meant
> > to be fixed. In fact, it is specified in this document. Which is one of
> > the reasons why we have to use == in this document and not >=. In
> > general, fixed-width types like uint32_t are better because they are
> > clearer and unambiguous. When possible I think they should be our first
> > choice in ABIs.
> 
> "unsigned long" is not fixed in a given architecture. It will change base on
> the data model used by the OS. For instance, for Arm 64-bit, we have 3 models:
> ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit.
> 
> So effectively unsigned long can't be used in the ABI.

If someone sees "unsigned long" in the public headers to define a public
Xen ABI, they would need to refer to this document to understand what
"unsigned long" really means, which specifies size and alignment of
"unsigned long" based on the architecture. In other words, this document
mandates LP64 (at least for safety configuration, given that nothing
else is tested).

This is the reason why ideally we wouldn't have any "unsigned long" in
the Xen ABI at all. They are not as clear as explicitly-sized integers
(e.g. uint32_t). In an ideal world, we would use explicitly-sized
integers for everything in public ABIs.
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Jan Beulich 8 months ago
On 22.03.2024 00:17, Julien Grall wrote:
> Hi Stefano,
> 
> I haven't fully read the thread. But I wanted to clarify something.
> 
> On 21/03/2024 19:03, Stefano Stabellini wrote:
>>>> Or possibly unsigned long depending on the parameter.
>>>
>>> You're contradicting yourself: You mean to advocate for fixed-width types,
>>> yet then you suggest "unsigned long".
>>
>> No. I explained it in another thread a couple of days ago. There are
>> cases where we have fixed-width types but the type changes by
>> architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs.
>> Rather than having #ifdefs, which is also an option, that is the one
>> case where using "unsigned long" could be a decent compromise. In this
>> context "unsigned long" means register size (on ARM we even have
>> register_t). Once you pick an architecture, the size is actually meant
>> to be fixed. In fact, it is specified in this document. Which is one of
>> the reasons why we have to use == in this document and not >=. In
>> general, fixed-width types like uint32_t are better because they are
>> clearer and unambiguous. When possible I think they should be our first
>> choice in ABIs.
> 
> "unsigned long" is not fixed in a given architecture. It will change 
> base on the data model used by the OS. For instance, for Arm 64-bit, we 
> have 3 models: ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit.

"... is 64-bit" you mean?

Jan

> So effectively unsigned long can't be used in the ABI.
> 
> As a side note, Xen will use LP64, hence why we tend to use 'unsigned 
> long' to describe 32-bit for Arm32 and 64-bit for Arm64.
> 
> Cheers,
>
Re: [PATCH v2] docs/misra: document the expected sizes of integer types
Posted by Julien Grall 8 months ago
Hi Jan,

On 25/03/2024 11:16, Jan Beulich wrote:
> On 22.03.2024 00:17, Julien Grall wrote:
>> Hi Stefano,
>>
>> I haven't fully read the thread. But I wanted to clarify something.
>>
>> On 21/03/2024 19:03, Stefano Stabellini wrote:
>>>>> Or possibly unsigned long depending on the parameter.
>>>>
>>>> You're contradicting yourself: You mean to advocate for fixed-width types,
>>>> yet then you suggest "unsigned long".
>>>
>>> No. I explained it in another thread a couple of days ago. There are
>>> cases where we have fixed-width types but the type changes by
>>> architecture: 32-bit for 32-bit archs and 64-bit for 64-bit archs.
>>> Rather than having #ifdefs, which is also an option, that is the one
>>> case where using "unsigned long" could be a decent compromise. In this
>>> context "unsigned long" means register size (on ARM we even have
>>> register_t). Once you pick an architecture, the size is actually meant
>>> to be fixed. In fact, it is specified in this document. Which is one of
>>> the reasons why we have to use == in this document and not >=. In
>>> general, fixed-width types like uint32_t are better because they are
>>> clearer and unambiguous. When possible I think they should be our first
>>> choice in ABIs.
>>
>> "unsigned long" is not fixed in a given architecture. It will change
>> base on the data model used by the OS. For instance, for Arm 64-bit, we
>> have 3 models: ILP32, LP64, LLP64. Only on LP64, 'unsigned long' is 32-bit.
> 
> "... is 64-bit" you mean?

Whoops. Yes!

Cheers,

-- 
Julien Grall