Windows open(2) implementation opens files in text mode by default and
needs a Windows-only O_BINARY flag to open files as binary. QEMU already
knows about that flag in osdep and it is defined to 0 on non-Windows,
so we can just add it to the host_flags for better compatibility.
Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
---
semihosting/syscalls.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/semihosting/syscalls.c b/semihosting/syscalls.c
index 508a0ad88c..b621d78c2d 100644
--- a/semihosting/syscalls.c
+++ b/semihosting/syscalls.c
@@ -253,7 +253,7 @@ static void host_open(CPUState *cs, gdb_syscall_complete_cb complete,
{
CPUArchState *env G_GNUC_UNUSED = cs->env_ptr;
char *p;
- int ret, host_flags;
+ int ret, host_flags = O_BINARY;
ret = validate_lock_user_string(&p, cs, fname, fname_len);
if (ret < 0) {
@@ -262,11 +262,11 @@ static void host_open(CPUState *cs, gdb_syscall_complete_cb complete,
}
if (gdb_flags & GDB_O_WRONLY) {
- host_flags = O_WRONLY;
+ host_flags |= O_WRONLY;
} else if (gdb_flags & GDB_O_RDWR) {
- host_flags = O_RDWR;
+ host_flags |= O_RDWR;
} else {
- host_flags = O_RDONLY;
+ host_flags |= O_RDONLY;
}
if (gdb_flags & GDB_O_CREAT) {
host_flags |= O_CREAT;
--
2.34.1
On Fri, 6 Jan 2023 at 10:21, Evgeny Iakovlev <eiakovlev@linux.microsoft.com> wrote: > > Windows open(2) implementation opens files in text mode by default and > needs a Windows-only O_BINARY flag to open files as binary. QEMU already > knows about that flag in osdep and it is defined to 0 on non-Windows, > so we can just add it to the host_flags for better compatibility. > > Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com> > Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> > Reviewed-by: Bin Meng <bmeng.cn@gmail.com> > --- > semihosting/syscalls.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/semihosting/syscalls.c b/semihosting/syscalls.c > index 508a0ad88c..b621d78c2d 100644 > --- a/semihosting/syscalls.c > +++ b/semihosting/syscalls.c > @@ -253,7 +253,7 @@ static void host_open(CPUState *cs, gdb_syscall_complete_cb complete, > { > CPUArchState *env G_GNUC_UNUSED = cs->env_ptr; > char *p; > - int ret, host_flags; > + int ret, host_flags = O_BINARY; The semihosting API, at least for Arm, has a modeflags string so the guest can say whether it wants to open O_BINARY or not: https://github.com/ARM-software/abi-aa/blob/main/semihosting/semihosting.rst#sys-open-0x01 So we need to plumb that down through the common semihosting code into this function and set O_BINARY accordingly. Otherwise guest code that asks for a text-mode file won't get one. I don't know about other semihosting APIs, so those would need to be checked to see what they should do. thanks -- PMM
Peter Maydell <peter.maydell@linaro.org> writes: > On Fri, 6 Jan 2023 at 10:21, Evgeny Iakovlev > <eiakovlev@linux.microsoft.com> wrote: >> >> Windows open(2) implementation opens files in text mode by default and >> needs a Windows-only O_BINARY flag to open files as binary. QEMU already >> knows about that flag in osdep and it is defined to 0 on non-Windows, >> so we can just add it to the host_flags for better compatibility. >> >> Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com> >> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> >> Reviewed-by: Bin Meng <bmeng.cn@gmail.com> >> --- >> semihosting/syscalls.c | 8 ++++---- >> 1 file changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/semihosting/syscalls.c b/semihosting/syscalls.c >> index 508a0ad88c..b621d78c2d 100644 >> --- a/semihosting/syscalls.c >> +++ b/semihosting/syscalls.c >> @@ -253,7 +253,7 @@ static void host_open(CPUState *cs, gdb_syscall_complete_cb complete, >> { >> CPUArchState *env G_GNUC_UNUSED = cs->env_ptr; >> char *p; >> - int ret, host_flags; >> + int ret, host_flags = O_BINARY; > > The semihosting API, at least for Arm, has a modeflags string so the > guest can say whether it wants to open O_BINARY or not: > https://github.com/ARM-software/abi-aa/blob/main/semihosting/semihosting.rst#sys-open-0x01 > > So we need to plumb that down through the common semihosting code > into this function and set O_BINARY accordingly. Otherwise guest > code that asks for a text-mode file won't get one. We used to, in fact we still have a remnant of the code where we do: #ifndef O_BINARY #define O_BINARY 0 #endif I presume because the only places it exists in libc is wrapped in stuff like: #if defined (__CYGWIN__) #define O_BINARY _FBINARY So the mapping got removed in a1a2a3e609 (semihosting: Remove GDB_O_BINARY) because GDB knows nothing of this and as far as I can tell neither does Linux whatever ISO C might say about it. Is this a host detail leakage to the guest? Should a semihosting app be caring about what fopen() modes the underlying host supports? At least a default O_BINARY for windows is most likely to DTRT. > I don't know about other semihosting APIs, so those would need > to be checked to see what they should do. > > thanks > -- PMM -- Alex Bennée Virtualisation Tech Lead @ Linaro
On Fri, 6 Jan 2023 at 15:44, Alex Bennée <alex.bennee@linaro.org> wrote: > Peter Maydell <peter.maydell@linaro.org> writes: > > The semihosting API, at least for Arm, has a modeflags string so the > > guest can say whether it wants to open O_BINARY or not: > > https://github.com/ARM-software/abi-aa/blob/main/semihosting/semihosting.rst#sys-open-0x01 > > > > So we need to plumb that down through the common semihosting code > > into this function and set O_BINARY accordingly. Otherwise guest > > code that asks for a text-mode file won't get one. > > We used to, in fact we still have a remnant of the code where we do: > > #ifndef O_BINARY > #define O_BINARY 0 > #endif > > I presume because the only places it exists in libc is wrapped in stuff > like: > > #if defined (__CYGWIN__) > #define O_BINARY _FBINARY > > So the mapping got removed in a1a2a3e609 (semihosting: Remove > GDB_O_BINARY) because GDB knows nothing of this and as far as I can tell > neither does Linux whatever ISO C might say about it. > > Is this a host detail leakage to the guest? Should a semihosting app be > caring about what fopen() modes the underlying host supports? At least a > default O_BINARY for windows is most likely to DTRT. I think the theory when the semihosting API was originally designed decades ago was basically "when the guest does fopen(...) this should act like it does on the host". So as a bit of portable guest code you would say whether you wanted a binary or a text file, and the effect would be that if you were running on Windows and you output a text file then you'd get \r\n like the user probably expected, and if on Linux you get \n. The gdb remote protocol, on the other hand, assumes "all files are binary", and the gdb source that implements the gdb remote file I/O operations does "always set O_BINARY if it's defined": https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/remote-fileio.c;h=3ff2a65b0ec6c7695f8659690a8f1dce9b5cdf5f;hb=HEAD#l141 So this is kind of an impedance mismatch problem -- the semihosting API wants functionality that the gdb protocol can't give us. But we don't have that mismatch issue if we're directly making host filesystem calls, because there we can set O_BINARY or not as we choose. Alternatively, we could decide that our implementation of semihosting consistently uses \n for the newline character on all hosts, such that guests which try to write text files on Windows hosts get the "wrong" newline type, but OTOH get consistently the same file regardless of host and regardless of whether semihosting is going via gdb or not. But if we want to do that we should at least note in a comment somewhere that that's a behaviour we've chosen, not something that's happened by accident. Given Windows is less unfriendly about dealing with \n-terminated files these days that might not be an unreasonable choice. -- PMM
On 1/6/2023 17:28, Peter Maydell wrote: > On Fri, 6 Jan 2023 at 15:44, Alex Bennée <alex.bennee@linaro.org> wrote: >> Peter Maydell <peter.maydell@linaro.org> writes: >>> The semihosting API, at least for Arm, has a modeflags string so the >>> guest can say whether it wants to open O_BINARY or not: >>> https://github.com/ARM-software/abi-aa/blob/main/semihosting/semihosting.rst#sys-open-0x01 >>> >>> So we need to plumb that down through the common semihosting code >>> into this function and set O_BINARY accordingly. Otherwise guest >>> code that asks for a text-mode file won't get one. >> We used to, in fact we still have a remnant of the code where we do: >> >> #ifndef O_BINARY >> #define O_BINARY 0 >> #endif >> >> I presume because the only places it exists in libc is wrapped in stuff >> like: >> >> #if defined (__CYGWIN__) >> #define O_BINARY _FBINARY >> >> So the mapping got removed in a1a2a3e609 (semihosting: Remove >> GDB_O_BINARY) because GDB knows nothing of this and as far as I can tell >> neither does Linux whatever ISO C might say about it. >> >> Is this a host detail leakage to the guest? Should a semihosting app be >> caring about what fopen() modes the underlying host supports? At least a >> default O_BINARY for windows is most likely to DTRT. > I think the theory when the semihosting API was originally designed > decades ago was basically "when the guest does fopen(...) this > should act like it does on the host". So as a bit of portable > guest code you would say whether you wanted a binary or a text > file, and the effect would be that if you were running on Windows > and you output a text file then you'd get \r\n like the user > probably expected, and if on Linux you get \n. > > The gdb remote protocol, on the other hand, assumes "all files > are binary", and the gdb source that implements the gdb remote > file I/O operations does "always set O_BINARY if it's defined": > https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/remote-fileio.c;h=3ff2a65b0ec6c7695f8659690a8f1dce9b5cdf5f;hb=HEAD#l141 > > So this is kind of an impedance mismatch problem -- the semihosting > API wants functionality that the gdb protocol can't give us. > But we don't have that mismatch issue if we're directly making > host filesystem calls, because there we can set O_BINARY or > not as we choose. > > Alternatively, we could decide that our implementation of > semihosting consistently uses \n for the newline character > on all hosts, such that guests which try to write text files > on Windows hosts get the "wrong" newline type, but OTOH > get consistently the same file regardless of host and regardless > of whether semihosting is going via gdb or not. But if > we want to do that we should at least note in a comment > somewhere that that's a behaviour we've chosen, not something > that's happened by accident. Given Windows is less unfriendly > about dealing with \n-terminated files these days that might > not be an unreasonable choice. > > -- PMM If SYS_OPEN is supposed to call fopen (i didn't actually know that..) then it does make more sense for binary/text mode to be propagated from guest. Qemu's implementation calls open(2) though, which is not correct at all then. Well, as long as qemu does that, there is no posix-compliant way to tell open(2) if it should use binary or text mode, there is no notion of that as far as posix (and most implementations) is concerned. My change then acts as a way to at least have predictable behavior across platforms, but i guess a more correct approach would be to follow actual semi-hosting spec and switch to fopen.
On Fri, 6 Jan 2023 at 18:22, Evgeny Iakovlev <eiakovlev@linux.microsoft.com> wrote: > > > On 1/6/2023 17:28, Peter Maydell wrote: > > On Fri, 6 Jan 2023 at 15:44, Alex Bennée <alex.bennee@linaro.org> wrote: > >> Peter Maydell <peter.maydell@linaro.org> writes: > > I think the theory when the semihosting API was originally designed > > decades ago was basically "when the guest does fopen(...) this > > should act like it does on the host". So as a bit of portable > > guest code you would say whether you wanted a binary or a text > > file, and the effect would be that if you were running on Windows > > and you output a text file then you'd get \r\n like the user > > probably expected, and if on Linux you get \n. > If SYS_OPEN is supposed to call fopen (i didn't actually know that..) > then it does make more sense for binary/text mode to be propagated from > guest. It's not required to literally call fopen(). It just has to give the specified semantics for when the guest passes it a mode integer, which is defined in terms of the ISO C fopen() string semantics for "r", "rb", "r+", "r+b", etc. > Qemu's implementation calls open(2) though, which is not correct > at all then. Well, as long as qemu does that, there is no > posix-compliant way to tell open(2) if it should use binary or text > mode, there is no notion of that as far as posix (and most > implementations) is concerned. QEMU doesn't have to be pure POSIX compliant: we know what our supported host platforms are and we can freely use extensions they provide. If we want to achieve the semantics that semihosting asks for then we can do that with open(), by passing O_BINARY when the mode integer from the guest corresponds to a string with "b" in it. I'm about 50:50 on whether we should do that vs documenting and commenting that we deliberately produce the same behaviour on all platforms by ignoring the 'b' flag, though. thanks -- PMM
On 1/6/23 7:58 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > On Fri, 6 Jan 2023 at 18:22, Evgeny Iakovlev > <eiakovlev@linux.microsoft.com> wrote: > > > > > > On 1/6/2023 17:28, Peter Maydell wrote: > >> On Fri, 6 Jan 2023 at 15:44, Alex Bennée <alex.bennee@linaro.org> wrote: > >>> Peter Maydell <peter.maydell@linaro.org> writes: > >> I think the theory when the semihosting API was originally designed > >> decades ago was basically "when the guest does fopen(...) this > >> should act like it does on the host". So as a bit of portable > >> guest code you would say whether you wanted a binary or a text > >> file, and the effect would be that if you were running on Windows > >> and you output a text file then you'd get \r\n like the user > >> probably expected, and if on Linux you get \n. > > > If SYS_OPEN is supposed to call fopen (i didn't actually know that..) > > then it does make more sense for binary/text mode to be propagated from > > guest. > > It's not required to literally call fopen(). It just has to > give the specified semantics for when the guest passes it a > mode integer, which is defined in terms of the ISO C > fopen() string semantics for "r", "rb", "r+", "r+b", etc. > > > Qemu's implementation calls open(2) though, which is not correct > > at all then. Well, as long as qemu does that, there is no > > posix-compliant way to tell open(2) if it should use binary or text > > mode, there is no notion of that as far as posix (and most > > implementations) is concerned. > > QEMU doesn't have to be pure POSIX compliant: we know what our > supported host platforms are and we can freely use extensions > they provide. If we want to achieve the semantics that semihosting > asks for then we can do that with open(), by passing O_BINARY when > the mode integer from the guest corresponds to a string with "b" in it. > > I'm about 50:50 on whether we should do that vs documenting and > commenting that we deliberately produce the same behaviour on all > platforms by ignoring the 'b' flag, though. > > thanks > -- PMM > Thanks Peter, i think i see your point. However, if you ask me, i feel like advertising a feature to guest code and only implementing it on 1 platform that supports it just because it has a non-standard POSIX implementation will only confuse the issue further. Guest code doesn't want to care whether or not an emulator is running on Linux or Windows, there is no notion of that leaking to guest code. What it cares about is being able to consistently use a certain feature in their code. So i think it would be rather useless to implement it on Windows-only given there is a clear alternative to switch to fopen. Just my 2 cents.
On Mon, 16 Jan 2023 at 15:56, <eiakovlev@linux.microsoft.com> wrote: > > > > On 1/6/23 7:58 PM, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Fri, 6 Jan 2023 at 18:22, Evgeny Iakovlev > > <eiakovlev@linux.microsoft.com> wrote: > > > > > > > > > On 1/6/2023 17:28, Peter Maydell wrote: > > >> On Fri, 6 Jan 2023 at 15:44, Alex Bennée <alex.bennee@linaro.org> wrote: > > >>> Peter Maydell <peter.maydell@linaro.org> writes: > > >> I think the theory when the semihosting API was originally designed > > >> decades ago was basically "when the guest does fopen(...) this > > >> should act like it does on the host". So as a bit of portable > > >> guest code you would say whether you wanted a binary or a text > > >> file, and the effect would be that if you were running on Windows > > >> and you output a text file then you'd get \r\n like the user > > >> probably expected, and if on Linux you get \n. > > > > > If SYS_OPEN is supposed to call fopen (i didn't actually know that..) > > > then it does make more sense for binary/text mode to be propagated from > > > guest. > > > > It's not required to literally call fopen(). It just has to > > give the specified semantics for when the guest passes it a > > mode integer, which is defined in terms of the ISO C > > fopen() string semantics for "r", "rb", "r+", "r+b", etc. > > > > > Qemu's implementation calls open(2) though, which is not correct > > > at all then. Well, as long as qemu does that, there is no > > > posix-compliant way to tell open(2) if it should use binary or text > > > mode, there is no notion of that as far as posix (and most > > > implementations) is concerned. > > > > QEMU doesn't have to be pure POSIX compliant: we know what our > > supported host platforms are and we can freely use extensions > > they provide. If we want to achieve the semantics that semihosting > > asks for then we can do that with open(), by passing O_BINARY when > > the mode integer from the guest corresponds to a string with "b" in it. > Thanks Peter, i think i see your point. However, if you ask me, i feel like advertising a feature to guest code and only implementing it on 1 platform that supports it just because it has a non-standard POSIX implementation will only confuse the issue further. Huh? We can implement it, if we want, on *all* hosts that we support: * On Windows hosts, plumb the binary indication from the semihosting SYS_OPEN call through to whether we pass O_BINARY to open(2) * On all other hosts, do nothing: on these hosts, text and binary files are identical so there is nothing to do Note that semihosting is not an API that QEMU has specified: it's an external one provided by multiple platforms. We do not "advertise" the existence of the 'binary' flag to SYS_OPEN: it is part of the pre-existing decades-old specification we implement. > Guest code doesn't want to care whether or not an emulator is > running on Linux or Windows, there is no notion of that leaking > to guest code. What it cares about is being able to consistently > use a certain feature in their code. The trouble here is that we have two different choices about how to be consistent: (1) Consistently have guests that use semihosting to open a file in text mode get the text-mode file that they asked for, regardless of the host operating system and its definition of what a text file is (2) Consistently have guest code produce a binary-identical output file regardless of host operating system It is not possible to have both; we have to pick one. On balance, I agree with Alex that option (2) is probably better, especially with the file-I/O-via-gdbstub part of it; but we are genuinely giving up property (1) in the process. thanks -- PMM
eiakovlev@linux.microsoft.com writes: > On 1/6/23 7:58 PM, Peter Maydell <peter.maydell@linaro.org> wrote: >> On Fri, 6 Jan 2023 at 18:22, Evgeny Iakovlev >> <eiakovlev@linux.microsoft.com> wrote: >> > >> > >> > On 1/6/2023 17:28, Peter Maydell wrote: >> >> On Fri, 6 Jan 2023 at 15:44, Alex Bennée <alex.bennee@linaro.org> wrote: >> >>> Peter Maydell <peter.maydell@linaro.org> writes: >> >> I think the theory when the semihosting API was originally designed >> >> decades ago was basically "when the guest does fopen(...) this >> >> should act like it does on the host". So as a bit of portable >> >> guest code you would say whether you wanted a binary or a text >> >> file, and the effect would be that if you were running on Windows >> >> and you output a text file then you'd get \r\n like the user >> >> probably expected, and if on Linux you get \n. >> > If SYS_OPEN is supposed to call fopen (i didn't actually know >> that..) >> > then it does make more sense for binary/text mode to be propagated from >> > guest. >> It's not required to literally call fopen(). It just has to >> give the specified semantics for when the guest passes it a >> mode integer, which is defined in terms of the ISO C >> fopen() string semantics for "r", "rb", "r+", "r+b", etc. >> > Qemu's implementation calls open(2) though, which is not correct >> > at all then. Well, as long as qemu does that, there is no >> > posix-compliant way to tell open(2) if it should use binary or text >> > mode, there is no notion of that as far as posix (and most >> > implementations) is concerned. >> QEMU doesn't have to be pure POSIX compliant: we know what our >> supported host platforms are and we can freely use extensions >> they provide. If we want to achieve the semantics that semihosting >> asks for then we can do that with open(), by passing O_BINARY when >> the mode integer from the guest corresponds to a string with "b" in it. >> I'm about 50:50 on whether we should do that vs documenting and >> commenting that we deliberately produce the same behaviour on all >> platforms by ignoring the 'b' flag, though. >> thanks >> -- PMM >> > > Thanks Peter, i think i see your point. However, if you ask me, i feel > like advertising a feature to guest code and only implementing it on 1 > platform that supports it just because it has a non-standard POSIX > implementation will only confuse the issue further. > Guest code doesn't want to care whether or not an emulator is running > on Linux or Windows, there is no notion of that leaking to guest code. > What it cares about is being able to consistently use a certain > feature in their code. > So i think it would be rather useless to implement it on Windows-only > given there is a clear alternative to switch to fopen. Just my 2 > cents. It's not switching to fopen() that is the issue - it's the interaction with gdb (via gdbstub) which has no idea about the distinction. Anyway I already have the patch queued with an additional note in the documentation that all file accesses are in binary mode. -- Alex Bennée Virtualisation Tech Lead @ Linaro
Evgeny Iakovlev <eiakovlev@linux.microsoft.com> writes: > Windows open(2) implementation opens files in text mode by default and > needs a Windows-only O_BINARY flag to open files as binary. QEMU already > knows about that flag in osdep and it is defined to 0 on non-Windows, > so we can just add it to the host_flags for better compatibility. > > Signed-off-by: Evgeny Iakovlev <eiakovlev@linux.microsoft.com> > Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> > Reviewed-by: Bin Meng <bmeng.cn@gmail.com> Queued to semihosting/next, thanks. -- Alex Bennée Virtualisation Tech Lead @ Linaro
© 2016 - 2024 Red Hat, Inc.