Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on
POSIX"), the maximum number of file descriptors that can be opened are
raised to nofile.rlim_max. On recent debian distro, this yield a maximum
of 1073741816 file descriptors. Now, when forking to start
qemu-bridge-helper, this actually calls close() on the full possible file
descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which
takes a considerable amount of time. Use close_range() which only
requires to be called twice and factorize it in a separate function for
both call sites.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
---
net/tap.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/net/tap.c b/net/tap.c
index 51f7aec39d..6f5bf06bb5 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer,
return s;
}
+static void fork_close_all_fds_except(int fd)
+{
+ int open_max = sysconf(_SC_OPEN_MAX);
+
+ if (fd > 3)
+ close_range(3, fd - 1, 0);
+
+ if (fd < open_max)
+ close_range(fd + 1, open_max, 0);
+}
+
static void launch_script(const char *setup_script, const char *ifname,
int fd, Error **errp)
{
@@ -400,13 +411,8 @@ static void launch_script(const char *setup_script, const char *ifname,
return;
}
if (pid == 0) {
- int open_max = sysconf(_SC_OPEN_MAX), i;
+ fork_close_all_fds_except(fd);
- for (i = 3; i < open_max; i++) {
- if (i != fd) {
- close(i);
- }
- }
parg = args;
*parg++ = (char *)setup_script;
*parg++ = (char *)ifname;
@@ -490,16 +496,11 @@ static int net_bridge_run_helper(const char *helper, const char *bridge,
return -1;
}
if (pid == 0) {
- int open_max = sysconf(_SC_OPEN_MAX), i;
char *fd_buf = NULL;
char *br_buf = NULL;
char *helper_cmd = NULL;
- for (i = 3; i < open_max; i++) {
- if (i != sv[1]) {
- close(i);
- }
- }
+ fork_close_all_fds_except(sv[1]);
fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
--
2.45.2
On Mon, Jun 17, 2024 at 06:25:18PM +0200, Clément Léger wrote: > Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on > POSIX"), the maximum number of file descriptors that can be opened are > raised to nofile.rlim_max. On recent debian distro, this yield a maximum > of 1073741816 file descriptors. Now, when forking to start > qemu-bridge-helper, this actually calls close() on the full possible file > descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which > takes a considerable amount of time. Use close_range() which only > requires to be called twice and factorize it in a separate function for > both call sites. > > Signed-off-by: Clément Léger <cleger@rivosinc.com> > --- > net/tap.c | 25 +++++++++++++------------ > 1 file changed, 13 insertions(+), 12 deletions(-) > > diff --git a/net/tap.c b/net/tap.c > index 51f7aec39d..6f5bf06bb5 100644 > --- a/net/tap.c > +++ b/net/tap.c > @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer, > return s; > } > > +static void fork_close_all_fds_except(int fd) > +{ > + int open_max = sysconf(_SC_OPEN_MAX); > + > + if (fd > 3) > + close_range(3, fd - 1, 0); > + > + if (fd < open_max) > + close_range(fd + 1, open_max, 0); > +} We can't assume that 'close_range' exists on all platforms/versions that QEMU targets. In system/async-teardown.c there is close_all_open_fd() that has a fallback path to iterating over /proc, which gives good fallback for Linux. That code doesn't have to deal with non-Linux though. I'd suggest that util/osdep.c needs to have a 'close_all_open_fd()' method that accepts an array of FDs to skip closing of, rather than assuming we always skip STDIO + 1 extra FD. eg int close_all_open_fd(int *skip, int nskip); Could either declare that 'skip' must be sorted, or we can explicitly run qsort() on it. Try native close_range first. If unavailable, then on Linux try /proc, otherwise the simple for() loop. Then use this common helper from both tap.c and asynct-teardown.c > + > static void launch_script(const char *setup_script, const char *ifname, > int fd, Error **errp) > { > @@ -400,13 +411,8 @@ static void launch_script(const char *setup_script, const char *ifname, > return; > } > if (pid == 0) { > - int open_max = sysconf(_SC_OPEN_MAX), i; > + fork_close_all_fds_except(fd); > > - for (i = 3; i < open_max; i++) { > - if (i != fd) { > - close(i); > - } > - } > parg = args; > *parg++ = (char *)setup_script; > *parg++ = (char *)ifname; > @@ -490,16 +496,11 @@ static int net_bridge_run_helper(const char *helper, const char *bridge, > return -1; > } > if (pid == 0) { > - int open_max = sysconf(_SC_OPEN_MAX), i; > char *fd_buf = NULL; > char *br_buf = NULL; > char *helper_cmd = NULL; > > - for (i = 3; i < open_max; i++) { > - if (i != sv[1]) { > - close(i); > - } > - } > + fork_close_all_fds_except(sv[1]); > > fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]); > > -- > 2.45.2 > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, 17 Jun 2024 at 17:25, Clément Léger <cleger@rivosinc.com> wrote: > > Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on > POSIX"), the maximum number of file descriptors that can be opened are > raised to nofile.rlim_max. On recent debian distro, this yield a maximum > of 1073741816 file descriptors. Now, when forking to start > qemu-bridge-helper, this actually calls close() on the full possible file > descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which > takes a considerable amount of time. Use close_range() which only > requires to be called twice and factorize it in a separate function for > both call sites. > > Signed-off-by: Clément Léger <cleger@rivosinc.com> > --- > net/tap.c | 25 +++++++++++++------------ > 1 file changed, 13 insertions(+), 12 deletions(-) > > diff --git a/net/tap.c b/net/tap.c > index 51f7aec39d..6f5bf06bb5 100644 > --- a/net/tap.c > +++ b/net/tap.c > @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer, > return s; > } > > +static void fork_close_all_fds_except(int fd) > +{ > + int open_max = sysconf(_SC_OPEN_MAX); > + > + if (fd > 3) > + close_range(3, fd - 1, 0); > + > + if (fd < open_max) > + close_range(fd + 1, open_max, 0); > +} We can't rely on close_range() being available, I think -- this should be ifdeffed with CONFIG_CLOSE_RANGE, with a fallback if it's not there. Also, QEMU coding style requires braces on all if() statements, even single-line ones. thanks -- PMM
On 17/06/2024 18:36, Peter Maydell wrote: > On Mon, 17 Jun 2024 at 17:25, Clément Léger <cleger@rivosinc.com> wrote: >> >> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on >> POSIX"), the maximum number of file descriptors that can be opened are >> raised to nofile.rlim_max. On recent debian distro, this yield a maximum >> of 1073741816 file descriptors. Now, when forking to start >> qemu-bridge-helper, this actually calls close() on the full possible file >> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which >> takes a considerable amount of time. Use close_range() which only >> requires to be called twice and factorize it in a separate function for >> both call sites. >> >> Signed-off-by: Clément Léger <cleger@rivosinc.com> >> --- >> net/tap.c | 25 +++++++++++++------------ >> 1 file changed, 13 insertions(+), 12 deletions(-) >> >> diff --git a/net/tap.c b/net/tap.c >> index 51f7aec39d..6f5bf06bb5 100644 >> --- a/net/tap.c >> +++ b/net/tap.c >> @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer, >> return s; >> } >> >> +static void fork_close_all_fds_except(int fd) >> +{ >> + int open_max = sysconf(_SC_OPEN_MAX); >> + >> + if (fd > 3) >> + close_range(3, fd - 1, 0); >> + >> + if (fd < open_max) >> + close_range(fd + 1, open_max, 0); >> +} > > We can't rely on close_range() being available, I think -- > this should be ifdeffed with CONFIG_CLOSE_RANGE, with a > fallback if it's not there. Ok, makes sense. > > Also, QEMU coding style requires braces on all if() statements, > even single-line ones. Sorry for that, noticed it after running checkpatch. I'll send a V2 with these fixes. Thanks, Clément > > thanks > -- PMM
© 2016 - 2024 Red Hat, Inc.