block/ssh.c | 2 -- 1 file changed, 2 deletions(-)
From: Jakub Jelen <jjelen@redhat.com>
The libssh does not handle non-blocking mode in SFTP correctly. The
driver code already changes the mode to blocking for the SFTP
initialization, but for some reason changes to non-blocking mode.
This used to work accidentally until libssh in 0.11 branch merged
the patch to avoid infinite looping in case of network errors:
https://gitlab.com/libssh/libssh-mirror/-/merge_requests/498
Since then, the ssh driver in qemu fails to read files over SFTP
as the first SFTP messages exchanged after switching the session
to non-blocking mode return SSH_AGAIN, but that message is lost
int the SFTP internals and interpretted as SSH_ERROR, which is
returned to the caller:
https://gitlab.com/libssh/libssh-mirror/-/issues/280
This is indeed an issue in libssh that we should address in the
long term, but it will require more work on the internals. For
now, the SFTP is not supported in non-blocking mode.
Fixes: https://gitlab.com/libssh/libssh-mirror/-/issues/280
Signed-off-by: Jakub Jelen <jjelen@redhat.com>
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
---
block/ssh.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/block/ssh.c b/block/ssh.c
index 9f8140bcb6..e1529cfda9 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -866,8 +866,6 @@ static int ssh_open(BlockDriverState *bs, QDict *options, int bdrv_flags,
goto err;
}
- /* Go non-blocking. */
- ssh_set_blocking(s->session, 0);
if (s->attrs->type == SSH_FILEXFER_TYPE_REGULAR) {
bs->supported_truncate_flags = BDRV_REQ_ZERO_WRITE;
--
2.46.0
Heh. I was creating a qemu bug report on gitlab when this email arrived :) 13.11.2024 14:49, Richard W.M. Jones wrote: > From: Jakub Jelen <jjelen@redhat.com> > > The libssh does not handle non-blocking mode in SFTP correctly. The > driver code already changes the mode to blocking for the SFTP > initialization, but for some reason changes to non-blocking mode. "changes to non-blocking mode LATER", I guess, - or else it's a bit difficult to read. But this works too. > This used to work accidentally until libssh in 0.11 branch merged > the patch to avoid infinite looping in case of network errors: > > https://gitlab.com/libssh/libssh-mirror/-/merge_requests/498 > > Since then, the ssh driver in qemu fails to read files over SFTP > as the first SFTP messages exchanged after switching the session > to non-blocking mode return SSH_AGAIN, but that message is lost > int the SFTP internals and interpretted as SSH_ERROR, which is > returned to the caller: > > https://gitlab.com/libssh/libssh-mirror/-/issues/280 > > This is indeed an issue in libssh that we should address in the > long term, but it will require more work on the internals. For > now, the SFTP is not supported in non-blocking mode. The comment at init where the code sets socket to blocking mode, says: /* * Make sure we are in blocking mode during the connection and * authentication phases. */ ssh_set_blocking(s->session, 1); There are a few other places where the code expect "some" blocking mode, changes it to blocking, and restores the mode later, - eg, see ssh_grow_file(). It looks all this has to be fixed too. I wonder if qemu ssh driver needs to mess with blocking mode of this socket in the first place, ever. Is there a way qemu can get non-blocking socket in this context? I can only think of fd=NNN, but is it possible for this socket to be non-blocking? > Fixes: https://gitlab.com/libssh/libssh-mirror/-/issues/280 > Signed-off-by: Jakub Jelen <jjelen@redhat.com> > Signed-off-by: Richard W.M. Jones <rjones@redhat.com> > --- > block/ssh.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/block/ssh.c b/block/ssh.c > index 9f8140bcb6..e1529cfda9 100644 > --- a/block/ssh.c > +++ b/block/ssh.c > @@ -866,8 +866,6 @@ static int ssh_open(BlockDriverState *bs, QDict *options, int bdrv_flags, > goto err; > } > > - /* Go non-blocking. */ > - ssh_set_blocking(s->session, 0); > > if (s->attrs->type == SSH_FILEXFER_TYPE_REGULAR) { > bs->supported_truncate_flags = BDRV_REQ_ZERO_WRITE; Please remove the empty line too. /mjt
On Wed, Nov 13, 2024 at 03:02:59PM +0300, Michael Tokarev wrote: > Heh. I was creating a qemu bug report on gitlab when this email arrived :) > > 13.11.2024 14:49, Richard W.M. Jones wrote: > >From: Jakub Jelen <jjelen@redhat.com> > > > >The libssh does not handle non-blocking mode in SFTP correctly. The > >driver code already changes the mode to blocking for the SFTP > >initialization, but for some reason changes to non-blocking mode. > > "changes to non-blocking mode LATER", I guess, - or else it's a bit > difficult to read. But this works too. > > >This used to work accidentally until libssh in 0.11 branch merged > >the patch to avoid infinite looping in case of network errors: > > > >https://gitlab.com/libssh/libssh-mirror/-/merge_requests/498 > > > >Since then, the ssh driver in qemu fails to read files over SFTP > >as the first SFTP messages exchanged after switching the session > >to non-blocking mode return SSH_AGAIN, but that message is lost > >int the SFTP internals and interpretted as SSH_ERROR, which is > >returned to the caller: > > > >https://gitlab.com/libssh/libssh-mirror/-/issues/280 > > > >This is indeed an issue in libssh that we should address in the > >long term, but it will require more work on the internals. For > >now, the SFTP is not supported in non-blocking mode. > > The comment at init where the code sets socket to blocking mode, says: > > /* > * Make sure we are in blocking mode during the connection and > * authentication phases. > */ > ssh_set_blocking(s->session, 1); > > > There are a few other places where the code expect "some" blocking > mode, changes it to blocking, and restores the mode later, - eg, > see ssh_grow_file(). It looks all this has to be fixed too. I'll just note that I'm only forwarding on the patch from Jakub. I didn't write it. I did lightly test it, and it seems to work. However it seems also likely that it is causing qemu to block internally. Probably not noticable for light use, but not something that we'd want for serious use. However if libssh doesn't support non-blocking SFTP there's not much we can do about that in qemu. I would recommend using nbdkit-ssh-plugin instead anyway as it is much more featureful and doesn't have this problem as we use real threads instead of coroutines. > I wonder if qemu ssh driver needs to mess with blocking mode of this > socket in the first place, ever. Is there a way qemu can get non-blocking > socket in this context? I can only think of fd=NNN, but is it > possible for this socket to be non-blocking? > > >Fixes: https://gitlab.com/libssh/libssh-mirror/-/issues/280 > >Signed-off-by: Jakub Jelen <jjelen@redhat.com> > >Signed-off-by: Richard W.M. Jones <rjones@redhat.com> > >--- > > block/ssh.c | 2 -- > > 1 file changed, 2 deletions(-) > > > >diff --git a/block/ssh.c b/block/ssh.c > >index 9f8140bcb6..e1529cfda9 100644 > >--- a/block/ssh.c > >+++ b/block/ssh.c > >@@ -866,8 +866,6 @@ static int ssh_open(BlockDriverState *bs, QDict *options, int bdrv_flags, > > goto err; > > } > >- /* Go non-blocking. */ > >- ssh_set_blocking(s->session, 0); > > if (s->attrs->type == SSH_FILEXFER_TYPE_REGULAR) { > > bs->supported_truncate_flags = BDRV_REQ_ZERO_WRITE; > > Please remove the empty line too. I posted a v2 which removes the blank line but is otherwise the same. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Am 13.11.2024 um 14:00 hat Richard W.M. Jones geschrieben: > On Wed, Nov 13, 2024 at 03:02:59PM +0300, Michael Tokarev wrote: > > Heh. I was creating a qemu bug report on gitlab when this email arrived :) > > > > 13.11.2024 14:49, Richard W.M. Jones wrote: > > >From: Jakub Jelen <jjelen@redhat.com> > > > > > >The libssh does not handle non-blocking mode in SFTP correctly. The > > >driver code already changes the mode to blocking for the SFTP > > >initialization, but for some reason changes to non-blocking mode. > > > > "changes to non-blocking mode LATER", I guess, - or else it's a bit > > difficult to read. But this works too. > > > > >This used to work accidentally until libssh in 0.11 branch merged > > >the patch to avoid infinite looping in case of network errors: > > > > > >https://gitlab.com/libssh/libssh-mirror/-/merge_requests/498 > > > > > >Since then, the ssh driver in qemu fails to read files over SFTP > > >as the first SFTP messages exchanged after switching the session > > >to non-blocking mode return SSH_AGAIN, but that message is lost > > >int the SFTP internals and interpretted as SSH_ERROR, which is > > >returned to the caller: > > > > > >https://gitlab.com/libssh/libssh-mirror/-/issues/280 > > > > > >This is indeed an issue in libssh that we should address in the > > >long term, but it will require more work on the internals. For > > >now, the SFTP is not supported in non-blocking mode. > > > > The comment at init where the code sets socket to blocking mode, says: > > > > /* > > * Make sure we are in blocking mode during the connection and > > * authentication phases. > > */ > > ssh_set_blocking(s->session, 1); > > > > > > There are a few other places where the code expect "some" blocking > > mode, changes it to blocking, and restores the mode later, - eg, > > see ssh_grow_file(). It looks all this has to be fixed too. I agree, if we're moving away from non-blocking sessions, then we should remove the switching everywhere. But obviously... > I'll just note that I'm only forwarding on the patch from Jakub. > I didn't write it. > > I did lightly test it, and it seems to work. However it seems also > likely that it is causing qemu to block internally. Probably not > noticable for light use, but not something that we'd want for serious > use. However if libssh doesn't support non-blocking SFTP there's not > much we can do about that in qemu. ...just making it blocking is not acceptable. It will potentially make the guest hang while we're waiting for sftp responses. I see that there is an sftp_aio_*() API, but it looks weird. Instead of allowing you to just poll the next request that is ready, you have to call a (blocking) wait on a specific request. co_yield(), which is currently used when sftp_read() returns SSH_AGAIN, makes sure that we poll the socket fd, so we can know that _something_ new has arrived. However it's unclear to me how to know _which_ request received a reply and can be completed now. It seems you have to call sftp_aio_wait_*() in non-blocking mode on all requests to do that, which probably is affected by the libssh bug, too. So I'm not sure if sftp_aio_*() can be combined with something else into a working solution, and I also don't know if it's affected by the same libssh bug. Jakub, can you help with that? > I would recommend using nbdkit-ssh-plugin instead anyway as it is much > more featureful and doesn't have this problem as we use real threads > instead of coroutines. Telling people to switch away from QEMU is not an appropriate fix for the problem. QEMU has all of the infrastructure with thread pools etc., that's not a unique thing of nbdkit. So if libssh can't provide working non-blocking connections, we'll have to use blocking sftp_read() in a worker thread. It's uglier than using a proper asynchronous interface, but we'll have to work with whatever we get from the library. As far as I can see, libssh sessions aren't thread safe, so we'll have to make sure to have only one request going at the same time, but I assume that calling ssh_read/write() from different threads sequentially isn't a problem? > > I wonder if qemu ssh driver needs to mess with blocking mode of this > > socket in the first place, ever. Is there a way qemu can get non-blocking > > socket in this context? I can only think of fd=NNN, but is it > > possible for this socket to be non-blocking? I'm not sure if this is actually related to blocking sockets specifically. It seems to me that it's more about blocking behaviour in libssh itself, while it internally uses poll() to avoid blocking. Kevin
Hello, comments inline below. On Thu, Nov 14, 2024 at 4:21 PM Kevin Wolf <kwolf@redhat.com> wrote: > [...] > > > I'll just note that I'm only forwarding on the patch from Jakub. > > I didn't write it. > > > > I did lightly test it, and it seems to work. However it seems also > > likely that it is causing qemu to block internally. Probably not > > noticable for light use, but not something that we'd want for serious > > use. However if libssh doesn't support non-blocking SFTP there's not > > much we can do about that in qemu. > > ...just making it blocking is not acceptable. It will potentially make > the guest hang while we're waiting for sftp responses. This is the limitation of the SFTP API we have (and a reason why we have a new API below, but only in new 0.11.0 release so not solution for older systems that wont get new libssh). > I see that there is an sftp_aio_*() API, but it looks weird. Instead of > allowing you to just poll the next request that is ready, you have to > call a (blocking) wait on a specific request. Its more "streaming" API allowing the request and response overlap in time allowing better throughput on networks with large latency. To support fully non blocking API in SFTP, there is still way to go, but this api should be more ready for that than the old one. > co_yield(), which is currently used when sftp_read() returns SSH_AGAIN, > makes sure that we poll the socket fd, so we can know that _something_ > new has arrived. However it's unclear to me how to know _which_ request > received a reply and can be completed now. It seems you have to call > sftp_aio_wait_*() in non-blocking mode on all requests to do that, which > probably is affected by the libssh bug, too. Are you sure that with the old libssh versions you were getting SSH_AGAIN in non-blocking mode? Michael in the following comment found the change where the issue started to demonstrate: https://gitlab.com/libssh/libssh-mirror/-/issues/280#note_2204139954 https://gitlab.com/libssh/libssh-mirror/-/commit/2d3b7e07af3675b9a0326bc5c6253a0bbbda567b And from what I read, it was just silently behaving as blocking (potentially infinitely) instead of returning SSH_AGAIN deep in libssh code. > So I'm not sure if sftp_aio_*() can be combined with something else into > a working solution, and I also don't know if it's affected by the same > libssh bug. Right now, we do not have a full solution. But having SFTP working completely in nonoblocking mode is one of the things we would like to have in the long term. > Jakub, can you help with that? > > [...] > > As far as I can see, libssh sessions aren't thread safe, so we'll have > to make sure to have only one request going at the same time, but I > assume that calling ssh_read/write() from different threads sequentially > isn't a problem? My understanding is that the thread safety of libssh is limited to not sharing session between threads -- there is no synchronization if two threads would send packets at the same time: https://api.libssh.org/master/ If you will make sure you will not call sftp_read()/sftp_write() at the same time from different threads, it might work, but it is untested. (joined the ML so I hope this mail will make it there) Jakub
Am 14.11.2024 um 17:49 hat Jakub Jelen geschrieben: > Hello, > comments inline below. > > On Thu, Nov 14, 2024 at 4:21 PM Kevin Wolf <kwolf@redhat.com> wrote: > > [...] > > > > > I'll just note that I'm only forwarding on the patch from Jakub. > > > I didn't write it. > > > > > > I did lightly test it, and it seems to work. However it seems also > > > likely that it is causing qemu to block internally. Probably not > > > noticable for light use, but not something that we'd want for serious > > > use. However if libssh doesn't support non-blocking SFTP there's not > > > much we can do about that in qemu. > > > > ...just making it blocking is not acceptable. It will potentially make > > the guest hang while we're waiting for sftp responses. > > This is the limitation of the SFTP API we have (and a reason why we > have a new API below, but only in new 0.11.0 release so not solution > for older systems that wont get new libssh). > > > I see that there is an sftp_aio_*() API, but it looks weird. Instead of > > allowing you to just poll the next request that is ready, you have to > > call a (blocking) wait on a specific request. > > Its more "streaming" API allowing the request and response overlap in time > allowing better throughput on networks with large latency. > > To support fully non blocking API in SFTP, there is still way to go, but this > api should be more ready for that than the old one. Ok, so something to possibly look into later, but not for the time being. > > co_yield(), which is currently used when sftp_read() returns SSH_AGAIN, > > makes sure that we poll the socket fd, so we can know that _something_ > > new has arrived. However it's unclear to me how to know _which_ request > > received a reply and can be completed now. It seems you have to call > > sftp_aio_wait_*() in non-blocking mode on all requests to do that, which > > probably is affected by the libssh bug, too. > > Are you sure that with the old libssh versions you were getting SSH_AGAIN > in non-blocking mode? Michael in the following comment found the > change where the issue started to demonstrate: > > https://gitlab.com/libssh/libssh-mirror/-/issues/280#note_2204139954 > https://gitlab.com/libssh/libssh-mirror/-/commit/2d3b7e07af3675b9a0326bc5c6253a0bbbda567b > > And from what I read, it was just silently behaving as blocking > (potentially infinitely) instead of returning SSH_AGAIN deep in libssh > code. Hm, after looking some more at the code, I agree that it can't have worked, for the simple reason that sftp_read() never returns SSH_AGAIN, but turns it into 0. Which QEMU would have reported as an I/O error if we're not at EOF. What I don't understand yet where in the code it would have blocked before rather than returning an error. I tried to follow the code path and didn't see anything like it, but obviously I'm also not familiar with libssh code. I guess it also doesn't really matter as long as we know it has always been broken... The thing that maybe misled me is that sftp_recv_response_msg() calls ssh_channel_poll() first to make sure that there is even something to read. So I expected it should have been non-blocking at least in some cases, but if it had been, we would probably have seen I/O errors all the time? > > So I'm not sure if sftp_aio_*() can be combined with something else into > > a working solution, and I also don't know if it's affected by the same > > libssh bug. > > Right now, we do not have a full solution. But having SFTP working > completely in nonoblocking mode is one of the things we would like to have > in the long term. > > > Jakub, can you help with that? > > > > [...] > > > > As far as I can see, libssh sessions aren't thread safe, so we'll have > > to make sure to have only one request going at the same time, but I > > assume that calling ssh_read/write() from different threads sequentially > > isn't a problem? > > My understanding is that the thread safety of libssh is limited to not > sharing session between threads -- there is no synchronization if two > threads would send packets at the same time: > > https://api.libssh.org/master/ > > If you will make sure you will not call sftp_read()/sftp_write() at > the same time from different threads, it might work, but it is > untested. How do you feel about it? Do you think this is something libssh can support, or is it something that might accidentally work today, but not necessarily next year? We have a thread pool readily available that we could use, but then requests for the same session would come from different threads - just never at the same time. If we need a single long-lived thread per session instead, that might be a little more involved because we might have to implement all the communication and synchronisation from scratch. (Hmm... Or we abuse the IOThread object to create one internally and just move the request coroutine to it around libssh calls. That could be easy enough.) Kevin
Hi Kevin, Sorry for the delay, my gmail filters will need some love to handle this high-traffic mailing lists so I can catch replies ... On Thu, Nov 14, 2024 at 6:49 PM Kevin Wolf <kwolf@redhat.com> wrote: > [...] > Hm, after looking some more at the code, I agree that it can't have > worked, for the simple reason that sftp_read() never returns SSH_AGAIN, > but turns it into 0. Which QEMU would have reported as an I/O error if > we're not at EOF. > > What I don't understand yet where in the code it would have blocked > before rather than returning an error. I tried to follow the code path > and didn't see anything like it, but obviously I'm also not familiar > with libssh code. I guess it also doesn't really matter as long as we > know it has always been broken... I think it is the cycle in the sftp_packet_read(), which was polling until we got something to read here: https://gitlab.com/libssh/libssh-mirror/-/blob/master/src/sftp_common.c?ref_type=heads#L75-L100 But I would have to retrace through the code as I already forgot where I was looking to. > The thing that maybe misled me is that sftp_recv_response_msg() calls > ssh_channel_poll() first to make sure that there is even something to > read. So I expected it should have been non-blocking at least in some > cases, but if it had been, we would probably have seen I/O errors all > the time? Note, that the sftp_recv_response_msg() function is new in master and not yet in any released versions. The 0.11 version had processing directly in the sftp_read() (and in each other function handling some responses). https://gitlab.com/libssh/libssh-mirror/-/blob/stable-0.11/src/sftp.c?ref_type=heads#L1200 Also the ssh_channel_poll() is called only when the file handle is in non-blocking mode (whatever it means). The qemu sets the blocking mode here: https://github.com/qemu/qemu/blob/master/block/ssh.c#L805C1-L805C44 This means the execution goes directly into sftp_read_and_dispatch(), which we discussed above as it does the blocking. > > > So I'm not sure if sftp_aio_*() can be combined with something else into > > > a working solution, and I also don't know if it's affected by the same > > > libssh bug. > > > > Right now, we do not have a full solution. But having SFTP working > > completely in nonoblocking mode is one of the things we would like to have > > in the long term. > > > > > Jakub, can you help with that? > > > > > > [...] > > > > > > As far as I can see, libssh sessions aren't thread safe, so we'll have > > > to make sure to have only one request going at the same time, but I > > > assume that calling ssh_read/write() from different threads sequentially > > > isn't a problem? > > > > My understanding is that the thread safety of libssh is limited to not > > sharing session between threads -- there is no synchronization if two > > threads would send packets at the same time: > > > > https://api.libssh.org/master/ > > > > If you will make sure you will not call sftp_read()/sftp_write() at > > the same time from different threads, it might work, but it is > > untested. > > How do you feel about it? Do you think this is something libssh can > support, or is it something that might accidentally work today, but not > necessarily next year? If we will write some test coverage for this, I think we can make sure it keeps working. There is really nothing that libssh would do in the background so the stuff should not go wrong. We will just make sure to double-check all the variables are a part of session and not scattered around as static variables or thread local ones (I hope we don't have much of these though). > We have a thread pool readily available that we could use, but then > requests for the same session would come from different threads - just > never at the same time. If we need a single long-lived thread per > session instead, that might be a little more involved because we might > have to implement all the communication and synchronisation from > scratch. > > (Hmm... Or we abuse the IOThread object to create one internally and > just move the request coroutine to it around libssh calls. That could be > easy enough.) Sorry, I don't have much experience around this to bring any useful insight here. So going back to the original issue, is the proposed patch something that could work for you in the short term before a better solution will be implemented or is there something we should change? Jakub
Am 18.11.2024 um 18:06 hat Jakub Jelen geschrieben: > Hi Kevin, > Sorry for the delay, my gmail filters will need some love to handle this > high-traffic mailing lists so I can catch replies ... > > On Thu, Nov 14, 2024 at 6:49 PM Kevin Wolf <kwolf@redhat.com> wrote: > > [...] > > Hm, after looking some more at the code, I agree that it can't have > > worked, for the simple reason that sftp_read() never returns SSH_AGAIN, > > but turns it into 0. Which QEMU would have reported as an I/O error if > > we're not at EOF. > > > > What I don't understand yet where in the code it would have blocked > > before rather than returning an error. I tried to follow the code path > > and didn't see anything like it, but obviously I'm also not familiar > > with libssh code. I guess it also doesn't really matter as long as we > > know it has always been broken... > > I think it is the cycle in the sftp_packet_read(), which was polling > until we got something to read here: > > https://gitlab.com/libssh/libssh-mirror/-/blob/master/src/sftp_common.c?ref_type=heads#L75-L100 > > But I would have to retrace through the code as I already forgot where I > was looking to. The loop you highlighted does at most 4 reads and errors out if it doesn't read anything, so I don't think that's it. If it blocks, something inside of ssh_channel_read() must do that. > > The thing that maybe misled me is that sftp_recv_response_msg() calls > > ssh_channel_poll() first to make sure that there is even something to > > read. So I expected it should have been non-blocking at least in some > > cases, but if it had been, we would probably have seen I/O errors all > > the time? > > Note, that the sftp_recv_response_msg() function is new in master and > not yet in any released versions. The 0.11 version had processing directly > in the sftp_read() (and in each other function handling some responses). > > https://gitlab.com/libssh/libssh-mirror/-/blob/stable-0.11/src/sftp.c?ref_type=heads#L1200 > > Also the ssh_channel_poll() is called only when the file handle is in > non-blocking mode (whatever it means). The qemu sets the blocking mode > here: > > https://github.com/qemu/qemu/blob/master/block/ssh.c#L805C1-L805C44 > > This means the execution goes directly into sftp_read_and_dispatch(), > which we discussed above as it does the blocking. Oh, right, I was confused with blocking mode for sftp and for the ssh session. We have sftp blocking and the session non-blocking for the most part. > > > > So I'm not sure if sftp_aio_*() can be combined with something else into > > > > a working solution, and I also don't know if it's affected by the same > > > > libssh bug. > > > > > > Right now, we do not have a full solution. But having SFTP working > > > completely in nonoblocking mode is one of the things we would like to have > > > in the long term. > > > > > > > Jakub, can you help with that? > > > > > > > > [...] > > > > > > > > As far as I can see, libssh sessions aren't thread safe, so we'll have > > > > to make sure to have only one request going at the same time, but I > > > > assume that calling ssh_read/write() from different threads sequentially > > > > isn't a problem? > > > > > > My understanding is that the thread safety of libssh is limited to not > > > sharing session between threads -- there is no synchronization if two > > > threads would send packets at the same time: > > > > > > https://api.libssh.org/master/ > > > > > > If you will make sure you will not call sftp_read()/sftp_write() at > > > the same time from different threads, it might work, but it is > > > untested. > > > > How do you feel about it? Do you think this is something libssh can > > support, or is it something that might accidentally work today, but not > > necessarily next year? > > If we will write some test coverage for this, I think we can make sure it > keeps working. There is really nothing that libssh would do in the background > so the stuff should not go wrong. We will just make sure to double-check > all the variables are a part of session and not scattered around as static > variables or thread local ones (I hope we don't have much of these though). Ok, that sounds good. > > We have a thread pool readily available that we could use, but then > > requests for the same session would come from different threads - just > > never at the same time. If we need a single long-lived thread per > > session instead, that might be a little more involved because we might > > have to implement all the communication and synchronisation from > > scratch. > > > > (Hmm... Or we abuse the IOThread object to create one internally and > > just move the request coroutine to it around libssh calls. That could be > > easy enough.) > > Sorry, I don't have much experience around this to bring any useful insight > here. > > So going back to the original issue, is the proposed patch something > that could work for you in the short term before a better solution > will be implemented or is there something we should change? Given that apparently it was always blocking, the patch doesn't make the situation any worse. I'll apply it for 9.2. Kevin
© 2016 - 2024 Red Hat, Inc.