ludo@gnu.org (Ludovic Courtès) writes:
lshg/lsh (as of lsh 2.1 on GNU/Linux, x86_64) systematically fails for me when passed large data streams on stdin:
--8<---------------cut here---------------start------------->8--- $ lsh -G -B fencepost.gnu.org Passphrase for key `xxx@yyy':
$ lshg fencepost.gnu.org uname -o GNU/Linux
$ cat /dev/zero | lshg fencepost.gnu.org md5sum lsh: Protocol error: Write buffer full, peer not responding. lsh: write_buffer: Attempt to write data to closed buffer.
...
Conversely, this works well:
--8<---------------cut here---------------start------------->8--- $ cat /dev/zero | lsh fencepost.gnu.org md5sum
Any idea what’s wrong or how to debug it?
Definitely looks like something wrong in the flow control involving lsh and lshg.
You could first try increasing the WRITE_BUFFER_MARGIN in connection.c, but I doubt that will help (if that was the problem, you'd most likely see it also when lshg isn't involved).
You may also get some info using some or all of -v --trace and --debug, to the lsh and lshg processes.
It would also be interesting to see if the problem still exists in the latest version in the repo (which works quite differently; lshg is no longer a separate program, but you still setup the gateway with lsh -G, and then further invocations of lsh will try to use it).
It's some time since I worked on this code... The way it works, there's a soft_limit limiting the amount of data we're willing to keep buffered for writing to the socket. When this happens, the hard_limit is set. We will still generate new packets to be buffered if needed to respond to a key exhange, but otherwise, we're not supposed to generate new packets, and this basically works because read_data.c:do_read_query_data checks if connection->hard_limit is set.
Now, I think the problem is that the code reading from a gateway client socket doesn't check if hard_limit > 0. Not entirely trivial to fix. What needs to be done is to
1. Make gateway_commands.c:do_read_gateway check that flag,
if (self->connection->chain->hard_limit) ... In this case, return zero, but we also need to stop reading from the socket. Maybe one can have the caller, io.c:do_buffered_read, check if the return value is zero, and call lsh_oop_cancel_read_fd?
2. Somwhow use the wakeup mechanism (invoked from connection.c:do_connection_flow_controlled) to restart reading from the gateway socket.
Do you think you could write a test case? At the receiving end, it might help to have a slow receiver of data, say a data sink like
while true; do dd bs=1000 count=1 of=/dev/null; sleep 1; done
Regards, /Niels
Thanks, Ludo’. _______________________________________________ lsh-bugs mailing list lsh-bugs@lists.lysator.liu.se http://lists.lysator.liu.se/mailman/listinfo/lsh-bugs