we've been using fsh for some internal stuff recently, along with
openssh (currently we're using version 3.0.2). we've been having some
problems with ssh connections failing with the message:
Received disconnect from x.x.x.x: 2: fork failed: Resource
temporarily unavailable
the problem seems related to either a lot of openssh procs, a lot of fsh
procs, or a combination. currently we have a cron job set to execute
every 15 minutes which does:
for A in `ps aux |grep fsh |egrep -v "reset_fsh|fsh -T" |awk '{print
$2}'`; do
kill $A
done
this fixes the problem, but obviously isn't a good long-term solution.
we use ssh and scp for a great deal of stuff, so the machines that are
having problems (which act as controller machines for our other
machines) have a LOT of ssh procs running usually.
most of the machines are debian linux (potato) with a custom built fsh
1.1 package (based on the woody package). there are a few freebsd
machines as well.
the change has become problematic since we switched to using ssh v2
everywhere (i assume the extra processor overhead needed to deal with
the larger dsa keys doesn't help).
so i realize the problem is most likely partially with ssh, but it only
seems to become problematic when we're using fsh to tunnel connections.
any tips on setting fsh, ssh, or both to allow more concurrent procs? i
tried changing the limits.h include file to allow more procs and
recompiling ssh. this didn't seem to help.
i'm a bit more hesitent to edit the fsh stuff since i don't know python
at all.
anyway, any help would be greatly appreciated, especially from people
who use fsh with a great number of concurrent connections.
--
Experience -- a great teacher, but the tutition fees...