Expect script times out when > 28 SSH sessions are spawned




Hi all,

I am having a problem with an Expect test script, test.exp, that
spawns more
than 28 SSH sessions to execute a shell script on 64-bit Suse 9 Linux
box.
Somehow the Expect script only receives output data from the first 28
SSH sessions and it does not receive any data nor recognizing EOF on
the rest of the SSH sessions. In test.exp it spawns 30 SSH sessions
and the script always times out because it gets no data nor EOF from
the last two SSH sessions. It seems as there is some sort of
limitation is set somewhere in SSH or SSHD or
Network settings, etc, but I am not sure where this 28 SSH sesssion
limit
is defined if any.

Here is information on Tcl/Expect/Linux/SSH.

x. x64 SuSe 9 Linux:

# uname -a
Linux lussd-gsc8-9 2.6.5-7.244-smp #1 SMP Mon Dec 12 18:32:25 UTC
2005
x86_64 x86_64 x86_64 GNU/Linux

x. Tcl/Tk: 8.3.5
Expect: 5.40.0 (I also tried with the more recent Expect 5.43.0
binary,
but the Expect script behaves the same)

x. SSH: openssl4.1p1

# rpm -qf /usr/bin/ssh
openssh-4.1p1-11.10

SSH has been configured to suppress (or bypass) the passwd prompt for
invoking a command so that SSH behaves the same as RSH on other
UNIXes.

In test.exp it just spawns a dummy shell script "/tmp/echo_me" via SSH
in a
loop, total of 30 times. /tmp/echo_me just prints one line output,
its pid
and any cmdline options.

# cat /tmp/echo_me
#!/bin/sh

#sleep 2
echo "I got it: $$ : $*"
#sleep 2

exit 0

I've tried running the test.exp with exp_internal variable turned on,
but it
does not reveal anything new, that is there was no data received from
the
last two SSH sessions.


With exp_internal variable turned on, the Expect script output looks
like:
..
..
..
expect: does "" (spawn_id exp33) match regular expression "(.+)"? no

expect: does "" (spawn_id exp32) match regular expression "(.+)"? no
expect: timed out
TIMEOUT: 2: exp32 exp33
TIMEOUT: exp32: pid 19890
TIMEOUT: exp33: pid 19905

From other xterm window:

# ps -ef |grep defunc
root 16615 16612 0 14:00 ? 00:00:00 [sh] <defunct>
root 19890 19627 0 14:37 ? 00:00:00 [ssh] <defunct>
root 19905 19627 0 14:37 ? 00:00:00 [ssh] <defunct>
root 19994 9532 0 14:38 pts/3 00:00:00 grep defunc

As you can see above two SSH client processes are exited, but somehow
Expect
does not recognize them by EOF, therefore, it times out instead.

Has anyone seen this sort of problem ? Any suggestion on how to
resolve this
problem ? The current work-around is to limit the maximum SSH
sessions to
be 28 hard-coded number in an Expect script.

Thanks,

test.exp:

#!/usr/bin/expect

set hosts "localhost"
set cmd_str "ssh -n %s /tmp/echo_me"
set sid_list ""

#log_user 1
#exp_internal 1

set my_id 0
for {set i 0} {$i < 30} {incr i} {
foreach h $hosts {
set cmd [format "$cmd_str" $h]

if [catch {eval spawn $cmd $my_id} pid] {
puts " '$cmd $i' failed: $pid\n"
break
} else {
puts "$h: $cmd $i: $spawn_id, $pid\n"
lappend sid_list $spawn_id
set pid_list($spawn_id) $pid
set data($spawn_id) {}
}
incr my_id 1
}
}

set sid_eof $sid_list
#set pat "(.*)\r\n"
set pat "(.+)"
set eof_exp [llength $sid_list]
set eof_cnt 0
set eof_left $eof_exp

while {1} {
expect {
-i "$sid_eof" -re "$pat" {
set id $expect_out(spawn_id)
puts "$id: $expect_out(1,string)"
append data($id) $expect_out(1,string)
}
-i "$sid_eof" eof {
if [catch {set id $expect_out(spawn_id)} err] {
puts "eof: $err\n"
break
}
puts "EOF: $id"
set k [lsearch -exact $sid_eof $id]
if {$k >= 0} {
set sid_eof [lreplace $sid_eof $k $k]
} else {
puts "Error, eof missing '$id' in slist: $sid_eof
\n"
}
set spawn_id $id
catch {close}
catch {wait}
incr eof_cnt 1
incr eof_left -1
if { $eof_left != 0 } {
puts "EOF remaining: $eof_left: $sid_eof"
}
}
timeout {
puts "TIMEOUT: $eof_left: $sid_eof"

foreach k $sid_eof {
puts "TIMEOUT: $k: pid $pid_list($k)"
}
break
}
}
}

foreach k $sid_list {
puts "$k: '$data($k)'"
}

.



Relevant Pages

  • [expect] How to detect if process is closed from interact command
    ... I have an expect script that spawns an ssh connection to a remote machine, ... exceptions that prompts the user and enters interactive mode. ... expect_after eof exit ...
    (comp.lang.tcl)
  • Re: pppd pty equivilent in FBSD
    ... I let pppd manage retries & setting routes. ... >I wouldn't personally recommend vpn over ssh for anyone either, ... I'm the sole bsd user at my company, and the ppp over ssh ... >Actual bash script I call: ...
    (freebsd-net)
  • Re: [kde-linux] Sessions names
    ... ssh 192.168.2.80??? ... You can write a wrapper script that would start ssh to some ip and rename the ... konsole session dynamically. ... dcop call to konsole you need to know the pid of konsole. ...
    (KDE)
  • Re: Hacker activity?
    ... >login to a server, most as root but some are attempts to login to ... >telnet, all come from the same remote server, and all fail. ... >getting some odd cgi calls to a script on a secure ssl server. ... Make sure root cannot login to your system via ssh. ...
    (freebsd-questions)
  • Re: Ive been hacked...tips for a postmortem?
    ... > for keywords, like the email address in the `mailme' script, see if it ... www.openssh.com) That SSH1 is not to be used for anything other than ... SSH at all. ... an SSH vulnerability is to place in your startup scripts (usually rc.M, ...
    (comp.os.linux.security)