Detecting potential regular expression matches?
- From: Fredderic <put_my_name_here@xxxxxxxxxxxxxxx>
- Date: Sat, 1 Jul 2006 12:54:21 +1000
I have a server-type TCL script, which needs to accept socket
connections from several different beasts.
Most of them, identify themselves pretty much straight off with a hello
keyword, one waits until the end of the line to throw in its magical
keyword, and now I need to add support for two binary streams. Both
binary streams are guaranteed to include a regexp'able pattern within
the first couple KB of data, and look very un-textish within the first
hundred or so.
What I would like to be able to do, is determine not only whether a
particular RE matches the incoming data stream, but also whether it
MIGHT match if we receive some more data.
What I do now, is to have a general wrapper on the readable file events
for the socket connection. The wrapper reads in whatever data is
available, and hands it to the proper handler function which was passed
to it as an argument. The default ("new connection") handler, in turn,
adds that data to a buffer, and then looks over it with a bunch of
regexp's. If one matches, it reconfigures the readable file event to
use that handler instead, calls an initialisation procedure to let it
configure the connection to its liking, and then calls the handler with
the contents of the buffer (which it then destroys) to get things
rolling.
The problem, is what happens when I don't recognise the incoming data.
At the moment, the new connection handler checks the size of the
buffer, and dumps the connection if the buffer exceeds a certain size.
It also has a timeout going to dump the connection if it isn't
recognised in a certain time frame. What I would like to be able to
do, is start off with a list of all the known handlers, and knock them
off the list as they get ruled out. Then instead of having to fill a
buffer to a certain point or sit there waiting for a timeout
(which is what happens when someone telnets the server and messes up
their entry), I can dump the connection as soon as it runs out of
potential matches (the binary streams look like binary streams very
early on, so all non-matching text connections can be dumped at the
first end-of-line).
I can still do it, for example, by checking the buffer for either an
end-of-line, or binary-looking data, and applying different constraints
appropriately, but I think it's going to be an awful lot more fiddly,
and slightly less reliable that way.
Fredderic
.
- Follow-Ups:
- Re: Detecting potential regular expression matches?
- From: Michael A. Cleverly
- Re: Detecting potential regular expression matches?
- From: Bruce Hartweg
- Re: Detecting potential regular expression matches?
- Prev by Date: Dr. Dobb's Tcl-URL! - weekly Tcl news and links (Jul 1)
- Next by Date: Re: Detecting potential regular expression matches?
- Previous by thread: Dr. Dobb's Tcl-URL! - weekly Tcl news and links (Jul 1)
- Next by thread: Re: Detecting potential regular expression matches?
- Index(es):
Relevant Pages
|