Re: regexp validation (arbitrary code execution) (regexp injection)

2011/6/1 Stanisław Findeisen <stf@xxxxxxxxxxxxx>

Suppose you have a collection of books, and want to provide your users
with the ability to search the book title, author or content using
regular expressions.

But you don't want to let them execute any code.

How would you validate/compile/evaluate the user provided regex so as to
provide maximum flexibility and prevent code execution?

Eisenbits - proven software solutions:
OpenPGP: E3D9 C030 88F5 D254 434C 6683 17DD 22A0 8A3B 5CC0

To unsubscribe, e-mail: beginners-unsubscribe@xxxxxxxx
For additional commands, e-mail: beginners-help@xxxxxxxx

Hi Stanisław,

From what you are saying I think you are looking for an option to take a
string and check it for any potential bad characters that would cause system
to execute unwanted code.

So a bit like this: "In.*?forest$" is a safe string to feed into your
regular expression but: ".*/; open my $fh, ">", $0; close $fh; $_ = ~/" is
an evil string causing you a lot of grief. At least that is how I understand
your question...

To be honest I am not sure if this is an issue as I suspect that the
following construction.
if ( $title =~ m/$userinput/ ) { do stuff... }
will give you any issues as far as I can remember the variable that you are
feeding here will not be treated as code by the interpreted but simply as a
matching instructions which would mean that what ever your user throws at it
perl will in the worst case return an failure to match.

But please don't take my word for it try it in a very simple test and see
what happens.

If you do have to ensure that a user cannot execute any code you could
simply prevent the user from entering the ; or smarter yet filter this out
from the user input, to prevent a "smart" user from feeding it to your code
via an method other then the front-end you provided. Without a means to
close the previous regular expression the user can not really insert
executable code into your regular expression. At least thats what I would
try but I am by no means an expert in the area and I suspect there might be
some people reading this and wondering why I didn't think of A, B or C if so
please do speak up people ;-)