Re: How good is PERL at searching ASCII files?




Quoth jmartzoo-google@xxxxxxxxx:
The challenge I'm faced with is to search out all occurrences where a
function named CONVERT() is being called with three parameters. It's a
challenge because the function has several signatures and I need to
segregate the three param calls from the two param calls. To make
things even spicier, it's very possible that the parameters may also be
function calls themselves. I think the best strategy becomes finding
the pattern:
CONVERT(<something>,<something>,
The key elements being the string CONVERT, one open paren ( followed by
two commas at the same nesting level.

This should get you started. Note that it doesn't handle quotes: that
is, parens in quotes will *not* be ignored as they should. You could
perhaps add a first pass to strip out all quoted strings, again using
Regexp::Common.

#!/usr/bin/perl -l

use warnings;
use strict;

# get some test data
my $data = <<'DATA';

Convert( foo, bar, baz )

Convert( (foo, bar), baz)

Convert(foo, bar)

convert (foo, (bar (baz, quux)))

convert (foo, (bar, (baz, quux)), flarp)
DATA

use Regexp::Common;

# $p is regex that matches a balanced parenthesized expression
my $p = $RE{balanced}{-parens => '()'};

# each time round the loop, remove one call to convert and place it in
# $1 (that's what the parens are for)
while ($data =~ s/( convert \s* $p )//xi) {

# put the call into $call
my $call = $1;

# get just the arguments, without the outer parens
(my $args = $call) =~ s/convert \s* \( (.*) \)/$1/xsi;

# strip any balanced sets of parens
$args =~ s/$p//g;

# count the number of commas
my $commas = $args =~ tr/,//;

if ($commas == 2) {
print $call;
}
}

Ben

--
"If a book is worth reading when you are six, * benmorrow@xxxxxxxxxxxxx
it is worth reading when you are sixty." [C.S.Lewis]
.