Re: Parse Key=Val parameters with s///eg

On Mar 17, 1:27 pm, c...@xxxxxxxxx (Chap Harrison) wrote:
On Mar 17, 2011, at 1:49 PM, C.DeRykus wrote:

On Mar 16, 9:58 am, c...@xxxxxxxxx (Chap Harrison) wrote:


use warnings;
use strict;
use feature ":5.10";

# $line, unless empty, should contain one or more white-space-separated
# expressions of the form
#       FOO
# or    BAZ = BAR
# We need to parse them and set
# $param{FOO} = 1       # default if value is omitted
# $param{BAZ} = 'BAR'
# Valid input example:
#   MIN=2 MAX = 12  WEIGHTED TOTAL= 20
# $param{MIN} = '2'
# $param{MAX} = '12'
# $param{WEIGHTED} = 1
# $param{TOTAL} = '20'

my $line = 'min=2 max = 12 weighted total= 20';
$line = 'min=2 max, = 12 weighted total= 20';
say $line;
my %param;

if ( $line and
     ($line !~
                \G            # Begin where prev. left off
                (?:           # Either a parameter...
                    (?:            # Keyword clause:
                        (\w+)      # KEYWORD (captured $1)
                        (?:        # Value clause:
                            \s*    #
                            =      # equal sign
                            \s*    #
                            (\w+)  # VALUE (captured $2)
                        )?         # Value clause is optional
                    \s*            # eat up any trailing ws
                )             ### <-- moved
                |             # ... or ...
                    $         # End of line.
            /                 # use captured to set %param
                $param{uc $1} = ( $2 ? $2 : 1 ) if defined $1;
   ) ) {
    say "Syntax error: '$line'";
    while (my ($x, $y) = each %param) {say "$x='$y'";}

while (my ($x, $y) = each %param) {say "$x='$y'";}

I believe the problem is the "?   # Value clause is optional"
since, in the case of your badline with a ",", the regex will
consume 'max' and then ignore the , since ? means 0 or 1
instance.  Therefore the regex will still succeed and $2 will
be undefined. So the VALUE gets set to 1.

I agree - encountering the ',' causes the regex to think it's encountered a keyword without a value.  But why doesn't the *next* iteration of the global substitution (which would begin at the ',') fail, causing the if-statement to succeed and print "Syntax error"?

Perhaps I don't fully understand how the /g option works....  I thought it would continue to "iterate" until either it reached the end of the string (in which case the s/// would be considered to have succeeded) or it could not match anything further (in which case it would be considered to have failed).

It does iterates through the string until match failure or
end of string. The regex returns the count of successful
matches but, due to the !~ , the count is negated and
returned. So, only if there had been no matches at all,
would the negated return have returned true and taken
the syntax error branch.

For instance, this fails to match immediately since 'a' doesn't
match \d and the negated return of false causes "true" to print:

perl -wle "my $x='abc123'; print 'true' if $x and $x !~ s/\G\d}//

But, this matches once before failing and the negated return of
that count causes the statement qualifier to fail so nothing gets

perl -wle "my $x='1abc23'; print 'true' if $x and $x !~ s/\G\d//g"

See: perldoc perlop for details about the substitution
operator and the \G assertion.

Charles DeRykus


Relevant Pages

  • Re: Regex question
    ... If I use the following string in C# for a validation control: ... regex tester programs such as Expresso. ... but it fails in my code.. ... And finally you could also use 2 validators, one to validate the length and a second to validate the pattern. ...
  • Re: [rake] excluding dirs with FileList
    ... returned true for a dir. ... But #exclude tries to turn its argument into a string and build up a regex. ...
  • Re: More regular expression woes
    ... public static Regex regex = new Regex( ... string must contain exactly seven or exactly eight digits" e.g. ... 123456 fails ... -Please do not send email directly to this alias. ...
  • Re: Fastest way to search a string for the occurance of a word??
    ... but the OP's question was what's the "Fastest way to search a string ... in all the tests I did here, the Regex was by far superior. ... However, of course, if you've got new regular expressions all ... Sure - but just that extra Match object could be relevant if the search ...
  • Re: regular expression help
    ... Basically because if you remove everything that is optional in the regex below you end up with an empty regex: ... So the regex engine will try to match on every character in the string: ... , comma doesn't match, but the nothingness in front of it does. ... A quote followed by any sequence of characters that is not a quote, ...