Re: Parse Key=Val parameters with s///eg



On Mar 17, 1:27 pm, c...@xxxxxxxxx (Chap Harrison) wrote:
On Mar 17, 2011, at 1:49 PM, C.DeRykus wrote:



On Mar 16, 9:58 am, c...@xxxxxxxxx (Chap Harrison) wrote:

#!/usr/bin/perl

use warnings;
use strict;
use feature ":5.10";

#
# $line, unless empty, should contain one or more white-space-separated
# expressions of the form
#       FOO
# or    BAZ = BAR
#
# We need to parse them and set
# $param{FOO} = 1       # default if value is omitted
# $param{BAZ} = 'BAR'
#
# Valid input example:
#   MIN=2 MAX = 12  WEIGHTED TOTAL= 20
# $param{MIN} = '2'
# $param{MAX} = '12'
# $param{WEIGHTED} = 1
# $param{TOTAL} = '20'
#

my $line = 'min=2 max = 12 weighted total= 20';
$line = 'min=2 max, = 12 weighted total= 20';
say $line;
my %param;

if ( $line and
     ($line !~
           s/
                \G            # Begin where prev. left off
                (?:           # Either a parameter...
                    (?:            # Keyword clause:
                        (\w+)      # KEYWORD (captured $1)
                        (?:        # Value clause:
                            \s*    #
                            =      # equal sign
                            \s*    #
                            (\w+)  # VALUE (captured $2)
                        )?         # Value clause is optional
                    )
                    \s*            # eat up any trailing ws
                )             ### <-- moved
                |             # ... or ...
                    $         # End of line.
            /                 # use captured to set %param
                $param{uc $1} = ( $2 ? $2 : 1 ) if defined $1;
       /xeg
   ) ) {
    say "Syntax error: '$line'";
    while (my ($x, $y) = each %param) {say "$x='$y'";}
    exit;}

while (my ($x, $y) = each %param) {say "$x='$y'";}

I believe the problem is the "?   # Value clause is optional"
since, in the case of your badline with a ",", the regex will
consume 'max' and then ignore the , since ? means 0 or 1
instance.  Therefore the regex will still succeed and $2 will
be undefined. So the VALUE gets set to 1.

I agree - encountering the ',' causes the regex to think it's encountered a keyword without a value.  But why doesn't the *next* iteration of the global substitution (which would begin at the ',') fail, causing the if-statement to succeed and print "Syntax error"?

Perhaps I don't fully understand how the /g option works....  I thought it would continue to "iterate" until either it reached the end of the string (in which case the s/// would be considered to have succeeded) or it could not match anything further (in which case it would be considered to have failed).

It does iterates through the string until match failure or
end of string. The regex returns the count of successful
matches but, due to the !~ , the count is negated and
returned. So, only if there had been no matches at all,
would the negated return have returned true and taken
the syntax error branch.

For instance, this fails to match immediately since 'a' doesn't
match \d and the negated return of false causes "true" to print:

perl -wle "my $x='abc123'; print 'true' if $x and $x !~ s/\G\d}//
g"

But, this matches once before failing and the negated return of
that count causes the statement qualifier to fail so nothing gets
printed:

perl -wle "my $x='1abc23'; print 'true' if $x and $x !~ s/\G\d//g"


See: perldoc perlop for details about the substitution
operator and the \G assertion.

--
Charles DeRykus

.