Re: regular expression match question



On Jul 29, Pine Yan said:

	line1:	$string3 = "bacdeabcdefghijklabcdeabcdefghijkl";
	line2:	$string4 = "xxyyzzbatttvv";

	line3:	print "\$1 = $1 \@{$-[0],$+[0]}, \$& = $&\n" if($string3
=~ /(a|b)*/);
	line4:	print "\$1 = $1 \@{$-[0],$+[0]}, \$& = $&\n" if($string4
=~ //);

	$1 = a @{0,2}, $& = ba
	$1 =  @{0,0}, $& =

The regex says "match zero or more of (a or b)". In string 1, it matches a 'b' and then an 'a' at the beginning, thus $& = 'ba'. In string 2, it matches zero characters (because it's allowed to!) at the beginning, thus $& eq ''.


	print "\$1 = $1 \@{$-[0],$+[0]}, \$& = $&\n" if($string3 =~
/(a|b)+/);

	$1 = a @{0,2}, $& = ba
	$1 = a @{6,8}, $& = ba

Now your regex says "match one or more of (a or b)". Thus in string 2, you're matching the "ba" in the middle.


HOWEVER, you're doing something weird. You have a QUANTIFIER (the * or the +) on a CAPTURING GROUP. Here's an example of the weirdness:

  "japhy" =~ /(.)+/;
  print $1;

What do you think that prints? It prints 'y'. Why? Because when you put a quantifier on a capturing group, ONLY THE LAST REPETITION of that capturing group gets saved. This is why you're getting only ONE letter in $1.

--
Jeff "japhy" Pinyan         %  How can we ever be the sold short or
RPI Acacia Brother #734     %  the cheated, we who for every service
http://japhy.perlmonk.org/  %  have long ago been overpaid?
http://www.perlmonks.org/   %    -- Meister Eckhart
.