RE: Hello a question about ".+?"



Actually, the ending ? makes the match non-greedy, in detail:
Given:
SASI_Hs01_00205058 HUMAN NM_005762 857 MISSION® siRNA 2 140.00
if (/(SASI\w+)(.+?)\s(\d+)\s/) { print "$3\n"; }
Match starts looking for the literal SASI followed by one or more \w's, which are upper or lower case letters or digits or underscores, the charcater class:
[a-zA-Z0-9_],
So SASI is followed by letters, underscores or digits till the pattern finds whitespace, in .+?, '.' is any character except a newline,
plus means one or more characters that are not newlines, the '?' means to match the minimal pattern/string possible,
the maximal greedy string might grab every character up to the newline (if there is one) at the end of the line.
The minimal, non-greedy match has to allow the rest of the pattern to match which is the one or more digits \d+ surronded by white space.

Sincerely,
David Kronheim
Production Support Tier II
Gateway Error Correction, VZ450 EDI, EDI Billing, & Metakey/LIA
484-213-1315
________________________________________
From: Chris Stinemetz [chrisstinemetz@xxxxxxxxx]
Sent: Thursday, December 29, 2011 2:45 PM
To: Xi Chen
Cc: beginners@xxxxxxxx
Subject: Re: Hello a question about ".+?"

On Thu, Dec 29, 2011 at 1:17 PM, Xi Chen <cxde515@xxxxxxxxx> wrote:
Hello everyone,

I have a question about how to translate the meaning of ".+?". Please
see the examples below:
SASI_Hs01_00205058 HUMAN NM_005762 857 MISSION® siRNA 2 140.00
I want to get number"857", I found the command below works:
perl -ne 'if (/(SASI\w+)(.+?)\s(\d+)\s/) { print "$3\n"; }'
but ".+" or ".*"doesn't work. I don't know why "?" is so important?


I believe it makes the grouping optional.

Match 1 or 0 times

HTH,

Chris

--
To unsubscribe, e-mail: beginners-unsubscribe@xxxxxxxx
For additional commands, e-mail: beginners-help@xxxxxxxx
http://learn.perl.org/

This communication is confidential. Frontier only sends and receives email on the basis of the terms set out at http://www.frontier.com/email_disclaimer.
.