RE: Hello a question about ".+?"

Actually, the ending ? makes the match non-greedy, in detail:
SASI_Hs01_00205058 HUMAN NM_005762 857 MISSION® siRNA 2 140.00
if (/(SASI\w+)(.+?)\s(\d+)\s/) { print "$3\n"; }
Match starts looking for the literal SASI followed by one or more \w's, which are upper or lower case letters or digits or underscores, the charcater class:
So SASI is followed by letters, underscores or digits till the pattern finds whitespace, in .+?, '.' is any character except a newline,
plus means one or more characters that are not newlines, the '?' means to match the minimal pattern/string possible,
the maximal greedy string might grab every character up to the newline (if there is one) at the end of the line.
The minimal, non-greedy match has to allow the rest of the pattern to match which is the one or more digits \d+ surronded by white space.

David Kronheim
Production Support Tier II
Gateway Error Correction, VZ450 EDI, EDI Billing, & Metakey/LIA
From: Chris Stinemetz [chrisstinemetz@xxxxxxxxx]
Sent: Thursday, December 29, 2011 2:45 PM
To: Xi Chen
Cc: beginners@xxxxxxxx
Subject: Re: Hello a question about ".+?"

On Thu, Dec 29, 2011 at 1:17 PM, Xi Chen <cxde515@xxxxxxxxx> wrote:
Hello everyone,

I have a question about how to translate the meaning of ".+?". Please
see the examples below:
SASI_Hs01_00205058 HUMAN NM_005762 857 MISSION® siRNA 2 140.00
I want to get number"857", I found the command below works:
perl -ne 'if (/(SASI\w+)(.+?)\s(\d+)\s/) { print "$3\n"; }'
but ".+" or ".*"doesn't work. I don't know why "?" is so important?

I believe it makes the grouping optional.

Match 1 or 0 times



To unsubscribe, e-mail: beginners-unsubscribe@xxxxxxxx
For additional commands, e-mail: beginners-help@xxxxxxxx

This communication is confidential. Frontier only sends and receives email on the basis of the terms set out at