Re: Extracting substring with regexp
- From: Joshua Cranmer <Pidgeot18@xxxxxxxxxxxxxxx>
- Date: Thu, 31 Jan 2008 21:43:22 GMT
Alex wrote:
Pattern p = Pattern.compile("abc(.*)xyz");
Matcher m = p.matcher("xxxxxabc123xyz789xyzxxxxx");
if (m.find())System.out.println(m.group(1));
should print "123" but, instead, it prints "123xyz789".
How can I force regexp to find first match?
Short answer: By default, matching will take the longest matching group. Use "abc(.*?)xyz" instead.
Long answer: The *, +, and ? operators (unqualified) match by first assuming that the match continues and then backtrack until they fail. The `?' operator, when concatenated, will override that behavior by first trying to match without applying the operator and then applying it. The `+' operator will also override the behavior by prohibiting backtracking.
"(a*"+operator+")a" on the string "aaa", group 1 matches:
"": aa
"?": a
"+": <failure>
--
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
.
- References:
- Extracting substring with regexp
- From: Alex
- Re: Extracting substring with regexp
- From: Eric Sosman
- Re: Extracting substring with regexp
- From: Alex
- Extracting substring with regexp
- Prev by Date: Re: Java Socket Constructor
- Next by Date: Re: general performance question
- Previous by thread: Re: Extracting substring with regexp
- Next by thread: Re: Extracting substring with regexp
- Index(es):