Re: need help with regex
From: Alan Moore (jbigboote_at_yoyodyne.com)
Date: 02/08/05
- Next message: Antti S. Brax: "Re: Counting down faster when looping?"
- Previous message: jonck: "Re: Counting down faster when looping?"
- In reply to: -: "need help with regex"
- Next in thread: -: "Re: need help with regex"
- Reply: -: "Re: need help with regex"
- Reply: -: "Re: need help with regex.. again."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 08 Feb 2005 11:42:11 -0800
On Tue, 08 Feb 2005 18:35:05 +0800, - <nobody@hoem.om> wrote:
>i have a sample text:
>"Key: Value Key: Value2 Key: Value3 Subkey: apple Subkey: orange Key:
>Value 4"
>
>i need to extract:
>"Subkey: apple Subkey: orange"
>
>i have a regex expression:
>p.compile("Key: Value2\\s.*(Subkey:\\s.*)Key:");
>
>which only succeeds in extracting "Subkey: orange"
>
>i am pretty sure the solution is to repeat the portion using
>(Subkey:\\s.*)* <--- extra asterisk.
The problem with your regex is that the first ".*" originally matches
all the way to the end of the line. Then the regex engine has to
backtrack in order to match the rest of the pattern--but it only
backtracks as far as it has to, i.e., to the *last* occurrence of
"Subkey:". Adding the asterisk where you suggested only makes things
worse, because now it doesn't have to match the parenthesized
expression even once.
The simplest solution is to make the first ".*" non-greedy: ".*?".
You do need to add a quantifier to the subexpression, but just tacking
on another asterisk is a bad idea. Whenever you have a regex of the
form (x*)*, you run the risk that the regex will take forever to
report failure. I suggest you modify the subexpression so that it
doesn't rely on backtracking. Assuming the Subkey values can't
contain spaces, this should work:
"Key: Value2\\s.*?((?:\\sSubkey:\\s\\S++)+)"
Notice that I also had to match the space preceding the Subkey value
in order for the quantifier to work. I also used a possessive plus
inside the subexpression to avoid the neverending nonmatch problem,
although in this case it isn't really necessary.
- Next message: Antti S. Brax: "Re: Counting down faster when looping?"
- Previous message: jonck: "Re: Counting down faster when looping?"
- In reply to: -: "need help with regex"
- Next in thread: -: "Re: need help with regex"
- Reply: -: "Re: need help with regex"
- Reply: -: "Re: need help with regex.. again."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|