Re: reg expression example...



Eric Sosman wrote:
On 8/29/2010 8:46 PM, john wrote:
Eric Sosman wrote:
On 8/29/2010 6:32 PM, john wrote:
Hi All,
I need to process large text file and I'm using this expression.
All I know that these words are coming in this order.

"word1.*word2(.*)word3.*word4.*word5"

How can I optimize it ?

What separates the words from each other, and how do you know
you've reached the end of the interstitial space and reached the
start of the next word?

Or if you're actually looking for lines like

word1word2buzzword3lightyearword4mumbleword5

... then you have my sympathies.


yeah, All I know is the order and I need to get piece between word2 and
word3 . It's possible to have multiple word1...word5 patterns and not
all of them include other words.

Pattern pattern = Pattern.compile(
"word1.*word2(.*)word3.*word4.*word5" , Pattern.MULTILINE|Pattern.DOTALL);

So, from "word1word2buzzword3lightyearword4mumbleword5", literally,
you want to extract "buzz" as the group between "word2" (those exact
five characters) and "word3" (those five)?
yes.

I need to use word1 and word5 as start and end of this pattern, but
there may be other word1...word5 patterns which don't include word3/word4 - I don't need them.

actually, I used "word1.*word2(.*?)word3.*word4.*word5"

>And you want to reject (not
match) "word9word8buzzword7lightyearword6word5"?
yes.



This works, but I guess, it's not the most efficient way...

It's the straightforward approach for the problem you've described.
Straightforward very often equals best, for many definitions of "best."
Have you made measurements that indicate it's not "good enough?"
no.
.



Relevant Pages

  • Re: reg expression example...
    ... On 8/29/2010 8:46 PM, john wrote: ... Pattern pattern = Pattern.compile( ... five characters) and "word3"? ... Eric Sosman ...
    (comp.lang.java.programmer)
  • Re: reg expression example...
    ... Eric Sosman wrote: ... All I know is the order and I need to get piece between word2 and word3. ... It's possible to have multiple word1...word5 patterns and not all of them include other words. ... Pattern pattern = Pattern.compile( ...
    (comp.lang.java.programmer)
  • Re: reg expression example...
    ... Pattern pattern = Pattern.compile( ... five characters) and "word3"? ... programmers are at predicting which pieces of their programs will ... Idle Question #1: How much time have you spent writing these ...
    (comp.lang.java.programmer)
  • Re: Help with Code
    ... You must read a string from the filehandle ... and do your comparison or pattern matching on that string. ... followed by an equals sign, followed by 1 or more of any digit, ...
    (comp.lang.perl.misc)
  • Re: Equals
    ... > Theorically equals should be a symetrical operation (docs state among ... So avoid the pattern... ... I *like* seeing errors when I try to dereference null. ...
    (microsoft.public.dotnet.languages.csharp)