Re: Matching parentheses with Regular Expressions



shakah wrote:
On Jul 3, 9:52 pm, Joshua Cranmer <Pidgeo...@xxxxxxxxxxxxxxx> wrote:
James wrote:
The regular expression
\\\\{2}.+\\\\Process\\(java\\).
>
> matches, but it matches too much of it:

In that case, you probably want this regex:
\\\\{2}[^\\\\]+\\\\Process\\(java\\)
--
Beware of bugs in the above code; I have only proved it correct, not
tried it. -- Donald E. Knuth

FWIW, you could avoid a little of the backslash escape mess
by using single-char character classes, e.g.:
Pattern.compile("[\\]{2}[^\\]+[\\]Process[(]java[)]") ;
// ...outside of a Java string that'd be [\]{2}[^\]+
[\]Process[(]java[)]

You also might get rid of some of those backslashes by substituting another character, then using replace() on the string before compiling it.

final static String PATTERN = "``{2}.+``Process`(java`)";

String myRegex = PATTERN.replace("`", "\\" );
System.out.println( myRegex );

Result:

\\{2}.+\\Process\(java\)


It just makes things more readable. Using `, or %, or # in a string, then replace that character with \'s before compiling it as a regex can save your eyes.

Incidentally, I wonder if Sun could be convinced to add this themselves. Maybe add a new operator/keyword altogether. Like # introduces new keywords or operators. It's followed by the keyword or operator. This just allows Sun to make new keywords or operators, with out breaking any existing code. So #s might give us new string constatns. Let's say ' then means like a Unix shell string, where escaping is ignored.

String regex = #s'\\{2}.+\\Process\(java\)';

Would give that literal string, without the need to escape the backslashes. Easier for regex at least. Other types of flags besides ' could be introduced too. `,$,@,%,= might do the same thing, just use a different character as a string terminator, in case you want a ' to be part of the string. """ might introduce a "here-is" operator. Etc.

Just thinking out loud....


.



Relevant Pages

  • Re: RegEx issues
    ... The problem is it appears that python is escaping the \ in the regex ... character within a string. ... This flag allows you to write regular expressions that look nicer. ...
    (comp.lang.python)
  • Re: string plit
    ... Therefore you have to find what character is at the end of each line. ... You should than get the charatcter at the last line of your string. ... keywords = fileText.Split(charSeparators, ... How about if I have a string separated by space characters and return? ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: string plit
    ... I think that little rectangle is a sort of non-Window thing, ... Therefore you have to find what character is at the end of each line. ... You should than get the charatcter at the last line of your string. ... keywords = fileText.Split(charSeparators, ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Regex Matches
    ... Well, yes, but I think that what the OP wanted to know is why Regex ... That is, in the string GGATGGATG, the ... part of the string after the first matched character, ... Match sequence = r.Match(dna, matchIndex); ...
    (microsoft.public.dotnet.languages.csharp)
  • A question about regexes
    ... In Java, the following regex: ... matches any string that starts with a then any character and then b. ...
    (comp.theory)