Re: Regular Expression



On Sep 7, 2:28 pm, "fritz-ba...@xxxxxx" <fritz-ba...@xxxxxx> wrote:
I 'm looking for a regular expression, which will find a certain word
in a text and replace it, if and only if it does not appear inside an
a html link or inside a tag

see Perlfaq 4 - How do I find matching/nesting anything?

==================================
This isn't something that can be done in one regular expression, no
matter how complicated. To find something between two single
characters, a pattern like /x([^x]*)x/ will get the intervening bits
in $1. For multiple ones, then something more like /alpha(.*?)omega/
would be needed. But none of these deals with nested patterns. For
balanced expressions using (, {, [ or < as delimiters, use the CPAN
module Regexp::Common, or see (??{ code }) in the perlre manpage. For
other cases, you'll have to write a parser.

If you are serious about writing a parser, there are a number of
modules or oddities that will make your life a lot easier. There are
the CPAN modules Parse::RecDescent, Parse::Yapp, and Text::Balanced;
and the byacc program. Starting from perl 5.8 the Text::Balanced is
part of the standard distribution.

One simple destructive, inside-out approach that you might try is to
pull out the smallest nesting parts one at a time:

while (s/BEGIN((?:(?!BEGIN)(?!END).)*)END//gs) {
# do something with $1
}

A more complicated and sneaky approach is to make Perl's regular
expression engine do it for you. This is courtesy Dean Inada, and
rather has the nature of an Obfuscated Perl Contest entry, but it
really does work:

# $_ contains the string to parse
# BEGIN and END are the opening and closing markers for the
# nested text.

@( = ('(','');
@) = (')','');
($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs;
@$ = (eval{/$re/},$@!~/unmatched/i);
print join("\n",@$[0..$#$]) if( $$[-1] );
==================================

--
Klaus

.



Relevant Pages

  • Re: Regular Expression
    ... This isn't something that can be done in one regular expression, ... But none of these deals with nested patterns. ... you'll have to write a parser. ... rather has the nature of an Obfuscated Perl Contest entry, ...
    (comp.lang.perl.misc)
  • Re: Conditionally Adding SQL
    ... ever use the C increment macro stuff?) ... regular expression, numeric sequencing. ... If you write your own parser. ...
    (microsoft.public.sqlserver.programming)
  • Re: Writing parser right way in c#
    ... If you anticipate more patterns to be added after your parser is ... Best way is to define your tokens as regular expressions and use these ... a lexical analyzer, which converts the input stream ...
    (microsoft.public.dotnet.general)
  • Re: From a JavaCC code to Delphi? About grammars and that stuff that i dont know
    ... > a grammar. ... > through a regular expression. ... this mean i can find a similar RegEx for say, this query parser:)? ... So is best avoid doing this with RegEx? ...
    (borland.public.delphi.non-technical)