Re: [PHP] not sure why regex is doing this

From: Kelly Hallman (khallman_at_ultrafancy.com)
Date: 01/09/04


Date: Fri, 9 Jan 2004 10:24:24 -0800 (PST)
To: craig <php@lonsbury.com>

On Fri, 9 Jan 2004, craig wrote:
> (4536,'golf tournament management',430,0,0),
> (1434,'Premium golf balls',,,0),
>
> I have to replace the blank entries (,,) with NULLs, using this regex:
> $query = preg_replace('/,\s*,/',',NULL,', $query, -1);
> after this line, only ONE of the ,, sets is replaced by ,NULL, like:
> (1434,'Premium golf balls',NULL,,0)

The regex does continue trying to make matches, but the point at which it
continues is just past your replacement. In other words, the trailing
comma in ,NULL, is not considered part of the string to match/replace.

This should do the trick:
preg_replace('/,\s*(?=[,\)])/', ',NULL', $input);

(?=pattern) is a positive lookahead. It evaluates true if the next
characters match the pattern, but those characters are not consumed.

So that regex is equivalent to "match a pattern starting with a comma
followed by any existing spaces, ONLY IF the next character is , or )"

The most robust way you could write this regex is:
preg_replace('/([,\(])\s*(?=[,\)])/', '\1NULL', $input);

I know you'll probably never have input like (,,,,),
but it would work as expected.

Many tricky regex problems can be solved by lookaheads. There is also a
negative lookahead (?!pattern) ... also note that this is an advanced
regex feature and won't it work on many regex engines not based on PCRE.

-- 
Kelly Hallman
// Ultrafancy


Relevant Pages

  • Re: Standard C Library regex performance issue
    ... except for the number of characters you have to type in. ... To specify more than one option, "or" them together with the | operator: ... By default, Python's regex engine only considers the letters A through Z, the ... The above implies that Pyhton's newline mode is *ON* by default. ...
    (comp.lang.c)
  • Re: Help with Regex (UserName, Email)
    ... I have a feeling you haven't tried regex till now, ... If Name = ALL recurring characters, excluding spacing, ... ShowMessage: Please re-enter the Name again. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: matching ? in a string ending with digits
    ... for my $item (@arr) { ... what other characters will fail to match in a string ... regex to fail as mentioned. ...
    (comp.lang.perl.misc)
  • Re: Regex question
    ... I didn't know the range would be that much different from SQL ... The problem is that my regex only gets ... should write a regex pattern that matches _that_, ... remove all the characters from the string that aren't digits or '/' ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Usename regex
    ... which have alphanumberic characters and one underscore. ... From what I understand is I can use a regex to do this. ... Think of a string, preferably very long that contains only alphanumeric characters, but end in a # sign. ...
    (microsoft.public.dotnet.framework.aspnet)