Re: Elegant equivalent to this regex?
- From: Mirco Wahab <wahab@xxxxxxxxxxxxxxxxxxx>
- Date: Thu, 04 Jan 2007 23:30:34 +0100
Thus spoke sherifffruitfly (on 2007-01-04 22:47):
Here's the regex I came up with:
(?<whole>\"(?<one>\d{1,3}),(?<two>\d{1,3}),(?<three>\d{1,3})\"|\"(?<one>\d{1,3}),(?<two>\d{1,3})\")
This works fine for me, and getting the desired complete "clean" number
from it is a
triviality.
But I get the feeling that this is the regex-equivalent of baby-talk.
I'd like to know if there's a simpler, more elegant regex matching the
same class of strings, and capturing essentially the same substrings.
You didn't specify how *exact* is your matching requirement,
eg. if you have data like this:
"323, 432, 5" "123, 456, 789" " 888 , 999" " " "1234, 456, 789" "333, 444, 333, 444"
we want to extract *only* sequences with 2 or 3 fields
(comma delimited) *and* exactly 3 digits per number(!),
so only group #2 and #3 would match. And how the whitespace
convention is going to be ...
This would (worst case and highest specification)
look almost like:
...
my $stuff = q'
"323, 432, 5" "123, 456, 789" " 888 , 999" " " "1234, 456, 789" "333, 444, 333, 444"
';
my $rexp = qr/ \"\s* # first quote
\d{3}\s* # first number
,\s* # first comma
\d{3}\s* # second number
(?: # prepare optional third thingy
,\s* # second comma
\d{3}\s* # third number
)?
\"/x; # second quote
my @hits =
map s/\D+//g && $_,
$stuff =~ /$rexp/g;
print join "\n", @hits;
...
Regards
M.
.
- Follow-Ups:
- Re: Elegant equivalent to this regex?
- From: sherifffruitfly
- Re: Elegant equivalent to this regex?
- References:
- Elegant equivalent to this regex?
- From: sherifffruitfly
- Elegant equivalent to this regex?
- Prev by Date: Re: Elegant equivalent to this regex?
- Next by Date: Re: Unsecured scripts and site hacking?
- Previous by thread: Re: Elegant equivalent to this regex?
- Next by thread: Re: Elegant equivalent to this regex?
- Index(es):
Relevant Pages
|