Re: regexp pipe problems..
- From: Christian Winter <thepoet_nospam@xxxxxxxx>
- Date: Thu, 30 Jun 2005 10:29:25 +0200
willwade@xxxxxxxxx schrieb:
> OK, I have a "fairly" straightforward regular expression which grabs
> some bits out of a url. Now this should be easy but I can't see the
> wood for the trees. I have a few or's in there but it seems to be
> adding each to memory - when I only want the found one. Also - instead
> of matching just the subdomain it matches the whole domain ($1 see
> below) - whats that all about?
>
> $url = 'http://sub1.site3.org/slash/slashey2/slashey3/39/4/223';
>
> if ($url =~ m{(([^/]+).site1.org|([^/]+).site2.org|([^/]+).site3.org)
> /slash/slashey2/([^/]+)/([0-9]+)/([0-9]+)/([0-9a-zA-Z-]+)}){
> print "yay! $1,$2,$3,$4,$5";
> } else {
> print "poo";
> exit;
> }
>
> and it prints:
>
> yay! sub1.site3.org,,,sub1,
Just make it a little more readable:
m{
( # this parenthesis is captured in $1
([^/]+).site1.org | # here's $2
([^/]+).site2.org | # that's $3
([^/]+).site3.org # and $4
)
/slash/slashey2/
( [^/]+ ) # here we get $5...
/
( [0-9]+ ) # ...and so on...
/
( [0-9]+ )
/
( [0-9a-zA-Z-]+ )
}x;
With the /x modifier you can insert whitespaces, comments
and linebreaks to your regex without changing their meaning.
This is especially useful when you post regexes here,
as your expressions are no longer damaged by automated line
breaks.
Capturing parens are numbered in the order the opening parens
appears in the regex from left to right, that's why your
example captures the fqdn.
So to work around this the easiest solution would be to
move the hostname pattern outside of the or-clause:
m{
([^/]+)\.(site1|site2|site3)\.org
/slash/slashey2/
( [^/]+ ) / ( \d+ ) / ( \d+ ) / ( [0-9a-zA-Z-]+ )
}x;
and to change the print statement (or whatever uses the
captering variables) to ignore $2:
print "yay! $1,$3,$4,$5,$6";
But, of course, TMTOWTDI, and "perldoc perle" has a lot of
useful information on regular expressions.
HTH
-Chris
.
- Follow-Ups:
- Re: regexp pipe problems..
- From: Tad McClellan
- Re: regexp pipe problems..
- References:
- regexp pipe problems..
- From: willwade
- regexp pipe problems..
- Prev by Date: Re: Cannot find DBD::Informix for ActivePerl 5.8.3
- Next by Date: Re: SOAP::Lite proxied
- Previous by thread: Re: regexp pipe problems..
- Next by thread: Re: regexp pipe problems..
- Index(es):
Relevant Pages
|