Re: Regexp to match an URL in an HTML <a href=""></a> tag
From: Andy R (andrew_rowland_at_hotmail.com)
Date: 11/15/03
- Previous message: Gunnar Hjalmarsson: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- In reply to: Charles Nadeau: "Regexp to match an URL in an HTML <a href=""></a> tag"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 15 Nov 2003 11:58:12 GMT
"Charles Nadeau" <charlesnadeau@hotmail.com> wrote in message
news:bp483h$1gv0$1@nwall1.odn.ne.jp...
> Hello,
>
> I am trying to craft a regular expression to filter an URL from a <a
> href=""></a> tag and the one I have doesn't seen right.
> I use the regular expression from this snippet of code:
>
> foreach my $message (@messages)
> {
> my @match=($message->decoded=~/\bhref="(http.*)">.*/gi);
>
> foreach my $match(@match)
> {
> print $match,"\n";
> }
>
> }
>
> but it doesn't lead to results that are exactly what I need. An excerpt of
> what I get as an output looks like:
>
> http://2%30%33.197.%3204.1%355/mout/
> http://www.superrxsalesman.info/aff1/?mulish
> http://www.superrxsalesman.info/aff1/?acme
> http://www.superrxsalesman.info/aff1/?blister
> http://www.superrxsalesman.info/aff1/?samba
> http://www.superrxsalesman.info/aff1/?depot"><font color="#0033CC
> http://www.superrxsalesman.info/aff1/?procter"><font color="#0033CC
> http://www.superrxsalesman.info/aff1/?use"><font color="#0033CC
> http://www.superrxsalesman.info/aff1/?butane"><font color="#0033CC
> http://www.superrxsalesman.info/aff1/?fiche"><font color="#0033CC
>
> The first 5 lines are exactly what I want but I don't understand why in
the
> following lines I get characters after and including ". I want basically
to
> keep what is in between the "" of the <href=""> tag.
> Could anybody tell me what is wrong with my regular expression?
> Thanks!
>
> Charles
>
> --
> Charles-E. Nadeau Ph.D
> http://radio.weblogs.com/0111823/
Use a ? to perform a non-greedy match ie:
my @match=($message->decoded=~/\bhref="(http.*?)">.*/gi);
Should work, though I've not tested it.
Andy R
- Previous message: Gunnar Hjalmarsson: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- In reply to: Charles Nadeau: "Regexp to match an URL in an HTML <a href=""></a> tag"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|