Regexp to match an URL in an HTML <a href=""></a> tag
From: Charles Nadeau (charlesnadeau_at_hotmail.com)
Date: 11/15/03
- Next message: Gunnar Hjalmarsson: "Re: Trying to get Image::Magick to work on a remote server"
- Previous message: Weapons of Mass Destruction: "Trying to get Image::Magick to work on a remote server"
- Next in thread: Gunnar Hjalmarsson: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- Reply: Gunnar Hjalmarsson: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- Reply: Andy R: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 15 Nov 2003 12:55:27 +0900
Hello,
I am trying to craft a regular expression to filter an URL from a <a
href=""></a> tag and the one I have doesn't seen right.
I use the regular expression from this snippet of code:
foreach my $message (@messages)
{
my @match=($message->decoded=~/\bhref="(http.*)">.*/gi);
foreach my $match(@match)
{
print $match,"\n";
}
}
but it doesn't lead to results that are exactly what I need. An excerpt of
what I get as an output looks like:
http://2%30%33.197.%3204.1%355/mout/
http://www.superrxsalesman.info/aff1/?mulish
http://www.superrxsalesman.info/aff1/?acme
http://www.superrxsalesman.info/aff1/?blister
http://www.superrxsalesman.info/aff1/?samba
http://www.superrxsalesman.info/aff1/?depot"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?procter"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?use"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?butane"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?fiche"><font color="#0033CC
The first 5 lines are exactly what I want but I don't understand why in the
following lines I get characters after and including ". I want basically to
keep what is in between the "" of the <href=""> tag.
Could anybody tell me what is wrong with my regular expression?
Thanks!
Charles
-- Charles-E. Nadeau Ph.D http://radio.weblogs.com/0111823/
- Next message: Gunnar Hjalmarsson: "Re: Trying to get Image::Magick to work on a remote server"
- Previous message: Weapons of Mass Destruction: "Trying to get Image::Magick to work on a remote server"
- Next in thread: Gunnar Hjalmarsson: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- Reply: Gunnar Hjalmarsson: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- Reply: Andy R: "Re: Regexp to match an URL in an HTML <a href=""></a> tag"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|