Regexp to match an URL in an HTML <a href=""></a> tag

From: Charles Nadeau (charlesnadeau_at_hotmail.com)
Date: 11/15/03


Date: Sat, 15 Nov 2003 12:55:27 +0900

Hello,

I am trying to craft a regular expression to filter an URL from a <a
href=""></a> tag and the one I have doesn't seen right.
I use the regular expression from this snippet of code:

foreach my $message (@messages)
{
    my @match=($message->decoded=~/\bhref="(http.*)">.*/gi);

    foreach my $match(@match)
    {
        print $match,"\n";
    }

}

but it doesn't lead to results that are exactly what I need. An excerpt of
what I get as an output looks like:

http://2%30%33.197.%3204.1%355/mout/
http://www.superrxsalesman.info/aff1/?mulish
http://www.superrxsalesman.info/aff1/?acme
http://www.superrxsalesman.info/aff1/?blister
http://www.superrxsalesman.info/aff1/?samba
http://www.superrxsalesman.info/aff1/?depot"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?procter"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?use"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?butane"><font color="#0033CC
http://www.superrxsalesman.info/aff1/?fiche"><font color="#0033CC

The first 5 lines are exactly what I want but I don't understand why in the
following lines I get characters after and including ". I want basically to
keep what is in between the "" of the <href=""> tag.
Could anybody tell me what is wrong with my regular expression?
Thanks!

Charles

-- 
Charles-E. Nadeau Ph.D
http://radio.weblogs.com/0111823/


Relevant Pages

  • Re: Regexp to match an URL in an HTML <a href=""></a> tag
    ... > I am trying to craft a regular expression to filter an URL from a <a ... > I use the regular expression from this snippet of code: ...
    (comp.lang.perl)
  • Re: entering regular expressions from the keyboard
    ... the user to type a regular expression. ... my @allfiles = readdir CPPDIR; ... foreach $_{ ... And Shawn pointed out that the proper syntax for a pattern match is: ...
    (perl.beginners)
  • Re: RegularExpressionValidator?
    ... > I have this snippet: ... > My question is why this regular expression is not invoked if I leave the ... > textbox blank? ... function properly (if not with a little extra maintenance on your part). ...
    (microsoft.public.dotnet.framework.aspnet)
  • RE: directory operations
    ... If you did want to filter based on a regular expression, you can always do it the long way: ... opendir|| die; ... foreach my $file{ ...
    (perl.beginners)
  • Re: Backreferences in case statements
    ... checks for a regular expression. ... In the above snippet how can I use the instance variables pre_match ... IMHO it is generally a bad idea to use grouping in the way you do it because it will capture a lot that you are not interested in. ... Kind regards ...
    (comp.lang.ruby)