Regular Expressions

From: Andrew Bullock (trullock_at_yahoo.com)
Date: 03/10/04


Date: Wed, 10 Mar 2004 14:32:01 -0000

Hi

Im REALLY stuck with regular expressions!

Can someone please give me a pointer with this...

Im trying to extract a load of images from a site, which are in hrefs.
ie. i want to extract the url and image path

typical hrefs would be:

<a href="test.html"><img src="clickme.jpg"></a>
<a target=_top href="test.html"><img border=0 src="clickme.jpg"></a>
<a href='test.html'> <img src=clickme.jpg></a>

as you can see, there may me double quotes, single quotes or no quotes
surrounding the image and url path, so i need to be able to account for
this.

Also i only want to return info, where there is a jpeg as an image to click,
and the url is a html.

Does that make sense?

$matchstr =
'/<a\s+.*?href=[\"\'s]?(.*?)\"?+src=[\"\'s]?(.*?)\"(.*?)\>\<\/a\>/i';

That almost works, but it returns this :

test.html"><img
thumbs/16.jpg>

how can i fix this?

Many thanks in advance

Andrew



Relevant Pages

  • Re: Regular Expressions
    ... > Can someone please give me a pointer with this... ... > Im trying to extract a load of images from a site, ... > ie. i want to extract the url and image path ... > typical hrefs would be: ...
    (alt.php)
  • Re: problem with spaces in quoted string arguments
    ... Janis Papanagnou wrote: ... are in double quotes. ... As can be seen in the output, the server called "photon hub" did not ... extract properly, since the space was detected in the argument to awk. ...
    (comp.unix.shell)
  • Re: Extract until unquote or EOL
    ... > I wan't to extract the phrase/text between the two quotes. ... NAME = no quotation marks so grab all of this ... NAME = "solitary quotation mark at the beginning of line, so grab all ...
    (comp.lang.perl.misc)
  • Re: Extract until unquote or EOL
    ... > I wan't to extract the phrase/text between the two quotes. ... > last quote isn't available (type/user error) then it should extract ... If i delete the first doublequote also, ...
    (comp.lang.perl.misc)
  • Re: Extract Data
    ... Depends where you are doing it, and if you can guarantee the quotes will be ... Using this value do another Instr to get the 2nd quote ... Use Mid to extract out the section you want. ... Function GetMiddleBit(strInput as string) as string ...
    (microsoft.public.access.queries)