Re: crawling the net...
From: JKop (NULL_at_NULL.NULL)
Date: 04/29/04
- Next message: John Harrison: "Re: ozone filter"
- Previous message: John Harrison: "ozone filter"
- In reply to: ask josephsen: "crawling the net..."
- Next in thread: Morten Wennevik: "Re: crawling the net..."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 29 Apr 2004 10:38:17 GMT
ask josephsen posted:
> Hi NG
>
> I'm making a program to crawl the internet. It works by retrieving all
> links in a page, downloading the page of each link and again retrieving
> all the links. (If there is better ways I'd like to hear)
>
> My problem is relative links (like "../../wohoo.asp"). What is the
> smartest way to get the full url (http://www.xyz.com/wohoo.asp)? Do I
> have to parse the relative link in relation to the url where the
> relative link was found and then concatenate it? Does anyone know how
> other search-engines/ crawlers walk the net?
>
>
> Thanks :)
>
> ./ask
You should have posted this on:
alt.sports.gymnastics
It would've been more on-topic _there_.
-JKop
- Next message: John Harrison: "Re: ozone filter"
- Previous message: John Harrison: "ozone filter"
- In reply to: ask josephsen: "crawling the net..."
- Next in thread: Morten Wennevik: "Re: crawling the net..."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|