Re: Html download challenge
- From: Patricia Shanahan <pats@xxxxxxx>
- Date: Thu, 30 Jun 2005 13:54:23 GMT
Paul Battersby wrote:
I've spent days poking around the internet, reading help information, trying to find working source code but no luck so far.
My problem, on the surface and to someone who knows what he/she is doing, should be easy to solve.
All I want to do is download the HTML from the following url:
http://www.google.com/search?q=business
Sounds simple. I can type that into a browser and I will get a page full of information. I try to download that using a Java program, and the server seems to know I am not a browser (my code works with other Urls just fine). I figure I need to pass some sort of header information or something so that I appear to be a browser.
So, what I'm looking for, if anyone is up to the challenge, is a small piece of Java source code that is capable of downloading the HTML from the above mentioned url and printing it to the screen.
On my own, I think I'm looking at a pretty big learning curve (low level HTTP protocol) to sort this out.
Any help is of course greatly appreciated.
There is an alternative approach. Google has a Java API. See http://www.google.com/apis/.
The licensing limits you to 1000 queries per day, and specifies personal, non-commercial use only. I assume they are trying to prevent anyone from constructing a rival search site with their own ads but Google's search results.
As long as you meet the licensing restrictions, it is MUCH easier to access the results using their API than by trying to parse their web pages, even if you can get hold of them.
Patricia .
- Follow-Ups:
- Re: Html download challenge
- From: Paul Battersby
- Re: Html download challenge
- References:
- Html download challenge
- From: Paul Battersby
- Html download challenge
- Prev by Date: Re: Html download challenge
- Next by Date: Re: A button close in JDialog
- Previous by thread: Re: Html download challenge
- Next by thread: Re: Html download challenge
- Index(es):