Re: Html download challenge



Andrea Desole wrote:


sks wrote:


Maybe the URLConn adds some headers by default, such as user-agent: Java or
something which google is rejecting.


that's what I thought, but I dumped all the request parameters, and the list turned out to be empty. If the class does it, it hides it.
I would really like to know how they do it.

Using the TcpTunnelGui from apache SOAP, I see that this is what is sent by Java when running the above program:


============= start =========================================
GET / HTTP/1.1
User-Agent: Java/1.5.0_03
Host: localhost:8888
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive
Content-type: application/x-www-form-urlencoded

GET /search?q=business HTTP/1.1
User-Agent: Java/1.5.0_03
Host: localhost:8888
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive
Content-type: application/x-www-form-urlencoded

============= end =========================================

You could probably use the HttpClient module from apache to change the User-Agent field.

Ray

--
XML is the programmer's duct tape.
.



Relevant Pages

  • Re: Open Office Cocoa
    ... noi attualmente usiamo Flash per il playback dei Google Video ... lato client di Internet" usiamo piu` che altro ... AJAX e mi sembra che ci abbiamo fatto delle belle cosine, ... Quanto a Java client-side, non credo che Google al momento ne abbia ...
    (it.comp.macintosh)
  • Re: 7.0 wishlist?
    ... Publicly accusing a specific person of stupidity, on the other hand, is an insult. ... This is a forum for discussing Java programming, not for assigning blame and arguing over who is at fault. ... domain-specific libraries, and ImageIO plugins and the like. ... I've had to filter JSTOR because that site produced so many Google hits I couldn't use on so many diverse topics, and there's a ton of other candidates for the same treatment, mostly scientific journal sites. ...
    (comp.lang.java.programmer)
  • Re: Just out of curiosity: Which languages are they using at Google and what for?
    ... is it PHP or Java or .NET? ... Which technology is rendering the google main page? ... special-purpose language for that specialized, very-high-volume task, ... at the problem -- but latency does not work the same way: ...
    (comp.lang.python)
  • Re: Giving an application a window icon in a sensible way
    ... Java app. ... At the time I made the initial posting to this thread, a google ... applications (they involved requesting the icon from a URL, ... or accomplish things on your own. ...
    (comp.lang.java.programmer)
  • Re: Java Web Start on sites.google.com
    ... It seemed like a good candidate for Java Web Start. ... JWS complains that it is "Unable to load resource: ... OK - I did not try loading the JNLP (not after ... I was expecting Google to refuse the connection outright, ...
    (comp.lang.java.help)