Re: URL's and Set's do not mix - DNS lookups are compulsory




"Rogan Dawes" <discard@xxxxxxxxxxxx> wrote in message
news:e4eesf$d$1@xxxxxxxxxxxxxxxxxxxxx
Hi folks,

I noticed a weird problem in a bit of code that I was writing. I wanted to
display a hierarchy of URL's in a TreeModel. In doing so, I was adding
URL's to a HashSet. While I was testing, I was using invalid URL's, like
"http://abcd/";, etc

What happened was that the calls to set.contains(url) and set.add(url)
were showing delays of up to 4 seconds executing the methods. And this
changed, depending on whether I was using a HashSet or a TreeSet with a
custom Comparator.

What turned out to be the problem is that the URL.hashCode() method was
actually trying to resolve the address of the hostname that was specified,
via the protocol specific handler. And obviously, the hostnames ("abcd") I
was using did not exist, and the DNS resolution was taking some time to
timeout.

Am I the only person to think that this is a completely STUPID design?

It's part and parcel of the attempt to make URL.equals() independent of how
the hostname is specified. This is actually hopeless, given the existence
of proxies, VPN, NAT, etc. There's no guarantee that two different IP
addresses don't reference the same host, or that two identical IP adresses,
used at different times or in different contexts, name the same host. I'm
not sure why Sun is even making the attempt.


I can think of many scenarios where one may want to keep even valid URL's
in a Set, without being able to resolve them to an IP address. For
example, a web scanner that works in a private environment (using a
corporate DNS), where the results may be reviewed on a machine outside of
the environment, with no access to the internal DNS servers.

This problem makes it almost impossible to use this kind of data structure
in an offline environment.

Does anyone have any suggestions on how to get around this issue? At this
point, I am thinking of simply copying the URL class into my own code, and
removing all traces of this idiocy.

Two suggestions, at least.

1. Store the string verion of the URL instead of the URL itself. It's cheap
enough to reparse it when necessary.
2. Wrap the URL with another class that bases equals() and hashCode() on the
URL's string value.


.



Relevant Pages

  • URLs and Sets do not mix - DNS lookups are compulsory
    ... What turned out to be the problem is that the URL.hashCodemethod was actually trying to resolve the address of the hostname that was specified, ... For example, a web scanner that works in a private environment (using a corporate DNS), where the results may be reviewed on a machine outside of the environment, with no access to the internal DNS servers. ...
    (comp.lang.java.programmer)
  • Re: adding a second nic
    ... > internal Active Directory domain zone. ... with a gateway of 192.168.1.1 and assign it in DNS as a domain name ... and use that as my internet connection ... on 192.168.1.200 called outside, with hostname ohostname. ...
    (microsoft.public.win2000.dns)
  • Re: Security flaw in ALCATEL/THOMSON Speed Touch Pro ADSL modems
    ... | because it's a way DHCP and DNS are supposed to work and it's ... firmware simply doesn't validate any further Hostname given to it, ... this flaw allows to corrupt the local zone file ...
    (Bugtraq)
  • Re: Bug in Graphical Network Configuration???
    ... I have assigned the static address for my FC5 box because I am serving ... FC5 box has something to do with the Samba since windows machines uses ... addresses - it might add them (machine hostname and FQDN) to the ... The ISP's DNS server numerical IP addresses. ...
    (Fedora)
  • Re: IP ADDRESS
    ... to me with the least amount of client machine configuration or programming. ... machines in DNS or RDNS, or that you have them all in DNS & RDNS. ... GET THE HOSTNAME FOR THE DEVICE ... FOR PTR = 1 TO LNS ...
    (comp.databases.pick)