Re: Red-black trees?

Jon Harrop <jon@xxxxxxxxxxxxxxxxx> wrote:
Mark Wooding wrote:
robin <robin_v@xxxxxxxxxxx> wrote:
A hash table is a table, and therefore is of fixed size, and would be
unsuitable for such a task.

You've missed the memo about extending hash tables. If you get it
right, the work done to grow the table leaves you with O(1) insertion
times (amortized).

Or O(n) insertion times, which is worse than a balanced binary tree!

The problems with hash tables that I know of are as follows.

* You sometimes need to wait for hash table growth. If you're
sensible, you'll grow the table by a constant factor each time. If
the table contains n objects, then it'll have been resized O(log n)
times, and you'll have had to move O(n) things in the process.
Since you've inserted O(n) things, that works out constant overall.
So if good-on-average is OK (and it usually is) then you have
nothing to worry about.

* If an adversary can choose the items you'll be inserting into your
table, he might be able to arrange for hash collisions to occur.
The solution is to use an (almost) 2-universal hashing family, and
don't tell the adversary which function you chose.

* The table doesn't store its items in any useful order. Sorry: if
this is a problem, use a tree.

Hash tables have more important problems, e.g. GCs often don't like them.

Certainly if you're using object addresses to identify objects, and
you're using a GC which moves objects about, then you're going to have
to arrange for the hash table to be rehashed during scavenging. (Either
that or you can try to find something else to hash on, and use object
address to disambiguate. But that sounds messy.)

I don't recall reading anything especial in Jones and Lins about hash
tables being problematic. Would you like to explain further?

-- [mdw]

Relevant Pages

  • Re: If you are unhappy with the direction of Ruby 1.8.7+, respond
    ... Hash, and made this change in Ruby 1.9, which has been backported to Ruby ... Seconded -- 1.8.7 is precisely where my code that was relying on insertion ... on keys being inserted in the order they appear in a hash literal, ... use literal appearance order when creating a new hash. ...
  • Re: I dont understand this hash table calculation
    ... "However, if each element hashes to the same bucket, the hash table ... I thought it took constant time ) to search a hash table and ... Insertion of one element into a linked list, in order, takes O. ... lookups, it makes sense to choose the in-order solution... ...
  • Re: Red-black trees?
    ... So operations on your hash table implementation are not all ... cost of an insertion and inclusion in the list remains as O. ... The O notation is an upper bound => it must bound ... That only improves the constant prefactor and does not affect the asymptotic ...
  • Re: What are my options for dictionary hashing?
    ... do the insertion verses the cumulative time needed for a linear search ... hash tables will win over binary searches. ... We have experimented both with different hash algorithms and different numbers of linear lists, and are satisfied that what we use now is close to optimum. ... so I'll model the behavior of a binary search and see if the ...
  • Re: Red-black trees?
    ... So operations on your hash table implementation are not all ... Balanced trees are only O. ... cost of an insertion and inclusion in the list remains as O. ... You are referring to amortized complexity which is irrelevant in the ...