Re: maximum size of a hash table
From: John Bokma (postmaster_at_castleamber.com)
Date: 02/28/05
- Next message: Eric Schwartz: "Re: Performance questions (SQL-statements)"
- Previous message: Henry Law: "Re: cannot open and write file"
- In reply to: Anno Siegel: "Re: maximum size of a hash table"
- Next in thread: Ted Zlatanov: "Re: maximum size of a hash table"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 28 Feb 2005 18:32:49 GMT
Anno Siegel wrote:
> John Bokma <postmaster@castleamber.com> wrote in comp.lang.perl.misc:
>> Anno Siegel wrote:
>>
>> > John Bokma <postmaster@castleamber.com> wrote in
>> > comp.lang.perl.misc:
>> >> Anno Siegel wrote:
>> >>
>> >> > John Bokma <postmaster@castleamber.com> wrote in
>> >> > comp.lang.perl.misc:
>> >> >> Anno Siegel wrote:
>> >> >>
>> >> >> > John Bokma <postmaster@castleamber.com> wrote in
>> >> >> > comp.lang.perl.misc:
>> >
>> > [hashes]
>> >
>> >> >> I am happy with constant avg length of c, and hence O(1) look
>> >> >> up.
>> >> >
>> >> > Fine. My point is that there are useful applications of hash
>> >> > tables also in the O(n) range.
>> >>
>> >> Uhm, like? (Ok, memory stress testing is one).
>> >
>> > It's a simple space/time tradeoff. You only get constant access
>> > time when collisions are rare, so the number of buckets must be
>> > larger than the number of keys.
>>
>> All depends on the hash function of course, but if the size is big
>> (n) and the number of collisions is small (constant), you still have
>> O(1) access.
>
> Collisions *can't* be rare when the number of keys exceeds the number
> of buckets, no matter what the hash function.
You wrote (see above): that "the number of buckets must be larger than
the number of keys" to make collisions rare. With perfect hashing you
can hash n elements in n slots without having any collision.
Moreover, it is possible to have relatively a lot of collision and still
have O(1) look up. As long as the average number of collisions is a
(small) constant wrt n, the look up is O(n).
>> > Say you expect a million keys and linear search
>> > is too slow by a factor of 100. Use a hash with 100 buckets (some
>> > more to compensate overhead).
>>
>> It all depends on what you want. If linear search is slow, and you
>> need a lot of access it might be way smarter to sort ( O(n log n) )
>> and use binary search for the look ups ( O( log n ) ).
>>
>> With 100 buckets you get 100 lists of 10,000 items each. So it takes
>> average 5,000 steps to find something.
>>
>> Compare that with binary search :-D.
>
> You must keep the list sorted. When insertions and/or deletions are
> frequent, you lose the advantage.
search, insertion and deletion are all O(log n). ( at most 20 steps )
Your proposal has O(n) look up (5,000 steps avg), O(1) insertion, and
O(n) deletion (of a specific element, since you have to look it up
first, 5,000 steps avg, otherwise O(1)).
Technically you are using a hash table with 100 buckets, each bucket
containing a list. Implementing it that way, and sorting each list would
give a major speed up (max 14 steps to find an item with binary search,
max 14 steps for insert/delete). [1]
> What's the point? I'm not saying that hashes with large load factors
> are a super trick that solves everything. They can be, and have been
> used that way. You asked for an example.
I can't think of any advantage of an O(n) look up hash table, it works
like an unsorted list, see above.
-- John Small Perl scripts: http://johnbokma.com/perl/ Perl programmer available: http://castleamber.com/ Happy Customers: http://castleamber.com/testimonials.html
- Next message: Eric Schwartz: "Re: Performance questions (SQL-statements)"
- Previous message: Henry Law: "Re: cannot open and write file"
- In reply to: Anno Siegel: "Re: maximum size of a hash table"
- Next in thread: Ted Zlatanov: "Re: maximum size of a hash table"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|