Re: Best way to hash a set of integers?
- From: Logan Shaw <lshaw-usenet@xxxxxxxxxxxxx>
- Date: Wed, 20 Sep 2006 04:49:15 GMT
Digital Puer wrote:
Ben Pfaff wrote:"Digital Puer" <digital_puer@xxxxxxxxxxx> writes:
However, I'm still looking for info on how to best hash a *set*
of integers;
Can you
tell us anything about the sets?
I guess I'm looking for a way to compute what could be
called a multi-dimensional key. Suppose I have three integers
representing age, height (in cm), and weight (in lbs). I want
to somehow combine those together into a single key that will be
my index into a hash table (prior to doing mod hash_table_size).
That's not a set! You could have a person who is 95 years old
and 95 lbs. So, "95" can occur twice, so it's not a set. Another
reason it's not a set is that the order matters.
So, it's actually a list of integers. Or maybe a record of
integers would be a better way to describe it.
Just how to hash these depends on several things. How big will
your hash table be? What is the distribution of your numbers?
Is it a random or is there some pattern to it? In most of these
cases, there will be a pattern. Weight and height will be pretty
close to a normal distribution. Age will be different.
You do have one good thing going for you with these kinds of
values: most of the low-order bits will be very evenly
distributed. It's extremely unlikely that even ages will
greatly outnumber odd ages, or that even weights will greatly
outnumber odd ones, etc. So that means that at least the least
significant bit in each is close to evenly distributed.
With weight, the 5 least significant bits are likely to be
pretty evenly distributed, maybe even 5 or 6 bits. With height,
most people will be between 160 cm to 180 cm and evenly spread
across that interval, so you probably only have something like
the 4 least significant bits that are evenly distributed. With
age, things are a little trickier, but you should be able to
use the 3 or 4 least significant bits.
So, one possible hash function is this:
int hash (int ageyears, int heightcm, int weightlbs)
{
int h;
h = ageyears & (1 << 3 - 1);
h = (h << 4) | (heightcm & (1 << 4 - 1));
h = (h << 5) | (weightlbs & (1 << 5 - 1));
return h;
}
That gives you a 12-bit hash code that should be pretty evenly
distributed across the 4096 possible hash values. If you need
something better, you'll have to be more clever.
- Logan
.
- Follow-Ups:
- Re: Best way to hash a set of integers?
- From: Richard Heathfield
- Re: Best way to hash a set of integers?
- References:
- Best way to hash a set of integers?
- From: Digital Puer
- Re: Best way to hash a set of integers?
- From: toby
- Re: Best way to hash a set of integers?
- From: Digital Puer
- Re: Best way to hash a set of integers?
- From: Ben Pfaff
- Re: Best way to hash a set of integers?
- From: Digital Puer
- Best way to hash a set of integers?
- Prev by Date: Re: Best way to hash a set of integers?
- Next by Date: Re: Total Newbie to asp asp & HTML
- Previous by thread: Re: Best way to hash a set of integers?
- Next by thread: Re: Best way to hash a set of integers?
- Index(es):
Relevant Pages
|