Re: How do I hash this?
- From: "Martijn" <subscription_remove_101@xxxxxxxxxxxxxxxxxxx>
- Date: Sun, 12 Feb 2006 09:37:57 +0100
I have 82,160 3 number combinations (non repeating 1- 80, like one
set of 1,2,3 ;
not like 1,2,2) that I'd like create hash keys for with out
collision, if possible. I tried SuperFastHash and when I tried
loading them to my database
I got some primary key failures which would indicate matching keys.
Now given that your integers are sorted, let's call them x,y,z.
Since the values are sorted, and there's no repetition, the largest
value z could be is 79. The largest value y could be is 78, and the
largest value x could be is 77. So your hash value could be x * 6320
+ y * 80 + z.
ok I tested and it works perfectly. I understand that 6320 is 80 * 79
but I don't understand why it works. I know that x will always be
less than y but I'm not sure why multiplying by 79 * 80 makes it
unique.
Think about our own decimal number system, or the hexadecimal system.
For the decimal system, you would construct all unique 4-digit numbers like
so:
n4 * 1000 + n3 * 100 * n2 * 10 + n3 * 1
This will - in our decimal system - always generate unique numbers. If this
were not the case, then our number system would probably be more confusing
(although we'd be used to it), but also less efficient. So if you take for
instance:
n4 = 5
n3 = 2
n2 = 0
n1 = 8
You would get:
5208
This probably makes common sense to you, but let me explain how I got those
number 1000 to 1.
The least significant digit (LSD) will not be altered, so we multiply by 1
(and, as a side-note, 1 happens to be any number to the power of 0). The
LSD has 10 combinations, so if I add another number, it will increase the
possibilities by 10. That is where the 10 comes from which I multiply by
n2. They are the possible permutations with 1 digit.
Now I have two digits, n1 and n2, and they each have 10 possibilities, so
the total number of possibilities with those two digits, is 100. This gives
me the number I need to multiply n3 with.
Now you can imagine that those digits did not form one number, but two
numbers, each with 1 digit. There are still 100 possibilities ( 0 0 -> 9
9) and so if I want to represent them in a single number, that number has to
have at least 100 possible values (0 -> 99). Now the number don't have one
digit, but instead a range (in this case 0 - 9). And now this range
changes, not going up to 9, but going up to 79. This gives you 80
possibilities with one number.
Now, the first number has 80 possibilities, so that means that the next
number has to be multiplied by 80. The number after that, has 79
possibilities. You can see this as if the number system changes depending
on the digit before it. So first you have numbers 0-79, then, if the first
digit is 24, you have 0-23,25-79 . It may not be much of a number system,
but it still only has 79 combinations. This is why it works. There are
only 6'162 numbers ending with 80, not 6'400.
Note that there still is quite some wastage: There are 80!/(77! * 3!)
combinations. 80!/77! takes you to 492'960. Deviding by 3! (6) takes you
back to 82'160. As it turns out this only requires 2 bits less (just a
little over 16 unfortunately, 16.3 versus 18.9 for the actual hash number).
So although you create a number large enough to hold all the values, you
also create space for numbers that will never be.
Hope it made it any clearer :)
--
Martijn
http://www.sereneconcepts.nl
.
- Follow-Ups:
- Re: How do I hash this?
- From: Antony Scriven
- Re: How do I hash this?
- References:
- How do I hash this?
- From: Whybother
- Re: How do I hash this?
- From: Oliver Wong
- Re: How do I hash this?
- From: Whybother
- How do I hash this?
- Prev by Date: Re: Command Line Interface (CLI): your recommendations
- Next by Date: Re: Command Line Interface (CLI): your recommendations
- Previous by thread: Re: How do I hash this?
- Next by thread: Re: How do I hash this?
- Index(es):
Relevant Pages
|