Re: Hash



From: Jon Harrop <j...@xxxxxxxxxxxxxxxxx>
Hash functions (in my experience) typically terminate after a
finite number of atomic operations, so the complexity is not
related to the size of the key but, rather, just to the bounded
length of this computation, i.e. O(m).

You have to be very careful about such hash functions. For example,
suppose m=8, and suppose you've collected a database containing all
Jeopardy (TV show) clues and corresponding responses, and you want
to build a hash table that maps responses to clues. For example:
Category: Astronomy
Clue: It has four large moons which were discovered by Galileo.
Response: What is Jupiter?
Nearly all such responses would have the first 8 characters
identical (a few would start with "Who is " instead), so would hash
to the same bucket.

You'd have a similar problem hashing names of organic chemicals,
many of which start identically:
1,3 methyl 2 hydroxy toluene
1,3 methyl 2 hydroxy benzene
1,3 methyl 2 hydroxy glucose
1,3 methyl 2 hydroxy whatever
Deeper match, so even hashing first 21 characters gives
worst-possible hash within each such group, although not as many
collisions in each group because groups are smaller to begin with.

One idea to avoid being bitten by such prefix-collisions, yet still
not have to hash the whole key if it's rather long, is to fix the
total number of characters included in the hash computation (for
example always include 8 characters, except for strings shorter
than 8 characters), but to distribute them per uniform fraction
down the string:
1,3 methyl 2 hydroxy toluene
1,3 methyl 2 hydroxy benzene
1,3 methyl 2 hydroxy glucose
* * * * * * * *
= = = = = = / / (1-2 chars vary among members of the small group)
1,3 methyl iso-mumbleic acid
= = = / / / / / (5 chars different from sister group)
1,3 methyl 2 hydroxy very long name here
* * * * * * * *
/ / = / / / / / (totally different hash from above,
except if you're just adding the characters together rather than
computing CRC then 1 character matches so only 7 different from 1st g.)

By the way, I thought of this new algorithm just now in reading
this thread. I've never see it mentionned before. Is this a new
invention of mine, or did somebody else think of it before me?
.



Relevant Pages

  • Re: Base36
    ... static string tokens = ... But - I don't think you want all those silly characters in the product key. ... I should be able to recalc the hash at the client ... > conversion to long so I can pass each long to the BaseXX converter to get ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: How to omit blank spaces in the text?
    ... Set adoPrimaryRS = New Recordset ... you're best to read the characters one by one and ... When the password is first created you calculate the hash and store ... then it is almost certain the entered password is correct. ...
    (microsoft.public.vb.general.discussion)
  • Re: How to omit blank spaces in the text?
    ... Set adoPrimaryRS = New Recordset ... character set from &H21 to &H7E provides for ASCI alpha numeric characters ... When the password is first created you calculate the hash and store that, ... then it is almost certain the entered password is correct. ...
    (microsoft.public.vb.general.discussion)
  • Re: How to omit blank spaces in the text?
    ... Private Sub Command1_Click ... Dim ssql As String, ssql2 As String, ssql3 As String, trimname As String ... character set from &H21 to &H7E provides for ASCI alpha numeric characters ... >> When the password is first created you calculate the hash and store>> that, ...
    (microsoft.public.vb.general.discussion)
  • Password Strength III (?)
    ... And then talk about why the NT hash ... > We already know that you only have to go through 7 characters ... > 8byteDeskey1 is used to encrypt the challenge ... > it's end and compares the result to the 24 byte response. ...
    (Security-Basics)