Re: Hashing

From: Chewy509 (chewy509.doesnt.like.spam_at_austarnet.com.au)
Date: 03/03/05


Date: Thu, 3 Mar 2005 20:36:51 +1000

Hutch,

Thanks for the insight, I'll look more into other implementations and how
they handle collisions.

Thinking a bit more about it, I can encode 32 characters into 192 bits (each
character being 6bits, and packed). This would ensure zero collisions...
However, I'm a bit concerned that matching hashes (of 192bits in size) might
get a bit expensive as the number of hashes increases... (Or I could reduce
the maximum symbol length to say 16 characters? which would produce a hash
of only 96bits).

Since speed at this time isn't important, I was thinking a single hash table
of 8,000 entries, and just use a simple list for the table. (I was thinking
since the hash algo and table would be limited to only a few areas, updating
it in the future won't be too difficult). Or am I under-estimating the
importance of the hashing algo and hash table structure within a compiler?

* Since symbols can only contain 'a..z','A..Z','0..9','_', I'm only dealing
with 63 characters, which will easily fit into 6bits. Since max symbol
length is 32 characters, 32 x 6 bits = 192bits.

-- 
Darran (aka Chewy509)... 


Relevant Pages

  • Re: who wants a crack at this
    ... (and IIRC my Javascript) all ... non-alphabetic characters are effectively converted to 'A'. ... word "hash". ... are interested only in the probability of accidental collisions, ...
    (sci.crypt)
  • Re: Hash table
    ... > value for any string of 12 characters. ... For 12 characters to have a non-colliding hash, ... collisions - there are too many strings and not enough possible hash ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Hash table
    ... > value for any string of 12 characters. ... For 12 characters to have a non-colliding hash, ... collisions - there are too many strings and not enough possible hash ...
    (microsoft.public.dotnet.general)
  • Re: When will md5crk complete?
    ... and in that case birthday attack ... > His core message is correct however: you shouldn't be using MD5. ... Collisions DO exist for every hash algorithm... ...
    (sci.crypt)
  • Re: Base36
    ... static string tokens = ... But - I don't think you want all those silly characters in the product key. ... I should be able to recalc the hash at the client ... > conversion to long so I can pass each long to the BaseXX converter to get ...
    (microsoft.public.dotnet.languages.csharp)