Re: Hash function for int-aligned text (was: accessing char as intthrough union)
- From: CBFalconer <cbfalconer@xxxxxxxxx>
- Date: Fri, 06 Oct 2006 19:52:07 -0400
Hallvard B Furuseth wrote:
.... snip ...
Thanks to several people for answers and explanations, in particular
websnarf. A few details:
websnarf@xxxxxxxxx writes:
Hallvard B Furuseth wrote:
The keys are words in a dictionary or text file, average size
just 5-6 characters + the padding zeroes. (And most searches
find nothing.)
Oh well, these are extremely short strings.
Actually that was quite wrong, I seem to have counted a test sample
instead or something at a late night. 5-6 letters is more like it
for some inputs, ~10 for others.
What you're going to what to do is special case them by *length*,
not alignment. Then just completely unroll the hash computation
for all the short cases (say up to 16 characters or so) and bail
out with a more general hasher (like mine, or Bob Jenkins').
Aah, of course.
Of course the overhead of splitting off the long strings, and thus
of calling strlen on them, will probably overshadow any gains,
especially when the expected lengths are short. You can't really
say anything without making measurements on the actual data.
--
Some informative links:
<news:news.announce.newusers
<http://www.geocities.com/nnqweb/>
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/>
.
- References:
- Re: accessing char as int through union
- From: Hallvard B Furuseth
- Re: accessing char as int through union
- From: "Nils O. Selåsdal"
- Hash function for int-aligned text (was: accessing char as int through union)
- From: Hallvard B Furuseth
- Re: Hash function for int-aligned text (was: accessing char as int through union)
- From: websnarf
- Re: Hash function for int-aligned text (was: accessing char as int through union)
- From: Hallvard B Furuseth
- Re: accessing char as int through union
- Prev by Date: Re: Amusing C, amusing compiler
- Next by Date: Re: writing a string to output
- Previous by thread: Re: Hash function for int-aligned text (was: accessing char as intthrough union)
- Next by thread: Meschach Matrix Library for Microsoft Visual C
- Index(es):
Relevant Pages
|