Re: Efficient Text File Copy
From: Keith Bostic (bostic_at_sleepycat.com)
Date: 02/03/04
- Next message: Jack Klein: "Re: C++ error "System.FormatException""
- Previous message: Rick Spinuzzi: "C++ error "System.FormatException""
- In reply to: CBFalconer: "Re: Efficient Text File Copy"
- Next in thread: Chris \( Val \): "Re: Efficient Text File Copy"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 2 Feb 2004 19:00:19 -0800
CBFalconer <cbfalconer@yahoo.com> wrote in message news:<401C3C49.38F12EE7@yahoo.com>...
>>> Why do I recommend Chris Torek's hash? Because Chris says it
>>> works well and I had no reason to disbelieve him; in my
>>> experience, he's right - it does work well.
While the function works well for strings, it doesn't work well
for numbers (for example, indexing on record numbers), in my
experience. Berkeley DB ended up using this:
/*
* Fowler/Noll/Vo hash
*
* The basis of the hash algorithm was taken from an idea sent by
email to the
* IEEE Posix P1003.2 mailing list from Phong Vo
(kpv@research.att.com) and
* Glenn Fowler (gsf@research.att.com). Landon Curt Noll
(chongo@toad.com)
* later improved on their algorithm.
*
* The magic is in the interesting relationship between the special
prime
* 16777619 (2^24 + 403) and 2^32 and 2^8.
*
* This hash produces the fewest collisions of any function that we've
seen so
* far, and works well on both numbers and strings.
*
* PUBLIC: u_int32_t __ham_func5 __P((DB *, const void *, u_int32_t));
*/
u_int32_t
__ham_func5(dbp, key, len)
DB *dbp;
const void *key;
u_int32_t len;
{
const u_int8_t *k, *e;
u_int32_t h;
if (dbp != NULL)
COMPQUIET(dbp, NULL);
k = key;
e = k + len;
for (h = 0; k < e; ++k) {
h *= 16777619;
h ^= *k;
}
return (h);
}
There are a variety of hash functions Berkeley DB has tried
at various times -- the ones we've liked best we've kept:
http://www.opensource.apple.com/darwinsource/10.3/BerkeleyDB-6/db/hash/hash_func.c
Regards,
--keith
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Keith Bostic bostic@sleepycat.com
Sleepycat Software Inc. keithbosticim (ymsgid)
118 Tower Rd. +1-781-259-3139
Lincoln, MA 01773 http://www.sleepycat.com
- Next message: Jack Klein: "Re: C++ error "System.FormatException""
- Previous message: Rick Spinuzzi: "C++ error "System.FormatException""
- In reply to: CBFalconer: "Re: Efficient Text File Copy"
- Next in thread: Chris \( Val \): "Re: Efficient Text File Copy"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|
|