Re: Hash-table versus B-Tree
From: CBFalconer (cbfalconer_at_yahoo.com)
Date: 12/07/04
- Next message: TestDriven.NET: "TestDriven.NET 1.0 - Zero Friction Unit Testing for Visual Studio .NET"
- Previous message: Gerry Quinn: "Re: Graph problem"
- In reply to: kar1107_at_gmail.com: "Re: Hash-table versus B-Tree"
- Next in thread: Alex Fraser: "Re: Hash-table versus B-Tree"
- Reply: Alex Fraser: "Re: Hash-table versus B-Tree"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 07 Dec 2004 13:00:47 GMT
"kar1107@gmail.com" wrote:
> KP Bhat wrote:
>
> I assume by B-Tree we also include balanced trees like AVL.
> Hash collision is handled thru a linked list.
>
>> I am trying to compare and contrast the use of Hash-table and
>> B-Trees as indexing mechanisms for record-based files. Here's
>> what I have got so far. Kindly suggest additions/modifications.
>>
>> Hash-table
>> 1. Simple indexing scheme
>> 2. Insert and delete operations very efficient
>
> Delete is not efficient. A potentially long linked list need to
> be traversed in case of collisions.
Not necessarily so. See hashlib, reference later. No linked list
is involved, and the operation is O(1).
>
>> 3. Retrieve operation very efficient
>> 4. Need to provide a hashing function that hashes record keys
>> into some number
This is usually a trivial operation. The trick is to supply a hash
function suitable for the data involved, and good systems can be
quite forgiving. The penalty for a bad function is poorer
performance.
>> 5. Supporting GetNext and GetBulk operations quite expensive
Depends on your definition of these. I have no idea what you have
in mind.
>> 6. Does not scale that well as the size of the data increases
Scales just fine until the data base no longer fits in virtual
memory. Then the B-tree is far superior.
>
> 7. Collision handling can lead to poor searches ("retrieve
> operation"). The complexity is O(n) for search and delete.
Not so. Again, see the hashlib reference at end. Search and
delete times are O(1).
>
>>
>> B-Tree
>> 1. Complex indexing scheme
>> 2. Insert and delete operations slightly expensive due to need
>> to retain the B-Tree property (i.e. splitting/joining nodes)
>> 3. Retrieve operation very efficient
>> 4. No need to provide a hashing function
>> 5. Supporting the GetNext and GetBulk operations very efficient
>> 6. Scales extremely well as the size of the data increases
>
> 7. search, insert, delete always O(log n) (thus supports #6 above)
>
> Once a solid library is implemented for the tree, its good to go
> with tree. You never know you may have to handle large data in
> future.
You can conveniently examine the characteristics of hash systems
via hashlib, which is available under GPL at:
<http://cbfalconer.home.att.net/download/hashlib.zip>
-- Chuck F (cbfalconer@yahoo.com) (cbfalconer@worldnet.att.net) Available for consulting/temporary embedded and systems. <http://cbfalconer.home.att.net> USE worldnet address!
- Next message: TestDriven.NET: "TestDriven.NET 1.0 - Zero Friction Unit Testing for Visual Studio .NET"
- Previous message: Gerry Quinn: "Re: Graph problem"
- In reply to: kar1107_at_gmail.com: "Re: Hash-table versus B-Tree"
- Next in thread: Alex Fraser: "Re: Hash-table versus B-Tree"
- Reply: Alex Fraser: "Re: Hash-table versus B-Tree"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|