Re: Extreamly large Hashtable
- From: Eric Sosman <eric.sosman@xxxxxxx>
- Date: Mon, 09 May 2005 11:48:43 -0400
Boudewijn Dijkstra wrote:
> "Eric Sosman" <eric.sosman@xxxxxxx> schreef in bericht
> news:d5g9sf$9sn$1@xxxxxxxxxxxxxxxxxxxxxxxxxxx
>
>>anthony@xxxxxxxxxxxx wrote:
>>
>>>Has anyone created an extreamly large Hashtable? I need to create a
>>>simple look up table key/value of some information.
>>>
>>>I'm assuming that if it is in memory it will be faster then looking
>>>this value up in a database.
>>>
>>>The problem is I have about 4,000,000 rows of data. Assuming I have
>>>enough memory to hold it, will a hashtable break being that large? Will
>>>it be fast? Each row will only be about 10k in size.
>>
>> Four megarows times ten kilobytes per row equals forty
>>gigabytes. Are you sure you have that much RAM available?
>>If you don't, you'll just trade database I/O for swap I/O.
>
>
> I believe the OP was talking about keeping the hashtable in memory, not the
> entire database.
That's possible; the O.P. has not divulged much about
what he's trying to do. I thought he intended to keep the
whole thing in memory because he specifically mentioned the
row size; if he were intending just to store the keys and
a disk pointer he'd have talked about the key size. Also,
he seemed to have a speed fetish, and speed is incompatible
with I/O ...
>> Note that the 10 KB row size is irrelevant to the table's
>>performance (unless it means that the keys' equals() and
>>hashCode() methods are slow). The table contains only the
>>references to the objects (that's another 7 GB, at a guess),
>>not the objects' data fields.
>
> Four million times 8 bytes (4-byte hashcode, 4-byte index) at a load of 75% is
> still less than 40 MB. Where is your guess based on?
I may have guessed wrong; let me try again. The Hashtable
itself contains four million references to Map.Entry objects;
in a 64-bit JVM (on the load-all-the-data assumption) they'll
take eight bytes apiece for 32 MB.
Each Map.Entry object holds references to a key and a value
(sixteen bytes) plus some amount of overhead to express its own
"objectivity" -- I don't know how much that is, but let's assume
it's something like sixteen bytes. That's thirty-two bytes for
each of the four million Map.Entry objects, or 128 MB.
Grand total for metadata: about 160 MB, "just a trifle" less
than my original guess. Even if the Map.Entry overhead is more
like thirty-two bytes and even if the Hashtable itself stores
more than just the Map.Entry reference, the total won't come
close to my 7 GB blunder ... (Memo to self: Stop trying to do
back-of-the-envelope calculations on envelope-less E-mail ;-)
--
Eric.Sosman@xxxxxxx
.
- References:
- Extreamly large Hashtable
- From: anthony
- Re: Extreamly large Hashtable
- From: Eric Sosman
- Re: Extreamly large Hashtable
- From: Boudewijn Dijkstra
- Extreamly large Hashtable
- Prev by Date: Standard Dicom images in java
- Next by Date: Re: Non-blocking method for reading writing objects to sockets
- Previous by thread: Re: Extreamly large Hashtable
- Next by thread: Swing Model Classes Updating Swing Components on a Thread Other Than AWT
- Index(es):
Relevant Pages
|
|