Re: Extreamly large Hashtable





anthony@xxxxxxxxxxxx wrote:
> Has anyone created an extreamly large Hashtable? I need to create a
> simple look up table key/value of some information.
>
> I'm assuming that if it is in memory it will be faster then looking
> this value up in a database.
>
> The problem is I have about 4,000,000 rows of data. Assuming I have
> enough memory to hold it, will a hashtable break being that large? Will
> it be fast? Each row will only be about 10k in size.

Four megarows times ten kilobytes per row equals forty
gigabytes. Are you sure you have that much RAM available?
If you don't, you'll just trade database I/O for swap I/O.

Where does this 40GB come from? If you can read it at
100 MB/s (from a well-tuned RAID stripe, say), it'll take
about seven minutes to pull in all the data and load the
table. Divide this seven minutes by the expected savings
per query to get the number of queries you must process in
order to recoup the startup time.

Note that the 10 KB row size is irrelevant to the table's
performance (unless it means that the keys' equals() and
hashCode() methods are slow). The table contains only the
references to the objects (that's another 7 GB, at a guess),
not the objects' data fields.

The above are just a few miscellaneous thoughts -- maybe
if you described your problem in more detail someone would be
able to comment more particularly.

--
Eric.Sosman@xxxxxxx

.