faster data structure to count word frequency?
From: Kevin (kaidizhao_at_yahoo.com.sg)
Date: 03/19/05
- Next message: Kevin: "Re: stuck in regex"
- Previous message: HK: "Re: stuck in regex"
- Next in thread: HK: "Re: faster data structure to count word frequency?"
- Reply: HK: "Re: faster data structure to count word frequency?"
- Reply: Boudewijn Dijkstra: "Re: faster data structure to count word frequency?"
- Reply: Chris Uppal: "Re: faster data structure to count word frequency?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 19 Mar 2005 13:58:59 -0800
Hello there,
I am justing thinking of how can we do this in a faster way, if any.
Suppose we are required to count the frequency of each different word
in a large text documents, the way I currently using is like:
//////////////////
Hashtable ht = new Hashtable();
Read each word in the text in its order, for each of them:
if (ht.contains(oneWord))
{
Integer I = (Integer) ht.get(oneWord));
ht.put(oneWord, new Integer(I.intValue()+1));
}
else
{
ht.put(oneWord, new Integer(1));
};
//////////////////
Since we can not apply I++ since I is an Object, I am thinking we have
to new a new Integer each time we use it.
Is there any other way to do this process faster? Or any other data
structure we can use for this purpose?
Thanks a lot and happy weekend!
- Next message: Kevin: "Re: stuck in regex"
- Previous message: HK: "Re: stuck in regex"
- Next in thread: HK: "Re: faster data structure to count word frequency?"
- Reply: HK: "Re: faster data structure to count word frequency?"
- Reply: Boudewijn Dijkstra: "Re: faster data structure to count word frequency?"
- Reply: Chris Uppal: "Re: faster data structure to count word frequency?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]