Re: count of each word occurred

From: Edo (edwardoJE_at_aking.com)
Date: 06/22/04

• Next message: Edo: "Re: do thing with the inputs"

```Date: Mon, 21 Jun 2004 15:32:05 -0700

```

B. v Ingen Schenau wrote:
> After you have sorted the list, why would you compare the current word with
> *all* the rest of them?

Ok, I see the problem, even thought I fixed the problem in this
following algorithm, I think there would be yet better ways.

struct WordEntry {
string word;
int count;
};
vector<WordEntry> WordCount;
WordEntry wordcount;

int main()
{
vector<string> words; //to store list of words from input
typedef vector<WordEntry>::size_type vec_size;

sort(words.begin(), words.end());

int count = 1;
vec_size j;
vec_size i = 0;
for (j=i+1; j < words.size(); ++j) {
if (words[i] == words[j]){
count++; //
++i; // jumps to the next unique word
} else {
wordcount.word = words[i++];
wordcount.count = count;
WordCount.push_back( wordcount );
count = 1;
}
}
wordcount.word = words[i++];
wordcount.count = count;
WordCount.push_back( wordcount );
count = 1;

}
the above works with no looping waste. but this repeat does not look
good, maybe I will make a function call to look nicer.

wordcount.word = words[i++];
wordcount.count = count;
WordCount.push_back( wordcount );

>
> words: +---------+ Wordcount .word: +---------+ .count: +---+
> (size=9) |and | (size=5) |and | (size=5) | 1 |
> +---------+ +---------+ +---+
> |beginning| |beginning| | 1 |
> +---------+ +---------+ +---+
> |dark | |dark | | 1 |
> +---------+ +---------+ +---+
> |earth | |earth | | 1 |
> +---------+ +---------+ +---+
> |In | |In | | 1 |
> +---------+ +---------+ +---+
> |the |
> +---------+
> |the |
> +---------+
> |void |
> +---------+
> |was |
> +---------+
>
> count: +---+ i: +---+ j: +---+
> | 1 | | 5 | | 6 |
> +---+ +---+ +---+
>

BTW; what tool did you use to explain the problem in your last post so
elegantly?
it would have been very helpful if I could do that on my own to avoid
future code with unnecessary looping.

I think Kerl’s way will require 2 tables to loop through, where as the
above loop requires only one pass through the data “after sorting-not
sure if it is expensive”.

• Next message: Edo: "Re: do thing with the inputs"