Re: count of each word occurred

From: Edo (edwardoJE_at_aking.com)
Date: 06/22/04

  • Next message: Edo: "Re: do thing with the inputs"
    Date: Mon, 21 Jun 2004 15:32:05 -0700
    
    

    B. v Ingen Schenau wrote:
    > After you have sorted the list, why would you compare the current word with
    > *all* the rest of them?

    Ok, I see the problem, even thought I fixed the problem in this
    following algorithm, I think there would be yet better ways.

            struct WordEntry {
                    string word;
                    int count;
            };
            vector<WordEntry> WordCount;
            WordEntry wordcount;

    int main()
    {
            vector<string> words; //to store list of words from input
            typedef vector<WordEntry>::size_type vec_size;
            
            sort(words.begin(), words.end());

            int count = 1;
            vec_size j;
            vec_size i = 0;
            for (j=i+1; j < words.size(); ++j) {
                    if (words[i] == words[j]){
                            count++; //
                            ++i; // jumps to the next unique word
                    } else {
                            wordcount.word = words[i++];
                            wordcount.count = count;
                            WordCount.push_back( wordcount );
                            count = 1;
                    }
            }
            wordcount.word = words[i++];
            wordcount.count = count;
            WordCount.push_back( wordcount );
            count = 1;
            
    }
    the above works with no looping waste. but this repeat does not look
    good, maybe I will make a function call to look nicer.

            wordcount.word = words[i++];
            wordcount.count = count;
            WordCount.push_back( wordcount );

    >
    > words: +---------+ Wordcount .word: +---------+ .count: +---+
    > (size=9) |and | (size=5) |and | (size=5) | 1 |
    > +---------+ +---------+ +---+
    > |beginning| |beginning| | 1 |
    > +---------+ +---------+ +---+
    > |dark | |dark | | 1 |
    > +---------+ +---------+ +---+
    > |earth | |earth | | 1 |
    > +---------+ +---------+ +---+
    > |In | |In | | 1 |
    > +---------+ +---------+ +---+
    > |the |
    > +---------+
    > |the |
    > +---------+
    > |void |
    > +---------+
    > |was |
    > +---------+
    >
    > count: +---+ i: +---+ j: +---+
    > | 1 | | 5 | | 6 |
    > +---+ +---+ +---+
    >

    BTW; what tool did you use to explain the problem in your last post so
    elegantly?
    it would have been very helpful if I could do that on my own to avoid
    future code with unnecessary looping.

    I think Kerl’s way will require 2 tables to loop through, where as the
    above loop requires only one pass through the data “after sorting-not
    sure if it is expensive”.


  • Next message: Edo: "Re: do thing with the inputs"