Re: count of each word occurred
From: Karl Heinz Buchegger (kbuchegg_at_gascad.at)
Date: 06/18/04
- Next message: David White: "Re: How to use Arrays that are not Evil"
- Previous message: Karl Heinz Buchegger: "Re: count of each word occurred"
- In reply to: Edo: "Re: count of each word occurred"
- Next in thread: Edo: "Re: count of each word occurred"
- Reply: Edo: "Re: count of each word occurred"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 18 Jun 2004 13:09:05 +0200
Edo wrote:
Sorry, hit accidently on "send"
Please continue reading, where I dropped in
the last post.
>
>
> thanks, that is a great help, I did the logic part
> but need help with
> C++ part
>
> // take each work and compare it with
> // each word in the list
> int count = 1;
> int idx = 0;
> for (int i=0; i < words.size(); ++i) {
> for (int j=i+1; j < words.size(); ++j) {
> if (words[i] == words[j]){
> count++;
> }
> }
> word_count[idx].word.push_back(words[i]);
> word_count[idx].count.push_back(count);
> count = 1;
> idx++;
> }
>
As to your errors.
> 4_5.cpp:28: warning: comparison between signed and unsigned integer
> expressions
What is the return value of size()?
With what are you comparing it to? What is it's type?
> 4_5.cpp:29: warning: comparison between signed and unsigned integer
> expressions
> 4_5.cpp:34: error: syntax error before `[' token
Seems that you have applied [ on something that is not an array
or vector.
But without seeing the definitions of word_count or words it is
impossible to tell exactly.
If word_count is the same as in your original post then:
word_count is a *data type*
struct word_count {
vector<string> word;
vector<int> count;
};
defines what a structure looks like, but it is not a variable. You need
to create a variable to work with it. In the same way that you cannot write
int[5] = 8;
but need to define a variable for that
int a[10];
a[5] = 8;
you cannot write
word_count[5].word.....
You need a variable for that
word_count TheWords;
TheWords.word.push_back( .... );
BTW: You design seems to be flawed (which brings me back to:
do the logic part first).
What you want is a structure which bundles a *single* word with
how often that word occoured:
struct WordEntry {
string word;
int count;
}
You then use this new data type to build a vector from it:
vector< WordEntry > TheWords;
and use it
WordEntry NewWord;
NewWord.word = "In";
NewWord.count = 1;
TheWords.push_back( NewWord );
How I came up with this?
Well. If you are anything like me, you would do the whole thing
on paper and pencil as follows:
Have 2 tables. One contains the original words, the other contains
the unique words paired with a count how often this word has occoured:
table 1 tabel 2
********* ***********
In
the
beginning
the
earth
was
void
and
dark
Now start at the first word. It is "In". I then would try to look it
up in table 2. Hmm, it's not there. Thus I add it to that table give
it a count of 1.
table 1 tabel 2
********* ***********
In In 1
the
beginning
the
earth
was
void
and
dark
Next word: "the".
Looking up table 2 shows that it is not there. Thus I add another entry for
"the" and again give it a count of 1
table 1 tabel 2
********* ***********
In In 1
the the 1
beginning
the
earth
was
void
and
dark
Next word: "beginning"
Same thing: not in table 2, thus add it
table 1 tabel 2
********* ***********
In In 1
the the 1
beginning beginning 1
the
earth
was
void
and
dark
Next word: "the"
Look up table 2 to see if it is already there reveals: it is already
there, thus I simply increment the count by 1
table 1 tabel 2
********* ***********
In In 1
the the 2
beginning beginning 1
the
earth
was
void
and
dark
Next word: "earth"
....
and so on and so on.
Stepping back and analyzing what I have done:
for all words in table 1 {
find word in table 2
if not found then
create new entry and give it a count of 1
else
increment counter at found position
}
Refining this brings us closer to a C program, but first you will
need to spend some thoughts on how table 2 should be organized:
The important thing in table 2 is the connection between the word
and the counter. Those 2 things belong together as far as table 2
is concerned. When doing the paper/pencil test, this relationship
was emphasized by the fact, that I wrote both items on the same line,
while 2 lines have really nothing in common; there is no relationship
between 2 lines besides that both of them happen to be in the same table.
That's why I build the structure to group those 2 item: a single word
and a single counter.
-- Karl Heinz Buchegger kbuchegg@gascad.at
- Next message: David White: "Re: How to use Arrays that are not Evil"
- Previous message: Karl Heinz Buchegger: "Re: count of each word occurred"
- In reply to: Edo: "Re: count of each word occurred"
- Next in thread: Edo: "Re: count of each word occurred"
- Reply: Edo: "Re: count of each word occurred"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|