Re: Counting words in a file



Phil Carmody wrote:
Terje Mathisen <spamtrap@xxxxxxxxxx> writes:

Phil Carmody wrote:

spamtrap@xxxxxxxxxx writes:

Hi everyone,
Sorry for this ad-hoc question, but is there a simple way to count
words in a text file?

....

Note, however, that your question is ill-formed - what you want to
do is entirely dependent on the OS that you are using.

The OS is irrelevant.


I don't believe even the great Terje Mathisen can get data from a text file without calling an underlying OS.

incbin "test.txt"

There ya go! Seriously, that's probably not what Paul had in mind. First we've got to open the file. No, *first* we have to discover the filename. "Kids these days" are going to want a browse-box... but probably we can get by with giving the filename on the command-line.

I suspect the code for opening, reading, and buffering the
data would be more lengthy than the code for actually doing
the word counting. It's a 2-state state machine, and only one transition causes anything to actually be done, after all.

Sure. The actual counting is probably just "inc [count]"* (of some size). But as Terje says, it's the "interesting" part. Then... the pesky user is probably going to expect us to display the results! More code where we're going to have to call the boring OS. And we're going to have to exit! Paul says, "I'm having some difficulty in understanding the way that asm deals with file pointers etc.", so this may be the part he needs help with. (In which case, we're going to want to know "what OS?")

As to the "interesting part", Robert asks, "jmp table? Other LUT? or sequential tests?" I think I'd vote for "Other LUT" - I think I'd mark "in a word" positions with "1", and "delimiters and other junk" with "0"s... haven't fully thought that out, yet. But "sequential tests" might be "easier", depending on Paul's experience level. Rather than testing for explicit delimiters, "in a word" characters could be checked by ranges - '0'..'9', 'A'..'Z', 'a'..'z', plus apostrophe and underscore, I guess. Any others? I like Terje's idea of user-specified delimiters, but it might be overkill for this... homework assignment(?).

So, Paul... tell us more about what you're trying to do (and what OS), and what parts you *can* do. You'll want different answers if you know how to do most of it, and want to make it go fast, than if you don't know how to open and read (mmap?) a file...

Best,
Frank

* or probably "adc [count], 0" if we can arrange that...

.



Relevant Pages

  • spork They are obscuring above the dorm now, wont conceal paces later.
    ... It's very dependent, I'll defeat ... Paul when the wise prescriptions tuck in front of the arab pond. ... I was requesting chancellors to mature Imam, ... dependence and repay it in support of its congregation. ...
    (rec.games.roguelike.nethack)
  • Re: Pauls rumored settlement offer
    ... I don't know how true this article is, but I think in a way it is wise ... beholdent to Paul. ... status is dependent on her relationship with Paul. ...
    (rec.music.beatles)
  • Re: Functional Dependencies > Uniqueness Constraints
    ... paul c wrote: ... the union of the attributes is a CK, eg., if there is no stated ... determinant set, all the attributes are in the dependent set. ...
    (comp.databases.theory)
  • Re: BBC Weather forecast
    ... Paul wrote: ... > anonymous holders of power to create a passive, dumb, dependent ... Speaking of deception and manipulation, anyone watched Space Cadets on C4? ...
    (uk.rec.walking)
  • Re: New Yorker article about Paul
    ... Paul is very careful to say only nice things about John ... in love to annoyed family members this time around, ... was way more dependent on the Beatles than the others - Paul wasn't ...
    (rec.music.beatles)