Re: taking a "word" as input



On Tue, 29 Apr 2008 20:53:50 +0530, santosh wrote:


What about words with other characters like hyphen? What about
constructs like "get_name"? Will you discard them too. What about words
that end with a ; or ...? What about words that contain symbols like #@
etc? Or words that end with an exclamation mark? Or words within
parenthesis or braces?

all of them will be discarded. Only words containing letters like
"santosh" will be considered, nothing else.




That's one way yes, suitable when you don't know the lengths of words in
advance, or you don't want to possibly waste storage with statically
allocated arrays.

yes, exactly, input will be at run-time only.


Depends on your requirements really, and the type and frequency of input
you expect. Will you put an upper limit on the length of words? It
hardly makes sense to accept words longer than about 64 characters if
you are dealing with normal English text.

ok, make the upper limit to 64 :) , I usually take it 100 as my style.



Static 2D arrays are
undoubtedly easier to work with but are less flexible than dynamically
allocated arrays. Since statically allocated arrays are of fixed size
it's possible for some elements to remain unused and hence wasted. OTOH
a large number of small allocations may lead to memory fragmentation and
also some wastage due to malloc bookeeping and possibly also a slowdown
in speed if you'll be reading a very large number of words from a file.
For input from a human it will not matter.


you want to say that there will be 2 types of implementations if
efficiency is my concern:

1.) input from human
2.) input from a text-file

??

couldn't there be a single implementation for both types of inputs ?


One efficient method is to use a single dynamically allocated array in
which words are stored sequentially. The length of each word could be
specified by either one or two bytes prefixing the word itself. This
results in very efficient storage, but is grossly inefficient if you
want to insert and delete words at random. For this a hash table based
approach is probably the best. OTOH a tree is very convenient for quick
searching and sorting.

If you tell us more details about the type and volume of input you
expect and the facilities (like searching, insertion, etc.) you plan to
implement, perhaps a tailored approach can be suggested.


The basic problem is to sort, count and print the sorted words. We are
not going to save a word in an array if it has already appeared, we will
just increase the count for that word.

K&R2 seems to suggest that a doubly-linked list using binary search is
the most efficient method to use, described in section 6.5 and is already
solved. Actually I am not able to understand the <getword> function of the
authors which actually is different from what I want, hence I need to
create one of my own.





--
http://lispmachine.wordpress.com/
my email ID is at the above address

.



Relevant Pages

  • Re: taking a "word" as input
    ... Will you discard them too. ... allocated arrays. ...   It ... expect and the facilities (like searching, insertion, etc.) you plan to ...
    (comp.lang.c)
  • Re: taking a "word" as input
    ... allocated arrays. ... One efficient method is to use a single dynamically allocated array in ... OTOH a tree is very convenient for quick ... expect and the facilities (like searching, insertion, etc.) you plan to ...
    (comp.lang.c)
  • the ugly msvcr71d.dll and lnk4098
    ... Now i ran one of the dependant lib files thru the /verbose option during a ... AND A BUNCH OF DISCARD NOTICES FOR MSVCR71D.dll ...
    (microsoft.public.dotnet.languages.vc)