Re: Reading in a Dictionary File




<boanerges35@xxxxxxxxx> wrote in message
news:1151695116.186287.166580@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

John W. Kennedy wrote:
Fahd Shariff wrote:
Reading in a dictionary file is the same as reading in any other file.
You use a BufferedReader and store the contents of the dictionary in
some kind of data structure, for example.

Not necessarily a BufferedReader. An DataInputStream or a
RandomAccessFile might be necessary. There's just no way of knowing
without more information.


What kind of info do you need? One of my concerns is how long it would
take to read in a 500,000 word dictionary. I have read a little online
about this and it does not seem to be as simple as it sounds, if one
considers efficiency. My program would take a word(s) as input,
calculate the permutations of the letters, and then try to match them
to actual words in the dictionary.

There are specialized file structures just for storing dictionaries--they're
organized (in binary, on disk) as a Tries http://en.wikipedia.org/wiki/Trie
in a variety of compressed formats. You have to know how your dictionary is
structured as a file in order to understand how to read and use it. A
plain-text dictionary of half a million words will be very inefficient to
work with. No one can give you much advice until you know what kind of
dictionary you're using.

Creating permutations and looking them up probably isn't such a good idea
either. A 7-letter word has up to 5040 permutations, only a very few of
which are words. Using a trie, it can be much easier to start with the few
letters in 1st position and then see which ones can grow by adding one of
the remaining letters. From any prefix you have already found in the trie,
you find out if it can be extended by a remaining letter.

Cheers,
Matt Humphrey matth@xxxxxxxxxxxxxx http://www.iviz.com/


.



Relevant Pages

  • Re: find words that contains some specific letters
    ... Another approach is to permute the letters and search the dictionary. ... permutations can be checked quickly with the containsmethod ... BufferedReader in = new BufferedReader( ... Finding the permutations of a string in Java is straightforward: ...
    (comp.lang.java.programmer)
  • Re: find words that contains some specific letters
    ... You only build the Set of permuted letters once. ... Then, following John ... One word lookup in the Set costs Obinary search and not O. ... Indeed, my approach is fine for small permutations (e.g. word puzzles, ...
    (comp.lang.java.programmer)
  • Re: Combinatorics question
    ... >> specific set of letters. ... Within a bin, a ... > Your bin is modeled by a mathematical object called a multiset. ... > If you order the objects, you can deal with permutations of the set ...
    (sci.math)
  • Re: Combinatorics
    ... combinations (with and without repetition) ... but the standard combinatorics (replacement ... (permutations with replacement) ...
    (comp.lang.python)
  • Re: Combinatorics question
    ... > specific set of letters. ... Within a bin, a ... Your bin is modeled by a mathematical object called a multiset. ... you can deal with permutations of the set ...
    (sci.math)