Re: find words that contains some specific letters



On Jun 1, 2:54 pm, Lew <l...@xxxxxxxxxxxxx> wrote:

I figured it meant "intersection", too, but that's the part I don't
understand. What does it mean to take an intersection of the result
sets?

With a HashSet approach, you take each of the n! permutations of the
search string and determine 'dictionary.contains( permutation )'. The
result you want is the union of the permutations that return 'true'.
I do not understand why there would be a test for portions of the
search string, much less taking an "intersection" of those results.

With a HashMap<String, Set<String>> approach, the entire Set of
resultant dictionary words is indexed by the search string, so one
simple 'dictionary.get( searchTerm )' yields an entire result set
directly.

So I remain puzzled, and still need the answer.

I can take a crack at it... I understood Giovanni as saying that we
pre-compute something like
Map<Character, Map<Integer, Set<String>>> countMap
which given a letter and a number of occurrences of the letter returns
the set of all words that have exactly that number of occurrences
(using some predefined dictionary).

E.g., countMap.get('d').get(1) will contain "ad" but not "add".


Next one analyzes the base string into its letter occurrences, e.g.,
if we want to find the permutations of aurora we have:
a 2
u 1
o 1
r 2
and find the intersection of
countMap.get('a').get(2)
countMap.get('u').get(1)
countMap.get('o').get(1)
and
countMap.get('r').get(2)

Set doesn't have a named intersection operator, but it does
have retainAll which basically does intersection. [Though
one needs to be check that it's implemented.]

E.g., something like:
Set wordList = countMap.get('a').get(2);
wordList.retainAll(countMap.get('u').get(1));
wordList.retainAll(countMap.get('o').get(1));
wordList.retainAll(countMap.get('r').get(2));
// Modulo the following discussion wordList now contains
// the list of words that have 2 a's, 1 o, 1 u and 2 r's

This is I think what Giovanni described, but I don't think it's quite
enough. I think we also have to constrain the length of the words to
be the same as the length of the original key. (E.g., "auroras" would
match
above). So perhaps we could make countMap a List where the index
gives the length of the words indexed...

List<Map<Char,Map<Integer,Set<String>>>> countMap so
countMap.get(6).get('a').get(2)
would get the set of 6 character words that have exactly two a's in
them.

I think this modified procedure would work though I don't know if it's
particular efficient.

Regards,
Tom McGlynn

.