Re: efficiently accumulating values
- From: "Kaz Kylheku" <kkylheku@xxxxxxxxx>
- Date: 7 Jul 2006 00:13:14 -0700
liyer.vijay@xxxxxxxxx wrote:
Hi,
I have a small lisp program to solve the boggle puzzle
(http://weboggle.shackworks.com/) and it generates a lot of lists.
This is the main part of the solving code:
(defun find-words (matrix)
"Given MATRIX find all possible words."
(destructuring-bind (m n) (array-dimensions matrix)
;; Find words starting at every position in the matrix
(loop with result = ()
as i from 0 to (1- m)
do (loop as j from 0 to (1- n)
do (setq result (nconc result ; <-- major consing
(get-words matrix i j))))
finally (return result))))
Obviously, the arrow here is pointing to the wrong place. All of the
consing is done inside GET-WORDS, which searches for Boggle (tm)
solutions from a given row-column position in the letter grid.
What's going on in the NCONC is that it has to repeatedly search for
the end of the list from the beginning to find the tail where the
NCONC'ing is done.
If you don't care about the order of the words, it would be somewhat
cheaper to do the NCONC'ing in reverse order of the lists, that is:
(setq result (nconc (get-words matrix i j) result))
because then you're scanning only to the end of each new list. You are
not scanning over the items over and over again.
Also, a Boggle matrix isn't that large. You could just PUSH up a list
of lists, and then stitch all the lists together in one fell swoop, and
then have them in the original order:
collecting (get-words matrix i j) into word-list-collection
finally (return apply #'nconc word-list-collection)
So if the Boggle board is 4x4, the APPLY here calls NCONC with at most
16 arguments: each one of the lists collected from each of the 16
starting positions.
Also, here is an idea. How about using a big, fat, special variable to
accumulate the data? The innermost code just pushes onto that variable
directly, and when you call it on all the positions, there it is. The
top level wrapper function, of course, rebinds the variable with a LET
and returns the value.
If you want to be a master hacker, you must liberate your sense of
shame.
(defun get-words (matrix i j)
"Find all possible words starting from MATRIX position I J"
(labels ((rec (i j prefix acc)
(loop for (ni nj) in (adjacent matrix i j)
as next = (aref matrix ni nj)
as word = (concatenate 'string prefix next)
if (and (>= (length word) *minimum-length*)
(wordp word))
do (push word acc)
when (prefixp word)
do (let ((old (aref matrix i j)))
(setf (aref matrix i j) 'nil
acc (rec ni nj word acc)
(aref matrix i j) old))
finally (return acc))))
(rec i j (aref matrix i j) '())))
Here,
(concatenate 'string prefix next)
will allocate a new string just to add a letter. What you might want
to do is to use adjustable vectors of characters instead and use
vector-push-extend. It's uglier code but it can be faster.
Another thing:
( loop for (ni nj) in (adjacent matrix i j) ...)
generates a list of the positions adjacent to i j. It's nice to put it
that way, but it's a lot of consing: several conses for each character
visited. And not all characters visited result in the discovery of
words, obviously.
You could cut some of that consing with a trivial change, namely have
ADJACENT return a list of dotted pairs, so it becomes (loop for (ni .
nj) in (adjacent ....)) . It could return a vector also, in which case
you could loop ACROSS It and destructure the pairs yourself.
This could be a good application for the ITERATE macro instead, which
has some generator capabilities: you'd generate the adjacent positions.
Also, perhaps because I'm tired, I do not see where in your program is
an important Boggle constraint implemented: namely that the search must
not visit a square more than once! It seems that when the local
function REC is invoked on a square adjacent to i j, it will compute
the adjacency for that square, and that adjacency will include i j, and
so REC will be recursively invoked on i j again.
;;;;;;;;
Running the problem as it is, takes 2.713 seconds. Throwing away the
results obtained takes only 0.176 seconds. I changed find-words to
The word prefix check could be done with a lookup in a trie dictionary.
You could traverse the trie in parallel as you scan through the board
positions rather than calling wordp and prefixp predicate functions, so
you could terminate a search as soon as the trie bottoms out.
For instance, suppose you visited T and H. So you are at the trie node
where all of the TH words are rooted. But the next character is Z.
Right away, the trie tells you that this is junk, and your recursion
should bail out one level to try a different letter instead of the Z.
That might be an E and so now you have THE. You can collect that and
keep going.
I would eliminate the character accumulation entirely. When a word is
detected, you simply pull it out of the dictionary, which, if suitably
designed, gives you a pointer to the entire word: a prefix tree that a
list of all the words in their entirety, plus a bunch of nodes that
allow for prefix searches. So as you recurse through the board
positions, you are not building up a string, but keeping track of where
you are in the prefix tree.
A long time ago, someone in the Lisp industry told me it was poor form
quote people; it suggests that they lack value.
-- Kent M Pitman <pitman@xxxxxxxxxxxxx> in comp.lang.lisp
No way! It suggests that they have transcended ordinary existence to
dwell among the symbols. Moreover, it shows that we are actually
interested in /them/ and not merely in what they can evaluate for us.
.
- References:
- efficiently accumulating values
- From: liyer . vijay
- efficiently accumulating values
- Prev by Date: Re: efficiently accumulating values
- Next by Date: Re: Why is lisp better than java perl or python or ruby?
- Previous by thread: Re: efficiently accumulating values
- Next by thread: need some problems to work on
- Index(es):
Relevant Pages
|