Re: coerce for arbitrary types
- From: vanekl <vanek@xxxxxxx>
- Date: Sat, 12 Apr 2008 03:39:54 +0000
Robert Maas, http://tinyurl.com/uh3t wrote:
Regarding that proposed koan earlier: Does anybody have aFrom: vanekl <va...@xxxxxxx>
collection of short code snippets which likewise illustrate a whole
mess of deep understanding, where just trying to understand why
this particular snippet of code produces the result it does will
enlighten a newbie?
As we just witnessed (if you've been reading this list recently),
(It's not a list, it's a group.)
a 10-line recursive BFS
I assume you mean breadth-first search (in some sort of tree,
possibly binary)?
Nope, I mean DFS. I misspoke. I was thinking about the coloring
algorithm that Cormen, Leiserson, and Rivest use in their DFS.
As stated, this was in context to the person who recently
posted his homework assignment. A DFS was a solution
to his homework. If you care to go back and read the problem,
you'd probably agree.
The solution would probably be simplest using a quanary
tree (North, South, East, West branches).
Why would anybody want to write that recursively?
See above and below.
While a depth-first search is nicely expressed recursively, since
the algorithm itself is essentially last-in first-out, hence
stack-like, hence recursive-like, a breadth-first search is
inherently first-in first-out, like a queue, not at all stack-like
or recursive-like. It seems that a iterative loop of appending
newly-discovered branches at the end of the queue and popping the
next task-to-do off the front of the queue is the best way to
expresss the algorithm. Why even bother trying to emulate a queue
via some sort of recursion?
Because it's simple, can be easily understood, and works.
Seem like pretty good ideas to me, especially when you are
discussing code that you would like newbies to understand.
You seem to think queues are easier to understand. From my
experience, once you get over the hurdle of recursion,
recursive code is quicker to understand, shorter, and less
bug prone. Why even bother with possible efficiency issues
at this point in the newbie's process of discovery?
(Alternately you can use two stacks,
one for popping off tasks at the current level, and one for
postponing all new tasks at the next level. Again this is best
expressed as either nested while loops or co-routines, not as
recursion.) Pseudocode for the two-stack algorithm:
NowStack <= new empty stack
WaitStack <= new stack containing topnode
While WaitStack not empty
Swap the two stacks
While NowStack not empty
Pop one item from NowStack and process it,
pushing each next-level node onto WaitStack.
What could be simpler, except for the single-queue algorithm of
course?
A recursive algorithm would be simpler. You could "paint" visited
nodes if they've been visited. (See Cormen, Leiserson, and Rivest.)
No queue required. While I agree that stacks/queues and iteration
is the preferred method if efficiency is your top concern, I cannot
figure out why you think your solution would be easier to understand and
a better demonstration.
Note that the two-stack algorithm could be instrumented to>ftp/pub/techreports/99-1.ps.gz+recursive+BFS&hl=en&ct=clnk&cd=3&gl=us&ie=UTF-8>
announce each new level being explored, which would be more
difficult with the single-queue algorithm.
Queue <= new queue containing topnode
While Queue not empty
Pop one item from front of Queue and process it,
appending each next-level node onto end of Queue.
is too much for a newbie to figure out by himself if he's never
seen something like that done in CL before.
Let me see if I can find the 10-line recursive BFS algorithm ...
according to Google Groups search engine, the only mention of BFS
and recursive in the same article in this newsgroup was your
article I'm replying to. Let me try other keywords ... no articles
whatsoever with breadth first search recursive, let me try more
general Google search ...
<http://209.85.173.104/search?q=cache:uwIaeFMa0xcJ:www.cs.jcu.edu.au/
All known algorithms of BFS use iteration. ...
The standard denition [sic] of BFS involves an iterative loop
over a queue of vertices. ...
Exploration of a graph component is achieved by dequeuing a vertex to
discover unexplored vertices which are subsequently enqueued. This
kind of manipulation produces graph exploration which can be viewed as
progressive levels of discovered vertices
(nicely explained IMO)
This section describes and analyzes recursive and iterative algorithms
that implement the clos recurrence to produce BFS. The recursive algorithm
is described rst [sic], since it is a natural extension of the clos
recurrence. ...
Algorithm 3.1
A recursive BFS algorithm.
(completely illegible, Google didn't do a good job of converting
PostScript to regular text)
(But it notes it's tail-recursive, i.e. actually iterative.)
The same traversal sequence is generated by the iterative algorithm
and the
recursive algorithm, except that the iterative algorithm is more
ecient [sic].
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/231503>
Ah, this looks better, a recursive algorithm based on iterators.
Unlike the usual queue-based BFS, the space used is only O(h).
Is this the kind of algorithm you were referring to?
If so, I'll take a closer look at it later. But:
I believe ... that this code does not perform a valid
breadth-first search. The problem is that the 'last' variable only
prevents backtracking to the last visited node, not to any
previously visited node. ...
<http://www.mec.ac.in/resources/notes/notes/ds/gratra.htm>
... when hear you breadth you should think queue, and when
you hear depth you should think stack.
(Somebody agrees with me!)
For a newbie, recursion handles the stack just fine, and the code is
simpler. Stack depth is not an issue considering the problem the newbie
described, and the type of problems newbies are typically assigned.
<http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V0C-4MT5K49-4>&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221
>&_version=1&_urlVersion=0&_userid=10&md5=9030bc77b63e54a6798de49d6c28c64d>
recursive best-first search (RBFS).
Is *that* what you meant??
Maybe I should have just asked you what you mean by BFS and where
to find the 10-line algorithm you are referring to, instead of
spending an hour researching what you might have meant.
Pascal-B illustrates recursion in a pedagogical and also useful manner:
(defun enumerate (start end)
(when (< start end)
(cons start (enumerate (1+ start) end))))
IMO this is a *bad* lesson, a lesson how *not* to write an algorithm.
TCONC (or equivalent) is IMO the proper way to do this sort of
thing, incrementally modifying a new data structure in an iterative
loop which uses o(1) stack space and o(1) additional memory.
Except TCONC isn't a part of CL, nor do I think this is newbie "territory."
You appear to differ.
So you're saying that CONS should be deprecated for newbies.
Amazing.
Your education and Lisp practitioning is so disjoint from mine that
there's probably no reconciling the two.
While I agree your earnest attempt at optimizing algorithms is proper
and even notable in some contexts, I believe, in the context of the premise of
this discussion, you are beyond the pale.
The problem with this sort of recursion is that whereas heap space
is allocated in any random order, so that various structures can
share the heap providing they don't all try to use it at the same
time, a stack (as normally implemented) requires a pre-allocation
of dedicated space, which is almost all *wasted* almost all the
time. So you either allocate a huge block of consecutive storage
for a huge stack for the largest graph you'd ever want to traverse,
or you re-allocate stack from time to time when you realize the old
stack wasn't big enough, or your algorithm dies from stack
overflow.
Right.
Those are the only considerations and the only options.
Uh huh.
You have an interesting way of framing problems to suit your
particular biases, ignoring the fact that this technique
is successfully commonly used, and simple to teach and understand.
Try something like this instead, to implement a fifo queue:
(defun make-empty-tconc () (cons nil nil))
(defun is-empty-tconc (tc) (null (car tc)))
(defun append-tconc (tc el)
(let ((newcell (cons el nil)))
(cond ((is-empty-tconc tc) (setf (car tc) newcell) (setf (cdr tc) newcell))
(t (setf (cdr (cdr tc)) newcell) (setf (cdr tc) newcell)))))
(defun pop-tconc (tc) ;To get elements out one by one
(let ((popped (car (car tc))))
(setf (car tc) (cdr (car tc)))
(if (is-empty-tconc tc) (setf (cdr tc) nil))
popped))
(defun see-all-tconc (tc) ;To get list of all remaining elements at once
(car tc))
The TCONC object as defined above can be used both to visit nodes
in fifo sequence per breadth-first search (using append-tconc and
pop-tconc repeatedly), and to build up the final list of nodes to
return from the top level (using append-tconc repeatedly and
see-all-tconc just once at the return point).
(BBN/UCI Lisp had a slightly different naming convention for
essentially the same functionality.)
Interesting, but I submit it is not something important enough that
a newbie should be first exposed to.
The random shuffle algorithm is accessible and shows that
there are lots of ways of accomplishing something seemingly
simple. (Take that Python.)
http://www.pvk.ca/Blog/Lisp/malmesure_quand_tu_nous_tiens.html
That article has a lot of discussion which slightly distracts me
from your point. Note that for me, there are some obvious ways to
randomize a sequence efficiently:
- If a vector, run the selection sort algorithm except that a
random number is used to directly select the element to exchange
with the next-to-fix element.
- If a list:
- Copy to vector and apply above algorithm then copy back to list.
- Tag elements with pseudo-random numbers, run merge-sort, then
collate adjacent elements to find the very few blocks that
got identical random numbers and locally shuffle each such block.
Of course if you have a list and want vector result, or vice versa,
you use the vector-selection-sort algorithm, copying from list at
start or to list at end as required.
Now if you have a large dataset in consecutive memory locations
(i.e. in an array, or in a file then loaded or mapped into an
array) in a large virtual address space but have only a relatively
small amount of RAM, you need to keep your working set small to
avoid thrashing, and the problem gets more interesting. An
array-based implementation of merge-sort, using just one temporary
array smaller than the original array, might be the best algorithm.
I also think Ron's iteration article is well written and
provides insight. He extracts an important pattern.
(Is there any high-level pattern more important than iteration,
other than "IF"?)
http://rondam.blogspot.com/2008/02/joy-of-iterators.html
Actually the lesson I see is that if you can't predict the new
requirements that your boss may throw at you *after* you've already
implemented all the use cases in the previous set of requirements,
you may need to do some refactoring of the old code to make it look
decent with the new requirements. In *some* such cases, writing an
iterator of various sorts may be useful. But you can't know ahead
of time whether it'll be useful or not, unless you can predict what
your boss will throw at you tomorrow. So you should just get the
current use-cases to work with reasonable code, and cross
tomorrow's bridge when your boss throws it at you tomorrow. It
helps to use a fast-prototype-enabling language such as Lisp so
that you can get all the current use cases working *before* the
boss throws the new stuff at you, so that at the end of each day
you have something actually working you can demo. Ideally you set
up a Web demo of each day's accomplishments, and when your boss
tells you he has new requirements, you interrupt him to tell him
that yesterday's requirements are already up and running at
tinyurl.com/<6chars> if he wants to see what you've done so-far
based on yesterday's requirements.
But I agree that producer-->consumer model is useful, whether it's
implemented by an iterator or continuation or stream or co-routine
or multi-tasking or whatever the jargon is today.
A nice example of parameterization, first-class functions,
and functional programming:
(defun best (sequence comparison &key (key #'identity))
(reduce (lambda (x y)
(if (funcall comparison (funcall key x) (funcall key y))
x y))
sequence
:initial-value (elt sequence 0)))
I actually disagree!
What a surprise :)
I agree that REDUCE is a wonderful function
that everyone should be familiar with, and that REDUCE would
accomplish the desired purpose here, but *not* that using REDUCE
here is really appropriate. The problem with using REDUCE is that
really picking the best from a large set could be parallelized,
which REDUCE doesn't do. For this pick-the-best task, a more
enlightening algorithm would quickly traverse the list, activating
side tasks for comparison, and running further traversal of the
list in parallel with the side tasks, rather than waiting for
completion of each comparison task before traversing further down
the list, as REDUCE is defined to do. Drawing a dataflow tree,
REDUCE must work like this (* represents a comparison):
el1 el2 el3 el4
\ / / /
* / /
\ / /
* /
\ /
*
whereas a parallized algorithm could do these comparisons:
el1 el2 el3 el4
\ / \ /
* *
\_ _/
\ /
*
taking log(n) time instead of n-1 time.
Wow. Now your disagreeing not because the algorithm is sound, but
because you would rather have one that is more easily parallelizable.
You continue to amaze me. I thought the entire premise for this
collection of koans was for newbie education.
(Your ascii art is striking, though, and well illustrates
your point.)
Why recursion is not always the best solution:
[16]> (defun fact(n) (if (zerop n) 1 (* n (fact (1- n)))))
(Bug if n less than 0, which somebody else pointed out.)
Indeed although this is often touted as the best first-example of
recursion, it's mis-use of recursion, we agree.
Have you seen my split-in-half recursive algorithm for factorial?
It uses stack depth log(n) instead of n.
<http://groups.google.com/group/comp.lang.lisp/msg/39a686b3e4bffa8a?dmode=source>
No, I haven't seen this until today, but I recently read about this
same algorithm implemented in this same fashion on another web site (I can't
place it right now). The web site did time tests and explained why this type of
algorithm is faster. It was consing less and using far
fewer BigNum calculations, according to the article.
Pretty interesting stuff, but, again, largely irrelevant for newbies IMO.
Example of clear, concise Lispy style:
[From "Tutorial on Good Lisp Programming Style"]
(defun count-all-numbers (exp)
(typecase exp
(cons
(+ (count-all-numbers (first exp))
(count-all-numbers (rest exp))))
(number 1)
(t 0)))
Minor quibble: The use of "first" and "rest" implies the
*intention* is nested lists, *not* arbitrary CONS trees.
Yeah, and this distinction is important for newbs. </sarcasm>
As if CAR and CDR are easier for newbies to grok.
They're /synonyms/, FCS.
The *intention* is not conveyed by the language, per se,
but how the language (e.g., the operators) is commonly used
depending on context. Again, not important for newbs IMO.
In fact, some newbies that I've talked to specifically
mention that they find the CAR and CDR operators distasteful.
Even after I mention that they never have to be used, they
still are turned off by them. I don't understand their
argument, but they are bright guys, so there must be something
to it. I'm guessing that Norvig, et al, took that into account
when this example was written.
Yet the
code works with arbitrary CONS trees. CAR and CDR should be used
instead of FIRST and REST if the intention is to work with
arbitrary CONS trees.
If exp is (5 . 9), should it count one number or two numbers??
Yes, the doc string is not explicit. Got me there :)
As nested lists, there's only one number in a proper position.
But as CONS trees, there are two numbers.
SORT has to be handled with care, even if it is destructive.
If you want to sort a list, you have to use the result of SORT.
Wont give a newbie the expected result:
(let ((mylist (list "c" "d" "a")))
(sort mylist #'string<)
mylist)
Yeah, this is a specific lesson about a specific
possibly-destructive function whose *result* rather than
side-effect is the *correct* result.
Considering how often sort is used, even though it is minor
lesson, it is still important, especially since it almost
certainly will eventually bite the newb on the @ss.
Better:
(let ((mylist (list "c" "d" "a")))
(setq mylist (sort mylist #'string<))
mylist)
Yes, but slightly inefficient compared to:
(let ((mylist (list "c" "d" "a")))
(setq mylist (sort mylist #'string<)))
because after all the return value from SETQ is the value that was assigned,
unlike defconstant whose return value is the symbol being assigned.
Efficiency was not my prime motivation. Pedagogical value was. Did you
notice the symmetry between the two examples? They were like that for a
reason: it's easier to draw conclusions this way. I don't understand your
urge to confuse easy to understand concepts with your (minor) efficiency concerns.
Apparently you find efficiency to be so important that even newbies
must be immediately indoctrinated in the finer points of Lisp.
And a good compiler could peephole optimize that last "mylist" out
anyway. I don't know whether any Lisp compiler does this, but I've
worked with a C compiler that was able to do peephole optimization.
It's not impossible, is what I'm saying, and may completely invalidate
your argument.
But really, since mylist is inaccessible after return, why not just:
(let ((mylist (list "c" "d" "a")))
(sort mylist #'string<))
Who cares that mylist has a randomly munged value if it's not accessible?
Apparently you are snipping things out of the discussion.
Since you do not mention it at all, it's hard to say unless I go
back to my original post and try to discover what you took
out.
And last, I would stay away from mentioning the eval operator.
Actually, when developing code line-at-a-time, which is the way I
strongly prefer in most cases, the REP is your biggest tool, and
EVAL really needs to be mentionned.
Evaluation could be mentioned as apart of the REPL. The EVAL /operator/
doesn't. Two separate, but related concepts. You have mistakenly
conflated the two.
Maybe what you mean is that
*you* explictly embedding an extra call to EVAL in the code you are
already EVALing is most often a design mistake.
Uh, yes, that's what I mean when I specifically say "eval operator."
Again, from context, this should have been evident. Reading each line
completely separate from the rest of the paragraph just makes for
poor reading comprehension.
Better to get
familiar with APPLY and FUNCALL rather than "consing up an
expression to pass to EVAL". Perhaps lots of examples of where
APPLY or FUNCALL is useful, would be a good tutorial, so the idea
of needing to explicitly call EVAL never arises in the first place.
Yes, I showed an example of this already. See above (/\cfuncall/).
Or maybe after the appropriate use of APPLY or FUNCALL is shown,
the same example can be demonstrated with EVAL to show how much
extra work it is by comparision. (Nevermind that lexical variables
aren't captured by EVAL whereas they *are* captured by FUNCALL and
APPLY. If you are never tempted to explicitly call EVAL in the
first place, you'll never be bitten by that problem.)
Good: (setq x <someComplicatedCalculation>)
(setq y <anotherComplicatedCalculation>)
(setq f <someCalculationToProduceAnAppropriateFunctionOBJECT>)
(funcall f x y)
Bad: (setq x <someComplicatedCalculation>)
(setq y <anotherComplicatedCalculation>)
(setq f <someCalculationToProduceAnAppropriateFunctionNAME>)
(setf (symbol-function 'tmpf) f)
(setq exp '(tmpf x y))
(eval exp)
I haven't a clue as to why you think newbies would need to know this.
They tend to look for the biggest hammer first, and showing them EVAL,
IMO, is not helpful. Apparently you think otherwise.
But it doesn't matter anyway because newbies don't read,
they think it's faster to just start coding.
There's a balance between:
- Reading everything before trying to do anything
- Reading nothing before trying to do everything
The trick is to read what's appropriate before trying to do
something appropriate as the first thing to do after reading that
first stuff. Don't read too much before you write your first line
of code, because you'll forget most of it because it's either not
relevant or not comprehendable before you write your first line of
code. But do read *something*, but what? There are several
pedagogical approaches:
- Sequence of lessons which you must follow in sequence.
- Top-down breakdown from end goal back towards prerequisites which
must be mastered before the main task can be attempted.
- Learn a few basic principles, then work from the reference manual
when actually trying to code.
Sorry, that last statement was (a little) tongue in cheek, and you
seemed to have taken it literally. I (poorly) was referring to the
newb who was just here asking for the best lisp operators to use,
as if knowing the most common operators is going to help you when you
are unable to articulate your algorithm. Well, I guess if you were
doing some Genetic Programming, it would help, but I don't think
that was the context.
My preferred method of learning a new language: find a good book,
read a chapter, put it in my lap as I do the most interesting
chapter problems, and continue on to the next chapter. When I've
learned enough to code without the book, put the book down.
My preferred book happened to be Norvig's PAIP. Even with the
advent of the internet, it's still my preferred method, because
published books are generally of higher quality than the language
studies I find on the internet. YMMV.
.
- Follow-Ups:
- How best to teach newbie at algorithms (was: coerce for arbitrary types)
- From: Robert Maas, http://tinyurl.com/uh3t
- How best to teach newbie at algorithms (was: coerce for arbitrary types)
- References:
- coerce for arbitrary types
- From: Albert Krewinkel
- Re: coerce for arbitrary types
- From: Scott Burson
- Re: coerce for arbitrary types
- From: Robert Maas, see http://tinyurl.com/uh3t
- Re: coerce for arbitrary types
- From: Kent M Pitman
- Re: coerce for arbitrary types
- From: Robert Maas, see http://tinyurl.com/uh3t
- Re: coerce for arbitrary types
- From: vanekl
- Re: coerce for arbitrary types
- From: Robert Maas, http://tinyurl.com/uh3t
- coerce for arbitrary types
- Prev by Date: Re: ABCL Lisp, continue project?
- Next by Date: OpenAIR, the RAILS killer?
- Previous by thread: Re: coerce for arbitrary types
- Next by thread: How best to teach newbie at algorithms (was: coerce for arbitrary types)
- Index(es):
Relevant Pages
|