Re: Program compression
- From: jaycx2.3.calrobert@xxxxxxxxxxxxxxxxxxxxxx (Robert Maas, http://tinyurl.com/uh3t)
- Date: Fri, 15 Aug 2008 01:36:04 -0700
RH> In any (even relatively sensible) language where it takes
RH> /more/ than one line of code, it only takes more than one line
RH> the /first/ time - because the programmer will wrap it into a
RH> function (or procedure, or subroutine, or whatever) and, in
RH> future, call the function (or whatever). So this conciseness
RH> argument is a mere canard, as all program tasks eventually boil
RH> down to:
RH> call existing_solution
RH> which is a one-liner in pretty well any language.
p...@xxxxxxxxxxxxxxxxx (Pascal J. Bourguignon) responded:
PJB> Not in languages that don't provide syntactic abstraction.
PJB> Let's take a simple example, the Memento pattern:
PJB> http://en.wikipedia.org/wiki/Memento_pattern
OK, I read that, and it seems that it may take some considerable
work within each new class to support the ability to restore an old
state even after destructive modifications have been done. On the
other hand, with classes that use a never-modify state object, such
as self-balancing binary trees (all operations log(n) nodes rebuilt
with the rest shared between old and new), the original state never
was changed so it's trivial to restore the "official" state to be
in fact the original state, and it would be easy to abstract a
generic tool that worked in all cases without any new code per
case. Unfortunately all the examples in the WikiPedia page, and in
your article, were too complicated for me to feel like reading in
full, so I don't know whether it was easy or hard to restore the
old state.
PJB> Everytime you want to apply this pattern to a new class, you have
PJB> to add the following methods and class, ...
PJB> In lisp, we write the same, but only ONCE: ...
OK, without studying the details, I get the basic idea.
PJB> This is possible because lisp, in addition to supporting data
PJB> abstraction and procedural abstraction, which any other
PJB> programming language followed suit, also supports syntactic
PJB> abstraction and meta-linguistic abstraction.
This may be the best way to show how much better Lisp is than those
other languages. The anti-Lisp folks are saying "nobody really
needs to be able to construct new source code at run time", but
this kind of pattern may be a good example of how it's a royal pain
*not* to be able to do that. I'd like to see a simpler example of
this sort of need, where it's not huge that I simply don't have the
energy to study the details. After all, if *I* am turned away by
the immensity of the example, I'm sure the anti-Lisp folks will
turn away too, and the lesson won't be learned.
By the way, in my software module to analyze the whitespace
properties of tabular data (in my case, the output from the
FileList utility on Macintosh computers) to automatically discern
where the various columns of data begin and end, I actually did
take advantage of the ability to define a new function at runtime.
Specifically I wrote functions that given the list of start/end
column indexes generated the body of a function definition for
extracting all those substrings:
;Given starting and ending index of multi-column, and symbol to be used
; to refer to a string to be split into pieces:
;Generate the SUBSEQ form, either (SUBSEQ SYM IX1 IX2) or (SUBSEQ SYM IX1)
; depending on whether IX2 is a number or NIL.
;Return that form.
(defun ix1+ix2+strsym-make-subseq-form (ix1 ix2 strsym) ...)
;Given alternating list of start..end indices for multi-columns,
; and symbol to be used to represent the string:
;Build list of SUBSEQ forms per pairs from the list, allowing for
; odd number of elements where end index of last pair is NIL.
;Return list of such SUBSEQ forms.
(defun ixpairs+strsym-make-forms (ixpairs asym) ...)
Having applied that function to the actual list of column indexes,
I then wrap FUNCTION and LAMBDA and LIST around the list of SUBSEQ
forms, and call EVAL to build a function object. Then to parse the
entire list of lines that had been previously read in (which had
already been used to generate the column whitespace statistics used
to decide where the multi-columns of data were located), all I had
to do was call MAPCRR on that newly-built function object and the
list of lines.
Now there are probably several other ways I could have written the
program to avoid needing to define a new function object at run
time, but doing it this way just seemed so very natural. One reason
I chose this method is that the code to run down the list of
alternating start/stop column indixes and check for odd/even total
hence last pair short or not, not only is defined just once, but is
*executed* just once, at the time the list of SUBSEQ forms are
generated, rather than being executed for each and every record
being parsed as some other algorithms would require. Once the
function object has been built, there's no indexing down the inner
loop at all, just inline machine instructions to call SUBSEQ just
the right number of times with just the right parameters each time.
Now whether it's actually faster to have inline SUBSEQ calls rather
than an inner loop that makes a single SUBSEQ call with varying
runtime parameters, is anybody's guess until and unless a profiler
can take a look at both ways. But it's done this way, and works,
and I'm not gonna spend my time trying the other way.
Abstract summary of what I said there as difference between my
method and the usual nested-loop method:
- Nested loop method:
(loop for line in allLines collect
(loop for pix from ixpairs step cddr let (ixlo ixhi) = pix collect
(if ixhi (subseq line ixlo ixhi)
(subseq line ixlo))))
- Define-parser-function method:
(setq theFunctionIDefined ...CodeToBuildAllSUBSEQsIntoOneFunction..)
(loop for line in allLines collect
(funcall theFunctionIDefined line)) ;Inner loop is inline in the function
In a sense I've inverted the inner and outer loop. The inner loop
is a "constant" in a sense, with exactly the same ixpairs each
time, hence exactly the same sequence of SUBSEQ calls except
insofar as the outer loop supplies a different line each time the
loop runs. So I simply took all the parts of the logic which were
constant, compiled them into theFunctionIDefined, and then the
outer loop mapped down the list of the one thing that changed, the
various lines from the file.
Hmm, this is actually a powerful design pattern: Given an algorithm
that has some parts constant and some parts variable, condense all
the constant parts into a function definition, with all constant
info compiled inline, and then only the variable parts need to be
passed around when that function is called. You don't have to keep
passing the several constant parts again and again and again. You
have just one constant part that needs to be mentionned repeatedly,
namely the pointer to the function object that you've defined. Can
any language except Lisp implement this design pattern, where the
constant parts are known only at runtime, and where you aren't
allowed to write source-code to the disk and compile it to disk and
then load from disk (as Java can do), because that isn't
timeshare-thread-safe unless you parameterize the file names to
contain the process-ID, a royal pain, nevermind requiring a 'chron'
job just to audit the disk to garbage collect entire disk files
after the process that generated them is no longer running?
.
- Follow-Ups:
- Re: Program compression
- From: Pascal J. Bourguignon
- Re: Program compression
- Prev by Date: Re: Program compression
- Next by Date: Re: Program compression
- Previous by thread: Re: Program compression
- Next by thread: Re: Program compression
- Index(es):
Relevant Pages
|