Re: Program compression
- From: Jon Harrop <jon@xxxxxxxxxxxxxxxxx>
- Date: Wed, 09 Jul 2008 22:35:09 +0100
Robert Maas, http://tinyurl.com/uh3t wrote:
Date: Sun, 22 Jun 2008 13:13:13 +0100Why this response is so belated:
<http://groups.google.com/group/misc.misc/msg/cea714440e591dd2>
= <news:rem-2008jun25-003@xxxxxxxxx>
From: Jon Harrop <j...@xxxxxxxxxxxxxxxxx>
Haskell, SML, OCaml, Mathematica, F# and Scala all allow real
problems to be solved much more concisely than with Lisp. Indeed, I
think it is difficult to imagine even a single example where Lisp
is competitively concise.
What does "solved much more concisely" mean??
The solutions are shorter in modern FPLs.
So why don't you say what you really mean, that some particular
application types are better represented in published libraries in
your favorite language than in mine (Lisp)?
That is neither what I meant nor what I said.
And people who like
Lisp will respond that some *other* application types are better
represented in published libraries in Lisp language than in
*yours*? Why don't you compile a list of all the major application
types that you feel are needed by the largest number of potential
customers/users, and allow the rest of us to propose additions to
your list? For example, IMO the most commonly useful application
types are:
- Business operations (accounts receivable/payable, inventory, payroll)
- Text editing (both plain text and "word processing" formatted text)
- Web browsing/surfing/searching
- E-mail (both plain text and MIME multi-media) including protection
against spam - Conferencing and other public/group forums
- Security, including protection against computer
viruses/worms/trojans/bots There are probably five or ten more that I
could add if I had the time/energy to think of them. Anyway, after we get
a complete list, based on concensus of everyone reading comp.programming
on a regular basis and seeing this thread, *then* you need to say how
well your favorite language addresses these common application
needs.
Note that algebraic datatypes are not among the major applications I
listed.
Common != useful.
...
(defcurry quadratic (a) (b c x) (+ (* a x x) (* b x) c))
...
In F#:
let inline quadratic a b c x = a*x*x + b*x + c
That's more verbose than my use of defcurrypoly.
No it isn't. Also, you did not define "defcurry".
Does F# provide a way to carry out the abstraction of defcurrypoly?
The "inline" did precisely that.
Also, it's absurd that the keyword for generating a curried
function is "inline" instead of "curry".
That is not what the "inline" was for.
Suppose you want to hand-code a listLisp:
of data to be processed, and then map some function down that list.
In Lisp all you need to do is
(mapcar #'function '(val1 val2 val3 ... val4))
where the vals are the expressions of the data you want processed
and function is whatever function you want applied to each element
of the list. You just enter that into the REP and you're done.
(mapcar #'function '(val1 val2 val3 ... val4))
OCaml and F#:
map f [v1; v2; v3; .... vn]
Mathematica:
f /@ {v1, v2, v3, ..., vn}
OK, for simple lists those three languages equal Lisp in conciseness.
So what if you want to pass a more complicated structure to some function.
For example, how do you pass a nested ASSOC list to a function that
automatically traverses it matching keys against given keys to find
the correct branch at each level? Suppose that the function is
called fks which is an abbreviation for Find Key Sequence (in such
a nested assoc list implied).
(defparameter *nal1*
'(:TELEPHONECODES
(:USA "1" (:OREGON "503" "541" "971") (:NEVADA "702" "775")
(:NORTHDAKOTA "701"))
(:CANADA "1" (:MANATOBA 204))
(:BELGIUM "32" (:ANTWERP "3") (:BRUSSELS "2"))
(:CROATIA "385" (:DUBROVNIK "20") (:ZAGREV "1"))
))
(fks '(:TELEPHONECODES :CROATIA :DUBROVNIK) *nal1*) => ("385" "20")
(fks '(:FOOBAR :USA)) ;signals error, toplevel tag wrong
(fks '(:TELEPHONECODES :MEXICO :CANCUN) ;signals error, inner tag #1
missing (fks '(:TELEPHONECODES :USA :TEXAS) ;signals error, inner tag #2
missing
Would that be equally simple to directly code in those three
languages you suggest?
Here is the OCaml/F# including the definition of the searching function:
let nal1 =
[ "USA", ("1", [ "Oregon", ["503"; "541"; "971"];
"Nevada", ["702"; "775"];
"NorthDekota", ["701"] ]);
"Canada", ("1", ["Manatoba", ["204"]]);
"Belgium", ("32", ["Antwerp", ["3"]; "Brussels", ["2"]]);
"Croatia", ("385", ["Dubrovnik", ["20"]; "Zagrev", ["1"]]) ]
let search key map k =
let h, t = List.assoc key map in
h :: k t
search "Croatia" nal1 (List.assoc "Dubrovnik")
The Mathematica is essentially the same as the Lisp because it also lacks
static type checking.
... modern functional languages like Haskell, OCaml and F# ...
They [users] clearly do not have a problem with the supreme brevity
of these languages.
In cases where those particular languages are not already brief as
delivered, how easy is it to define a shorter syntax if you're
going to want a particular operation very terse because it's going
to be done a lot? I.e. how do they compare with Lisp's ability to
define macros?
Macros are very rare in modern languages. OCaml has a full macro system but
it is rarely used. F# has no macro system.
... I was listing some of Lisp's most practically-important
deficiencies. Lack of static typing in the language is one.
I disagree. I pointed out how static typing doesn't really solve
the problem at all because it's incapable of dealing with
intentional data types which is where the *real* problem lies.
Static type-checking of internal datatypes doesn't hardly begin to
solve the real problem. It's just an advertising gimmick that turns
out to be more cost than value.
Why do you think statically typed languages completely dominate general
purpose programming?
(The exception is in tight loops
where some machine type happens to work efficiently and where that
particular loop eats up most of the CPU time, which is precisely
where Lisp allows type declarations and consequent CPU-efficient
code as an *option* where it would actually be worth the cost.)
Lisp is unable to regain the performance cost of its dynamic design:
http://www.ffconsultancy.com/languages/ray_tracer/results.html
Lack of concurrent GC in all Lisp implementations is another.
I gotta look up that to see what the term means ... first match is:
<http://xiao-feng.blogspot.com/2007/03/about-mostly-concurrent-gc.html>
That gives me some idea, but it doesn't seem like what you desire, right?
That is a step in the right direction.
Skipping a few search results that don't look worth studying, I come to:<http://groups.google.com/group/comp.lang.functional/browse_thread/thread/a288a989c5cd6acb/50b6ca2607173c91>
"So concurrent GC makes communication cheaper at the expense of everything
else?" That was written by none other than Jon D Harrop!
(The point was that if you have a shared mutable object, then you
need concurrent GC, but usually it's better design to have each
thread have its own mutable data and instead marshall data to pass
back and forth as needed, but then passing large objects back and
forth can cost a lot.)
There's a compromise between shared mutable data and passing large
data objects back and forth, which is to have each mutable data
object owned by just one process and everyone else must mutate or
examine it via queries in the OO style. So then only the handle on
the object, the parameters of the query (which may be handles on
other objects), and the return value (some piece of the object, or
a success/fail flag) need be passed between processes. It seems to
me that having an actual shared-mutable-object is almost as bad as
having a global variable that is mutable by multiple processes
(except for semaphores/interlocks/etc. which really *do* need to be
shared at a low level to make multi-processing work in the first place).
Later in that thread, it's pointed out that sharing memory doesn't
scale to really large applications, so you need to marshall data
(even if it's only ints) to pass between distant processors.
So in essence shared memory is a bad idea in large applications
where you need parallization in the first place, so you should
avoid shared memory, and thus there's no need for concurrent GC.
Later in that same thread, Harrop said "I'm hard pressed to find a
practically-important application that will be better off with a
concurrent GC." So if concurrent GC is essentially worthless, why
is Lisp's lack of it a reason for disliking Lisp??
The problem is that my tests were all ideally suited to distributed
parallelism when, in fact, even embarassingly parallel programs often
require substantial scattering and gathering of data. Having a
concurrent GC makes it possible for programs to share mutable data
structures which removes the cost of scatter and gather.
For example, I found that a parallel matrix multiply is up to 100x faster in
F# than OCaml:
http://groups.google.com/group/fa.caml/msg/c3dbf6c5cdb3a898
For programs like this (which are very common), scattering and gathering are
free with a concurrent GC but can be extremely expensive without.
Consequently, F# sees performance improvements from parallelization for a
much wider variety of tasks than OCaml can.
So anyway, whether concurrent GC is worth the trouble or not, is
debatable.
If you have >1 core, it is not debateable.
weak references,[suffix "are all fundamental deficiencies of the language" for
these next several items]
It seems to me that a GC hook could be used to purge transient data
that hasn't been referenced recently. How easy is it to actually
implement this and get it working well with only ANSI CL to work
with?
AFAIK, it is inherently implementation dependent.
finalizers,
You mean like to automatically close a system resource whenever the
Lisp hook for it is GC'd? Normally system resources like that are
lexically bound, whereby UNWIND-PROTECT is sufficient to make sure
they get closed when that block of code is exited. Is there
*really* a good reason to have such resources kept around per data
object instead of per block-entry/exit?
Yes. Finalizers are useful for the same reason that GC itself is good (and
Ada is bad): you don't want to have to shoehorn everything you do into
scopes just to ensure that things get freed correctly. Another reason is
that garbage collectors often free sooner than that.
Somehow it seems to me a
dangerous design to have system resources that get closed at
"random" times (when memory runs short) instead of at clearly
defined times.
Absolutely. Finalizers are not suitable if you require determinism.
But if you can think of a really good use for random closure, I may
reconsider my opinion.
I used finalizers in OCaml to collect unused OpenGL display lists and
textures. The results were excellent: the code was greatly simplified at no
cost.
asychronous computations,
I'm not sure what you mean. Don't compilers already generate
efficient CPU data+instruction parallel-pipelining whenever they
detect that two operations are done in close proximity neither of
which depends on the result of the other?
I was referring to the use of green threads from a monadic style, like F#'s
asynchronous workflows. You need to rewrite the code to CPS which can be
done in Lisp but is not provided as standard (and would be useful for a
wide variety of concurrent programming tasks).
memory overflow recovery,
That might be moot on typical modern architectures that have a
small CPU cache and a large RAM and a truly gigantic virtual
address space. Locality of reference to avoid thrashing is a
problem long before total virtual memory or disk space for swapping
is ever exceeded.
The vast majority of computers do not have virtual memory because they are
embedded devices.
In a serial GC, the program often (e.g. at every backward branch)
calls into the GC to have some collection done. Collections are
typically done in small pieces (incrementally) to facilitate
soft-real time applications but can only use a single core. For
example, OCaml has a serial GC so OCaml programs wishing to use
multiple cores fork multiple processes and communicate between them
using message passing which is two orders of magnitude slower than
necessary, largely because it incurs huge amounts of copying that
is not necessary on a shared memory machine:
On a shared-memory machine, how difficult is it to prevent
different processes from trying to read and write the same object
at the same time, whereupon the writer has only partly overwritten
the object when the reader starts to read it and sees pieces of old
and new object scrambled together?
Depends who you ask. The Java community don't seem to have a problem but the
Haskell community will tell you that it is impossible (they often tell me
that what I do is impossible).
With a concurrent GC, the garbage collector's threads run
concurrently with the program threads without globally suspending
all program threads during collection. This is scalable and can be
efficient but it is incredibly difficult to implement correctly.
The OCaml team spent a decade trying to implement a concurrent GC
and never managed to get it working, let alone efficient.
Maybe it's just too tricky (to implement correctly) to be worth
doing?
If that were true then the JVM and .NET would not be worth having but, in
reality, they are two of the most highly valued pieces of software ever
written.
Maybe the Lisp implementors were wise not to make that gamble?
The Lisp implementors had no choice: they cannot write a concurrent GC
because it is too hard for them. Same goes for all other open source FPLs as
well.
Maybe it's "not ready for prime time", i.e. until somebody
finds a way to get it right, it's not worth fretting over the fact
that it hasn't been done, it's better to just live without it?
The JVM and .NET are prime time.
- List several kind of applications and/or API tools that are
hampered by lack of whatever that means.
Any software that requires fine-grained parallelism for
performance will be hampered by the lack of a concurrent GC.
List five such applications, and convince me that any technique
other than fine-grained parallelism is too inefficient for each
such application.
Polygon tesselation, Fast Fourier transform, LU decomposition, suffix trees,
the distinct element method...
Also convince me in each case that circular
structure is absolutely necessary, because after all if there are
no pointer loops then reference counting works just fine and no GC
is needed at all. Convice me that a hybrid system, using a symbol
table of tops of structures, with no loops within any of the
structures except that symbols in the symbol table may be
mentionned within structures, would not be sufficient for each such
application.
In theory, you can code everything in assembler. In practice, it is not
feasible. The same applies here: high-level languages make all of those
parallel tasks much easier.
Development costs are astronomical in Lisp compared to modern2. Even though Lisp's forte is as a language laboratory, Lisp hasUm, some implementations of Common Lisp are **free** to download
so little to offer but costs so much
and then "use to your heart's content". How does that cost too much???
alternatives like F#, largely because it lacks a static type system
As I pointed out before, static type-checking doesn't solve enough
of the "problem" to be worth the trouble it causes.
I disagree.
Any language that imposes static typing is what slows down development.
I have found that modern statically typed languages let me develop
complicated applications much faster than any dynamic language.
decent developer tools like IDEs
I used to use Macintosh Allegro Common Lisp on my Mac Plus.
Wasn't that a IDE?
Compare it to Mathematica (which has the world's best IDE, IMHO).
There are no decent Lisp implementations for .NET.
Why does anybody care? What, if anything, is good about .NET?
Lots of wealthy .NET users leading to high profits is good.
Finally, there is no way you'll ever turn a profit because the
market for commercial third-party software for Lisp is too small.
Consider leased server-side applications (CGI or PHP). How would
the customer know or care whether Lisp or some other language
happens to be used to implement it?
In theory, they would not care. In practice, I am not familiar with any such
businesses.
Just about everyone on the net
uses the Google search engine. It's advertising based rather than
subscription, but ignore that detail for a moment. Do you know what
language(s) are used to implement all its internals?
Mostly C/C++.
Do you care?
Yes but as a programmer rather than a Googler.
I think you should aspire to earn money directly from customers
rather than asking people to give you money to develop your
software.
I like that idea very much. Will you find me a potential customer
who wants some serverside application that doesn't already exist?
You need to write a compelling sales page and make it indexable by search
engines and make it as easy as possible for users to give you their money
before plastering all of the news websites that you can find with
descriptions of your new service and links to it. Then you need to
repeatedly update it and report new news stories about it for many months
before you will start to see significant website hits. Only hits on sales
pages correlate with sales.
As a guide, our products make around 20p profit per page view.
--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
.
- Follow-Ups:
- Re: Program compression
- From: Robert Maas, http://tinyurl.com/uh3t
- Re: Program compression
- From: Alf P. Steinbach
- Re: Program compression
- References:
- Re: Program compression
- From: Robert Maas, http://tinyurl.com/uh3t
- Re: Program compression
- Prev by Date: Re: Which computer language program is best for undergrads?
- Next by Date: Re: Program compression
- Previous by thread: Re: Program compression
- Next by thread: Re: Program compression
- Index(es):
Relevant Pages
|