Re: Java connected Lisp

On 2007-08-08, Scott Burson <FSet.SLB@xxxxxxxxx> wrote:
If anyone is interested in picking this up, they can have the source
code. But if I recall correctly, there is at least one other project
that does a similar thing and got further than mine.

If it exists, I don't know how to find it.

That is because I never released it.

I'm not sure I'll actually have time to work on this anytime soon, but
it would be interesting to see what you've done. Is the code
someplace I can download it?

Beware, this is source code that I never thought I would release before
rewriting it completely.

Although the VM itself is tiny (less that 9kLOC), you will need matching
versions of GNU Classpath and SBCL, so grab the monster tarball
including full source code and binaries for Classpath, SBCL, binaries of
Eclipse (just for fun, try any other application that you like instead),
and enough precompiled heapfiles to make Eclipse start up in minutes
rather than hours. Linux/x86 is required (because the JNI
implementation is not 64 bit clean or something):

- Run ./ in that directory to start.
- Wait and watch Eclipse start, then click somewhere and watch it
crash. :-)
- That's if it works. Reports on #lisp indicate that a full startup
happens less often than random crashes during startup.

Change test.lisp to run something other than eclipse. For example,
(cloak::test) runs Hello World. (At least that starts up reliably. :-))

Keep in mind that CLOAK is really unfinished. It does not need users,
it needs hackers, and in particular people willing to rewrite large
parts of it. Here is a list of things that might be fun to do:

* update to current SBCL

* update to current Classpath

* (tedious but not hard) Look for race conditions and other bugs

Eclipse usually starts (but not always, so there are probably
threading-related bugs). Using it crashes soon.

(Of course, Eclipse is a test case chosen for coolness, not
practicality. Pick any other application you like instead.)

Fix these problems as you find them.

* (lots of work) Switch from GNU Classpath to OpenJDK/IcedTea

Ultimately having OpenJDK support is the right thing to do.

* Validation (Java 2-style)

Validation sounds hard if you believe the JVM spec (so ignore it
completely). Read Alessandro Coglio instead:

* Validation (Java 6-style)


* Java <-> Lisp Interface

Invent a nice (!) and fast interface for calls from Lisp to Java and back.
Bonus points if you implement the same interface on ABCL.

* JNI/libsbcl

Make SBCL loadable as a shared library to implement the missing JNI
functions that a VM in a previously existing process.

Also implement the AttachThreadSomething functions that allow JNI code to
turn non-Java pthreads into Java threads.

* Port to other platforms

Linux/x86-only right now. Linux/amd64 should be easy.

(Other threading-enabled Unix on x86 might just work.)

* (relatively easy) Get rid of spinlocking on monitor contention.

Currently, cloak uses Bacon's algorithm, which busy waits in some

Implement at least Onodera's algorithm instead:

* Fix stack trace disaster

There is an extremely terrible mechanism to translate PCs from
backtraces to their methods, which involves a huge table for all
methods and a GC hook.

Instead, implement Juho Snellman's suggestion of an SBCL extension
that would provide a memory array for long-term pinning of objects,
allowing us to pin all methods referenced from stack traces until user
code asks for them to be resolved. (Keep a table of weak references
to exceptions.)

* (hard to get right) Implement OutOfMemoryError

If you actually want the Lisp and the VM to recover from rigorous
testing of out-of-memory situations, this is tricky.

* (easy) weak references

Weak references are not implemnted yet (IIRC).

The trick could be to hack the compiler so that it recognizes accesses
to the slot of a weak reference and hides a Lisp weak reference
between the slot and its contents.

* finalization

- Find my patch for Java-style finalization in SBCL and use it to
implement finalize().
- Read Bruno Haible's talk to find out what's wrong with my patch.
- Make sure not to finalize objects with a trivial finalize() method.
- Implement PhantomReference and SoftReference.

* Interpreter

A bytecode interpreter would be nice to avoid having to run all
classes through the SBCL compiler.

While a simple and slow interpreter could be written in an afternoon,
we would want an interpreter that is much more optimized.

Have a look how a simple interpreter like JamVM does it, then try to
figure out how to do the same in Lisp. (It is probably harder in Lisp
than in C because you will have endless fun with type declarations and
compiler notes.)

(Alternatively, make the interpreter call specific VOPs instead of
fighting with Lisp types, which might be easy if the compiler has
already been rewritten to do so; see below.)

* Rewrite the "compiler"

The translation from bytecode to Lisp source code is extremely naive.

- (easy) Introduce a BASIC-BLOCK class.

- Compile into IR1 or IR2 instead of going through source code.
o Control exactly which VOPs are used.
o Make compilation fast somehow.
(This amounts to a complete rewrite, but as everyone knows, it is
way easier to rewrite code than write it from scratch, so it might
be easy enough if you know SBCL.)

- (easy) Review binary compatibility (I): Classes are compiled into Lisp
functions and dumped to disk. Right now, we try to generate code
that is independent of other classes, so that each class can be
loaded back from disk even if other classes have changed [in a
binary-compatible way].

- (easy) Review binary compatibility (II): If classes have changed in a way
that breaks binary compatibility, we are required to throw certain
exceptions. Make sure that we actually do that.

- (not too hard and probably fun) Break binary compatibility for
optimization. The generated code can hardly do any optimizations at
all without taking into account the class hierarchy.
Add a mode that optimizes all classes (or perhaps certain jar files)

o All method invocations currently go through a double-dispatch
scheme. In optimized mode, the vtable index could be computed
at compilation time.

o Ditto for slots.

o Inlining of methods!
(Careful, Java needs accurate stack traces, so the effects of
inlining must be recorded exactly so that instructions can be
mapped back to the method they came from at run time.)

o Eliminate unneeded class casts based on type informations
resulting from inlining.

o Eliminate unneeded class initialization checks (for example,
superclasses will already have been initialized when the current
class runs.)

o The big catch: Figure out how to make this scheme safe.
Heapfiles might include cryptographic check sums of all classes
they depend on.

But (hard): Make sure to either keep incremental class loading
working or make our code much more compact so that it's actually
feasible to map entire pre-compiled applications into memory in one

- Either: Propagate information about whether variables can be `null' and
eliminate redundant null checks accordingly. (Ditto for class

- Alternatively, restore the logic that lets null pointer accesses
just happen, and signal the NullPointerException in the segfault
handler. There is a *feature* that might even still work if you set
it. It involves lying to the compiler though and makes things
harder to debug. Worse, it didn't make anything faster back then.

- Hack SBCL to allow untagged arguments across full functions calls.
We do lots of untagged operations, resulting in significant overhead
when the numbers involved are passed to other functions. (And
Java's use of 32 bit and 64 bit arithmetic really makes it
completely unattractive to do the calculations with tagged numbers).
This change could be in the form of a "Java calling convention" that
ordinary Lisp function calls don't use.

- Rewrite parser.lisp, which is currently rather slow and conses too much.

* (hard) Generated much smaller native code.

* (easy) Store the precompiled .heap files in a zip file to save disk space.

Requires loading them into dynamic space using memcpy() instead of mmap().
Should save lots of space.