Re: Common Lisp implementations are still multiple times slower than C



Frank Buss <fb@xxxxxxxxxxxxx> writes:

I've tested my cellular automaton from
http://www.frank-buss.de/automaton/arabeske.html again on my current
computer with C and some Common Lisp implementations on Windows:

...

I'm interested in the timings for other Lisp implementations. You can use
the C version from http://www.frank-buss.de/tmp/arabeske.zip for a time
reference (or use the CLISP version).

In another thread in this newsgroup some people wrote that Common Lisp can
be as fast as C. This might be possible, but looks like no current
implementation reachs this goal. And I know, it's unfair to compare C and
Lisp with such low-level tasks, because Lisp is better for high-level
tasks, but would be nice to do fast image processing like in
http://www.frank-buss.de/lisp/texture.html in Lisp, only.

If you don't mind compiling your code with slightly different
semantics, then you don't need to do much to have more or less the
same performance as C _without_ messing with lots of type
declarations.

What I usually do is to develop in Common Lisp. If I need extreme
performance, I just add a few type declarations and feed the program
to the Linj compiler that outputs Java source code. Then, I use a
Java compiler to create an efficient version.

The interesting thing is that I only need the include as many type
declarations as the Linj compiler requires me to do. It will infer
all the others according to Java semantics.

So, just for comparison, I timed you original Common Lisp code (in
Allegro 7.0) and your hand-written C and Java code in my laptop and
got the following results:

Allegro CL 7.0 Linux: 117.1 s
GCC 4.0.3 Debian: 11.2 s
Java 1.4.2 Linux: 11.2 s

So, without any special optimization, the Java code runs on par with
the C code and both much faster than the Common Lisp code.

Now, let's change your example so that it becomes easier to translate
into Java. First, here is your code _without_ type declarations:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun generations (count file)
"a generations CA (see http://www.mirekw.com/ca/rullex_gene.html )"
(let* ((width 640)
(height 480)
(ca-width (+ width 2))
(ca-height (+ height 2))
(len (* ca-width ca-height))
(ca1 (make-array len :initial-element 0))
(ca2 (make-array len :initial-element 0)))

;; seed the cellular automaton space
(setf (aref ca1 (- (floor len 2) ca-width 4)) 1)

;; calcuclate the generation "count"
(loop for c from 1 to count do
(loop for y from 0 below height
with i = (1+ ca-width) do
(loop for x from 0 below width do
(let ((cell (aref ca1 i)))
(if (zerop cell)
;; summing the 8 neighbours
(let ((sum (+ (aref ca1 (1+ i))
(aref ca1 (1- i))
(aref ca1 (+ i ca-width -1))
(aref ca1 (+ i ca-width))
(aref ca1 (+ i ca-width 1))
(aref ca1 (- i ca-width -1))
(aref ca1 (- i ca-width))
(aref ca1 (- i ca-width 1)))))
(when (= sum 1) (setf cell 1)))
(if (= cell 62)
(setf cell 0)
(incf cell)))
(setf (aref ca2 i) cell)
(incf i)))
(incf i 2))
(rotatef ca1 ca2))

;; save it
(with-open-file (stream file :direction :output
:element-type '(unsigned-byte 8)
:if-exists :supersede
:if-does-not-exist :create)
(loop for y from 0 below ca-height
with i = 0 do
(loop for x from 0 below ca-width do
(let ((cell (aref ca1 i)))
(write-byte cell stream))
(incf i))))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Using the previous code, the timings get even worse (as expected). I
get 133.1 seconds in Allegro.

Now, let's change the function name and include a type declaration for
its parameters. These declarations aren't strictly needed but,
without them, Linj will include extra libraries (to deal with bignums,
symbols and such) that we don't need for this example. Also, we need
to be a bit more specific about the array declaration. You will also
notice that I had to change the with-open-file macro call because Java
IO is somewhat different and I replaced the rotatef macro by a psetf
because I didn't bother to include rotatef in Linj. Anyway, here is
the result. I hope you will not find many differences.

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun main (count file)
"a generations CA (see http://www.mirekw.com/ca/rullex_gene.html )"
(declare (int count) (string file)) ;;changed
(let* ((width 640)
(height 480)
(ca-width (+ width 2))
(ca-height (+ height 2))
(len (* ca-width ca-height))
(ca1 (make-array (list len) :element-type 'byte)) ;;changed
(ca2 (make-array (list len) :element-type 'byte))) ;;changed

;; seed the cellular automaton space
(setf (aref ca1 (- (floor len 2) ca-width 4)) 1)

;; calcuclate the generation "count"
(loop for c from 1 to count do
(loop for y from 0 below height
with i = (1+ ca-width) do
(loop for x from 0 below width do
(let ((cell (aref ca1 i)))
(if (zerop cell)
;; summing the 8 neighbours
(let ((sum (+ (aref ca1 (1+ i))
(aref ca1 (1- i))
(aref ca1 (+ i ca-width -1))
(aref ca1 (+ i ca-width))
(aref ca1 (+ i ca-width 1))
(aref ca1 (- i ca-width -1))
(aref ca1 (- i ca-width))
(aref ca1 (- i ca-width 1)))))
(when (= sum 1) (setf cell 1)))
(if (= cell 62)
(setf cell 0)
(incf cell)))
(setf (aref ca2 i) cell)
(incf i)))
(incf i 2))
(psetf ca1 ca2 ;;changed
ca2 ca1))

;; save it
(with-open-file (stream file :direction :output ;;changed
:stream-class file-output-stream)
(loop for y from 0 below ca-height
with i = 0 do
(loop for x from 0 below ca-width do
(let ((cell (aref ca1 i)))
(write-byte cell stream))
(incf i))))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

Now, you can translate this code into Java using the Linj compiler.
The result is:

//////////////////////////////////////////////////////////////////////
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.PrintWriter;

public class Generator extends Object {

// methods

// a generations CA (see http://www.mirekw.com/ca/rullex_gene.html )
public static void main(String[] outsideArgs) throws FileNotFoundException {
int count = Integer.parseInt(outsideArgs[0]);
String file = outsideArgs[1];
int width = 640;
int height = 480;
int caWidth = width + 2;
int caHeight = height + 2;
int len = caWidth * caHeight;
byte[] ca1 = new byte[len];
byte[] ca2 = new byte[len];
ca1[(((int)Math.floor(len / 2)) - caWidth) - 4] = 1;
for (int c = 1; c <= count; ++c) {
{
int y = 0;
for (int i = caWidth + 1; y < height; ++y) {
for (int x = 0; x < width; ++x) {
byte cell = ca1[i];
if (cell == 0) {
int sum =
ca1[i + 1] +
ca1[i - 1] +
ca1[i + caWidth + -1] +
ca1[i + caWidth] +
ca1[i + caWidth + 1] +
ca1[(i - caWidth) - -1] +
ca1[i - caWidth] +
ca1[(i - caWidth) - 1];
if (sum == 1) {
cell = 1;
}
} else if (cell == 62) {
cell = 0;
} else {
++cell;
}
ca2[i] = cell;
++i;
}
i += 2;
}
}
byte[] ca10 = ca1;
ca1 = ca2;
ca2 = ca10;
}
PrintWriter stream = new PrintWriter(new FileOutputStream(file));
try {
int y = 0;
for (int i = 0; y < caHeight; ++y) {
for (int x = 0; x < caWidth; ++x) {
byte cell = ca1[i];
stream.write(cell);
++i;
}
}
} finally {
if (stream != null) {
stream.close();
}
}
}
}
//////////////////////////////////////////////////////////////////////

You might find it interesting to compare the translated code with your
hand-written code.

Now, the good news is that this code executes the same example in 11.7
seconds which is 10 times faster than your original Common Lisp
version that included type declarations and is similar to your
hand-written Java a C examples. Note that I didn't include type
declarations except for the function parameters.

Here is the summary:

Your versions
Allegro CL 7.0 Linux: 117.1 s
GCC 4.0.3 Debian: 11.2 s
Java 1.4.2 Linux: 11.2 s

Linj version
Java 1.4.2 Linux: 11.7 s

Obviously, now that the code is targeting Java, we might take
advantage of Java libraries and we can do the file printing as you
did. Here is the change:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun main (count file)
"a generations CA (see http://www.mirekw.com/ca/rullex_gene.html )"

...

;; save it
(with-open-file (stream file :direction :output
:stream-class file-output-stream
:decorator-classes nil)
(write stream ca1))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

The translation (for the changed fragment) is:

//////////////////////////////////////////////////////////////////////
import java.io.FileOutputStream;
import java.io.IOException;

public class Generator extends Object {

// methods

// a generations CA (see http://www.mirekw.com/ca/rullex_gene.html )
public static void main(String[] outsideArgs) throws IOException {

...

FileOutputStream stream = new FileOutputStream(file);
try {
stream.write(ca1);
} finally {
if (stream != null) {
stream.close();
}
}
}
}
//////////////////////////////////////////////////////////////////////

Notice that using with-open-file in Linj gives you the correct
treatment to IO errors (that you forgot in the hand-written Java
example).

António Leitão.
.



Relevant Pages

  • Re: Is anything easier to do in java than in lisp?
    ... > (Even if Lisp had fantastic libraries it would take some work, ... Java has this. ... Common Lisp is in an "interesting" position at this moment. ... involve other technologies and/or other "standards". ...
    (comp.lang.lisp)
  • Re: Lex, Yacc, and Lisp...
    ... Common Lisp specification. ... I have been involved both in implementing languages for Java and for Common Lisp. ... If you take a look around in the Java world what language extensions people are working on, and risk to take a look behind the scenes, you can tell that this is mostly the same everywhere. ...
    (comp.lang.lisp)
  • Re: SBCL is now faster than Java, as fast as Ocaml, and getting better
    ... times faster than Java, and is finally as fast as Ocaml. ... For purposes of propaganda, i.e. advocating Common Lisp, this is ... exactly the same ratio for each language. ... programming language in the Benchmarks Game?" ...
    (comp.lang.lisp)
  • Re: Lisp vs. Java vs. C++ speed comparison time? [LONG]
    ... I also tried other Common Lisp compilers. ... >> look carefully at the Java code that was generated by the Linj ... but is it not true that the Boyer benchmark involves only ... important, non built-in, types, namely linj.Cons and linj.Symbol. ...
    (comp.lang.lisp)
  • Re: do expression
    ... assignments be executed during execution of the do loop ... do (funcall g th2) ... the Common Lisp version above is incorrect. ... What needs to be done in a translation of binding semantics to assignment semantics is that new local bindings are created manually in each iteration, so that they can persist, and that all references go to those local bindings. ...
    (comp.lang.scheme)