Re: Aquarius prolog so fast?

"rupertlssmith@xxxxxxxxxxxxxx" <rupertlssmith@xxxxxxxxxxxxxx> writes:
Thanks, thats just the explanation I was looking for. Any chance of a
simple example to illustrate the limitation of Mercury's variables,

This is the serialize program. It is about the smallest program you can get
to illustrate logic variables; any smaller example would be too contrived.

serial(Items, Ranks) :-
pairlists(Items, Ranks, ItemsRanks),
arrange(ItemsRanks, ItemsRanksTree),
numbered(ItemsRanksTree, 1, _N).

pairlists([], [], []).
pairlists([HeadItem | TailItems], [HeadRank | TailRanks],
[pair(HeadItem, HeadRank) | TailItemsRanks]) :-
pairlists(TailItems, TailRanks, TailItemsRanks).

arrange([], void).
arrange([Head | Tail], tree(TreeBelow, Head, TreeAbove)) :-
split(Tail, Head, TailBelow, TailAbove),
arrange(TailBelow, TreeBelow),
arrange(TailAbove, TreeAbove).

split([], _Pivot, [], []).
split([Head | Tail], Pivot, TailBelow, TailAbove) :-
( before(Head, Pivot) ->
split(Tail, Head, TailBelow1, TailAbove),
TailBelow = [Head | TailBelow1]
; before(Pivot, Head) ->
split(Tail, Head, TailBelow, TailAbove1),
TailAbove = [Head | TailAbove1]
split(Tail, Head, TailBelow, TailAbove)

before(pair(Item1, _Rank1), pair(Item2, _Rank2)) :-
Item1 < Item2.

numbered(void, N, N).
numbered(tree(Tree1, pair(_Item, N1), Tree2), N0, N) :-
numbered(Tree1, N0, N1),
N2 is N1 + 1,
numbered(Tree2, N2, N).

The job of this code is to compute the position of each element of the input
list in the sorted list of the elements. A query such as

serial([40,20,50,10], L)

yields L = [3,2,4,1], because 40 is the third number in the sorted order
[10,20,40,50], 20 is the second number, 50 is the fourth, and 10 is the first.

Pairlists creates an association list in which each item has an associated free
variable. ItemRanks will have a value of the form [40 - A, 20 - B, 50 - C,
10 - D], and pairlists also returns the variables [A, B, C, D] as Ranks.
The arrange predicate then turns the association list ItemRanks into a binary
search tree ItemsRanksTree, in which each node is a pair of an input item
and its associated variable. The numbered predicate then replaces each of
those variables with a number from 1 to N.

The key point here is that by binding the variables in the tree, numbered
also binds the OTHER occurrences of those same variables, which in this case
will be in both ItemsRanks and Ranks. ItemRanks will go from
[40 - A, 20 - B, 50 - C, 10 - D] to [40 - 3, 20 - 2, 50 - 4, 10 - 1],
and Ranks will go from [A, B, C, D] to [3, 2, 4, 1], which is the answer.

Mercury cannot do all this because Mercury cannot keep track of the fact
that a variable such as A occurs in all of ItemsTanksTrees, ItemsRanks
and Ranks, and that if any one of these variables are instantiated to
an integer, the others are also instantiated to that integer.

This is a deliberate tradeoff. The only reason why Prolog can do this
is because every access to the value of a variable starts by chasing
a chain of pointers. All these "copies" of A are on the same chain of pointers,
so updating the entity at the end of the chain effectively updates all of them.
This is nice when you need this capability, but

- you need this capability quite rarely,
- the loop that chases the pointer chain is a significant overhead, which
and finally
- code that depends on this capability is almost always quite hard for most
people to read and understand, much less debug. (The original version
of serialize used one-character variable names and used several cuts
instead of the if-then-else, which made things even worse.)

In the Mercury project, we prefer to trade off a bit of flexibility
for a significant increase in programmer productivity and program speed.
We fully recognize that others have other preferences.

One thing I have been wondering about Mercury, is that its compiler
uses the GCC back-end. Why not the alternative GCC back-end, LLVM?

While Mercury can compile to the internal data structures of the gcc backend,
that is not the usual compilation path. Mercury usually generates C source
code, which you can then compile with whatever C compiler you wish. (If that
C compiler supports gcc's extension to C, then we can generate faster code,
but that is a different question.)

As for why we can target the internals of gcc instead of LLVM, I believe
Fergus Henderson, who implemented the initial version of that capability
over a Christmas break, wanted to learn the internals of gcc, and thought
this was a suitable vehicle for that.

Compiling down to LLVM or to C-- have both long been on our list of things
to look at and maybe do, but given that we already have several working
backends, they have never been near the top of the pile. Basically, we
don't have the manpower to work on them, unless a student decides that
that is what they want to do for their research project.

Zoltan Somogyi <zs@xxxxxxxxxxx>
Department of Computer Science and Software Engineering, Univ. of Melbourne