Re: Aquarius prolog so fast?

"rupertlssmith@xxxxxxxxxxxxxx" <rupertlssmith@xxxxxxxxxxxxxx> writes:
Thanks, thats just the explanation I was looking for. Any chance of a
simple example to illustrate the limitation of Mercury's variables,

This is the serialize program. It is about the smallest program you can get
to illustrate logic variables; any smaller example would be too contrived.

serial(Items, Ranks) :-
pairlists(Items, Ranks, ItemsRanks),
arrange(ItemsRanks, ItemsRanksTree),
numbered(ItemsRanksTree, 1, _N).

pairlists([], [], []).
pairlists([HeadItem | TailItems], [HeadRank | TailRanks],
[pair(HeadItem, HeadRank) | TailItemsRanks]) :-
pairlists(TailItems, TailRanks, TailItemsRanks).

arrange([], void).
arrange([Head | Tail], tree(TreeBelow, Head, TreeAbove)) :-
split(Tail, Head, TailBelow, TailAbove),
arrange(TailBelow, TreeBelow),
arrange(TailAbove, TreeAbove).

split([], _Pivot, [], []).
split([Head | Tail], Pivot, TailBelow, TailAbove) :-
( before(Head, Pivot) ->
split(Tail, Head, TailBelow1, TailAbove),
TailBelow = [Head | TailBelow1]
; before(Pivot, Head) ->
split(Tail, Head, TailBelow, TailAbove1),
TailAbove = [Head | TailAbove1]
split(Tail, Head, TailBelow, TailAbove)

before(pair(Item1, _Rank1), pair(Item2, _Rank2)) :-
Item1 < Item2.

numbered(void, N, N).
numbered(tree(Tree1, pair(_Item, N1), Tree2), N0, N) :-
numbered(Tree1, N0, N1),
N2 is N1 + 1,
numbered(Tree2, N2, N).

The job of this code is to compute the position of each element of the input
list in the sorted list of the elements. A query such as

serial([40,20,50,10], L)

yields L = [3,2,4,1], because 40 is the third number in the sorted order
[10,20,40,50], 20 is the second number, 50 is the fourth, and 10 is the first.

Pairlists creates an association list in which each item has an associated free
variable. ItemRanks will have a value of the form [40 - A, 20 - B, 50 - C,
10 - D], and pairlists also returns the variables [A, B, C, D] as Ranks.
The arrange predicate then turns the association list ItemRanks into a binary
search tree ItemsRanksTree, in which each node is a pair of an input item
and its associated variable. The numbered predicate then replaces each of
those variables with a number from 1 to N.

The key point here is that by binding the variables in the tree, numbered
also binds the OTHER occurrences of those same variables, which in this case
will be in both ItemsRanks and Ranks. ItemRanks will go from
[40 - A, 20 - B, 50 - C, 10 - D] to [40 - 3, 20 - 2, 50 - 4, 10 - 1],
and Ranks will go from [A, B, C, D] to [3, 2, 4, 1], which is the answer.

Mercury cannot do all this because Mercury cannot keep track of the fact
that a variable such as A occurs in all of ItemsTanksTrees, ItemsRanks
and Ranks, and that if any one of these variables are instantiated to
an integer, the others are also instantiated to that integer.

This is a deliberate tradeoff. The only reason why Prolog can do this
is because every access to the value of a variable starts by chasing
a chain of pointers. All these "copies" of A are on the same chain of pointers,
so updating the entity at the end of the chain effectively updates all of them.
This is nice when you need this capability, but

- you need this capability quite rarely,
- the loop that chases the pointer chain is a significant overhead, which
and finally
- code that depends on this capability is almost always quite hard for most
people to read and understand, much less debug. (The original version
of serialize used one-character variable names and used several cuts
instead of the if-then-else, which made things even worse.)

In the Mercury project, we prefer to trade off a bit of flexibility
for a significant increase in programmer productivity and program speed.
We fully recognize that others have other preferences.

One thing I have been wondering about Mercury, is that its compiler
uses the GCC back-end. Why not the alternative GCC back-end, LLVM?

While Mercury can compile to the internal data structures of the gcc backend,
that is not the usual compilation path. Mercury usually generates C source
code, which you can then compile with whatever C compiler you wish. (If that
C compiler supports gcc's extension to C, then we can generate faster code,
but that is a different question.)

As for why we can target the internals of gcc instead of LLVM, I believe
Fergus Henderson, who implemented the initial version of that capability
over a Christmas break, wanted to learn the internals of gcc, and thought
this was a suitable vehicle for that.

Compiling down to LLVM or to C-- have both long been on our list of things
to look at and maybe do, but given that we already have several working
backends, they have never been near the top of the pile. Basically, we
don't have the manpower to work on them, unless a student decides that
that is what they want to do for their research project.

Zoltan Somogyi <zs@xxxxxxxxxxx>
Department of Computer Science and Software Engineering, Univ. of Melbourne

Relevant Pages

  • Re: How do I install this missing library?
    ... you really should be carefully following LSTC's installation ... libg2c is part of gcc. ... to tell a newcomer to compile. ... like the source code to GNU tar, and make sure you understand what's ...
  • Re: HPGCC Questions ladies and gentlemen!!!
    ... No matter how you slice it in order to compile a C program you need to know ... it took a few hours just to get gcc running in my computer ... of the students that used an ide in the c++ class I took a few years ago. ... so why not use a data inspector if it's available? ...
  • Re: Just starting out in C
    ... was based on the ANSI standard as it was being developed. ... Some words about gcc: ... gcc provides some extensions to the language that aren't supported by ... For example, to compile the Hello, World program, use the command line ...
  • Re: Question about gcc on OS X Tiger
    ... running Linux and gcc 4.0.2. ... How do I switch to a different version of gcc to compile with on OS X ... fgetsis a libc function and is not really part of the compiler. ... int main(int argc, char* argv){ ...
  • Re: Those who dont know Lisp are doomed to underestimate it [was Re: Python from Wise Guys Viewpoint
    ... > You mean my Mercury version of the unrealistic benchmark where all the ... > If you compile with the appropriate optimization options, the Mercury ... >>You cannot assume that the sum will be smaller than any finite sized ...