Re: In which cases/problems is Prolog faster than Java?



On May 23, 1:51 pm, Cameron Hughes <cahug...@xxxxxxxxxx> wrote:

...
I don't know  what level of programming and algorithm analysis skills Christina has, I'm just a  little
squeemish  about switching a language and paradigm just to  shave seconds off of a sort.
Carl Hoare's 1960's  quick sort  is starting to wear thin in the year 2008 (please don't take that statement wrong)
But we can and must do better.  At the risk of opening a classic 'can of worms'   Is there something
we can do with our new found parallelism (i.e.,  2,4, and 8 processor multicore chips) to improve the Qsort?
The overhead for parallelism can be prohibitive for small datasets, but once we get to sorting datasets that have a size
where we actually care about the performance of Java  qsort vs.  Prolog qsort   then  threading etc., starts to pay for itself.

So what about  parallel implementations  of qsort for Prolog?

Here is one in Logtalk (which you can run with SWI-Prolog, YAP, and
XSB as a back-end compiler when using Logtalk high-level multi-
threading features):


:- object(qsort(_Threads)).

:- info([
version is 1.21,
author is 'Paul Crocker and Paulo Moura',
date is 2007/12/27,
comment is 'Multi-threaded version of the quick sort algorithm.',
parameters is ['Threads'- 'Number of threads to use in sorting.
Valid values are 1, 2, 4, 8, 16, 32, etc.']]).

:- threaded.

:- public(qsort/2).
:- mode(qsort(+list, -list), one).
:- info(qsort/2, [
comment is 'Sorts a list of terms into ascending order.',
argnames is ['List', 'Sorted']]).

qsort(List, Sorted) :-
parameter(1, Threads),
Threads > 0,
qsort(List, [], Sorted, Threads).

qsort([], Sorted, Sorted, _).
qsort([Pivot| Rest], Acc, Sorted, Threads) :-
( Threads =:= 1 ->
quicksort([Pivot| Rest], Acc, Sorted)
; Threads2 is Threads//2,
partition(Rest, Pivot, Smaller0, Bigger0),
threaded((
qsort(Smaller0, [Pivot| Bigger], Sorted, Threads2),
qsort(Bigger0, Acc, Bigger, Threads2)
))
).

partition([], _, [], []).
partition([X| Xs], Pivot, Smalls, Bigs) :-
( X < Pivot ->
Smalls = [X| Rest],
partition(Xs, Pivot, Rest, Bigs)
; Bigs = [X| Rest],
partition(Xs, Pivot, Smalls, Rest)
).

quicksort([], Sorted, Sorted).
quicksort([Pivot| Rest], Acc, Sorted) :-
partition(Rest, Pivot, Smaller0, Bigger0),
quicksort(Smaller0, [Pivot| Bigger], Sorted),
quicksort(Bigger0, Acc, Bigger).

:- end_object.

The problem with speeding up Quicksort using multiple threads is that
the actual performance very much depends on the pivot chosen at each
iteration. I.e. if the pivot does not lead to sub-partitions roughly
the same size you get little more performance compared with a
sequential version. E.g. using SWI-Prolog as the back-end compiler on
a 8-core computer with *random* lists of floats gives (using 1, 2, 4,
8, and 16 threads):

Quicksort (average of 10 runs)
1 2 4 8 16
5000 0.047746 0.047059 0.035708 0.029006 0.032277
10000 0.111083 0.100361 0.076050 0.077118 0.083268
15000 0.167924 0.174797 0.109953 0.100317 0.091422
20000 0.226519 0.212724 0.139554 0.136808 0.134318
25000 0.328326 0.337592 0.339155 0.272412 0.251659
30000 0.396432 0.384252 0.363177 0.364878 0.259189
35000 0.420288 0.246681 0.206172 0.209500 0.182987
40000 0.490231 0.338510 0.249211 0.177443 0.186040
45000 0.588948 0.575981 0.417951 0.325616 0.268653
50000 0.682585 0.698429 0.587631 0.488556 0.374016

Actual speedup also depends on the performance of the Prolog compiler.
E.g. with YAP you will need bigger lists than above to hope to beat
the sequential version (you always need to account for the threads
overhead plus the overhead of all those temporary lists being built).
Other sorting algorithms such as merge sort allow better speedups when
using multiple cores. E.g.:

Merge sort (average of 10 runs)
1 2 4 8 16
5000 0.049974 0.037345 0.027060 0.026583 0.037963
10000 0.146841 0.081894 0.054119 0.048932 0.061557
15000 0.232520 0.128032 0.086671 0.076746 0.096394
20000 0.317598 0.176163 0.119017 0.102286 0.122714
25000 0.419169 0.230886 0.148339 0.129592 0.156422
30000 0.519234 0.281993 0.184366 0.157323 0.185261
35000 0.617644 0.335898 0.216543 0.183829 0.220019
40000 0.714777 0.390772 0.249395 0.214107 0.254999
45000 0.812895 0.447561 0.284706 0.243894 0.283623
50000 0.917376 0.517273 0.322977 0.275071 0.318263

Thus, hard to beat a highly optimized C implementation of a Prolog
sorting built-in predicate for most cases.

Cheers,

Paulo

.



Relevant Pages

  • Re: In which cases/problems is Prolog faster than Java?
    ... Carl Hoare's 1960's quick sort is starting to wear thin in the year 2008 ... So what about parallel implementations of qsort for Prolog? ... partition(Rest, Pivot, Smaller0, Bigger0), ... qsort(Bigger0, Acc, Bigger, Threads2) ...
    (comp.lang.prolog)
  • Re: sorting columns
    ... I wanted to make sure i was not missing some internal functions to help me ... Pivot tables are more like ... If you plan on creating your custom sort code, ... First, if you ever had a course on programming dealing with sorting, ...
    (microsoft.public.excel.programming)
  • Re: Sorting Grouped Dates in Pivot Table
    ... You could ungroup the dates, sort, then regroup them, and the order ... > Having problems sorting grouped dates in a pivot table report. ...
    (microsoft.public.excel)
  • Re: sorting columns
    ... Pivot tables are more like Cross ... If you plan on creating your custom sort code, you many want to think about ... First, if you ever had a course on programming dealing with sorting, there's ... sort objects within sort objects to create you different layers of sort ...
    (microsoft.public.excel.programming)
  • Re: Pivot Table - Grouped Dates - Format
    ... If you sort them later, ... I have a grouped pivot by date. ... the date doesn't allow sorting in the way that I need. ... the pivot and it doesn't do anything at all to how the format in the ...
    (microsoft.public.excel)