Breakthrough



Hi

In a recent discussion ("Open letter to Ian Collins") we found out that the main problem in the C containers library was the time taken by the Sort function.

The reasons for that is that it was a general sort function calling a general comparison function that uses memcmp...

The first step to improve performance was to write data type specific functions using a data type generic file. As a bonus, we get compile time checking and better syntax.

Still, the generic sort function was taking a lot of time.

I have written then a data type generic quick sort function that
receives as a parameter a macro that is used to compare two elements.
This vastly improves performance.

The times are:

Generic sort function without any specialized heap
(using malloc)
real 0m17.029s
user 0m16.654s
sys 0m0.368s

Generic sort function using a specialized heap
(reduces the malloc overhead)
real 0m11.759s
user 0m11.500s
sys 0m0.252s

"Templated" sort function using a macro parameter
(reduces comparisons to a simple expression)
real 0m6.453s
user 0m6.165s
sys 0m0.281s

The expression used to compare two list elements is:
#define COMPARE_EXPRESSION(A, B) \
((*B)->Data > (*A)->Data ? -1 : (*B)->Data != (*A)->Data)

I thank Pete for that proposal, I wouldn't have come to it
alone :-)

I would like to have the corresponding C++ times but I fear that
if I write it it would be unfair since I am not a C++ wizard...
Here is the C code for the example:

#include <stdlib.h>
#include "containers.h"
#include "intlist.h"
#define N 10000000
int main(void)
{
intList *L1;
size_t i;
int d;
long long sum=0,newSum=0;
Iterator *it;
int *pint;

L1 = iintList.Create(sizeof(int));
iintList.UseHeap(L1,NULL); // Use specialized Heap
// Add N random numbers to the integer list
for (i=0; i<N;i++) {
d = rand();
sum += d;
iintList.Add(L1,d);
}
// Sort it
iintList.Sort(L1);
// Go through the sorted list using an iterator
it = iintList.NewIterator(L1);
i=0;
for (pint = it->GetFirst(it); pint; pint = it->GetNext(it)) {
newSum += *pint;
}
// Verify that both sums are identical
if (sum != newSum)
printf("Failed\n");
else printf("Sum = %lld\n",sum);
// Destroy the list
iintList.Finalize(L1);
}

I have uploaded the latest code to:

http://code.google.com/p/ccl/

There you will find (in listgen.c) the "templated" version of quick sort in C.

.



Relevant Pages

  • Re: Breakthrough
    ... In a recent discussion ("Open letter to Ian Collins") we found out that ... The reasons for that is that it was a general sort function calling a ... the generic sort function was taking a lot of time. ... receives as a parameter a macro that is used to compare two elements. ...
    (comp.lang.c)
  • Re: sort on more than one key?
    ... )> It would seem that your sort function would be the thing that handles this ... Assuming we're talking about the libc qsort() function. ... you get to give it your own compare function. ... Prev by Date: ...
    (comp.programming)
  • Sort vector with out using sort function
    ... How can I sort a vector with out using the sort function. ... Also how can I take the average of the vector entries with out the sum or mean built in functions ...
    (comp.soft-sys.matlab)