Re: How to make this unpickling/sorting demo faster?



On Apr 17, 7:33 pm, Steve Bergman <sbergma...@xxxxxxxxx> wrote:
to demonstrate how Python can combine simplicity, readability, *and*
speed when the standard library is leveraged properly. So, in this

But that's not true. You're just creating an artificial example to
prove your point.

Consider this C code that does what you say. The code is very basic
and readable. In my machine, it takes a little longer than one second
to execute.

#include <stdio.h>
#include <stdlib.h>

#define size 1000000

struct float_pair {
double a;
double b;
};
static struct float_pair s[size];

int compare(const void *a, const void *b)
{
const struct float_pair *sa;
const struct float_pair *sb;
sa = a;
sb = b;
if (sa->b > sb->b) return 1;
if (sa->b == sb->b) return 0;
return -1;
}

int main(void)
{
FILE *f;
f = fopen("floats", "rb");
puts("Reading pairs...");
fread(s, sizeof(s), 1, f);
fclose(f);
qsort(s, size, sizeof(struct float_pair), compare);
puts("Done!");
return 0;
}

Is the code too big ? Yes. Can you make it faster in python ?
Probably not in the 10 minutes required to write this. The code that
generates the files is the following:

#include <stdlib.h>
#include <stdio.h>
#include <time.h>

struct float_pair {
double a;
double b;
};

static struct float_pair s[1000000];

int main(void)
{
FILE *f;
unsigned i;
f = fopen("floats", "wb");
srand(time(NULL));
for (i = 0; i < 1000000; i++) {
s[i].a = (double)rand()/RAND_MAX;
s[i].b = (double)rand()/RAND_MAX;
}
fwrite(s, sizeof(s), 1, f);
fclose(f);
return 0;
}

The binary file used is not portable, because the "double" type is
not standardized in C. However:
- if you restrict this code to run only on machines that follow the
IEEE standard, this code will probably be more portable than the
python one
- in python, there may be small rounding errors when exchanging the
data between machines, since "float" is not standardized either.


case, calling C or assembler code would defeat the purpose, as would
partitioning into smaller arrays. And my processor is single core.

I started at about 7 seconds to just read in and sort, and at 20
seconds to read, sort, and write. I can now read in, sort, and write
back out in almost exactly 5 seconds, thanks to marshal and your
suggestions.

I'll look into struct and ctypes. Although I know *of* them, I'm not
very familiar *with* them.

Thank you, again, for your help.

.



Relevant Pages

  • Re: strict aliasing rules in ISO C, someone understands them ?
    ... > I try to understand strict aliasing rules that are in the C Standard. ... > Let's have two struct having different tag names, ... > struct s1 {int i;}; ... Accessing a member of the foreign structure type through the pointer ...
    (comp.lang.c)
  • Re: Why is C Standard Code Example Invalid?
    ... JK> that a declaration of the complete type of the union is ... the safest bet is to take the standard at it's ... struct t1; ... "E.MOS", which would suggest that the implicit dereference occurs, but ...
    (comp.std.c)
  • Re: realloc() implicit free() ?
    ... block size as a request for one byte, and let the alignment mechanisms raise it as they will. ... is therefore suitably aligned for any type (`struct s' in particular), the value can be assigned to any type of pointer, and can then be used to access an object of that type. ... of the Standard can distort its meaning. ...
    (comp.lang.c)
  • Re: Memory layout in unions
    ... If I have a union with an anonymous struct, ... unsigned int ccount; ... Standard C does not allow anonymous struct members. ...
    (comp.lang.c)
  • Re: ctime double double check
    ... with converting the time. ... However I couldn't find anything in the standard that specifies ... how the struct tm - time_t conversions should be done which means ... storing date as a string of characters and it may look like in ...
    (comp.lang.c)