Re: Python(2.5) reads an input file FASTER than pure C(Mingw)



On Apr 27, 4:54 pm, n00m <n...@xxxxxxxx> wrote:
Another PC, another OS (Linux) and another compiler C++ (g++ 4.0.0-8)

Compare 2 my latest submissions:http://www.spoj.pl/status/SBANK,zzz/

times: 1.32s and 0.60s

Submitted codes:

import sys
z=sys.stdin.readlines()
print z[5]

#include <cstdio>
#include <cstdlib>
#include <vector>
#include <string>

using namespace std;

vector<string> vs;

int main() {
    while (true) {
        char line[50];
        if (!fgets(line,50,stdin)) break;
        vs.push_back(line);
    }
return 0;

}

If it proves nothing then white is black and good is evil

It seems that the "push_back" line takes most of the time of the
code. Remove it and execution will drop to 0.25s.

Python readline uses fread instead of fgets:
http://svn.python.org/view/python/tags/r251/Objects/fileobject.c?rev=54864&view=markup
(see the file_readlines function)

If you write a code that does an fread loop, execution will drop to
0.01s.

This C code takes 0.25s. Almost all time is spent with string
manipulation.

#include <stdio.h>
#include <string.h>

#define B 8192

char vs[100000][40];
char buffer[B];

int main(void) {
int count;
char *begin, *end;
int i;
i = 0;
while (1) {
count = fread(buffer, 1, B, stdin);
if (count == 0) break;
begin = buffer;
while(1) {
end = (char *)memchr(begin, '\n', buffer+B-begin);
if (end == NULL) {
memmove(buffer, begin, buffer+B-begin);
break;
}
memmove(vs[i], begin, end-begin);
i = (i+1)%100000;
begin = end + 1;
}
}
return 0;
}

The difference, 0.60s-0.25s = 0.35s is probably mostly python's
memory management (which seems to be much more efficient than
std::vector default).

Very interesting post. :-) I had no idea about how much optimized the
builtin library was.


.



Relevant Pages