Re: Wow, Python much faster than MatLab




Stef Mientki wrote:

MatLab: 14 msec
Python: 2 msec

I have the same experience. NumPy is usually faster than Matlab. But it
very much depends on how the code is structured.

I wonder if it is possible to improve the performance of NumPy by
having its fundamental types in the language, instead of depending on
operator overloading. For example, in NumPy, a statement like

array3[:] = array1[:] + array2[:]

allocates an intermediate array that is not needed. This is because the
operator overloading cannot know if it's evaluating a part of a larger
statement like

array1[:] = (array1[:] + array2[:]) * (array3[:] + array4[:])

If arrays had been a part of the language, as it is in Matlab and
Fortran 95, the compiler could see this and avoid intermediate storage,
as well as looping over the data only once. This is one of the main
reasons why Fortran is better than C++ for scientific computing. I.e.
instead of

for (i=0; i<n; i++)
array1[i] = (array1[i] + array2[i]) * (array3[i] + array4[i]);

one actually gets something like three intermediates and four loops:

tmp1 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp1[i] = array1[i] + array2[i];
tmp2 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp2[i] = array3[i] + array4[i];
tmp3 = malloc(n*sizeof(whatever));
for (i=0; i<n; i++)
tmp3[i] = tmp1[i] + tmp2[i];
free(tmp1);
free(tmp2);
for (i=0; i<n; i++)
array1[i] = tmp3[i];
free(tmp3);

In C++ this is actually further bloated by constructor, destructor and
copyconstructor calls.
Why one should use Fortran over C++ is obvious. But it also applies to
NumPy, and also to the issue of Numpy vs. Matlab, as Matlab know about
arrays and has a compiler that can deal with this, whilst NumPy depends
on bloated operator overloading. On the other hand, Matlab is
fundamentally impaired on function calls and array slicing compared
with NumPy (basically copies are created instead of views). Thus, which
is faster - Matlab or NumPy - very much depends on how the code is
written.

Now for my question: operator overloading is (as shown) not the
solution to efficient scientific computing. It creates serious bloat
where it is undesired. Can NumPy's performance be improved by adding
the array types to the Python language it self? Or are the dynamic
nature of Python preventing this?

Sturla Molden

.



Relevant Pages

  • Re: Loopless syntax for 2d in NumPy (or Numarray)
    ... heapq would not work on ... > NumPy arrays, right? ... it is then a normal Python array rather ...
    (comp.lang.python)
  • Re: Flattening lists
    ... Rhamphoryncus wrote: ... I have a similar use case in pyspread, which is a Python spreadsheet ... that employs numpy object arrays. ... there is one main numpy array of type "O". ...
    (comp.lang.python)
  • Re: numpy array sorting weirdness
    ... feature of numpy? ... The numpy array sort method takes an optional keyword argument "axis", ... like a normal python sort. ... Or retaining compatibility with python lists is not ...
    (comp.lang.python)
  • Re: About alternatives to Matlab
    ... is there any alternative software for Matlab? ... I have used Matlab for years, and has recently changed to Python. ... Python language is fare more expressive and productive than Matlab, ... The NumPy package is the core requirement for numerical work in Python. ...
    (comp.lang.python)
  • Re: exposing C array to python namespace: NumPy and array module.
    ... Even the buffer interface provides only 'get ... > Python has no array objects in the core language, ... > lists and NumPy etc arrays are very different. ...
    (comp.lang.python)