Re: About alternatives to Matlab
- From: "Mark Morss" <mfmorss@xxxxxxx>
- Date: 5 Dec 2006 07:35:37 -0800
I doubt that anyone would dispute that even as boosted by Numpy/Scipy,
Python will almost certainly be notably slower than moderately
well-written code in a compiled language. The reason Numpy exists,
however, is not to deliver the best possible speed, but to deliver
enough speed to make it possible to solve large numerical problems with
the powerful and flexible Python language. As observed by Hans
Latangen in _Python Scripting for Computational Science_, 2nd ed.,
Springer 2005, scientific computing is more than number crunching:
"Very often programming is about shuffling data in and out of different
tools, converting one data format to another, extracting numerical data
from a text, and administering numerical experiments involving a large
number of data files and directories. Such tasks are much faster to
accomplish in a language like Python than in Fortran, C, C++ or Java."
He might well have mentioned the importance of developing nice-looking
reports once the analysis is complete, and that development is simpler
and more flexible in Python than a compiled language.
In principle, I agree that heavy-duty number-crunching, at least if it
has to be repeated again and again, should be accomplished by means of
a compiled language. However, if someone has to solve many different
problems just one time, or just a few times (for example if you are an
engineering consultant), there is an excellent argument for using
Python + Numpy. Unless it affects feasibility, I opine, computational
speed is important primarily in context of regular production, e.g.,
computing the daily "value at risk" in a commodity trading portfolio,
or making daily weather predictions.
I am aware of the power and flexibility of the OCaml language, and
particularly that an OCaml user can easily switch back and forth
between interpreted and compiled implementation. I'm attacted to OCaml
and, indeed, I'm in the process of reading Smith's (unfortunately not
very well-written) _Practical OCaml_. However, I also understand that
OCaml supports only double-precision implementation of real numbers;
that its implementation of arrays is a little clunky compared to
Fortran 95 or Numpy (and I suspect not as fast as Fortran's); and that
the available libraries, while powerful, are by no means as
comprehensive as those available for Python. For example, I am unaware
of the existance of an HDF5 interface for OCaml.
In summary, I think that OCaml is an excellent language, but I think
that the question of whether to use it in preference to Python+Numpy
for general-purpose numerical analysis must rest on much more than its
computational speed.
Jon Harrop wrote:
Filip Wasilewski wrote:
Besides of that this code is irrelevant to the original one and your
further conclusions may not be perfectly correct. Please learn first
about the topic of your benchmark and different variants of wavelet
transform, namely difference between lifting scheme and dwt, and try
posting some relevant examples or use other topic for your benchmarks.
Your lifting implementation is slightly longer and slower than the
non-lifting one but its characteristics are identical. For n=2^25 I get:
1.88s C++ (816 non-whitespace bytes)
2.00s OCaml (741 b)
2.33s F# (741 b)
9.83s Your Python (1,002 b)
The original python was 789 bytes.
Neglecting this issues, I wonder if it is equally easy to juggle
arbitrary multidimensional arrays in a statically typed language and do
operations on selected axis directly from the interactive interpreter
like in the NumPy example from my other post -
http://groups.google.com/group/comp.lang.python/msg/a71a5bf00a88ad57 ?
I don't know much OCaml so it would be great if you could elaborate on
this.
It is probably just as easy. Instead of dynamic typing you have parametric
polymorphism. If you want to make your function generic over arithmetic
type then you can pass in the arithmetic operators.
0.56s C++ (direct arrays)
0.61s F# (direct arrays)
0.62s OCaml (direct arrays)
1.38s OCaml (slices)
2.38s Python (slices)
10s Mathematica 5.1
Note that all implementations are safe (e.g. C++ uses a.at(i) instead of
a[i]).
So why not use C++ instead of all others if the speed really matters?
What is your point here?
1. Benchmarks should not just include two inappropriate languages.
2. Speed aside, the other languages are more concise.
Could you please share full benchmark code, so we could make
conclusions too?
I'll paste the whole programs at the end...
I would like to get to know about your benchmark
methodology. I wonder if you are allocating the input data really
dynamically or whether it's size is a priori knowledge for the
compiler.
The knowledge is there for the compiler to use but I don't believe any of
them exploit it.
In this specific context (discrete wavelet transform benchmark), I'd have
said that speed was the most important thing after correctness.
Not necessarily, if you are doing research with different kinds of
wavelets and need a general, flexible and easily adaptable tool or just
the computation speed is not considered a bottleneck.
Brevity is probably next.
Language speed is a great advantage, but if it always was the most
important, people would talk directly to the CPUs or knit DSP chips in
theirs backyards whenever they need to solve a calculation problem.
Sure.
Please don't make this discussion heading towards `what is the fastest
way of doing d4 transform and why OCaml rules` instead of `what is the
optimal way of doing things under given circumstances and still have a
free afternoon ;-)`.
Comments like that seem to be based on the fundamental misconception that
writing C++ like this is hard.
Different tasks need different, well-balanced
measures. Neither Python nor OCalm nor any other language is a panacea
for every single problem.
Absolutely. OCaml is as good as the next (compiled) language in this case.
Python and Matlab seem to be particularly bad at this task.
Here's my complete OCaml:
let a = sqrt 3. and b = 4. *. sqrt 2.
let h0, h1, h2, h3 =
(1. +. a) /. b, (3. +. a) /. b, (3. -. a) /. b, (1. -. a) /. b
let g0, g1, g2, g3 = h3, -.h2, h1, -.h0
let rec d4_aux tmp a n =
let n2 = n lsr 1 in
for i=0 to n2-2 do
tmp.(i) <- a.(i*2)*.h0+.a.(i*2+1)*.h1+.a.(i*2+2)*.h2+.a.(i*2+3)*.h3;
tmp.(i+n2) <- a.(i*2)*.g0+.a.(i*2+1)*.g1+.a.(i*2+2)*.g2+.a.(i*2+3)*.g3;
done;
tmp.(n2-1) <- a.(n-2)*.h0 +. a.(n-1)*.h1 +. a.(0)*.h2 +. a.(1)*.h3;
tmp.(2*n2-1) <- a.(n-2)*.g0 +. a.(n-1)*.g1 +. a.(0)*.g2 +. a.(1)*.g3;
Array.blit tmp 0 a 0 n;
if n > 4 then d4_aux tmp a (n lsr 1)
let d4 a =
let tmp = Array.make (Array.length a) 0. in
d4_aux tmp a (Array.length a)
let () =
print_endline "Generating test data...";
let x = Array.init (1 lsl 25) (fun _ -> Random.float 1.) in
print_endline "Benchmarking...";
let t1 = Sys.time() in
ignore(d4 x);
Printf.printf "Elapsed time is %.6f seconds\n" (Sys.time() -. t1)
and C++:
#include <iostream>
#include <vector>
#include <ctime>
#include <cmath>
using namespace std;
double a = sqrt(3), b = 4 * sqrt(2);
double h0 = (1 + a) / b, h1 = (3 + a) / b, h2 = (3 - a) / b, h3 = (1 - a) /
b;
double g0 = h3, g1 = -h2, g2 = h1, g3 = -h0;
void d4(vector<double> &a) {
vector<double> tmp(a.size());
for (int n = a.size(); n >= 4; n >>= 1) {
int half = n >> 1;
for (int i=0; i<n-2; i+=2) {
tmp.at(i/2) = a.at(i)*h0 + a.at(i+1)*h1 + a.at(i+2)*h2 + a.at(i+3)*h3;
tmp.at(i/2+half) = a.at(i)*h3 - a.at(i+1)*h2 + a.at(i+2)*h1 -
a.at(i+3)*h0
;
}
tmp.at(half-1) = a.at(n-2)*h0 + a.at(n-1)*h1 + a.at(0)*h2 + a.at(1)*h3;
tmp.at(2*half-1) = a.at(n-2)*h3 - a.at(n-1)*h2 + a.at(0)*h1 -
a.at(1)*h0;
copy(tmp.begin(), tmp.begin() + n, a.begin());
}
}
int main() {
cout << "Generating data...\n";
vector<double> a(1 << 25);
for (int i=0; i<a.size(); ++i) a.at(i) = rand();
cout << "Benchmarking...\n";
double t1 = clock();
d4(a);
cout << "Elapsed time " << (clock() - t1) / CLOCKS_PER_SEC << "s\n";
}
--
Dr Jon D Harrop, Flying Frog Consultancy
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/index.html?usenet
.
- Follow-Ups:
- Re: About alternatives to Matlab
- From: Konrad Hinsen
- Re: About alternatives to Matlab
- From: Mark Morss
- Re: About alternatives to Matlab
- References:
- Re: About alternatives to Matlab
- From: Bill Maxwell
- Re: About alternatives to Matlab
- From: Jon Harrop
- Re: About alternatives to Matlab
- From: Carl Banks
- Re: About alternatives to Matlab
- From: Carl Banks
- Re: About alternatives to Matlab
- From: Jon Harrop
- Re: About alternatives to Matlab
- From: Filip Wasilewski
- Re: About alternatives to Matlab
- From: Jon Harrop
- Re: About alternatives to Matlab
- Prev by Date: PythonTidy
- Next by Date: Re: About alternatives to Matlab
- Previous by thread: Re: About alternatives to Matlab
- Next by thread: Re: About alternatives to Matlab
- Index(es):
Relevant Pages
|