Re: Poor performance with OpenMP
- From: Daniel Carrera <dcarrera@xxxxxxxxx>
- Date: Thu, 13 Jan 2011 08:24:28 -0800 (PST)
On Jan 13, 2:45 pm, gmail-unlp <ftine...@xxxxxxxxx> wrote:
Just in case: there is an openmp forum athttp://www.openmp.org/forum/
about "Using OpenMP" where you can also ask, but I'm sure many of us
here are interested too.
About the question: right now, I think you are partially measuring the
difference in making (plus threading/parallel processing).
C = MATMUL(A,B)
do j = 1,N
C(j,:) = MATMUL(A,B(j,:))
Just to see something "interesting" you could time ex2 without -
fopenmp when compiling (it compiles, but does not use OpenMP) or run
with export OMP_NUM_THREADS=1.
Indeed. When I do as you suggest, the two programs have basically the
same runtime. The single-threaded is still faster, but only by about 1
second out of ~18.
Lesson: Parallel processing might be worse if it forces you to use a
poorer algorithm. In this instance, I guess that the for loop might
prevent the compiler from doing some optimizations, like making sure
you access memory sequentially maybe.
A side question: how are you measuring runtime?
I'm using the Unix "time" command and I'm reporting the total wall
clock time spent by the user.