Re: MATMUL slower than expected
- From: kargl@xxxxxxxxxxxxxxxxxxxxxxxxxxxx (Steven G. Kargl)
- Date: Mon, 30 Jan 2006 15:21:36 +0000 (UTC)
In article <43DDC1C4.1000708@xxxxxxxxxxxxxxxxxx>,
Tim Prince <tprince@xxxxxxxxxxxxxxxxxx> writes:
>
> gfortran 4.2 on 1.5Ghz Pentium-m (SuSE 9.2):
> time using matmul: 11.13631
> triple loop time: 9.561546
> double loop with dot_product: 11.52825
> ifort 9.1 -xB -O3:
> time using matmul: 2.434629
> triple loop time: 2.432630
> double loop with dot_product: 14.08286
>
> Your choice of loop nesting seems particularly unfavorable when you
> instruct compilers not to try variations. gfortran has no option to do
> so. ifort optimized code does take advantage of stride 1 storage by column.
> On Pentium-m, there isn't much advantage in MKL over optimized Fortran
> source, but Xeon platforms would show more gain for MKL. As you must be
> aware, the "funny" f77 oriented BLAS calling sequences are well
> entrenched in practice, but you could employ the f90 wrappers.
> gfortran still has the documented peculiarity that sum(a*b) is
> significantly faster than dot_product(a,b).
There is a patch to improve dot_product (and matmul) in gfortran.
The author of the patch isn't quite happy with the current choice
of a switch-over point from one algorithm to another.
--
Steve
http://troutmask.apl.washington.edu/~kargl/
.
- References:
- MATMUL slower than expected
- From: Matthew Halfant
- Re: MATMUL slower than expected
- From: Tim Prince
- MATMUL slower than expected
- Prev by Date: Need clarification on unformatted IO
- Next by Date: Re: fortran
- Previous by thread: Re: MATMUL slower than expected
- Next by thread: Need clarification on unformatted IO
- Index(es):