Re: Language efficiency of C versus FORTRAN et al
From: E. Robert Tisdale (E.Robert.Tisdale_at_jpl.nasa.gov)
Date: 12/29/04
- Next message: senthil: "Re: linklist"
- Previous message: Mole Wang: "Re: compiler bug or misuse of a cast?"
- In reply to: travisperkins03_at_hotmail.com: "Language efficiency of C versus FORTRAN et al"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 28 Dec 2004 17:56:53 -0800
travisperkins03@hotmail.com wrote:
> I have read somewhere
Do you remember where?
> that C code sometimes cannot be compiled to be as efficient as FORTRAN,
> e.g. for matrix multiplication,
> because a C compiler cannot make the assumptions about arrays
> that a FORTRAN compiler can. But I don't understand the example,
> not least because I don't understand FORTRAN.
> I also don't understand why it is more efficient in this case
> for a compiler to choose the order of evaluation
> (or whatever it is that it does
> for matrix multiplication to make it faster).
> Can anyone explain all this, please?
> And how much speed-up might one get from using FORTRAN over C for such things?
> What sort of compilers offer the best performance for issues like this?
> Is there any general advice
> about how to achieve efficient code for such linear algebra?
> This is a fairly live issue
No. It is a *dead* issue.
> because matrix mulitplication
> (and other things, like evaluating a dot product)
> often take an extremely long time for large matrices and vectors.
> I am also wondering how other languages like Pascal
Please ask in the comp.lang.pascal newsgroup instead.
> might compare to C and Fortran in this regard;
> does Pascal have enough array structure to
> allow compilers to take advantage of such optimisations?
>
> Any more general docs on issues like this
> would also be interesting reading.
> cat main.f
program main
real C(1024, 1024)
real B(1024, 1024)
real A(1024, 1024)
integer i, j, k
do j = 1, 1024
do i = 1, 1024
A(i, j) = i - 1 +(j - 1)*1024
B(i, j) = i - 1 +(j - 1)*1024
end do
end do
! C <-- A^TB (matrix-matrix dot product)
do j = 1, 1024
do i = 1, 1024
C(i, j) = 0.0
do k = 1, 1024
C(i, j) = C(i, j) + A(k, i)*B(k, j)
end do
end do
end do
end program main
> f90 -O3 -o main main.f
> time ./main
27.957u 0.105s 0:28.06 99.9% 0+0k 0+0io 0pf+0w
> cat main.c
#include <stdio.h>
#include <limits.h>
int main(int argc, char* argv[]) {
const
int n = 1024;
float A[n][n];
float B[n][n];
float C[n][n];
for (size_t i = 0; i < n; ++i) {
for (size_t j = 0; j < n; ++j) {
A[i][j] = j + i*n;
B[i][j] = j + i*n;
}
}
// C <-- BA^T (matrix-matrix dot product)
for (size_t i = 0; i < n; ++i) {
for (size_t j = 0; j < n; ++j) {
C[i][j] = 0.0;
for (size_t k = 0; k < n; ++k) {
C[i][j] = C[i][j] + B[i][k]*A[j][k];
}
}
}
return 0;
}
> gcc -Wall -std=c99 -pedantic -O3 -o main main.c
> time ./main
31.365u 0.113s 0:31.62 99.5% 0+0k 0+0io 0pf+0w
Quality C and Fortran compilers will optimize these loops
in almost exactly the same way. The difference appears
when you pass arrays to "subprograms".
void matrix_matrix_dot(float C[][1024],
const float A[][1024], const float B[][1024]);
or
interface
subroutine matrix_matrix_dot(C, A, B)
real, intent(in), dimension(1024, 1024):: A
real, intent(in), dimension(1024, 1024):: B
real, intent(out), dimension(1024, 1024):: C
end subroutine matrix_matrix_dot
end interface
The problem is that the compiler does not know that
the destination array C is not an *alias*
for [part of] one of the source operands --
it can't be sure that programmers won't write
call matrix_matrix_dot(A, A, B)
for example. If C or Fortran programmers do this,
they are going to get garbage in A
instead of the matrix-matrix dot product.
Fortran programmers are simply admonished *not* to do this
but, for some reason, C programmers expect to get the same thing
as if they had written
for (size_t i = 0; i < n; ++i) {
for (size_t j = 0; j < n; ++j) {
A[i][j] = 0.0;
for (size_t k = 0; k < n; ++k) {
A[i][j] = A[i][j] + B[i][k]*A[j][k];
}
}
}
at the same place in their program where they wrote
matrix_matrix_dot(A, A, B);
which inhibits the C compiler from performing any optimizations
which might yield a different [wrong] result.
The new C99 standard allows programmers
to qualify aliases with the 'restrict' keyword
so that the C99 compiler is allowed to perform
the same optimizations as the Fortran compiler.
NOTE: Neither Fortran or C99 do anything
to enhance the safety of subprograms
that may modify input arrays through aliases.
keyword
- Next message: senthil: "Re: linklist"
- Previous message: Mole Wang: "Re: compiler bug or misuse of a cast?"
- In reply to: travisperkins03_at_hotmail.com: "Language efficiency of C versus FORTRAN et al"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|