Re: Sun Studio Express 3 compilers available for download



*** Hendrickson wrote:
ejko123@xxxxxxxxx wrote:
robert.corbett@xxxxxxx wrote:
I am unaware of any effective algorithms for optimizing a general
array expression. A superoptimizer approach might work, but I'm
guessing that most people would like their optimized compilations
to finish before the heat death of the universe.

This is probably much more than can be adequately discussed here,
but I'd like to understand why this is hard. Is it hard because of
the register allocation problem, the array temporary problem, or
something else? Are there some good review papers on the subject?

--Eric

I'll take a swing at listing 3 or 4 problems in no particular order.

I have some practical observations which surfaced in a recent
investigation of why our code was crashing with segmentation faults.

This revealed that some compilers generate unnecessary array temporary
variables on the stack, for very simple array expressions
(array+constant), and perform an unnecessary
memory->memory copy of the array.

See what happens to the attached test code if you run it after setting
the stack size to less than 1MB. (ulimit -s 1000 for bash
shells.). This version doesn't have timing code, but adding it reveals
the obvious - that the unnecessary array copies cost time.

Here are the tabulated results for a number of compilers as to whether
they are able to merge the array expression evaluation and assignment
loops and avoid the array temporary.

Ifort 8.0, 9.0,9.1 N
HP Tru64/Alpha N
IBM XLF N
Intel Fortran 7.1 Y
Sun Forte Y
Sun StudioExpress Y
Nag Y
Lahey-Fujitsu Y
g95 Y
gfortran Y
Pathscale Y
Irix.MIPS Y
Pgf90 Y

The example is a slightly more challenging test than the simplest
array operations as the operands are pointer components of derived
types, but most compilers still manage the optimisation. IBM's XLF
compiler manages the optimisation if the pointer components are
changed to allocatable arrays or if the dangerous flag
"-qalias=noaryovrlp" is used. (But this flag could miscompile
standard-conforming code.) Intel have accepted a bug report on this,
and the current status is "targetted" to be fixed".

So the message is that sometimes it is easier for a human being to
figure out the meaning of the code than a compiler....

Keith Refson



--
Dr Keith Refson,
Building R3
Rutherford Appleton Laboratory
Chilton
Didcot kr AT
Oxfordshire OX11 0QX isise D@T rl D.T ac D?T uk
module outer

type my_data
real, pointer, dimension(:) :: x
integer :: nx
end type my_data

contains

subroutine addit(d, f, a)

type(my_data), intent(inout) :: d
real, intent(in) :: f, a

d%x = a*d%x + f

end subroutine addit

subroutine addit2(d, f, a)

type(my_data), intent(inout) :: d
real, intent(in) :: f, a

d%x(1:d%nx) = a*d%x(1:d%nx) + f

end subroutine addit2

subroutine loopit(d, f, a)

type(my_data), intent(inout) :: d
real, intent(in) :: f, a

integer :: i

do i = 1, size(d%x)
d%x(i) = a*d%x(i) + f
end do

end subroutine loopit

end module outer

program testst

use outer

type(my_data) :: d

allocate(d%x(256000))
d%nx = size(d%x)

write(*,*) ' Calling loop version'
call loopit(d, 1.0, 12345.0)
write(*,*) ' Returned from loop version'
write(*,*) ' Calling whole array version'
call addit(d, 1.0, 12345.0)
write(*,*) ' Returned from whole array version'
write(*,*) ' Calling array subrange version'
call addit(d, 1.0, 12345.0)
write(*,*) ' Returned from array subrange version'

end program testst