Re: defined operator & assignment speed & memory usage problem



On Jul 16, 3:05 am, Arjen Markus <arjen.mar...@xxxxxxxxxx> wrote:
On 12 jul, 17:49, Damian <dam...@xxxxxxxxxx> wrote:

This is a slightly modified reposting of an earlier posting that
received no replies:

Having read this thread (and the articles you mentioned),
I think I understand the problem. I wrote the program
below to measure the performance of such assignments.

Typically, using overloaded operations takes twice as long
as an explicit do-loop (see the comments, measured on a Windows PC
with Cygwin and MinGW but also on Linux with Ifort).

One thing that I have not tested is the following idea:
Rather than work with allocated blocks of memory _per variable_,
you could use a pool of allocated blocks. That way you do not
need to copy data if that data is a temporary result. And
you minimize the memory leaks.

(There are drawbacks to that method! Local variables and such
need to be cleaned up. And probably I need to write it down
more properly than expect you to read my mind ;)).

Anyhow, see the program.

Regards,

Arjen
-------
! assign_comp.f90 --
! Various ways to implement "A = B * C" where A, B and C
! are derived types
!
! g95:
! Plain:
! operator: 16.45 16.95 seconds
! do-loop: 11.02 11.44 seconds
! -O2:
! operator: 10.58 10.88 seconds
! do-loop: 6.92 7.11 seconds
!
! gfortran:
! Plain:
! operator: 15.86 15.80 seconds
! do-loop: 13.25 13.50 seconds
! -O2:
! operator: 11.47 11.67 seconds
! do-loop: 7.41 7.37 seconds
!
!
module assignment
implicit none

integer, parameter :: fieldsize = 10000000

type field
logical :: temp = .false.
real, dimension(:), pointer :: data
end type field

interface assignment(=)
module procedure assign_field
module procedure assign_field_scalar
end interface

interface operator(*)
module procedure multiply_fields
end interface
contains

subroutine assign_field( left, right )
type(field), intent(inout) :: left
type(field), intent(in) :: right

if ( right%temp ) then
if ( associated(left%data) ) then
deallocate( left%data )
endif
left%temp = .false.
left%data => right%data
else
if ( .not. associated(left%data) ) then
allocate( left%data(fieldsize) )
endif
left%temp = .false.
left%data = right%data
endif

end subroutine assign_field

subroutine assign_field_scalar( left, right )
type(field), intent(inout) :: left
real, intent(in) :: right

if ( .not. associated(left%data) ) then
allocate( left%data(fieldsize) )
endif
left%temp = .false.
left%data = right

end subroutine assign_field_scalar

type(field) function multiply_fields( op1, op2 )
type(field), intent(in) :: op1, op2

multiply_fields%temp = .true.

if ( op1%temp ) then
multiply_fields%data => op1%data
elseif ( op2%temp ) then
multiply_fields%data => op2%data
else
allocate( multiply_fields%data(1:fieldsize) )
endif

multiply_fields%data = op1%data * op2%data

end function multiply_fields

! Do it the classic way
!
subroutine multiply_assign( left, op1, op2 )
type(field) :: left, op1, op2
integer :: i

if ( .not. associated(left%data) ) then
allocate( left%data(1:fieldsize) )
endif

do i = 1,fieldsize
left%data(i) = op1%data(i) * op2%data(i)
enddo
end subroutine multiply_assign

end module assignment

program test_assign
use assignment

implicit none

type(field) :: field_a
type(field) :: field_b
type(field) :: field_c

real :: time1, time2, cumulative1, cumulative2
integer :: count

character(len=10) :: dummy

field_b = 1.1
field_c = 21.4

cumulative1 = 0.0
cumulative2 = 0.0

do count = 1,100
call cpu_time( time1 )
field_a = field_b * field_c
call cpu_time( time2 )

cumulative1 = cumulative1 + (time2-time1)

call cpu_time( time1 )
call multiply_assign( field_a, field_b, field_c )
call cpu_time( time2 )

cumulative2 = cumulative2 + (time2-time1)
enddo

write(*,*) field_a%data(1), field_b%data(1), field_c%data(1)

write(*,*) 'Method 1 (operator): ', cumulative1
write(*,*) 'Method 2 (do-loop): ', cumulative2

write(*,'(a)') 'Waiting ...'
read(*,'(a)') dummy

end program

Aha! So we finally have some empirical data. I'm going to try this
code out with the IBM XL Fortran compiler because I have been told it
will do interprocedural optimization that might eliminate the
differences between the two approaches.

Damian
.



Relevant Pages