Re: Automatically transform or expand do loop in a subroutine
- From: "James Van Buskirk" <not_valid@xxxxxxxxxxx>
- Date: Thu, 31 Jan 2008 00:12:17 -0700
"yaqi" <yaqiwang@xxxxxxxxx> wrote in message
news:7ac8cf62-97a7-49f9-8487-4eb8eaa1533e@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I knew following piece of code will not be compiled, at least with
Compaq Visual Fortran. My intention of this code is to let the
compiler automatically transform or expand the loop inside the
function (I do not care the code size) if the caller has a SMALL
CONSTANT integer 'd' in its argument list when the compiler is doing
optimization. Because the function is really crucial for the
performance, I want to save computing time as much as I can. So do not
ask me why I want to do this. Actually I guess this idea is similar as
the template in C++.
Are there any ways to implement this idea easily with Fortran?
You can do something like a C++ template in Fortran, but just like
the C++ template you will need to compile every time you want a new
loop length. I have an example that has a Fortran program invoking
the compiler to create a dynamic link library and then the program
calls LoadLibraryA to link to the code just conpiled. This may be
more like what you want:
http://groups.google.com/group/comp.lang.fortran/msg/b5065aaaad9eb748
Of course it's done in gfortran and C binding, but it would be easier
to have done it with CVF and dfwin.mod. A problem with this approach
is that if you want to redistribute the code you have to include
instructions on installing the compiler as well. You might be better
off in that event to use a freeware compiler as your run-time dll-
generator, or as a more lightweight alternative, a freeware
assembler. For this purpose, FASM would be best suited because there
is even a dll form of it available which would obviate the need for
CreateProcessA and maybe WaitForSingleObject.
There is probably even a Win32 API that permits you to allocate a
block of memory and get an executable address for it so that you
can just poke your code into memory directly and call it via a
Cray pointer.
Although all the above is fun, I might point out that loop unrolling
has limitations. In particular, if the loop is unrolled to such an
extent that it's larger than instruction cache, your code is going
to spend all of its time fetching the instructions from L2 cache or
beyond and very little time actually executing your instructions.
I have a problem like that where it's easy (relatively) to write out
the code but it's way too big for cache and compilers are not as well
equipped to carry out loop systhesis as they are to carry out loop
unrolling, so I would have to rewrite so as to insert the looping by
hand.
--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end
.
- References:
- Prev by Date: Re: Automatically transform or expand do loop in a subroutine
- Next by Date: Re: Automatically transform or expand do loop in a subroutine
- Previous by thread: Re: Automatically transform or expand do loop in a subroutine
- Next by thread: Re: Automatically transform or expand do loop in a subroutine
- Index(es):
Relevant Pages
|
|