Re: Loop unrolling



On Apr 23, 2:34 am, Tim Frink <plfr...@xxxxxxxx> wrote:
Hi,

what are typical heuristics for loop unrolling, i.e. which constraints
must be satisfied to have a compiler perform unrolling on a particular
loop?

I assume that the loop must not exceed a particular size to avoid a
code size explosion. But are there any other heuristics?

Every compiler will be different. You can always unroll it yourself
and compare. E.g.:


void loopcopy(long *to, long *from, long count);
void duffcopy(long *to, long *from, long count);

void loopcopy(long *to, long *from, long count)
{
do
*to = *from++;
while (--count > 0);
return;
}

void duffcopy(long *to, long *from, long count)
{
register n = (count + 7) / 8;
switch (count % 8)
{
case 0:
do
{
*to = *from++;
case 7:
*to = *from++;
case 6:
*to = *from++;
case 5:
*to = *from++;
case 4:
*to = *from++;
case 3:
*to = *from++;
case 2:
*to = *from++;
case 1:
*to = *from++;
}
while (--n > 0);
}
return;
}

#include <stdlib.h>

#define SIZE 4096

long from[SIZE];
long to[SIZE];

long main()
{
long i;

for (i = 0; i < SIZE; i++)
from[i] = rand();
for (i = 0; i < 10000; i++)
loopcopy(to, from, SIZE);
for (i = 0; i < 10000; i++)
duffcopy(to, from, SIZE);
return 0;
}
/*
Profile: Function timing, sorted by time
Date: Wed Apr 16 23:38:59 1997


Program Statistics
------------------
Command line at 1997 Apr 16 23:38: "c:\tmp\Release\Duff"
Total time: 1187.011 millisecond
Time outside of functions: 2.139 millisecond
Call depth: 2
Total functions: 3
Total hits: 20001
Function coverage: 100.0%
Overhead Calculated 9
Overhead Average 9

Module Statistics for duff.exe
------------------------------
Time in module: 1184.872 millisecond
Percent of time in module: 100.0%
Functions in module: 3
Hits in module: 20001
Module function coverage: 100.0%

Func Func+Child Hit
Time % Time % Count Function
---------------------------------------------------------
719.321 60.7 719.321 60.7 10000 _loopcopy (duff.obj)
444.002 37.5 444.002 37.5 10000 _duffcopy (duff.obj)
21.550 1.8 1184.872 100.0 1 _main (duff.obj)

*/
.



Relevant Pages

  • Re: Should I use mutex in this context?
    ... can ascertain that it's non-0 at the beginning of the loop. ... void stupid_thread ... Point, in both cases, is whether the compiler can ... keeps the end result the same - an transparent optimization that does ...
    (microsoft.public.vc.language)
  • Re: duffs device / loop unriolling
    ... > With a modern, optimising compiler, it's bad idea. ... > Compilers can do unrolling for you. ... the loop body that is being unrolled is ... The technique of incrementing i by 8, and using 'i+0', etc, ...
    (comp.lang.c)
  • Re: reinitialization an array
    ... since you have a char array, you can do a memset), ... void dummy; ... compiler used quadwords for the assignments. ... The function cwas compiled to a loop in the ...
    (comp.lang.c)
  • Re: efficency for not instanciating variables?
    ... > void foo ... > Because I'm always writing such code and am not sure if the compiler would ... Do not in general declare variables at a scope where they are not needed. ... If you replace the "if"-statement with a loop, it might in some cases be ...
    (comp.lang.cpp)
  • Re: about static int 2D array
    ... If you want a quick reply, you've chosen the wrong newsgroup. ... mainreturns int, not void. ... either turn up its warning level or get a better compiler. ... Your inner for loop is an infinite loop. ...
    (comp.lang.c.moderated)