Re: Compile-Time Loops in Assembly Language

From: Percival (dragontamer5788_at_yahoo.com)
Date: 09/07/04


Date: 7 Sep 2004 21:50:12 GMT

Percival wrote:
> For the results, it clearly shows that the compact loop takes up, whas
> this? About 100 times longer than the unrolled loop!!
> Percival

Wow, maybe i was the one lying. I mean 1/30th the time. Not 1/100th the
time.

Anyway, benchmark 2

test1.asm

>>>>>>>>>>>>>.
format ELF executable

entry main
section executable

main:
mov edx, 100000
.loop:
repeat 1000
         push eax
         pop eax
end repeat
         dec edx
         jnz .loop

         mov eax, 1
         int 80h
>>>>>>>>>>>>>>>

test2.asm:

>>>>>>>>>>.
format ELF executable
entry main

section executable

main:
mov edx, 1000000
.loop:
   mov ecx, 1000
   .loop2:
     push eax
     pop eax

dec ecx
jnz .loop2

dec edx
jnz .loop

mov eax, 1
int 80h
>>>>>>>>>

time ./test1; time ./test2

real 0m0.459s
user 0m0.458s
sys 0m0.000s

real 0m2.517s
user 0m2.505s
sys 0m0.005s

This time, with more statements in the middle (push and pop, instead of
a speedy inc eax) the time has dropped to ~ 1/5 the speed instead of
1/30th. Still, i have provided two examples (on my 2.0 GHz Celeron :)
where the unrolled loop is faster than the compact loop.

Betov either doesn't know what he was talking about, or isn't posting
his benchmark for some odd reason :)

Percival



Relevant Pages

  • Re: while Vs for loop
    ... >debugger on linux). ... In for loop, the index variable i is always kept ... >register but in while loop it was kept in EAX register only at the ...
    (comp.lang.c)
  • Re: GCC question
    ... int regB; ... Not really rand, but the idea is that eax, regB, and ... changed by the "call" (the inline asm). ... inside the loop. ...
    (comp.lang.c)
  • Re: The Advantage of Macros
    ... NASM will run on any processor architecture as long as the system ... so you have to write an assembler for the Intanium. ... loop: add.l r0,sum ... mov D$Sum 0, eax 1 ...
    (alt.lang.asm)
  • Re: Loop alignment on core duo
    ... L0: SUB EAX, 4 ... I add a varying number of NOPS before L0 screwing with the alignment. ... loop crosses a 16 byte boundary. ... On a core duo it's producing strange ...
    (comp.lang.asm.x86)
  • Re: Can this loop be made faster ?
    ... increments, of course, the MEMORY at eax ... at least within the loop. ... maximal 16 iterations for loops which contain one branch, ... | routine, and the time is the divided by 100000 after all the calls. ...
    (alt.lang.asm)