optimizers are overrated



Optimizers are overrated

I started learning ASM not long ago to improve my understanding of the
hardware architecture and my ability to optimize C code. The results of my
first experiment were surprising to say at least. After reading the chapter
on loops in my ASM book I wanted to test whether modern C compilers are
actually as smart as commonly claimed. I chose a most simple loop: calling
putchar 100 times. The first function (foo) uses a typical C style loop to
test the assumption that "the compiler will optimize that better than any
human could". The second function (bar) is based my newly gained knowledge,
the loop is basically ASM written in C. Now I am certain if I asked here
which one is more efficient all you guys would reply "the compiler will most
likely generate the same code in both cases" (I have read such claims
countless times here). Well, look at the ASM output below to see how wrong
your assumption is.

/* C Code */

void foo(void)
{
int i;

for (i = 0; i < 100; i++) putchar('a');
}


void bar(void)
{
int i = 100;

do {
putchar('a');
} while (--i);
}

As I said, damn simple. No nasty side effects, no access to global
variables, etc. The optimizer has no excuses. It should generate optimial
code in both cases. But see the result:


/* GCC 4.3.0 on x86/Windows, -O2 */

/* foo */
L7:
subl $12, %esp
pushl $97
call _putchar

incl %ebx
addl $16, %esp
cmpl $100, %ebx
jne L7


/* bar */
L2:
subl $12, %esp
pushl $97
call _putchar
addl $16, %esp

decl %ebx
jne L2


Comment: See, even the most recent version of the probably most widely used
compiler can not correctly optimize a most simple loop! At least GCC
understood the bar loop, so my "write C like ASM" optimization worked.

At this point you might wonder what horrible things an average C compiler
will do when GCC already fails so badly. Here is the gruesome result:


/* lccwin32, optimize on */

/* foo */
_$4:
pushl $97
call _putchar
popl %ecx

incl %edi
cmpl $100,%edi
jl _$4


/* bar */
_$10:
pushl $97
call _putchar
popl %ecx

movl %edi,%eax
decl %eax
movl %eax,%edi
or %eax,%eax
jne _$10


Comment: lcc is unable to optimize the loop just like GCC, but it adds
insults to injury by actually generating worse code for the ASM-style loop!
So you cannot even optimize the loop yourself!


/* MS Visual C++ 6 /O2 */

For this compiler I had to replace the putchar call with a call to a custom
my_putchar function otherwise the compiler replaces the putchar calls with
direct OS API stuff. While this is a good optimization it is not the
subject of this test, and only makes the resulting asm harder to read, so I
supressed that.


/* foo */

jmp SHORT $L833
$L834:
mov eax, DWORD PTR _i$[ebp]
add eax, 1
mov DWORD PTR _i$[ebp], eax
$L833:
cmp DWORD PTR _i$[ebp], 100
jge SHORT $L835

push 97
call _my_putchar
add esp, 4

jmp SHORT $L834
$L835:


/* bar */

$L840:
push 97
call _my_putchar
add esp, 4

mov eax, DWORD PTR _i$[ebp]
sub eax, 1
mov DWORD PTR _i$[ebp], eax
cmp DWORD PTR _i$[ebp], 0
jne SHORT $L840


Comment: Amazingly enough, this compiler has found yet another way to screw
up. Would you have thought that each compiler generates different code for
such a simple construct?
I hope you agree that the compiler of the beast deserves the award "Worst of
Show" for this mess. Are MS compilers still this bad?




.



Relevant Pages

  • Re: Brian Kernighan, maybe Im not worthy, maybe Im scum
    ... and just hone in on the stuff related to programming and this newsroup] ... moron that was taken from optimization which does hoist when to do so ... compiler design and optimization, including my 1976 text in graduate ... loop in a language in which the designers messed up, ...
    (comp.programming)
  • Re: Brian Kernighan, maybe Im not worthy, maybe Im scum
    ... what experienced programmers do, ... optimization, ... Thugs" ad nauseum fits that a lot more closely than discussing compiler ... be modified outside a loop, and guessing ...
    (comp.programming)
  • Re: Why is C# 450% slower than C++ on nested loops ??
    ... The posted benchmark was crucial to ... > compilers generate for the loop and get over with it. ... > additions in the outer loops, which the C# compiler doesn't. ... gotten around to implementing every possible optimization in every language, ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Interesting article by Randall Hyde
    ... >>> a test before executing the loop. ... is a compiler writer forced to ... And another thing you might consider is that compiler optimization was ... Then the programmer will have to protect the execution of that loop. ...
    (comp.lang.asm.x86)
  • Re: mixing C and assembly
    ... document which is also known as WG14 N1021). ... Fortunately, they implement gcc's asm extensions, at least partially :-) ... optimization like the branch/jump to main. ... Why not let the compiler do the work. ...
    (comp.arch.embedded)