C/C++ Compiler's Optimization Failed

From: Bryan Parkoff (bryan.nospam.parkoff_at_nospam.com)
Date: 02/21/04


Date: Sat, 21 Feb 2004 01:18:24 +0000 (UTC)


    I wrote C++ code below.

VOID Set_Red(U_BYTE _Red, U_BYTE _Green, U_BYTE _Blue)
{
    Red = (0x00 << 24) | (_Red << 16) | (_Green << 8) | _Blue;
}

    I used C/C++ Compiler's Optimization. The machine language shows below.

mov ecx, DWORD PTR [esp+0ch]
xor eax, eax
mov ah, BYTE PTR [esp+04h]
and ecx, 0ffh
mov al, BYTE PTR [esp+08h]
shl eax, 08h
or eax, ecx
mov DWORD PTR [0409500h], eax
ret

    I am shocked that C/C++ Compiler did not tune optimization very well
because partial register DOES EXIST!!

    My correction code should be below.

mov eax, DWORD PTR [esp+04h]
shl eax, 08h
or eax, DWORD PTR [esp+08h]
shl eax, 08h
or eax, DWORD PTR [esp+0ch]
ret

    I am very concerned that all C/C++ Compiler does not do good job to tune
optimization properly. I used Intel VTune to detect and discover many hot
spots that they do include partial register. I did test on both Pentium III
and Pentium IV. They do apply partial register that should be avoided.

    What is your recommendation?

    1. Write C/C++ code in DEBUG version.
    2. Test C/C++ code on any computer to see if it works with NO BUG
(Optimization is not important YET!!)
    3. Change from DEBUG version to RELEASE version.
    4. Test C/C++ code on any computer to see if it works with NO BUG
(Optimization is ENABLED!!)
    5. Use VTune to discover hot spots including partial register.
    6. Convert C/C++ code into machine language in your own hands manually.
    7. Link your existing C/C++ code and Machine code together.
    8. Test mixed C/C++ code and Machine code on any computer to see if it
works with NO BUG.
    9. After all tests are done, it will be big IMPACT for improved
optimization.

-- 
Bryan Parkoff


Relevant Pages