Memset me up Scotty.

From: Iman Habib (pixelpajasREMOVETHIS_at_hotmail.com)
Date: 02/23/04

  • Next message: Dragon Lord: "Re: C/C++ Compiler's Optimization Failed"
    Date: Mon, 23 Feb 2004 09:17:04 +0000 (UTC)
    
    

    Hi guys..

    I'm trying to pull out a fast memset routine out of my magic hat
    for a toy 3D engine of mine.

    And to be honest.. I suck at assembly optimizations. =...(
    The routine i have manged to make is about twise as fast as
    regular "rep stosd" 32 bit memset on my AMD Athlon XP.
    But I am still not content as I have a gut feeling that it is
    possible to make it faster.

    So i'll let you guys poke at my memset code
    and se if you can find more places to optmize. =)

    Or even better.. some of you may have links to webpages that have better
    code

    cheers
    //iman

    -----------------8<----------------8<--------------------

    inline void memset32mmx(unsigned int *dest, unsigned int c, unsigned int
    len)
    {
            unsigned int apa[2];
            apa[0] = apa[1] = c;

            if(len < 2) { // i know i can remove the code here.. remake it, put
    it in the next
                _asm { // asm block and make it a bit faster.. but it wont be
    significant.. do it later
                    mov eax,c
                    mov edi,dest
                    mov ecx,len
                    cld
                    rep stosd
                }
                return;
            }

            _asm {
                mov edx, [dest]
                mov eax, len
                mov ecx, eax
                shr eax, 1 //len/2
                and ecx, 1 //len%2
                movd mm1, c
                movq mm0, [apa]
                l:
                movntq [edx], mm0
                add edx, 8
                dec eax
                jnz l
                test ecx, ecx
                je q
                sub edx, 4
                movntq [edx], mm0
                q:
                // sfence
                emms
            }
    }

    -----------------8<----------------8<--------------------

    cheers
    //iman


  • Next message: Dragon Lord: "Re: C/C++ Compiler's Optimization Failed"