Fast way to do memory copy



Hi,
This progam gets crashed when data size is given as 520000.
Can anyone find the reason for that? This is a fast implementation for
memcpy. It can do memcpy allmost 40% faster. Program doesn't have
problem when i give other sizes. When I debug the program it is found
that program is getting crashed at line (A). I am trying to find the
problem. If anyone could see what i have done wrong please post that.
The error that i am getting is unhandled exception.

void CopyMMX( void* sorce, void* destination, int count )
{
int nCount64 = ( count / 128 ) * 128;
int nRemainder = ( count % 128 );
_asm
{
MOV ESI, sorce
MOV EDI, destination
MOV ECX, nCount64
CMP ECX, 0
JZ BYTEBYTE
MOV EDX, 128
SHR ECX, 7
TOP:
PREFETCHNTA 64[ESI] // Pre-fetch data for Next loop
PREFETCHNTA 128[ESI]
// Copy data from source
MOVDQU XMM0, 0[ESI]
MOVDQU XMM1, 16[ESI]
MOVDQU XMM2, 32[ESI]
MOVDQU XMM3, 48[ESI]
MOVDQU XMM4, 64[ESI]
MOVDQU XMM5, 80[ESI]
MOVDQU XMM6, 96[ESI]
MOVDQU XMM7, 112[ESI]

// Save the data from MM registers to Destination
MOVNTDQ 0[EDI], XMM0 //(A)->Program gets crashed here
MOVNTDQ 16[EDI], XMM1
MOVNTDQ 32[EDI], XMM2
MOVNTDQ 48[EDI], XMM3
MOVNTDQ 64[EDI], XMM4
MOVNTDQ 80[EDI], XMM5
MOVNTDQ 96[EDI], XMM6
MOVNTDQ 112[EDI], XMM7

ADD ESI, EDX
ADD EDI, EDX
DEC ECX
JNZ TOP
// Copy remaining data BYTE by BYTE
BYTEBYTE:
MOV ECX, nRemainder
CMP ECX, 0
JZ ENDS
PREFETCHNTA [ESI+ECX]
REM:
MOV AL, 0[ESI]
MOV 0[EDI], AL
INC ESI
INC EDI
DEC ECX
JNZ REM
ENDS:
EMMS
}
}

Best regards,
Amal P.

.



Relevant Pages

  • Re: Fast way to do memory copy
    ... MOV EDI, destination ... MOV ECX, nCount64 ... MOVDQU XMM1, 16 ... MOVNTDQ 16, XMM1 ...
    (comp.lang.asm.x86)
  • Re: Memory copy issue
    ... MOV EDI, pDestination ... MOV ECX, nCount64 ... MOVDQA XMM1, 16 ... MOVNTDQ 16, XMM1 ...
    (comp.lang.asm.x86)
  • Re: Fast way to do memory copy
    ... MOVNTDQ the memory operand should ... MOV EDI, destination ... MOVDQU XMM1, 16 ... MOVNTDQ 16, XMM1 ...
    (comp.lang.asm.x86)
  • Memory copy issue
    ... This is a function for doing memcpy. ... MOV EDI, pDestination ... MOV ECX, nCount64 ... MOVNTDQ 16, XMM1 ...
    (comp.lang.asm.x86)