Re: Using SSE 128 bit movs From One Memory Location To Another



Il 17.02.2011 22:28, KA ha scritto:
Sorry, yes, you are right. I was doing some incorrect math in my code which I have now solved.

I am trying to beat a rep movsb on large objects by using SSE.

To measure this, I've been using GCC C++ where I use a for loop containing my inline assembly and iterate a billion times


Hi KA,
some hints about the subject of this thread:

1) *avoiding* C and HLLs for speed-aimed-routines.
/solution/ create pure *ASM* object files to be linked

2) *avoiding* C and HL Languages to test speed-aimed-routines.
they insert too much overbloating code and neuro-handlers.
/solution/ test speed-aimed-routines *pure as they are*, possibly
without CRT/libc/SEH stub.

3) *avoiding* reinventing the wheel for such operations
(movs) on big bulks of datas.
/solution/ reading this could help something

_Intel_64_and_IA-32_Architectures_Optimization_Reference_Manual_

http://www.intel.com/Assets/ja_JP/PDF/manual/248966.pdf

Chapter 7.2, 7.3 could be a starting point.

Cheers,

--

.:hopcode[marc:rainer:kranz]:.
x64 Assembly Lab
http://sites.google.com/site/x64lab
.