Re: Memory copy



bartc wrote:
This is a problem that's been driving me round the bend.

I think it's an x86 thing and I've also posted in that group, but
someone here might have ideas, and ultimately fixing it might mean
messing about with malloc():

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
char *p, *q;
int i;
#define n 8192
//#define n 8000

p=malloc(n);
q=malloc(n);

for (i=0; i<1000000; ++i)
memcpy(p,q,5000);

}

When n is 8192, the memcpy takes nearly twice as long to execute as when
n is 8000 (or 9000, or just about anything else except 8192/16384 and so
on). (And in assembler, a byte-oriented copy takes four times as long
with n=8192 as the same byte-oriented copy with n=8000.)

Any ideas on what I can do with malloc to avoid this slowdown?

Nothing portable, since the size(s) that cause this problem will vary by
CPU.

I suspect that the particular CPU you are testing this on has an n-way
set associative cache (n is often 4). If the difference between p and q
is a certain amount, the locations you are reading from and then writing
to (or is it writing to and then reading from in the next cycle?) will
end up in the same way. With such cache contention, that level of cache
can become nearly useless, limiting you to the performance of the next
level.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
.



Relevant Pages

  • Re: Elephant experiences?
    ... or one thread is reading while another is ... writing or something like that. ... I don't see how pinning the whole process to one CPU will help ... Even if you don't have real parallel execution you still have to ...
    (comp.lang.lisp)
  • Re: Lock variables between two threads
    ... L2/L3 cache shared between cores), or always an entire word (or ... atomicity of the read/modify/write to the cache line even with ... Reading or writing a byte. ...
    (microsoft.public.vc.language)
  • Re: 333 mhz AMD w/ 128 mb RAM enough for XP?
    ... 333MHz CPU ... reading and writing to the disk drive. ... happens at mechanical speeds, not electronic ones, and is ...
    (microsoft.public.windowsxp.basics)
  • Re: 333 mhz AMD w/ 128 mb RAM enough for XP?
    ... 333MHz CPU ... reading and writing to the disk drive. ... happens at mechanical speeds, not electronic ones, and is ...
    (microsoft.public.windowsxp.basics)
  • Re: CPU waiting for... what? (mistery)
    ... I/O wait: CPU would do something, ... Reading from cache is much faster. ...
    (comp.unix.solaris)