Re: Memory copy
- From: Stephen Sprunk <stephen@xxxxxxxxxx>
- Date: Fri, 18 Dec 2009 09:20:02 -0600
bartc wrote:
This is a problem that's been driving me round the bend.
I think it's an x86 thing and I've also posted in that group, but
someone here might have ideas, and ultimately fixing it might mean
messing about with malloc():
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *p, *q;
int i;
#define n 8192
//#define n 8000
p=malloc(n);
q=malloc(n);
for (i=0; i<1000000; ++i)
memcpy(p,q,5000);
}
When n is 8192, the memcpy takes nearly twice as long to execute as when
n is 8000 (or 9000, or just about anything else except 8192/16384 and so
on). (And in assembler, a byte-oriented copy takes four times as long
with n=8192 as the same byte-oriented copy with n=8000.)
Any ideas on what I can do with malloc to avoid this slowdown?
Nothing portable, since the size(s) that cause this problem will vary by
CPU.
I suspect that the particular CPU you are testing this on has an n-way
set associative cache (n is often 4). If the difference between p and q
is a certain amount, the locations you are reading from and then writing
to (or is it writing to and then reading from in the next cycle?) will
end up in the same way. With such cache contention, that level of cache
can become nearly useless, limiting you to the performance of the next
level.
S
--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
.
- References:
- Memory copy
- From: bartc
- Memory copy
- Prev by Date: Re: Difference between Cygwin and DOS handling of string input
- Next by Date: Re: Programming challenge
- Previous by thread: Re: Memory copy
- Next by thread: Direct support for various Statistical functions
- Index(es):
Relevant Pages
|