Re: Tips on optimizing these functions
- From: Tim Prince <tprince@xxxxxxxxxxxxxxxxxx>
- Date: Sat, 27 Sep 2008 06:44:49 -0700
Andrea Taverna wrote:
I've done some benchmarks with copying and initialisation. Compared to a
specific-nested-loop solution, the functions take up to twice the time.
However, turning on some optimization flags, specifically '-O3' with gcc,
the gap between the recursive and the specific solution reduces to 20%.
So, have you got any advice about optimizing this code?
Other suggestions are welcomeas well.
typedef unsigned char byte;This is so dependent on the platform that we could justifiably argue you
// this one copy one row of the matrix. The row is supposed to store the
value of elements, not pointers
void _copy_row(void* dest, void* src, unsigned short elem_size, unsigned
int n)
{
unsigned short length;
byte* d1,*d2;
d1 = (byte*)dest;
d2 = (byte*)src;
// copy byte to byte
while (n > 0)
{
for (length = 0; length < elem_size; length++)
{
(*d1) = (*d2);
d1++;
d2++;
};
n--;
};
}
should choose one, and go to a forum associated with that platform.
Do any of the compilers you use take advantage of restrict?
If elem_size happens to match frequently the size of a stdint type, you
will need to switch case the code so as to remove the inner loop for those
cases.
Some compilers automatically substitute a run-time library copy function
which invokes all the usual memcpy() optimizations (align destination,
move groups of bytes per instruction).
If you wrote memcpy() in line, that would work well with certain
compilers, not so well with others (possibly depending on command line
options and which run time library you choose). If you are somehow
prohibited from using restrict, writing in memcpy() makes the same assertion.
.
- References:
- Tips on optimizing these functions
- From: Andrea Taverna
- Tips on optimizing these functions
- Prev by Date: Re: outputing text
- Next by Date: Re: 99 ^ 99 in C
- Previous by thread: Tips on optimizing these functions
- Next by thread: Re: Tips on optimizing these functions
- Index(es):
Relevant Pages
|