Re: code optimiation
- From: David Brown <david.brown@xxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Sat, 10 May 2008 21:19:38 +0200
Tomás Ó hÉilidhe wrote:
On May 10, 9:11 am, "aamer" <raqeeb...@xxxxxxxxx> wrote:Dear all,
Are there any hard and fast rules for code optimization in C targetting a
processor.
I'd advocate using types like "uint_fast8_t" instead of "unsigned
int"; that way you'll get good performance out of all kinds of
machine, whether they be 8-Bit, 16-Bit or 5-billion-Bit. For instance
if you use "unsigned int" on an 8-Bit microcontroller where an 8-Bit
integer would suffice, then your code will be at least twice as slow
because multiple instructions are used everytime you do simple
arithmetic.
Using the "fast" types can make sense, especially for speed-critical code. There are advantages in using the size-specific types, however - specifying "uint8_t" rather than "uint_fast8_t" may let the compiler (or linter) spot range errors that would not be found if "uint_fast8_t" boils down to a 32-bit value. Given that the compiler can often optimise the generated code to use the best sized types available, it's seldom worth specifying "fast" types explicitly.
Also I'd advocate using "built-in" parts of the language where
possible, e.g.:
unsigned arr[12] = {0};
That's good advice, except that using "unsigned" contradicts your previous advice. Personally, I dislike abbreviated types like "unsigned" - I always write the implicit "int" explicitly.
instead of:
unsigned arr;
memset(arr,0,sizeof arr);
I presume you meant "unsigned arr[12];" here.
The main reason for using the {} initialiser rather than memset() or other methods is that it gives clearer and shorter source code - smaller and faster object code is a bonus (in some circumstances, compilers will optimise the memset() call to the same code anyway).
(Also the former is fully portable for dealing with types like
pointers and floating point types whose "zero value" might not be all-
bits-zero)
It is virtually impossible to write fully portable code - and totally impossible within the world of embedded programming. Forget the machines that have weird values for zeros, or bizarre numbers of bits (although some DSP's have 16-bit or 32-bit chars), or something other than two's complement arithmetic, or non-ASCII for their basic character set. It's not worth it - code suitable for an ARM is not suitable for running on a 1970's mainframe anyway.
Another thing would be about the use of the post-increment and post-
decrement operators in a conditional. For instance:
void strcpy(char *dst, char const *src)
{
while (*dst++ = *src++);
}
The idiom of using *p++ is widespread, but unfortunately its use is no
longer advisable because hardware has moved on. I think it was the
PDP11 that had a single instruction for dereferencing a pointer and
also incrementing it at the same time, thus it was beneficial to use *p
++ wherever possible -- however modern machines don't have such an
instruction, so the assembler produced for *p++ when used as the
conditional in an if statement, for instance, might be sub-optimal. So
I'd say opt for:
for ( ; *dst = *src; ++dst, ++src) ;
In any code review, that form would be taken out and shot. Just because it is legal in C to write an ugly mess inside a for() statement, does not mean that it is sensible to write it. It's not even going to produce smaller or faster code - any compiler that can't produce tight code for the original while() will produce poor code from this construct too.
The first idiom is so commonly used that it is clear to any reader - although I'd have two sets of parenthesis (gcc convention to disable a warning) and perhaps a comment to say that I really meant a single "=".
For less capable compilers, you are probably better with:
while ((*dst = *src)) {
dst++;
src++;
}
That's far clearer to the reader, and easier for a less sophisticated compiler.
It's always important to examine the generated assembly code, and learn to know your target architecture and your compiler's idiosyncrasies if you want to get the best from it - don't guess randomly at the most obfuscated expression you can think of.
Moving on...
On most machines, I would use pointers instead of element indices for
iterating thru an array. For example:
char *p = arr;
char const *const pend = arr + LENGTH;
do if ('a' == *p) return 1;
while (pend != ++p);
intead of:
unsigned i = 0;
do if ('a' == arr[i]) return 1;
while (LENGTH != ++i);
First off, get yourself a decent compiler. It will do the same job, and let you write the source code using proper array constructs.
Secondly, don't write a loop like that (first or second forms) without using brackets - it's unclear, and it changes can easily break the code.
Third, forget the silly "if (constant == variable)" form of expression unless you are working for MISRA nazis (i.e., those that think the rules are unbendable). The logical and sensible ordering when reading such a comparison is normally "if (variable == constant)". If your compiler does not spot mistakes such as using a single "=" when you meant "==", get a better compiler or a better linter.
The latter, on most architectures, is a hell of a lot slower. But then
again there are some PC's that have a single instruction for "pointer
+ offset", so I can't discredit that technique altogether.
Do you have any evidence whatsoever for such a wild claim? A good compiler will use pointer instructions for array access, and will do the strength reduction turning the array loop into an incrementing pointer. Also, there are plenty of current modern architectures that have array memory modes that will be used as appropriate.
On all architectures, I advocate the use of look-up tables instead of
switch statements where applicable, especially when it's possible to
have a look-up table containing function pointers.
You can advocate all you want - fortunately most people will ignore you. The compiler will almost always generate better code for common switch cases than a lookup table - and will generate a jump table automatically as necessary. This will be significantly smaller and faster than a lookup table of function pointers. (There are plenty of good reasons for using a table of function pointers as a code construct - it's just that replacing switch statements is not one of them.)
If you're ever dealing with a struct that has a lot of information in
it which is common to a "type", then it might be advisable to follow C+
+'s idom of removing that stuff from the struct and replacing it with
a pointer to a single object which contains all the relevant
information for that time (a V-Table, that is).
It *might* be, but it sounds very unlikely. What you describe is not a C++ idiom, and it's not a vtable - you are describing static data members.
Emmm they're the main ones that come to mind right now..
- Follow-Ups:
- Re: code optimiation
- From: Thad Smith
- Re: code optimiation
- From: Everett M. Greene
- Re: code optimiation
- From: Tomás Ó hÉilidhe
- Re: code optimiation
- References:
- code optimiation
- From: aamer
- Re: code optimiation
- From: Tomás Ó hÉilidhe
- code optimiation
- Prev by Date: Re: Hobbyist embedded work
- Next by Date: Re: code optimiation
- Previous by thread: Re: code optimiation
- Next by thread: Re: code optimiation
- Index(es):
Relevant Pages
|