Re: Coldfire MCF5475 performance question
- From: David Brown <david@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: 7 Jun 2006 16:28:34 +0200
David Hearn wrote:
David Brown wrote:David Hearn wrote:David Brown wrote:David Hearn wrote:
<snip>
I can't see how that could give 1000 iterations, unless the local variable is initialised with -937 instead of 0. It's also very poor code - are you compiling it with all optimisation off? I find it is normally much easier to see what is happening at the assembly level with basic optimisation enabled.
Optimisation is off at present as recommended by P&E for debugging. Also wasn't sure if it would optimise out the loop entirely as it had no instructions within it.
Too much optimisation can make debugging hard - it becomes hard to follow what's happening as the compile re-arranges everything. But too little optimisation can also make it difficult, since the compiler puts data on the stack and uses unnecessarily long, slow code sequences. It depends on your compiler and debugger, but I find (with gcc and gdb) that -O gives a reasonable compromise.
You have to watch out for a few things, however - an empty loop like this can be removed entirely. The correct way to deal with this is to declare "temp_loop" to be volatile (a slightly different alternative would be to add an assembly "nop" inside the loop).
The actual code which generated this empty loop was:
for (temp_loop = 0; temp_loop < 1000; temp_loop++)
{
}
Are you running this from external memory with all caching and the like disabled?
Running from SDRAM.
As for cache - didn't realise I had to turn it on! Looked at some startup code and found assignment to CACR (cache control register) invalidating the data, branch and instruction caches and not turning them on. I've since set the 3 bits in this register to turn each of the caches on, and the empty loop now takes 106us, a factor of 14 improvement.
Thanks for the advice on that!
Have you checked your clock, to see if you are running at 266 MHz ?
This is something I thought of, but wasn't sure where to check this - it's a standard evaluation board with little/no configuration available, and this eval board only had 1 model, so no chance of mistake over purchase.
I'll now go back and look at the other suggestions and see whether there's any more tweaking I can do - but for now, it appears that at least for an empty loop, performance has increased.
Thanks again
D
The easiest way to check your clock rate is if you have a decent scope, look at the clock output pin (driving the clock to the sdram, for example). You'll have to check what the bus division ratio is for your chip - it's likely to be divide by 3.
The clock rate for most micros (with configurable clocks) is very conservative to start with - the 150 MHz MCF5234 I'm using at the moment comes out of reset at 37.5 MHz (using a 25 MHz reference).
.
- References:
- Coldfire MCF5475 performance question
- From: David Hearn
- Re: Coldfire MCF5475 performance question
- From: David Brown
- Re: Coldfire MCF5475 performance question
- From: David Hearn
- Re: Coldfire MCF5475 performance question
- From: David Brown
- Re: Coldfire MCF5475 performance question
- From: David Hearn
- Coldfire MCF5475 performance question
- Prev by Date: Re: Coldfire MCF5475 performance question
- Next by Date: Re: port I/O abstraction macros
- Previous by thread: Re: Coldfire MCF5475 performance question
- Next by thread: Re: Coldfire MCF5475 performance question
- Index(es):
Relevant Pages
|
Loading