Re: Coldfire MCF5475 performance question



David Hearn wrote:
David Brown wrote:
David Hearn wrote:
David Brown wrote:
David Hearn wrote:

<snip>

I can't see how that could give 1000 iterations, unless the local variable is initialised with -937 instead of 0. It's also very poor code - are you compiling it with all optimisation off? I find it is normally much easier to see what is happening at the assembly level with basic optimisation enabled.

Optimisation is off at present as recommended by P&E for debugging. Also wasn't sure if it would optimise out the loop entirely as it had no instructions within it.


Too much optimisation can make debugging hard - it becomes hard to follow what's happening as the compile re-arranges everything. But too little optimisation can also make it difficult, since the compiler puts data on the stack and uses unnecessarily long, slow code sequences. It depends on your compiler and debugger, but I find (with gcc and gdb) that -O gives a reasonable compromise.

You have to watch out for a few things, however - an empty loop like this can be removed entirely. The correct way to deal with this is to declare "temp_loop" to be volatile (a slightly different alternative would be to add an assembly "nop" inside the loop).


The actual code which generated this empty loop was:

for (temp_loop = 0; temp_loop < 1000; temp_loop++)
{
}

Are you running this from external memory with all caching and the like disabled?

Running from SDRAM.

As for cache - didn't realise I had to turn it on! Looked at some startup code and found assignment to CACR (cache control register) invalidating the data, branch and instruction caches and not turning them on. I've since set the 3 bits in this register to turn each of the caches on, and the empty loop now takes 106us, a factor of 14 improvement.

Thanks for the advice on that!

Have you checked your clock, to see if you are running at 266 MHz ?

This is something I thought of, but wasn't sure where to check this - it's a standard evaluation board with little/no configuration available, and this eval board only had 1 model, so no chance of mistake over purchase.

I'll now go back and look at the other suggestions and see whether there's any more tweaking I can do - but for now, it appears that at least for an empty loop, performance has increased.

Thanks again

D

The easiest way to check your clock rate is if you have a decent scope, look at the clock output pin (driving the clock to the sdram, for example). You'll have to check what the bus division ratio is for your chip - it's likely to be divide by 3.

The clock rate for most micros (with configurable clocks) is very conservative to start with - the 150 MHz MCF5234 I'm using at the moment comes out of reset at 37.5 MHz (using a 25 MHz reference).
.



Relevant Pages

  • Re: RAD vs. performance
    ... abstractions that will be there, whereas my compiler can optimize them ... Usually this doesn't matter, though. ... I'd say the same thing about your premature optimisation ...
    (comp.lang.misc)
  • Re: simple increment operator question.
    ... The Real OS/2 Guy wrote: ... > compiler to make optimasions right. ... > optimisation ability will produce better code for ... If find it sadly ironic that someone so concerned with the readability ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Debug optimised code
    ... DWARF2) and compiler optimizations. ... accurate results for higher optimization levels? ... optimisation levels mean very much. ... you have to start debugging optimised code. ...
    (comp.compilers)
  • Re: simple increment operator question.
    ... compiler to make optimasions right. ... optimisation ability will produce better code for ... I've never seen assember code for more than 15 years - ... language right when you have to write code in that language. ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Hey, what is all this off topic posting?
    ... >>>generated by the compiler. ... I deliberately sent interrupts at greater than the interrupt ... >> external clock by two to get the internal clock. ... >The number of internal cycles actually sounds similar to the result I got from ...
    (sci.electronics.design)

Loading