Re: code optimization in embedded systems



Stefan Reuther wrote:
Wilco Dijkstra wrote:
"Ken Asbury" <avoidingspam2001@xxxxxxxxx> wrote in message
Not to sound argumentative, but my experience (and a very good living
over the last 30+ years) has repeatedly demonstrated that, in an
embedded environment, someone familiar with both disciplines can
routinely achieve an order of magnitude improvements in speed of
selected functions. The trick is to be able to determine which
functions
to rewrite in assembler.
I don't mind sounding argumentative, but 10 times is just ridiculous,
it is simply impossible for a compiler to produce code that bad.

Look at gcc56k and write a simple FIR filter. Then do the same in 56k
assembler and compare. Our 56k developer once showed me the listings,
and there easily is a 10x difference.

In case of the good old 56000 DSP I can imagine that to be true. I remember a 56000 developer I used to work with which was particulary good in translating algorithms to handcrafted 56000 code. In this case the inner loop took in C about 10 statements, the 56000 code was 2 or 3 instructions. It took him more than a day and several iterations to get there, and the code had no resemblance what so ever with the original C code.

Granted, gcc56k is really old, but when I evaluated Blackfin gcc a while
ago, I also saw a magnitude difference to VisualDSP++ for similar
computationally intensive code. gcc simply isn't yet smart enough for
that target (maybe it is today, haven't looked for a while).

GCC isn't generally know for its optimizer, often there are better alternatives.

Modern compilers routinely beat all but the most experienced assembly
programmers. On ARM for example you'd be having difficulty getting
more than 10% speedup over compiled code, and getting more than
that is only possible in extremely rare cases. However even 2 times
speedup for a non-trivial piece of code is impossible.

Billions of ARM processors are shipped each year, hence good tools for
them are widely available and understood. For lower volume
architectures, it might look much worse.

It also depends on the architecture itself, some architectures are just no good targets for C compilers.
.



Relevant Pages

  • Re: Build your own Forth for Microchip PIC (Episode 837)
    ... the unreasonablly small hardware stack size for my particular target. ... You are handcuffed in the sense that you would like to be able to download small amounts of code into ram and execute it. ... because once you do away with the inner interpreter, ... An optimizing compiler isn't a post-processor, it's an *alternative* to another kind of compiler. ...
    (comp.lang.forth)
  • Re: Build your own Forth for Microchip PIC (Episode 837)
    ... put those tokens in RAM. ... This is the reason I'm wanting to use the host as a remote execution ... Test/Debug code on target recompiling and reloading as necessary. ... of implementing anything other than a batch forth compiler for it. ...
    (comp.lang.forth)
  • Re: Build your own Forth for Microchip PIC: Design thoughts
    ... compiler is the only task that's burning in my brain right. ... that puts PIC code on the host for the simulator to execute. ... I'm not so sure about the simulator. ... That target will execute compiled forth words. ...
    (comp.lang.forth)
  • Re: forthday proceedings
    ... for target chips in colorforth and reviewed a version ... and with a target compiler and software simulator ... other OKAD application programs and the chip designs ...
    (comp.lang.forth)
  • Re: Build your own Forth for Microchip PIC: Design thoughts
    ... testing words on the host, then transferring words to the target. ... where the students would be ultimately working with a cross-compiler, ... to a different compiler that will generate PIC code and download it. ...
    (comp.lang.forth)