Re: STM32 ARM toolset advice?



In article <bM_Gk.19032$wU.1096@xxxxxxxxxxxxx>,
Wilco.removethisDijkstra@xxxxxxxxxxxx says...

"Mark Borgerson" <mborgerson@xxxxxxxxxxx> wrote in message news:MPG.2355df51ee57f03298991d@xxxxxxxxxxxxxxxxxxxxxxxxx
In article <bWQGk.4374$wG3.1249@xxxxxxxxxxxxx>,
Wilco.removethisDijkstra@xxxxxxxxxxxx says...

"Mark Borgerson" <mborgerson@xxxxxxxxxxx> wrote in message news:MPG.2355722479c1065e989919@xxxxxxxxxxxxxxxxxxxxxxxxx
In article <81ene4hsatqaphkmp01cikmpk9l7ana9qi@xxxxxxx>,
nobody@xxxxxxxxxxxxxxxx says...

Keil has been bought by ARM, and AFAIK they now use the compiler from
ARM. Apparently this compiler generates the best code for ARM's CPUs.
GCC Generates quite good code for the ARM these days. The biggest
drawback is the use of newlib. Rowley provides a nice IDE with GCC,
and their own library, which removes the one disadvantage of using
GCC. Their product is also available for Windows and Linux.


One of the biggest problems I ran into with GCC-ARM when using the
linux libraries for the TRITON boards, is that floating point operations
are executed as kernel interrupts (Undefined Instruction generating
jumps to a floating point emulator, I think.) I think it turned
out to be several times slower than the IAR floating point library
that runs in user mode.

And given that the IAR floating point library is one of the slowest available,
that is quite slow indeed.


I found t that an ARM (PXA-255 at 400MHz like that in the Triton board)
does a FP multiply in about 0.13 microseconds using the Soft-Float
libary. (It takes about 10X longer using floating-point emulation in
the Linux kernel.)

That's about 52 cycles, which is pretty good if it is double multiply.

I think it was only single-precision (32 bits) IEEE-854 format. It is
also possible that the soft float library did not properly handle
NAN and some other conditions. It was supposedly highly optimized for
the array-multiply operations for which it was used (part of an
extended Kalman Filter).

http://albatross-uav.org/index.php/Benchmarks

I did find that an AT91SAM7S at 16Mhz using the IAR libraries,
had about 5 times the floating point performance of an M68332 at the
same clock using the Codewarrior libraries. How much of that is
better code and how much is due to a better CPU is open to
question.

The M68332 is such a slow CISC that one can do a full floating point
multiply on ARM in less than half the time it takes the M68332 to execute
one 32x32 multiply instruction.


I agree. I used to have to worry about the effects of DIV and DIVU
instructions on interrupt latency in some low-jitter analog input
routines.

I suspect that there is some real art in the design and coding of
an ARM FP library so that the operations take full advantage of the
shift/rotate and evaluate instructions and scheduling things to
keep the pipelines full. I don't think the earlier IAR libraries
took advantage of the instruction set as much as they could have
as the code is written in C (by P.J. Plauger in 1994, according
to the available library source). A lot of the math.h routines
also expand 32-bit floats to doubles before doing the math.
That's a good way to maintain precision, but not the fastest
way to do 32-bit FP math.


Mark Borgerson

.



Relevant Pages

  • Re: STM32 ARM toolset advice?
    ... GCC Generates quite good code for the ARM these days. ... linux libraries for the TRITON boards, is that floating point operations ... I did find that an AT91SAM7S at 16Mhz using the IAR libraries, ...
    (comp.arch.embedded)
  • Re: STM32 ARM toolset advice?
    ... GCC Generates quite good code for the ARM these days. ... linux libraries for the TRITON boards, is that floating point operations ... I did find that an AT91SAM7S at 16Mhz using the IAR libraries, ...
    (comp.arch.embedded)
  • Re: converting float to ascii w/o printf
    ... One of the main problems with the use of C library floating point is that you are then more or less forced to use all the library functions that support the internal format. ... For small system embedded work, it's better to develop more efficient libraries over several projects / years. ... Such libraries can of course include scaled integer support and are actually surprisingly trivial to write, as are table based trig functions. ...
    (comp.arch.embedded)
  • Re: Sines and Cosines
    ... and I've got the DJGPP environment running atop ... and that our libraries are dutifully translating ... floating point processor. ... The thing is that our Java implementations are making different choices ...
    (comp.lang.java.programmer)
  • Re: Sines and Cosines
    ... and I've got the DJGPP environment running atop ... and that our libraries are dutifully translating ... floating point processor. ... Is the implementation of Java on ...
    (comp.lang.java.programmer)