Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: wilco.dijkstra@xxxxxxxxxxxx
- Date: Sun, 24 Jun 2007 08:45:49 -0700
On 23 Jun, 03:10, rickman <gnu...@xxxxxxxxx> wrote:
On Jun 22, 7:34 pm, "Wilco Dijkstra" <Wilco_dot_Dijks...@xxxxxxxxxxxx>
wrote:
"rickman" <gnu...@xxxxxxxxx> wrote in messagenews:1182540547.184518.91830@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
The data *** says it requires one wait state from 24 to 48 MHz and 2
wait states above 48 MHz. So compared to the Luminary parts running
at 50 MHz with *NO* wait states, I say the ST M3 parts are dogs.
It's not that bad. Cortex-M3 has a prefetch buffer and branch prediction. This
means that the cost of a single waitstate can be hidden for conditional branches,
ie. only indirect branches have a penalty. With 2 wait states the branch prediction
only works on unconditional branches, so you'll get a slowdown. However you can
change loops to use an unconditional branch at the end so they run at the speed
of zero-wait state memory.
I don't follow what you are saying at all. Branch prediction relates
to pipelining. I don't see how it relates to wait states.
Adding a wait state is the same as increasing the pipeline depth, and
branch
prediction coupled with prefetching can hide some of that latency.
The required wait states are added because of a fundamental limitation in
the bandwidth of the Flash memory. You can look-ahead all you want,
but you can still only return one word from Flash per 3 clock cycles
when running at full speed. Unless the Flash word width is increased
(as in the NXP designs) or the instruction size is reduced (many
Cortex M3 instructions are 16 bits, but they would need to be 10 bits
with two wait states and 32 bit memory) this will limit performance in
the Cortex M3.
Am I completely missing something? I always leave that possibility
open...
You have to increase the fetch width if you add waitstates, that's a
given. The
M3 TRM recommends 64-bit flash fetch. While this allows straightline
code to run
at full speed, branches are still slow. What I meant is that M3 has
branch
optimizations that reduce this slowdown.
Yes, that is all pretty obvious. But it does not address the point of
the Y intercept being a hefty 9 mA. This is not as high as the Analog
Devices ARM parts, but it is significant. It means you need to use
modes and hardware features to get better power savings compared to
just slowing the clock which is much simpler to do.
powerFrom what I remember, I think it is the PLL and flash that cause high
consumption at lower speeds. The specs showed various settings that
use
significantly less than 9mA below 8MHz. So I don't think it really
burns 9mA
at 0MHz (which is what I think you mean with Y intercept, right?).
The power for the STM32 is from the data *** and includes basic
power to the peripherals, although since they are not performing work
the power they draw is less than typical. So the comparison is not
perfect.
The ST specs list some numbers with peripherals off, and that is less
than
half the normal current, 21mA from flash at 72MHz IIRC - pretty good.
Now consider that an M3 runs twice as fast as a SAM7 at the same frequency,
so the MIPS/Watt is 4 times as good!
How do you support the claim that the M3 runs twice as fast as the
SAM7 at the same frequency??? Maybe I don't want to know...
Because I've benchmarked it myself?
I have not seen anyone claim that the M3 runs twice as fast as an ARM7
clock for clock. I don't even think ARM claims that. I seem to
ARM claims 70% improvement, but that is an understatement due to the
Dhrystone benchmark. EEMBC is more accurate here.
recall that after all the hoopla is removed, you might see from 10% to
25% speedup from the ARM7 to the M3 depending on your application. If
you disagree on this basic point, then I think we should not discuss
it further. I have seen it discussed before ad nauseum with no hard
information to support any given number.
You seem to forget that the M3 was designed from the ground up to be
an efficient
MCU, while ARM7 wasn't:
1. Thumb-2 is more efficient than Thumb-1 (just like ARM is faster
than Thumb-1)
2. Cortex-M3 has a more efficient micro architecture than ARM7 (it
beats ARM9)
3. Cortex-M3 slows down less with wait states than ARM7
4. New features like unaligned access, hardware divide, better
interrupts
Both 1 and 2 provide about 40% improvement, so about 2 times speedup
together.
3 can give somewhere between 10 and 20% extra performance, but only at
higher
frequencies when waitstates are used. Depending on the application
used, 4 can make
a huge difference (I've seen benchmarks go twice as fast just because
of hardware divide).
I've personally benchmarked the effects of 1, 2 and 4 on a large
amount of benchmarks.
Now where did you get that 10-25% number from?
Actually, I guess a power factor would be required for the SAM7 parts
as well since they run with one wait state at their top speed. So
maybe the STM32 part do better on power than I realized!
If you're trying to compare MIPS/Watt don't forget that different cores running
at the same frequency do not run at the same speed.
Yes, but that is a small delta compared to adding waitstates with a 2x
or 3x reduction in performance and therefore the same effect on power
efficiency.
Not at all. Adding waitstates doesn't slow you down by that much as
you make
the memory wider (not doing that makes no sense at all, so I do not
consider
that a valid option). But instruction set and microarchitecture
differences can easily
make a factor of 2 difference, as shown above.
Wilco
.
- References:
- New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Bill Giovino
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Jim Granville
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Simone
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Bill Giovino
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Wilco Dijkstra
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Jonathan Kirwan
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Jim Granville
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Bill Giovino
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Jim Granville
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Bill Giovino
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Eric
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: rickman
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: Wilco Dijkstra
- Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- From: rickman
- New ARM Cortex Microcontroller Product Family from STMicroelectronics
- Prev by Date: Re: What is your development setup on Linux (or equiv)?
- Next by Date: Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- Previous by thread: Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- Next by thread: Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
- Index(es):