Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics



On Jul 5, 4:30 pm, "Ulf Samuelsson" <u...@xxxxxxxxxxxxx> wrote:
"rickman" <gnu...@xxxxxxxxx> skrev i meddelandetnews:1183651600.912254.284760@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

On Jun 24, 11:45 am, wilco.dijks...@xxxxxxxxxxxx wrote:
On 23 Jun, 03:10, rickman <gnu...@xxxxxxxxx> wrote:
I don't follow what you are saying at all. Branch prediction relates
to pipelining. I don't see how it relates to wait states.

Adding a wait state is the same as increasing the pipeline depth, and
branch
prediction coupled with prefetching can hide some of that latency.

I don't see how that is true at all. When you add a waitstate you
freeze all stages of the pipeline while you wait for the Flash to
finish the access.

I don't know exactly how the Cortex work, but I worked on the internals
of another 32 bit RISC core.

Thanks for the rundown on this alternative CPU. Sounds a bit like the
National 32 bit CPU with variable length instructions. That was
supposed to be a fast CPU, but not a commercial success. If there had
been a longer term commitment, it may have grown in popularity. But
the realities of the commercial CPU market allowed it to pass on to
the CPU boneyard.

This core had a 16 byte FIFO in the first pipeline stage.
The prefetch mechanism loaded 32 bits into this FIFO each access.
The memory controller could add waitstates to this access if neccessary.

....snip...

Since most instructions are 16 bits, and you read 32 bits at a time,
zero waitstate operation allows to fetch almost two instructions per cycle.
The FIFO will quite soon be filled and if the odd 32/48 bit instruction pops
up,
it wont hurt your performance.

No, the "odd" 48 bit instruction won't hurt performance, but the FIFO
already has had a negative influence anytime the instruction sequence
is not linear. It is, in terms of the negative effect, like adding
pipeline stages. The entire FIFO has to be flushed anytime you
branch.


If you have one waitstate, you will see that the bandwidth is still high
enough that 1MIPS/MHz can be maintained as long as you only
execute 16 bit instructions. You will be hurt by fetching a 32 bit
instruction
since that takes 2 clocks.

Even executing 16 bit instructions takes a 1 clock cycle hit on a
branch. Instead of having the next instruction in the FIFO, you have
to wait 2 clock cycles before you can start decoding it.


I have run the SAM7 at 48 MHz, zero waitstate. Does not work over the full
temp range though.
The AVR32 will support 1.2 MIPS/MHz @ 1 waitstate operation @ 66 MHz
due to its 33 MHz 2 way interleaved flash memory.
(1st access after jump is two clocks, subsucquent accesses are 1 clock)

How does that compare to the Cortex M3 running at 50 MHz with no
waitstates and no branch penalty?


.



Relevant Pages

  • Re: Thinking assembly?
    ... I simply needs to understand the CPU and how it works, ... > understand the instructions, be a good and experienced programmer, ... assembly language is the machine's language (or, at least, ...
    (alt.lang.asm)
  • Re: Loading Memory Addresses
    ... You're kidding about kidding!!! ... The CPU instructions seem to lack AMD instruction sets. ... CPU 186 Assemble instructions up to the 80186 instruction set ...
    (alt.lang.asm)
  • Re: Adjusting PC Hyperthreading for Spice Simulation
    ... 3 instructions (or 3 cycles' worth of instructions) per CPU ... PPro upwards can execute multiple load/store ... 1100 MHz to 1400 MHz CPU cores had appeared we had DDR 333 ram. ...
    (sci.electronics.design)
  • Re: testbench for a microprocessor
    ... for your CPU, and writing a really classy instruction- ... Next, you'll need a memory model, with the right bus ... instructions for the CPU to chew, ... Then I'd build a testbench with the CPU-under-test ...
    (comp.lang.vhdl)
  • Re: [RFC] remove bus_memio.h and bus_pio.h
    ... > benefit was swamped by the actual I/O. ... Pentium can do up to 532 instructions in a microsecond even if it is ... CPU cycles @ 4nsec each, plus a huge number of CPU cycles for the i/o ... i/o instructions are slightly faster than memory accesses ...
    (freebsd-arch)