Re: Caches in embedded systems



On Thu, 04 May 2006 22:21:58 +0200, Rob Windgassen
<rwindgas.delete.this@xxxxxxxxx> wrote:

On Thu, 04 May 2006 11:14:01 +0300, Paul Keinanen wrote:

On 3 May 2006 18:04:17 -0700, "shrey" <shreyas76@xxxxxxxxx> wrote:

I know caches are avoided in real time applications

Unless the cache is very badly implemented, the worst case timing
occurs when the cache is disabled.

Independent of your cache implementation, the software *can*
cause really bad timing in some cases, especially for data accesses.
When the software hits in (almost) random access patterns a large amount
of memory, i.e. larger than the cache size, each access causes a cache
line load, instead of reading a single word needed by your program.

Even in this situation, the difference is not usually that dramatic.
E.g. in a typical x86 implementation with 32 byte cache line and 8
byte (64 bit) wide DRAMs, a cache line loading requires one full
RAS/CAS cycle (which includes the DRAM access time) to get the first 8
bytes and additional three CAS cycles (to get the remaining 24 bytes),
which essentially activates a data selector on the DRAM.

A direct random memory access would still require the full RAS/CAS
cycle to get up to 8 bytes of data. Both the cache line load as well
as the random access read contains a single memory cell access time
(which depends on the cell technology), while the cache load
additionally multiplexes three times 8 bytes on the memory data bus
(the time depends on the bus speed).

When
that data word is also changed it will furthermore mean that a whole
updated cache line must be written back to memory when a new access is
done.

The need for immediate write back usually occurs in direct mapping
caches, but in any associative mapping, the write back can usually be
delayed.

It should be noted that with direct access, a read and a write cycle
must still be performed.

Even a single byte write to a 2 .. 8 byte wide memory will require a
read operation to get the unmodified bytes from the memory word, then
replace the byte to modified in the CPU and then write back the full 2
... 8 byte wide memory word, unless of course you have up to 8 separate
write enable signals for each byte.

The difference between direct access and cached access is not that
great as it might first appear.

Of course this behaviour will not happen in general, but it can
happen.

In hard real time systems with firm deadlines the worst case situation
must still be identified.

Paul


.



Relevant Pages

  • Re: ARM instruction timings
    ... I saw a list of ARM instruction timings once. ... about the cache. ... execute as on the ARM2, however, the cycle lengths are different. ... memory system and is a sequential access. ...
    (comp.sys.acorn.programmer)
  • Re: Cached memory never gets released
    ... Stock linux 2.4.26 kernel. ... Due to flash bug 3M of memory gets lost due to font memory getting lost ... The output of "free" cache number steadily grows. ... longer to exhaust all of system memory with the cache. ...
    (Linux-Kernel)
  • Re: UMA cache back pressure
    ... on a cache for another week or two. ... done in memory allocation last years improved situation. ... heavy and doesn't benefit much from bucket sizing. ... You're also biasing the first zones in the list. ...
    (freebsd-current)
  • Re: UMA cache back pressure
    ... on a cache for another week or two. ... done in memory allocation last years improved situation. ... heavy and doesn't benefit much from bucket sizing. ... You're also biasing the first zones in the list. ...
    (freebsd-hackers)
  • Re: UMA cache back pressure
    ... done in memory allocation last years improved situation. ... limits on cache sizes -- they are self-tuned, ... Even on my workload with ZFS creating strong memory pressure I still have mbuf* zones buckets almost maxed out. ...
    (freebsd-current)