Re: ARM926 caching question
- From: Didi <dp@xxxxxxxxxxx>
- Date: Sun, 5 Sep 2010 05:55:33 -0700 (PDT)
On Sep 5, 2:44 pm, Stargazer <stargazer3...@xxxxxxxxx> wrote:
Greetings,
this question is for ARM experts, in particular it's about ARM926 core
(which is used in TI's DM6467 DaVinci processor).
I want to use cache for speeding up processing on video buffer of size
YCbCr 4:2:0 1080P (1920x1088x1.5). Normally the buffer is not cached,
since it is shared between ARM code, C64 DSP core and with an
additional PCI master. Data flow is the follows: external PCI master
fills in raw uncompressed frame -> we add several processings (layout
building, background, some graphics and OSD test belnding), then the
whole resulting frame is passed to DSP for compression.
ARM core runs MontaVista Linux 4.0.1 with kernel 2.16.18 (MV-patched
from MontaVista 5.0 distribution).
I'd like to enable caching on ARM for this buffer, process it in
chunks of 4K (D-cache on DM6467's ARM core is 8K 4-way associative, so
I want to leave at least two ways for caching of other program's data
and stack) and then call a kernel module, which will write-back and
invalidate each 4K chunk. So by the end of wbinvd'ing the last chunk
the whole buffer will be consistent in external RAM ready for DSP
processing (obviously, before starting such a processing, the whole D-
cache will have to be invalidated without write-back).
Sounds good, but I see problems with doing so, according to ARM926 TRM
(or may be I just misunderstand).
ARM caches data in 32-byte lines tagged with Modified Virtual Address.
MVA is made by appending a special field FCSE PID in CP15 reg. c13 to
program's virtual address, if that address is below 32M; if the
address is above 32M, no appending takes place and VA = MVA (that's
what happens in kernel mode). User-mode programs are mapped to lower
32M VA and hence use that FCSE PID. I tried to use user-mode pointers
in kernel mode, and got inconsistent data in user-mode buffers;
apparently, the kernel changes FCSE ID on system call entry or just
disables it.
Now the TRM says: "FCSE translation is not applied for addresses used
for entry based cache or TLB
maintenance operations. For these operations VA = MVA." That is, I can
use VA-based cache manipulation CP15 instructions in kernel mode
without caring about FCSE PID. Now if that was true, suppose that
there are currently data from 3 different processes cached for the
same VA, just different PID and we're invalidating cache entry for
that VA via CP15 reg c7, for which "translation is not applied". Which
of the 3 entries above will get invalidated? All of them?
Thanks,
Daniel
Not having any ARM experience - I live in power (PPC) - I would still
question your understanding they cache based on anything but physical
address (i.e. I would expect caching is done after all translation
has been done).
But this is just my speculation, again, I don't know ARM.
Dimiter
------------------------------------------------------
Dimiter Popoff Transgalactic Instruments
http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
.
- Follow-Ups:
- Re: ARM926 caching question
- From: Stargazer
- Re: ARM926 caching question
- References:
- ARM926 caching question
- From: Stargazer
- ARM926 caching question
- Prev by Date: ARM926 caching question
- Next by Date: Re: ARM926 caching question
- Previous by thread: ARM926 caching question
- Next by thread: Re: ARM926 caching question
- Index(es):
Relevant Pages
|