Re: powerpc sync and eieio instructions



Here is an imprecise non-authoritative summary:

eieio- prevents the CPU from reordering memory accesses.
The best example is when a program needs to update
peripheral registers in a particular order.


Hi, I've read some references about eieio instruction use but I've
still some doubts.

The code that confused me is that (about MMU initialization):

Page Table Updates (from MPC7450UM -> 5.5.3)

Thus the following code should be used:
/* Code for Modifying a Page Table Entry */
/* First delete the current page table entry */
PTEV <- 0/* (other fields don’t matter) */
sync /* ensure update completed */
tlbie(old_EA) /* invalidate old translation */
eieio /* order tlbie before tlbsync */
tlbsync /* ensure tlbie completed on all processors */
sync /* ensure tlbsync completed */
/* Then add new PTE over old */
PTERPN,R,C,WIMG,PP <- new values
eieio /* order 1st PTE update before 2nd */
PTEVSID,API,H,V <- new values (V=1)
sync /* ensure updates completed */

And also this explanation:
(from MPC7450UM -> 3.3.3.5 Enforcing Store Ordering with Respect to
Loads)

The PowerPC architecture specifies that an eieio instruction must be
used to ensure sequential ordering of
loads with stores.
The MPC7450 guarantees that any load followed by any store is
performed in order (with respect to each
other). The reverse, however, is not guaranteed. An eieio instruction
must be inserted between a store
followed by a load to ensure sequential ordering between that store
and that load. Also note that setting
HID0[SPD] does not prevent loads from bypassing stores.

I've one processor, why should I use tlbsync instruction in page table
updates? And why should I use eieio instruction in this context?

Thanks a lot.
.



Relevant Pages

  • Re: sync and eieio instruction
    ... The PowerPC architecture specifies that an eieio instruction must be ... The MPC7450 guarantees that any load followed by any store is ... followed by a load to ensure sequential ordering between that store ...
    (comp.sys.powerpc.tech)
  • Re: AMD Bulldozer optimization guide
    ... p. 21 - a single macro-op can handle load and store to the same address, whereas micro-ops can only be load and store. ... L3$ - non-inclusive victim cache ... p. 80 - load-execute instructions are preferred over separate load and execute instructions. ...
    (comp.arch)
  • Re: A "killer" macro
    ... (defconstant load 8) ... (defconstant store 9) ... giving the opcode behind that mnemonic on that architecture. ... sophisticated way to work around the limitations of the "case" macro. ...
    (comp.lang.lisp)
  • Re: [PATCH resend 5/9] MIPS: sync after cacheflush
    ... assume one of the three for an uncached load: ... from) the store buffer, no external cycle on the bus is seen. ... The load bypasses the stores and therefore reaches the external bus ...
    (Linux-Kernel)
  • Re: How Many Processor Cores Are Enough?
    ... A load by Pi is considered performed at a point in time when the ... A store by Pi is considered performed with respect to Pk (i and k ... It's defined in the Itanium manuals and is equivalent to Sparc TSO ...
    (comp.arch)