7-6
MPC7400 RISC Microprocessor Users Manual
AltiVec Technology and the Programming Model
Bandwidth between the processor and memory is managed explicitly through the use of
cache management instructions, which provide a way to indicate to the cache hardware how
it should prefetch and prioritize the writeback of data. The principal instruction for this
purpose is a software-directed cache prefetch instruction called Data Stream Touch (
dst
).
Other related instructions are provided for complete control of the software directed cache
prefetch mechanism.
Table 7-3 summarizes the directed prefetch cache instructions defined by the AltiVec
VEA. Note that these instructions are accessible to user-level programs.
7.1.2.3 Data Stream Touch Instructions
Note that, in general, prefetching data to which the program is only going to perform store
instructions does not help and can sometimes hinder performance. User-level programs
should not use the touch-for-store prefetches (
dstt
,
dstst
, and
dststt
) unless the program is
performing loads and stores to the data that is being prefetched. If the user is only
performing stores to the data, then performance is almost certainly better by not prefetching
and simply performing the stores by themselves.
So, in general, touch-for-store instructions (
dstt
,
dstst
, and
dststt
) should not be used and
should be used only for prefetch data that is going to be both loaded from then stored to.
Otherwise, a programmer should use the normal touch-for-load instruction (
dst
) only to
prefetch data that the program is loading.
If HID0[NOPDST] = 1, all subsequent
dst
x
instructions are treated as no-ops and all
previously executed
dst
streams are canceled. This no-op means that the touch does not
cause a load operation and cannot perform address translation. Therefore, no table search
operations are initiated, and no page table entry (PTE) referenced bits are set.
The
dst
x
instructions are broken into one or more self-initiated
dcbt
-like touch line fetches
by the memory subsystem. When the
dst
x
instruction is dispatched to the LSU and all of
its operands are available, the
dst
x
is queued in a vector-touch queue (VTQ) in the next
cycle. There are four data stream engines within the VTQdata stream 0 uses engine VT1
within the VTQ, and data stream 1 uses VT1, and so forth.
Table 7-3. AltiVec User-Level Cache Instructions
Name
Mnemonic
Syntax
Implementation Notes
Data Stream Touch (non-transient)
dst
r
A,
r
B,STRM
Data Stream Touch Transient
dstt
r
A,
r
B,STRM
Used for last access
Data Stream Touch for Store
dstst
r
A,
r
B,STRM
Not recommended for use in the MPC7400
Data Stream Touch for Store Transient)
dststt
r
A,
r
B,STRM
Not recommended for use in the MPC7400
Data Stream Stop (one stream)
dss
STRM
Data Stream Stop All
dssall
STRM