Chapter 3. L1 and L2 Cache Operation
3-47
Cache Operations
In the case of a dRLT entry that was allocated by a store marked memory-coherency
required, when subsequent stores have merged to all 32 bytes, the dRLT signals the BIU
that it no longer needs data for that entry. If the cache block Tll request in the BIU for the
reload buffer entry has not yet propagated to the bottom of the BIUs address queue, the
transaction is completely dropped and does not appear on the address bus. In this case, store
miss merging to non-global space enables the processor to silently allocate a new cache
block in the data cache.
If the cache block Tll request in the BIU is at the bottom of the BIU's address queue but has
not received a qualiTed bus grant for the read-with-intent-to-modify (RWITM) transaction,
it performs an address-only kill broadcast instead. If the cache block Tll request has already
received a qualiTed bus grant, the transaction completes as a RWITM, but the data is
discarded.
Note that two back-to-back AltiVec store misses can write a full 32-byte dRLT entry. For
these back-to-back AltiVec stores, the MPC7400 nearly always performs kill coherency
actions instead of RWITM transactions. Note that the chances of this happening decrease
if other instructions are placed between the two stores or if a data dependency stalls the
second store.
For large block copies to either global (memory-coherency-required) or non-global
(memory-coherency-not-required) address space, the MPC7400 is more efTcient if
adjacent stores are used instead of
dcbz
or
dcba
instructions. This is due to the following
three reasons:
¥
store hits to the data cache are fully pipelined whereas
dcbz
/
dcba
hits to the data
cache can happen only once every four cycles best case
¥
the store miss merge mechanism allows the MPC7400 to issue kill transactions
similar to
dcbz
/
dcba
¥
dcbz
/
dcba
instructions are usually used for prefetching; the real purpose of a copy
is to perform real stores which the MPC7400 can perform just as efTciently without
dcbz
/
dcba
prefetches.
3.6.6 Store Hit to a Data Cache Block Marked Recent or
Shared
Write-back stores that hit to a data cache block in the R or S state cannot be performed
without Trst obtaining exclusive ownership of that block by a kill broadcast on the system
bus.
When a write-back store hits on shared or recent cache block, the target block is invalidated
in the data cache. The current data from the target block is merged with the new store data
and is copied into a reload buffer entry. A kill operation is propagated to the system bus.
When the kill broadcast is successful, the target block is reloaded into the data cache in the
MCD state.