
D-8
PowerPC Microprocessor Family: The Bus Interface for 32-Bit Microprocessors
D.5.2 Operations Required for Processor Bus Operations
The operations performed to maintain marked L1 inclusion are like those required for
simple L1 inclusion. When an allocation is performed by the L1 cache, the L2 cache must
also ensure that a tag is allocated and that the inclusion bit for the accessed 32-byte granule
is set (that this bit might already be set if a processor discarded this block without being
detected). Before an L2 cache tag can be allocated, it must be inspected. If the tag contains
address granules for which the inclusion bit is set, a back invalidation for each granule must
be sent to the processor. If they do not, the tag can be removed directly, assuming that the
data is copied back to main memory as required by the state indicated for the cache blocks.
The old tag can be removed from the L2 directory and the fetch for the new tag can begin
only when the snoop push-backs required for all included granules are completed. As with
simple inclusion, replacing a tag from the L2 may require a copy-back to main memory.
The inclusion bit is reset whenever a processor bus operation is performed, which visibly
removes an entry from the L1 cache. These operations are as follows:
Write with kill (deallocate)—A cache castout operation or a snoop response in
which the L1 cache state goes to invalid, so the inclusion bit can be reset.
Kill block (deallocate)—The result of
dcbi
instruction, L1 cache state goes to
invalid, so the inclusion bit can be reset.
ICBI—The result of an
icbi
instruction, L1 cache state goes to invalid, so the
inclusion bit can be reset.
Other operations indicate an entry has been removed from the L1 cache; in particular read
operations. However, because it is not easy to determine for which L2 cache block to reset
the inclusion bit, L2 inclusion bits can only approximate the actual L1 contents.
The exact L2 set index to use can be determined only by creating a structure like the L1
directory that can track both the L1 set index and the L1 group entry information. This
structure could be simpler than the L1 directory because it needs no comparator and
because it requires only an L2 index entry rather than a tag. However, adding this structure
to marked L1 inclusion requires more resources and is needlessly more complicated than
simply implementing an L2 cache with a copy of the L1 tag and state information.
D.5.3 Forwarding System Bus Operations to the Processor
When a system bus operation is run, the cases of interest are as follows:
Snoop miss—Because of L1 inclusion, there is no need to pass the operation onto
the processor.
Snoop hit with inclusion bit reset—There is also no need to pass the operation to the
processor.
Snoop hit with inclusion bit set—The operations are identical to simple L1 inclusion
operations.