
Appendix D. L2 Considerations for the PowerPC 604 Processor
D-7
Allocation operations for L2 inclusion are decoded like those required for maintaining L1
tag copies. With L1 inclusion however, there is no need to monitor the L1 cache’s use of its
copy-back buffers because the back invalidations force any modified data replaced to be
copied back to the L2 level (if not all the way to main memory) before the fetch operations
can proceed, thereby reducing the benefit of copy-back buffers in such an environment.
The data cache block allocated in an L2 cache can be larger than that in the L1 cache, in
which case multiple back-invalidation operations to the L1 cache may be required
whenever a tag is deallocated in the L2 cache, depending upon whether subblocking is
implemented. This increases the latency of such operations and must be weighed against
the hit rate advantages of such a configuration.
D.4.3 Forwarding System Bus Operations to the Processor
If L1 inclusion is assured, the following scenarios can be considered:
The system bus operation does not match in the L2 directory. In this case, there can
be no copy in the L1 cache, so the system bus operation requires no intervention.
The address matches in the cache and is in E or M state. In this case, the system bus
operation must be retried and the operation must be forwarded to the processor
cache because it may have a more up-to-date copy of the data.
The address matches in the cache and is in the S state. In this case, for the simple
case of a read/read-atomic operation, SYS-SHD needs to be asserted only. Other
operations may require retrying, even if the data is only state S, as it is reasonable to
wait until the operation completes at the processor before letting it complete on the
system bus.
D.5 Marked L1 Inclusion
In addition to guaranteeing inclusion, marked inclusion keeps more information about
when an L2 cache entry is also in the L1 cache. Viewed narrowly, it is only necessary to
require that when an entry is marked as not in the L1 cache, that it in fact not be present. It
is acceptable to assume an entry is in the L1 when it is in fact not. Without maintaining a
structure that mimics the L1 directory, it is hard to closely match entries marked as included
in the L1 with those that actually are. Marked L1 inclusion offers reduction of back
invalidations and more efficient snoop filtering than simple L1 inclusion.
D.5.1 Requirements for Saving State Information
The included state for a tag is typically independent of the coherency states supported by
the L2 cache; for example, both a shared and an exclusive data entry can be present or not
in the L1 cache. Inclusion information is most easily kept on a 32-byte coherency granule
(doing otherwise may complicate some mechanisms with no large benefit). Consider the
extra state required for a 1-Mbyte L2 cache; for such a configuration, the additional
memory required is 32 Kbits.