
Appendix D. L2 Considerations for the PowerPC 604 Processor
D-3
D.2.1 Requirements for Saving State Information
This approach requires the implementation of a set of cache tags and comparators to
maintain the addresses in the primary cache. For separate 16-Kbyte instruction and data
caches, each cache directory must have 4 by 128 entries of a valid bit and 20 tag bits, for a
combined total of 21 Kbits and eight 21-bit comparators. The valid bit is assumed to be
implemented as a seven-transistor cell, as it should be cleared at power-up for repeatability
and testing. Because this resource is a small fraction of the total tag required for a 1-Mbyte
L2 (approximately 256 Kbits), it easily can be placed in the same controller. This tag array
must be able to read a tag and valid bit, to write a tag and valid bit, and compare a tag and
valid bit against an address (and assumed one valid bit).
A second state requirement is a set of registers with associated comparators per register
(termed the copy-back address registers) to hold addresses displaced from the L1 tags that
need snooping. One such register is needed for each copy-back buffer on the processor.
These registers need only be able to write and compare. Reading them is unnecessary.
D.2.2 Operations Required for Processor Bus Operations
Apart from the memory required, the L2 cache must determine when a replacement
operation has been performed by the L1 cache. This determination together with the cache
set information allows the L2 to update the L1 directory copy so as to insert the new address
information. Therefore, the following processor bus operations must be decoded and dealt
with as indicated.
Read, RWITM, read atomic, RWITM atomic—Provided the cache inhibit (CI)
signal is not asserted, the tag is allocated in the L1 directory as indicated by the
CSE
n
signals and address. Additionally, in the 604 these allocations must indicate
whether they have caused a data and address pair to be transferred into a copy-back
buffer.
Note that the 601 position is that the cache directory model must be increased in
associativity by as many buffers as exist. For example, the 604 would require a
four-way instruction cache model and ‘4+1’ data cache model. It is referred to as
‘4+1’ because it is not truly a five-way model since groups 0–3 are selected by CSE,
group 4 is replaced by the evicted tag.
Kill block—TC0 asserted distinguishes kill block operations that deallocate cache
entries (caused by DCBI cache operations) from kill block operations that allocate
entries in the L1 cache (caused by DCBZ cache operations) or retain a cache entry
(a store to a shared entry). Likewise, kill-block operations as a result of a DCBZ
operation must also indicate (through TC2) whether an entry has been placed in a
copy-back buffer.
Whenever an allocation is generated by the processor that uses a copy-back buffer,
the previous L1 directory entry must be saved into a copy-back address register. The
use of these registers is simple first-in/first-out. It is unnecessary to copy the valid
bit from the tag directory into the address register, but for repeatability and testing,