
285
Chapter 5 Operating System Issues
write–through buffer just like stores that miss. The write–through buffer completes
the second required cycle of a store when the cache is free.
Because the write–through buffer can contain data for a store that hit in the
cache, the write–buffer must be flushed before cache reload can be performed. To
understand this, consider that the write–buffer may contain data for a modified block
which must be written back before the block can be reallocated. The write–buffer can
not forward the store data to the cache block after it has been assigned to a new
memory address.
Not all cache blocks need to be written back to the system memory. The format
of the cache tag and status information is shown on Figure 5-10. The tag information
contains a Modify (M) bit. When a block is first reloaded the valid bit is set and the M
bit is cleared. If a store (which is not write–through) is performed to an address in the
block, a hit occurs and the cache satisfies the access. At the same time the M bit is set
indicating the block has been modified. If the block is reallocated, it will be copied
back only if the M bit is set. Otherwise the block can be reloaded without the
copy–back being performed.
Figure 5-10.
Am29040 Data Cache Tag and Status bits
Address Tag
V
S
M
Data cache reload always fills a complete block. Unlike the Am29240
microcontroller, reload with critical word first is not performed. The processor will
use burst mode when reloading a block and will start with the first word in the block.
When the critical word is accessed during reload it is forwarded to the execute unit.
This enables reload to continue in parallel with code execution. If the critical word
had been accessed first, and it was not the first word in the block, burst mode access to
the memory block would have to be disrupted. This would increase the overall reload
time and would be particularly noticeable for back–to–back loads which miss in the
cache. Data cache reload is given priority over instruction cache for access to the
system busses. Loads issued while the cache is disabled, or to non cachable data, only
fetch the critical word from memory.
There is a minimum access latency of 3–cycles for the first word in a reloaded
cache block. This is true even if the off–chip memory system has the minimum access
latency of 2–cycles. When a block is reloaded it is possible the block will be supplied
by another Am29040 processor (via data intervention) rather than the memory
system. Data intervention is not asserted until the third cycle after the address of the
first word in the block appears on the address bus. The memory system may supply
the data in two cycles, but the processor holds the data internally for one cycle in case
data intervention occurs. Because cache reload is always block orientated,
intervention only occurs with the first word of the block. If the memory system