
201
Chapter 4 Interrupts and Traps
Data cache synchronizing is discussed in detail at the end of this section. The number
of cycles required to read the data memory is represented by
Dr
. Once the address of
the handler has been fetched it must be routed to the processor PC, this takes one
cycle. A further cycle occurs before the address reaches the Address Pins. Delays in-
volved in fetching the first instruction are then the same as described above. Once
again, if the first instruction is found in the cache, the Branch Target Cache memory
forwards the instruction directly to the decode unit. The total latency (minimum of
seven cycles for the hit case) is given by the equation below.
delay(miss) = 1 + 1 + Dc + <cache sync.> + 1 + Dr + 1 + 1 + Ir + 1 + 1
delay(hit) = 1 + 1 + Dc + <cache sync.> + 1 + Dr + 1 + 1 + 1
The Am29050 processor supports instruction forwarding. This enables instruc-
tions to be forwarded directly to the decode unit, bypassing the fetch unit and saving
one cycle. The minimum latency for the Am29050 processor for the vector fetch and
non-vector fetch methods is six cycles and four cycles, respectively.
The Am29040 and Am2924x processors have data cache which can add to inter-
rupt latency. Consider that the Am29240 has a two word write–buffer which must be
flushed before interrupt processing can be completed. This adds as much as 2x
Dw
cycles to interrupt latency. The processor could be performing a load when inter-
rupted. If the load caused a block (cache entry) to be allocated, then the load would be
completed but block allocation canceled.
Cache synchronizing for the Am29040 processor is a little more complicated.
The worst case condition occurs when the write buffer is full and a load is performed.
The load can cause block allocation and because of the write–back policy, the se-
lected block may have to be copied–back. The Am29040 always flushes the write–
buffer before reloading a new block. Cache reload can not be cancelled even if the
interrupt occurs before the write–buffer is flushed. However, the loaded block will be
held in the reload buffer (see Figure 5-9) and the copy–back buffer returned to the
cache. Unfortunately, the reload buffer contents will never make it into the cache.
The effects of data cache synchronizing on interrupt latency are summarize be-
low:
Am29240 <cache sync.> = 2 x Dw
Am29040 <cache sync.> = (2 x Dw) + (4 x Dr)
4.3.3 Simple Freeze-mode Handlers
The simplest interrupt or trap handler will execute in its entirety in Supervisor
mode, with interrupts disabled, and with the FZ bit set in the CPS register. This corre-
sponds to the first stage depicted in Figure 4-1.