
1999 LSI Logic Corporation
5
T
T
h
h
e
e
D
D
a
a
t
t
a
a
U
U
n
n
i
i
t
t
(
(
D
D
U
U
)
)
The DU comprises the direct mapped data cache, data pre-fetch unit, circular buffer unit, and load/
store arbiter. It is the task of the DU to pre-fetch and buffer four data words per cycle. This unit
also has the ability to write two data words per cycle, if required. Like the instruction cache the data
cache reduces power dissipation by minimizing memory accesses. The Data Unit provides hardware
for the implementation of two circular buffers and for the sustained data throughput required by DSP
applications.
T
T
h
h
e
e
P
P
i
i
p
p
e
e
l
l
i
i
n
n
e
e
C
C
o
o
n
n
t
t
r
r
o
o
l
l
U
U
n
n
i
i
t
t
(
(
P
P
C
C
U
U
)
)
The PCU’s role is to group instructions for parallel execution. In this task, the PCU resolves data and
resource dependencies in the program sequence. Stated another way, this hardware schedules
instructions for execution by the four functional units (two MACs and two ALUs), simplifying the tasks
of the programmer or the compiler. The PCU also synchronizes the entire operation of the pipeline,
arranges operand bypass, and processes interrupt requests.
The ZSP Processor is four-way scalar and employs a five-stage pipeline. At any time, there may be a
maximum of twenty instructions in various stages of execution in the pipeline. The five pipeline
stages of this machine are Fetch/Decode (F/D), Group (G), Read (R), Execute (E), and Write (W).
The Fetch/Decode stage is where instructions are fetched, decoded, and issued. The Group stage is
where instructions are grouped for parallel execution after thorough checking of dependencies. The
operand register file and the data cache are read, and functional unit bypassing is performed during
the Read stage. Bypassing allows a functional unit to access the result of the previous instruction
without waiting for the result to be written to the operand register file. The Execution stage is where
ALU, MAC, and any internal memory access operations are completed. Finally, the operand register
file is written during the Write stage, and store data is written to memory.
As previously stated, the PCU processes interrupt requests. The interrupt structure provides four
user-defined priority levels. When an interrupt occurs and is enabled (unmasked), its priority is
compared to the current execution priority level of the machine. If the new interrupt is of equal or
higher priority, its level is saved as the current execution priority level and the interrupt is taken. If
the new interrupt is of lower priority than the execution priority level of the machine, the new
interrupt will remain pending until the execution priority level of the machine drops.
The user can change the interrupt priority of an interrupt source “on-the-fly” by explicitly writing a
new priority level to the appropriate field of the Interrupt Priority
registers. This method can be used
to raise the priority level of another interrupt source while executing a lower priority interrupt routine.
The user can also change the execution priority level of the machine “on-the-fly” by writing to the
Interrupt Priority register, thereby enabling interrupts of lower priority to be taken without modifying
any of the assigned interrupt priorities. This flexible and programmable interrupt mechanism enables
quick response to real-time tasks and without any risk of task starvation.
It takes five cycles for the processor to respond to an interrupt. The double word load and store
instructions enable fast context saving during interrupt.