1-14
MPC7400 RISC Microprocessor Users Manual
MPC7400 Microprocessor Features
1.2.2.4.4 Floating-Point Unit (FPU)
The FPU, shown in Figure 1-1, is designed such that single-precision operations require
only a single pass, with a latency of three cycles. As instructions are dispatched to the FPUs
reservation station, source operand data can be accessed from the FPRs or from the FPR
rename buffers. Results in turn are written to the rename buffers and are made available to
subsequent instructions. Instructions pass through the reservation station in dispatch order.
The FPU contains a single-precision multiply-add array and the oating-point status and
control register (FPSCR). The multiply-add array allows the MPC7400 to efTciently
implement multiply and multiply-add operations. The FPU is pipelined so that one single-
or double-precision instruction can be issued per clock cycle. Thirty-two 64-bit
oating-point registers are provided to support oating-point operations. Stalls due to
contention for FPRs are minimized by automatic allocation of the six oating-point rename
registers. The MPC7400 writes the contents of the rename registers to the appropriate FPR
when oating-point instructions are retired by the completion unit.
The MPC7400 supports all IEEE 754 oating-point data types (normalized, denormalized,
NaN, zero, and inTnity) in hardware, eliminating the latency incurred by software
exception routines.
1.2.2.4.5 Load/Store Unit (LSU)
The LSU executes all load and store instructions as well as the AltiVec LRU and transient
instructions and provides the data transfer interface between the GPRs, FPRs, VRs, and the
cache/memory subsystem. The LSU calculates effective addresses, performs data
alignment, and provides sequencing for load/store string and multiple instructions.
Load and store instructions are issued and translated in program order; however, some
memory accesses can occur out of order. Synchronizing instructions can be used to enforce
strict ordering. When there are no data dependencies and the guarded bit for the page or
block is cleared, a maximum of one out-of-order cacheable load operation can execute per
cycle from the perspective of the LSU, with a two-cycle total latency on a cache hit. Data
returned from the cache is held in a rename register until the completion logic commits the
value to a GPR, FPR, or VR. Stores cannot be executed out of order and are held in the store
queue until the completion logic signals that the store operation is to be completed to
memory. The MPC7400 executes store instructions with a maximum throughput of one per
cycle and a three-cycle total latency to the data cache. The time required to perform the
actual load or store operation depends on the processor/bus clock ratio and whether the
operation involves the on-chip cache, the L2 cache, system memory, or an I/O device.
1.2.2.4.6 System Register Unit (SRU)
The SRU executes various system-level instructions, as well as condition register logical
operations and move to/from special-purpose register instructions. To maintain system
state, most instructions executed by the SRU are execution-serialized; that is, the
instruction is held for execution in the SRU until all previously issued instructions have