![](http://datasheet.mmic.net.cn/230000/79RC32438-200BB_datasheet_15568909/79RC32438-200BB_76.png)
IDT MIPS32 4Kc Processor Core
Pipeline Description
79RC32438 User Reference Manual
2 - 14
November 4, 2002
Notes
Figure 2.12 MDU Pipeline Flow During a 32-bit Divide (DIV) Operation
Branch Delay
The pipeline has a branch delay of one cycle. The one-cycle branch delay is a result of the branch deci-
sion logic operating during the E pipeline stage. This allows the branch target address calculated in the
previous stage to be used for the instruction access in the following E stage. The branch delay slot means
that no bubbles are injected into the pipeline on branch instructions. The address calculation and branch
condition check are both performed in the E stage. The target PC is used for the next instruction in the
I stage (2nd instruction after the branch).
The pipeline begins the fetch of either the branch path or the fall-through path in the cycle following the
delay slot. After the branch decision is made, the processor continues with the fetch of either the branch
path (for a taken branch) or the fall-through path (for the non-taken branch).
The branch delay means that the instruction immediately following a branch is always executed, regard-
less of the branch direction. If no useful instruction can be placed after the branch, then the compiler or
assembler must insert a NOP instruction in the delay slot.
Figure 2.13 illustrates the branch delay.
Figure 2.13 IU Pipeline Branch Delay
Data Bypassing
Most MIPS32 instructions use one or two register values as source operands for the execution. These
operands are fetched from the register file in the first part of E stage. The ALU straddles the E to M
boundary, and can present the result early in M stage. However, the result is not written in the register file
until W stage. This leaves following instructions unable to use the result for 3 cycles. To overcome this
problem, Data bypassing is used.
Between the register file and the ALU, a data bypass multiplexer is placed on both operands (see Figure
2.14). This enables the 4K core to forward data from preceding instructions which have the target register of
the first instruction as one of the source operands. An M to E bypass and an A to E bypass feed the bypass
multiplexers. A W to E bypass is not needed, as the register file is capable of making an internal bypass of
Rd write data directly to the Rs and Rt read ports.
RS Adjust
E Stage
M
MDU
Stage
M
MDU
Stage
M
MDU
Stage
A
MDU
Stage
Rem Adjust
Add/Subtract
Clock
1
2
4-34
35
36
W
MDU
Stage
37
Reg WR
Sign Adjust
M
MDU
Stage
Add/Subtract
3
Early In
One Cycle
Jump Target Instruction
Delay Slot Instruction
One Clock
Branch Delay
One Cycle
One Cycle
One Cycle
One Cycle
One Cycle
I
E
M
A
W
I
E
M
A
W
I
E
M
A
Jump or Branch