Chapter 1. Overview
1-11
MPC7400 Microprocessor Features
Dynamic prediction is implemented using a 512-entry branch history table (BHT), a cache
that provides two bits per entry that together indicate four levels of prediction for a branch
instructionnot-taken, strongly not-taken, taken, strongly taken. When dynamic branch
prediction is disabled, the BPU uses a bit in the instruction encoding to predict the direction
of the conditional branch. Therefore, when an unresolved conditional branch instruction is
encountered, the MPC7400 executes instructions from the predicted target stream although
the results are not committed to architected registers until the conditional branch is
resolved. This execution can continue until a second unresolved branch instruction is
encountered.
When a branch is taken (or predicted as taken), the instructions from the untaken path must
be ushed and the target instruction stream must be fetched into the IQ. The BTIC is a
64-entry, four-way set associative cache that contains the most recently used branch target
instructions, typically in pairs. When an instruction fetch hits in the BTIC, the instructions
arrive in the instruction queue in the next clock cycle, a clock cycle sooner than they would
arrive from the instruction cache. Additional instructions arrive from the instruction cache
in the next clock cycle. The BTIC reduces the number of missed opportunities to dispatch
instructions and gives the processor a one-cycle head start on processing the target stream.
The BPU contains an adder to compute branch target addresses and three user-control
registersthe link register (LR), the count register (CTR), and the condition register (CR).
The BPU calculates the return pointer for subroutine calls and saves it into the LR for
certain types of branch instructions. The LR also contains the branch target address for the
Branch Conditional to Link Register (
bclr
x
) instruction. The CTR contains the branch
target address for the Branch Conditional to Count Register (
bcctr
x
) instruction. Because
the LR and CTR are SPRs, their contents can be copied to or from any GPR. Because the
BPU uses dedicated registers rather than GPRs or FPRs, execution of branch instructions
is largely independent from execution of integer and oating-point instructions.
1.2.2.3 Completion Unit
The completion unit operates closely with the instruction unit. Instructions are fetched and
dispatched in program order. At the point of dispatch, the program order is maintained by
assigning each dispatched instruction a successive entry in the eight-entry completion
queue. The completion unit tracks instructions from dispatch through execution and retires
them in program order from the two bottom entries in the completion queue (CQ0 and
CQ1).
Instructions cannot be dispatched to an execution unit unless there is a vacancy in the
completion queue. Branch instructions that do not update the CTR or LR are removed from
the instruction stream and do not take an entry in the completion queue. Instructions that
update the CTR and LR follow the same dispatch and completion procedures as non-branch
instructions, except that they are not issued to an execution unit.
Completing an instruction commits execution results to architected registers (GPRs, FPRs,
VRs, LR, and CTR). In-order completion ensures the correct architectural state when the