Chapter 6. Instruction Timing
6-23
Execution Unit Timings
6.4.1.1 Branch Folding and Removal of Fall-Through Branch
Instructions
When a branch instruction is encountered by the fetcher, the BPU immediately begins to
decode it and tries to resolve it. All branch instructions except those that update either the
LR or CTR are removed from the instruction ow before they would take a position in the
CQ.
Branch folding occurs either when a branch is taken or is predicted as taken (as is the case
with unconditional branches). When the BPU folds the branch instruction out of the
instruction stream, the target instruction stream that is fetched into the instruction queue
overwrites the branch instruction.
Figure 6-11 shows branch folding. Here a
instructions. The branch is resolved as taken. What happens on the next clock cycle depends
on whether the target instruction stream is in the BTIC, the instruction cache, or if it must
be fetched from the L2 cache or from system memory.
br
instruction is encountered in a series of
add
Figure 6-11 shows cases where there is a BTIC hit, and when there is a BTIC miss (and
instruction cache hit).
If there is a BTIC hit on the next clock cycle the
instruction, and1, that was found in the BTIC; the second and instruction is also fetched
from the BTIC. On the next clock cycle, the next four and
stream are fetched from the instruction cache.
b
instruction is replaced by the target
instructions from the target
If the target instruction is not in the BTIC, there is an idle cycle while the fetcher attempts
to fetch the Trst four instructions from the instruction cache (on the next clock cycle). In
the example in Figure 6-11, the Trst four target instructions are fetched on the next clock.
If it misses in the caches, an L2 cache or memory access is required, the latency of which
is dependent on several factors, such as processor/bus clock ratios. In most cases, new
instructions arrive in the IQ before the execution units become idle.
Figure 6-11. Branch Folding
Figure 6-12 shows the removal of fall-through branch instructions, which occurs when a
branch is not taken or is predicted as not taken.
IQ5
IQ4
IQ3
IQ2
IQ1
IQ0
add5
add4
add3
b
add2
add1
and2
and1
and6
and5
and4
and3
Branch Folding
(Taken Branch/BTIC Hit)
Clock 0
IQ5
IQ4
IQ3
IQ2
IQ1
IQ0
add5
add4
add3
b
add2
add1
and4
and3
and2
and1
Branch Folding
(Taken Branch/BTIC Miss)
Clock 0
Clock 1
Clock 1
Clock 2
Clock 2