PowerPC 440 Core
09/21/1999
Page 6 of 18
instructions are stored, which gives a significant performance boost while keeping the design
straightforward.
Decode and Issue
The four-entry decode queue accepts up to two instructions per clock
submitted
from the pre-decode
buffers. Instructions always enter the lowest empty or emptying queue position, behind any instructions
already in the queue. Therefore, the queue fills from the bottom up, instructions stay in order, and no
bubbles exist in the queue. A significant portion of decode is performed in the lowest two positions
(DISS0 and DISS1). Up to two instructions exit the queue based on the instructions’ decode and pipeline
availability, and are
issued
to the RACC stage. DISS1 can issue out of order with respect to DISS0.
Register Access
Conceptually, the GPR file consists of thirty-two, 32-bit general purpose registers. It is implemented as
two 6-port arrays, (one array for LRACC, one for IRACC) each with thirty-two, 32-bit registers
containing three write ports and three read ports. On all GPR updating instructions, the appropriate GPR
write ports will be written in order to keep the contents of the files the same. On GPR reads, however, the
GPR read ports are dedicated to instructions that are
dispatched
to a RACC’s associated pipe(s).
Execution Pipelines
The PPC440 contains three execution pipes: a load/store pipe (“L-pipe”), a simple integer pipe (“J-pipe”),
and a complex integer pipe (“I-pipe”). The L-pipe and J-pipe instructions are dispatched from the
LRACC; I-pipe instructions are dispatched from IRACC. The three pipes together perform all 32-bit
PowerPC integer instructions in hardware compliant with the PowerPC Book E specification. Table 2
lists the rules for dispatching to each of the three execution pipes.
L-pipe only
I-pipe or J-
pipe
2
I-pipe only
Loads/stores
1
, cache instructions, mbar, msync
Add, addi, addis, and, andc, cntlzw, eqv, extsb, extsh, nand, neg, nor, or, orc,
ori, oris, xori, xoris, rlwimi, rlwinm, rlwnm, slw, srw, subf
Branches, multiplies, divides, move to/from DCR/SPR, indirect XER updates,
indirect LR/CTR updates, indirect CR updates, CR-logicals, MAC instructions,
mcrf, mcrxr, mtcrf, mfcr, compares, dlmzb, isync, rfi, rfci, sc, wrtee, wrteei,
mtmsr, mfmsr, traps
Table 2 – Rules for Instruction Issue
The MAC unit is an auxiliary processor unit (APU) which adds 24 operations to the PPC440 instruction
set. MAC instructions operate on either signed or unsigned 16 bit operands and accumulate the results in
a 32-bit GPR. All MAC unit instructions have single cycle throughput. The MAC unit is contained
within the I-pipe.
1
The stwcx. instruction goes down both the L-pipe as well as the I-pipe, in order to update the CR.
2
Instructions which update the CR or XER are not issued to the J-pipe.