6-4
ColdFire CF4e Core User’s Manual
For More Information On This Product,
Go to: www.freescale.com
Instruction Fetch Pipeline (IFP)
— IB (instruction buffer)—(optional) Prefetched instructions can pass directly into
the OEP instruction registers, hiding IB from software in most cases.
Five-stage OEP with two optional write stages
— DS (decode and select)
— OAG (address generation)
— OC1 (operand fetch cycle 1)
— OC2 (operand fetch cycle 2)
— EX (execute)
— DA (write data available (operand write operations only)
— ST (store data (operand write operations only)
In summary, the OEP implements a five-stage design with limited superscalar instruction
issue capabilities to provide a cost-effective solution for the V4e core.
6.2 Instruction Fetch Pipeline (IFP)
The IFP generates instruction addresses and fetches. Because the fetch and execution
pipelines are decoupled by the eight-instruction FIFO instruction buffer (IB), the IFP can
prefetch instructions before the OEP needs them, minimizing stalls.
The IED stage provides early decoding, which effectively implements a hardware lookup
table. First, the 32 bits of fetched instruction data are separated into 16-bit parcels, the
minimum size for ColdFire instructions. Next, each parcel is used to index a hardware table
that provides a vector to decode fields that provide information such as instruction length,
data memory reference type, necessary register resources, and control information needed
early in the OEP DS stage. Finally, the instruction and its early decode information are
loaded into the instruction buffer or directly into the OEP. The early decode information
becomes the extended opword as it enters the OEP.
The primary IFP/OEP interface includes 48 bits of instruction (16-bit opword and two
optional 16-bit extension words) along with the extended opword containing the decode
vector. The IFP also supplies another 16-bit opword and its extended opword for the next
sequential instruction to the OEP to support the limited superscalar dispatch capabilities.
In addition to the prefetch function, the IFP improves the performance of change-of-flow
operations through the following:
8-entry, direct-mapped branch cache unit (BCU). Associates branch instruction
addresses with the target address for taken conditional branch instructions (Bcc).
Each entry includes a 2-bit, four-state branch prediction value that predicts a Bcc
instruction to be strongly or weakly taken or not taken. Branch folding of the branch
cache entry allows zero-cycle Bcc execution times for correctly predicted taken
branches. To maximize effectiveness of the small direct-mapped branch cache, a
hashed address indexes into the cache. The hashed address is generated as follows:
hashedBcuAddress[2:0] = IfpAddr[7:5] XOR IfpAddr[4:2]
F
Freescale Semiconductor, Inc.
n
.