
6-22
ColdFire CF4e Core User’s Manual
For More Information On This Product,
Go to: www.freescale.com
Instruction Execution Times
The OEP is loaded with the opword and all required extension words at the
beginning of each instruction execution. This implies that the OEP spends no time
waiting for the IFP to supply opwords or extension words.
The OEP experiences no sequence-related pipeline stalls. For the V4, the most
common example of this type of stall occurs when a register is modified in the EX
engine and a subsequent instruction generates an address that uses the previously
modified register. The second instruction stalls in the OEP until the previous
instruction updates the register. For example:
muls.l
move.l
move.l waits 3 cycles for the muls.l to update d0. If consecutive instructions update
a register and use that register as a base of index value with a scale factor of 1
(Xi.l*1) in an address calculation, a 2-cycle pipeline stall occurs. If the destination
register is used as an index register with any other scale factor (Xi.l*2, Xi.l*4), a
3-cycle stall occurs. Table 6-3 lists instructions optimized to prevent such stalls.
#<data>,d0
(a0,d0.l*4),d1
NOTE:
Address register results from postincrement and predecrement
modes are available to subsequent instructions without stalls.
The OEP can complete all memory accesses without memory causing any stalls.
Thus, these timings assume an infinite, zero-wait state memory attached to the core.
Operand accesses are assumed to be aligned as follows:
— 16-bit operands are aligned on 0-modulo-2 addresses
— 32-bit operands are aligned on 0-modulo-4 addresses
Operands that do not meet these guidelines are misaligned. Table 6-6 shows how the
core decomposes a misaligned operand reference into a series of aligned accesses.
Table 6-6. Misaligned Operand References
A[1:0]
Size
Bus Operations
Additional C(R/W)
1
1
Each timing entry is presented as C(r/w), described as follows:
C is the number of processor clock cycles, including all applicable operand fetches and writes, as well as all
internal core cycles required to complete the instruction execution.
r/w is the number of operand reads (r) and writes (w) required by the instruction. An operation performing a
read-modify write function is denoted as (1/1).
x1
Word
Byte, Byte
2(1/0) if read
1(0/1) if write
x1
Long
Byte, Word, Byte
3(2/0) if read
2(0/2) if write
10
Long
Word, Word
2(1/0) if read
1(0/1) if write
F
Freescale Semiconductor, Inc.
n
.