2
Performing a sequence of cacheable data loads over
a 100MHz bus, both the MPC750 and the
MPC74xx variants have a peak bandwidth of
800Mbytes per second. With the constraints of the
60x bus protocol and the same memory system
latency, both have a maximum bandwidth of
640Mbytes per second. However, in terms of
sustained bandwidth, which best represents actual
system performance, the MPC74xx devices
outperform the MPC750 by nearly 3:1.
Benefit 2. More Back-to-Back
Transactions on the Bus
Instructions
In the G3 architecture, once an I-cache miss occurs,
no further I-cache misses are issued to the L2 or the
system bus until the cache line
fi
ll updates both the
L1 and L2 caches. Thanks to an additional entry in
the
instruction
reload
MPC7400/MPC7410 architecture allows a second
instruction fetch to start after the
fi
rst fetch has
updated the L1, but before it has updated the L2.
Going a step further in improving instruction fetch
performance, the MPC7440/MPC7450 can support
table,
the
up to two outstanding instruction fetches, compared
to
just
one
for
the
MPC7400/MPC7410.
MPC750
and
the
Data
As a result of the G3’s D-cache design, once a
D-cache miss occurs, no further D-cache misses
(triggered by program loads and stores) are
propagated to the L2 or the system bus until the
original missed data is returned. This means that
back-to-back cacheable data reads are not pipelined
on the bus. Even though the bus interface unit may
be ready for more transactions, and the 60x bus
protocol can accept another pipelined address
phase, the blocking caches add latency to a
sequence of read accesses. In order to prevent one
miss from blocking the cache for subsequent
accesses,
the
MPC7400/MPC7410
supports ‘miss-under-miss.’ If a miss is pending,
subsequent loads that miss in the D-cache will
propagate to the bus, rather than stalling. In fact, the
load/store unit of the MPC7400/MPC7410 can
continue to issue requests until up to six misses are
pending. The MPC7440/MPC7450 can support up
to 16 outstanding data tenures on the bus,
fi
ve of
which may be data load misses. (The others may be
stores, castouts, snoop pushes, or instruction
fetches.)
Better pipelining of instruction fetches and support
for multiple outstanding data transactions add up to
better bus utilization and higher sustainable
bandwidth than the MPC750 can provide.
D-cache
Benefit 3. L1 Cache Access
Improvements
Load Miss Folding
In the MPC750, if there are two load misses to the
same cache block, the second load must wait until
the entire block is returned before it can access its
data. Subsequent accesses to the cache are also
stalled. When two load misses to the same cache
block occur in the MPC74xx, the stall does not
occur. Instead, as data beats return for the
fi
rst miss,
results can be provided for the next miss as well.
Furthermore, up to four subsequent misses to the
same cache block can be ‘folded’ into a Load Fold
Queue, allowing full access to the D-cache for the
following instructions while the reload is in
progress. Non-blocked access to the cache,
combined with pipelining of back-to-back data
reads on the bus, can improve the performance of a
PowerPC system limited by bus bandwidth.
Comparison of MPC750 and MPC74xx Bus
Bandwidth
(Mbytes/sec.)1
at 100MHz
1
Values assume a memory read latency of 10 bus cycles,
counted from the cycle when address is driven and TS is
asserted:
1. Processor bus to system logic
2. System logic to memory interface
3. SDRAM Activate command (assert RAS)
4. Wait for memory (activate to Read/Write = 2 cycles)
5. Read command (assert CAS)
6. Wait for memory (SDRAM Read Latency = 3 cycles)
7. Wait for memory (continued)
8. First beat on memory bus
9. Data latched into system logic (not necessarily required)
10. First beat on processor bus
Peak bandwidth (MPC750 and MPC74xx) = 8 Bytes/cycle
x 100MHz = 800 MB/sec.
3
Maximum bandwidth (MPC750 and MPC74xx) =
[(1 cache line)/5 bus cycles] x
100MHz = 32 Bytes x 100MHz/ 5 cyc = 640 MB/sec.
4
Sustained bandwidth (MPC750) = [(1 cache line)/13 bus
cycles] x 100MHz = 32 Bytes x 100MHz / 13 cyc = 246
MB/sec.
5
Sustained bandwidth (MPC74xx) = maximum bandwidth
(MPC74xx). By pipelining transactions on the address
bus, the MPC74xx does not incur any additional penalty
beyond the limitations of the 60x bus protocol.
Device
Peak
2
2
Maximum
3
Sustained
MPC750
800
640
246
4
MPC74xx
800
640
640
5
F
Freescale Semiconductor, Inc.
For More Information On This Product,
Go to: www.freescale.com
n
.