R
November 9, 1998 (Version 3.1)
7-35
XC3000 Series Field Programmable Gate Arrays
7
Device Performance
The XC3000 families of FPGAs can achieve very high per-
formance. This is the result of
A sub-micron manufacturing process, developed and
continuously being enhanced for the production of
state-of-the-art CMOS SRAMs.
Careful optimization of transistor geometries, circuit
design, and lay-out, based on years of experience with
the XC3000 family.
A look-up table based, coarse-grained architecture that
can collapse multiple-layer combinatorial logic into a
single function generator. One CLB can implement up
to four layers of conventional logic in as little as 1.5 ns.
Actual system performance is determined by the timing of
critical paths, including the delay through the combinatorial
and sequential logic elements within CLBs and IOBs, plus
the delay in the interconnect routing. The AC-timing speci-
fications state the worst-case timing parameters for the var-
ious logic resources available in the XC3000-families
architecture.
Figure 31 shows a variety of elements
involved in determining system performance.
Logic block performance is expressed as the propagation
time from the interconnect point at the input to the block to
the output of the block in the interconnect area. Since com-
binatorial logic is implemented with a memory lookup table
within a CLB, the combinatorial delay through the CLB,
called TILO, is always the same, regardless of the function
being implemented. For the combinatorial logic function
driving the data input of the storage element, the critical
timing is data set-up relative to the clock edge provided to
the flip-flop element. The delay from the clock source to the
output of the logic block is critical in the timing signals pro-
duced by storage elements. Loading of a logic-block output
is limited only by the resulting propagation delay of the
larger interconnect network. Speed performance of the
logic block is a function of supply voltage and temperature.
Interconnect
performance
depends
on
the
routing
resources used to implement the signal path. Direct inter-
connects to the neighboring CLB provide an extremely fast
path. Local interconnects go through switch matrices
(magic boxes) and suffer an RC delay, equal to the resis-
tance of the pass transistor multiplied by the capacitance of
the driven metal line. Longlines carry the signal across the
length or breadth of the chip with only one access delay.
Generous on-chip signal buffering makes performance rel-
atively insensitive to signal fan-out; increasing fan-out from
1 to 8 changes the CLB delay by only 10%. Clocks can be
distributed with two low-skew clock distribution networks.
The tools in the Development System used to place and
route a design in an XC3000 FPGA automatically calculate
the actual maximum worst-case delays along each signal
path. This timing information can be back-annotated to the
design’s netlist for use in timing simulation or examined
with, a static timing analyzer.
Actual system performance is applications dependent. The
maximum clock rate that can be used in a system is deter-
mined by the critical path delays within that system. These
delays are combinations of incremental logic and routing
delays, and vary from design to design. In a synchronous
system, the maximum clock rate depends on the number of
combinatorial
logic
layers
between
re-synchronizing
flip-flops.
Figure 33 shows the achievable clock rate as a
function of the number of CLB layers.
CLB
IOB
CLB
PAD
(K)
Logic
CKO
T
CLOCK
Clock to Output
Combinatorial
Setup
TCKO
TILO
TICK
(K)
PAD
IOB
TPID
TOKPO
OP
T
X3178
Figure 31: Primary Block Speed Factors. Actual timing is a function of various block factors combined with routing.
factors. Overall performance can be evaluated with the timing calculator or by an optional simulation.
Product Obsolete or Under Obsolescence