Revision 1.2
11
www.national.com
Architecture Overview (
Continued
)
G
1.1
The integer unit consists of:
Instruction Buffer
Instruction Fetch
Instruction Decoder and Execution
INTEGER UNIT
The pipelined integer unit fetches, decodes, and executes
x86 instructions through the use of a five-stage integer
pipeline.
The instruction fetch pipeline stage generates, from the
on-chip cache, a continuous high-speed instruction
stream for use by the processor. Up to 128 bits of code
are read during a single clock cycle.
Branch prediction logic within the prefetch unit generates
a predicted target address for unconditional or conditional
branch
instructions.
When
detected, the instruction fetch stage starts loading instruc-
tions at the predicted address within a single clock cycle.
Up to 48 bytes of code are queued prior to the instruction
decode stage.
a
branch
instruction
is
The instruction decode stage evaluates the code stream
provided by the instruction fetch stage and determines the
number of bytes in each instruction and the instruction
type. Instructions are processed and decoded at a maxi-
mum rate of one instruction per clock.
The address calculation function is pipelined and contains
two stages, AC1 and AC2. If the instruction refers to a
memory operand, AC1 calculates a linear memory
address for the instruction.
The AC2 stage performs any required memory manage-
ment
functions,
cache
accesses,
accesses. If a floating point instruction is detected by
AC2, the instruction is sent to the floating point unit for
processing.
and
register
file
The execution stage, under control of microcode, exe-
cutes instructions using the operands provided by the
address calculation stage.
Write-back
,
the last stage of the integer unit, updates the
register file within the integer unit or writes to the
load/store unit within the memory management unit.
1.2
The floating point unit (FPU) interfaces to the integer unit
and the cache unit through a 64-bit bus. The FPU is x87-
instruction-set compatible and adheres to the IEEE-754
standard. Because almost all applications that contain
FPU instructions also contain integer instructions, the
GXLV processor’s FPU achieves high performance by
completing integer and FPU operations in parallel.
FLOATING POINT UNIT
FPU instructions are dispatched to the pipeline within the
integer unit. The address calculation stage of the pipeline
checks
for
memory
management
accesses memory operands for use by the FPU. Once the
instructions and operands have been provided to the FPU,
the FPU completes instruction execution independently of
the integer unit.
exceptions
and
1.3
The 16 KB write-back unified (data/instruction) cache is
configured as four-way set associative. The cache stores
up to 16 KB of code and data in 1024 cache lines.
WRITE-BACK CACHE UNIT
The GXLV processor provides the ability to allocate a por-
tion of the L1 cache as a scratchpad, which is used to
accelerate the Virtual Systems Architecture technology
algorithms as well as for some graphics operations.
1.4
The memory management unit (MMU) translates the lin-
ear address supplied by the integer unit into a physical
address to be used by the cache unit and the internal bus
interface unit. Memory management procedures are x86-
compatible, adhering to standard paging mechanisms.
MEMORY MANAGEMENT UNIT
The MMU also contains a load/store unit that is responsi-
ble for scheduling cache and external memory accesses.
The
load/store
unit
incorporates
enhancing features:
two
performance-
Load-store reordering
that gives memory reads
required by the integer unit a priority over writes to
external memory.
Memory-read bypassing
that eliminates unnecessary
memory reads by using valid data from the execution
unit.
1.5
The internal bus interface unit provides a bridge from the
GXLV processor to the integrated system functions (i.e.,
memory subsystem, display controller, graphics pipeline)
and the PCI bus interface.
INTERNAL BUS INTERFACE UNIT
When external memory access is required, the physical
address is calculated by the memory management unit
and then passed to the internal bus interface unit, which
translates the cycle to an X-Bus cycle (the X-Bus is a pro-
prietary internal bus which provides a common interface
for all of the integrated functions). The X-Bus memory
cycle is arbitrated between other pending X-Bus memory
requests to the SDRAM controller before completing.
In addition, the internal bus interface unit provides config-
uration control for up to 20 different regions within system
memory with separate controls for read access, write
access, cacheability, and PCI access.