
MOTOROLA
Chapter 1. Overview
1-33
256-Mbyte segment size. Block sizes range from 128 Kbyte to 256 Mbyte and are software
selectable. In addition, the 603e uses an interim 52-bit virtual address and hashed page
tables for generating 32-bit physical addresses. The MMUs in the 603e rely on the
exception processing mechanism for the implementation of the paged virtual memory
environment and for enforcing protection of designated memory areas.
Instruction and data TLBs provide address translation in parallel with the on-chip cache
access, incurring no additional time penalty in the event of a TLB hit. A TLB is a cache of
the most recently used page table entries. Software is responsible for maintaining the
consistency of the TLB with memory. The 603e’s TLBs are 64-entry, two-way
set-associative caches that contain instruction and data address translations. The 603e
provides hardware assist for software table search operations through the hashed page table
on TLB misses. Supervisor software can invalidate TLB entries selectively.
The 603e also provides independent four-entry BAT arrays for instructions and data that
maintain address translations for blocks of memory. These entries define blocks that can
vary from 128 Kbytes to 256 Mbytes. The BAT arrays are maintained by system software.
As specified by the PowerPC architecture, the hashed page table is a variable-sized data
structure that defines the mapping between virtual page numbers and physical page
numbers. The page table size is a power of 2, and its starting address is a multiple of its size.
Also as specified by the PowerPC architecture, the page table contains a number of page
table entry groups (PTEGs). A PTEG contains eight page table entries (PTEs) of eight bytes
each; therefore, each PTEG is 64 bytes long. PTEG addresses are entry points for table
search operations.
1.3.6 Instruction Timing
The 603e is a pipelined superscalar processor. A pipelined processor is one in which the
processing of an instruction is reduced into discrete stages. Because the processing of an
instruction is broken into a series of stages, an instruction does not require the entire
resources of an execution unit. For example, after an instruction completes the decode
stage, it can pass on to the next stage, while the subsequent instruction can advance into the
decode stage. This improves the throughput of the instruction flow. For example, it may
take three cycles for a floating-point instruction to complete, but if there are no stalls in the
floating-point pipeline, a series of floating-point instructions can have a throughput of one
instruction per cycle.