
PRS28.4G
IBM Packet Routing Switch
prs28.03.fm
August 31, 2000
Architecture
Page 15 of 131
2. Architecture
The PRS28.4G incorporates two 888 Mb/s, per-port, self-routing sub-switch elements and a control section
that is common to both.
Each 1.77 Gb/s port therefore carries two data streams, one master and one slave, each at 888 Mb/s. The
master stream carries the data packet header bytes, followed by some packet payload bytes. The slave
stream carries only payload bytes.
The input controllers examine the headers of incoming Data Packets and check the data integrity, using a
parity bit on the header bytes. Valid Data Packets are then stored in the shared memory, and their storage
addresses, along with the packet priority and bit map, are further processed by a centralized output queue
access manager. These addresses are enqueued into FIFO queues, one per output port and per priority,
according to the packet priority and bit-map field. Data Packets are then transmitted, one at a time according
to their place in the output queues, with the restriction that high priority packets always overtake lower priority
packets.
Multicast packets are processed the same way. A multicast packet is stored only once in the shared buffer,
while its address is enqueued in all output queues indicated by its bit map field. A multicast packet is trans-
mitted on the indicated outputs according to its position in each output queue, not necessarily at the same
time on all outputs.
A central address manager maintains a pool of free shared-buffer addresses and provides new store
addresses to the input controllers. Once a packet is transmitted, its address is returned to the address
manager. The address manager also keeps track of the number of outputs still holding each address, since
one address can be copied multiple times for multicast packets. Once this reaches zero, the address is
returned to the free address pool.
The shared memory is organized as two banks, one master and one slave, each consisting of 512 rows of 20
bytes, with one write port and one read port. Access to the shared memory is performed one input and output
port at a time. 16 to 20 bytes are transferred at each access, depending on the packet length. A central
sequencer grants shared memory access to the input and output ports, in round robin. This sequencer cycle
is equal, in byte cycles, to the number of data bytes stored at every access in one memory row. It is defined
as an integer between 16 and 20, such that packet length is a multiple of this integer. All cycles have equal
length. Without speed expansion or with speed expansion and packet length greater than 128 bytes, an LU is
received in two cycles of equal length. In speed expansion and packet length smaller than 128 bytes, it takes
only one cycle to receive an LU.
Data flow is controlled using a grant mechanism. Grants are given to the input interface of the attached
device to allow packets to be transmitted. Similarly, the output interface of the attached device provides
grants to each output port to enable packet transmission. On the input side of the switch, output queue
grants, which reflect the status of the output queues, and memory grants, for the status of the shared
memory, are provided. One output queue grant is provided per output and per priority. The output queue
manager maintains a counter for each output queue, which indicates the total number of packets enqueued
for that output, regardless of priority. Four programmable output queue thresholds are also provided, one for
each priority. All output counters are compared to those four thresholds once per sequencer cycle. If an
output counter value is less than the threshold, the corresponding grant is set. Otherwise, it is cleared. Simi-
larly, a counter keeps track of the total number of packets in the shared memory. Four programmable shared
memory thresholds are also provided, one for each priority. This counter is compared once per sequencer
cycle to those four thresholds to generate the memory grants. An input interface island is only allowed to