
Rev. 0
|
Page 4  of 44
|
March 2006
ADSP-21261
GENERAL DESCRIPTION
The ADSP-21261 SHARC DSP is a member of the SIMD 
SHARC family of DSPs featuring Analog Devices Super Har-
vard Architecture. The ADSP-21261 is source code compatible 
with the ADSP-2126x, ADSP-21160, and ADSP-21161 DSPs, as 
well as with first generation ADSP-2106x SHARC processors in 
SISD (single-instruction, single-data) mode. Like other SHARC 
DSPs, the ADSP-21261 is a 32-bit/40-bit floating-point proces-
sor optimized for high performance signal processing applica-
tions with its dual-ported on-chip SRAM, mask-programmable 
ROM, multiple internal buses to eliminate I/O bottlenecks, and 
an innovative digital applications interface.
As shown in the Functional Block Diagram on Page 1, the 
ADSP-21261 uses two computational units to deliver a five to 10 
times performance increase over previous SHARC processors 
on a range of DSP algorithms. Fabricated in a state-of-the-art, 
high speed, CMOS process, the ADSP-21261 DSP achieves an 
instruction cycle time of 6.67 ns at 150 MHz. With its SIMD 
computational hardware, the ADSP-21261 can perform 
900 MFLOPS running at 150 MHz. 
Table 1
 shows performance benchmarks for the ADSP-21261.
The ADSP-21261 continues SHARC’s industry-leading stan-
dards of integration for DSPs, combining a high performance 
32-bit DSP core with integrated, on-chip system features. 
The block diagram of the ADSP-21261 
on Page 1
 illustrates the 
following architectural features:
 Two processing elements, each containing an ALU, multi-
plier, shifter, and data register file
 Data address generators (DAG1, DAG2)
 Program sequencer with instruction cache
 PM and DM buses capable of supporting four 32-bit data 
transfers between memory and the core at every core pro-
cessor cycle
 Three programmable interval timers with PWM genera-
tion, PWM capture/pulse width measurement, and 
external event counter capabilities
 On-chip dual-ported SRAM (1M bit)
 On-chip dual-ported, mask-programmable ROM 
(3M bit)
 JTAG test access port
 8- or 16-bit parallel port that supports interfaces to off-chip 
memory peripherals
 DMA controller
 Four full-duplex serial ports
 SPI-compatible interface
 Digital applications interface that includes two precision 
clock generators (PCG), an input data port (IDP), four 
serial ports, eight serial interfaces, a 20-bit synchronous 
parallel input port, 10 interrupts, six flag outputs, six flag 
inputs, three programmable timers, and a flexible signal 
routing unit (SRU)
Figure 2
 shows one sample configuration of a SPORT using the 
precision clock generator to interface with an I
2
S ADC and an 
I
2
S DAC with a much lower jitter clock than the serial port 
would generate itself. Many other SRU configurations are 
possible.
ADSP-21261 FAMILY CORE ARCHITECTURE
The ADSP-21261 is code compatible at the assembly level with 
the ADSP-2126x, ADSP-2136x, ADSP-2116x, and the first gen-
eration ADSP-2106x SHARC DSPs. The ADSP-21261 shares 
architectural features with the ADSP-2126x, ADSP-2136x, and 
ADSP-2116x SIMD SHARC family of DSPs, as detailed in the 
following sections. 
SIMD Computational Engine
The ADSP-21261 contains two computational processing ele-
ments that operate as a single-instruction, multiple-data 
(SIMD) engine. The processing elements are referred to as PEX 
and PEY and each contains an ALU, multiplier, shifter, and reg-
ister file. PEX is always active, and PEY may be enabled by 
setting the PEYEN mode bit in the MODE1 register. When this 
mode is enabled, the same instruction is executed in both pro-
cessing elements, but each processing element operates on 
different data. This architecture is efficient at executing math 
intensive DSP algorithms.
Entering SIMD mode also has an effect on the way data is trans-
ferred between memory and the processing elements. When in 
SIMD mode, twice the data bandwidth is required to sustain 
computational operation in the processing elements. Because of 
this requirement, entering SIMD mode also doubles the band-
width between memory and the processing elements. When 
using the DAGs to transfer data in SIMD mode, two data values 
are transferred with each access of memory or the register file.
Independent, Parallel Computation Units 
Within each processing element is a set of computational units. 
The computational units consist of an arithmetic/logic unit 
(ALU), multiplier, and shifter. These units perform all opera-
tions in a single cycle. The three units within each processing 
Table 1. ADSP-21261 Benchmarks (at 150 MHz)
Benchmark Algorithm
1024 Point Complex FFT (Radix 4, with reversal) 46 
μ
s
FIR Filter (per tap)
1
IIR Filter (per biquad)
1
Matrix Multiply (pipelined)
[3×3] × [3×1]
[4×4] × [4×1]
Divide (y/×)
Inverse Square Root
Speed 
(at 150 MHz)
1
Assumes two files in multichannel SIMD mode.
2.5 ns 
10 ns 
22.5 ns
40 ns
15 ns 
22.5 ns