
6
Chapter 1: Overview of the AMD64 Architecture
AMD 64-Bit Technology
24593—Rev. 3.09—September 2003
The data are often represented as small quantities, such as 8
bits for pixel values, 16 bits for audio samples, and 32 bits
for object coordinates in floating-point format.
The 128-bit and 64-bit media instructions are designed to
accelerate these applications. The instructions use a form of
vector (or packed) parallel processing known as single-
instruction, multiple data (SIMD) processing. This vector
technology has the following characteristics:
A single register can hold multiple independent pieces of
data. For example, a single 128-bit XMM register can hold 16
8-bit integer data elements, or four 32-bit single-precision
floating-point data elements.
The vector instructions can operate on all data elements in a
register, independently and simultaneously. For example, a
PADDB instruction operating on byte elements of two vector
operands
in
128-bit
XMM
simultaneous additions and returns 16 independent results
in a single operation.
registers
performs
16
128-bit and 64-bit media instructions take SIMD vector
technology a step further by including special instructions that
perform operations commonly found in media applications. For
example, a graphics application that adds the brightness values
of two pixels must prevent the add operation from wrapping
around to a small value if the result overflows the destination
register, because an overflow result can produce unexpected
effects such as a dark pixel where a bright one is expected. The
128-bit and 64-bit media instructions include saturating-
arithmetic instructions to simplify this type of operation. A
result that otherwise would wrap around due to overflow or
underflow is instead forced to saturate at the largest or smallest
value that can be represented in the destination register.
1.1.5
Floating-Point
Instructions
The AMD64 architecture provides three floating-point
instruction subsets, using three distinct register sets:
128-Bit Media Instructions
support 32-bit single-precision
and 64-bit double-precision floating-point operations, in
addition to integer operations. Operations on both vector
data and scalar data are supported, with a dedicated
floating-point
exception-reporting
floating-point operations comply with the IEEE-754
standard.
mechanism.
These