
120
Chapter 3: General-Purpose Programming
AMD 64-Bit Technology
24593—Rev. 3.09—September 2003
Caches are divided into fixed-size blocks, called
cache lines
.
Typically, implementations have either 32-byte or 64-byte cache
lines. The processor
allocates
a cache line to correspond to an
identically-sized region in main memory. After a cache line is
allocated, the addresses in the corresponding region of main
memory are used as addresses into the cache line. It is the
processor’s responsibility to keep the contents of the allocated
cache line
coherent
with main memory. Should another system
device access a memory address that is cached, the processor
maintains coherency by providing the correct data back to the
device and main memory.
When a memory-read occurs as a result of an instruction fetch
or operand access, the processor first checks the cache to see if
the requested information is available. A
read hit
occurs if the
information is available in the cache, and a
read miss
occurs if
the information is not available. Likewise, a
write hit
occurs if a
memory write can be stored in the cache, and a
write miss
occurs if it cannot be stored in the cache.
A read miss or write miss can result in the allocation of a cache
line, followed by a
cache-line fill
. Even if only a single byte is
needed, all bytes in a cache line are loaded from memory by a
cache-line fill. Typically, a cache-line fill must write over an
existing cache line in a process called a
cache-line replacement
.
In this case, if the existing cache line is modified, the processor
performs a cache-line
writeback
to main memory prior to
performing the cache-line fill.
Cache-line writebacks help maintain coherency between the
caches and main memory. Internally, the processor can also
maintain cache coherency by
internally probing
(checking) the
other caches and write buffers for a more recent version of the
requested data. External devices can also check a processor’s
caches and write buffers for more recent versions of data by
externally probing
the processor. All coherency operations are
performed in hardware and are completely transparent to
applications.
Cache Coherency and MOESI.
Implementations of the AMD64
architecture maintain coherency between memory and caches
using a five-state protocol known as MOESI. The five MOESI
states are
modified
,
owned
,
exclusive
,
shared
, and
invalid
. See
“Memory System” in Volume 2 for additional information on
MOESI and cache coherency.