
21918B/0—October 1999
AMD-K6
-
III
Processor Data Sheet
Chapter 7
Cache Organization
189
7.8
Write Allocate
Write allocate, if enabled, occurs when the processor has a
pending memory write cycle to a cacheable line and the line
does not currently reside in the L1 data cache. If the line does
not exist in the L2 cache, the processor performs a 32-byte burst
read cycle on the system bus to fetch the data-cache line
addressed by the pending write cycle. If the line does exist in
the L2 cache, the data is supplied directly from the L2 cache, in
which case a system bus cycle is not executed. The data
associated with the pending write cycle is merged with the
recently-allocated data-cache line and stored in the processor’s
L1 data cache. If the data-cache line was fetched from memory
(because of a L2 cache miss), the data is stored, without
modification, in the L2 cache. The final MESI state of the cache
lines depends on the state of the WB/WT# and PWT signals
during the burst read cycle and the subsequent L1 data cache
write hit (See Table 34 on page 195 to determine the cache-line
states and the access types following a cache write miss). If the
L1 data cache line is stored in the modified state, then the same
cache line is stored in the L2 cache in the exclusive state. If the
L1 data cache line is stored in the shared state, then the same
cache line is stored in the L2 cache in the shared state.
If a data-cache line fetch from memory is attempted because
the write allocate misses the L2 cache, and KEN# is sampled
negated, the processor does not perform an allocation. In this
case, the pending write cycle is executed as a single write cycle
on the system bus.
During write allocates that miss the L2 cache, a 32-byte burst
read cycle is executed in place of a non-burst write cycle. While
the burst read cycle generally takes longer to execute than the
non-burst write cycle, performance gains are realized on
subsequent write cycle hits to the write-allocated cache line.
Due to the nature of software, memory accesses tend to occur in
proximity of each other (principle of locality). The likelihood of
additional write hits to the write-allocated cache line is high.
Write allocates that hit the L2 cache increase performance by
avoiding accesses to the system bus.
The following is a description of three mechanisms by which the
AMD-K6-III processor performs write allocations. A write
allocate is performed when any one or more of these
mechanisms indicates that a pending write is to a cacheable
area of memory.