
4-8
Viterbi Decoder Implementation
For More Information On This Product,
Go to: www.freescale.com
Algorithmic Extensions
Starting from 0: The Pre ACS Macro
The subl a,b instruction does a shift left and subtract operation on b and a to obtain the
branch metric for encoder 11 outputs. The negated path metric value is saved in x0. The
transfer instruction and its parallel move swap y1 with a to save the 00 branch metric
and place the partial branch metric for the next computation.
The next mac instruction scales the imaginary part and subtracts it from the partial
branch metric to obtain the branch metric for encoder 01 outputs. The 11 branch metric is
saved in register x1. The negated path metric is placed in b, and the 11 branch metric is
move to y0.
The final subl instruction shift left and subtracts the negated path metric from the 01
branch to get the 10 branch metric in b. The 00 branch metric is copied to x0.
After all these moves, we find that register x1:x0 contains branches 11:00, register y1:y0
contains branches 00:11, and a1:b1 has branches 01:10. Note that we can access
accumulators as ab or ba to get branches 01:10 or 10:01. All we have to do now is to place
the branch metric pairs in memory to correspond with their states for the Viterbi
butterfly loop.
Unlike the case in
Section 3
, our branch metric generation allows us to use a simple
memory placement scheme. We begin with the memory location for state 0, and
increment through the states until the r2 address pointer rolls over to 0 after location 15
(r2 is set to be a modulo 16 address pointer). Because we have all possible needed branch
metric pairs in some register, we just code each move with the branch metric pair for that
state.
4.3
STARTING FROM 0: THE PRE ACS MACRO
For data that is coded in packets, it is not unusual for the encoder to start in a known
state. The easiest case is when the encoder begins each data packet starting in state zero.
This is what is assumed for this example. For this case, the beginning trellis is shown in
Figure 4-1
. Notably, the number of states to be updated changes from 2 to 4 to 8, etc.,
doubling until the full 32 states is reached. Until that time, we have fewer computations
to do because the number of states is smaller and because no comparison of competing
branches is needed until 32 states of the trellis are being used. We can reduce the
computation needed for a packet by treating the first five decoder inputs separately.
The code to do this appears in the PreACS macro in
Example 4-3
. The first two moves
save the state address pointers to be restored at the routines end. Next the branch
metrics are loaded into the accumulators b and a. Finally, the path metric/path pair is
loaded into x.
F
Freescale Semiconductor, Inc.
n
.