
54
Optimization
As you benchmark your programs keep in mind that optimization can greatly
improve the performance of the program as well as inflate code size, cause
erratic behavior and even cause the program to run slower! Take careful note of
the compiler selections you use (-O1, -G, -O2 etc.) as they determine the
optimizations the compiler will perform on your code.
Below are some simple explanations of each major optimization.
Be sure you know the effect of each before you
attempt to call any timing optimal.
Conventional wisdom says there are three components to generating
good code on the 80x86 processors; register allocation, register
allocation and register allocation.
Global register
allocation
Because memory references are so expensive on these processors, it is
extremely important to minimize those references through the intelligent
use of registers. Global register allocation both increases the speed and
decreases the size of your application. You should always use global
register allocation when compiling your application with optimizations on.
Dead-code
elimination
Although you may never intentionally write code to do things which are
unnecessary, the optimizer may reveal possibilities to eliminate stores
into variables that are not needed.
Common
subexpression
elimination
Common subexpression elimination is the process of finding duplicate
expressions within the target scope and storing the calculated value of
those expressions once so as to avoid recalculating the expression.
Although in theory this optimization could reduce code-size, in practice,
it is a speed optimization and will only rarely result in size reductions.
You should also use global common subexpression analysis if you like to
reuse expressions rather than create explicit stack locations for them.
Loop invariant
code motion
Moving invariant code out of loops is a speed optimization. The
optimizer uses the information about all the expressions in the function
gathered during common subexpression elimination to find expressions
whose values do not change inside a loop. To prevent the calculation
from being done many times inside the loop, the optimizer moves the
code outside the loop so that it is calculated only once. The optimizer
then reuses the calculated value inside the loop.
You should use loop invariant code motion whenever you are compiling
for speed and you have used global common subexpressions, since
moving code out of loops can result in enormous speed gains.