
Philips Semiconductors
Image Co-Processor
File: icp.fm5, modified 7/26/99
PRELIMINARY INFORMATION
13-19
3. SDRAM memory loaded to 95% of its bandwidth by
DCACHE traffic from DSPCPU. Priority delay = 16, i.e.
ICP did wait 16 block times before competing for memo-
A load of 95% of the memory bandwidth is very rarely
found in a real system.
So the results in these tables may be useful to estimate
upper bounds for the computation time in a loaded sys-
tem.
The priority delays were set to the minimum and maxi-
mum possible values, so the computation time for other
priority delay values should be somewhere in between.
A simple linear model of computation time has been fit-
ted to the tabular data and to corresponding measure-
ments with half the number of pixels per line.
It has been assumed that
processing time = (time per line start)* (number of lines)
+(time per pixel) * (number of pixels)
line start and the time per pixel in this equation for the
three memory bandwidth cases.
The maximum deviation between measured time and fit-
ted model is in the order of 10% in the range W = 180 ..
1024, H = 240 .. 768. The deviation is much less in most
cases.
The values were found by least squares fit to the mea-
sured data.
In some cases the cumulative time for line starts contrib-
uted so little to the total computation time, that the value
per line start could only be determined relatively inaccu-
rately. In other words the pixel time part then dominated
the equation so much, that the line time part was negligi-
ble, given the inaccuracies of the model.
Therefore the simple model is only thought to allow inter-
polation for other picture sizes within the range W = 180
..1024, H = 240 .. 768. Extrapolation to picture sizes
much outside this range should not be attempted.
In some cases the real ICP performance may be much
better than that predicted by the model, due to irregular
behavior of the ICP.
For horizontal and vertical up/down-scaling operations
use the larger W or H value occurring at input/output with
the H/V filter times table or model.
This will lead to overestimation of processing time by up
to 20%.
Table 13-5. Measured processing time in ms - no other load to SDRAM
W in pixels
360
640
720
800
1024
H in pixels
240
480
768
480
600
768
horizontal lter, one component
1.22
3.82
4.43
7.08
4.78
5.98
9.27
horizontal lter, 3 components YUV 4:2:2
2.68
8.18
9.29
14.86
10.08
12.60
19.35
vertical lter, one component
2.57
8.73
10.24
16.36
11.19
13.97
22.30
vertical lter, 3 components YUV 4:2:2
5.15
17.47
20.48
32.72
22.95
28.65
44.60
yuv to rgb8a,pci output
3.36
10.74
11.93
19.08
13.04
16.30
26.02
yuv to rgb15a,pci output
3.39
10.79
11.96
19.12
13.10
16.41
26.15
yuv to rgb24,pci output
3.72
12.24
13.52
21.62
14.85
18.59
29.98
yuv to rgb24a,pci output
4.34
14.52
16.04
25.02
17.58
21.63
35.01
yuv to rgb8a,sdram output
3.39
10.78
11.95
19.09
13.13
16.40
26.08
yuv to rgb15a,sdram output
3.46
11.04
12.26
19.60
13.46
16.82
26.87
yuv to rgb24,sdram output
3.62
11.69
13.06
20.88
14.43
18.03
28.71
yuv to rgb24a,sdram output
3.90
12.69
14.11
22.57
15.65
19.56
31.07
yuv to rgb8a,bitmask,pci output
3.37
11.42
12.49
19.97
13.61
17.01
27.83
yuv to rgb8a,rgb 15a overlay,pci output
3.67
11.72
12.92
20.67
14.23
17.79
28.23
yuv to rgb8a,rgb24a overlay,pci output
4.23
13.57
15.32
24.51
16.93
21.15
33.15
yuv to rgb8a,yuv422a overlay,pci output
3.67
11.72
12.92
20.67
14.23
17.79
28.23
yuv to rgb8a,422sequencing,pci output
2.52
7.77
8.57
13.70
9.32
11.65
18.40