Lecture 7 - Programming Issues
Lecture 7 - Programming Issues
Programming Issues
• Analysis of implementation 3
– Use same analysis techniques as implementation 2
– Total execution time for processing one image:
• 1.5 seconds
– Power consumption:
• 0.033 watt (same as 2)
– Energy consumption:
• 0.050 joule (1.5 s x 0.033 watt)
• Battery life 6x longer!!
– Total chip area:
6
• 90,000 gates
• 8,000 less gates (less memory needed for code)
DCT floating-point cost
• Floating-point cost
– DCT uses ~260 floating-point operations per pixel transformation
– 4096 (64 x 64) pixels per image
– 1 million floating-point operations per image
– No floating-point support with Intel 8051
• Compiler must emulate
– Generates procedures for each floating-point operation
» mult, add
– Each procedure uses tens of integer operations
– Thus, > 10 million integer operations per image
– Procedures increase code size
• Fixed-point arithmetic can improve on this
7
Fixed-point arithmetic
• Integer used to represent a real number
– Constant number of integer’s bits represents fractional portion of real number
• More bits, more accurate the representation
– Remaining bits represent portion of real number before decimal point
• Translating a real constant to a fixed-point representation
– Multiply real value by 2 ^ (# of bits used for fractional part)
– Round to nearest integer
– E.g., represent 3.14 as 8-bit integer with 4 bits for fraction
• 2^4 = 16
• 3.14 x 16 = 50.24 ≈ 50 = 00110010
• 16 (2^4) possible values for fraction, each represents 0.0625 (1/16)
• Last 4 bits (0010) = 2
• 2 x 0.0625 = 0.125
• 3(0011) + 0.125 = 3.125 ≈ 3.14 (more bits for fraction would increase accuracy)
8
Fixed-point arithmetic
• Addition
operations
– Simply add integer representations
– E.g., 3.14 + 2.71 = 5.85
• 3.14 → 50 = 00110010
• 2.71 → 43 = 00101011
• 50 + 43 = 93 = 01011101
• 5(0101) + 13(1101) x 0.0625 = 5.8125 ≈ 5.85
• Multiply
– Multiply integer representations
– Shift result right by # of bits in fractional part
– E.g., 3.14 * 2.71 = 8.5094
• 50 * 43 = 2150 = 100001100110
• >> 4 = 10000110
• 8(1000) + 6(0110) x 0.0625 = 8.375 ≈ 8.5094
• Range of real values used limited by bit widths of possible resulting values
9
Fixed-point implementation of
CODEC
• COS value used extensively.
a= b*cos q a and b are fractions static unsigned char C(int h) { return h ? 64 :
ONE_OVER_SQRT_TWO;}
static int F(int u, int v, short img[8][8]) {
long s[8], r = 0;
unsigned char x, j;
for(j=0; j<8; j++) for(x=0; x<8; x++) {
s[x] = 0;
s[x] += (img[x][j]*COS_TABLE[j][v]) for(j=0; j<8; j++)
s[x] += (img[x][j] * COS_TABLE[j][v] ) >> 6;
>> 6; }
for(x=0; x<8; x++) r += (s[x] * COS_TABLE[x][u]) >> 6;
return (short)((((r * (((16*C(u)) >> 6) *C(v)) >> 6))
>> 6) >> 6);
}
10
Pipelining