0% found this document useful (0 votes)
54 views25 pages

Computer Architecture - Mid - Solution

Uploaded by

sishahed420
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views25 pages

Computer Architecture - Mid - Solution

Uploaded by

sishahed420
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

`

MID-TERM QUESTION SOLUTIONS

COMPUTER
ARCHITECTURE
CSE 3313

SOLUTION BY
MD. NURUL ALAM ADOR

UPDATED TILL SPRING 2024

nurulalamador.github.io/UIUQuestionBank
Index

Trimester Page

Spring 2024 3

Fall 2023 9

Summer 2023 15

Spring 2023 20

nurulalamador.github.io/UIUQuestionBank 2
Spring 2024

1. Consider the following MIPS code:

1 block:
2 add $sp, $sp,-8 Instruction CPI
3 sw $ra, $zero($sp) add 2
4 sw $s0, 4($sp) addi 3
5 sub 2
6 addi $v0, 0, 0 lw 4
7 addi $s0, $zero, 1 sw 4
8 sll 1
9 loop:
srl 1
10 slti $t0, $a0, s0
slt 4
11 bne $t0, $zero, end_loop
slti 8
12 add $v0, $v0, $s0
13 addi $s0, $s0, 1 beq 4
14 jump loop bne 4
15 j 8
16 end_loop: jr 4
17 lw $ra, 4($sp) jal 4
18 lw $s0, 8($sp)
19 addi $sp, $sp, 12 Table 1: CPI for each instruction
20 jr $ra for program 1a

[N.B. You should assume the value of $a0 = 3]

1. a) Find the errors and fix them in the above code.

Solution:
The code has been rewritten below with error correction:

block:
addi $sp, $sp,-8 # addi should use for constant operand
sw $ra, 0($sp) # offset of sw should be constant
sw $s0, 4($sp)

addi $v0, $zero, 0 # only last operand is constant in addi


addi $s0, $zero, 1

loop:
slt $t0, $a0, $s0 # last operands of slti cannot be register
bne $t0, $zero, end_loop
add $v0, $v0, $s0
addi $s0, $s0, 1
j loop # jump is not a valid instruction

end_loop:
lw $ra, 0($sp) # lw addresses doesn’t match with sw
lw $s0, 4($sp)
addi $sp, $sp, 8 # 8 was allocated for $sp
jr $ra

nurulalamador.github.io/UIUQuestionBank 3
1. b) As you get error-free code from 1a, Write the value of $vo and $s0 registers after
executing the program.
Solution:
The value of the following registers has been written below:
$v0 = 6
$s0 = 4

1. c) Consider a CPU with a 4GHz clock rate and CPI in the following table-1. Calculate
the CPU time for executing the program 1a.
Solution:
Given,
CPU Clock Rate = 4 𝐺𝐻𝑧
= 4 × 109 𝐻𝑧

Here,
Instruction addi sw slt bne add j lw jr
Instuction
7 2 4 4 3 3 2 1
Count (IC)
CPI 3 4 4 4 2 8 4 4

Now,
Clock Cycle = IC × CPI
= 7×3+2×4+4×4+4×4+3×2+3×8+2×4+1×4
= 103

We know,
CPU Time = Clock Cycle × Clock Time
Clock Cycle
=
Clock Rate
103
=
4 × 109
= 2.575 × 10−8 𝑠

1. d) Consider program 1a running on another computer that requires 160ns, with 40ns
spent executing FP instructions, 90ns executed L/S instructions, and 30ns spent
executing branch instructions. What is the improvement factor using Amdahl’s
law if we only improve the performance of L/S instructions using a better ALU to
get the program completion time improved by 2x?
Solution:
Here,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = 90 𝑛𝑠 [L/S instructions time]
𝑇𝑢𝑛𝑒𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = (160 − 90) 𝑛𝑠
= 70 𝑠
160
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = 𝑛𝑠
2
= 80 𝑛𝑠

nurulalamador.github.io/UIUQuestionBank 4
Improvement factor, 𝑛 = ?

We know,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = + 𝑇𝑢𝑛𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑛
90
or, 80 = + 70
𝑛
90
or, = 10
𝑛
90
or, =𝑛
10
∴𝑛=9

2. Consider the following C function. Compiler will assign a (base address) into $s0, b
(base address) into $s1, x into $s2 and i into $s3.

1 int main() {
2 int x=1,i=32,a[10],b[10];
3 do{
4 if(a[2*i]==b[2*(i+1)]){
5 x += a[2*i]- b[2*(i+1)];
6 break;
7 }
8 else if(a[2*i] & b[2*(i+1)]){
9 i = i+x;
10 continue;
11 }
12 else{
13 i = i- 3;
14 }
15 i = i+1;
16 }while(i%4);
17 return 0;
18 }

[N.B. You should assume the value of $a0 = 3]

2. a) Convert the code to the corresponding MIPS assembly instructions.

Solution:
The code has been converted into MIPS assembly instructions:

main:
addi $sp, $sp, -16
sw $s0, 12($sp)
sw $s1, 8($sp)
sw $s2, 4($sp)
sw $s3, 0($sp)
addi $s2, $zero, 1
addi $s3, $zero, 32

loop:
sll $t0, $s3, 2

nurulalamador.github.io/UIUQuestionBank 5
add $t0, $s0, $t0
lw $t1, 0($t0)
addi $t0, $s3, 1
sll $t0, $t0, 2
add $t0, $s1, $t0
lw $t2, 0($t0)

if:
bne $t1, $t2, else_if
sub $t0, $t1, $t2
add $s2, $s2, $t0
j end

else_if:
and $t0, $t1, $t2
beq $t0, $zero, else
add $s3, $s3, $s2
j loop

else:
addi $s3, $s3, -3
addi $s3, $s3, 1
and $t0, $s3, 3
bne $t0, $zero, loop

end:
lw $s3, 0($sp)
lw $s2, 4($sp)
lw $s1, 8($sp)
lw $s0, 12($sp)
addi $sp, $sp, 16
jr $ra

2. b) Convert the first 8 lines of MIPS assembly instructions to the corresponding


machine code after entering the loop body. No need to convert it to binary.
Solution:
The code has been converted into MIPS assembly instructions:

1. op rs rt rd shamt funct

0 X 19 8 2 0

2. op rs rt rd shamt funct

0 16 8 8 X 32

3. op rs rt C/A

35 8 9 0

4. op rs rt C/A

8 19 8 1

5. op rs rt rd shamt funct

0 X 8 8 2 0

nurulalamador.github.io/UIUQuestionBank 6
6. op rs rt rd shamt funct

0 17 8 8 X 32

7. op rs rt C/A

35 8 10 0

8. op rs rt C/A

5 10 9 else_if/4

9. op rs rt rd shamt funct

0 9 10 8 X 34

10. op rs rt rd shamt funct

0 18 8 18 X 32

2. c) Assume that the processor has 64 registers. The size of MIPS instruction is 32 bits
and 6 bits are reserved for opcode. The structure for addi instruction is given in the
table-2. Find out the maximum constant value for addi instruction in MIPS. Note
that, MIPS supports negative constants.

opcode rs rt constant

Table 2: Structure of addi instruction


Solution:
We know,
Structure of addi instruction,

opcode rs rt constant
6 bits 5 bits 5 bits 16 bits

We have 16 bits space for constants. Since MIPS supports negative constant, 1 bit
used as sign bit and remaining 15 bits store constant value.

15
∴ Maximum constant value = 2 −1
= 32767

3. Using the division algorithm, show each step of the division of 15 by 4.

Solution:
15 ÷ 4
1111 (DN)
0100 (DR)
Initially: A, Q: 0000 1111 M = 0100
M: 0100 1011
-M: 1100 +1
-M = 1100

nurulalamador.github.io/UIUQuestionBank 7
Step 1: A, Q: 0001 111_ A = A-M
A, Q: 1101 1110 = 0001+1100
A, Q: 0001 1110 = 1101

Step 2: A, Q: 0011 110_ A = A-M


A, Q: 1111 1100 = 0011+1100
A, Q: 0011 1100 = 1111

Step 3: A, Q: 0111 100_ A = A-M


A, Q: 0011 1001 = 0111+1100
= 0011

Step 4: A, Q: 0111 001_ A = A-M


A, Q: 0011 0011 = 0111+1100
= 0011

∴ Reminder, A = 0011 (3)


∴ Quotient, Q = 0011 (3)

nurulalamador.github.io/UIUQuestionBank 8
Fall 2023

1. a) Consider three different processors P1, P2, and P3 executing the same instruction
set architecture (ISA). P1 has a 3GHz clock rate and a CPI of 1.5. P2 has a 2.5GHz
clock rate and a CPI of 1.0. P3 has a 4.0GHz clock rate and has a CPI of 2.2.
i) Find out which processor performs better by calculating CPU time.
ii) If each processor’s execute a program in 10 seconds, find the number of clock
cycles and the number of instructions.
Solution:
i) For processor P1,
IC P1 × CPI P1
CPU Time P1 =
Clock Rate P1
𝐼 × 1.5
=
3 × 109 Here,
= 𝐼 × 5 × 1010 𝑠 CPI P1 = 1.5
Clock Rate P1 = 3 𝐺𝐻𝑧
For processor P2,
= 3 × 109 𝐻𝑧
IC P2 × CPI P2
CPU Time P2 = CPI P2 = 1
Clock Rate P2
Clock Rate P2 = 2.5 𝐺𝐻𝑧
𝐼×1
= = 2.5
2.5 × 109
= 𝐼 × 4 × 10−10 𝑠 × 109 𝐻𝑧
CPI P3 = 2.2
For processor P3, Clock Rate P3 = 4 𝐺𝐻𝑧
IC P3 × CPI P3 = 4 × 109 𝐻𝑧
CPU Time P3 = IC P1 = IC P2 = IC P3 = 𝐼
Clock Rate P3
𝐼 × 2.2
=
4 × 109
= 𝐼 × 5.5 × 1010 𝑠

Since CPU Time P1 > CPU Time P2 > CPU Time P3,
∴ Processor P2 perform better.

ii) For processor P1,


Clock Cycle P1 = CPU Time × Clock Rate P1
= 10 × 3 × 109
= 30 × 109
Clock Cycle P1 30 × 109 Here,
IC P1 = = = 2 × 1010
CPI P1 1.5 CPU Time = 10 𝑠

For processor P2,


Clock Cycle P2 = CPU Time × Clock Rate P2
= 10 × 2.5 × 109
= 25 × 109
Clock Cycle P2 25 × 109
IC P2 = = = 2.5 × 1010
CPI P2 1

nurulalamador.github.io/UIUQuestionBank 9
For processor P3,
Clock Cycle P3 = CPU Time × Clock Rate P3
= 10 × 4 × 109
= 40 × 109
Clock Cycle P3 40 × 109
IC P3 = = = 1.8 × 1010
CPI P3 2.2

1. b) Consider a program running on a computer that requires 280ns, with 80ns spent
executing FP instructions, 180ns executed L/S instructions, and 20ns spent
executing branch instructions..
i) By how much is the total time reduced if the time for FP operations is reduced
by 20?
ii) What is the improvement factor using Amdahl’s law if we only improve the
performance of L/S instructions using a better ALU to get the program
completion time improved by 2x?
Solution:
i) Here,
Previous total time = 280 𝑛𝑠
L/S instruction execution time = 180 𝑛𝑠
Branch instruction execution time = 20 𝑛𝑠

After reducing by 20 𝑛𝑠,


New FP instruction execution time = 80 − 20 𝑛𝑠
= 60 𝑛𝑠

∴ New total time = 180 + 20 + 60 𝑛𝑠


= 260 𝑛𝑠

∴ Total time reduced = 280 − 260 𝑛𝑠


= 20 𝑛𝑠

ii) Here,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = 180 𝑛𝑠 [L/S instructions time]
𝑇𝑢𝑛𝑒𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = (280 − 180) 𝑛𝑠
= 100 𝑛𝑠
280
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = 𝑛𝑠
2
= 140 𝑠
Improvement factor, 𝑛 = ?

We know,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = + 𝑇𝑢𝑛𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑛
180
or, 140 = + 100
𝑛
180
or, = 40
𝑛

nurulalamador.github.io/UIUQuestionBank 10
180
or, =𝑛
40
∴ 𝑛 = 4.5

2. Consider the following C function. Assume necessary registers.

1 2000: int check_func(int a[],int x){


2 if(x%2){
3 a[x] = a[x]-x;
4 return a[x];
5 }
6 }
7 3000: int get_sum(int a[],int b[],int k){
8 int sum = 0;
9 for(int i=0;i<k;i++){
10 if(check_func(a[i+1],i)){
11 b[i] = sum + (a[i]-b[i])/2;
12 sum = b[i];
13 }
14 }
15 return sum;
16 }
17 int main() {
18 PC->5000: int res=0,n=10;
19 int a[n],b[n];
20 res = get_sum(a,b,n);
21 res = res*9;
22 return 0;
23 }

2. a) Convert the code to the corresponding MIPS assembly instructions.

Solution:
The code has been converted into MIPS assembly instructions:

5000: addi $s0, $zero, 0


5004: addi $s1, $zero, 10 Assuming,
5008: add $a0, $s2, $zero
5012: add $a1, $s3, $zero res in $s0
5016: add $a2, $s1, $zero n in $s1
5020: jal get_sum 3000 a in $s2
5024: add $s0, $v0, $zero b in $s3
5028: sll $t0, $s0, 3
5032: add $s0, $t0, $s0
5036: addi $v0, $zero, 0
.
.
get_sum:
3000: addi $sp, $sp, -8 Assuming,
3004: sw $s0, 4($sp)
sum in $s0
3008: sw $s1, 0($sp)
3012: addi $s0, $zero, 0 i in $s1
3016: addi $s1, $zero, 0
loop:

nurulalamador.github.io/UIUQuestionBank 11
3020: slt $t0, $s1, $a2
3024: beq $t0, $zero, exit_loop 3144
3028: addi $t0, $s1, 1
3032: sll $t0, $t0, 2
3036: add $t0, $a0, $t0
3040: lw $t1, 0($t0)
3044: addi $sp, $sp, -12
3048: sw $a0, 8($sp)
3052: sw $a1, 4($sp)
3056: sw $ra, 0($sp)
3060: add $a0, $t1, $zero
3064: add $a1, $s1, $zero
3068: jal check_func 2000
3072: lw $ra, 0($sp)
3076: lw $a1, 4($sp)
3080: lw $a0, 8($sp)
3084: addi $sp, $sp, 12
3088: beq $v0, $zero, inc_i 3132
3092: sll $t0, $s1, 2
3096: add $t1, $a0, $t0
3100: add $t2, $a1, $t0
3104: lw $t3, 0($t1)
3108: lw $t4, 0($t2)
3112: sub $t0, $t3, $t4
3116: srl $t0, $t0, 1
3120: add $t0, $s0, $t0
3124: sw $t0, 0($t2)
3128: lw $t0, 0($t2)
3132: add $s0, $t0, $zero
inc_i:
3136: addi $s1, $s1, 1
3140: j loop 3020
exit_loop:
3144: add $v0, $s0, $zero
3148: lw $s1, 0($sp)
3152: lw $s0, 4($sp)
3156: addi $sp, $sp, 8
3160: jr $ra
.
.
check_func:
2000: andi $t0, $a1, 1
2004: beq $t0, $zero, exit_check 2036
2008: sll $t0, $a1, 2
2012: add $t1, $a0, $t0
2016: lw $t2, 0($t1)
2020: sub $t0, $t2, $a1
2024: sw $t0, 0($t1)
2028: lw $t0, 0($t1)
2032: add $v0, $t0, $zero
exit_check:
2036: jr $ra

2. b) Convert the first 12 lines of get_sum() function’s assembly instructions to the


corresponding machine code. No need to convert it to binary.

nurulalamador.github.io/UIUQuestionBank 12
Solution:
The code has been converted into MIPS assembly instructions:

1. op rs rt C/A

8 29 29 -8

2. op rs rt C/A

43 29 16 4

3. op rs rt C/A

43 29 17 0

4. op rs rt C/A

8 0 16 0

5. op rs rt C/A

8 0 17 0

6. op rs rt rd shamt funct

0 17 6 8 X 42

7. op rs rt C/A

4 0 8 786

8. op rs rt C/A

8 17 8 1

9. op rs rt rd shamt funct

0 X 8 8 2 0

10. op rs rt rd shamt funct

0 4 8 8 X 32

11. op rs rt C/A

35 8 9 0

12. op rs rt C/A

8 29 29 -12

2. c) Monica claims that the sll $t0, $s1, 40 instruction is correct in the MIPS
architecture, but Joey disagrees with Monica’s claim. Justify your opinion with a
proper explanation.
Solution:
We know,
Structure of sll instruction,

opcode rs rt rd shamt funct


6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

nurulalamador.github.io/UIUQuestionBank 13
Here,
5
Maximum value for shamt = 2 − 1 = 31

Given instruction:
sll $t0, $s1, 40

Since shamt (shift amount) of given instruction is greater than maximum value of
shamt (40 > 31), therefore this is not a correct instructions for MIPS.
∴ Joey is correct.

3. Using the optimized multiplication algorithm, show each step of the multiplication of
12 by 3.
Solution:

12 × 3
1100 (MN)
0011 (MR)
Initially: MN: 1100
P: 0000 0011

Step 1: P: 1100 0011 PLS = PLS + MN


P: 0110 0001 = 0000+1100
= 1100

Step 2: P: 0010 0001 PLS = PLS + MN


P: 1001 0000 = 0110+1100
= 0010 (Carry 1)

Step 3: P: 0100 1000

Step 4: P: 0010 0100

∴ Product, P = 0010 0100 (36)

nurulalamador.github.io/UIUQuestionBank 14
Summer 2023

1. Instruction IC CPI
addi 3 2
add 2 1
You are the most renowned
beq 1 4
computer architect in the world. Your
friend asks you to find the answer to bne 0 8
the following question for the given
slt 1 4
scenario: Consider computer A with a
CPU speed of 2 GHz. The program P sll 1 4
which has number of instruction and and 1 4
CPI as shown in Table 1.
j 1 8

Table 1: Program P’s number of instructions


and CPI in Computer A architecture

1. a) Find the execution time of program P in Computer A.

Solution:
Given,
CPU Clock Rate = 2 𝐺𝐻𝑧
= 2 × 109 𝐻𝑧

Now,
Clock Cycle = IC × CPI
= 3×2+2×1+1×4+0×8+1×4+1×4+1×4+1×8
= 32

We know,
CPU Time = Clock Cycles × Clock Time
32
=
2 × 109
= 16 × 10−9 𝑠
= 16 𝑛𝑠

∴ Execution time program P in computer A is 16 𝑛𝑠 .

1. b) Computer B with a CPU speed of 10 GHz has the execution time of the program P,
2 times faster than computer A and same instruction set architecture (ISA),
calculate the average CPI of computer B.
Solution:
From 1(a),
Clock Cycle A = 32

nurulalamador.github.io/UIUQuestionBank 15
Here,
Total Instruction Count of Computer A = 3 + 2 + 1 + 0 + 1 + 1 + 1 + 1
= 10
32
∴ Average CPI A = = 3.2
10

For computer A,
IC A × Average CPI A
CPU Time A =
Clock Rate A
𝐼 × 3.2
=
2 × 109
= 𝐼 × 1.6 × 10−9 𝑠

For computer B,
IC A × Average CPI B Here,
CPU Time B =
Clock Rate B
Clock Rate A = 2 𝐺𝐻𝑧
𝐼 × Average CPI B
= = 2 × 109 𝐻𝑧
2 × 109
Clock Rate B = 10 𝐺𝐻𝑧
= 𝐼 × Average CPI B × 5 × 10−10 𝑠
= 10 × 109 𝐻𝑧
IC A = IC B = 𝐼
According to question, Average CPI A = 3.2
CPU Time A Average CPI B = ?
=2
CPU Time B
𝐼 × 1.6 × 10−9
or, =2
𝐼 × Average CPI B × 5 × 10−10
1.6 × 10−9
or, = Average CPI B
2 × 5 × 10−10
∴ Average CPI B = 1.6

∴ Average CPI of computer B is 1.6.

1. c) As we get the total execution time of the given program for Computer A from
question (a). The Arithmetic unit takes 62.5%, the logical unit takes 25%, and the
branch operation takes 12.5% time of total execution. Find the improvement factor
of the given program if we replace the arithmetic unit with a better Arithmetic
unit, which improves total completion time by 2x.
Solution:
From 1(a),
Total execution time = 16 𝑛𝑠
∴ Arithmetic unit time = (16 × 62.5%) 𝑛𝑠
= 10 𝑛𝑠
∴ Logical unit time = (16 × 25%) 𝑛𝑠
= 4 𝑛𝑠
∴ Branch operation time = (16 × 12.5%) 𝑛𝑠
= 2 𝑛𝑠

Here,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = 10 𝑛𝑠 [Arithmetic unit time]

nurulalamador.github.io/UIUQuestionBank 16
𝑇𝑢𝑛𝑒𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = (16 − 10) 𝑛𝑠
= 6 𝑛𝑠
16
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = 𝑛𝑠
2
=8𝑠
Improvement factor, 𝑛 = ?

We know,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = + 𝑇𝑢𝑛𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑛
10
or, 8 = +6
𝑛
10
or, =2
𝑛
10
or, =𝑛
2
∴𝑛=5

2. Consider the following C function. Assume necessary registers.

1 2000: int find_series_sum(int n,int x){


2 intres=0;
3 if(n==0){
4 res = x;
5 }
6 else{
7 for(int i=1;i<=n;i++){
8 res = res + (x*5);
9 }
10 }
11 return res;
12 }
13 int main(){
14 1000: int s=10,x=2,r=0;
15 r = find_serise_sum(s,x);
16 }

2. a) Convert the code to the corresponding MIPS assembly instructions.

Solution:
The code has been converted into MIPS assembly instructions:

1000: addi $s0, $zero, 10


1004: addi $s1, $zero, 2 Assuming,
1008: addi $s2, $zero, 0
1012: add $a0, $s0, $zero s in $s0
1016: add $a1, $s1, $zero x in $s1
1020: jal find_series_sum 2000 r in $s2
1024: add $s2, $v0, $zero
.
.
find_series_sum:

nurulalamador.github.io/UIUQuestionBank 17
2000: addi $sp, $sp, -8
2004: sw $s0, 4($sp)
2008: sw $s1, 0($sp) Assuming,
2012: addi $s0, $zero, 0
2016: bne $a0, $zero, else 2024 res in $s0
2020: add $s0, $a1, $zero i in $s1
else:
2024: addi $s1, $zero, 1
loop:
2028: slt $t0, $s1, $a0 Changing
2032: beq $t0, $zero, exit_loop 2056 for(int i=1;i<=n;i++)
2036: sll $t0, $a1, 2 to
2040: add $t0, $t0, $a1
for(int i=0;i<n;i++)
2044: add $s0, $s0, $t0
2048: addi $s1, $s1, 1
2052: j loop 2028
exit_loop:
2056: add $v0, $s0, $zero
2060: lw $s1, 0($sp)
2064: lw $s0, 4($sp)
2068: addi $sp, $sp, 8
2072: jr $ra

2. b) Convert the first 8 lines of your assembly instructions to the corresponding


machine code. No need to convert it to binary.
Solution:
The code has been converted into MIPS assembly instructions:

1. op rs rt C/A

8 0 16 10

2. op rs rt C/A

8 0 17 2

3. op rs rt C/A

8 0 18 0

4. op rs rt rd shamt funct

0 16 0 4 X 32

5. op rs rt rd shamt funct

0 17 0 5 X 32

6. op address

3 500

7. op rs rt rd shamt funct

0 2 0 18 X 32

8. op rs rt C/A

8 29 29 -8

nurulalamador.github.io/UIUQuestionBank 18
2. c) Assume we have a new instruction type available in MIPS architecture which is S-
type. Only load/store instructions can be executed using the S-type MIPS field. The
structure of the Stype is given in table 2. Please find the maximum number of
indexes that can be possible in an array. Explain your answer.

op rs rt C/A
8 bits 8 bits 8 bits 40 bits

Table 2: S-type format for MIPS


Solution:
Given,
Structure of S-Type instruction,

op rs rt C/A
8 bits 8 bits 8 bits 40 bits

Here,
40
Maximum value for C/A = 2 −1
= 1.09951163 × 1012

This is the maximum value that we can store in the C/A. But it is not the maximum
index number because we cannot use consecutive number for array index location
since each location is size of 4 byte. Therefore if we divide maximum value with 4,
then we will get maximum index for array.

1.09951163 × 1012
∴ Maximum index for array =
4
= 2.74877907 × 1011

∴ Maximum number of indexes that can be possible in an array is 2.74877907 × 1011 .

3. a) Assuming 4-bit architecture and using the division algorithm show each step of
the division of 13 by 5.
Solution:
13 ÷ 5
1101 (DN)
0101 (DR)
Initially: A, Q: 0000 1101 M = 0101
M: 0101 1010
-M: 1011 +1
-M = 1011

Step 1: A, Q: 0001 101_ A = A-M


A, Q: 1100 1010 = 0001+1011
A, Q: 0001 1010 = 1100

Step 2: A, Q: 0011 010_ A = A-M


A, Q: 1110 0100 = 0011+1011
= 1110

nurulalamador.github.io/UIUQuestionBank 19
A, Q: 0011 0100

Step 3: A, Q: 0110 100_ A = A-M


A, Q: 0001 1001 = 0110+1011
= 0001

Step 4: A, Q: 0011 001_ A = A-M


A, Q: 1110 0010 = 0011+1011
A, Q: 0011 0010 = 1110

∴ Reminder, A = 0011 (3)


∴ Quotient, Q = 0010 (2)

3. b) If we want to multiply 32 by 32 using the multiplication algorithm then what will


be the minimum size of the product register?
Solution:
Here,
32
× 32
1024

Let, 𝑛 bits size required for product register.

Now,
2𝑛 − 1 = 1024
or, 2𝑛 = 1024 + 1
or, 𝑛 = log2 1025
or, 𝑛 = 10.001
∴ 𝑛 ≈ 11

∴ Minimum size of the product register should be 11 bits.

nurulalamador.github.io/UIUQuestionBank 20
Spring 2023

1. a) Alternative compiled code sequences are using same instructions as add, sub and
beq. A table is given to show the required number of cycles per instruction (CPI)
and the instruction count (IC) on each code sequence.

Find out the number of clock cycles and average CPI for all the code sequences.

Instuction add sub beq


CPI 3 4 7
Code Sequences 1 240 300 500
Code Sequences 2 320 100 150

Solution:
For Code Sequence 1,
Clock Cycle 1 = IC × CPI
= 3 × 240 + 4 × 300 + 7 × 500
= 5420
Total Instruction Count = 240 + 300 + 500
= 1040
5420
∴ Average CPI 1 = = 5.21
1040

For Code Sequence 2,


Clock Cycle 2 = IC × CPI
= 3 × 320 + 4 × 100 + 7 × 150
= 2410
Total Instruction Count = 320 + 100 + 150
= 570
2410
∴ Average CPI 2 = = 4.23
570

1. b) Consider a computer running a program that requires 400 s, with 90 s spent


executing FP instructions, 180 s executed L/S instructions, and 60 s spent
executing branch instructions. Find out the affected and unaffected times for
Amdahl’s law. What is the improvement factor using Amdahl’s law if we get the
program completion time improved by 4x?
Solution:
Here,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = (90 + 180 + 60) 𝑛𝑠
= 330 𝑛𝑠
𝑇𝑢𝑛𝑒𝑓𝑓𝑒𝑐𝑡𝑒𝑑 = (400 − 330) 𝑛𝑠
= 70 𝑠

nurulalamador.github.io/UIUQuestionBank 21
400
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = 𝑛𝑠
4
= 100 𝑛𝑠
Improvement factor, 𝑛 = ?

We know,
𝑇𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑇𝑖𝑚𝑝𝑟𝑜𝑣𝑒𝑑 = + 𝑇𝑢𝑛𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑛
330
or, 100 = + 70
𝑛
330
or, = 30
𝑛
330
or, =𝑛
30
∴ 𝑛 = 11

1. c) What is power wall? Discuss the necessity of multi-core processors.

Solution:
The power wall refers to the physical limitations in increasing processor speed due to
excessive power consumption and heat generation. The power wall is a point where
increasing processor speed becomes impractical due to the insurmountable
challenges of power consumption and cooling.

To overcome the power wall, the computer industry shifted its focus from increasing
clock speeds to increasing the number of cores on a single chip. The power wall
necessitated a shift in processor design, leading to the development of multi-core
processors as a viable solution to overcome the limitations of single-core processors
performance. Multi-core processors offer several advantages such as improved
performance, enhanced energy efficiency, parallel processing and scalability.

2. Consider the following C function. The starting MIPS assembly instruction is 1000.
Assume necessary registers.

int function(int n1, int n2) {


int i,s=1;
for(i=n1;i<n2;i++) {
if(arr[i]<5) {
arr[i]=arr[i]+(s*5);
s=s+i;
}
else {
s++;
}
}
return s;
}

2. a) Convert the code to the corresponding MIPS assembly instructions.

Solution:

nurulalamador.github.io/UIUQuestionBank 22
The code has been rewritten below with the error correction:

function:
1000: addi $sp, $sp, -8 Assuming,
1004: sw $s0, 4($sp)
1008: sw $s1, 0($sp) n1 in $a0
1012: addi $s0, $zero, 1 n2 in $a1
1016: add $s1, $a0, $zero s in $s0
loop : i in $s1
1020: slt $t0, $s1, $a1 arr[] in $s2
1024: beq $t0, $zero, exit_loop 1084
1028: sll $t0, $s1, 2 [ Assuming arr[] was
1032: add $t1, $s2, $t0 declared outside of
1036: lw $t2, 0($t1) function before ]
1040: slti $t0, $t2, 5
1044: beq $t0, $zero, else
1048: sll $t0, $s0, 2
1052: add $t3, $t0, $s0
1056: add $t0, $t2, $t3
1060: sw $t0, 0($t1)
1064: add $s0, $s0, $s1
1068: j loop_end 1076
else:
1072: addi $s0, $s0, 1
loop_end:
1076: add $s1, $s1, 1
1080: j loop 1020
exit_loop:
1084: add $v0, $s0, $zero
1088: lw $s1, 0($sp)
1092: lw $s0, 4($sp)
1096: addi $sp, $sp, 8
1100: jr $ra

2. b) Convert the first 10 lines of your assembly instructions to the corresponding


machine code. No need to convert it to binary..
Solution:
The code has been converted into MIPS assembly instructions:

1. op rs rt C/A

8 29 29 -8

2. op rs rt C/A

43 29 16 4

3. op rs rt C/A

43 29 17 0

4. op rs rt C/A

8 0 17 1

5. op rs rt rd shamt funct

0 4 0 17 X 32

nurulalamador.github.io/UIUQuestionBank 23
6. op rs rt rd shamt funct

0 17 5 8 X 42

7. op rs rt C/A

4 0 8 271

8. op rs rt rd shamt funct

0 X 17 8 2 0

9. op rs rt rd shamt funct

0 18 8 9 X 32

10. op rs rt C/A

35 9 10 0

2. c) Assume we have a new instruction type available in MIPS architecture which is K-


type. Only jump instruction can be executed using the K-type MIPS field. Structure
of the Ktype is given below. Please find the maximum jump address. Explain your
answer.
op rs rt C/A
12 bits 10 bits 10 bits 32 bits

Solution:
Given,
Structure of K-Type instruction,

op rs rt C/A
12 bits 10 bits 10 bits 32 bits

Here,
32
Maximum value for C/A = 2 −1
= 4294967295

This is the maximum value that we can store in the C/A. But it is not the jump
address because we store jump address in C/A by dividing it by 4. Therefore if we
multiply maximum value with 4, then we will get maximum jump address.

∴ Maximum index for array = 4294967295 × 4


= 17179869180

∴ Maximum jump address is 17179869180.

3. a) Assuming 4-bit architecture and using the division algorithm show each step of
the division of 11 by 6.
Solution:
11 ÷ 6
1011 (DN)

nurulalamador.github.io/UIUQuestionBank 24
0110 (DR)
Initially: A, Q: 0000 1011 M = 0110
M: 0110 1001
-M: 1010 +1
-M = 1010

Step 1: A, Q: 0001 011_ A = A-M


A, Q: 1011 0110 = 0001+1010
A, Q: 0001 0110 = 1011

Step 2: A, Q: 0010 110_ A = A-M


A, Q: 1100 1100 = 0010+1010
A, Q: 0010 1100 = 1100

Step 3: A, Q: 0101 100_ A = A-M


A, Q: 1111 1000 = 0101+1010
A, Q: 0101 1000 = 1111

Step 4: A, Q: 1011 000_ A = A-M


A, Q: 0101 0001 = 1011+1010
= 0101

∴ Reminder, A = 0101 (5)


∴ Quotient, Q = 0001 (1)

3. b) Optimized multiplication is better than the normal multiplication algorithm. Why?


Explain.
Solution:
Given,
Structure of K-Type instruction,

op rs rt C/A
12 bits 10 bits 10 bits 32 bits

Here,
32
Maximum value for C/A = 2 −1
= 4294967295

This is the maximum value that we can store in the C/A. But it is not the jump
address because we store jump address in C/A by dividing it by 4. Therefore if we
multiply maximum value with 4, then we will get maximum jump address.

∴ Maximum index for array = 4294967295 × 4


= 17179869180

∴ Maximum jump address is 17179869180.

nurulalamador.github.io/UIUQuestionBank 25

You might also like