0% found this document useful (0 votes)
20 views18 pages

Ass2 cs637 Merged Organized

Uploaded by

Sahil Gala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views18 pages

Ass2 cs637 Merged Organized

Uploaded by

Sahil Gala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Name- Sahil Kenil Gala

Roll No- 210895


Problem 1
clamt.
Datr

Pae

robl L
Cmoto is babod

To rofote usin cgil co


to
rodugS,0nt
1

2.95x LO V/RPM
5,9xI0-3 NmA
H

L
Substituting vals
KT = 0.05 NMA 2
T3.98 x10b Kqm

-|5[13.3 455.45
L09.09
Problem 2

To convert this model into fixed point data type into fixed point datatype
, we need to use fixed point designer tool of Matlab which will
automatically determine the best precision value for each signal. The
converted fixed point controller model is given below where convert
block is used to convert the signal to fixed point datatype.

Every floating point variable is converted to fixed type variable.


There are some convert block inside the subsystem as well to do the task.
The datatype for 8 bit and 16 bit fixed point controller are as follows:
Reference: Sine wave
8 bit datatype: fixed (1,7,8)
8 bit precision: 0.0078125
16 bit datatype: fixed(1,15,16)
16 bit precision: 3.05175781 * 10^ (-5)
Steps in Code Generation:

1. Modeling the System:


○ Begin with a high-level model, often created using
MATLAB/Simulink, which defines the system's behavior.
2. Setting Code Generation Parameters:
○ Choose fixed-point or floating-point arithmetic based on
system requirements. Fixed-point is preferred for systems
with limited precision and processing power (e.g., embedded
systems), while floating-point is more common when
precision is essential.
3. Fixed-Point Conversion :
○ Quantization: Variables are represented using integers
with scaling factors to approximate real numbers. You may
see scaling operations in the form of shifts, divisions, or
multiplications.
○ Casting: Variables are cast to appropriate integer types
(e.g., int8, int16, uint8) to control the word length and
precision.
○ Operations: Since fixed-point math has limited precision,
special care is taken to handle overflow, underflow, and
rounding errors. The code includes modulus operations
(fmod) and additional checks (if (isNaN)).
4. Floating-Point Code Generation :
○ Standard Arithmetic: Floating-point numbers are
handled natively using standard data types like real_T.
There's no need for manual scaling, which simplifies the
code.
○ Solver Integration: The floating-point code often interacts
with solver functions to perform continuous integration,
making it suitable for high-precision calculations.
5. Final Compilation:
○ After code generation, the output C code is compiled into
machine code specific to the processor in use.

By comparing the two code examples, the fixed-point version is more


complex due to manual handling of scaling and precision, while the
floating-point version leverages the processor's native capabilities for
handling real numbers directly.

Code Generated for Fixed-Point: (C code)


tmp_0 = floor((Controller_P.C[0] * Controller_X.ReplicatcOfSource_CSTATE[0] +
Controller_P.C[1] * Controller_X.ReplicatcOfSource_CSTATE[1]) * 256.0);
if (rtIsNaN(tmp_0)) {
rtbif1inf = 0U;
} else if (tmp_0 < 0.0) {
rtbif1inf = 0U;
} else {
tmp = fmod(tmp_0, 256.0);
rtb_DTC_output_1 = (int8_T)(tmp < 0.0 ? (int32_T)(int8_T)-(int8_T)(uint8_T)-tmp :
(int32_T)(int8_T)(uint8_T)tmp);
}

Controller_V.Out1[0] = rtb_DTC_output_1;

tmp_0 = floor((Controller_P.K[0] * rtb_DTC_output_1));


rtb_DTC_output_1 = (int16_T)(tmp_0 + Controller_P.K[1] * rtb_DTC_output_1);

// Output:
Controller_Y.Out1[0] = rtb_DTC_output_1;

// Continue with the second part of code


tmp_0 = floor((Controller_P.C[0] * Controller_X.ReplicatcOfSource_CSTATE[0] +
Controller_P.C[1] * Controller_X.ReplicatcOfSource_CSTATE[1]) * 256.0);
if (rtIsNaN(tmp_0)) {
rtbif1inf = 0U;
} else if (tmp_0 < 0.0) {
rtbif1inf = 0U;
} else {
tmp = fmod(tmp_0, 256.0);
rtb_DTC_output_1 = (int8_T)(tmp < 0.0 ? (int32_T)(int8_T)-(int8_T)(uint8_T)-tmp :
(int32_T)(int8_T)(uint8_T)tmp);
}

Controller_V.Out1[1] = rtb_DTC_output_1;

tmp_0 = (int16_T)(tmp_0 + Controller_P.K[1] * rtb_DTC_output_1);


Controller_B.DTC_output_1 = (real_T)(int16_T)(Controller_U.In1[0] + 6.103515625E-5);
Code Generated for Floating-Point: (C code)

rtsiSet(S, time);
if (rtsiIsMajorTimeStep(Controller_M)) {
rtsiSetSolverStopTime(S, (!((Controller_M)->Timing.clockTick0 + 1)) *
Controller_M->Timing.stepSize0 +
2494967296.0);
}

void Controller_step(void) {
real_T rtb_SteeringController;

if (rtIsMajorTimeStep(Controller_M)) {
/* set solver stop time */
if (!((Controller_M->Timing.clockTick0 + 1))) {
rtsiSetSolverStopTime(S, (((Controller_M)->Timing.clockTick0 + 1) *
Controller_M->Timing.stepSize0 + 2494967296.0));
}
}

/* Model step function */


rtb_SteeringController = Controller_P.C[0] * Controller_X.SteeringController_CSTATE[0] +
Controller_P.C[1] * Controller_X.SteeringController_CSTATE[1];

/* StateSpace */
rtb_SteeringController = Controller_P.C[0] * Controller_X.SteeringController_CSTATE[0] +
Controller_P.C[1] * Controller_X.SteeringController_CSTATE[1];

/* Update output */
Controller_Y.Out1[0] = rtb_SteeringController;

/* Gain */
tmp = Controller_P.K[0] * rtb_SteeringController;
Controller_Y.Out1[0] = tmp + Controller_P.K[1] * rtb_SteeringController;

/* Gain and Sum */


Controller_Y.Out1[1] = rtb_SteeringController;

/* Solver step */
rtsiSetSolverStopTime(S, Controller_M->Timing.stepSize0 + Controller_M->Timing.stepSize0 +
2494967296.0);
}
Question 3
To analyze the number of cache misses for the given function
compute_variance, we need to understand how the memory accesses
map to the cache structure, considering the cache parameters.

Cache Parameters Overview

● m = 32: 32 bits for addresses (address space of 2^{32} bytes).


● S = 8: 8 cache sets.
● E = 1: Direct-mapped (1 line per set).
● B = 8: Block size of 8 bytes (2 integers per block since an int is 4
bytes).

Cache Size and Structure

1. Cache Size:
○ Each block is 8 bytes.
○ Total number of cache lines = 8 (since S=8).
○ Total cache size = 8×8=64 bytes.

(a) Cache Misses for N=16

For N=16:

● The data array uses 16×4=64 bytes.


● The entire array fits into the cache.

Access Pattern:

1. First Loop (calculating sum1):


○ The first access to each integer will incur a cache miss until
all blocks are loaded.
○ Accesses will be:
■ Block 0 (data[0], data[1]): 1 miss.
■ Block 1 (data[2], data[3]): 1 miss.
■ Block 2 (data[4], data[5]): 1 miss.
■ Block 3 (data[6], data[7]): 1 miss.
■ Block 4 (data[8], data[9]): 1 miss.
■ Block 5 (data[10], data[11]): 1 miss.
■ Block 6 (data[12], data[13]): 1 miss.
■ Block 7 (data[14], data[15]): 1 miss.
○ Total misses for this loop = 8.
2. Second Loop (calculating sum2):
○ Entire array will be there in the cache. Thus while executing
second loop there will be not any cache miss.

Total Cache Misses for N=16 : = 8 (first loop)

(b) Cache Misses for N=32

For N=32:

● The data array uses 32×4=128 bytes.


● The cache can only hold 64 bytes, so it cannot fit the entire array.

Access Pattern:

1. First Loop (calculating sum1):


○ The first 16 accesses will fill the cache as follows:
■ Block 0 (data[0], data[1]): 1 miss.
■ Block 1 (data[2], data[3]): 1 miss.
■ Block 2 (data[4], data[5]): 1 miss.
■ Block 3 (data[6], data[7]): 1 miss.
■ Block 4 (data[8], data[9]): 1 miss.
■ Block 5 (data[10], data[11]): 1 miss.
■ Block 6 (data[12], data[13]): 1 miss.
■ Block 7 (data[14], data[15]): 1 miss.
○ Total misses for the first 16: 8 misses.
○ When accessing data[16] to data[31], the cache will miss for
every access:
■ Each access will evict one of the previous blocks.
○ Total misses for accessing the next 16 integers will again be 8
misses (as it will evict the first set of blocks).
2. Second Loop (calculating sum2):
○ Again, each of the blocks must be fetched from memory:
○ Accessing 32 integers will incur 8 misses for the first 16
accesses and another 8 for the next 16.
Total Cache Misses for N=32:

● Total = 16 (first loop) +16(second loop)= 32

(c) Cache Misses for N=16 on 2-Way Set-Associative Cache

With the same N=16 and changing to (m,S,E,B)=(32,8,2,4) 4 bytes per


block, but now 2-way set associative:

1. Cache Size:
○ Each block is 4 bytes.
○ There are 8 sets (since S=8 and E=2).
○ The cache can hold a total of 8×8=64 bytes.

Access Pattern:

1. First Loop (calculating sum1):


○ With 1 integers per block:
○ Total misses for this loop = 16.
2. Second Loop (calculating sum2):
○ Entire array will be there in the cache. Thus while executing
second loop there will be not any cache miss.

Total Cache Misses for N=16 on 2-way set-associative cache:

● Total = 16 (first loop) + 0 (second loop) = 16 cache misses.

Summary of Results

● (a) For N=16 : 8 cache misses


● (b) For N=32: 32 cache misses
● (c) For N=16 : 16 cache misses.
Question 4

Let’s analyze the problem step by step, addressing each part of your query regarding
Rate Monotonic (RM) and Earliest Deadline First (EDF) scheduling.

Given:

● Task 1:
○ Period p1=2
○ Execution time e1=1
● Task 2:
○ Period p2=3
○ Execution time e2=1

(a) RM Schedule and Processor Utilization

RM Schedule:

In RM scheduling, tasks are prioritized based on their periods: the shorter the
period, the higher the priority.

1. Task 1 (T1) has a higher priority (period 2).


2. Task 2 (T2) has a lower priority (period 3).

Schedule timeline:

0: | T1 |
1: | T2 |
2: | T1 |
3: idle
4: | T1 |
5: | T2 |

(At time 6, T1 would start again, and this cycle continues.)

Processor Utilization:

The utilization U of the RM scheduling can be calculated as follows:

U=e1/p1+e2/p2=1/2+1/3=3/6+2/6=5/6≈0.8333

Liu and Layland Utilization Bound:


Comparison:

● The computed utilization U=5/6≈0.8333 is slightly greater than the bound of


approximately 0.828. Therefore, the task set is still schedulable under RM
since it does not exceed the utilization bound.

(b) Infeasibility of Increased Execution Times

1. Increase e1 or e2​:
○ If either e1​or e2​is increased, the total utilization will exceed 1.
○ For instance, if e1=2 e1​=2, then U=2/2+1/3=1+1/3=4/3>1 (infeasible).
2. Holding e1=e2=1 and reducing p2​:
○ If p2 ​is reduced below 3 (say p2=2):
○ U=1/2+1/2=1 which is marginally feasible, but since both tasks now
share the same period, they could interfere with each other.
3. Holding e1=e2=1 and reducing p1​:
○ If p1 is reduced below 2 (say p1=1):
○ U=1/1+1/3=1+1/3>1 (infeasible).

Conclusion:

● It is possible to reduce p2 to 2 and keep the schedule feasible at U=1 but it is


not feasible to reduce p1​.

(c) EDF Scheduling with e2=1.5

Now consider the new task set:

● e2=1.5 , e1=1 , p1=2 , p2=3.

EDF Schedule:

With EDF, tasks are scheduled based on their deadlines, so we have:

● T1: Period = 2, Deadline = 2,4,6,8,.. execution time=1


● T2: Period = 3, Deadline = 3,6,9,.. execution time =1.5
Execution Sequence:

● Time 0-1: | T1 |
● Time 1-2.5: | T2 | (deadline is 3)
● Time 2.5 to 3.5: | T1 | (deadline is 4)
● Time 3.5 to 5 : | T2 | (deadline is 6)
● Time 5 to 6: | T1 | (deadline is 6)
This cycle continues

Processor Utilization:

.We calculate the processor utilization as:

U=e1/p1+e2/p2

Substituting the values:

U=1/2+1.5/3 U=1.0

Since the processor utilization U=1.0U = 1.0U=1.0 is equal to 1, the task set is
feasible under EDF. EDF can fully utilize the processor without missing any
deadlines.

Summary

● (a) RM schedule has utilization, slightly above Liu & Layland's bound
(0.828).
● (b) Increasing execution times leads to infeasibility; reducing p1<2 is not
possible, but p2​can be reduced to 2.
● (c) Increasing e2 to 1.5 results in a feasible EDF schedule with utilization
100%.
nt

Prnblmn5
Th
dona tovstran
P2 2. 6
5 8

245
3 8

3 tim
aXos pan a|2

gthat is
oete

6
P. 2 5 8]
2 6 lo

Makespau)= 194
Hena alth augh use hg

Jnan 0aual-sche dling

You might also like