0% found this document useful (0 votes)
2 views

Lecture

The document discusses the analysis of algorithms, focusing on the scientific method and mathematical models for measuring running time. It explains the importance of input size, execution frequency, and operation costs in determining algorithm efficiency, along with the challenges of obtaining precise measurements. Additionally, it highlights the differences between constant and linear time operations, using examples to illustrate the concepts.

Uploaded by

siyobas586
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture

The document discusses the analysis of algorithms, focusing on the scientific method and mathematical models for measuring running time. It explains the importance of input size, execution frequency, and operation costs in determining algorithm efficiency, along with the challenges of obtaining precise measurements. Additionally, it highlights the differences between constant and linear time operations, using examples to illustrate the concepts.

Uploaded by

siyobas586
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 12

... Yes, no? I can see you. So it works, yeah? Okay, good.

Okay, so we'll continue


today with the analysis of algorithms that we started last week. So again, you have
here the QR code for the Q&A session. So please feel free to use it. So you can use
it to ask questions, and please try to come up with good questions because we'll
moderate the session. We can only allow kind of meaningful questions. So last week
we started with the analysis of algorithms, so we tried to see how to analyze
algorithms. And then we talked about different ways how to do it. Last week we
talked about the scientific method. So what was the idea behind the scientific
method? So basically we ran experiments. So we increased kind of the input size, so
the N, so the number of input kind of data elements. And then we measured kind of
the time. And then we documented the results. And then we plotted the data. And
then we talked about how to plot the data so we can have the standard plot. And
then also we said that we can also plot the data using the log plot. So that means
we use the log of the input size on the x-axis, and we use the log of the time on
the y-axis. And then if we do it like this, in many cases, so we can get a kind of
a line. And then we can do kind of a bit of math. So to find the kind of the, yeah,
this kind of regression, so what is the kind of this line about. So this is a kind
of one way to do it. And then also we had a kind of another quick way. So how to do
this with the doubling hypothesis. And the idea again behind the doubprogram. The
data that we use as input and the algorithm itself. So this will affect the b. So
the exponent that we have in this equation. And then we have dependent effects. So
these are kind of the system dependent effects like the hardware, software, and the
system. And these system dependent effects together with the system independent
effects will determine the kind of the constant. So this a in this formula over
here. So that means that the b only depends on the algorithm and the input data.
And the a, so this constant, depends on everything. And then there are kind of good
and bad things. I mean the bad things if we use this scientific method is that it
is not the kind of accurate all the time. Because if we conduct experiments many
things can go wrong. Which will lead to the fact that we don't have precise
measurements and precise results. The good news if we use the scientific method,
yeah, in the case of measuring the running time of algorithms. Is that it is kind
of easier to do and cheaper than other kind of sciences. Because we don't have
humans, we don't have animals, and we are not running experiments on a kind of,
yeah, humans and animals. And that's why in many cases, so this kind of scientific
method will give a kind of good results. This is where we stopped last time. We
will continue from this slide. And we will start with a kind of A-class exercise as
a warm up. So it's related to this doubling hypothesis way to do this kind of
scientific method. So this is the data that we collected. So we ran a kind of
experiment and then we collected some kind of results. So we have the N, so the
input size, and then we measure the kind of the time. And then we try to apply this
doubling hypothesis. So we try to double the input size from one experiment to the
next one. And this is the kind of the data that we collected. So we have the input
size N, we have the running time. And the question now is which of the following
functions best models the running time T of N. So we provide a kind of four kind of
options. As a hint, so we show again this power function. And then as we said, if
we double the kind of the input size, and if we take the ratio, so the time needed
to process 2N divided by the time needed to process N, so we get 2 to the power of
B. And the question now is based on the data, on the collected data, so try to
define the B, so which will be the exponent here. And then try to define the A,
which will be kind of the constant. And then tell which of these kind of options
will model the running time based on the data that we have. And you have three
minutes. And again, try to work in pairs. I mean, I don't want to see people
working alone. If you have someone sitting next to you, please kind of talk to
them. And you have three minutes to solve this small exercise. And time is starting
now. Time is up. Time is up. Okay. Time is up. Time is up. Time is up. Time is up.
Do you see the screen now in this group? Because I thought that there were kind of
some issues with the screen share. Can you confirm that it's working now in this
group? Okay. Okay. Okay. So I did not hear anything, so I assume that it works.
Okay. Time is up. I hope that you got kind of the right kind of answer. Maybe you
can put kind of, see that kind of right answer that you got. Sorry? Well. Is it?
Okay. Okay. Okay. Time is up. Hold it. One place. Right here. Okay. Is it working
now? So what now? Yes. Okay. So what is the deal? Okay. Okay. So the technology is
working. So this is kind of the right solution. So I think most of you got it. So C
is the right solution. And the solution is quite simple. Basically, we have this
kind of double hypothesis, and then we can focus on this kind of ratio. So the T of
2N divided by T of N. If we do it for this one, for example, so we'll get something
close to 4. So that, and then if we also do it with the last kind of 2, because
it's always good to do it with the kind of the larger kind of values of N. So then
this ratio, let me... So this ratio is 4. And then we said this ratio corresponds
to 2 to the power of B. So if 2 to the power of B is 4, it means that B is a kind
of 2. So that means now we got the kind of the B. So it's kind of quadratic, so
it's 2 is B. And then once we have the B, so we can plug in the value of B. And
then solve for A, and this is kind of the result that we get. So basically, what
you need to do is try to get the log kind of ratio. So first the ratio, and then
the log of the ratio. The log of the ratio will be the B. And then once you have
the B, so then you plug in the value of B in the formula, and then solve for A, and
then you will get your equation. Questions related to this? I think it should be
straightforward as well. Good. So, this was kind of about the scientific method. So
as I said, there are kind of some issues in the scientific method. So it does not
give always kind of accurate results. Also, we don't know what's going on in the
programmable bed. And that's why there is a kind of another kind of maybe more
reliable method to compute the running time. And then again, it's based on math,
and we call it the mathematical model. So it's the mathematical approach how to
count the running time. The idea behind it is simple. So the idea was proposed by
Dijkhoof. So Dijkhoof is a famous researcher, and he did a lot related to the
analysis of the algorithms in the end of the 60s. And then what he said is quite
simple. So if we have a kind of program, we have some operations. And this idea is
quite simple to calculate the total running time. So we try to find the frequency
of execution of the operations. And then we know the cost of each operation. We
multiply the cost with the frequency of execution of the operations, and then we
sum up through all the operations that we have in the program. So these are some
examples of operations that we can have in the program. So we can have the array
access. So if we have an array, we are storing the data in the array, and then we
access the array. If we want to add a kind of integer, if we want to compare
integers, if we want to increment one variable, if we want to assign a kind of
variable, these are kind of examples of operations that we can have in the program.
And then each one of these kind of operations, so there is a kind of sub-frequency.
So the frequency of the execution of these operations, for example, how many times
we do a kind of integer add, for example. And on the other side, we have a kind of
constant. So the constant of kind of executing that kind of operation. And then for
the execution of each operation, we can have a different constant. So for example,
for A, for the array access, so we have a kind of constant. So that means the time
which is needed to run that operation. And then we multiply the cost of one
operation with the frequency of the execution of the operation. And then we do it
for all the operations, and this will give us the total running time. So this is
the main idea behind this mathematical model for running time. So we try to get
operations in the program. We try to count kind of the frequency of the execution
of these operations. We know about the cost for each operation. We multiply, we sum
up, and then we get our running time. So the frequency of the operations, I mean
this will depend on the algorithm and the input. So that means how many times we
kind of do one specific operation. This will only depend on the algorithm and the
input, so the input size. And then the cost of running each operation, this will
depend on the machine and the compiler. So it is how much time one operation will
need to execute. This only depends on the machine and the compiler. And this is the
idea behind this kind of computing. The total running time, again, the sum of the
cost times the frequency of execution of the operations. We said that the cost
depends on the machine and the compiler. The frequency of the execution of the
operations, they do depend only on the algorithm and the input data. And the good
thing, so we can compute these things. And then we will get a kind of accurate kind
of result compared to the scientific method where, depending on the experiment, we
might find an inaccurate kind of results. So this is the main idea. The question
now is, I mean in this equation you
have two things. I mean first the cost of the operation, and then we have the
frequency of the execution of these operations. Let's start with the cost of the
basic operations. So you can see here some examples. I mean again, we add integers,
we multiply, we divide, floating point add. If you want to go with a kind of more
complicated ones, the sine and the arctangent. These are kind of some basic
operations. And then all these operations, I mean these basic operations, so we can
perform them in a constant time. So it depends on the machine that you have, but on
all of them, so we can perform them in a kind of constant time. So here you can see
some examples of the time needed to execute these operations. And then all of them
are in the range of many kind of seconds. And then it's a kind of constant all the
time for these kind of basic operations. So these are kind of the basic ones, or
the primitive kind of operations, which mostly take a kind of constant time. But
then we have to be careful with some operations which sound to be primitive, but
this is not the case. And this is in the case if we have, for example, array
allocation. So we have one array, and there are some items, and then we want to
allocate the kind of items in this array. So this is linear, so it's not constant
anymore, because we need to access each kind of item in the array. So this is not
constant, it's linear in this case. The other kind of example, the other operation,
where you need to be kind of very careful, is the concatenation of strings. So many
students, I mean, they do kind of some, kind of novice kind of mistake, and then
they use a kind of string concatenation a lot, because it's kind of easy. In
Python, for example, there is the plus operator, and then you can use it to
concatenate two kind of strings. It sounds like it's a kind of primitive operation,
it's a basic operation, and we know that most primitive operations, they take a
kind of constant running time. This is not the case. The string concatenation, it's
a kind of delian. So it will increase with the size of the strings of it. So here
you have to be kind of careful with the usage of string concatenation, because
again, it's not a kind of constant running time, it's linear and more expensive
running time. All the other ones, they do take a kind of only constant time. Good.
So this is now related to the first arc, which is a kind of the cost, so the cost
of the primitive operations. So we said in most of the cases, it's a kind of
constant. There are kind of some few exceptions where it is a kind of linear. And
now the other kind of part of the equation, so the frequency of the executions of
these operations. And then we said this depends on the program that we have. Let's
see the kind of first example, this kind of basic example, and then we'll use the
one sum. So last time we talked about the three sum. So remember the task there was
to find all the kind of the triples in a kind of input, a kind of data, so which we
sum up to a zero. And then one sum is one kind of specific case of this kind of
three sum. So here we count how many zeros we have in our kind of input data. So we
give a kind of some integers, for example, and then we want to count how many zeros
we have in that kind of input data. This is a kind of, yeah, a smaller kind of
program how to do it. So we have a count, we initiate it with zero. We have kind
of, yeah, so this kind of a third rule. In this case we have only one variable, so
the i, so the i will be in the range zero to n. So that means we start from zero
and then we end at n minus one because this is only the index of it. So the size
will go from, so until n, but the index will go from zero until n minus one. And
this you can see it better maybe in Java for the ones who are more familiar with
Java. So we start from zero, we do the test whether it is smaller than n or not,
and then we increment with one. And then we go through all these kind of items, and
then we do one kind of test. If a i is zero, so then we increment the count of it.
It's kind of very basic kind of code. And this way we want to find how many zeros
we have in our input data. So basically we go through all the items one by one, and
then if it's zero, we increment the count. It's not more than this. And we have
other operations. And we want now to figure out what is the frequency of the
execution of these operations. So we have variable declaration. So we have two of
them. In this case we have two variables. We have the count and we have the i. So
these are the two variables that we declare. We have the assignment statement. Also
we have two. So here we assign a kind of a count, so zero to count, and i also we
start from zero, so we have a kind of two of them. We have the less than compare,
which is this one. You can see it more in the Java code, so here it's a bit hidden.
But here we have the less than compare. So that means you are doing kind of this
test whether i, the index i, is less than n or not. And here we have a kind of n
plus one. So from zero to n, and we repeat a kind of compare from zero to n. That's
why we have n plus one operations related to this less than compare. And then we
have equal to compare operation. This is this one. So where we test whether kind of
the item that we have at index i is equal to zero or not. And this we have a kind
of n times because we go through from this loop from zero until n minus one. And
then we test whether it is equal to zero or not. That's why the frequency of the
execution of this operation is n. And then we have the array access. So how many
times we access the array? And a is the array, and here where we are doing the
array access. And here again, we have a kind of n times that we access the array.
And then we have the increment. So here we are doing kind of two increments. So we
are incrementing the i, and this we are doing it n times. And then we are
incrementing the count, and here it can be something between zero and n, depending
on whether the item is equal to zero or not. That's why the total number of, so the
frequency of the execution of this increment can go from n, so for the i, until to
n, depending on whether we increment the count or not. So this is a kind of first
example. Simple one where we have kind of the operations. And then based on the
code, how to compute the frequency of the execution of these operations. For the
one sum, it's a kind of straightforward, so it's not that much that we have to do.
But how is it now with two sum? So two sum, yes? Why is it? You mean this one,
right? So the question is, now I learned that I have to repeat the question, the
question is why it is two n. So here we are talking about the increment, and we are
incrementing the i, and we are incrementing the count. So let's assume that all the
items are equal to zero. So here for the i, we incremented kind of n times, and
then if this test is complete, so that means all the items are equal to zero, so
then we increment the count also n times. And that's why it can go to two n. So in
the worst case, I mean, yeah, not the worst case, but if we don't have a kind of
zeros in the array, so then we increment the i n times, but count will not be
incremented because this test is not fulfilled, and this is a kind of low-pretend
of franchise, this is the n. And in the other case where we have all the kind of
zeros, so that the count will be incremented n times. That's why we have the two n.
Other questions? No. Okay, good. Let's move on with the two sum. So the two sum is
the other kind of variant of the pre-sum, and the idea here is to test how many
pairs we have in our data which will sum up to zero. So in three sum triples, so
how many triples will sum up to zero? In one sum, how many zeros we have in our
data? And in two sum, how many pairs in the data will sum up to zero? So this is
kind of the code related to this. So here we have a kind of a rabbit loop. So we
have the i and we have the j. So i also will go from zero to n, j will go from i
plus one to n. So that means we start from the beginning, and then for each kind of
item, we try to get the kind of the other items, and then we do a kind of test
whether these two kind of items will sum up to zero, and this way we kind of
recover all of them via these two kind of four loops. And then if the test is
fulfilled, I mean if we find the kind of two items which will sum up to zero, then
we will increment the part of it. So the only difference from one sum, in one sum
we had only one loop, so i will go from the list of it, and then we test whether it
is zero or not. So here we have kind of two loops, so that we can capture all the
possible kind of combinations, and then we do kind of the test whether the two kind
of elements, the two items will sum up to zero or not. If this is the case, we
increment the part with one. As you can see, it's still not a kind of very
complicated, it's a kind of two sum, but as you can see, computing the frequency of
the operations is getting kind of a bit more convenience. So here we can find that
the variable declaration is a kind of n plus two, assigned to n plus two, because
also we need to go through the second kind of loop a bit, and then we have the
other kind of results for the frequency of the execution of the different
operations. Don't worry, you will see these kind of computations step by step. What
I just want to stress at this point is that it's not kind of easy to compute the
frequency of the execution of the operations. So, for example, for the equal to
compare, which is this one, so how many times we do this kind of test, whether the
two items will sum up to zero or not. So we do it with one half n times n minus
one, and then we will see how to do it. Basically, we managed to find a kind of
three ways, maybe with increasing level of complexity,
to compute this one. So here we are interested in computing the frequency of the
execution of this operation equal to compare, this one. So how many times this
operation will be executed. The first kind of method how to do it, we do a kind of
sum of two ways. And the idea behind to sum again, it was how many pairs will sum
up to exactly zero. And this is equivalent to finding the possible two combinations
of a set of n elements. So we have n elements, so this is a kind of our input size,
and then we want to find how many pairs, or what are the possible kind of two
combinations. So if I take a kind of two items from this, a kind of from the n
elements, so how many of them will sum up to zero. And this we know from
combinatorics, it's a kind of n in two, or we can also tell n choose two. So it was
from n, we choose a kind of two elements. And this is a kind of the combinatorial
number, n and k, or also we can call it n choose k. So it was from n choose two k
elements in this case, if we have this kind of more general kind of case as well.
And we know the formula, so n in k, or n choose k. So this is n factorial in the
numerator, in the denominator we have k factorial times n minus k factorial. Yeah,
this is in the case where we have k. Now back to our example. So here we have, so
we want to have the pairs, that means k is equal to two. And then we want to find
the possible two combinations of a set of n elements. So we have n in two, or n
choose two. And then we apply the formula with k is equal to two. And then we get n
factorial divided by two factorial times n minus two factorial. And this is a kind
of the result that we get. So this is one kind of easy way to compute, this is kind
of equal to compare how many times we do it. Because it reflects what we are doing.
So I think we have these kind of two loops. We fix one, we go through kind of the
other loop, and then we kind of, we try to pick kind of two items, and then we test
whether they will sum up to zero or not. And this is n choose two, and this is kind
of the result. So it is not the frequency of execution of the equal to kind of
compare. It's kind of one half n times n minus one. This is kind of the first way
how to do it. There is a kind of another way. So to use a kind of a table a bit. So
here we can have a kind of our indices. We have the i, we have the j. And then we
try to count the number of iterations of j. So we go through the loops. I mean we
start from the first loop with i. So i will go from zero to n minus one. Because as
we said, here we focus on the index. So i will go from zero to n minus one. So this
is the first loop. And then whenever we fix i, so we try to focus on the j. So if
we start with a is equal to zero, in the second loop j will go from i plus one
until n. So that means if i is zero, j will go from one until n minus one. So i
will go from zero to n minus one. And j will go from i plus one to n. That means if
we have zero, i zero, j will go from one to n minus one. And then we increment i.
So i is now one. And then again j will go from i plus one to n. In this case it's
from two until n minus one. And then we do it for all the values of i. And then as
we said, i will go from zero to n minus one. Let's see the last two maybe kind of
cases. So when i is n minus two, so j will go from i plus one. So that means it
will only be n minus one. And then if i is n minus one, so j will be kind of n. And
then we do the test that j should be strictly less than n. That's why j will be
kind of zero in this case. So this is kind of the table. So basically try to go to
the, what's happening in the code of it. And then try to fix kind of the, one kind
of index of one loop. And then kind of try to see how the second variable will
behave. And then we count the number of iterations. So in the first case, so number
of iterations of j. So here it will go from one until n minus one. Then it will be
n minus one kind of iterations. Here from two to n minus one, that means the
possible values that j can take on. It will be n minus two and so on until in the
end it's one. In the pre last kind of case and then zero in the last one. So now we
manage to find kind of the iterations of j. And then we know that the number of
iterations of j will correspond to this kind of equal to kind of comparison. That
means whenever we have a kind of j, so we have a kind of one compare. And that's
why the number of compares, equal compares, will be the sum of these kind of
values. So we take the sum, so the total, so i to j, so the total iterations of j
of n. And then we take the sum. So n minus one plus n minus two until zero. And
then we have a kind of this result of it. I mean here just see it kind of from the
outline to the top of it. But that's the same kind of sum. So we sum up all the
iterations of j. And then we will get this one. And this will correspond to the
equal to compare. So the frequency of execution of the equal to compare. Probably
you know this kind of sum of it. It's the arithmetic sum or the arithmetic series
of it. And the arithmetic series, so we have a kind of, yeah, we will start with a
kind of a number. And then we increase by something. In this case we are increasing
by one. So we start from zero. And then we increase by one. And this is one example
of the arithmetic kind of sum. And then we know how to compute the kind of, yeah,
the sum of an arithmetic series of it. The sum will be, so the number of terms
times the average of first and last type of terms. So this is how to get the sum of
the arithmetic series of it. You try to find the number of terms that you have. And
then you take the average of the first and the last one. And then you will get your
sum. If we try to apply it on our case, so what is the number of terms? I mean the
number of terms, here we start from zero. We will end up with n minus one. That
means the number of terms is n. And then the average of the first and the last
term. The first term is zero. The last one is n minus one. We take the average,
it's n minus one divided by two. And this is the same in the formula that we got in
the previous one. Is it clear? Yes? There is a question? No? Okay, good. Let's move
on. So this was a kind of the second kind of option, how to compute it a bit. So by
just a kind of trying to get this table a bit, trying to find the number of
iterations. And then try to apply a kind of some math to get the sum of the
arithmetic series of it. To get the sum. This is the third method. So basically it
is kind of a bit more kind of math a bit. But it will reflect what's happening in
this program a bit. So we have our two variables. We have the i and the j. And we
have a kind of two sums. So for the i, so it will go from zero to n minus one. So
this is the first kind of loop a bit. So i will go from zero to n minus one. And
then inside that loop we have the second one, which is kind of about j. So j will
go from i plus one to n. And this is what we have in this sum. So j will go from i
plus one to n minus one. Again, because we are talking about ADCs. And then we have
the one. So this is the operation which will happen inside these kind of two loops.
This is, in our case, this equal to compare operation. And then now we try to solve
this kind of sum a bit. So we start from inside. So what is the sum j from i plus
one to n minus one of one? So that means a kind of how many times we repeat a kind
of this one. It will be the last term minus a kind of the first one plus one. So
this is n minus one, n minus one minus the first one, i plus one. And then we add
one. So as you know, if you want to have a kind of, I don't know, the number of
items, let's say between one and twelve, between one and twelve, it will be twelve.
If you want to get it between, let's say, five and fifteen, so it's fifteen minus
five plus one, it will be eleven in that case. I mean this is what we are applying
here. The last term minus the first one plus one, and this is what we get. So then
we have now the sum over i of n minus i minus one. So this is this one. And now we
want to get the sum. So i will go from zero until n minus one. So we plug in the
values of i. So if we start from zero, i will be zero. That means the first term
will be n minus one. The second one, i will be kind of one. So n minus one minus
one, it's n minus two. And the last one, i will be n minus one. So then it will be
n minus n minus one minus one, and then it will be zero. So this is kind of the
equation that we will get over. And this is again the arithmetic sum, so that we
did kind of before. And we know that the result in this case is n times n minus one
over two. So again, we take the number of terms, in this case it's n. The first
one, the last one, we take the average, and this is kind of the result. Is it
clear? So you have now kind of three ways how to compute the frequency of the
execution of this operation, equal to compare in the case of kind of two sums in
this case. And then these are kind of options, and then feel free to use the one
that you like. I mean, if all there is is kind of one option which will fit kind of
all the time, so sometimes you need to use a kind of the table, sometimes use a
kind of a mathematical kind of formula, but it will depend on the case of the,
these are three kind of possible options that you can use to find these kind of
frequencies of it. Okay, so we did this now for the equal to compare. So we found
now that the frequency of the execution of equal to compare is this one. So how
many times will we kind of run this kind of, or will we execute this operation?
It's a kind of one half n times n minus one. Now, if you focus on the array access,
so that means the frequency of the execution of the operation array access. We see
now a code, so here we need a kind of two kind of array accesses. I mean, to access
the item at
the i position, and to access the item in the array at the j position, so we have
a kind of two. And then we can see that it relates to this kind of number of
executions of the equal to compare, so just for each kind of equal to compare
operation, we need two array accesses. That's why the array access, so the number
of executions of array access will be twice the number of executions of equal to
compare. We know the result for the equal to compare. We multiply with two, because
again, for each equal to compare operation, we have two array accesses, and then we
get n times n minus one, and this is this variable. Okay. One question? The
assignment statement? The assignment statement? So the assignment statement should
be related to the, in this case, kind of count, and i, and j, and the count of it.
No, but the count we have it. So for the i, so we're going to kind of, so it will
be kind of only in this kind of, so we do it kind of once, but for each kind of
variable of i, we need to assign the variable of j from i plus one until n, and
that's why this n is related to the assignment of j, because i will be assigned
only once in the beginning, and then for this i, which is zero, so we need to
assign the j kind of i plus one until n, and this is kind of the n, plus two, which
is a kind of the assignment of i, and the assignment of count of it. So this i,
this n, is related to the assignment of j, because we have to do it n times when we
fix, and i is a kind of assignment once. So again, n is the j, so the assignment of
j, and then plus two, the assignment of i, which is only once, the first one, i
zero, and then count is zero. So this is the n plus two. Other questions? Good.
Okay. Now we talked about, and we saw kind of examples how to compute the number of
assignments of the equal to compare. We saw kind of three ways how to do it, and
then the array access, I mean it was easy because it depends, or it can be derived
from the equal to compare. Now we want to do kind of one exercise together, and now
we focus on the less than compare operation. So the less than compare, maybe you
will not see it kind of directly in the Python code, but you can see it in the Java
code. So what we mean with the less than compare is a kind of comparison. So that
we compare i to n, and we compare the j to n, and we want now to figure out how
many, or what is the frequency of the execution of this operation less than
compare. So here we know the result, but now we need to kind of come to this kind
of result. We do it together as a kind of in-class exercise, and we can apply what
we discussed a bit. So we have the code, we have the code in Java, and then again
we focus on this less than compare. So the i and the j, how many times we compare
them with n. And you have a kind of four minutes for this, and then the hint is try
to use the table. And the time starts now. Okay, try to work together a bit. Try to
talk to each other. That's fine. I want to hear that you are talking to each other.
Let's have a bit more kind of action in the lecture part. Yes, talk to each other.
Haven't we done action so far? So try to fix the eye and then try to run a kind of
a j-approach. And this will be a kind of one row in the table over, and then I'm
going to do that kind of iterations, and then I'm just going to do the sum of these
kind of iterations for every value of i and j. So if I put, for example, a value of
j x1, then technically the answer is g goes to Okay, so who got it? Raise your
hand. Don't worry, I will not ask you to present the solution. Just I want to see
who got it. Who did not get it? Who got it? Nobody got it? Who did not get it?
Okay, let me know. I believe you, because otherwise... Okay, good. Let's see the
kind of solution together. So the idea is simple. So I kind of try to come up with
this kind of table. We have our two indices, the i and the j. So the i will go from
0 to m, because you are interested in the comparison. I mean, it's not the i
itself, but the comparison. I mean, how many times we do the comparison? We do it
for the i from 0 until m, because also with the n, with the last number, we do kind
of the comparison. That's why the i will go from 0 to m, and this is what we have
in the first row. Now we fix the i. So i is 0. Then j will go from i plus 1 until
m. So i is 0, j will go from 1 until m. So it goes from i plus 1 until m, because
also we do this kind of compare with the last element, which is m. Okay. Okay. So
then i is 1, and then j will go from i plus 1 until m again. That means from 2 to
m. And then we have kind of the, we go on like this. And then the last kind of
elements, so when i is m minus 1, so j will start from i plus 1, then it will be
kind of m. So in that case we only do one compare, so we compare with j, which is m
in that case, because it's i plus 1, and we do the compare with m, and that's why
we have m here. And then for the m, so then we have kind of 0. This is not, it
should not be m, it should be 1. So this is not m, this should be 1. We have a typo
on the slide. Okay, so now we get the kind of number of iterations. So now we focus
on the number of iterations of j. So in the first one j will go from 1 to m, so
this is m times. And then for the second one from 2 to m, it's m minus 1 times, and
then we do kind of the same kind of exercise. In the end it will be 1, and then it
will be 0. Now we need the kind of, we get the kind of this less than compare for
the j. We sum up these kind of things here. And this is what we get. So we sum up
all these kind of terms, and then here again we have our kind of arithmetic sum, so
that we discussed before. And now we know how to compute this arithmetic kind of
sum. We have the number of elements times the average of the first and the last
one. If we apply it, so here we go from 0 to n, so that means how many elements we
have, it's n plus 1. The average of the first and the last one, 0 plus 1 divided by
2, this is the kind of result. So basically the application of this arithmetic sum,
not more, not less. And this way we get now the number of iterations of j. But
still we need to get the iterations of i, because also for the i we have this kind
of number of compares of it. And the total number of kind of the compares related
to i, it's n plus 1, because i will go from 0 to n, including n, and then we have n
plus 1 kind of compares for the i. Now for the result, we sum up kind of the two,
yeah, the two equations. So the total number of iterations of j, this is what we
got through this kind of arithmetic sum. The easier one, the number of iterations,
or the total number of iterations of i, which is n plus 1, so we get it immediately
via this loop here. We sum up the two things, so the total number of iterations of
i, the total number of iterations of j, we just kind of sum up the two things. We
do a kind of factorization of that, and then we get the kind of result of it, which
is 1 times n plus 1 times n plus 2. Is it clear? I mean the idea is quite simple.
You just kind of focus on the loops that you have. You start with the outer loop.
You fix the kind of the variable that you have in the outer loop, and then you can
see the inner loop. So for the second, you got variables of n. You note down how
many steps you need, and then you sum up the things, and you are done. Okay? Yes?
Why? Okay, okay, okay. No, that should be n. No, that's correct, because it's a
kind of the compared thing. So the j in that case, because i is n minus 1, and then
j is i plus 1, that's correct. That means the value of j is n, and this is correct.
So another term, and this is correct. Thank you, Sean. And thank you to the student
in this group who wrote it. Okay, that's n, and then we have only one value of j.
That's why we have this one, and that's correct. So there is no type on this slide.
Good. Other questions? No. Okay, let's move on. So now we saw kind of how to
compute this kind of frequency of the execution of the operations of it. So it's a
bit kind of tedious of it, because we have to go through all the operations that we
have in our program, and then we try to figure out kind of how many times these
operations are executed, and then we sum up. Then it's one or two kind of
simplifications of the calculations. And the first kind of idea, how to simplify
the calculation, it was given by Tullio. And then he did it in the context of
matrix processing or matrix kind of multiplication. And then what he said, we have
a kind of bunch of different kind of operations that we can do. So like kind of
additions or subtractions, multiplications, divisions, and so on. But then what he
said is there is no need to focus on all of them. We just focus on the most
important ones, and basically the most important ones are the most expensive ones.
In the case of matrix kind of processing, so he said, we don't need to bother about
kind of all the types of operations that we have. So we just need to attempt to
count the number of multiplications and recordings, and we ignore all the other
operations, because they don't have a kind of lot of influence. So this kind of
multiplications and recordings in the case of matrix processing are a kind of the
operations which are a kind of the most expensive ones. That means we focus on the
expensive ones, we ignore the cheaper ones, and then we get a kind of estimation or
approximation of our kind of running time. And this is kind of the idea of it, and
this is the idea behind what is called the cost kind of model. And the idea is to
use some basic operation as a proxy of running time. That means I don't focus on
the running time related to all the operations that I have in my program. I focus
on one of them, on one basic operation which will give me a kind of approximation
of the running time, and then I should use a kind of the expensive kind of
operation.
So not, for example, the declaration. So declaration is kind of cheap a bit. I
mean, we're not kind of doing that frequently. And the question now is, I mean,
let's maybe focus on, I think this is the Tucson example, which operation is the
most expensive one? What do you think? So we have different operations. Here we
have a kind of the variable declaration. We have the increment kind of things. We
have the array access. We have the equal to comparison. So we have a kind of bunch
of different operations that we can focus on, but we want to take the most
expensive one. And the most expensive one is the one which we kind of executed most
frequently. Yes? Yeah, the inner loop. Which one again? The inner loop. Exactly,
the inner loop. So this is, so the inner loop, I mean, what's happening in the
inner loop is the operation which will run kind of most kind of frequently. So it's
not kind of a declaration of the variables, but we focus on the inner loop because
here we have to go through two kind of four loops of it. And then this operation
will be the operation which will run kind of most kind of frequently. That means,
back to this cost model, so we take one basic operation as a proxy of the running
time, but we should use a kind of the most expensive one because this will give us
a kind of the approximation of the running time. And in most of the cases, if not
in all of the cases, we should take the operation which is in the inner loop
because it will run kind of very, very frequently compared to the other operations.
That means in this case, if you want to have this cost model and then we want to
get a proxy for the running time, we don't get the declaration. It will be kind of,
it will happen n plus two times, but we should kind of take one which is in the
inner loop of it. And this is the idea behind the cost model. So basically, these
are the steps that you need to kind of do to develop a kind of mathematical model
of running time. So the first step is to define the input model. So the input model
is kind of the data that you want to work on. And then you identify the inner loop,
so the one which is kind of inside kind of message loops, if you have message
loops. And then we define the cost model, so it is the proxy or to pick one basic
operation which will act as the proxy for the running time. We pick one which is in
the inner loop because it will be the most expensive one because it will be run
more frequently. And then we determine the frequency of execution of that kind of
operation, so the expensive one. And this will give us a kind of an approximation
of the running time of our input model. So the short version, we don't focus on all
the operations. We pick one which is the most expensive one. We try to get the
running time based on that kind of operation. And this will give us kind of the
approximation of the total running time of our input model. Clear? Okay. If we try
to apply this for the cost model, so we said we use the most expensive operation as
the proxy. We said it should be something which is in the inner loop. And the inner
loop, we have kind of two things. We have the equal to compare and we have the
array to array access. And that's why we should pick one of them. And in this case,
we take the array access because it is more expensive because we do twice kind of
the equal to compare if we want to get the array access. That means as a cost
model, we use the array access as the operation which will act as the proxy for the
total running time of the program. And then we did this already, so we know that
the frequency of execution of the array access, we found n times n times 1. And
this is the one that we take as a kind of proxy. And we take this kind of result as
a kind of, yeah, for the running time of the program in this case. This was a kind
of one simplification. So the cost model, again, would be kind of one expensive
operation as the proxy for the running time. There is also another simplification
which is the theta equation. And the idea behind the theta equation is also kind of
simple. We work on trying to get approximations. So we don't want to get the exact
result. We should be satisfied with the kind of approximation of the running time.
So these are kind of, let's say, I mean we do kind of some computation and then we
find the kind of easy results as a kind of running time. So 1 over 6n cubic plus
20n and we have a kind of other ones. So as you can see, I mean all of them, they
are the same. I mean if we focus on the highest kind of, normal kind of term, so
all of them are kind of cubic. So the difference is in the normal kind of terms. So
here we have kind of 20n as a constant. So we add to the power of, yeah, 4 divided
by 3. And that means kind of n quadratic and n and so on. And the idea behind,
yeah, it's a kind of denotation, so that we ignore the lower, normal kind of terms.
So we focus only on the highest order term, which is in this case the cubic stuff.
And we ignore all the other ones. Why is this the case? Because if we have a kind
of large n, what will come after this so we can neglect it? And this only happens
with a kind of large kind of n. That means these two kind of equations, they might
be kind of different for a small n, but with a kind of increasing n, they will
converge. And that's why we can kind of get rid of what is called the lower kind of
order terms. So in this case, we write a kind of tilde, 1 over 6 of n kind of
cubic. We only take the first one, which is a kind of the term. This is the highest
order kind of term. And we ignore the other ones. So these are the lower kind of
order kind of terms of n. We focus on the first one. Because we see in this
example, if we plot the kind of the data, so we plot the kind of the curve related
to this one, or in this case, it was the last one, it was this one. And we plot the
curve related to only if we keep a kind of n cubic, so then you would see that at
some point, so they will converge. With a kind of increasing n, they will converge
a bit. And this is the idea behind the tilde notation. So to simplify kind of the
calculation a bit, I mean we don't try to get kind of the full thing. We just focus
on the highest kind of order terms, and we ignore all the other ones. In this case,
all of these kind of equations, all of them are kind of tilde notation, 1 over 6 n
kind of cubic, because again, with increasing kind of n, all of them will be the
same. And this way, if you use the tilde notation, you get a kind of the
approximation of the running run. It will not be the exact kind of time, but with a
kind of n, which is kind of as big as possible, we will get a kind of an
approximation of the running time of the algorithm, and this is the idea behind the
tilde notation. The technical definition, so we say that fn, yeah, tilde gm, so
which we mean that, so the limit when n goes to infinity, so that means with a kind
of larger kind of n of it. If we do a kind of the ratio of n divided by gm, because
they will converge, they will be the same at some point, then this limit will be a
kind of 1. They will be exactly the same kind of curve in that case. And this is
the tilde notation, and this is what we'll use in our course of that. We will use
the tilde notation to approximate the whole thing. So we'll not use a kind of the
full thing, so we'll use a kind of the tilde notation. That means we focus only on
this highest kind of order term, and we ignore all the other ones to tell about the
running time of the program. These are the two simplifications. I mean, we talked
about the first one, the cost model. So we use a kind of one expensive operation as
a kind of proxy for the running time, and the second one was the tilde notation. So
again, we ignore all the terms, and we focus only on the highest one, and we use
these two simplifications to approximate the kind of running time. These are kind
of, so the notation of these kind of running times by using the tilde notation. So
here it's n plus 2, so we say it's tilde n. So we just keep the one, so the highest
kind of order kind of term of it. If we see this one, so we don't need to have it
kind of completely, so then it would be kind of just one half n quadratic. So we
keep the leading constant or the leading kind of coefficient, and then we just
focus on the highest kind of order kind of term. In this case, it's n squared. The
same thing for this one. For this one, it's again kind of n squared, but for this
one, without the one half because we don't have the one half. And in this one, we
will have something between one half n squared to n squared. So we focus only on
the first and the highest kind of order kind of term of it, and this way we have a
kind of simplified kind of notation to express the running time of the program. So
if you apply this for the two sum, so I think we saw it already. We did our
calculation. We found n squared minus one, and we focus here on the array access,
and then we say it's tilde n squared. So we take the first kind of term of it, and
here we say in this program, so it's a kind of tilde n squared array access. And
this is what we use in this kind of course of it. We use the cost model, and we use
the tilde notation to simplify the key parts. Clear? Questions? Good. Let's move
on. Let's see it based on the example of three sum. I mean the example that we
started with. I mean if you don't mind, we will maybe take maybe five or ten
minutes because we started a bit late. So if you allow it, I will maybe stop at
five to twelve, because we started a bit late. Okay. So if you want to apply this
on our three sum example, so this is the three sum example. We have kind of three
loops. So we have the indices, the i, j, and the k. So i will go from zero to n, j
will go from i plus one to n, and k will go from j plus one to n. So kind of three,
maybe four
kind of loop a bit. And then we want now to find the kind of the running time of
this program a bit. And then we said we focus on one operation, which is the inner
loop. And that's why we focus on the array access, because the array access is kind
of the most expensive one. Now how to compute the running time for this kind of
code a bit. We can use this kind of combinatorial kind of approach a bit. So in the
three sum, we want to find how many triples when sum up to exactly zero. And this
is equivalent to find the kind of possible three combinations from the set of n
elements. So I try to find the kind of three elements, and then I try to find the
kind of all of them, and then I do the test whether these triples, or these three
elements, will sum up to zero. And then if we follow kind of the combinatorial kind
of number, n and three, we apply the formula that we saw before, then we get
something tilde one over six of n cubic. So this is kind of the tilde notation for
the array access operation in this kind of example. And this is what we use. And
this kind of one over six n cubic, this is the equal to kind of inverse. That means
how many times we do this kind of equal to inverse. But for the array access,
because we need to access the array thrice, I mean for the i, for the j, and for
the k, that's why we multiply these three, and then we get one half n kind of
cubic. And this is a kind of the tilde notation that we gave as a kind of value to
tell about the number of array access in this program. So it's one half n cubic. So
we got our first model, which is the array access. This is the one which should be
kind of executed most frequently. We use the tilde notation by focusing on the
highest kind of over a number of terms above, and then we get our result. We will
do this kind of quick last exercise, and then we'll stop there. I promise you. We
have a kind of, yeah, small kind of code kind of snippet. Again, we have three kind
of nested triple, nested triple four kind of loops of that. We have the i, which
will go from 0 to n. We have the j, which will go from i plus 1 to n. And we have
the k, which will go from 1 to n. But be careful here, so we don't increment with
1. We double the k. So that means 1, and then it will be 2, and then it will be 4,
until we get to the n of it. So we do not increment with 1. So we double the kind
of the k, so for each kind of, so from one step to the next one. And now we want to
compute this kind of, yeah, the running time based on the array access. And again,
the array access will be kept as a cost model, because it is a kind of in the inner
loop, and it will be kind of the most expensive operation. That's why we get the
array access. So the array access are these ones here. So we try to access the
array three times to get the item at a, item at j, and item at k. And these are
kind of four kind of options. And you have three minutes to try to tell how many
array accesses this code fragment will make as a function of n. And try to use the
kind of sum information that we had before. So we saw, for example, and then use
the fit annotation. So it should all be kind of the exact thing. So we saw, for
example, that it would be this kind of two sum. So when we have a kind of two, two
kind of loops, so it's a kind of n squared. So that means these two things is a
kind of n squared. We just need to see what's happening in this kind of k loop a
bit. So how many times will we kind of run this kind of k loop a bit? So we know
from the two outer kind of loops, because here we go from 0 to n, here we go from i
plus 1 to n. From the two sum, we know this is theta and square. We just need to
focus on this one. And this will make the difference. Two more minutes. Okay. Okay.
Okay. Okay. So who got it? Who did not get it? Okay. Good. Maybe first you can tell
what might be kind of the right answer, A, B, C, or D. What? So the majority is for
B, and it is correct. So let's see the solution together, and then we'll stop
there. So we said that, I mean, we know from the two sum and the three sum, so the
number of times the k loop executes is one, is theta one half n squared. And this
is a kind of, this is the k loop. How many times did we get executed? It should be
based on these kind of two message loops. And then we know from two sum and three
sum, this is theta one half n squared. Now, if we fix the i and the j, so the k
loop will require log n, so kind of larger equal convergence. So we are talking
about this one. So we want to get how many times this kind of equal, larger equal
kind of compare and kind of execute. So for this one, we need a kind of log n. So
because, so here the k will go from one to n, but we double the k at each kind of
step. And then if we double the k, so then how many times this k loop will execute,
it will be a kind of log n. Does it make sense? Did you get why it's a kind of log
n? So we start from one, we double the things until we will get to n. So let's
assume n is 16. So we start from one, then the values that k will take on, one,
two, four, eight, and 16. It will not come to 16 because it should be a kind of
strictly less than 16, but it will be one, two, four, and eight. So that means it
will not take on kind of four kind of possible values, and this is a log of 16. So
the log of 16 is four. That's why this kind of k loop will execute a kind of log n
kind of times. Now we just kind of multiply the two things. So this kind of k loop
execution will require log n. And then we know that we do this kind of tilde one
half n squared times. If we multiply the two things, we will get kind of this kind
of result there, and this is how many times this larger of equal comparison will
kind of execute. So again, what is happening in the outer two loops, this is tilde
one half kind of n squared, and then how many times this k loop will execute to be
kind of log n, and this is kind of the running time of this kind of larger than or
equal kind of comparison. It's tilde one half n squared times log n. But we are now
interested in the array access, and then we know that for every kind of larger
equal compare, we have three kind of array accesses. That's why we multiply
everything with three, and this is the final result. So it's tilde three half n
quadratic log n. This is the end of the result. Is it clear? You can also do it
with a little bit of math, but this is only if you want to make it more
complicated. But the first one will do the job. We will stop at this point. Thank
you for joining us. We will start next time. So this week we started with the
problem of array.

You might also like