The document discusses the analysis of algorithms, focusing on the scientific method and mathematical models for measuring running time. It explains the importance of input size, execution frequency, and operation costs in determining algorithm efficiency, along with the challenges of obtaining precise measurements. Additionally, it highlights the differences between constant and linear time operations, using examples to illustrate the concepts.
Download as TXT, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
2 views
Lecture
The document discusses the analysis of algorithms, focusing on the scientific method and mathematical models for measuring running time. It explains the importance of input size, execution frequency, and operation costs in determining algorithm efficiency, along with the challenges of obtaining precise measurements. Additionally, it highlights the differences between constant and linear time operations, using examples to illustrate the concepts.
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 12
... Yes, no? I can see you. So it works, yeah? Okay, good.
Okay, so we'll continue
today with the analysis of algorithms that we started last week. So again, you have here the QR code for the Q&A session. So please feel free to use it. So you can use it to ask questions, and please try to come up with good questions because we'll moderate the session. We can only allow kind of meaningful questions. So last week we started with the analysis of algorithms, so we tried to see how to analyze algorithms. And then we talked about different ways how to do it. Last week we talked about the scientific method. So what was the idea behind the scientific method? So basically we ran experiments. So we increased kind of the input size, so the N, so the number of input kind of data elements. And then we measured kind of the time. And then we documented the results. And then we plotted the data. And then we talked about how to plot the data so we can have the standard plot. And then also we said that we can also plot the data using the log plot. So that means we use the log of the input size on the x-axis, and we use the log of the time on the y-axis. And then if we do it like this, in many cases, so we can get a kind of a line. And then we can do kind of a bit of math. So to find the kind of the, yeah, this kind of regression, so what is the kind of this line about. So this is a kind of one way to do it. And then also we had a kind of another quick way. So how to do this with the doubling hypothesis. And the idea again behind the doubprogram. The data that we use as input and the algorithm itself. So this will affect the b. So the exponent that we have in this equation. And then we have dependent effects. So these are kind of the system dependent effects like the hardware, software, and the system. And these system dependent effects together with the system independent effects will determine the kind of the constant. So this a in this formula over here. So that means that the b only depends on the algorithm and the input data. And the a, so this constant, depends on everything. And then there are kind of good and bad things. I mean the bad things if we use this scientific method is that it is not the kind of accurate all the time. Because if we conduct experiments many things can go wrong. Which will lead to the fact that we don't have precise measurements and precise results. The good news if we use the scientific method, yeah, in the case of measuring the running time of algorithms. Is that it is kind of easier to do and cheaper than other kind of sciences. Because we don't have humans, we don't have animals, and we are not running experiments on a kind of, yeah, humans and animals. And that's why in many cases, so this kind of scientific method will give a kind of good results. This is where we stopped last time. We will continue from this slide. And we will start with a kind of A-class exercise as a warm up. So it's related to this doubling hypothesis way to do this kind of scientific method. So this is the data that we collected. So we ran a kind of experiment and then we collected some kind of results. So we have the N, so the input size, and then we measure the kind of the time. And then we try to apply this doubling hypothesis. So we try to double the input size from one experiment to the next one. And this is the kind of the data that we collected. So we have the input size N, we have the running time. And the question now is which of the following functions best models the running time T of N. So we provide a kind of four kind of options. As a hint, so we show again this power function. And then as we said, if we double the kind of the input size, and if we take the ratio, so the time needed to process 2N divided by the time needed to process N, so we get 2 to the power of B. And the question now is based on the data, on the collected data, so try to define the B, so which will be the exponent here. And then try to define the A, which will be kind of the constant. And then tell which of these kind of options will model the running time based on the data that we have. And you have three minutes. And again, try to work in pairs. I mean, I don't want to see people working alone. If you have someone sitting next to you, please kind of talk to them. And you have three minutes to solve this small exercise. And time is starting now. Time is up. Time is up. Okay. Time is up. Time is up. Time is up. Time is up. Do you see the screen now in this group? Because I thought that there were kind of some issues with the screen share. Can you confirm that it's working now in this group? Okay. Okay. Okay. So I did not hear anything, so I assume that it works. Okay. Time is up. I hope that you got kind of the right kind of answer. Maybe you can put kind of, see that kind of right answer that you got. Sorry? Well. Is it? Okay. Okay. Okay. Time is up. Hold it. One place. Right here. Okay. Is it working now? So what now? Yes. Okay. So what is the deal? Okay. Okay. So the technology is working. So this is kind of the right solution. So I think most of you got it. So C is the right solution. And the solution is quite simple. Basically, we have this kind of double hypothesis, and then we can focus on this kind of ratio. So the T of 2N divided by T of N. If we do it for this one, for example, so we'll get something close to 4. So that, and then if we also do it with the last kind of 2, because it's always good to do it with the kind of the larger kind of values of N. So then this ratio, let me... So this ratio is 4. And then we said this ratio corresponds to 2 to the power of B. So if 2 to the power of B is 4, it means that B is a kind of 2. So that means now we got the kind of the B. So it's kind of quadratic, so it's 2 is B. And then once we have the B, so we can plug in the value of B. And then solve for A, and this is kind of the result that we get. So basically, what you need to do is try to get the log kind of ratio. So first the ratio, and then the log of the ratio. The log of the ratio will be the B. And then once you have the B, so then you plug in the value of B in the formula, and then solve for A, and then you will get your equation. Questions related to this? I think it should be straightforward as well. Good. So, this was kind of about the scientific method. So as I said, there are kind of some issues in the scientific method. So it does not give always kind of accurate results. Also, we don't know what's going on in the programmable bed. And that's why there is a kind of another kind of maybe more reliable method to compute the running time. And then again, it's based on math, and we call it the mathematical model. So it's the mathematical approach how to count the running time. The idea behind it is simple. So the idea was proposed by Dijkhoof. So Dijkhoof is a famous researcher, and he did a lot related to the analysis of the algorithms in the end of the 60s. And then what he said is quite simple. So if we have a kind of program, we have some operations. And this idea is quite simple to calculate the total running time. So we try to find the frequency of execution of the operations. And then we know the cost of each operation. We multiply the cost with the frequency of execution of the operations, and then we sum up through all the operations that we have in the program. So these are some examples of operations that we can have in the program. So we can have the array access. So if we have an array, we are storing the data in the array, and then we access the array. If we want to add a kind of integer, if we want to compare integers, if we want to increment one variable, if we want to assign a kind of variable, these are kind of examples of operations that we can have in the program. And then each one of these kind of operations, so there is a kind of sub-frequency. So the frequency of the execution of these operations, for example, how many times we do a kind of integer add, for example. And on the other side, we have a kind of constant. So the constant of kind of executing that kind of operation. And then for the execution of each operation, we can have a different constant. So for example, for A, for the array access, so we have a kind of constant. So that means the time which is needed to run that operation. And then we multiply the cost of one operation with the frequency of the execution of the operation. And then we do it for all the operations, and this will give us the total running time. So this is the main idea behind this mathematical model for running time. So we try to get operations in the program. We try to count kind of the frequency of the execution of these operations. We know about the cost for each operation. We multiply, we sum up, and then we get our running time. So the frequency of the operations, I mean this will depend on the algorithm and the input. So that means how many times we kind of do one specific operation. This will only depend on the algorithm and the input, so the input size. And then the cost of running each operation, this will depend on the machine and the compiler. So it is how much time one operation will need to execute. This only depends on the machine and the compiler. And this is the idea behind this kind of computing. The total running time, again, the sum of the cost times the frequency of execution of the operations. We said that the cost depends on the machine and the compiler. The frequency of the execution of the operations, they do depend only on the algorithm and the input data. And the good thing, so we can compute these things. And then we will get a kind of accurate kind of result compared to the scientific method where, depending on the experiment, we might find an inaccurate kind of results. So this is the main idea. The question now is, I mean in this equation you have two things. I mean first the cost of the operation, and then we have the frequency of the execution of these operations. Let's start with the cost of the basic operations. So you can see here some examples. I mean again, we add integers, we multiply, we divide, floating point add. If you want to go with a kind of more complicated ones, the sine and the arctangent. These are kind of some basic operations. And then all these operations, I mean these basic operations, so we can perform them in a constant time. So it depends on the machine that you have, but on all of them, so we can perform them in a kind of constant time. So here you can see some examples of the time needed to execute these operations. And then all of them are in the range of many kind of seconds. And then it's a kind of constant all the time for these kind of basic operations. So these are kind of the basic ones, or the primitive kind of operations, which mostly take a kind of constant time. But then we have to be careful with some operations which sound to be primitive, but this is not the case. And this is in the case if we have, for example, array allocation. So we have one array, and there are some items, and then we want to allocate the kind of items in this array. So this is linear, so it's not constant anymore, because we need to access each kind of item in the array. So this is not constant, it's linear in this case. The other kind of example, the other operation, where you need to be kind of very careful, is the concatenation of strings. So many students, I mean, they do kind of some, kind of novice kind of mistake, and then they use a kind of string concatenation a lot, because it's kind of easy. In Python, for example, there is the plus operator, and then you can use it to concatenate two kind of strings. It sounds like it's a kind of primitive operation, it's a basic operation, and we know that most primitive operations, they take a kind of constant running time. This is not the case. The string concatenation, it's a kind of delian. So it will increase with the size of the strings of it. So here you have to be kind of careful with the usage of string concatenation, because again, it's not a kind of constant running time, it's linear and more expensive running time. All the other ones, they do take a kind of only constant time. Good. So this is now related to the first arc, which is a kind of the cost, so the cost of the primitive operations. So we said in most of the cases, it's a kind of constant. There are kind of some few exceptions where it is a kind of linear. And now the other kind of part of the equation, so the frequency of the executions of these operations. And then we said this depends on the program that we have. Let's see the kind of first example, this kind of basic example, and then we'll use the one sum. So last time we talked about the three sum. So remember the task there was to find all the kind of the triples in a kind of input, a kind of data, so which we sum up to a zero. And then one sum is one kind of specific case of this kind of three sum. So here we count how many zeros we have in our kind of input data. So we give a kind of some integers, for example, and then we want to count how many zeros we have in that kind of input data. This is a kind of, yeah, a smaller kind of program how to do it. So we have a count, we initiate it with zero. We have kind of, yeah, so this kind of a third rule. In this case we have only one variable, so the i, so the i will be in the range zero to n. So that means we start from zero and then we end at n minus one because this is only the index of it. So the size will go from, so until n, but the index will go from zero until n minus one. And this you can see it better maybe in Java for the ones who are more familiar with Java. So we start from zero, we do the test whether it is smaller than n or not, and then we increment with one. And then we go through all these kind of items, and then we do one kind of test. If a i is zero, so then we increment the count of it. It's kind of very basic kind of code. And this way we want to find how many zeros we have in our input data. So basically we go through all the items one by one, and then if it's zero, we increment the count. It's not more than this. And we have other operations. And we want now to figure out what is the frequency of the execution of these operations. So we have variable declaration. So we have two of them. In this case we have two variables. We have the count and we have the i. So these are the two variables that we declare. We have the assignment statement. Also we have two. So here we assign a kind of a count, so zero to count, and i also we start from zero, so we have a kind of two of them. We have the less than compare, which is this one. You can see it more in the Java code, so here it's a bit hidden. But here we have the less than compare. So that means you are doing kind of this test whether i, the index i, is less than n or not. And here we have a kind of n plus one. So from zero to n, and we repeat a kind of compare from zero to n. That's why we have n plus one operations related to this less than compare. And then we have equal to compare operation. This is this one. So where we test whether kind of the item that we have at index i is equal to zero or not. And this we have a kind of n times because we go through from this loop from zero until n minus one. And then we test whether it is equal to zero or not. That's why the frequency of the execution of this operation is n. And then we have the array access. So how many times we access the array? And a is the array, and here where we are doing the array access. And here again, we have a kind of n times that we access the array. And then we have the increment. So here we are doing kind of two increments. So we are incrementing the i, and this we are doing it n times. And then we are incrementing the count, and here it can be something between zero and n, depending on whether the item is equal to zero or not. That's why the total number of, so the frequency of the execution of this increment can go from n, so for the i, until to n, depending on whether we increment the count or not. So this is a kind of first example. Simple one where we have kind of the operations. And then based on the code, how to compute the frequency of the execution of these operations. For the one sum, it's a kind of straightforward, so it's not that much that we have to do. But how is it now with two sum? So two sum, yes? Why is it? You mean this one, right? So the question is, now I learned that I have to repeat the question, the question is why it is two n. So here we are talking about the increment, and we are incrementing the i, and we are incrementing the count. So let's assume that all the items are equal to zero. So here for the i, we incremented kind of n times, and then if this test is complete, so that means all the items are equal to zero, so then we increment the count also n times. And that's why it can go to two n. So in the worst case, I mean, yeah, not the worst case, but if we don't have a kind of zeros in the array, so then we increment the i n times, but count will not be incremented because this test is not fulfilled, and this is a kind of low-pretend of franchise, this is the n. And in the other case where we have all the kind of zeros, so that the count will be incremented n times. That's why we have the two n. Other questions? No. Okay, good. Let's move on with the two sum. So the two sum is the other kind of variant of the pre-sum, and the idea here is to test how many pairs we have in our data which will sum up to zero. So in three sum triples, so how many triples will sum up to zero? In one sum, how many zeros we have in our data? And in two sum, how many pairs in the data will sum up to zero? So this is kind of the code related to this. So here we have a kind of a rabbit loop. So we have the i and we have the j. So i also will go from zero to n, j will go from i plus one to n. So that means we start from the beginning, and then for each kind of item, we try to get the kind of the other items, and then we do a kind of test whether these two kind of items will sum up to zero, and this way we kind of recover all of them via these two kind of four loops. And then if the test is fulfilled, I mean if we find the kind of two items which will sum up to zero, then we will increment the part of it. So the only difference from one sum, in one sum we had only one loop, so i will go from the list of it, and then we test whether it is zero or not. So here we have kind of two loops, so that we can capture all the possible kind of combinations, and then we do kind of the test whether the two kind of elements, the two items will sum up to zero or not. If this is the case, we increment the part with one. As you can see, it's still not a kind of very complicated, it's a kind of two sum, but as you can see, computing the frequency of the operations is getting kind of a bit more convenience. So here we can find that the variable declaration is a kind of n plus two, assigned to n plus two, because also we need to go through the second kind of loop a bit, and then we have the other kind of results for the frequency of the execution of the different operations. Don't worry, you will see these kind of computations step by step. What I just want to stress at this point is that it's not kind of easy to compute the frequency of the execution of the operations. So, for example, for the equal to compare, which is this one, so how many times we do this kind of test, whether the two items will sum up to zero or not. So we do it with one half n times n minus one, and then we will see how to do it. Basically, we managed to find a kind of three ways, maybe with increasing level of complexity, to compute this one. So here we are interested in computing the frequency of the execution of this operation equal to compare, this one. So how many times this operation will be executed. The first kind of method how to do it, we do a kind of sum of two ways. And the idea behind to sum again, it was how many pairs will sum up to exactly zero. And this is equivalent to finding the possible two combinations of a set of n elements. So we have n elements, so this is a kind of our input size, and then we want to find how many pairs, or what are the possible kind of two combinations. So if I take a kind of two items from this, a kind of from the n elements, so how many of them will sum up to zero. And this we know from combinatorics, it's a kind of n in two, or we can also tell n choose two. So it was from n, we choose a kind of two elements. And this is a kind of the combinatorial number, n and k, or also we can call it n choose k. So it was from n choose two k elements in this case, if we have this kind of more general kind of case as well. And we know the formula, so n in k, or n choose k. So this is n factorial in the numerator, in the denominator we have k factorial times n minus k factorial. Yeah, this is in the case where we have k. Now back to our example. So here we have, so we want to have the pairs, that means k is equal to two. And then we want to find the possible two combinations of a set of n elements. So we have n in two, or n choose two. And then we apply the formula with k is equal to two. And then we get n factorial divided by two factorial times n minus two factorial. And this is a kind of the result that we get. So this is one kind of easy way to compute, this is kind of equal to compare how many times we do it. Because it reflects what we are doing. So I think we have these kind of two loops. We fix one, we go through kind of the other loop, and then we kind of, we try to pick kind of two items, and then we test whether they will sum up to zero or not. And this is n choose two, and this is kind of the result. So it is not the frequency of execution of the equal to kind of compare. It's kind of one half n times n minus one. This is kind of the first way how to do it. There is a kind of another way. So to use a kind of a table a bit. So here we can have a kind of our indices. We have the i, we have the j. And then we try to count the number of iterations of j. So we go through the loops. I mean we start from the first loop with i. So i will go from zero to n minus one. Because as we said, here we focus on the index. So i will go from zero to n minus one. So this is the first loop. And then whenever we fix i, so we try to focus on the j. So if we start with a is equal to zero, in the second loop j will go from i plus one until n. So that means if i is zero, j will go from one until n minus one. So i will go from zero to n minus one. And j will go from i plus one to n. That means if we have zero, i zero, j will go from one to n minus one. And then we increment i. So i is now one. And then again j will go from i plus one to n. In this case it's from two until n minus one. And then we do it for all the values of i. And then as we said, i will go from zero to n minus one. Let's see the last two maybe kind of cases. So when i is n minus two, so j will go from i plus one. So that means it will only be n minus one. And then if i is n minus one, so j will be kind of n. And then we do the test that j should be strictly less than n. That's why j will be kind of zero in this case. So this is kind of the table. So basically try to go to the, what's happening in the code of it. And then try to fix kind of the, one kind of index of one loop. And then kind of try to see how the second variable will behave. And then we count the number of iterations. So in the first case, so number of iterations of j. So here it will go from one until n minus one. Then it will be n minus one kind of iterations. Here from two to n minus one, that means the possible values that j can take on. It will be n minus two and so on until in the end it's one. In the pre last kind of case and then zero in the last one. So now we manage to find kind of the iterations of j. And then we know that the number of iterations of j will correspond to this kind of equal to kind of comparison. That means whenever we have a kind of j, so we have a kind of one compare. And that's why the number of compares, equal compares, will be the sum of these kind of values. So we take the sum, so the total, so i to j, so the total iterations of j of n. And then we take the sum. So n minus one plus n minus two until zero. And then we have a kind of this result of it. I mean here just see it kind of from the outline to the top of it. But that's the same kind of sum. So we sum up all the iterations of j. And then we will get this one. And this will correspond to the equal to compare. So the frequency of execution of the equal to compare. Probably you know this kind of sum of it. It's the arithmetic sum or the arithmetic series of it. And the arithmetic series, so we have a kind of, yeah, we will start with a kind of a number. And then we increase by something. In this case we are increasing by one. So we start from zero. And then we increase by one. And this is one example of the arithmetic kind of sum. And then we know how to compute the kind of, yeah, the sum of an arithmetic series of it. The sum will be, so the number of terms times the average of first and last type of terms. So this is how to get the sum of the arithmetic series of it. You try to find the number of terms that you have. And then you take the average of the first and the last one. And then you will get your sum. If we try to apply it on our case, so what is the number of terms? I mean the number of terms, here we start from zero. We will end up with n minus one. That means the number of terms is n. And then the average of the first and the last term. The first term is zero. The last one is n minus one. We take the average, it's n minus one divided by two. And this is the same in the formula that we got in the previous one. Is it clear? Yes? There is a question? No? Okay, good. Let's move on. So this was a kind of the second kind of option, how to compute it a bit. So by just a kind of trying to get this table a bit, trying to find the number of iterations. And then try to apply a kind of some math to get the sum of the arithmetic series of it. To get the sum. This is the third method. So basically it is kind of a bit more kind of math a bit. But it will reflect what's happening in this program a bit. So we have our two variables. We have the i and the j. And we have a kind of two sums. So for the i, so it will go from zero to n minus one. So this is the first kind of loop a bit. So i will go from zero to n minus one. And then inside that loop we have the second one, which is kind of about j. So j will go from i plus one to n. And this is what we have in this sum. So j will go from i plus one to n minus one. Again, because we are talking about ADCs. And then we have the one. So this is the operation which will happen inside these kind of two loops. This is, in our case, this equal to compare operation. And then now we try to solve this kind of sum a bit. So we start from inside. So what is the sum j from i plus one to n minus one of one? So that means a kind of how many times we repeat a kind of this one. It will be the last term minus a kind of the first one plus one. So this is n minus one, n minus one minus the first one, i plus one. And then we add one. So as you know, if you want to have a kind of, I don't know, the number of items, let's say between one and twelve, between one and twelve, it will be twelve. If you want to get it between, let's say, five and fifteen, so it's fifteen minus five plus one, it will be eleven in that case. I mean this is what we are applying here. The last term minus the first one plus one, and this is what we get. So then we have now the sum over i of n minus i minus one. So this is this one. And now we want to get the sum. So i will go from zero until n minus one. So we plug in the values of i. So if we start from zero, i will be zero. That means the first term will be n minus one. The second one, i will be kind of one. So n minus one minus one, it's n minus two. And the last one, i will be n minus one. So then it will be n minus n minus one minus one, and then it will be zero. So this is kind of the equation that we will get over. And this is again the arithmetic sum, so that we did kind of before. And we know that the result in this case is n times n minus one over two. So again, we take the number of terms, in this case it's n. The first one, the last one, we take the average, and this is kind of the result. Is it clear? So you have now kind of three ways how to compute the frequency of the execution of this operation, equal to compare in the case of kind of two sums in this case. And then these are kind of options, and then feel free to use the one that you like. I mean, if all there is is kind of one option which will fit kind of all the time, so sometimes you need to use a kind of the table, sometimes use a kind of a mathematical kind of formula, but it will depend on the case of the, these are three kind of possible options that you can use to find these kind of frequencies of it. Okay, so we did this now for the equal to compare. So we found now that the frequency of the execution of equal to compare is this one. So how many times will we kind of run this kind of, or will we execute this operation? It's a kind of one half n times n minus one. Now, if you focus on the array access, so that means the frequency of the execution of the operation array access. We see now a code, so here we need a kind of two kind of array accesses. I mean, to access the item at the i position, and to access the item in the array at the j position, so we have a kind of two. And then we can see that it relates to this kind of number of executions of the equal to compare, so just for each kind of equal to compare operation, we need two array accesses. That's why the array access, so the number of executions of array access will be twice the number of executions of equal to compare. We know the result for the equal to compare. We multiply with two, because again, for each equal to compare operation, we have two array accesses, and then we get n times n minus one, and this is this variable. Okay. One question? The assignment statement? The assignment statement? So the assignment statement should be related to the, in this case, kind of count, and i, and j, and the count of it. No, but the count we have it. So for the i, so we're going to kind of, so it will be kind of only in this kind of, so we do it kind of once, but for each kind of variable of i, we need to assign the variable of j from i plus one until n, and that's why this n is related to the assignment of j, because i will be assigned only once in the beginning, and then for this i, which is zero, so we need to assign the j kind of i plus one until n, and this is kind of the n, plus two, which is a kind of the assignment of i, and the assignment of count of it. So this i, this n, is related to the assignment of j, because we have to do it n times when we fix, and i is a kind of assignment once. So again, n is the j, so the assignment of j, and then plus two, the assignment of i, which is only once, the first one, i zero, and then count is zero. So this is the n plus two. Other questions? Good. Okay. Now we talked about, and we saw kind of examples how to compute the number of assignments of the equal to compare. We saw kind of three ways how to do it, and then the array access, I mean it was easy because it depends, or it can be derived from the equal to compare. Now we want to do kind of one exercise together, and now we focus on the less than compare operation. So the less than compare, maybe you will not see it kind of directly in the Python code, but you can see it in the Java code. So what we mean with the less than compare is a kind of comparison. So that we compare i to n, and we compare the j to n, and we want now to figure out how many, or what is the frequency of the execution of this operation less than compare. So here we know the result, but now we need to kind of come to this kind of result. We do it together as a kind of in-class exercise, and we can apply what we discussed a bit. So we have the code, we have the code in Java, and then again we focus on this less than compare. So the i and the j, how many times we compare them with n. And you have a kind of four minutes for this, and then the hint is try to use the table. And the time starts now. Okay, try to work together a bit. Try to talk to each other. That's fine. I want to hear that you are talking to each other. Let's have a bit more kind of action in the lecture part. Yes, talk to each other. Haven't we done action so far? So try to fix the eye and then try to run a kind of a j-approach. And this will be a kind of one row in the table over, and then I'm going to do that kind of iterations, and then I'm just going to do the sum of these kind of iterations for every value of i and j. So if I put, for example, a value of j x1, then technically the answer is g goes to Okay, so who got it? Raise your hand. Don't worry, I will not ask you to present the solution. Just I want to see who got it. Who did not get it? Who got it? Nobody got it? Who did not get it? Okay, let me know. I believe you, because otherwise... Okay, good. Let's see the kind of solution together. So the idea is simple. So I kind of try to come up with this kind of table. We have our two indices, the i and the j. So the i will go from 0 to m, because you are interested in the comparison. I mean, it's not the i itself, but the comparison. I mean, how many times we do the comparison? We do it for the i from 0 until m, because also with the n, with the last number, we do kind of the comparison. That's why the i will go from 0 to m, and this is what we have in the first row. Now we fix the i. So i is 0. Then j will go from i plus 1 until m. So i is 0, j will go from 1 until m. So it goes from i plus 1 until m, because also we do this kind of compare with the last element, which is m. Okay. Okay. So then i is 1, and then j will go from i plus 1 until m again. That means from 2 to m. And then we have kind of the, we go on like this. And then the last kind of elements, so when i is m minus 1, so j will start from i plus 1, then it will be kind of m. So in that case we only do one compare, so we compare with j, which is m in that case, because it's i plus 1, and we do the compare with m, and that's why we have m here. And then for the m, so then we have kind of 0. This is not, it should not be m, it should be 1. So this is not m, this should be 1. We have a typo on the slide. Okay, so now we get the kind of number of iterations. So now we focus on the number of iterations of j. So in the first one j will go from 1 to m, so this is m times. And then for the second one from 2 to m, it's m minus 1 times, and then we do kind of the same kind of exercise. In the end it will be 1, and then it will be 0. Now we need the kind of, we get the kind of this less than compare for the j. We sum up these kind of things here. And this is what we get. So we sum up all these kind of terms, and then here again we have our kind of arithmetic sum, so that we discussed before. And now we know how to compute this arithmetic kind of sum. We have the number of elements times the average of the first and the last one. If we apply it, so here we go from 0 to n, so that means how many elements we have, it's n plus 1. The average of the first and the last one, 0 plus 1 divided by 2, this is the kind of result. So basically the application of this arithmetic sum, not more, not less. And this way we get now the number of iterations of j. But still we need to get the iterations of i, because also for the i we have this kind of number of compares of it. And the total number of kind of the compares related to i, it's n plus 1, because i will go from 0 to n, including n, and then we have n plus 1 kind of compares for the i. Now for the result, we sum up kind of the two, yeah, the two equations. So the total number of iterations of j, this is what we got through this kind of arithmetic sum. The easier one, the number of iterations, or the total number of iterations of i, which is n plus 1, so we get it immediately via this loop here. We sum up the two things, so the total number of iterations of i, the total number of iterations of j, we just kind of sum up the two things. We do a kind of factorization of that, and then we get the kind of result of it, which is 1 times n plus 1 times n plus 2. Is it clear? I mean the idea is quite simple. You just kind of focus on the loops that you have. You start with the outer loop. You fix the kind of the variable that you have in the outer loop, and then you can see the inner loop. So for the second, you got variables of n. You note down how many steps you need, and then you sum up the things, and you are done. Okay? Yes? Why? Okay, okay, okay. No, that should be n. No, that's correct, because it's a kind of the compared thing. So the j in that case, because i is n minus 1, and then j is i plus 1, that's correct. That means the value of j is n, and this is correct. So another term, and this is correct. Thank you, Sean. And thank you to the student in this group who wrote it. Okay, that's n, and then we have only one value of j. That's why we have this one, and that's correct. So there is no type on this slide. Good. Other questions? No. Okay, let's move on. So now we saw kind of how to compute this kind of frequency of the execution of the operations of it. So it's a bit kind of tedious of it, because we have to go through all the operations that we have in our program, and then we try to figure out kind of how many times these operations are executed, and then we sum up. Then it's one or two kind of simplifications of the calculations. And the first kind of idea, how to simplify the calculation, it was given by Tullio. And then he did it in the context of matrix processing or matrix kind of multiplication. And then what he said, we have a kind of bunch of different kind of operations that we can do. So like kind of additions or subtractions, multiplications, divisions, and so on. But then what he said is there is no need to focus on all of them. We just focus on the most important ones, and basically the most important ones are the most expensive ones. In the case of matrix kind of processing, so he said, we don't need to bother about kind of all the types of operations that we have. So we just need to attempt to count the number of multiplications and recordings, and we ignore all the other operations, because they don't have a kind of lot of influence. So this kind of multiplications and recordings in the case of matrix processing are a kind of the operations which are a kind of the most expensive ones. That means we focus on the expensive ones, we ignore the cheaper ones, and then we get a kind of estimation or approximation of our kind of running time. And this is kind of the idea of it, and this is the idea behind what is called the cost kind of model. And the idea is to use some basic operation as a proxy of running time. That means I don't focus on the running time related to all the operations that I have in my program. I focus on one of them, on one basic operation which will give me a kind of approximation of the running time, and then I should use a kind of the expensive kind of operation. So not, for example, the declaration. So declaration is kind of cheap a bit. I mean, we're not kind of doing that frequently. And the question now is, I mean, let's maybe focus on, I think this is the Tucson example, which operation is the most expensive one? What do you think? So we have different operations. Here we have a kind of the variable declaration. We have the increment kind of things. We have the array access. We have the equal to comparison. So we have a kind of bunch of different operations that we can focus on, but we want to take the most expensive one. And the most expensive one is the one which we kind of executed most frequently. Yes? Yeah, the inner loop. Which one again? The inner loop. Exactly, the inner loop. So this is, so the inner loop, I mean, what's happening in the inner loop is the operation which will run kind of most kind of frequently. So it's not kind of a declaration of the variables, but we focus on the inner loop because here we have to go through two kind of four loops of it. And then this operation will be the operation which will run kind of most kind of frequently. That means, back to this cost model, so we take one basic operation as a proxy of the running time, but we should use a kind of the most expensive one because this will give us a kind of the approximation of the running time. And in most of the cases, if not in all of the cases, we should take the operation which is in the inner loop because it will run kind of very, very frequently compared to the other operations. That means in this case, if you want to have this cost model and then we want to get a proxy for the running time, we don't get the declaration. It will be kind of, it will happen n plus two times, but we should kind of take one which is in the inner loop of it. And this is the idea behind the cost model. So basically, these are the steps that you need to kind of do to develop a kind of mathematical model of running time. So the first step is to define the input model. So the input model is kind of the data that you want to work on. And then you identify the inner loop, so the one which is kind of inside kind of message loops, if you have message loops. And then we define the cost model, so it is the proxy or to pick one basic operation which will act as the proxy for the running time. We pick one which is in the inner loop because it will be the most expensive one because it will be run more frequently. And then we determine the frequency of execution of that kind of operation, so the expensive one. And this will give us a kind of an approximation of the running time of our input model. So the short version, we don't focus on all the operations. We pick one which is the most expensive one. We try to get the running time based on that kind of operation. And this will give us kind of the approximation of the total running time of our input model. Clear? Okay. If we try to apply this for the cost model, so we said we use the most expensive operation as the proxy. We said it should be something which is in the inner loop. And the inner loop, we have kind of two things. We have the equal to compare and we have the array to array access. And that's why we should pick one of them. And in this case, we take the array access because it is more expensive because we do twice kind of the equal to compare if we want to get the array access. That means as a cost model, we use the array access as the operation which will act as the proxy for the total running time of the program. And then we did this already, so we know that the frequency of execution of the array access, we found n times n times 1. And this is the one that we take as a kind of proxy. And we take this kind of result as a kind of, yeah, for the running time of the program in this case. This was a kind of one simplification. So the cost model, again, would be kind of one expensive operation as the proxy for the running time. There is also another simplification which is the theta equation. And the idea behind the theta equation is also kind of simple. We work on trying to get approximations. So we don't want to get the exact result. We should be satisfied with the kind of approximation of the running time. So these are kind of, let's say, I mean we do kind of some computation and then we find the kind of easy results as a kind of running time. So 1 over 6n cubic plus 20n and we have a kind of other ones. So as you can see, I mean all of them, they are the same. I mean if we focus on the highest kind of, normal kind of term, so all of them are kind of cubic. So the difference is in the normal kind of terms. So here we have kind of 20n as a constant. So we add to the power of, yeah, 4 divided by 3. And that means kind of n quadratic and n and so on. And the idea behind, yeah, it's a kind of denotation, so that we ignore the lower, normal kind of terms. So we focus only on the highest order term, which is in this case the cubic stuff. And we ignore all the other ones. Why is this the case? Because if we have a kind of large n, what will come after this so we can neglect it? And this only happens with a kind of large kind of n. That means these two kind of equations, they might be kind of different for a small n, but with a kind of increasing n, they will converge. And that's why we can kind of get rid of what is called the lower kind of order terms. So in this case, we write a kind of tilde, 1 over 6 of n kind of cubic. We only take the first one, which is a kind of the term. This is the highest order kind of term. And we ignore the other ones. So these are the lower kind of order kind of terms of n. We focus on the first one. Because we see in this example, if we plot the kind of the data, so we plot the kind of the curve related to this one, or in this case, it was the last one, it was this one. And we plot the curve related to only if we keep a kind of n cubic, so then you would see that at some point, so they will converge. With a kind of increasing n, they will converge a bit. And this is the idea behind the tilde notation. So to simplify kind of the calculation a bit, I mean we don't try to get kind of the full thing. We just focus on the highest kind of order terms, and we ignore all the other ones. In this case, all of these kind of equations, all of them are kind of tilde notation, 1 over 6 n kind of cubic, because again, with increasing kind of n, all of them will be the same. And this way, if you use the tilde notation, you get a kind of the approximation of the running run. It will not be the exact kind of time, but with a kind of n, which is kind of as big as possible, we will get a kind of an approximation of the running time of the algorithm, and this is the idea behind the tilde notation. The technical definition, so we say that fn, yeah, tilde gm, so which we mean that, so the limit when n goes to infinity, so that means with a kind of larger kind of n of it. If we do a kind of the ratio of n divided by gm, because they will converge, they will be the same at some point, then this limit will be a kind of 1. They will be exactly the same kind of curve in that case. And this is the tilde notation, and this is what we'll use in our course of that. We will use the tilde notation to approximate the whole thing. So we'll not use a kind of the full thing, so we'll use a kind of the tilde notation. That means we focus only on this highest kind of order term, and we ignore all the other ones to tell about the running time of the program. These are the two simplifications. I mean, we talked about the first one, the cost model. So we use a kind of one expensive operation as a kind of proxy for the running time, and the second one was the tilde notation. So again, we ignore all the terms, and we focus only on the highest one, and we use these two simplifications to approximate the kind of running time. These are kind of, so the notation of these kind of running times by using the tilde notation. So here it's n plus 2, so we say it's tilde n. So we just keep the one, so the highest kind of order kind of term of it. If we see this one, so we don't need to have it kind of completely, so then it would be kind of just one half n quadratic. So we keep the leading constant or the leading kind of coefficient, and then we just focus on the highest kind of order kind of term. In this case, it's n squared. The same thing for this one. For this one, it's again kind of n squared, but for this one, without the one half because we don't have the one half. And in this one, we will have something between one half n squared to n squared. So we focus only on the first and the highest kind of order kind of term of it, and this way we have a kind of simplified kind of notation to express the running time of the program. So if you apply this for the two sum, so I think we saw it already. We did our calculation. We found n squared minus one, and we focus here on the array access, and then we say it's tilde n squared. So we take the first kind of term of it, and here we say in this program, so it's a kind of tilde n squared array access. And this is what we use in this kind of course of it. We use the cost model, and we use the tilde notation to simplify the key parts. Clear? Questions? Good. Let's move on. Let's see it based on the example of three sum. I mean the example that we started with. I mean if you don't mind, we will maybe take maybe five or ten minutes because we started a bit late. So if you allow it, I will maybe stop at five to twelve, because we started a bit late. Okay. So if you want to apply this on our three sum example, so this is the three sum example. We have kind of three loops. So we have the indices, the i, j, and the k. So i will go from zero to n, j will go from i plus one to n, and k will go from j plus one to n. So kind of three, maybe four kind of loop a bit. And then we want now to find the kind of the running time of this program a bit. And then we said we focus on one operation, which is the inner loop. And that's why we focus on the array access, because the array access is kind of the most expensive one. Now how to compute the running time for this kind of code a bit. We can use this kind of combinatorial kind of approach a bit. So in the three sum, we want to find how many triples when sum up to exactly zero. And this is equivalent to find the kind of possible three combinations from the set of n elements. So I try to find the kind of three elements, and then I try to find the kind of all of them, and then I do the test whether these triples, or these three elements, will sum up to zero. And then if we follow kind of the combinatorial kind of number, n and three, we apply the formula that we saw before, then we get something tilde one over six of n cubic. So this is kind of the tilde notation for the array access operation in this kind of example. And this is what we use. And this kind of one over six n cubic, this is the equal to kind of inverse. That means how many times we do this kind of equal to inverse. But for the array access, because we need to access the array thrice, I mean for the i, for the j, and for the k, that's why we multiply these three, and then we get one half n kind of cubic. And this is a kind of the tilde notation that we gave as a kind of value to tell about the number of array access in this program. So it's one half n cubic. So we got our first model, which is the array access. This is the one which should be kind of executed most frequently. We use the tilde notation by focusing on the highest kind of over a number of terms above, and then we get our result. We will do this kind of quick last exercise, and then we'll stop there. I promise you. We have a kind of, yeah, small kind of code kind of snippet. Again, we have three kind of nested triple, nested triple four kind of loops of that. We have the i, which will go from 0 to n. We have the j, which will go from i plus 1 to n. And we have the k, which will go from 1 to n. But be careful here, so we don't increment with 1. We double the k. So that means 1, and then it will be 2, and then it will be 4, until we get to the n of it. So we do not increment with 1. So we double the kind of the k, so for each kind of, so from one step to the next one. And now we want to compute this kind of, yeah, the running time based on the array access. And again, the array access will be kept as a cost model, because it is a kind of in the inner loop, and it will be kind of the most expensive operation. That's why we get the array access. So the array access are these ones here. So we try to access the array three times to get the item at a, item at j, and item at k. And these are kind of four kind of options. And you have three minutes to try to tell how many array accesses this code fragment will make as a function of n. And try to use the kind of sum information that we had before. So we saw, for example, and then use the fit annotation. So it should all be kind of the exact thing. So we saw, for example, that it would be this kind of two sum. So when we have a kind of two, two kind of loops, so it's a kind of n squared. So that means these two things is a kind of n squared. We just need to see what's happening in this kind of k loop a bit. So how many times will we kind of run this kind of k loop a bit? So we know from the two outer kind of loops, because here we go from 0 to n, here we go from i plus 1 to n. From the two sum, we know this is theta and square. We just need to focus on this one. And this will make the difference. Two more minutes. Okay. Okay. Okay. Okay. So who got it? Who did not get it? Okay. Good. Maybe first you can tell what might be kind of the right answer, A, B, C, or D. What? So the majority is for B, and it is correct. So let's see the solution together, and then we'll stop there. So we said that, I mean, we know from the two sum and the three sum, so the number of times the k loop executes is one, is theta one half n squared. And this is a kind of, this is the k loop. How many times did we get executed? It should be based on these kind of two message loops. And then we know from two sum and three sum, this is theta one half n squared. Now, if we fix the i and the j, so the k loop will require log n, so kind of larger equal convergence. So we are talking about this one. So we want to get how many times this kind of equal, larger equal kind of compare and kind of execute. So for this one, we need a kind of log n. So because, so here the k will go from one to n, but we double the k at each kind of step. And then if we double the k, so then how many times this k loop will execute, it will be a kind of log n. Does it make sense? Did you get why it's a kind of log n? So we start from one, we double the things until we will get to n. So let's assume n is 16. So we start from one, then the values that k will take on, one, two, four, eight, and 16. It will not come to 16 because it should be a kind of strictly less than 16, but it will be one, two, four, and eight. So that means it will not take on kind of four kind of possible values, and this is a log of 16. So the log of 16 is four. That's why this kind of k loop will execute a kind of log n kind of times. Now we just kind of multiply the two things. So this kind of k loop execution will require log n. And then we know that we do this kind of tilde one half n squared times. If we multiply the two things, we will get kind of this kind of result there, and this is how many times this larger of equal comparison will kind of execute. So again, what is happening in the outer two loops, this is tilde one half kind of n squared, and then how many times this k loop will execute to be kind of log n, and this is kind of the running time of this kind of larger than or equal kind of comparison. It's tilde one half n squared times log n. But we are now interested in the array access, and then we know that for every kind of larger equal compare, we have three kind of array accesses. That's why we multiply everything with three, and this is the final result. So it's tilde three half n quadratic log n. This is the end of the result. Is it clear? You can also do it with a little bit of math, but this is only if you want to make it more complicated. But the first one will do the job. We will stop at this point. Thank you for joining us. We will start next time. So this week we started with the problem of array.