Day13 Basic Stats Via Loops Studentsr 2020
Day13 Basic Stats Via Loops Studentsr 2020
1
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Reference
} You should be following along in Chapter 14 of the 5th
Edition of the Attaway book.
} The YouTube video points to a different chapter in the older book.
Chapter 14
1 4. 1 –
1 4. 3
2
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Statistics in the Real World?
} Traditional
} suppose that you perform multiple measurements of some
phenomenon
} the “average” measurement
} the amount of variation in the measurements
} the order of the measurements
} Trending
} Big Data
} Machine Learning
} Artificial Intelligence
3
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Ex. Final Mark of an Arbitrary Class
4
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Maximum
} basic idea for finding the maximum score:
} create a variable to store the maximum score seen so far
} initial value?
} look at each score
} this requires a loop
¨
} if the current score is greater than the maximum score seen so far,
update the maximum score seen so far
¨
5
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
% find the maximum score for Test 1
load test1mc.mat
maxValue = 0;
n = length(x);
for idx = 1:n
curr = x(idx);
end
end
% or use max
maxValue = max(x);
6
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Maximum
} MATLAB's max function returns the maximum value
of a vector and the index of the maximum value
maxValue =
25
minValue =
25
idx =
50
7
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Minimum
} basic idea for finding the minimum score:
} create a variable to store the minimum score seen so far
} initial value?
} look at each score
} this requires a loop
¨ which kind of loop? for loop? while loop?
} if the current score is smaller than the minimum score seen so
far, update the minimum score seen so far
¨ this requires an if statement
8
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
% find the minimum score for Test 1
load test1mc.mat
minValue = 25;
n = length(x);
for idx = 1:n
curr = x(idx);
minValue = curr;
end
end
% or use min
minValue = min(x);
9
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Minimum
} MATLAB's min function returns the minimum value
of a vector and the index of the minimum value
minValue =
7
minValue =
7
idx =
213
10
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Basic statistics
(just “fancy” loops)
EECS 1011 – Day 13
Original notes by Dr. Burton Ma; Updates by Dr. James Andrew Smith & Dr. Steven Castellucci
11
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Reference
} You should be following along in Chapter 14 of the 5th
Edition of the Attaway book.
} The YouTube video points to a different chapter in the older book.
Chapter 14
1 4. 1 –
1 4. 3
12
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Statistics in the Real World?
} Traditional
}
} the “average” measurement
}
} the order of the measurements
} Trending
} Big Data
} Machine Learning
} Artificial Intelligence
13
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Average
} the arithmetic average is an estimate of the mean
} in statistics, the mean is a measure of the center point of a
distribution
} for 𝑁 measurements 𝑥! the sample mean is
14
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Average
} basic idea to compute the average:
} create a variable to accumulate the running sum
} initial value?
} look at each score
} this requires a loop
¨ which kind of loop? for loop? while loop?
} add the current score to the running sum
} divide the accumulated sum by the number of scores
15
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
% find the average score for
[Fill
Test 1
in here]
load test1mc.mat
runningSum = 0;
n = length(x);
for idx = 1:n
curr = x(idx);
end
avgValue = runningSum / n;
% or use mean
avgValue = mean(x);
16
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[new slide]
Average on a data file, test1mc.mat
17
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[new slide]
Look again: Equation = For Loop
runningSum = 0;
n = length(x);
$
1 for idx = 1:n
𝑥̅ = & 𝑥! curr = x(idx);
𝑁 runningSum = runningSum + curr;
!"# end
avgValue = runningSum / n;
18
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[update]
Average
} you should use the function mean instead
}
19
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[update]
Mean
} one problem with the mean is that it is sensitive to
erroneous or measurements
20
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Basic statistics
(just “fancy” loops)
EECS 1011 – Day 13
Original notes by Dr. Burton Ma; Updates by Dr. James Andrew Smith & Dr. Steven Castellucci
21
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Reference
} You should be following along in Chapter 14 of the 5th
Edition of the Attaway book.
} The YouTube video points to a different chapter in the older book.
Chapter 14
1 4. 1 –
1 4. 3
22
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Statistics in the Real World?
} Traditional
}
}
} Trending
} Big Data
} Machine Learning
} Artificial Intelligence
23
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Variance
} the variance is a measure of spread around the mean
} for 𝑁 measurements 𝑥! the sample variance is
% "low" variance
>> x = randn(1, 100);
>> hist(x, 20);
>> var(x)
% "high" variance
>> x = 10 * randn(1, 100);
>> hist(x, 20);
>> var(x)
24
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Variance
} basic idea to compute the variance:
} create a variable to accumulate the running sum
} initial value?
} compute and store the average value
} look at each score
} this requires a loop
¨ which kind of loop? for loop? while loop?
} add the squared deviation to the running sum
} divide the accumulated sum by the number of scores minus 1
25
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
% find the variance of the scores for Test 1
% file found at http://www.eecs.yorku.ca/~drsmith/eecs1011/test1mc.mat
load test1mc.mat
runningSum = 0;
avgValue = mean(x);
n = length(x);
% or use var
variance = var(x);
26
[Fill in here]
Standard deviation
} the standard deviation is the square root of the
variance
} for 𝑁 measurements 𝑥! the sample standard deviation
is often calculated as
𝑠=
>> std(x)
27
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Standard deviation
} what fraction of scores lie within +/- one standard
deviation from the mean?
} create a variable to count the number of scores within +/-
one standard deviation of the mean
} compute and store the standard deviation of the scores
} subtract the mean from all of the scores
} do you need a loop?
} look at each de-meaned score
} this requires a loop
} if the current de-meaned score is within +/- one standard
deviation increase the count by 1
} divide the count by the number of scores
28
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
% count the number of scores within +/- one
% standard deviation of the mean
% file found at http://www.eecs.yorku.ca/~drsmith/eecs1011/test1mc.mat
load test1mc.mat
count = 0;
sdev = std(x);
y = x - mean(x);
n = length(x);
for idx = 1:n
curr = y(idx);
if abs(curr) < sdev
count = count + 1;
end
end
frac = count / n;
29
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Normal distribution
} the test scores are almost normally distributed
} in statistics, the normal distribution is the so-called
bell curve
} many physical measurements are approximately normally
distributed
} a normal distribution is completely characterized by
two values
} mean, usually abbreviated as 𝜇
} variance, usually abbreviated as 𝜎 !
30
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Normal distribution