0% found this document useful (0 votes)
3 views31 pages

Day13 Basic Stats Via Loops Studentsr 2020

The document provides an overview of basic statistics concepts, focusing on methods to calculate maximum, minimum, average, variance, and standard deviation using MATLAB. It emphasizes the importance of loops in these calculations and introduces functions like max, min, mean, var, and std for efficient computation. Additionally, it touches on the relevance of statistics in real-world applications, including Big Data and machine learning.

Uploaded by

tatopla20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views31 pages

Day13 Basic Stats Via Loops Studentsr 2020

The document provides an overview of basic statistics concepts, focusing on methods to calculate maximum, minimum, average, variance, and standard deviation using MATLAB. It emphasizes the importance of loops in these calculations and introduces functions like max, min, mean, var, and std for efficient computation. Additionally, it touches on the relevance of statistics in real-world applications, including Big Data and machine learning.

Uploaded by

tatopla20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Basic statistics

(just “fancy” loops)


EECS 1011 – Day 13
Original notes by Dr. Burton Ma; Updates by Dr. James Andrew Smith & Dr. Steven Castellucci

1
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Reference
} You should be following along in Chapter 14 of the 5th
Edition of the Attaway book.
} The YouTube video points to a different chapter in the older book.

Chapter 14

1 4. 1 –
1 4. 3

2
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Statistics in the Real World?
} Traditional
} suppose that you perform multiple measurements of some
phenomenon
} the “average” measurement
} the amount of variation in the measurements
} the order of the measurements

} Trending
} Big Data
} Machine Learning
} Artificial Intelligence

3
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Ex. Final Mark of an Arbitrary Class

4
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Maximum
} basic idea for finding the maximum score:
} create a variable to store the maximum score seen so far
} initial value?
} look at each score
} this requires a loop
¨

} if the current score is greater than the maximum score seen so far,
update the maximum score seen so far
¨

5
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
% find the maximum score for Test 1
load test1mc.mat

maxValue = 0;
n = length(x);
for idx = 1:n
curr = x(idx);

end
end

% or use max
maxValue = max(x);

6
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Maximum
} MATLAB's max function returns the maximum value
of a vector and the index of the maximum value

>> maxValue = max(x)

maxValue =
25

>> [maxValue, idx] = max(x)

minValue =
25

idx =
50

7
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Minimum
} basic idea for finding the minimum score:
} create a variable to store the minimum score seen so far
} initial value?
} look at each score
} this requires a loop
¨ which kind of loop? for loop? while loop?
} if the current score is smaller than the minimum score seen so
far, update the minimum score seen so far
¨ this requires an if statement

8
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
% find the minimum score for Test 1
load test1mc.mat

minValue = 25;
n = length(x);
for idx = 1:n
curr = x(idx);

minValue = curr;
end
end

% or use min
minValue = min(x);

9
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Minimum
} MATLAB's min function returns the minimum value
of a vector and the index of the minimum value

>> minValue = min(x)

minValue =
7

>> [minValue, idx] = min(x)

minValue =
7

idx =
213

10
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Basic statistics
(just “fancy” loops)
EECS 1011 – Day 13
Original notes by Dr. Burton Ma; Updates by Dr. James Andrew Smith & Dr. Steven Castellucci

11
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Reference
} You should be following along in Chapter 14 of the 5th
Edition of the Attaway book.
} The YouTube video points to a different chapter in the older book.

Chapter 14

1 4. 1 –
1 4. 3

12
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Statistics in the Real World?
} Traditional
}
} the “average” measurement
}
} the order of the measurements

} Trending
} Big Data
} Machine Learning
} Artificial Intelligence

13
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Average
} the arithmetic average is an estimate of the mean
} in statistics, the mean is a measure of the center point of a
distribution
} for 𝑁 measurements 𝑥! the sample mean is

14
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Average
} basic idea to compute the average:
} create a variable to accumulate the running sum
} initial value?
} look at each score
} this requires a loop
¨ which kind of loop? for loop? while loop?
} add the current score to the running sum
} divide the accumulated sum by the number of scores

15
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
% find the average score for
[Fill
Test 1
in here]
load test1mc.mat

runningSum = 0;
n = length(x);
for idx = 1:n
curr = x(idx);

end
avgValue = runningSum / n;

% or use mean
avgValue = mean(x);

16
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[new slide]
Average on a data file, test1mc.mat

17
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[new slide]
Look again: Equation = For Loop
runningSum = 0;

n = length(x);
$
1 for idx = 1:n
𝑥̅ = & 𝑥! curr = x(idx);
𝑁 runningSum = runningSum + curr;
!"# end

avgValue = runningSum / n;

18
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[update]
Average
} you should use the function mean instead
}

>> mean([-1 0 1])


ans =
0
mean computes the average of each
column for a matrix
>> X = [1 2 3;
3 8 11];
>> mean(X)
ans =
2 5 7

19
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[update]
Mean
} one problem with the mean is that it is sensitive to
erroneous or measurements

>> x = randn(1, 20);


>> x(21) = 100;
>> hist(x, 20);
>> mean(x)

20
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Basic statistics
(just “fancy” loops)
EECS 1011 – Day 13
Original notes by Dr. Burton Ma; Updates by Dr. James Andrew Smith & Dr. Steven Castellucci

21
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Reference
} You should be following along in Chapter 14 of the 5th
Edition of the Attaway book.
} The YouTube video points to a different chapter in the older book.

Chapter 14

1 4. 1 –
1 4. 3

22
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Statistics in the Real World?
} Traditional
}
}

} the amount of variation in the measurements


} the order of the measurements

} Trending
} Big Data
} Machine Learning
} Artificial Intelligence

23
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
Variance
} the variance is a measure of spread around the mean
} for 𝑁 measurements 𝑥! the sample variance is

% "low" variance
>> x = randn(1, 100);
>> hist(x, 20);
>> var(x)
% "high" variance
>> x = 10 * randn(1, 100);
>> hist(x, 20);
>> var(x)

24
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Variance
} basic idea to compute the variance:
} create a variable to accumulate the running sum
} initial value?
} compute and store the average value
} look at each score
} this requires a loop
¨ which kind of loop? for loop? while loop?
} add the squared deviation to the running sum
} divide the accumulated sum by the number of scores minus 1

25
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
[Fill in here]
% find the variance of the scores for Test 1
% file found at http://www.eecs.yorku.ca/~drsmith/eecs1011/test1mc.mat

load test1mc.mat

runningSum = 0;
avgValue = mean(x);
n = length(x);

runningSum = runningSum + (curr - avgValue)^2;


end
variance = runningSum / (n - 1);

% or use var
variance = var(x);

26
[Fill in here]
Standard deviation
} the standard deviation is the square root of the
variance
} for 𝑁 measurements 𝑥! the sample standard deviation
is often calculated as
𝑠=

>> std(x)

27
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Standard deviation
} what fraction of scores lie within +/- one standard
deviation from the mean?
} create a variable to count the number of scores within +/-
one standard deviation of the mean
} compute and store the standard deviation of the scores
} subtract the mean from all of the scores
} do you need a loop?
} look at each de-meaned score
} this requires a loop
} if the current de-meaned score is within +/- one standard
deviation increase the count by 1
} divide the count by the number of scores

28
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
% count the number of scores within +/- one
% standard deviation of the mean
% file found at http://www.eecs.yorku.ca/~drsmith/eecs1011/test1mc.mat
load test1mc.mat

count = 0;
sdev = std(x);
y = x - mean(x);
n = length(x);
for idx = 1:n
curr = y(idx);
if abs(curr) < sdev
count = count + 1;
end
end
frac = count / n;

29
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Normal distribution
} the test scores are almost normally distributed
} in statistics, the normal distribution is the so-called
bell curve
} many physical measurements are approximately normally
distributed
} a normal distribution is completely characterized by
two values
} mean, usually abbreviated as 𝜇
} variance, usually abbreviated as 𝜎 !

30
Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.
Normal distribution

31 By Dan Kernler (Own work) [CC-BY-SA-4.0 (http://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons


Copyright James Andrew Smith & others. No permission is granted to distribute or upload outside of York University.

You might also like