Math Stats
Math Stats
TERMS:
STATISTICS - Involves collection, organization, presentation, and interpretation of data
DESCRIPTIVE STATISTICS - Involves collection, organization, summarization, and presentation of data
INFERENTIAL STATISTICS - Interprets and draws conclusions from data
POPULATION - Refers to the whole group to be studied SUMMATION NOTATION: Greek letter sigma, .
SAMPLE - Representative of the whole group, Any subset of the population RANKED LIST: Any list of numbers that is arranged in
RAW DATA - Scores that are not yet organized or presented numerical order
MODE - Number that occurs most frequently in the number list, VARIANCE: Square of the standard deviation of a data
MEDIAN - Middlemost data when arranged in ranked list
RANGE - Difference between the greatest and least data value
FREQUENCY DISTRIBUTION - A table that lists observed events and the frequency of occurrence of each observed event, is often
used to organized raw data
Examples:
Elle is a senior at a university. In a few months she plans to graduate and start a career as a landscape architect. A survey of five
landscape architects from last year’s senior class shows that they received job offers with the following yearly salaries.
Find the median of the data in the following lists. (a. 4,8,1,14,9,21,12) (b. 46,23,92,89,77,108)
SOLUTION (A): The list 4,8,1,14,9,21,12 contains 7 numbers. The median of a list with an odd number of entries is found by
ranking the numbers and finding the middle number. Ranking the numbers from smallest to largest gives.
1, 4, 8, 9, 12, 14, 21
The 2 middle numbers are 77 and 89. The mean of 77 and 89 is 83. Thus 83 is the MEDIAN
BASTA KUNG ODD NUMBER ANG TOTAL N, ANG MEDIAN YA IS IANG MIDDLE NUMBER.
KUNG EVEN NAMAN ANG TOTAL N, ANG MEDIAN IS ANG MEAN SNG 2 KA MIDDLE NUMBERS
Find the mode of the data in the following lists. (a. 18, 15, 21, 16, 15, 14, 15, 21) (b. 2, 5, 8, 9, 11, 4, 7, 23)
SOLUTION (A): In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than others. Thus, 15 is the mode.
SOLUTION (B): Each number in the list 2, 5, 8, 9, 11, 4, 7, 23 occurs only once. NO MODE
• MEAN, MEDIAN, MODE are all averages; The mean of a set of a data is the most sensitive of the averages. A change in any
of the numbers changes the mean and the mean can be changed drastically by changing an extreme value.
• In contrast, the MEDIAN and MODE of a set of a data are not usually changed by changing an extreme value.
• When a data set has one or more extreme values that are very different from the majority of data values, the mean will not
necessarily be a good indicator of an average value.
STATISTICS Page 1
In the following example, we compare the mean, median, and mode for the salaries of 5 employees of a small company.
WEIGHTED MEAN: Often used when some data values are more important than others.
Consider the situation in which a professor counts the final examination as 2 test scores. To find the weighted mean of the
student's scores, the professor first assigns a weight to each score.
In this case, the professor could assign each of the test scores a weight of 1 and the final exam score a weight of 2.
A student with test scores of 65, 70, and 75 and a final examination score of 90 has a weighted mean of:
Weighted mean = (3 x 4) + (4 x 3) + (1 x 3) + (2 x 4)
14
= 35
14 = 2.5
FREQUENCY DISTRUBUTION: a table that lists observed events and the frequency of occurrence of each observed event, is often used
to organize raw data
Consider the following table, which lists the The frequency distribution below was constructed using the data from the
number of a laptop computers owned by families table. The first column of the frequency distribution consists of the numbers
in each of 40 homes in a subdivision 0,1,2,3,4,5,6, and 7. The corresponding frequency of occurrence, f, of each
of the numbers in the 1st column is listed in the 2nd column
STATISTICS Page 2
"MEASURES OF DISPERSION"
Range - set of data values is the Standard Deviation - a measure of dispersion that is less sensitive to extreme
difference between the greatest data values. It makes use of the amount by which each individual data value deviates from the
value and the least data value mean.
hese deviations, represented by , are positive when the data value is greater than
the mean and are negative when is less than the mean . he sum of all the
deviations is for all sets of data.
Because the sum of all the deviations of the data values from the mean is always 0, we cannot use the sum of the deviations as a
measure of dispersion for a set of data. Instead, the standard deviation uses the sum of the squares of the deviations
PROCEDURE IN CALCULATING THE STANDARD DEVIATION OF n NUMBERS The following numbers were obtained by sampling a
1. Determine the mean of the n numbers population: 2,4, 7, 12, 15
2. For each number, calculate the deviation (difference) between Find the standard deviation of the sample:
the number and the mean of the numbers
3. Calculate the square of each of the deviations and find the sum SOLUTION:
of these squared deviations 1 2 12 1
4. If the data is a POPULATION, then divide the sum by n. If the 5
data is a SAMPLE, then divide the sum by n - 1 = 40
5. Find the square root of the quotient in step 4 5
=8
3RD STEP:
2ND STEP:
4TH STEP: Because we have a sample of n = 5 values, divide the 5TH STEP: The standard deviation of the sample is
sum 118 by n -1, which is 4
STATISTICS Page 3
Variance - Square of the standard deviation of the data Normal Distribution
"NORMAL DISTRIBUTION"
Frequency Distributions and Histograms
- Large sets of data are often displayed using a grouped frequency distribution of histogram
An internet service provider (ISP) has installed new computers. To estimate the new download times its subscribers will experience,
the ISP surveyed 1000 of its subscribers to determine the time required for each subscriber to download a particular file from an
internet site
- The type of frequency distribution that lists the percent of data in each
class is called RELATIVE FREQUENCY DISTRIBUTION
- The RELATIVE FREQUENCY HISTOGRAM was drawn by using the data
in the relative frequency distribution. It shows the percent of
subscribers along its vertical axis
STATISTICS Page 4
Using the relative frequency distribution on the last table, determine the
a. Percent of subscribers who required at least 25 s to download the file
b. Probability that a subscriber chosen at random will require at least 5 s but less than 20 s to download the file
The amount spent for lunch by the students is assumed to be normally distributed
with mean of 90 and standard deviation of 5. Draw the curve that will represent this
distribution
STATISTICS Page 5
The amount spent for lunch by the students is assumed to be normally distributed with mean of 90
and standard deviation of 5. Draw the curve that will represent this distribution.
If Em-em is one of the students, find the probability that her lunch costs
a. Less than Php90 b. Less than Php95
b. Less than Php95 The probability that Em-em's lunch is less than Php95 is
c. Less than Php80 represented by the symbol P(x<95). In the normal curve, it is
d. Between Php80 and Php90 the area on the left of x = 95
Therefore, the probability that her lunch is more than 95 is 0.16 or 16%
STANDARD NORMAL DISTRIBUTION: is a normal distribution whose mean is 0 and the standard deviation is 1. It is denoted by N(0,1).
Transformation can be done using the formula
STATISTICS Page 6