0% found this document useful (0 votes)
86 views

Math Stats

This document defines key terms related to measures of central tendency and dispersion in statistics. It discusses mean, median, mode, range, and standard deviation as measures used to summarize and describe data. Formulas are provided for calculating mean, weighted mean, median, mode, range, and standard deviation. Examples are included to demonstrate calculating these measures from data sets.

Uploaded by

Pein Mwa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Math Stats

This document defines key terms related to measures of central tendency and dispersion in statistics. It discusses mean, median, mode, range, and standard deviation as measures used to summarize and describe data. Formulas are provided for calculating mean, weighted mean, median, mode, range, and standard deviation. Examples are included to demonstrate calculating these measures from data sets.

Uploaded by

Pein Mwa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

"MEASURES OF CENTRAL TENDENCY"

TERMS:
STATISTICS - Involves collection, organization, presentation, and interpretation of data
DESCRIPTIVE STATISTICS - Involves collection, organization, summarization, and presentation of data
INFERENTIAL STATISTICS - Interprets and draws conclusions from data
POPULATION - Refers to the whole group to be studied SUMMATION NOTATION: Greek letter sigma, .
SAMPLE - Representative of the whole group, Any subset of the population RANKED LIST: Any list of numbers that is arranged in
RAW DATA - Scores that are not yet organized or presented numerical order
MODE - Number that occurs most frequently in the number list, VARIANCE: Square of the standard deviation of a data
MEDIAN - Middlemost data when arranged in ranked list
RANGE - Difference between the greatest and least data value
FREQUENCY DISTRIBUTION - A table that lists observed events and the frequency of occurrence of each observed event, is often
used to organized raw data

Examples:
Elle is a senior at a university. In a few months she plans to graduate and start a career as a landscape architect. A survey of five
landscape architects from last year’s senior class shows that they received job offers with the following yearly salaries.

Php240,000 Php300,000 Php420,000 Php456,000 Php216,000

MEAN = 240,000 + 300,000 + 420,000 + 456,000 + 216,000 MEAN = 326,400


5.
The mean suggests that Stacy can reasonably expect a job offer at a of about Php326,400
Six friends in a biology class of 20 students received test grades of
92, 84, 65, 76, 88, and 90 MEAN = ALL NUMBERS
PILA KA NUMBERS
SOLUTION: MEAN = 92 + 84 + 65 + 76 + 88 + 90 495 82.5
6 6

Find the median of the data in the following lists. (a. 4,8,1,14,9,21,12) (b. 46,23,92,89,77,108)

SOLUTION (A): The list 4,8,1,14,9,21,12 contains 7 numbers. The median of a list with an odd number of entries is found by
ranking the numbers and finding the middle number. Ranking the numbers from smallest to largest gives.
1, 4, 8, 9, 12, 14, 21

The middle number is 9. Thus 9 is the MEDIUM


SOLUTION (B): The list 46, 23, 92, 89, 77, 108 contains 6 numbers. The median of a list of data with an even number of entries
is found by ranking the numbers and computing the mean of the 2 middle numbers. Ranking the numbers from smallest to
largest gives: 23, 46, 77, 89, 92, 108

The 2 middle numbers are 77 and 89. The mean of 77 and 89 is 83. Thus 83 is the MEDIAN

BASTA KUNG ODD NUMBER ANG TOTAL N, ANG MEDIAN YA IS IANG MIDDLE NUMBER.
KUNG EVEN NAMAN ANG TOTAL N, ANG MEDIAN IS ANG MEAN SNG 2 KA MIDDLE NUMBERS

Find the mode of the data in the following lists. (a. 18, 15, 21, 16, 15, 14, 15, 21) (b. 2, 5, 8, 9, 11, 4, 7, 23)

SOLUTION (A): In the list 18, 15, 21, 16, 15, 14, 15, 21, the number 15 occurs more often than others. Thus, 15 is the mode.
SOLUTION (B): Each number in the list 2, 5, 8, 9, 11, 4, 7, 23 occurs only once. NO MODE

• MEAN, MEDIAN, MODE are all averages; The mean of a set of a data is the most sensitive of the averages. A change in any
of the numbers changes the mean and the mean can be changed drastically by changing an extreme value.

• In contrast, the MEDIAN and MODE of a set of a data are not usually changed by changing an extreme value.

• When a data set has one or more extreme values that are very different from the majority of data values, the mean will not
necessarily be a good indicator of an average value.

STATISTICS Page 1
In the following example, we compare the mean, median, and mode for the salaries of 5 employees of a small company.

Monthly Salaries Php370,000 60,000 36,000 20,000 20,000

MEAN: 370,000 + 60,000 + 36,000 + 20,000 + 20,000


5
= 506,000 = 101,200 MEDIAN: 36,000 (MIDDLE NUMBER) MODE: 20,000 (OCCURS TWICE)
5

WEIGHTED MEAN: Often used when some data values are more important than others.

Consider the situation in which a professor counts the final examination as 2 test scores. To find the weighted mean of the
student's scores, the professor first assigns a weight to each score.

In this case, the professor could assign each of the test scores a weight of 1 and the final exam score a weight of 2.

A student with test scores of 65, 70, and 75 and a final examination score of 90 has a weighted mean of:

(65 x 1) + (70 x 1) + (75 x 1) + (90 x 2) 390 78


5 5

The table below shows Dillon's fall semester course


grades. Use the weighted mean formula to find
Dillon's GPA for the fall semester.

The B is worth 3 points, with a weight of 4; the A is worth 4 points with a


weight of 3; the D is worth 1 point, with a weight of 3; and the C is worth 2
points, with a weight of 4. The sum of all weights is 14.

Weighted mean = (3 x 4) + (4 x 3) + (1 x 3) + (2 x 4)
14
= 35
14 = 2.5

FREQUENCY DISTRUBUTION: a table that lists observed events and the frequency of occurrence of each observed event, is often used
to organize raw data
Consider the following table, which lists the The frequency distribution below was constructed using the data from the
number of a laptop computers owned by families table. The first column of the frequency distribution consists of the numbers
in each of 40 homes in a subdivision 0,1,2,3,4,5,6, and 7. The corresponding frequency of occurrence, f, of each
of the numbers in the 1st column is listed in the 2nd column

The formula for a weighted mean can be used to find


the mean of the data in a frequency distribution. The
only change is that the weights w1, w2, w3, …., wn are
replaced with the frequencies f1,f2,f3, …, fn.

The number in the right-hand column of Table 13.4 are


the frequencies f for the numbers in the 1st column.
The sum of all the frequencies is 40.

STATISTICS Page 2
"MEASURES OF DISPERSION"
Range - set of data values is the Standard Deviation - a measure of dispersion that is less sensitive to extreme
difference between the greatest data values. It makes use of the amount by which each individual data value deviates from the
value and the least data value mean.

hese deviations, represented by , are positive when the data value is greater than
the mean and are negative when is less than the mean . he sum of all the
deviations is for all sets of data.

Find the RANGE:

The greatest number of ounces


dispensed is 10.07 and the least is
5.85. The range of the number of ounces
dispensed is 10.07 - 5.85 = 4.22

Because the sum of all the deviations of the data values from the mean is always 0, we cannot use the sum of the deviations as a
measure of dispersion for a set of data. Instead, the standard deviation uses the sum of the squares of the deviations

Sample standards deviations are


designated by the lower case letter s

Standard deviation of the


population is designated by 

PROCEDURE IN CALCULATING THE STANDARD DEVIATION OF n NUMBERS The following numbers were obtained by sampling a
1. Determine the mean of the n numbers population: 2,4, 7, 12, 15
2. For each number, calculate the deviation (difference) between Find the standard deviation of the sample:
the number and the mean of the numbers
3. Calculate the square of each of the deviations and find the sum SOLUTION:
of these squared deviations 1 2 12 1
4. If the data is a POPULATION, then divide the sum by n. If the 5
data is a SAMPLE, then divide the sum by n - 1 = 40
5. Find the square root of the quotient in step 4 5
=8
3RD STEP:
2ND STEP:

4TH STEP: Because we have a sample of n = 5 values, divide the 5TH STEP: The standard deviation of the sample is
sum 118 by n -1, which is 4

118 29.5 To the nearest hundredth, the standard deviation is


4 S = 5.43

STATISTICS Page 3
Variance - Square of the standard deviation of the data Normal Distribution

"NORMAL DISTRIBUTION"
Frequency Distributions and Histograms
- Large sets of data are often displayed using a grouped frequency distribution of histogram

An internet service provider (ISP) has installed new computers. To estimate the new download times its subscribers will experience,
the ISP surveyed 1000 of its subscribers to determine the time required for each subscriber to download a particular file from an
internet site

The results of that survey are summarized in Table 13.7

- The table is called a grouped frequency distribution. It shows how


often (frequently) certain events occurred
- Each interval, 0-5, 5-10, and so on is called a class. The
distribution has 12 classes. For the 10-15 class, 10 is the lower
class boundary and 15 is the upper class boundary
- Any data value that lies on a common boundary is assigned to the
higher class.
- The graph of a frequency distribution is called a histogram. A
histogram provides a pictorial view of how the data are distributed.

A grouped Frequency distribution with 12 classes

- The type of frequency distribution that lists the percent of data in each
class is called RELATIVE FREQUENCY DISTRIBUTION
- The RELATIVE FREQUENCY HISTOGRAM was drawn by using the data
in the relative frequency distribution. It shows the percent of
subscribers along its vertical axis

A Relative Frequency Distribution

One advantage of using relative frequency distribution instead


of a grouped frequency distribution is that there is a direct
correspondence between the percent values of the relative
frequency distribution and probabilities

For instance, in the relative frequency distribution in the table,


the percent of the data that lies between 35 s and 40 s is 14.9%

Thus, if a subscriber is chosen at random, the probability that


the subscriber will require at least 35 s but less than 40 s to
download the music file is 0.149.

STATISTICS Page 4
Using the relative frequency distribution on the last table, determine the
a. Percent of subscribers who required at least 25 s to download the file
b. Probability that a subscriber chosen at random will require at least 5 s but less than 20 s to download the file

a. The percent of data in all the classes with a lower boundary of 25


s or more is the sum of the percents printed in red in table 13.9 at
right.

Thus the percent of subscribers who required at least 25 s to


download a file is 69.1%

b. The percent of data in all the classes with a lower boundary of at


least 5 s and an upper boundary of 20 s is the sum of the
percents printed in blue in the table

Thus the percent of subscribers who required at least 5 s but less


than 20 s to download the file is 15.2%

Normal Distributions and the Empirical Rule


- One of the most important statistical distributions of data is known as a normal distribution. This occurs in a variety of
applications
- Types of data that may demonstrate a normal distribution include the lengths of leaves on a tree, the weights of newborns in
a hospital, the lengths of time of a student's trip from home to school over a period of months, the SAT scores of a large
group of students and the life spans of light bulbs.

NORMAL DISTRIBUTION: is a distribution of continuous random variable X


- Forms a bell-shaped curve that is symmetric about a vertical line through the mean of the data. A graph of a normal
distribution with a mean of 5 is shown below its graph is bell shaped
- It depends on 2 parameters:
1. The population mean, µ
2. The population standard deviation, σ
- Often referred to as the Gaussian distribution in honor of Carl Friedrich Gauss (1777 - 1855)
- It is denoted by N(X;µ,σ) or N(µ,σ).

Characteristics of normal distribution:


- Bell-shaped
- Symmetric around their mean
- The mean, median, mode are equal
- Denser in the center and less dense in tails
- Area is equal to 1.0 or 100%

Draw the curve denoted by N(90, 10)

The amount spent for lunch by the students is assumed to be normally distributed
with mean of 90 and standard deviation of 5. Draw the curve that will represent this
distribution

Most tests that gauge one's intelligence


quotient (IQ) are designed to have a
mean of 100 and a standard deviation of
15. It is also known that IQs are normally
distributed. Draw a curve that would
represent the distribution for IQs

STATISTICS Page 5
The amount spent for lunch by the students is assumed to be normally distributed with mean of 90
and standard deviation of 5. Draw the curve that will represent this distribution.

If Em-em is one of the students, find the probability that her lunch costs
a. Less than Php90 b. Less than Php95
b. Less than Php95 The probability that Em-em's lunch is less than Php95 is
c. Less than Php80 represented by the symbol P(x<95). In the normal curve, it is
d. Between Php80 and Php90 the area on the left of x = 95

a. Less than Php90


Let X be the cost of Em-em's lunch. The probability that Em-em's
lunch is less than Php90 is represented by the symbol P(x<90). In
the normal curve, it is the area on the left of x=90. The shaded
region in the graph below 0.5 0.34

Therefore, the probability that her


lunch is less than 95 is 0.84 or 84%
50%
d. Less than Php80
The probability that the cost of Em-em's lunch is less than
Php80 is represented by the symbol P(x<80). In the normal
curve, it is the area on the left of x = 80. The shaded region
c. More than Php95
The probability that Em-em's lunch is more than Php95 is
represented by the symbol P(x>95). In the normal curve, it is the
area on the right of x = 95

0.5 0.34 Therefore, the probability that her


lunch is less than 80 is 0.02

Therefore, the probability that her lunch is more than 95 is 0.16 or 16%
STANDARD NORMAL DISTRIBUTION: is a normal distribution whose mean is 0 and the standard deviation is 1. It is denoted by N(0,1).
Transformation can be done using the formula

STATISTICS Page 6

You might also like