Lesson1 160127200215
Lesson1 160127200215
1
Why should you learn statistics?
In the general sense, statistics is the science of dealing with variability, uncertainty, and
subjectivity to produce objective and quantitative information that can assist in making reliable
decisions about numerous situations in life. Globally, statistics is a key tool in governments and
organizations activities.
The reason we need statistics is that we are living in a world of numbers, or more precisely a
world of data.
• Data, on the other hand, are called statistics when they reflect specific or descriptive measures
of the phenomenon or the event under study
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………. 2
What is statistics?
Definitions of statistics
Definition Source
"The mathematics of the collection, organization, and American Heritage Dictionary®
interpretation of numerical data, especially the analysis of
population characteristics by inference from sampling."
"A branch of mathematics dealing with the collection, analysis, The Merriam-Webster’s Collegiate
interpretation, and presentation of masses of numerical data." Dictionary®
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
3
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
4
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Statistics
100, 140, 213, 230, 180, 211, 120, 160, 200, 110, 260, 235, 280, 180, 300
DATA
100, 140, 213, 230, 180, 211, 120, 160, 200, 110, 260, 235, 280, 180, 300
NUMBERS
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
5
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
*** If you can maintain your grade at an “A” from course to course, it
becomes
a controlled variable or a constant
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
6
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
7
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Example: In a survey conducted in a community college of 5000 students, 800 students were selected
randomly and asked if they would transfer to a four-year university. Five hundred and fifty of the students said
yes. Identify the population and the sample. Describe the data set.
Solution:
• The sample consists of all the students who were randomly selected (800 students)
• The actual data set consists of 550 yes’s and 250’s no’s.
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
8
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Major television networks constantly monitor the popularity of their programs via asking
TV viewers.
(a)Suppose 1000 TV prime-time viewers selected randomly were asked if they watched a new talk-show, and 450
indicated they watched the show. Identify the population and the sample. Describe the data set.
(b)In another survey, suppose 1100 TV prime-time viewers selected randomly were asked if they watched the 2009
Super Ball on TV, and 999 indicated they watched the game. Identify the population and the sample. Describe the
data set.
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
9
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Example: Decide whether the numerical values given below describe a sample statistic or a
population parameter.
(a)A sample of community college professors in the U.S.A. revealed that the average starting salary of a
college professor is $52,000 and the range of salaries is $62,000.
(b)In a college survey of all freshmen students, it was revealed that 85% of the students were fresh out of the
high school and 15% were students who graduated from high schools more than 5 years ago
Solution:
(a) The average is a sample statistic and the range is a sample statistic
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
11
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Determine whether the numerical values given below describe a sample statistic or
a population parameter.
(b)New residents of an apartment complex were asked if they like the landscaping surrounding the
complex. Eighty five percent indicated that they like it.
(c)The average salary of a group of 120 employees selected randomly from different divisions of a
company was found to be $65,000
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
12
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
Notes:
Inaccurate/Imprecise Accurate/Precise
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
13
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
38.3 37.8 36.0 38.3 38.2 37.6 38.2 38.4 37.9 38.3 39.0
Describe these values in terms of precision and accuracy and explain your
answer.
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
14
What is statistics?
“The science and art of reading, describing, and manipulating data, which represents variables so that
practical observations about a population can be made from a sample drawn from the population, and
guidelines can be established to allow making precise and accurate conclusions about a certain
process or system”
• A process implies one or more of five basic elements: machine, material, methodology, people,
and environment.
• A system is an entity that has inputs and outputs
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
15
Material
(Books &
Notes)
Machine People
(Projectors & (Teachers &
computers) Students)
Education
Process
Methodology Environment
(Face-to-Face, (Classroom or
Distance Laboratory)
Learning)
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
16
Input System Output
(A)
Yarns Fabric
(Weaving
Machine)
Inputs: Outputs:
- Fiber type - Fabric width
- Yarn strength - Fabric thickness
- Yarn diameter - Fabric weight
Output-Input Relationships
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
17
Descriptive statistics: how to describe data
• The starting point in any statistical analysis is to read data using the language of statistics
•This language uses the so called ‘descriptive statistics’ to establish an organized and
meaningful display of data with the goal being to reduce the flood of data presented down
to few statistics that can fully describe the data
Problem: The four sets of data shown below represent student grades of four consecutive
statistics quizzes given to a class of 10 students. Describe each set of data by closely
observing it and writing your comments.
1 2 3 4 5 6 7 8 9 10
Quiz 1 90 90 90 90 90 90 90 90 90 90
Quiz 2 82 86 78 30 88 82 79 77 81 99
Quiz 3 68 90 89 71 92 95 73 75 94 66
Quiz 4 82 76 85 88 95 86 84 87 96 78
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
18
Simple Numerical descriptive statistics
Using statistics, we can provide two types of description:
76 78 82 84 85 86 87 88 96 95
Example: Grades 82 76 85 88 95 86 84 87 96 78
R = 96-76 = 20
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 19
Simple Numerical descriptive statistics
Working Problem 1.4:
Calculate the mean and the range for the following data set of people
income ($):
Calculate the mean and the range for the following data set of people weight (pound)
Calculate the mean and the range for the following data set of people temperature (Fo)
Calculate the mean and range for the following data set of property taxes ($)
8100, 3500, 7000, 4200, 3000, 5000, 5100, 4000, 7500, 4800
20
Histogram and frequency distribution
A histogram or a frequency distribution is a simple x-y graph in which the horizontal x-axis
represents the values (or classes of values) of the variable and the vertical y-axis represents
the number of observations corresponding to each value (or the frequency).
Observation # 1 2 3 4 5 6 7 8 9 10
Grades 75 65 90 75 90 75 65 90 75 100
3
Frequency
0
65 70 75 80 85 90 95 100
Grade
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
21
Typical Example of a Histogram: Data of human weight (lb) of a random sample of 60 people
Weigh Weigh
n t n t n Weight n Weight n Weight n Weight
1 146 11 145 21 144 31 153 41 127 51 146
2 145 12 157 22 267 32 162 42 145 52 159
3 147 13 148 23 151 33 144 43 137 53 157
4 120 14 155 24 143 34 160 44 160 54 144
5 187 15 158 25 161 35 110 45 141 55 159
6 157 16 195 26 148 36 142 46 154 56 162
7 143 17 142 27 240 37 155 47 152 57 157
8 117 18 154 28 128 38 145 48 149 58 149
9 170 19 160 29 136 39 150 49 125 59 283
10 138 20 160 30 110 40 136 50 139 60 154
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 22
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
23
Key Points about Descriptive Statistics:
•The key elements of descriptive statistics are the measures of central tendency
and the measures of variability
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
24
What is probability?
Most people use terms such as chance, likelihood, or probability to reflect the level of uncertainty
about some issues or events. Some of the common examples of using these terms are as follows:
•As you watch the news every day, you hear forecasters saying that there is a 70% chance of rain
tomorrow.
•As you plan to enter a new business, an expert in the field tells you that the probability of making a
profit in this business is only 0.4, or there is a 40% chance that you will make a profit.
•As you take a new course, you may be wondering about the likelihood of passing or failing the
course.
•Your friend is undergoing a surgery and the physician is telling him that his chance of surviving the
surgery is 95%.
•You hear it on health news all the time that a smoker has a greater chance of getting lung cancer
than a nonsmoker.
These are all expressions of probability that we often hear or read about and they can affect
our planning or intention to do or not to do things in life.
25
What is a probability Value?
The general definition of probability is that it is a value between zero and one, which reveals
the relative possibility an event will occur.
• A probability of zero or close to zero implies that an event is very improbable to occur
• A probability of one or close to one gives us higher assurance that an event will occur.
•Between these two extremes, different values of probability will be expressed as a decimal
such as 0.33, 0.7, or 0.50, as a fraction such as 1/3, 7/10, or 1/2, or as a percent such as
33.33%, 70%, or 50%.
Classically, probability is defined by the ratio of the number of particular target outcomes of an
event to the number of all possible equally likely and mutually exclusive outcomes. For example,
we know that in tossing a coin, the chance of head being the outcome is 50% or, in the context
of probability, we say that the probability of head occurrence is 0.5. This value is a direct
calculation from the classic definition of probability where the total number of possible
outcomes in tossing a coin is 2 (head or tail), and the chance of head being the outcome is 1/2.
H T 26
Sampling and sampling techniques
Sampling:
Why Sampling?
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
……………………………………………………………………………………………………………………………………………………………………………………………
27
What are the different types of sampling?
Population Sample
Random
Sampling
I II
II
I
Stratified
Sampling
IV
III
III IV
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 28
What are the different sources of variability?
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 29
Working Problem 1.9:
(a) If one picks any boll of cotton from the field, one will not find two cotton
fibers in this boll that are similar in length, diameter, or maturity.
Inherent ( ) Induced ( )
b. You open a box of water bottles and you find that some bottles are completely filled and some are
half-empty
Inherent ( ) Induced ( )
c. At the workplace, you find some people performing better than other people of the
same experience and background
Inherent ( ) Induced ( )
d. In a sample of natural soil aggregates taken from a certain area you find no two soil particles
that are alike in size or texture.
Inherent ( ) Induced ( )
30
What are the different types of variables?
•Continuous variables
•Discrete variables
•Special variables
Quantitative Variables
Qualitative Variables
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 31
Working Problem 1.10:
Decide whether the following variables are discrete or continuous. Explain your reasoning.
Discrete ( ) Continuous ( )
Discrete ( ) Continuous ( )
Discrete ( ) Continuous ( )
Discrete ( ) Continuous ( )
Discrete ( ) Continuous ( )
32
What are the levels of measurements?
In order to avoid any confusion about what level of measurement a variable belongs to, we
should begin by addressing the following key questions (Michael Sullivan III, 2010, Statistics-
Informed Decisions Using Data, Third Edition, Prentice Hall, Pearson Education, Inc.):
•Does the variable simply categorize each individual? If so, the variable is nominal (e.g.
gender)
• Does the variable categorize and allow ranking of each value of the variable? If so, the
variable is ordinal (letter grade in your calculus class)
•Do differences in values of the variable have meaning, but a value of zero does not mean
the absence of the quantity? If so, the variable is interval (e.g. temperature).
• Do ratios of values of the variable have meaning and there is a natural zero starting
point?
If so, the variable is ratio (e.g. human weight and number of hours you study every week)
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 34
What are the levels of measurements?
Example: What is the level of measurement for each of the following variables?
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 35
Working Problem 1.11:
18, 21, 19, 17, 16, 22, 21, 22, 23, 18, 18, 17, 19
(b)In a survey of 500 luxury-house owners (above $2million price), 200 were from California, 150 from New York, and 150 from
Florida.
36
What is a Normal Distribution?
Notes: 10 20 30 40 50 60 70 80
…………………………………………………………………………………………………………………………………………………………………………………Ch…a…ra…c…teristic value
…………………………………………………………………………………………………………………………………………………………………………………………… 37
General Features of a Normal Distribution
High Values
Center Values
Low Values
Frequency
Characteristic value
Mean
Mode
Median
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 38
Example: The Areas of Two Different Types of Ceramic Tiles
Frequency
Type A
Type B
Low variability
High variability
Mean Area
Mode
Median
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 39
Classifying Variables by the Nature of Values:
Nominal-the-Best Variables
- Thickness of wood board
- Area of ceramic tiles
- Shoe size
Frequency
Characteristic value
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 40
Working Problem 1.13:
The frequency distributions below illustrate the grades of three quizzes for students taking a biology
class. The final frequency distribution combines all the grades of the three quizzes. Describe the
students’ performances in each quiz and the overall performance of the class.
50
45 40
40 35
35
30
30
Percent
25
Percent
25
20
20
15
15
10 10
5 5
0 0
Quiz 1 Quiz 2
50 25
45
40 20
35
30 15
Percent
Percent
25
20 10
15
10 5
5
0 0
50
45
Quiz 3
40
Quiz 2
35
Quiz 1
Percent Frequency
30
25
20
15
10 All Quizzes
5.0
0.0
40 45 50 55 60 65 70 75 80 90 95
Grades
85
Notes:
…………………………………………………………………………………………………………………………………………………………………………………………… 42
……………………………………………………………………………………………………………………………………………………………………………………………
Working Problem 1.15:
The frequency distribution shown below represents the average grades of the three quizzes in
Working Problems 1.13 or 1.14. Describe this distribution.
Frequency Distribution of Average Grades of the Three Quizzes
35
30
25
20
Percent
15
10
Grade Averages
43
Inferential statistics: how do you estimate population parameters from sample statistics?
Sample
Population
Descriptive Statistics
Population Parameters
Sample Statistics
Population mean ()
Population Mode Inferential Statistics Sample mean (X-bar)
Population Range Sample Mode
Population Standard Deviation () Sample Range
Sample Standard
Deviation (s)
Descriptive Statistics: The analysis of determining sample statistics (e.g. mean and range)
Inferential Statistics: The analysis of estimating population parameters from sample
statistics
Notes:
……………………………………………………………………………………………………………………………………………………………………………………………
…………………………………………………………………………………………………………………………………………………………………………………………… 44