Lesson 1-Introduction To Statistics
Lesson 1-Introduction To Statistics
STATISTICAL CONCEPTS
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Learning Outcomes:
At the end of the lesson, you are expected to:
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Think about it!
•What is statistics?
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Consider the following:
• Twenty-eight percent or 17.3 million Filipino adults age 15 years and older
are current tobacco smokers, according to the results of the 2009 Global
Adult Tobacco Survey (GATS).
• Based on the 2020 Annual Poverty Indicators Survey (APIS), 7.0 percent of
the total family members reported that they got ill/sick or injured in the
past month preceding the survey; that is, from 01 to 30 June 2020.
• Unemployment rate in the country slightly picked up in December 2021 at
6.6 percent from the 6.5 percent reported in November 2021.
• Registered deaths attributed to transport accidents decreased by -32.0
percent, from 12.80 thousand (2.1% share) in 2019 to 8.70 thousand (1.4%
share) in 2020, pushing its rank from 11th to 17th.
Source: Philippine Statistics Authority
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
What is data?
• Data consist of information
coming from observations,
counts, measurements, or
responses.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Statistics
• the word statistics is derived from the Latin word
status, meaning “state.”
• Statistics is the science of collecting, organizing,
analyzing, and interpreting data in order to make
decisions.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Definition:
• A population is the collection of all outcomes,
responses, measurements, or counts that are of
interest.
• A sample is a subset, or part, of a population.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Identifying Data sets
• 1. In a recent survey, 1500 adults in the United States
were asked if they thought there was solid evidence of
global warming. Eight hundred fifty-five of the adults
said yes. Identify the population and the sample.
Describe the sample data set. (Adapted from Pew
Research Center)
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Identifying Data sets
• The population consists of the
responses of all adults in the United
States.
• The sample consists of the responses
of the 1500 adults in the United
States in the survey. The sample is a
subset of the responses of all adults
in the United States. The sample data
set consists of 855 yes’s and 645 no’s.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Identifying Data sets
2. The US Department of Energy conducts weekly surveys of
approximately 900 gasoline stations to determine the average
price per gallon of regular gasoline. On January 11, 2010, the
average price was $2.75 per gallon. Identify the population
and the sample. Describe the sample data set. (Source: Energy
Information Administration)
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Identifying Data sets
a. The population consists of the prices per gallon of regular
gasoline at all gasoline stations in the United States.
The sample consists of the prices per gallon of regular
gasoline at the 900 surveyed stations.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Try This!
Identify the population and sample.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Definition
• A parameter is a numerical description of a
population characteristic.
• A statistic is a numerical description of a sample
characteristic.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Parameter or Statistic
Decide whether the numerical value describes a population parameter or a
sample statistic. Explain your reasoning.
1. A recent survey of 200 college career centers reported that the average
starting salary for petroleum engineering majors is Php 83,121.
2. The 2182 students who accepted admission offers to Northwestern
University in 2009 have an average SAT score of 1442.
3. In a random check of a sample of retail stores, the Food and Drug
Administration found that 34% of the stores were not storing fish at the
proper temperature.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Parameter or Statistic
Solution:
1.Because the average of Php 83,121 is based on a subset of
the population, it is a sample statistic.
2.Because the SAT score of 1442 is based on all the students
who accepted admission offers in 2009, it is a population
parameter.
3.Because the percent of 34% is based on a subset of the
population, it is a sample statistic.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Try This!
Decide whether the numerical value describes a population parameter or a sample
statistic. Explain your reasoning.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Branches of Statistics
• Descriptive statistics is the branch of statistics that
involves the organization, summarization, and display
of data.
• Inferential statistics is the branch of statistics that
involves using a sample to draw conclusions about a
population. A basic tool in the study of inferential
statistics is probability.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Descriptive and Inferential
Decide which part of the study represents the descriptive branch of
statistics. What conclusions might be drawn from the study using
inferential statistics?
1. A large sample of men, aged 48, was studied for 18 years. For
unmarried men, approximately 70% were alive at age 65. For married
men, 90% were alive at age 65. (Source: The Journal of Family Issues)
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Descriptive and Inferential
Solution:
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Descriptive and Inferential
Decide which part of the study represents the descriptive branch of
statistics. What conclusions might be drawn from the study using
inferential statistics?
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Descriptive and Inferential
Solution:
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Try This!
Decide which part of the study represents the descriptive branch of statistics. What
conclusions might be drawn from the study using inferential statistics?
2. A study shows that senior citizens who live in Tacloban City have better
memories than senior citizens who do not live in Tacloban City.
a. Make an inference based on the results of this study.
b. What is wrong with this type of reasoning?
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Try This!
Solution:
1.
a. Descriptive statistics involve the statement “76% of women and 60% of
men had a physical examination within the previous year.”
b. An inference drawn from the study is that a higher percentage of women
had a physical examination within the previous year.
2.
a. An inference drawn from the sample is that senior citizens who live in
Tacloban City have better memories than senior citizens who do not live in
Tacloban City.
b. This inference may incorrectly imply that if you live in Tacloban City you
will have a better memory.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Classification of Variable
• Qualitative variables are variables that can be placed into distinct
categories, according to some characteristic or attribute. For example,
if subjects are classified according to gender (male or female), then
the variable gender is qualitative. Other examples of qualitative
variables are religious preference and geographic locations.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Try This!
Classify each variable as qualitative or quantitative.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Classification of Quantitative Variable
• Discrete variables assume values that can be counted. Examples of
discrete variables are the number of children in a family, the number
of students in a classroom, and the number of calls received by a
switchboard operator each day for a month.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Try This!
Classify each variable as discrete or continuous.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Level of Measurement
• Nominal level of measurement classifies data into mutually exclusive
(nonoverlapping) categories in which no order or ranking can be
imposed on the data. Examples: basketball player number, sex,
marital status, religious affiliation.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Level of Measurement
• Interval level of measurement ranks data, and precise differences
between units of measure do exist; however, there is no meaningful
zero. Examples: IQ level, Temperature, etc.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Try This!
Classify each as nominal-level, ordinal-level, interval level, or ratio-level measurement.
1. Number of pages in the 25 best-selling mystery novels. Ratio
2. Rankings of golfers in a tournament. Ordinal
3. Temperatures inside 10 pizza ovens. Interval
4. Weights of selected cell phones. Ratio
5. Salaries of the coaches in the NFL. Ratio
6. Times required to complete a chess game. Ratio
7. Ratings of textbooks (poor, fair, good, excellent). Ordinal
8. Number of amps delivered by battery chargers. Ratio
9. Ages of children in a day care center. Ratio
10. Categories of magazines in a physician’s office (sports, women’s, health, men’s, news).
Nominal
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Types of Studies
• Observational study - the researcher merely observes what is
happening or what has happened in the past and tries to draw
conclusions based on these observations.
For example, data from the Motorcycle Industry Council stated that
“Motorcycle owners are getting older and richer.” Data were collected
on the ages and incomes of motorcycle owners for the years 1980 and
1998 and then compared. The findings showed considerable
differences in the ages and incomes of motorcycle owners for the two
years. In this study, the researcher merely observed what had
happened to the motorcycle owners over a period of time. There was
no type of research intervention.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Types of Studies
• Experimental study - the researcher manipulates one of the variables and
tries to determine how the manipulation influences other variables.
For example, a study conducted at Virginia Polytechnic Institute and
presented in Psychology Today divided female undergraduate students into
two groups and had the students perform as many sit-ups as possible in 90
sec. The first group was told only to “Do your best,” while the second group
was told to try to increase the actual number of sit-ups done each day by
10%.
After four days, the subjects in the group who were given the vague
instructions to “Do your best” averaged 43 sit-ups, while the group that was
given the more specific instructions to increase the number of sit-ups by 10%
averaged 56 sit-ups by the last day’s session. The conclusion then was that
athletes who were given specific goals performed better than those who
were not given specific goals.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Types of Studies
Statistical studies usually include one or more independent variables
and one dependent variable.
• The independent variable in an experimental study is the one that is
being manipulated by the researcher. The independent variable is also
called the explanatory variable. The resultant variable is called the
dependent variable or the outcome variable.
For example, in the sit-up study, the researchers gave the groups two
different types of instructions, general and specific. Hence, the
independent variable is the type of instruction. The dependent variable,
then, is the resultant variable, that is, the number of sit-ups each group
was able to perform after four days of exercise.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Types of Studies
In the sit-up study, there were two groups. The group that received the
special instruction is called the treatment group while the other is
called the control group. The treatment group receives a specific
treatment (in this case, instructions for improvement) while the control
group does not.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Types of Studies
A confounding variable is one that influences the dependent or
outcome variable but was not separated from the independent
variable.
For example, subjects who are put on an exercise program might also
improve their diet unknown to the researcher and perhaps improve
their health in other ways not due to exercise alone. Then diet
becomes a confounding variable.
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Uses and Misuses of Statistics
Ways that statistics can be misrepresented.
1. Suspect Samples - the first thing to consider is the sample that was used in
the research study. Sometimes researchers use very small samples to obtain
information.
2. Ambiguous Averages - select the one measure of average that lends the
most evidence to support a position.
3. Changing the subject - another type of statistical distortion can occur
when different values are used to represent the same data.
For example, one political candidate who is running for reelection might say,
“During my administration, expenditures increased a mere 3%.” His
opponent, who is trying to unseat him, might say, “During my opponent’s
administration, expenditures have increased a whopping 600,000,000,000.”
Prepared by: Roy I. Branzuela STATSAP: Statistical Analysis with Software Application
Uses and Misuses of Statistics
Ways that statistics can be misrepresented.