0% found this document useful (0 votes)
7 views171 pages

Statistic

The document provides an introduction to statistical concepts, defining statistics as the science of collecting, organizing, summarizing, and analyzing data to draw conclusions. It covers the importance and limitations of statistics, differentiates between descriptive and inferential statistics, and explains various types of variables and levels of measurement. Additionally, it discusses data collection methods, sample size considerations, and the significance of proper data gathering for research accuracy.

Uploaded by

Angeli Calumpit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views171 pages

Statistic

The document provides an introduction to statistical concepts, defining statistics as the science of collecting, organizing, summarizing, and analyzing data to draw conclusions. It covers the importance and limitations of statistics, differentiates between descriptive and inferential statistics, and explains various types of variables and levels of measurement. Additionally, it discusses data collection methods, sample size considerations, and the significance of proper data gathering for research accuracy.

Uploaded by

Angeli Calumpit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 171

Introduction To

The Statistical
Concepts
Objectives
• Define statistics.
• Enumerate the importance and limitations of statistics.
• Explain the process of statistics.
• Know the difference between descriptive and inferential
statistics.
• Distinguish between qualitative and quantitative
variables.
• Distinguish between discrete and continuous variables.
• Determine the level of measurement of variables.

STATISTICS?

3
Definition of Statistics

STATISTICS is the science of


collecting, organizing,
summarizing and analyzing
information to draw conclusions
or answer questions.
Definition of Statistics

1. Collection of information.
2. Organization and summarization of information.
3. Information is analyzed to draw conclusions or
answer specific questions.
4. Results should be reported using some measure
that represents how convinced we are that our
conclusions reflect reality.
Importance
of Statistics
Importance of Statistics

It enables people to
make decisions Provides us with
based on empirical tools needed to
evidence. convert massive
data into pertinent
Provides us information that can
information that we be used in decision
can used to make making.
sensible decision.
DATA

DATA are factual


information used as a
basis for reasoning,
discussion, or calculation.
Field of
Statistics
Field of Statistics
Mathematical
Statistics
- The study and development of statistical
theory and methods in the abstract.
Applied Statistics
- The application of statistical methods to solve
real problems involving randomly generated
data and the development of new statistical
methodology motivated by real problems.
Limitation of
Statistics
Limitation of Statistics
1. Statistics is not
suitable to the study of 4. Statistics table may
qualitative be misused.
phenomenon.
2. Statistics does not
study individuals. 5. Statistics is only,
one of the methods of
3. Statistical laws are studying a problem.
not exact.
Process of
Statistics
Process of Statistics
1. Identify the research
objective
- A researcher must determine the
question(s) he or she wants to
answered. The question(s) must
clearly identify the population that
is to be studied.
Process of Statistics
2. Collect the information needed
to answer the questions.
- Conducting research on an entire
population is often difficult and
expensive, so we typically look at a
sample.
EXAMPLE
The Philippine Mental Health Associations
contacts 1, 028 teenagers who are 13 to 17 years of
age and live in Laoag City and asked whether or not
they had been prescribed medications for any mental
disorders, such as depression or anxiety.
Population: Sample:
Teenagers 13 to 17 1, 028 teenagers
years of age who live in 13 to 17 years of age who
Laoag City. live in Laoag City.
EXAMPLE
A farmer wanted to learn about the weight
of his corn crop. He randomly sampled 100
plants and weighted the corn on each plant.

Population: Sample:
Entire corn 100 selected
crop corn crop
Process of Statistics
3. Organize and summarize the
information
- Descriptive statistics allow the
researcher to obtain an overview of the
data and can help determine the type of
statistical methods the research should
use.
Process of Statistics
4. Draw conclusion from the
information
- Information collected from the sample
is generalized to the population.
- Inferential statistics uses methods.
Take Note!
If the entire population is
studied, then inferential
statistics is not necessary,
because descriptive statistics
will provide all the information
that we need regarding the
EXAMPLE

1. A badminton player wants to


know his average score for the
past 10 games.
EXAMPLE

2. A car manufacturer wishes to


estimate the average lifetime of
batteries by testing a sample of
50 batteries.
EXAMPLE

3. Janine wants to determine the


variability of her six exam
scores in Algebra.
EXAMPLE

4. A politician wants to
determine the total number of
votes his rival obtained in the
past election based on his
copies of the tally sheet of
electoral returns.
EXAMPLE
5. A shipping company wishes
to estimate the number of
passengers traveling via their
ships next year using their data
on the number of passengers in
the past three years.
Distinction
Between
Qualitative and
Quantitative
Variables
Qualitative and Quantitative
Variables

Variables
- Characteristics of the
individuals within the
population.
Qualitative and Quantitative
Variables
Qualitative Variable
- is variable that yields
categorical responses. It is a
word or a code that
represents a class or
Qualitative and Quantitative
Variables
Quantitative Variable
- takes on numerical
values representing an
amount or quantity.
EXAMPLE

1. Hair Color
2. Temperature
3. Stages of Breast Cancer
4. Number of Hamburger Sold
EXAMPLE

5. Number of Children
6. Zip Code
7. Place of Birth
8. Degree of Pain
Distinction
Between Discrete
and Continuous
Discrete and Continuous

Discrete Variable
- is a quantitative
variable that either a finite
number of possible values or
a countable number of
Discrete and Continuous

Continuous Variable
- is a quantitative
variable that has an infinite
number of possible values
that are not countable.
EXAMPLE

1. The number of heads obtained after flipping


a coin five times.
2. The number of cars that arrive at a
McDonald’s drive-through between 12:00 P.M.
and 1:00 P.M.
3. The distance of a 2005 Toyota Car can
travel in city conditions with a full tank of gas.
EXAMPLE

4. Number of words correctly


spelled.

5. Time of a runner to finish one


lap.
Levels of
Measurement
Levels of Measurement

Ratio
Quantitative
Interval
Ordinal
Qualitative
Nominal
Levels of Measurement

Nominal
- They are sometimes called
categorical scales or categorical data.
Such a scale classifies persons or
objects into two or more categories.
Example
Nominal

Method of Payment
Type of School
Eye Color
Levels of Measurement
Ordinal
- This involves data that may be
arranged in some order, but
differences between data values
either cannot be determined or
meaningless.
Example
Ordinal
Food Preferences
Stage of Diseases
Social Economic Class
Severity of Pain
Levels of Measurement

Interval
- This is a measurement level not only
classifies and orders the measurement, but it
also specifies that the distances between
each interval on the scale are equivalent
along the scale from low interval to high
interval.
Example
Interval
• Temperature on Fahrenheit/Celsius
Thermometer

• Trait Anxiety

• IQ
Levels of Measurement
Ratio
- A ratio scale represents the highest,
most precise, level of measurement. It has the
properties of the interval level of
measurement and the ratios of the values of
the variable have meaning.
Example
Ratio

• Height and Weight


• Time
• Time until death
Addition/ Multiplication/
Scales Counting Ranking Subtraction Division

Nominal √
Ordinal √ √
Interval √ √ √
Ratio √ √ √ √

Levels of Measurement
Example
1. Ranking of college athletic teams.
2. Employee number.
3. Number of vehicles registered.
4. Brands of soft drinks.
5. Number of car passers along C5 on a
given day.
Assessments
/Activities
Identify each of the following data sets as
either Population or a Sample.
1. The grade point average (GPAs) of all students at a
college.
2. The GPAs of a randomly selected group of students at
a college campus.
3. The ages of the nine Supreme Court Justice of the
United States on January 1, 1842.
4. The gender of every second customer who enter a
movie theater.
5. The lengths of Atlantic croakers caught on a fishing trip
to the beach.
Identify the following measures as either
Quantitative or Qualitative.
1. The gender of the first 40 newborns in a hospital
one year.
2. The natural hair color of 20 randomly selected
fashion models.
3. The ages of 20 randomly selected fashion models.
4. The fuel economy in miles per gallon of 20 new
cars purchased last month.
5. The political affiliation of 500 randomly selected
voters.
Data Collection
and Basic
Concepts in
Sampling Design
Objectives
• Determine the sources of data (primary and
secondary data).
• Distinguish the different methods data
collection under primary and secondary data.
• Determine the appropriate sample size.
• Differentiative various sampling techniques.
• Know the sources of errors in sampling.
Data Collection
Data collection is the process of
gathering and measuring information
on variables of interest, in an
established systemic fashion that
enables one to answer stated research
questions, test hypotheses, and
evaluate outcomes.
Consequences
from Improperly
Collected Data
Data Collection
• Inability to answer research questions
accurately.
• Inability to repeat and validate the study.
• Distorted findings resulting in wasted resources.
• Misleading other researches to pursue fruitless
avenues of investigation.
• Compromising decisions for public policy.
• Causing harm to human participants and animal
subjects.
Steps in
Data
Gathering
Steps in Data
Gathering
1. Set the objectives for collecting data.
2. Determine the data needed based on the set
objectives.
3. Determine the method to be used in data
gathering and define the comprehensive data
collection points.
4. Design data gathering forms to be used.
5. Collect data.
Choosing of
Method of
Data
Collection
Data Collection
Decision-makers need
information that is relevant, timely,
accurate and usable. The cost of
obtaining, processing and
analyzing these data is high.
Sources of
Data
Primary Sources
Provide a first-hand
account of an event or time
period and are considered to
be authoritative.
Primary Data
Data documented by the
primary source. The data
collectors documented the
data themselves.
Secondary Sources
Offer an analysis,
interception or a restatement
of primary sources and are
considered to be persuasive.
Secondary Data
Data documented by a
secondary source. The data
collectors had the data
documented by other
sources.
The Primary
Data Can Be
Collected In 5
Methods
Methods
1. Direct Personal Interviews
- the researcher has direct
contact with the interviewee. The
researcher gathers information by
asking questions to the interviewee.
Methods
2. Interact/Questionnaire Method
- this methods of data collection
involve sourcing and accessing
existing data that were originally
collected for the purpose of the study.
Questions to be
Considered

Who exactly do we want to know


according to the objectives and
variables we identified earlier?
Questions to be
Considered

Of whom will we ask


questions and what
techniques will we use?
Questions to be
Considered

Are our informants


mainly literate or
illiterate?
Questions to be
Considered

How large is the


sample that will be
interviewed?
Key Design
Principles of a
Good
Questionnaire
Key Design Principles of a Good
Questionnaire

1. Keep the questionnaire as short as


possible.
2. Decide on the type of questionnaire
(open ended or closed ended).
3. Write the questions properly.
4. Order the questions appropriately.
Key Design Principles of a Good
Questionnaire

5. Avoid questions that prompt or motivate


the respondent to say what you would like
to hear.
6. Write an introductory letter or an
introduction.
7. Write special instructions for
interviewers or respondents.
Key Design Principles of a Good
Questionnaire

8. Translate the questions if


necessary.
9. Always test your questions
before taking the survey.
Open-Ended
Question &
Closed-ended
Question
Open-ended Question

- type of question that does


not include response categories.
The respondent is not given any
possible answers to choose
from.
Closed-ended Question

- is a type of question that


includes a list of response
categories from which
respondent will select his/her
answer.
Advantages
Open-ended VS Closed-
ended
• Easy to encode,
tabulate, and
• More detailed analyze.
answer. • Easy to understand.
• Enables inter-study
• Could reveal comparison.
additional • Saves time and
insights. money.
• High response rate.
Disadvantag
es
Open-ended VS Closed-
ended
• Difficult to encode, • Could frustrate
tabulate, and analyze. respondents.
• Low response rate. • Potentially biased
• Respondent has to be response sets.
articulate. • Difficult or impossible
• Respondent could feel
to detect if
threatened.

respondent truly
Responses could have
different levels of understood the
detail. questions.
Methods
3. Focus Group
- is a group interview of
approximately six to twelve people
who share similar characteristics or
common interest.
Methods
4. Experiment
- is a method of collecting data
where there is direct human
intervention on the conditions that
may affect the values of the variable
of interest.
Experiment
• Ethical, moral, and legal
concerns.
• Unrealistic controlled
environments.
• Inability to control for all
variables.
Methods
5. Observation
- is a technique that involves
systematically selecting, watching and
recording behaviors of people or other
phenomena and aspects of the setting
in which they occur, for the purpose of
getting specified information.
Observation
• Radiographic
• Biochemical
• Xray machines
• Microscope
• Clinical examinations
• Microbiological examinations
The Secondary
Data Can Be
Collected In 5
Methods
Methods
1. Published report on newspaper and
periodicals.
2. Financial data reported in annual
reports.
3. Records maintained by the institution.
4. Internal reports of the government
departments.
5. Information from official publications.
Take Note!
• Always investigate the validity and reliability of
the data by examining the collection method
employed by your source.
• Do not use inappropriate data for your
research.
• The choice of methods of data collection is
largely based on the accuracy of the
information they yield.
Sample Size
Sample Size
“How many
participants should
be chosen for a
survey”?
Sample Size
- is typically denoted by n and
it is always a positive integer.
- no exact sample size can be
mentioned here and it can vary in
different research settings.
Take Note!
• Representativeness, not size, is the
more important consideration.
• Use no less than 30 subjects if possible.
• If you use complex statistics, you may
need a minimum of 100 or more in your
sample (varies with method)
Non-
Statistical and
Statistical
Consideration
s
Non-Statistical
Considerations

- It may include availability


of resources, man power,
budget, ethics and sampling
frame.
Statistical
Considerations
- It will include the
desired precision of the
estimate.
Criteria in
Determining
the
Appropriate
Sample Size
1. Level of Precision
- Also called sampling
error, the level of precision, is
the range in which the true
value of the population is
estimated to be.
2. Confidence Interval
- It is statistical measure of
the number of times out of
100 that results can be
expected to be within a
specified range.
2. Confidence Interval
Desired Confidence Z-Score
Level
80% 1.28
85% 1.44
90% 1.65
95% 1.96
99% 2.58
3. Degree of
Variability
- depending upon the
target population and
attributes under
consideration, the degree of
variability varies considerably.
Methods in
Determining
the Sample
Size
1. Estimate the Mean or
Average
- The sample size required
to estimate the population
mean µ to with a level of
confidence with specified
margin of error e.
Take Note!
If when is unknown, it is common practice
to conduct a preliminary survey to
determine and use it as an estimate of or
use results from previous studies to obtain
an estimate of . When using this approach,
the size of the sample should be at least
30.
Example
A soft drink machine is regulated so that the
amount of drink dispensed is approximately
normally distributed with a standard
deviation equal to 0.5 ounce. Determine the
sample size needed if we wish to be 95%
confident that our sample mean will be
within 0.03 ounce from the true mean.
2. Estimating Proportion (Infinite
Population)

- The sample size required


to obtain a confidence interval
for p with specified margin of
error e.
Example
Suppose we are doing a study on the inhabitants
of a large town, and want to find out how many
households serve breakfast in the mornings. We
don’t have much information on the subject to
begin with, so we’re going to assume that half of
the families serve breakfast: this gives us
maximum variability. So p = 0.5. We want 99%
confidence and at least 1% precision.
3. Slovin’s Formula

- Slovin’s formula is used


to calculate the sample size n
given the population size and
error.
Example
A researcher plans to conduct a
survey about food preference of
BS Stat students. If the population
of students is 1000, find the
sample size if the error is 5%
4. Finite Population
Correction

- If the population is
small then the sample size
can be reduced slightly.
Online Calculator of Sample
Size
https://select-statistics.co.uk/calculato
rs/sample-size-calculator-population-
proportion/

https://www.calculator.net/sample-siz
e-calculator.html
Basic
Sampling
Design
Reason for Sampling
• Important that the individuals included
in sample represent a cross section
individuals in the population.
• If sample is not representative it is
biased. You cannot generalize to the
population from your statistical data.
Observation Unit
• An object on which a
measurement is taken. This
is the basic unit of
observation, sometimes
called an element.
Target Population
• The complete
collection of
observation we want to
study.
Sample Population
• The collection of all possible
observation units that might
have been chosen in a
sample; the population from
which the sample was taken.
Sample

• A subset of a
population.
Sampling Unit
• A unit that can be selected for a
sample. We may want to study
individuals, but do not have a
list of all individuals in the target
population.
Sampling Frame
• A list, map, or other
specification of sampling
units in the population from
which a sample may be
selected.
Sampling Bias
• This involves problems in
your sampling, which reveals
that your sample is not
representative of your
population.
Advantages
of Sampling
Over
Complete
Advantage of
Sampling
• Less Labor
• Reduced Cost
• Greater Speed
• Greater Scope
• Greater Efficiency and Accuracy
• Convenience
• Ethical Considerations
Two Type of
Sample
1. Probability Sample
• Samples are obtained using some
objective chance mechanism, thus
involving randomization.
• They require the use of a complete
listing of the elements of the
universe called sampling frame.
1. Non -Probability
Sample
• Samples are obtained haphazardly,
selected purposively or are taken as
volunteers.
• The probabilities of selection are
unknown.
• They should not be used for statistical
inference.
Sampling
Procedure
Sampling Procedure
• Identify the population
• Determine if population is accessible
• Select a sampling method.
• Choose a sample that is representative of
the population.
• Ask the question, can I generalize to the
general population from the accessible
population?
Basic Sampling
Technique of
Probability
Sampling
1. Simple Random
Sampling
• Most basic method of drawing a
probability sample.
• Assigns equal probabilities of
selection to each possible sample.
• Results to a simple random
sample.
Simple Random Sampling
Advantages and Disadvantages

• It is very • The sample


chosen may be
simple and distributed over
easy to a wide
use. geographic
area.
When to Use
Simple Random Sampling
• This is preferable to use if
the population is not
widely spread
geographically.
2. Systematic Random
Sampling
• It is obtained by selecting every
kth individual from the
population.
• The first individual selected
corresponds to a random
number between 1 to n.
Obtaining a
Systematic
Random Sample
Obtaining a Systematic
Random Sample
• Decide on a method of
assigning a unique serial
number, from 1 to N, to each
one of the elements in the
populations.
Obtaining a Systematic
Random Sample
• Compute for the sampling
interval:
Obtaining a Systematic
Random Sample
• Select a number, from 1 to k, using a
randomization mechanism. The
element in the population assigned to
this number is the first elements of the
sample are those assigned to the
numbers and so on until you get a
sample of size.
Example
We want to select a sample
of 50 students from 500
students under this method kth
item and picked up from the
sampling frame.
Systematic Random Sampling
Advantages and Disadvantages
• Drawing of the • May give poor
sample is easy. It is precision when
easy to administer unsuspected
in the field, and the
periodicity is
sample is spread
evenly over the present in the
population. population.
When to Use
Systematic Random Sampling
• This is advisable to us if the
ordering of the population is
essentially random and when
stratification with numerous
data is used.
3. Stratified Random
Sampling
• It is obtained by separating the
population into non-overlapping
groups called strata and then
obtaining a simple random
sample from each stratum.
Example
A sample of 50 students is to be
drawn from a population consisting of 500
students belonging to two institutions A
and B. The number of students in the
institution A is 200 and the institution B is
300. How will you draw the sample using
proportional allocation?
Stratified Random Sampling
Advantages and Disadvantages
• Stratification of • Values of the
respondents is stratification variable
advantageous in may not be easily
terms of precision available for all units in
the population
of the estimates of
especially if the
the characteristics characteristics of
of the population. interest is homogenous.
When to Use
Stratified Random Sampling
• If the population is such that the
distribution of the characteristics of
the respondents under
consideration concentrated in
small and spread segment of the
population.
4. Cluster Sampling
• You take sample from naturally
occurring groups in your population.
• The clusters are constructed such that
the sampling units are heterogeneous
within the cluster and homogeneous
among the clusters.
Obtaining a
Cluster Sample
Obtaining a Cluster Sample

• Divide the population into non-


overlapping clusters.
• Number the clusters in the
population from 1 to N.
Obtaining a Cluster Sample
• Select n distinct numbers from 1 to
N using a randomization
mechanism. The selected clusters
are the clusters associated with the
selected numbers.
• The sample will consist of all
elements in the selected clusters.
Example
A researcher wants to
survey academic performance
of high school students in
MIMAROPA.
Cluster Sampling
Advantages and Disadvantages
• There is no need to • In actual field
come out with a list applications adjacent
of units in the households tend to
population; all what have more similar
is needed is simply characteristics than
a list of the clusters. households distantly
apart.
When to Use
Cluster Sampling
• If the population can be grouped
into clusters where individual
population elements are known to
be different with respect to the
characteristics under study, this
preferable to use.
5. Multi-Stage Sampling
• Selection of the sample is done in two
or more steps or stages, with sampling
units varying in each stage.
• The population is first divided into
number of first-stage sampling units
from which a sample is drawn.
Obtaining a
Multi-Stage
Sampling
Obtaining a Multi-Stage
Sampling
• Organize the sampling process into
stages where the unit of analysis is
systematically grouped.
• Select a sampling technique for each.
• Systematically apply the sampling
technique to each stage until the unit of
analysis has been selected.
Example
Suppose we wish to study the
expenditure patterns of households in
NCR. We can select a sample of
households for this study using
simple three-stage sampling.
Multi-Stage Sampling
Advantages and Disadvantages
• It is easier to generate • It is complexity in
adequate sampling
frames. Transportation theory may be difficult
costs are greatly to apply in the field.
reduced since there is Estimation
some form of procedures may be
clustering among difficult for non-
ultimate or final statisticians to follow.
samples.
When to Use
Multi-Stage Sampling
• If no population list is
available and if the
population covers a wide
area.
Take Note!
• Used probability sampling if the
main objective of the sample
survey is making inferences
about the characteristics of
the population under study.
Basic Sampling
Technique of Non-
Probability
Sampling
Accidental Sampling
• There is no system of
selection but only those
whom the researcher or
interviewer meets by chance.
Quota Sampling
• There is specified number
of persons of certain
types is included in the
sample.
Convenience Sampling
• It is process of picking out
people in the most
convenient and fastest way
to get reactions immediately.
Purposive Sampling
• It is based on certain
criteria laid down by the
researcher.
Judgement Sampling
• Selects sample in
accordance with an
expert’s judgement.
Cases wherein
Non-Probability
Sampling is Useful
Cases wherein Non-Probability
Sampling is Useful
• Only few are willing to be interviewed.
• Extreme difficulties in locating or
identifying subjects.
• Probability sampling is more expensive
to implement.
• Cannot enumerate the population
elements.
Sources of Errors
in Sampling
1. Non-sampling Error
• Errors that results from the
survey process.
• Any errors that cannot be
attributed to the sample-to-
sample variability.
Sources of Non-sampling Error
• Non-response
• Interview Error
• Misrepresented Answers
• Data entry errors
• Questionnaire Design
• Wording of Questions
• Selection Bias
2. Sampling Error
• Error that results from taking one
sample instead of examining the
whole population.
• Error that results from using
sampling to estimate information
regarding a population.

You might also like