0% found this document useful (0 votes)
15 views

Lesson 1 Introduction To Statistics

Introduction ot statistics

Uploaded by

kentmatthewperez
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Lesson 1 Introduction To Statistics

Introduction ot statistics

Uploaded by

kentmatthewperez
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

INTRODUCTION

TO STATISTICS
Prepared by: JAKE C. MAGBANUA
Data and Statistics
Statistics
-Collection of methods for planning experiments, obtaining data, and then organizing,
summarizing, presenting, analyzing, interpreting, and drawing conclusions.

Data consists of information coming from observations, counts, measurements, or


responses
 In the field of education,
 In the field of business and economics,
 In the field of science and technology,
 In psychology,
 In the government, and others
Statistics is a very important tool in researches and studies. Statistical designs and
experiments are utilized to gather more information from a limited body of observation.
Various statistical techniques are used in the laboratories, experimental fields, or under
controlled conditions. The utilization of these tools in statistics is needed so that accurate and
reliable results are determined.
 Thus, the study of statistics requires primarily the understanding of basic
concepts, symbols, and mathematical notations.
Statistics
Two main branches of Statistics
1. Descriptive Statistics
Collection, organization, summarization, and presentation of data.

2. Inferential Statistics
Generalizing from samples to populations using probabilities. Performing
hypothesis testing, determining relationships between variables, and making
predictions.
3 main types of descriptive statistics

The 3 main types of descriptive statistics concern the frequency distribution, central
tendency, and variability of a dataset.
 Distribution refers to the frequencies of different responses.
 Measures of central tendency give you the average for each response.
 Measures of variability show you the spread or dispersion of your dataset.
Inferential Statistics

-Involves drawing the right conclusions from the statistical analysis that has been performed
using descriptive statistics. In the end, it is the inferences that make studies important and
this aspect is dealt with in inferential statistics.

-Most predictions of the future and generalizations about a population by studying a smaller
sample come under the purview of inferential statistics.

-Most social sciences experiments deal with studying a small sample population that helps
determine how the population in general behaves.

-By designing the right experiment, the researcher is able to draw conclusions relevant to his
study.

-While drawing conclusions, one needs to be very careful so as not to draw


the wrong or biased conclusions. Even though this appears like a science, there are ways in
which one can manipulate studies and results through various means
Variables

Variable
Characteristic or attribute that can assume different values

Random Variable
A variable whose values are determined by chance.
Data is a specific measurement of a variable – it is the value you record in your data
sheet. Data is generally divided into two categories:

• Quantitative data represents amounts


• Categorical data represents groupings

A variable that contains quantitative data is a quantitative variable; a variable that


contains categorical data is a categorical variable. Each of these types of variable can
be broken down into further types.
Variables

Discrete vs Continuous Variables (Categorical and Continuous


Variables)
- Categorical variables are also known as discrete or qualitative variables. Categorical
variables can be further categorized as either nominal, ordinal or dichotomous.
- Discrete variables are usually obtained by counting. There are a finite or countable
number of choices available with discrete data. You can't have 2.63 people in the room.

Example: number of deaths, births, students, accident cases, …..

-Continuous variables can never be exact no matter what we do in getting the


measurement, are usually obtained by measuring.
Example: Length, weight, temperature, and time are all examples of continuous variables.

Note: Since continuous variables are real numbers, we usually round them or we set an
interval. This implies a boundary depending on the number of decimal places.
For example: 64 is really anything 63.5 ≤ x < 64.5. Likewise, if there are two decimal places,
then 64.03 is really anything 63.025 ≤x < 63.035. Boundaries always have one more decimal
place than the data and end in a 5.
Variables

Dependent and Independent Variables


Variables can be grouped into dependent and independent variables with respect on
their use.
Independent variable is used as predictor if the objective is to predict the value of one
variable on the basis of the other.
- Independent variables are referred to as treatment variables. They do not change
in relation to other factors. Instead, scientific researchers explore whether or not an
independent variable causes, leads to or is associated with a change in one or more
dependent variables.
Dependent variable means the variable whose value is predicted.
- Dependent variables are factors studied in terms of how they change in relation to
independent variables.
Variables

Dependent and Independent Variables (continuation)


In a scientific study, the dependent variable is the variable that the researcher is
testing and measuring in relation to the independent variable. The researcher is
seeking to determine whether or not manipulating the independent variable will
lead to different outcomes regarding the dependent variable.
Socioeconomic Status and Number of Children

A social scientist explores if there is a link between socioeconomic status and the number of children someone
has. Dependent variable (output variable)
 independent variable - socioeconomic status
Independent variable (input variable)
 dependent variable - number of children
Job Satisfaction and Pay
A human resources professional wonders if how much money a person earns can impact the extent to
which an individual experiences job satisfaction.
 independent variable - compensation (salary or wages)
 dependent variable - job satisfaction
Variables

Dependent and Independent Variables (continuation)


In a scientific study, the dependent variable is the variable that the researcher is
testing and measuring in relation to the independent variable. The researcher is
seeking to determine whether or not manipulating the independent variable will
lead to different outcomes regarding the dependent variable.
Academic Performance of students Using Different Teaching Strategies

(INPUT) Independent Variable - Different Teaching Strategies


(OUTPUT) Dependent Variable - Academic Performance

Yield of Pole Sitao Applied with Different Organic Fertilizers

Independent Variable - Different Organic Fertilizers


Dependent Variable - Yield of Pole Sitao
Variables

Qualitative and Quantitative variables


1. Quantitative Variables: Sometimes referred to as “numeric” variables, these are variables that
represent a measurable quantity. Examples include:
 Number of students in a class
 Number of square feet in a house
 Population size of a city
 Age of an individual
 Height of an individual
2. Qualitative Variables: Sometimes referred to as “categorical” variables, these are variables that take on
names or labels and can fit into categories. Examples include:
 Eye color (e.g. “blue”, “green”, “brown”)
 Gender (e.g. “male”, “female”)
 Breed of dog (e.g. “lab”, “bulldog”, “poodle”)
 Level of education (e.g. “high school”, “Associate’s degree”, “Bachelor’s degree”)
 Marital status (e.g. “married”, “single”, “divorced”)
Population vs Sample

Population
- All subjects possessing a common characteristic that is being
Parameter
studied.
Characteristic or measure obtained from a population.
Sample
- Subgroup or subset of the population
Statistic (not to be confused with Statistics)
Characteristic or measure obtained from a sample.
Example: In a recent survey, 400 students of CPSU were asked if they smoked cigarettes regularly.
Thirty (30) of the students said yes. Identify the population and the sample.

Responses of all
CPSU students Responses of
(population) students in survey
(sample)
Population vs Sample

Population vs Sample – What is the difference?


Usually, a sample of the population is used in research, as it is easier and cost-effective to
process a smaller subset of the population rather than the entire group.

Population Sample
The measurable characteristic of the population
The measurable characteristic of the sample is called
like the mean or standard deviation is known as
a statistic.
the parameter.
The sample is a subset of the population that is
Population data is a whole and complete set.
derived using sampling.
A survey done of an entire population is
accurate and more precise with no margin of A survey done using a sample of the population bears
error except human inaccuracy in responses. accurate results, only after further factoring
However, this may not be the margin of error and confidence interval.
possible always.
The parameter of the population is a numerical The statistic is the descriptive component of the
or measurable element that defines the system sample found by using sample mean or sample
of the set. proportion.
Sample rather than the
Population
Reasons to choose a sample from a given population
 Practicality: In most cases, a population can be too large to collect accurate data – which is
not practical. Samples offer a representation of the whole population if sampled accordingly.
 It offers urgent data: When it comes to research, the amount of time available can be a
defining factor for a study. A sample provides a smaller set of the population for review, that
delivers data that is useful to represent the whole population.
 Cost-effective: The cost of conducting research is often a parameter for the study.
 Accuracy of representation: Depending on the method of sampling, research conducted on
a sample can be accurate with lesser non-response bias, than if performed by the census. A
sample that is selected using the non-probability method is an accurate representation of the
population. This data collected can be used to gather insight into the whole community.
 Inferential statistics: Inferential statistics is a process by which representative data is used
to infer insights about the entire population. Data collected from a sample represents the
whole population. Inferential statistics can only be obtained using data samples.
 At times, a sample is more accurate than a census: A census of an entire population
does not always offer accurate data due to errors such as inconsistency in responses, or non-
response bias. A carefully obtained sample, however, does away with this bias and provides
Scales of Measurement

There are four levels of measurement: Nominal, Ordinal, Interval,


and Ratio. These go from lowest level to highest level. Data is
classified according to the highest level which it fits.
1) Nominal is the lowest level. Only names are meaningful here.
Nominal Scale. Nominal variables (also called categorical variables) can be
placed into categories. They don’t have a numeric value and so cannot be
added, subtracted, divided or multiplied.
Dichotomous variables are nominal variables which have only two categories or
levels. For example, if we were looking at gender, we would most probably
categorize somebody as either "male" or "female".
Scales of Measurement

2.
Ordinal Ordinal variables are variables that have two or more categories just like
nominal variables only the categories can also be ordered or ranked. So if you asked
someone if they liked the policies of the Duterte Administration and they could
answer, "They are OK" or "Yes, “Not Okay or No”, “undecided or it can be yes or
not”, not very much and many more - a lot of categories, then you have an ordinal
variable. Why? Because you have categories in an orderly manner.
Thus, the result can be ranked, you can rank them from the most positive (Yes,
a lot), to the middle response (They are OK), to the least positive (Not very much).
However, while we can rank the levels, we cannot place a "value" to them; we
cannot say that "They are OK" is twice as positive as "Not very much" for example.

Response Strongly disagree disagree undecided agree Strongly agree


Rating 1 2 3 4 5
Scales of Measurement

3.
Interval-numbers are assigned to the items or objects. These are use to identify and
rank the objects. They also measure the degree of differences between any two
classes.
Example: weights, heights, temperatures, IQ, grades, test scores

4. Ratio
measurement
-the ratio of numbers assigned in the measurement shows the ratio in the
amounts of property being measured.

In statistics, psychology, social sciences as well as education, the interval and


ration are treated ordinarily in the same manner, the only difference between
interval ad ratio measurements is that there is true zero.
Scales of Measurement

Ambiguities in classifying a type of variable


In some cases, the measurement scale for data is ordinal,
but the variable is treated as continuous. For example, a
Likert scale that contains five values - strongly agree,
agree, neither agree nor disagree, disagree, and strongly
disagree - is ordinal. However, where a Likert scale
contains seven or more value - strongly agree, moderately
agree, agree, neither agree nor disagree, disagree,
moderately disagree, and strongly disagree - the
underlying scale is sometimes treated as continuous
(although where you should do this is a cause of great
dispute).
Types of Sampling
(Probabilistic and non-
probabilistic sampling)
Probabilistic/random sampling

Random sampling simply describes when every element in a population has an


equal chance of being chosen for the sample.
Probability sampling means that every member of the target population has a
known chance of being included in the sample.
Probability sampling methods include simple random sampling,
systematic sampling, stratified sampling, and cluster sampling.

 Random sampling is analogous to putting everyone's name into a hat and


drawing out several names. Each element in the population has an equal chance
of occurring.
 Systematic sampling is easier to do than random sampling. In systematic
sampling, the list of elements is "counted off". That is, every kth element is
taken. This is similar to lining everyone up and numbering off "1,2,3,4; 1,2,3,4;
etc". When done numbering, all people numbered 4 would be used.
Types of Sampling
(Probabilistic and non-
probabilistic sampling)
 Stratified sampling also divides the population into groups called strata. However, this
time it is by some characteristic, not geographically. For instance, the population might
be separated into males and females. A sample is taken from each of these strata using
either random, systematic, or convenience sampling.
 Cluster sampling is accomplished by dividing the population into groups -- usually
geographically. These groups are called clusters or blocks. The clusters are randomly
selected, and each element in the selected clusters are used.
Cluster sampling starts by dividing a population into groups, or clusters. What makes
Year each
this different that stratified sampling is that Levelcluster
1 Male
must be representative of the
population. Then, you randomly selecting entire clusters to sample.
CPSU Year Level 2 Female
Education
Students Year Level 3
Year Level 4
Types of Sampling
(Probabilistic and non-
probabilistic sampling)

Male (50)
Year Level 1 (100) Stratified
Female (50) sampling-
CPSU Sampling is
Male (50)
Education Year Level 2 (100) per strata
Students Female (50)
(2000) cluster
Year Level 3 (100) Male (50) sampling-
Sampling
Female (50) use the
Year Level 4 (100) whole
Male (50) selected
Female (50) strata
Types of Sampling
(Probabilistic and non-
probabilistic sampling)
Non-probability sampling, on the other hand, does not involve “random”
processes for selecting participants. In non-probability sampling, the members
of the population will not have an equal chance of being selected, and in many
cases, there will be members of the population who have no chance of being
selected
 Convenience sampling is very easy to do, but it's probably the worst technique to
use. In convenience sampling, readily available data is used. That is, the first
people the surveyor runs into.
 Quota sampling is a non-probabilistic sampling method where we divide the survey
population into mutually exclusive subgroups. These subgroups are selected with
respect to certain known (and thus non-random) features, traits, or interests.
People in each subgroup are selected by the researcher or interviewer who is
conducting the survey.
Types of Sampling
(Probabilistic and non-
probabilistic sampling)
• Snowball sampling is where research participants recruit other participants for a test
or study. It is used where potential participants are hard to find. It’s called snowball
sampling because (in theory) once you have the ball rolling, it picks up more “snow”
along the way and becomes larger and larger. Snowball sampling is a non-probability
sampling method. It doesn’t have the probability involved, with say, simple random
sampling (where the odds are the same for any particular participant being chosen).
Rather, the researchers used their own judgment to choose participants.

• Judgmental sampling is a non-probability sampling technique where the


researcher selects units to be sampled based on their knowledge and
professional judgment.
This type of sampling technique is also known as purposive sampling and authoritative
sampling.

Purposive sampling is used in cases where the specialty of an authority can select a more
representative sample that can bring more accurate results than by using other probability

You might also like