Lecture 1: Introduction To Statistics

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 23

Lecture 1: Introduction to

Statistics

1
Statistics
• Statistics is a science that involves the
extraction of information from numerical
data obtained during an experiment or
from a sample. It involves the design of
the experiment or sampling procedure, the
collection and analysis of the data, and
making inferences (statements) about the
population based upon information in a
sample.
2
Population
• A population is a complete set of
individuals, objects, or measurements
having some common characteristics.
• For example, a researcher may be
interested in the relation between class
size (variable 1) and academic
performance (variable 2) for the population
of third-grade children.

3
Sample
• A sample is a subject or part of the
population selected to represent the
population.
• Usually populations are so large that a
researcher cannot examine the entire
group. Therefore, a sample is selected to
represent the population in a research
study. The goal is to use the results
obtained from the sample to help answer
questions about the population. 4
Variables
• A variable is a characteristic or condition
that can change or take on different
values.
• Most research begins with a general
question about the relationship between
two variables for a specific group of
individuals.

6
Types of Variables
Two kinds of variables:
Qualitative, or Attribute, or Categorical, Variable: A
variable that categorizes or describes an element of a
population.
Note: Arithmetic operations, such as addition and
averaging, are not meaningful for data resulting from a
qualitative variable.
Quantitative, or Numerical, Variable: A variable that
quantifies an element of a population.
Note: Arithmetic operations such as addition and
averaging, are meaningful for data resulting from a
quantitative variable.
7
Types of Variables(cont.)
• Quantitative Variables can be classified as
discrete or continuous.
• Discrete variables (such as class size)
consist of indivisible categories, and
continuous variables (such as time or
weight) are infinitely divisible into whatever
units a researcher may choose. For
example, time can be measured to the
nearest minute, second, half-second, etc.

8
Measuring Variables
• To establish relationships between
variables, researchers must observe the
variables and record their observations.
This requires that the variables be
measured.
• The process of measuring a variable
requires a set of categories called a scale
of measurement and a process that
classifies each individual into one
category.
9
4 Types of Measurement Scales
1. A nominal scale is an unordered set of
categories identified only by name.
Nominal measurements only permit you
to determine whether two individuals are
the same or different.
2. An ordinal scale is an ordered set of
categories. Ordinal measurements tell
you the direction of difference between
two individuals.

10
4 Types of Measurement Scales
3. An interval scale is an ordered series of equal-
sized categories. Interval measurements
identify the direction and magnitude of a
difference. The zero point is located arbitrarily
on an interval scale.
4. A ratio scale is an interval scale where a value
of zero indicates none of the variable. Ratio
measurements identify the direction and
magnitude of differences and allow ratio
comparisons of measurements.
11
Correlational Studies
• The goal of a correlational study is to
determine whether there is a relationship
between two variables and to describe the
relationship.
• A correlational study simply observes the
two variables as they exist naturally.

12
Experiments
• The goal of an experiment is to
demonstrate a cause-and-effect
relationship between two variables; that is,
to show that changing the value of one
variable causes changes to occur in a
second variable.

14
Experiments (cont.)
• In an experiment, one variable is manipulated
to create treatment conditions. A second
variable is observed and measured to obtain
scores for a group of individuals in each of the
treatment conditions. The measurements are
then compared to see if there are differences
between treatment conditions. All other
variables are controlled to prevent them from
influencing the results.
• In an experiment, the manipulated variable is
called the independent variable and the
observed variable is the dependent variable.

15
Data
• The measurements obtained in a research
study are called the data.
• The goal of statistics is to help researchers
organize and interpret the data.

16
Example: A college dean is interested in learning about the
average age of faculty. Identify the basic terms in this
situation.
The population is the age of all faculty members at the
college.
A sample is any subset of that population. For example, we
might select 10 faculty members and determine their age.
The variable is the “age” of each faculty member.
One data would be the age of a specific faculty member.
The data would be the set of values in the sample.
The experiment would be the method used to select the ages
forming the sample and determining the actual age of each
faculty member in the sample.
The parameter of interest is the “average” age of all faculty at
the college.
The statistic is the “average” age for all faculty in the sample.
17
Descriptive Statistics
• Descriptive statistics are methods for
organizing and summarizing data.
• For example, tables or graphs are used to
organize data, and descriptive values such
as the average score are used to
summarize data.
• A descriptive value for a population is
called a parameter and a descriptive
value for a sample is called a statistic.
18
Inferential Statistics
• Inferential statistics are methods for using
sample data to make general conclusions
(inferences) about populations.
• Because a sample is typically only a part of the
whole population, sample data provide only
limited information about the population. As a
result, sample statistics are generally imperfect
representatives of the corresponding population
parameters.

19
Sampling Error
• The discrepancy between a sample
statistic and its population parameter is
called sampling error.
• Defining and measuring sampling error is
a large part of inferential statistics.

20
Notation
• The individual measurements or scores obtained
for a research participant will be identified by the
letter X (or X and Y if there are multiple scores
for each individual).
• The number of scores in a data set will be
identified by N for a population or n for a sample.
• Summing a set of values is a common operation
in statistics and has its own notation. The Greek
letter sigma, Σ, will be used to stand for "the sum
of." For example, ΣX identifies the sum of the
scores.
22
Order of Operations
1. All calculations within parentheses are done
first.
2. Squaring or raising to other exponents is done
second.
3. Multiplying, and dividing are done third, and
should be completed in order from left to right.
4. Summation with the Σ notation is done next.
5. Any additional adding and subtracting is done
last and should be completed in order from left
to right.

23

You might also like