0% found this document useful (0 votes)
21 views105 pages

Basic Concepts

This document provides an overview of key concepts in econometrics. It discusses variables, populations and samples, types of variables, scales of measurement, descriptive and inferential statistics, sampling error, and frequency distributions. Frequency distributions organize and simplify data by tabulating how many observations fall into each category of a measurement scale. They can take the form of regular or grouped tables and histograms or polygons.

Uploaded by

ahalalami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views105 pages

Basic Concepts

This document provides an overview of key concepts in econometrics. It discusses variables, populations and samples, types of variables, scales of measurement, descriptive and inferential statistics, sampling error, and frequency distributions. Frequency distributions organize and simplify data by tabulating how many observations fall into each category of a measurement scale. They can take the form of regular or grouped tables and histograms or polygons.

Uploaded by

ahalalami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 105

Basic concepts in

Econometrics

1
Variables

• A variable is a characteristic or condition


that can change or take on different
values.
• Most research begins with a general
question about the relationship between
two variables for a specific group of
individuals.

2
Population

• The entire group of individuals is called the


population.
• For example, a researcher may be
interested in the relation between class
size (variable 1) and academic
performance (variable 2) for the population
of third-grade children.

3
Sample

• Usually populations are so large that a


researcher cannot examine the entire
group. Therefore, a sample is selected to
represent the population in a research
study. The goal is to use the results
obtained from the sample to help answer
questions about the population.

4
Types of Variables

• Variables can be classified as discrete or


continuous.
• Discrete variables (such as class size)
consist of indivisible categories, and
continuous variables (such as time or
weight) are infinitely divisible into whatever
units a researcher may choose. For
example, time can be measured to the
nearest minute, second, half-second, etc.

6
Real Limits

• To define the units for a continuous


variable, a researcher must use real limits
which are boundaries located exactly half-
way between adjacent categories.

7
Measuring Variables

• To establish relationships between


variables, researchers must observe the
variables and record their observations.
This requires that the variables be
measured.
• The process of measuring a variable
requires a set of categories called a scale
of measurement and a process that
classifies each individual into one
category.
8
4 Types of Measurement Scales

1. A nominal scale is an unordered set of


categories identified only by name.
Nominal measurements only permit you
to determine whether two individuals are
the same or different.
2. An ordinal scale is an ordered set of
categories. Ordinal measurements tell
you the direction of difference between
two individuals.

9
4 Types of Measurement Scales

3. An interval scale is an ordered series of equal-


sized categories. Interval measurements
identify the direction and magnitude of a
difference. The zero point is located arbitrarily
on an interval scale.
4. A ratio scale is an interval scale where a value
of zero indicates none of the variable. Ratio
measurements identify the direction and
magnitude of differences and allow ratio
comparisons of measurements. Mostly used in
finance like liquidity ratio 10
Correlational Studies

• The goal of a correlational study is to


determine whether there is a relationship
between two variables and to describe the
relationship.
• A correlational study simply observes the
two variables as they exist naturally.
• Not giving the causality between the
variables( used in the econometrics.

11
Experiments

• The goal of an experiment is to


demonstrate a cause-and-effect
relationship between two variables; that is,
to show that changing the value of one
variable causes changes to occur in a
second variable
• Econometrics mostly used non
econometrics data

13
Data

• The measurements obtained in a research


study are called the data.
• The goal of statistics is to help researchers
organize and interpret the data.
• Collection of data panel data (different
time different variable)
• Time serios data ( same variable different
time)
• Cross sectional data ( different variable
14
same time)
Descriptive Statistics

• Descriptive statistics are methods for


organizing and summarizing data.
• For example, tables or graphs are used to
organize data, and descriptive values such
as the average score are used to
summarize data.
• A descriptive value for a population is
called a parameter and a descriptive
value for a sample is called a statistic.
• SD, Variance , mean, median, based in
the observation no experiment. 15
Inferential Statistics

• Inferential statistics are methods for using


sample data to make general conclusions
(inferences) about populations.
• Because a sample is typically only a part of the
whole population, sample data provide only
limited information about the population. As a
result, sample statistics are generally imperfect
representatives of the corresponding population
parameters.

16
Sampling Error

• The discrepancy between a sample


statistic and its population parameter is
called sampling error.
• Defining and measuring sampling error is
a large part of inferential statistics.

17
Determining Scale of Measurement

• 3 steps

• Examples

• Try a few
Scales (levels) of Measurement
To determine level of measurement

• Step #1Identify the Unit of Observation


(obs unit) (sampling unit)
– (one of something)
• What you are interested on studying?
• possible Observational units:
– A rat
– A Group/team
– A Household
– A City
– A Country
– A College student
What level (scale)?
Number of children aboard each of 15 school buses?

• Observational unit?
– One school bus
• Place bus in category?
– No
• Measurement Unit?
– One student
• Absolute zero?
– Yes, bus could carry no students
• Level of measure?
– Ratio
What Level Measurement?
Students Fail v. Pass
• Observational unit?
– One student
• Place student in category?
– Yes: fail or pass
• Are the categories ranked on some quality?
– Yes (underlying continuous variable…success in
learning)
• Level of measurement?
– Ratio
– Ordinal
Frequency Distributions

• After collecting data, the first task for a


researcher is to organize and simplify the
data so that it is possible to get a general
overview of the results.
• This is the goal of descriptive statistical
techniques.
• One method for simplifying and organizing
data is to construct a frequency
distribution.
23
Frequency Distributions (cont.)

• A frequency distribution is an organized


tabulation showing exactly how many
individuals are located in each category on
the scale of measurement. A frequency
distribution presents an organized picture
of the entire set of scores, and it shows
where each individual is located relative to
others in the distribution.

24
Frequency Distribution Tables

• A frequency distribution table consists of at


least two columns - one listing categories on the
scale of measurement (X) and another for
frequency (f).
• In the X column, values are listed from the
highest to lowest, without skipping any.
• For the frequency column, tallies are determined
for each value (how often each X value occurs in
the data set). These tallies are the frequencies
for each X value.
• The sum of the frequencies should equal N.

25
Frequency Distribution Tables (cont.)
• A third column can be used for the
proportion (p) for each category: p = f/N.
The sum of the p column should equal
1.00.
• A fourth column can display the
percentage of the distribution
corresponding to each X value. The
percentage is found by multiplying p by
100. The sum of the percentage column is
100%.
26
Regular Frequency Distribution

• When a frequency distribution table lists all


of the individual categories (X values) it is
called a regular frequency distribution.

27
Grouped Frequency Distribution

• Sometimes, however, a set of scores


covers a wide range of values. In these
situations, a list of all the X values would
be quite long - too long to be a “simple”
presentation of the data.
• To remedy this situation, a grouped
frequency distribution table is used.

28
Grouped Frequency Distribution (cont.)

• In a grouped table, the X column lists groups of


scores, called class intervals, rather than
individual values.
• These intervals all have the same width, usually
a simple number such as 2, 5, 10, and so on.
• Each interval begins with a value that is a
multiple of the interval width. The interval width
is selected so that the table will have
approximately ten intervals.

29
Frequency Distribution Graphs

• In a frequency distribution graph, the


score categories (X values) are listed on
the X axis and the frequencies are listed
on the Y axis.
• When the score categories consist of
numerical scores from an interval or ratio
scale, the graph should be either a
histogram or a polygon.

30
Histograms

• In a histogram, a bar is centered above


each score (or class interval) so that the
height of the bar corresponds to the
frequency and the width extends to the
real limits, so that adjacent bars touch.

31
Frequency distribution graphs

• Frequency distribution graphs are useful


because they show the entire set of
scores.
• At a glance, you can determine the highest
score, the lowest score, and where the
scores are centered.
• The graph also shows whether the scores
are clustered together or scattered over a
wide range.
33
Shape

• A graph shows the shape of the distribution.


• A distribution is symmetrical if the left side of
the graph is (roughly) a mirror image of the right
side.
• One example of a symmetrical distribution is the
bell-shaped normal distribution.
• On the other hand, distributions are skewed
when scores pile up on one side of the
distribution, leaving a "tail" of a few extreme
values on the other side.

34
Positively and Negatively
Skewed Distributions
• In a positively skewed distribution, the
scores tend to pile up on the left side of
the distribution with the tail tapering off to
the right.
• In a negatively skewed distribution, the
scores tend to pile up on the right side and
the tail points to the left.

35
Percentiles, Percentile Ranks,
and Interpolation
• The relative location of individual scores
within a distribution can be described by
percentiles and percentile ranks.
• The percentile rank for a particular X
value is the percentage of individuals with
scores equal to or less than that X value.
• When an X value is described by its rank,
it is called a percentile.

37
Interpolation

• When scores or percentages do not correspond


to upper real limits or cumulative percentages,
you must use interpolation to determine the
corresponding ranks and percentiles.
• Interpolation is a mathematical process based
on the assumption that the scores and the
percentages change in a regular, linear fashion
as you move through an interval from one end to
the other.

38
Central Tendency

40
Central Tendency

• In general terms, central tendency is a


statistical measure that determines a
single value that accurately describes the
center of the distribution and represents
the entire distribution of scores.
• The goal of central tendency is to identify
the single value that is the best
representative for the entire set of data.
• Mean, Median, Mode
• Data not normally distributed will have big 41
Central Tendency (cont.)

• By identifying the "average score," central


tendency allows researchers to summarize or
condense a large set of data into a single value.
• Thus, central tendency serves as a descriptive
statistic because it allows researchers to
describe or present a set of data in a very
simplified, concise form.
• In addition, it is possible to compare two (or
more) sets of data by simply comparing the
average score (central tendency) for one set
versus the average score for another set.
42
The Mean, the Median,
and the Mode
• It is essential that central tendency be
determined by an objective and well‑defined
procedure so that others will understand exactly
how the "average" value was obtained and can
duplicate the process.
• No single procedure always produces a good,
representative value. Therefore, researchers
have developed three commonly used
techniques for measuring central tendency: the
mean, the median, and the mode.
43
The Mean

• The mean is the most commonly used


measure of central tendency.
• Computation of the mean requires scores
that are numerical values measured on an
interval or ratio scale.
• The mean is obtained by computing the
sum, or total, for the entire set of scores,
then dividing this sum by the number of
scores.
44
The Mean (cont.)

Conceptually, the mean can also be defined as:


1. The mean is the amount that each individual
receives when the total (ΣX) is divided equally
among all N individuals.
2. The mean is the balance point of the distribution
because the sum of the distances below the
mean is exactly equal to the sum of the
distances above the mean.

45
Changing the Mean

• Because the calculation of the mean involves


every score in the distribution, changing the
value of any score will change the value of the
mean.
• Modifying a distribution by discarding scores or
by adding new scores will usually change the
value of the mean.
• To determine how the mean will be affected for
any specific situation you must consider: 1) how
the number of scores is affected, and 2) how the
sum of the scores is affected.
46
When the Mean Won’t Work

• Although the mean is the most commonly used


measure of central tendency, there are situations
where the mean does not provide a good,
representative value, and there are situations
where you cannot compute a mean at all.
• When a distribution contains a few extreme
scores (or is very skewed), the mean will be
pulled toward the extremes (displaced toward
the tail). In this case, the mean will not provide
a "central" value.

47
The Median

• If the scores in a distribution are listed in order


from smallest to largest, the median is defined
as the midpoint of the list.
• The median divides the scores so that 50% of
the scores in the distribution have values that
are equal to or less than the median.
• Computation of the median requires scores that
can be placed in rank order (smallest to largest)
and are measured on an ordinal, interval, or
ratio scale.

48
The Median (cont.)

Usually, the median can be found by a


simple counting procedure:
1.With an odd number of scores, list the
values in order, and the median is the
middle score in the list.
2.With an even number of scores, list the
values in order, and the median is half-way
between the middle two scores.

49
The Median (cont.)

• One advantage of the median is that it is


relatively unaffected by extreme scores.
• Thus, the median tends to stay in the
"center" of the distribution even when
there are a few extreme scores or when
the distribution is very skewed. In these
situations, the median serves as a good
alternative to the mean.

50
The Mode

• The mode is defined as the most frequently


occurring category or score in the distribution.
• In a frequency distribution graph, the mode is
the category or score corresponding to the peak
or high point of the distribution.
• The mode can be determined for data measured
on any scale of measurement: nominal, ordinal,
interval, or ratio.

51
The Mode (cont.)

• The primary value of the mode is that it is


the only measure of central tendency that
can be used for data measured on a
nominal scale. In addition, the mode often
is used as a supplemental measure of
central tendency that is reported along
with the mean or the median.

52
Bimodal Distributions

• It is possible for a distribution to have more than


one mode. Such a distribution is called bimodal.
(Note that a distribution can have only one
mean and only one median.)
• In addition, the term "mode" is often used to
describe a peak in a distribution that is not really
the highest point. Thus, a distribution may have
a major mode at the highest peak and a minor
mode at a secondary peak in a different location.

53
Central Tendency and the
Shape of the Distribution
• Because the mean, the median, and the
mode are all measuring central tendency,
the three measures are often
systematically related to each other.
• In a symmetrical distribution, for example,
the mean and median will always be
equal.

55
Central Tendency and the
Shape of the Distribution (cont.)
• If a symmetrical distribution has only one
mode, the mode, mean, and median will
all have the same value.
• In a skewed distribution, the mode will be
located at the peak on one side and the
mean usually will be displaced toward the
tail on the other side.
• The median is usually located between the
mean and the mode.
56
Variability

• Variability serves both as a descriptive measure


and as an important component of most
inferential statistics.
• As a descriptive statistic, variability measures
the degree to which the scores are spread out or
clustered together in a distribution.
• In the context of inferential statistics, variability
provides a measure of how accurately any
individual score or sample represents the entire
population.
• Range, variance, SD.
57
Variability (cont.)

• When the population variability is small, all


of the scores are clustered close together
and any individual score or sample will
necessarily provide a good representation
of the entire set.
• On the other hand, when variability is large
and scores are widely spread, it is easy for
one or two extreme scores to give a
distorted picture of the general population.

58
Measuring Variability

• Variability can be measured with


– the range
– the interquartile range
– the standard deviation/variance.
• In each case, variability is determined by
measuring distance.

59
The Range

• The range is the total distance covered by


the distribution, from the highest score to
the lowest score (using the upper and
lower real limits of the range).

60
The Interquartile Range

• The interquartile range is the distance


covered by the middle 50% of the
distribution (the difference between Q1
and Q3).

61
The Standard Deviation

• Standard deviation measures the


standard distance between a score and
the mean.
• The calculation of standard deviation can
be summarized as a four-step process:

62
The Standard Deviation (cont.)

1. Compute the deviation (distance from the mean) for each


score.
2. Square each deviation.
3. Compute the mean of the squared deviations. For a
population, this involves summing the squared deviations
(sum of squares, SS) and then dividing by N. The resulting
value is called the variance or mean square and measures
the average squared distance from the mean.
For samples, variance is computed by dividing the
sum of the squared deviations (SS) by n - 1, rather than
N. The value, n - 1, is know as degrees of freedom (df)
and is used so that the sample variance will provide
an unbiased estimate of the population variance.
4. Finally, take the square root of the variance to obtain the
standard deviation.
63
The Mean and Standard Deviation
as Descriptive Statistics
• If you are given numerical values for the
mean and the standard deviation, you
should be able to construct a visual image
(or a sketch) of the distribution of scores.
• As a general rule, about 70% of the scores
will be within one standard deviation of the
mean, and about 95% of the scores will be
within a distance of two standard
deviations of the mean.

65
z-Scores and Location

• By itself, a raw score or X value provides very


little information about how that particular score
compares with other values in the distribution.
• A score of X = 53, for example, may be a
relatively low score, or an average score, or an
extremely high score depending on the mean
and standard deviation for the distribution from
which the score was obtained.
• If the raw score is transformed into a z-score,
however, the value of the z-score tells exactly
where the score is located relative to all the
other scores in the distribution.
66
z-Scores and Location (cont.)

• The process of changing an X value into a z-


score involves creating a signed number, called
a z-score, such that
a. The sign of the z-score (+ or –)
identifies whether the X value is located above
the mean (positive) or below the mean
(negative).
b. The numerical value of the z-score
corresponds to the number of standard
deviations between X and the mean of the
distribution.
67
z-Scores and Location (cont.)

• Thus, a score that is located two standard


deviations above the mean will have a z-
score of +2.00. And, a z-score of +2.00
always indicates a location above the
mean by two standard deviations.

68
Transforming back and forth
between X and z
• The basic z-score definition is usually
sufficient to complete most z-score
transformations. However, the definition
can be written in mathematical notation to
create a formula for computing the z-score
for any value of X.
X– μ
z = ────
σ
70
Transforming back and forth
between X and z (cont.)
• Also, the terms in the formula can be
regrouped to create an equation for
computing the value of X corresponding to
any specific z-score.

X = μ + zσ

71
Z-scores and Locations

• In addition to knowing the basic definition of a z-


score and the formula for a z-score, it is useful
to be able to visualize z-scores as locations in a
distribution.
• Remember, z = 0 is in the center (at the mean),
and the extreme tails correspond to z-scores of
approximately –2.00 on the left and +2.00 on the
right.
• Although more extreme z-score values are
possible, most of the distribution is contained
between z = –2.00 and z = +2.00.
72
Z-scores and Locations (cont.)

• The fact that z-scores identify exact


locations within a distribution means that
z-scores can be used as descriptive
statistics and as inferential statistics.
– As descriptive statistics, z-scores describe
exactly where each individual is located.
– As inferential statistics, z-scores determine
whether a specific sample is representative of
its population, or is extreme and
unrepresentative.
73
z-Scores as a Standardized
Distribution
• When an entire distribution of X values is
transformed into z-scores, the resulting
distribution of z-scores will always have a
mean of zero and a standard deviation of
one.
• The transformation does not change the
shape of the original distribution and it
does not change the location of any
individual score relative to others in the
distribution.
74
z-Scores as a Standardized
Distribution (cont.)
• The advantage of standardizing
distributions is that two (or more) different
distributions can be made the same.
– For example, one distribution has μ = 100 and
σ = 10, and another distribution has μ =
40 and σ = 6.
– When these distribution are transformed to z-
scores, both will have μ = 0 and σ = 1.

75
z-Scores as a Standardized
Distribution (cont.)
• Because z-score distributions all have the
same mean and standard deviation,
individual scores from different
distributions can be directly compared.
• A z-score of +1.00 specifies the same
location in all z-score distributions.

76
z-Scores and Samples

• It is also possible to calculate z-scores for


samples.
• The definition of a z-score is the same for
either a sample or a population, and the
formulas are also the same except that the
sample mean and standard deviation are
used in place of the population mean and
standard deviation.

77
z-Scores and Samples (cont.)
• Thus, for a score from a sample,
X–M
z = ─────
s
• Using z-scores to standardize a sample also has
the same effect as standardizing a population.
• Specifically, the mean of the z-scores will be
zero and the standard deviation of the z-scores
will be equal to 1.00 provided the standard
deviation is computed using the sample formula
(dividing n – 1 instead of n).
78
Other Standardized Distributions
Based on z-Scores
• Although transforming X values into z-
scores creates a standardized distribution,
many people find z-scores burdensome
because they consist of many decimal
values and negative numbers.
• Therefore, it is often more convenient to
standardize a distribution into numerical
values that are simpler than z-scores.

79
Other Standardized Distributions
Based on z-Scores (cont.)
• To create a simpler standardized
distribution, you first select the mean and
standard deviation that you would like for
the new distribution.
• Then, z-scores are used to identify each
individual's position in the original
distribution and to compute the individual's
position in the new distribution.

80
Other Standardized Distributions
Based on z-Scores (cont.)
• Suppose, for example, that you want to
standardize a distribution so that the new mean
is μ = 50 and the new standard deviation is σ =
10.
• An individual with z = –1.00 in the original
distribution would be assigned a score of X = 40
(below μ by one standard deviation) in the
standardized distribution.
• Repeating this process for each individual score
allows you to transform an entire distribution into
a new, standardized distribution.
81
Correlation

82
Correlations: Measuring and
Describing Relationships
• A correlation is a statistical method used to
measure and describe the relationship
between two variables.
• A relationship exists when changes in one
variable tend to be accompanied by
consistent and predictable changes in the
other variable.

83
Correlations: Measuring and
Describing Relationships (cont.)

• A correlation typically evaluates three


aspects of the relationship:
– the direction
– the form
– the degree

86
Correlations: Measuring and
Describing Relationships (cont.)
• The direction of the relationship is
measured by the sign of the correlation (+
or -). A positive correlation means that the
two variables tend to change in the same
direction; as one increases, the other also
tends to increase. A negative correlation
means that the two variables tend to
change in opposite directions; as one
increases, the other tends to decrease.
87
Correlations: Measuring and
Describing Relationships (cont.)

• The most common form of relationship is


a straight line or linear relationship which
is measured by the Pearson correlation.

89
Correlations: Measuring and
Describing Relationships (cont.)

• The degree of relationship (the strength or


consistency of the relationship) is
measured by the numerical value of the
correlation. A value of 1.00 indicates a
perfect relationship and a value of zero
indicates no relationship.
• 0.3 is week, 0.5 no correlation, near from 1
strong
91
Correlations: Measuring and
Describing Relationships (cont.)
• To compute a correlation you need two
scores, X and Y, for each individual in the
sample.
• The Pearson correlation requires that the
scores be numerical values from an
interval or ratio scale of measurement.
• Other correlational methods exist for other
scales of measurement.
93
The Pearson Correlation

• The Pearson correlation measures the


direction and degree of linear (straight line)
relationship between two variables.
• To compute the Pearson correlation, you first
measure the variability of X and Y scores
separately by computing SS for the scores of
each variable (SSX and SSY).
• Then, the covariability (tendency for X and Y to
vary together) is measured by the sum of
products (SP).
• The Pearson correlation is found by computing
the ratio, SP/(SSX)(SSY) .
94
The Pearson Correlation (cont.)

• Thus the Pearson correlation is comparing the


amount of covariability (variation from the
relationship between X and Y) to the amount X
and Y vary separately.
• The magnitude of the Pearson correlation
ranges from 0 (indicating no linear relationship
between X and Y) to 1.00 (indicating a perfect
straight-line relationship between X and Y).
• The correlation can be either positive or
negative depending on the direction of the
relationship.
95
The Spearman Correlation

• The Spearman correlation is used in two


general situations:
(1) It measures the relationship between two
ordinal variables; that is, X and Y both consist of
ranks.
(2) It measures the consistency of direction of
the relationship between two variables. In this
case, the two variables must be converted to
ranks before the Spearman correlation is
computed.

97
The Spearman Correlation (cont.)
The calculation of the Spearman correlation requires:

1. Two variables are observed for each individual.


2. The observations for each variable are rank ordered.
Note that the X values and the Y values are ranked
separately.
3. After the variables have been ranked, the Spearman
correlation is computed by either:
a. Using the Pearson formula with the ranked
data.
b. Using the special Spearman formula
(assuming there are few, if any, tied ranks).
98
The Point-Biserial Correlation and
the Phi Coefficient
• The Pearson correlation formula can also
be used to measure the relationship
between two variables when one or both
of the variables is dichotomous.
• A dichotomous variable is one for which
there are exactly two categories: for
example, men/women or succeed/fail.

100
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
With either one or two dichotomous variables the
calculation of the correlation precedes as
follows:
1. Assign numerical values to the two categories of
the dichotomous variable(s). Traditionally, one
category is assigned a value of 0 and the other
is assigned a value of 1.
2. Use the regular Pearson correlation formula to
calculate the correlation.
101
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
• In situations where one variable is
dichotomous and the other consists of
regular numerical scores (interval or ratio
scale), the resulting correlation is called a
point-biserial correlation.
• When both variables are dichotomous, the
resulting correlation is called a phi-
coefficient.
102
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
• The point-biserial correlation is closely related to
the independent-measures t test introduced in
Chapter 10.
• When the data consists of one dichotomous
variable and one numerical variable, the
dichotomous variable can also be used to
separate the individuals into two groups.
• Then, it is possible to compute a sample mean
for the numerical scores in each group.

103
The Point-Biserial Correlation and
the Phi Coefficient (cont.)
• In this case, the independent-measures t
test can be used to evaluate the mean
difference between groups.
• If the effect size for the mean difference is
measured by computing r2 (the percentage
of variance explained), the value of r2 will
be equal to the value obtained by squaring
the point-biserial correlation.
104

You might also like