Regression & Correlation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

2021

Regression Analysis-Qaiser ACMA

Qaiser Iqbal (ACMA)


Associate Member of ICMAP
Regression
Introduction
Regression is used to examine the relationship between one dependent and one independent variable. The
variable intended to be estimated or predicted is known as dependent variable and the variable on basis of
which dependent variable is to be estimated or predicted is called independent variable.
In estimating the yield of a crop, on the basis of amount of fertilizer used, the yield will be dependent variable
and amount of fertilizer would be independent.
The dependent variable is also called as regressand, the predicted variable, the response variable or the
explained variable. On the other side, the independent variable is also called regressor, the predictor or
explanatory variable.
The values of independent variable are assumed to be fixed and hence it is not a random variable. The
dependent variable whose values are determined on the basis of the independent variable, is a random variable.
In above said example yield of a crop is random variable and amount of fertilizer is assumed to be fixed or
known.
Regression Analysis
Regression is a process by which we estimate dependent variable on the basis of independent variable. If two
variables are involved, then regression analysis is referred as simple regression analysis and multiple
regression analysis we used where more than two variables are involved.
If Y is to be estimated on the basis of X, we called this equation as regression equation of Y on X. Normally
we place dependent variable on Y-axis and independent variable on X-axis (Y on X). If X is to be estimated
on the basis of Y, we called this equation as regression equation of X on Y.
The term regression is also used to mean “Linear Regression”, if a constant amount of change in dependent
variable is associated with a unit change in predicting variable, the relationship is said to be linear, otherwise
non-linear.
Regression line Y on X
Y = a + bX
Regression line X on Y
X = a + bY

Step to determine Least Square Line of Regression (Y on X)


1) Calculate Mean of Both Variable 𝑋̅ 𝑎𝑛𝑑 𝑌̅

1 (∑ 𝑥) (∑ 𝑦)
2) Calculate Combine Variance 𝑆𝑥𝑦 = 𝑛−1
(∑ 𝑥𝑦 − 𝑛
)

1 (∑ 𝑥)2
3) Calculate Variance of X 𝑆𝑥2 = 𝑛−1
(∑ 𝑥 2 − 𝑛
)

𝑆𝑥𝑦
4) Calculate Slope 𝑏𝑦𝑥 = 𝑆𝑥2

5) Calculate Intercept 𝑎𝑦𝑥 = 𝑌̅ − 𝑏𝑦𝑥 𝑋̅

6) Establish Least square regression Line 𝑌 = 𝑎𝑦𝑥 + 𝑏𝑦𝑥 𝑋


Step to determine Least Square Line of Regression (X on Y)
1) Calculate Mean of Both Variable 𝑋̅ 𝑎𝑛𝑑 𝑌̅

1 (∑ 𝑥) (∑ 𝑦)
2) Calculate Combine Variance 𝑆𝑥𝑦 = 𝑛−1
(∑ 𝑥𝑦 − 𝑛
)

1 (∑ 𝑦)2
3) Calculate Variance of X 𝑆𝑦2 = 𝑛−1
(∑ 𝑦 2 − 𝑛
)

𝑆𝑥𝑦
4) Calculate Slope 𝑏𝑥𝑦 = 𝑆𝑦2

5) Calculate Intercept 𝑎𝑥𝑦 = 𝑋̅ − 𝑏𝑥𝑦 𝑌̅

6) Establish Least square regression Line 𝑋 = 𝑎𝑥𝑦 + 𝑏𝑥𝑦 𝑌

Properties of Regression Line


The regression line has the following properties.
o sum differences between observed values (the y values) and predicted values (the ŷ values computed
⏞) = 0.
from the regression equation) is always Zero ∑(𝑌 − 𝑌
o The line minimizes the sum of squared differences between observed values (the y values) and
⏞)2 = minimum.
predicted values (the ŷ values computed from the regression equation). ∑(𝑌 − 𝑌
o The regression line passes through the mean of the X values (x) and through the mean of
the Y values (y).
o The regression coefficient (Slope) is the average change in the dependent variable (Y) for a 1-unit
change in the independent variable (X). It is the slope of the regression line.
o Regression Coefficients 𝑏𝑥𝑦 𝑎𝑛𝑑 𝑏𝑦𝑥 are regression Coefficients
o The sign of both the regression coefficients will be same, i.e. they will be either positive or negative.
Thus, it is not possible that one regression coefficient is negative while the other is positive.
o The regression coefficients are independent of the change of origin, but not of the scale. By origin,
we mean that there will be no effect on the regression coefficients if any constant is subtracted from
the value of X and Y. By scale, we mean that if the value of X and Y is either multiplied or divided
by some constant, then the regression coefficients will also change.
o The correlation coefficient is the geometric mean of two regression coefficients. Symbolically, it can
be expressed as: 𝑟 = √𝑏𝑥𝑦 . 𝑏𝑦𝑥
o If one of the regression coefficients is greater than unity, the other must be less than unity.
Therefore, the value of the coefficient of correlation cannot exceed unity i.e. 1.
o The average value of the two regression coefficients will be greater than the value of the
𝑏𝑥𝑦 +𝑏𝑦𝑥
correlation. 2
>𝑟
𝑆𝑥 𝑆𝑦
o 𝑏𝑥𝑦 = 𝑟 and also 𝑏𝑦𝑥 = 𝑟
𝑆𝑦 𝑆𝑥
PRACTICE QUESTION
For each of the following data, determine the estimated regression equation Y=a+bX
(a) X̅ = 10; Y̅ = 20; ΣXY = 1000; ΣX2 = 2000; n = 10.
(b) ΣX = 528; ΣY =11720 ΣXY = 193640; ΣX2 = 11440; n = 32
(c) ΣX = 1239; ΣY = 79; ΣXY = 1613; ΣX2 17322; ΣY2 = 293; n = 100.
(d) n = 10, ΣX = 1710, ΣY = 760, ΣX2= 293162, ΣY2= 59390, ΣXY = 130628.
(e) X̅= 52, Y̅ = 237, Σ(X-X̅)2 = 2800; Σ(X-X̅) (Y-Y̅) = 9871
Given these ten pairs of (X, Y) values:

X 1 1 2 3 4 4 5 6 6 7
Y 2.1 2.5 3.1 3.0 3.8 3.2 4.3 3.9 4.4 4.8

(a) Plot a scatter diagram for the above data.


(b) Carry out the necessary computations to obtain the least-squares estimates of the parameters in
the simple linear regression Y = a + bX
(c) Compute the residuals and verify that they add to zero.
(d) Use the regression equation to predict the value of Y when X=10.
Given the following sets of values:

Y 6.5 5.3 8.6 1.2 4.2 2.9 1.1 3.0


X 3.2 2.7 4.5 1.0 2.0 1.7 0.6 1.9

a) Compute the least-squares regression equation for Y values on X values, that is the equation Ŷ = a +
bX.
b) Compute the residuals and verify that they add to zero.
c) Use the regression equation to predict the value of Y when X=10.

The owner of a retailing organization is interested in the relationship between price at which a commodity is
offered for sale and the quantity sold. The following sample data have been collected.
Price 25 45 30 50 35 40 65 75 70 60
Quantity sold 118 105 112 100 111 108 95 88 91 96
(a) Plot a scatter diagram for the above data.
(b) Using the method of least squares, determine the equation for the estimated regression line. Plot this
line on the scatter diagram.
Given the following set of values: Determine the equation of the least square regression line.

X 20 11 15 10 17 19
Y 5 15 14 17 8 9

The data in the following table gives the market share of product television advertising expenditure:
X=Advertising Expenditure 15 17 13 14 16
Y=Market Share 23 25 21 24 26
Estimate market share when advertising expenditure is 20:

A)28 B)28.8
C)26.5 D)25.6
Y=45-3x is the regression line of y on x, what number of units is expected to increase in ‘y’ if ‘x’ is
decreased by two units?

A)6 B)7
C)8 D)9

A researcher finds that there is a linear relationship between amount of fertilizer supplied to tomato plants
and the subsequent yield of tomatoes obtained. Eight tomato plants of the same variety were selected at
random and treated weekly with a solution in which x grams of fertilizer was dissolved in a fixed quantity of
water. The yield, y kilograms of tomatoes was recorded

Plant A B C D E F G H
X 1 1.5 2 2.5 3 3.5 4 4.5
Y 3.9 4.4 5.8 6.6 7 7.1 7.3 7.7
Estimate the yield of a plant treated weekly with 3.2 grams of fertilizer
A)6.7 B)3.9
C)7.6 D)8.2

A regression analysis between sales (in RS:1000) and advertising (in RS: 1000) resulted in the following
least squares line Y = 80+5x this implies that:

A)As advertising increases by RS .1000,sales B)As advertising increases by RS:1000,sales


increases by RS 5000 increases by RS:80,000
C)Advertising increases by RS 5, sale increases D) None of these
by RS:80

The two lines of regression are given by 8x+10y=25 and 16x+5y=12 respectively. If the variance of x is 25.
What is the standard deviation of y?

A)16 B)8
C)64 D)4
Given the following data

Variable X Y
Mean 80 98
Variance 4 9
Coefficient of correlation=0.6 What is the most likely value of y when X=90

A)90 B)103
C)104 D)107

Use the data for the following questions


∑x= 1,239 , ∑Y=79, ∑XY=17233, ∑X2=568,925, ∑Y2=293, n=100
Find the line ‘x on y’
a)Y = 0.4813+0.03X b) Y = 0.4813-0.03X
c) Y = 0.4813+1.03X d) None of these

Find the line ‘y on x’


a) Y = -43.3 -70.49Y b) Y = 43.3+70.49Y
c) Y = -43.3+70.49Y d) None of these
Regression line ‘x on y ‘ (2x+3y=3) , Regression line ‘y on x ‘ (x+4y=10) find slopes of both lines
a) bxy =-2, byx=-1/4 b) bxy =-2, byx =-4
c) bxy =2, byx =4 d) bxy =-3, byx =5

Regression line ‘x on y ‘ (2x+3y=3) , Regression line ‘y on x ‘ (x+4y=10) find x̅ and y̅


a) 4 and 3.5 b) -4 and 3.5
c) 2 and 4 d) 5 and 4

Regression line ‘x on y ‘given. You are required to find the change in y due to 1 unit increase in x
a) Y will increase 3 units b) Y will increase 5 units
c) Y will decrease 3 units d) Y will increase 2 units

For the variables ‘ x and y ’ the regression equation are given as 7x-3y-18=0 and 4x-y-11=0 respectively
find arithmetic of x and y
a) Mean (x)=3 and Mean (y)=1 b) Mean (x)=2 and Mean (y)=3
c) Mean (x)= 2 and Mean (y)=1 d) Mean (x)=1 and Mean (y)=3

Regression coefficients remains unchanged if


a) Origin b) Scale
c) both are change d) any one is change

If y=3x+4 is the regression line ‘x on y’ and the arithmetic mean of x is -1 , what is the arithmetic mean of
y?
A) 1 b) -1
c) 7 d) 5

Suppose that four randomly chosen plots where treated with various level of fertilizer resulting in the following
yields of corn.
Fertilizer(kg/Acre) X 100 200 400 500
Production (Bushels/Acre) Y 70 70 80 100
i. Estimate the linear regression
ii. Estimate the yield when no fertilizer is applied.
iii. Estimate the yield when the average amount of fertilizer is applied.
iv. Estimate how much yield is increased for every kilogram of fertilizer

The linear relation between a dependent and an independent variable is called:


A) Regression line B) Regression co-efficient
C) Co-efficient of correlation D) None of these
Slope of the regression line is called:
A) Regression parameter B) Sample parameter
C) Regression co-efficient D) None of these
In regression analysis, if the value of a is positive the value of b:
A) Must be positive B) May take any value
C) Must be negative D) Less than -1or more than 1
The procedure which selects that particular line for which the sum of the squares of the vertical distances
from the observed points to the line is as small as possible, is called:

A) Sum of squares method B) Sum of squares of errors method


C) Least square method D) None of these

The numerical values of regression co-efficients must be:


A) Both positive B) Both negative
C) Both positive or both negative D) None of these

In regression, the dependent variable is assumed to be a random variable whereas the independent
variable is assumed to have:
A) Random values B) Fixed values
C) Both (a) or (b) D) None of these

The dependent variable is also called response or:


A) The explained variable B) Unexplained variable
C) The explanatory variable D) None of these
16- The explained variable or response is also called:
A) The independent variable B) The dependent variable
C) Non-random variable D) None of these
17- The predictor or unexplained variable is also called:
A) The independent variable B) The dependent variable
C) Random variable D) None of these
In regression analysis, b = 2.8, indicates that the value of dependent variable:
A) Increases by 2.8 units at per unit increase in B) Decreases by 2.8 units at per unit increase in
independent variable independent variable
C) Increases by 2.8 units at per unit decrease in D) None of these
independent variable

If Y is the observed value and ⏞ ⏞)


𝑌is the estimated value (estimated by using the regression line) then Σ(Y-𝑌
A) Should be zero B) Is likely to be close to zero
C) In majority of the cases would be equal to zero D) None of these
If there are two variables x and y, then the number of regression could be:
A) 1 B) 2
C) Any number D) 3
Since Blood Pressure of a person depends on age, we need consider:
A) The regression equation of Blood Pressure on B) The regression of age on Blood Pressure
age
C) Both A & B D) Either A or B
The method applied for deriving the regression equations is known as:
A) Least squares B) Concurrent deviation
C) Product moment D) Normal equation

The difference between the observed value and the estimated value in regression analysis is known as:
A) Error B) Residue
C) Deviation D) A or B
The errors in case of regression equations are:
A) Positive B) Negative
C) Zero D) All of these
The regression line of y on is derived by:
A) The minimization of vertical distances in the B) The minimization of horizontal distances in the
scatter diagram scatter diagram
C) Both A and B D) A or B

The two lines of regression become identical when:


A) r = 1 B) r = -1
C) r = 0 D) A or B
What are the limits of the two regression coefficients?
A) No limit B) Must be positive
C) One positive and the other negative D) Product of the regression coefficient must be
numerically less than unity
The regression coefficients remain unchanged due to a:
A) Shift of origin B) Shift of scale
C) Both A and B D) A or B
The difference between actual and estimated value in regression analysis is known as:
A) y- intercept B) Error
C) Slope D) None of these

Which of the following represents the proportion of variation in dependent variable that is explained by
the independent variable?
A) Co-efficient of determination B) Co-efficient of correlation
C) Regression co-efficient D) None of these
As the angle between two regression lines increases the correlation co-efficient:
A) Remains same B) increases
C) Decreases D) None of these
The independent variable is also called:
A) Regressor B) Predictor
C) Regression D) All of these
In regression problem, the independent and dependent variables are:
A) Both fixed B) Both random
C) Independent variable fixed & dependent D) Dependent variable fixed & independent
variable random variable random

if a constant amount of change in the predicted variable is associated with a unit change in the
predicting variable the relation is said to be:
A) Linear B) Non linear
C) Inverse D) None of these
The two regression co-efficient always have:
A) Opposite signs B) Same signs
C) Not definite D) No signs
Which of the following is true when the slope of a regression line is positive?
A) Correlation co-efficient between the B) The regression line is parallel to the horizontal
independent and independent variable is 1 line
C) There is positive correlation between the D) None of these
dependent and independent variables
The regression co-efficient in independent of change of:
A) Origin B) Scale
C) Both D) None of these
Regression equation is also called:
A) Predicting equation B) Estimating equation
C) Line of average relationship D) All A,B and C
The regression line of x on y derived by method of least square:
A) Minimizes the horizontal distances in scatter B) Maximizes the vertical distances in scatter
diagram diagram
C) Minimizes the vertical distances in scatter D) Maximizes the horizontal distances in scatter
diagram diagram
Correlation
Introduction
Correlation is the degree of covariation between variables. The correlation coefficient, denoted by r, is a
measure of the strength of the straight-line or linear relationship between two variables. The correlation
coefficient takes on values ranging between +1 and -1. The following points are the accepted guidelines for
interpreting the correlation coefficient

𝑟 = ±√𝑏𝑥𝑦 . 𝑏𝑦𝑥

𝑆𝑥𝑦
𝑟=
𝑆𝑥 𝑆𝑦

Note:
o byx and bxy = + ve then r =+ ve
o byx and bxy = -ve then r = -ve
o Coefficient of correlation is pure number, it does not dependent upon the unit of measurement.
o The correlation coefficient is symmetrical with respect to X and Y, i.e. 𝑟𝑥𝑦 = 𝑟𝑦𝑥
o The correlation coefficient lies between –1 and +1. i.e. –1 ⩽ r ⩽+1.

Scatter diagram (A rough measure of Relationship Between two variables)


Scatter diagram. Also called: scatter plot, X-Y graph. The scatter diagram graphs pairs of numerical
data, with one variable on each axis, to look for a relationship between them.
o If the values of two different variables (say x and y) are plotted on a rectangular axes, such a
plot is referred to as a scatter diagram.
o From the inspection of scatter diagram if it is seen that the points follow closely a straight line it
indicates that the two variables are to some extent linearly related.
o In a scatter diagram, if he points follow closely a straight line of positive slope, the two variable
are said to have high positive correlation.
o In a scatter diagram if the points follow clearly a straight line of negative slope, the two
variables are said to have high negative correlation.
o In scatter diagram, if the points follow a strictly random pattern, the two variables are said to
have no linear relationship.
o If two variables tend to increase (or decrease) together, the correlation is said to be direct or
positive.
o If one variable tends to increase as the other variable decrease, the correlation is said to be inverse or
negative.
o Correlation coefficient remain unchanged by added, subtracted, multiplied and divided by any
number.
o The coefficient of correlation is independent origin and scale.
o If all the plotted points in a scatter diagram lie on a single line, then the correlation is perfect
positive.
o If the plotted points in a scatter diagram lie from upper left to lower right, then the correlation is
negative.
o The correlation between shoe-size and intelligence is zero.
o Spurious correlation means no causal relation
Spearman’s correlation
The Spearman rank-order correlation coefficient (Spearman’s correlation, for short) is a nonparametric
measure of the strength and direction of association that exists between two variables measured on at least an
ordinal scale. The test is used for either ordinal variables or for continuous data that has failed the assumptions
necessary for conducting the Pearson's product-moment correlation. For example, you could use a Spearman’s
correlation to understand whether there is an association between exam performance and time spent revising;

6 ∑ 𝑑2
𝑟 =1−
𝑛(𝑛2 − 1)
Rank of the equation is given
Judge x 1 2 3 4 5 6 7 8
Judge y 6 5 1 4 3 2 8 7

Find spearman‟s rank correlation


Ten students got the following percentage of marks in mathematics and physics. Find rank
correlation
Mathematics 8 36 98 25 75 82 92 62 65 35
x
Physics y 84 51 91 60 68 62 86 58 35 49

Find the coefficient of rank correlation for the following data

A 35 40 42 43 40 53 54 49 41 55
B 102 101 97 98 38 101 97 92 95 95
PRACTICE QUESTION
1. The two regression coefficients have following value, 𝑏𝑦𝑥 = 0.86, 𝑏𝑥𝑦 = 0.95 find r.
2. Find the coefficient of correlation if the two regression coefficients have the following values -
0.1 and -0.4
3. The following are the results are given r = 0.60, 𝑆𝑥2 = 9, 𝑏𝑥𝑦 = 0.80, find 𝑆𝑦 .
4. For a given set of data, we have, r= 0.48, 𝑆𝑥2 = 16, 𝑆𝑥𝑦 = 36, find 𝑆𝑦 .
5. For a set of 50 pairs of observations the standard deviations of X and Y are 4.5 and 3.5
respectively. If the sum of products of deviations of X and Y values from their respectively means
be 420, find the Karl Pearson’s coefficient of correlation.
The correlation coefficient between two variables:
A) is a unit free measure B) Is expressed as product of units of two variables
C) Is expressed in units of first variable D) Is expressed in units of second variable
Correlation co-efficient is independent of change of origin:
A) Is always false B) Is always true
C) can be false D) Can be true
In rank correlation the association should be linear:
A) False B) True
C)A&B D) None of these
If the values of two different variables (say x and y) are plotted on a rectangular axes, such a plot is
referred to as a:
A) Frequency diagram B) Value diagram
C) Scatter diagram D) None of these
From the inspection of scatter diagram if it is seen that the points follow closely a straight line, it
indicates that the two variables are to some extent:
A) Unrelated B) Related
C) Linearly related D) None of these
In a scatter diagram, if the points follow closely a straight line of positive slope, the two variables are
said to have:
A) No correlation B) High positive correction
C) Negative correlation D) None of these
In a scatter diagram, if the points follow clearly a straight line of negative slope, the two variables are
said to have:
A) No correlation B) High positive correlation
C) High negative correlation D) None of these
In a scatter diagram, if the points follow a strictly random pattern, the two variables are said to have:
A) No linear relationship B) Low positive relationship
C) Low negative relationship D) None of these
A measure of the strength or degree of relationship or the interdependence is called:
A) Correlation B) Regression
C) Least square estimate D) None of these
The phenomenon that investigates the dependence of one variable on one or more independent
variables is called:
A) Correlation B) Regression
C) Least square estimate D) None of these
The linear relation between a dependent and an independent variable is called:
A) Regression line B) Regression co-efficient
C) Co-efficient of correlation D) None of these
Slope of the regression line is called:
A) Regression parameter B) Sample parameter
C) Regression co-efficient D) None of these
In regression analysis, if the value of a is positive the value of b:
A) Must be positive B) May take any value
C) Must be negative D) Less than -1or more than 1
The procedure which selects that particular line for which the sum of the squares of the vertical
distances from the observed points to the line is as small as possible, is called:
A) Sum of squares method B) Sum of squares of errors method
C) Least square method D) None of these
The numerical values of regression co-efficient must be:
A) Both positive B) Both negative
C) Both positive or both negative D) None of these
In regression, the dependent variable is assumed to be a random variable whereas the independent
variable is assumed to have:
A) Random values B) Fixed values
C) Both (a) or (b) D) None of these
Which of the statements about Spearman’s Co-efficient of Correlation is NOT correct:
A) It can co-relate two or more set of rankings B) It applies only when no ties exist
C) Both (a) and (b) D) None of the above
If two variables tend to vary simultaneously in some direction, they are said to be:
A) Dependent B) Independent
C) Correlated D) None of these
If two variable tends to increase (or decrease) together, the correlation is said to be
A) Zero B) Direct or positive
C) 1 D) None of these
If one variable tends to increase as the other variable decreases, the correlation is said to be:
A) Zero B) Inverse or negative

C) -1 D) None of these
While calculating “r” if x and y are interchanged i.e. instead of calculating if is calculated then:
A) = B)>
C) < D) None of these
Limits of the co-efficient of Correlation are:
A) -1 to 0 B) 0 to 1
C) 1- to +1 D) None of these
If r = 0.9 and if 5 is subtracted from each observation of x, then r will:
A) Decrease by 5 units B) Decreases by less than 5 units
C) Remain unchanged D) None of these
If r = 0.9 and if 5 is added to each observation of x, then r will:
A) Increase by 5 units B) Increase by more than 5 units
C) Remain unchanged D) None of these
If r = 0.9 and if 3 is subtracted from each observation of Y, then r will:
A) Decrease by 3 units B) Decrease by less than 3 units
C) Remain unchanged D) None of these
If r = 0.9 and if 3 is added to each observation of y, then r will:
A) Increase by 3 units B) Increase by more than 3 units
C) Remain unchanged D) None of these
If r = 0.9 and if 3 is subtracted from each observation of x and 5 is added to each observation of y, then
r will:
A) Decrease by 2 units B) Increase by 2 units
C) Remain unchanged D) None of these
If r = 0.9 and each observation of x is multiplied by 100, then r will:
A) Increase by 100 times B) Less than 100 times
C) Remain unchanged D) None of these
If r = 0.9 and each observation of Y is divided by 10, then r will:
A) Decrease by 10 times B) Decrease by less than 10 times
C) Remain unchanged D) None of these
If r = 0.9 and each observation of x and y is divided by 10, then r will:
A) Decrease by 10 times B) Decrease by 100 times
C) Remain unchanged D) None of these
The co-efficient of correlation is independent of:
A) Only origin B) Only scale
C) Origin and scale D) None of these
The geometric mean of two regressions co-efficient is equal to:
A) Co-efficient of determination B) Co-efficient of correlation
C) Co-efficient of rank correlation D) None of these
If 𝑏𝑥𝑦 = -0.78 and 𝑏𝑦𝑥 -0.45, then r is equal to:
A) +0.351 B) -0.351
C) Cannot be determined D) None of these
If 𝑏𝑥𝑦 -0.78 and 𝑏𝑦𝑥 0.45, then r is equal to:
A) +0.351 B) -0.351
C) Cannot be determined D) None of these
If 𝑏𝑥𝑦 +1.93 and 𝑏𝑦𝑥 = 0.6, then r is equal to:
A) 1.158 B) 1.0761
C) Data is fictitious D) None of these
If 𝑏𝑥𝑦 = 1.93 and 𝑏 𝑦𝑥 = 0.51, then r is equal to:
A) 0.9843 B) 0.992
C) Data is fictitious D) None of these
If 𝑏𝑥𝑦 = -1.93 and 𝑏 𝑦𝑥 0.51, then r is equal to:
A) -0.9843 B) -0.992
C) Data is fictitious D) None of these
The quantity which describes that the proportion (or percentage) of variation in the dependent
variable explained (or reduced) by the independent variable is called:
A) Co-efficient of determination B) Co-efficient of regression
C) Co-efficient of correlation D) None of these
If r = 0.8, then the variation in the dependent variable y due to independent variable x is about:
A) 80% B) 64%
C) 64% to 80% D) None of these
If r = 0.8 and byx 1.04 then bxy is equal to:
A) 0.769 B) 0.615
C) Cannot be determined D) None of these
2
If 𝑟 = 0.796 and bxy -1.04 then byx is equal to:
A) 0.765 B) -0.765
C) Cannot be determined D) None of these
Correlation analysis is aim at:
A) Predicting one variable for a given value of the B) Establishing relation between two variable
other variable
C) Measuring the extent of relation between two D) Both B & C
variables
What is spurious correlation?
A) It is bad relation between two variables B) it is very low correlation between two variables
C) It is the correlation between two variables D) It is negative correlation
having no casual relation
Scatter diagram is considered for measuring:
A) Linear relationship between two variables B) Curvilinear relationship between two variables
C) Neither A or B D) Both A & B
if the plotted points in a scatter diagram lie from upper left or lower right, then the correlation is:
A) Positive B) Zero
C) Negative D) None of these
If the plotted points in a scatter diagram are evenly distributed, then the correlation is:
A) Zero B) negative
C) Positive D) None of these
If all the plotted points In a scatter diagram lie on a single line, then the correlation is:
A) Perfect positive B) Perfect negative
C) Both A & B D) Either A or B
The correlation between shoe-size and intelligent is:
A) Zero B) Positive
C) Negative D) None of these
The correlation between the speed of an automobile and the distance travelled by it after applying the
brakes is:
A) Negative B) Zero
C) Positive D) None of these
Scattered diagram helps us to:
A) find the nature correlation between two B) Compute the extent of correlation between
variables two variables
C) Obtain the mathematical relationship between D) Both A & C
two variables
Pearson‟s correlation is used to for finding:
A) Correlation for any type of relation B) Correlation for linear relation only
C) Correlation for curvilinear relation only D) Both A & C
Product moment correlation coefficient is considered for:
A) Finding the nature of correlation B) Finding the amount of correlation
C) Both A & B D) Either A & B
If the value of correlation coefficient is positive, then the points in a scatter diagram tend to cluster:
A) From lower left corner to upper right corner B) From lower left corner to lower right corner
C) From lower right corner to upper left corner D) From lower right corner to upper right corner
Product moment correlation coefficient may be defined as the ratio of:
A) The product of standard deviations of the two B)The covariance between the variables to the
variables to the covariance between them product of the variance of them
C) The covariance between the variables to the D) Either B or C
product of their standard deviations
The covariance between two variables is:
A) Strictly positive B) Strictly negative
C) Always 0 D) Either positive or negative or zero
Which of the following is NOT a possible value of the correlation coefficient?
A) Negative 0.9 B) Zero
C) positive 0.15 D) Positive 1.5
The coefficient of correlation between two variables:
A) can have any unit B) Is expressed as the product of units of the two
variables
C) Is a unit free measure D) None of these
What are the limits of the correlation coefficient?
A) No limit B) -1 and 1
C) 0 and 1, including the limits D) -1 and 1, including the limits
For finding correlation between two attributes, we consider:
A Person‟s correlation coefficient B) scatter diagram
C) Spearman’s rank correlation coefficient D) Coefficient of concurrent deviations
For finding the degree of agreement about beauty, between two judges in a Beauty Contest, we use:
A) Scatter diagram B) Coefficient of rank correlation
C) Coefficient of correlation D) Coefficient of concurrent deviation
If there is a perfect disagreement between the marks in Geography and Statistics, then what would be the
value of rank correlation coefficient?
A) Any value B) Only 1
C) Only -1 D) B or C
When we are not concerned with the magnitude of the two variables under discussion, we consider:
A) Rank correlation coefficient B) Product moment correlation coefficient
C) Coefficient of concurrent deviation D) A or B but not c
What is the quickest method to find correlation between two variables?
A) Scatter diagram B) Method of concurrent deviation
C) Method of rank correlation D) Method of product moment correlation
What are the limits of the coefficient of concurrent deviations?
A) no limit B) Between -1 and 0, including the limiting values
C) Between 0 and 1, including the limiting values D) Between -1 and 1, the limiting values
The correlation between Shoes size and IQ:
A) Positive B) Negative
C) Might be any D) None of these
If the coefficient of determination is positive then „r‟:
A) Must be positive B) Must be negative
C) Might be any D) None of these
Bivariate are the data collected for:
A) One variable B) Two variables at different point of time
C) Two variables at some point of time D) None of these
Correlation analysis is used to:
A) Predict one variable for a given value of other B) Establish relationship between two variables
variable
C) Measure the extent of relation between two D) Both B or C
variables
If the plotted points in a scatter diagram lie from lower left to upper right, then correlation is:
A) Negative B) Positive
C) Perfect negative D) Perfect positive
Sign of product moment correlation co-efficient depends on:
A) Variance of X B) variance of Y
C) Co-variance D) Product of two standard deviation
Karl-Pearson‟s Correlation is the ration of:
A) Two variances B) The product of standard deviation of two
variables to the covariance
C) The covariance to the product of standard D) The covariance to the product of variance of
deviations of two variables two variables
Which of the following method take magnitude of observation into account:
A) Scatter diagram B) Correlation
C) Rank correlation D) All of these

You might also like