Corr and Regress

PROBABILITY AND STATISTIC

Uploaded by

Ian Miles Daligdig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

24 views32 pages

Corr and Regress

PROBABILITY AND STATISTIC

Uploaded by

Ian Miles Daligdig

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 32

Correlation and Regression Dr. Anil V Dusane Sir Parashurambhau College Pune, India [email protected] www.careerguru.co.com eevariables is called correlation. * Correlation analysis is a statistical tool, that measures the closeness or strength of the relationship between the variables. * In correlation, two variables are inter-dependent or co-vary and we can not make distinction between the independent and dependent variables. E.g birth weight and maternal height, drug intake and number of days taken to cure ete. * Correlation analysis is not only establishing relationship but also quantify it. Correlation is unable to indicate the cause and effect relationship between rsOn the basis of the nature of relationship between the variables, correlation can be categorized as 1.Positive and negative correlation. 2.Simple, partial and multiple correlation 3.Linear and non-linearNo.of days * This correlation is also called, direct correlation. * In this, an increase or decrease in the value of one variable is associated with the increase or decrease in the value of the other. * In this, both variables move in the same direction. * E.g. number of tillers and plant yield in wheat, plant yield and number of pods, number of days and height of the plant, etc.120 100 20 Supply (Tonnes) + In this, increase in one variable causes the proportionate decrease in the other variable. + Here the two variables move in the opposite direction. + E.g. supply and price of commodity. If the supply of the commodity is more, price fall and if there is scarcity of the commodity, then the price goes up. Here there is negative relationship between supply and price.* Depending of the number of variables the correlation is classified into Simple, partial and multiple correlations. + 1. Simple: + In this only two variables are involved, and these two variables are taken into consideration at a time. + E.g. yield of wheat and the amount (dose) of fertilizers. * 2. Partial correlation: * Relationship between three or more variables is studied. * In this type only two variables are taken into consideration and other variables are excluded. + E.g. the yield of maize and the amount of fertilizers applied to it are taken into consideration and the effect of the other variables such as effect of pesticides, type of soil, availability of water etc. are not taken into consideration.* In this three or more variables are studied simultaneously. * However multiple correlations consist of measurements of relationships between a dependable variable and two or more independent variable. * Partial and multiple correlation are mainly associated with multivariate analysis. * E.g. relationship between agricultural production, rainfall and quantity of fertilizers used.00 * + Linear and non-linear 90 correlation: 80 10 + Difference between these two is 60 based on the ratio of change 50 . ‘0 between the variables under 30 study. 20 . . 10 + Linear correlation: values o have constant ratio. 1 2 3 Te + E.g. X= 30, 60, 90. . Y= 10, 20, 30No. of days za The amount of change in one 2 variable doesn’t have a so constant ratio to the change ” in other related variable. *° * E.g. If the use of fertilizer 20 is doubled, yield of maize crop would not be exactly a ; ; qi 5 doubled.* Measures of correlation: There are several measures of correlation but following three are important measures. 1.Scatter diagram 2.Graph method 3.Correlation coefficient* This is the simplest method for confirming whether there is any relationship between two variables by plotting values on chart or graph. ¢ Itis nothing but a visual representation of two variables by points (dots) on a graph. * Ina scatter diagram one variable is taken on the X-axis and other on the Y-axis and the data is represented in the form of points. ¢ Itis called as a scatter diagram because it indicates scatter of various points (variables).* Scatter diagram gives a general idea a about existence of correlation between ” + two variables and type of correlation, . but it does not give correct numerical value of the correlation. * Depending on the extent of relationship between two variables, scatter diagrams shows perfect correlation, perfect negative correlation, ~~ NO COrrelation, high positive and high negative correlation.+ Merits of scatter diagram: -It is the simple method to find out nature of correlation between two variables. N . It is not influenced by extreme limits we . It is easy to understand. + Demerits: - It doesn’t give correct numerical value of correlation. It is unable to give exact degree of correlation between two variables. N . It is a subjective method. w . It cannot be applied to qualitative data.* Scattered diagram and graphic method only gives a rough idea about the relationship between two variables but does not give numerical measure of correlation. * The degree of relationship can be established by calculating Karl Pearson’s coefficient, which is denoted by ‘r’ * Definition: The coefficient of correlation ‘r’ can be defined as a measure of strength of the linear relationship between the two variables X and Y. es Me+ r= X(X- X)(Y Y)/VE(X- X(Y- Y) * where X = Independent variable + Y= dependent variable + X- X =deviation from AM + Y- Y=deviation from the mean ¢ Ifr>0, correlation is positive and r<0, correlation is negative. * 1=0 variables are not related.* Larger the numerical value of ‘r’ more close relationship between variables. + Ifr=1, we can say that there is perfect positive relationship + Ifr=-1 there is perfect negative relationship. + In general, for r >0.8 we can say that there is high correlation + If ris between 0.3-0.8 then there is considerable correlation exists and + If r <0.3 we can say that there is negligible correlation.The value of r ranges between (-1) and (+1): ¢ If there is no relationship at all between the two variables, then the value is zero. * On the other hand if the relationship is perfect, which means that all the points on the scatter diagram fall on the straight line, the value of ris +1 or—1, depending on the direction of line. ¢ Other values of r show an intermediate degree of relationship between the two variables.Sign of the coefficient can be positive or negative: ¢ It is positive when the slope of the line is positive, and it is negative when the slope of line is negative. ¢ Ifthe value of Y increases as the value of X increases the sign will be positive on the other hand if the value Y decreases as the value of X increases, then the slope will be negative a so there will be —ve coefficient of correlation.1.It is the numerical measure of correlation. 2.It determines a single value which summarizes extent of linear relationship. 3. It also indicates the type of correlation 4. It depends on all the observations so give true picture.1.It can not be computed for qualitative data such as flower colour, honesty, beauty, intelligence etc. 2.It measures only linear relationship, but it fails to measure non-linear relationship. 3.It is difficult to calculate.+ Inagriculture, genetics, physiology, medicine etc. correlation is used as a tool of the analysis. Agriculture: * Correlation is widely used as a tool of analysis in agriculture sciences. + E.g. to estimate the role of various variables (factors) such as fertilizers, irrigation, fertility of soil etc. on crop yield. + Physiology: + Using regression and correlation analysis relationship between germination time and temperature of soil, alkalinity of river water and growth of fungi, etc. can be estimated.Genetics: * Correlation analysis finds a lot of application in genetics. * For instance, when ‘r’=0 (correlation coefficient) then it indicates that the concern genes are located at distance on same chromosomes. * When r=1, it indicates that genes are linked. Thus, correlation analysis is very important in gene mapping.* Depending on the extent of relationship between two variables, scatter diagrams shows perfect correlation, perfect negative correlation, no correlation, high positive and high negative correlation. Perfect correlation: * All the points lie on a straight line. + As the variable value increases on X-axis the value on Y-axis also increases or vice a versa. * E.g. height and biomass.Perfect negative correlation: * In this all the points lie on a straight line. + As the value on X-axis increases, the value on Y-axis decreases proportionately * e.g. Water temperature and amount of dissolved oxygen. No-correlations: + In this the line can not be drawn which is passing through most of the plotted points and the points are totally scattered. * Hence there is no correlation between variables of X and Y-axis.High positive correlation: In this most of the plotted points lie on the line and others near to this line. High negative correlation: The diagram is showing high negative correlation as the slope of the lines is more than 90° and most of the points either lie on the straight line or in close vicinity.* This term was first used by Sir Francis Galton to describe the laws of human inheritance. * Regression describes the liner relationship in quantitative terms. * Itis used to make predictions about one variable based on our knowledge of the other. * The regression is divided into two categories i.e. simple regression and multiple regressions. * The simple regression is concerning with two variables while multiple regression is concerning with more than two variables. * Simple regression is further classified into linear and non-linear type regression.¢ A linear regression is one in which some change in dependent variable (Y) can be expected for the change in independent variable (X, irrespective of the values of Y). * In studying the way in which the yield of wheat vary in relation to change the amount of fertilizer applied, yield is dependent variable (Y) and fertilizer level is independent variable (X). * The starting point in regression is to illustrate the relationship between the dependent variable (weight) and independent variable (age) by scatter diagram.* Regression analysis is widely used for prediction and forecasting. * It is also used to understand which among the independent variables are related to the dependent variable, and to explore the forms of these relationships. ¢ In restricted circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. en: A* In statistics linear regression includes any approach to modelling the relationship between a scalar variable y and one or more variables denoted X, such that the model depends linearly on the unknown parameters to be estimated from the data. * Such a model is called a “linear model”. * Linear regression has many practical applications. * This is because models that depend linearly on their unknown parameters are easier to fit than models which are non-linearly related to their parameters.* Linear regression is widely used in biological, behavioural and social sciences to describe possible relationships between variables. * Itranks as one of the most important tools used in these disciplines. Prediction or forecasting: * Linear regression can be used to fit a predictive model to an observed data set of y and_X values. + After developing such a model, if an additional value of X is then given without its accompanying value of y, the fitted model can be used to make a prediction of the value of y.Epidemiology: * Early evidence relating tobacco smoking to mortality and morbidity came from observational studies employing regression analysis. + In order to reduce spurious correlations when analyzing observational data, researchers usually include several variables in their regression models in addition to the variable of primary interest. * For example, suppose we have a regression model in which cigarette smoking is the independent variable of interest, and the dependent variable is lifespan measured in years.Environmental science: ¢ Linear regression finds application in a wide range of environmental science. * In Canada, the Environmental Effects Monitoring Program uses statistical analyses on fish and benthic surveys to measure the effects of pulp mill or metal mine effluent on the aquatic ecosystem.

Correlation
No ratings yet
Correlation
83 pages
Correlation and Recession
No ratings yet
Correlation and Recession
45 pages
Assignment On Correlation
100% (1)
Assignment On Correlation
7 pages
Maths and Statistical Analysis
No ratings yet
Maths and Statistical Analysis
56 pages
Lecture Cor 28.06.22
No ratings yet
Lecture Cor 28.06.22
50 pages
Correlation and Regression
No ratings yet
Correlation and Regression
64 pages
Correlation Regreesion Sums
No ratings yet
Correlation Regreesion Sums
50 pages
24 11
No ratings yet
24 11
24 pages
202003241550009941rajeev Pandey Correlation Research
No ratings yet
202003241550009941rajeev Pandey Correlation Research
87 pages
Correlation and Regression
No ratings yet
Correlation and Regression
45 pages
Measures of Correlation
No ratings yet
Measures of Correlation
23 pages
009 D 1 Correlation
No ratings yet
009 D 1 Correlation
29 pages
Correlationandregression1 200905162711
No ratings yet
Correlationandregression1 200905162711
32 pages
Unit - 3 - Correlation & Unit - 4 - Regression
No ratings yet
Unit - 3 - Correlation & Unit - 4 - Regression
43 pages
Correlation Notes
No ratings yet
Correlation Notes
45 pages
Scatter Plot
No ratings yet
Scatter Plot
33 pages
Correlation and Regression
100% (1)
Correlation and Regression
45 pages
Simple Linear Correlation and Regression
No ratings yet
Simple Linear Correlation and Regression
21 pages
5 - Correlation Analysis
No ratings yet
5 - Correlation Analysis
34 pages
Correlation: (For M.B.A. I Semester)
100% (2)
Correlation: (For M.B.A. I Semester)
46 pages
Correlation
No ratings yet
Correlation
19 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
17 pages
Methods of Correlation - PPTX 20240821 140255 0000
No ratings yet
Methods of Correlation - PPTX 20240821 140255 0000
16 pages
Correlation and Regression: Jaipur National University
No ratings yet
Correlation and Regression: Jaipur National University
32 pages
Mis 121620003
No ratings yet
Mis 121620003
39 pages
Concept of Correlation
No ratings yet
Concept of Correlation
17 pages
Business Project 12 Content
No ratings yet
Business Project 12 Content
33 pages
Correlation
No ratings yet
Correlation
18 pages
Correlation and Regression
No ratings yet
Correlation and Regression
13 pages
Correlation Analysis and Its Types
No ratings yet
Correlation Analysis and Its Types
50 pages
Correlation
No ratings yet
Correlation
24 pages
Correlation Analysis
No ratings yet
Correlation Analysis
49 pages
Correlation
No ratings yet
Correlation
7 pages
Correlational Analysis - Statistics - Alok - Kumar
No ratings yet
Correlational Analysis - Statistics - Alok - Kumar
42 pages
Correlation
No ratings yet
Correlation
22 pages
Eda Group 1
No ratings yet
Eda Group 1
19 pages
05correlation Lecture
No ratings yet
05correlation Lecture
14 pages
Unit 3 Correlation and Regression
No ratings yet
Unit 3 Correlation and Regression
27 pages
Peter
No ratings yet
Peter
48 pages
QT Module II Correlation and Regression Analysis
No ratings yet
QT Module II Correlation and Regression Analysis
10 pages
Correlation & Regression Analysis
No ratings yet
Correlation & Regression Analysis
16 pages
Module-I Regression
No ratings yet
Module-I Regression
30 pages
Correlation and Regression-1
No ratings yet
Correlation and Regression-1
32 pages
Correlation
No ratings yet
Correlation
4 pages
Correlation Theory
No ratings yet
Correlation Theory
34 pages
Correlation Analysis
No ratings yet
Correlation Analysis
16 pages
Correlation
No ratings yet
Correlation
34 pages
Unit 3-1
No ratings yet
Unit 3-1
12 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Business Statistics Unit 4 Correlation and Regression
No ratings yet
Business Statistics Unit 4 Correlation and Regression
27 pages
Correlation Notes
No ratings yet
Correlation Notes
9 pages
Correlation
No ratings yet
Correlation
27 pages
CORRELATION
No ratings yet
CORRELATION
5 pages
Lecture Sheet H
No ratings yet
Lecture Sheet H
17 pages
Chapter 6 PDF
No ratings yet
Chapter 6 PDF
3 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
BS Module 2
No ratings yet
BS Module 2
7 pages
Correlation KDK DHH W
No ratings yet
Correlation KDK DHH W
16 pages
Correlation: Definitions
No ratings yet
Correlation: Definitions
24 pages
Chapter 2 Noise Analysis Part 1 Lecture Notes
No ratings yet
Chapter 2 Noise Analysis Part 1 Lecture Notes
17 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
22 pages
6 - Process Capability
No ratings yet
6 - Process Capability
11 pages
01 Paradiddle Madness
No ratings yet
01 Paradiddle Madness
1 page
S.L.E.D A Pro-Life Defense
No ratings yet
S.L.E.D A Pro-Life Defense
1 page

Corr and Regress

Uploaded by

Corr and Regress

Uploaded by

You might also like