0% found this document useful (0 votes)
20 views9 pages

Correlation

The document explains how to calculate correlation coefficients and perform linear regression analysis using Excel's Analysis Toolpak. It details the steps to find correlations between variables, interpret regression outputs, and assess statistical significance through R Square and P-values. Additionally, it covers generating descriptive statistics for a dataset.

Uploaded by

anuanamika0220
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views9 pages

Correlation

The document explains how to calculate correlation coefficients and perform linear regression analysis using Excel's Analysis Toolpak. It details the steps to find correlations between variables, interpret regression outputs, and assess statistical significance through R Square and P-values. Additionally, it covers generating descriptive statistics for a dataset.

Uploaded by

anuanamika0220
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Correlation

The correlation coefficient (a value between -1 and +1) tells you how
strongly two variables are related to each other. We can use the CORREL
function or the Analysis Toolpak add-in in Excel to find the correlation
coefficient between two variables.

- A correlation coefficient of +1 indicates a perfect positive correlation. As


variable X increases, variable Y increases. As variable X decreases,
variable Y decreases.

- A correlation coefficient of -1 indicates a perfect negative correlation. As


variable X increases, variable Z decreases. As variable X decreases,
variable Z increases.
- A correlation coefficient near 0 indicates no correlation.

To use the Analysis Toolpak add-in in Excel to quickly generate correlation


coefficients between multiple variables, execute the following steps.

1. On the Data tab, in the Analysis group, click Data Analysis.

Note: can't find the Data Analysis button? Click here to load the Analysis
ToolPak add-in.

2. Select Correlation and click OK.

3. For example, select the range A1:C6 as the Input Range.

4. Check Labels in first row.

5. Select cell A8 as the Output Range.


6. Click OK.

Result.

Conclusion: variables A and C are positively correlated (0.91). Variables A


and B are not correlated (0.19). Variables B and C are also not correlated
(0.11) . You can verify these conclusions by looking at the graph

Regression
R Square | Significance F and P-Values | Coefficients | Residuals

This example teaches you how to run a linear regression


analysis in Excel and how to interpret the Summary Output.
Below you can find our data. The big question is: is there a relation
between Quantity Sold (Output) and Price and Advertising (Input). In other
words: can we predict Quantity Sold if we know Price and Advertising?

1. On the Data tab, in the Analysis group, click Data Analysis.

Note: can't find the Data Analysis button? Click here to load the Analysis
ToolPak add-in.

2. Select Regression and click OK.

3. Select the Y Range (A1:A8). This is the predictor variable (also called
dependent variable).

4. Select the X Range(B1:C8). These are the explanatory variables (also


called independent variables). These columns must be adjacent to each
other.

5. Check Labels.
6. Click in the Output Range box and select cell A11.

7. Check Residuals.

8. Click OK.

Excel produces the following Summary Output (rounded to 3 decimal


places).

R Square
R Square equals 0.962, which is a very good fit. 96% of the variation in
Quantity Sold is explained by the independent variables Price and
Advertising. The closer to 1, the better the regression line (read on) fits
the data.

Significance F and P-values


To check if your results are reliable (statistically significant), look at
Significance F (0.001). If this value is less than 0.05, you're OK. If
Significance F is greater than 0.05, it's probably better to stop using this
set of independent variables. Delete a variable with a high P-value
(greater than 0.05) and rerun the regression until Significance F drops
below 0.05.

Most or all P-values should be below below 0.05. In our example this is the
case. (0.000, 0.001 and 0.005).

Coefficients
The regression line is: y = Quantity Sold = 8536.214 -835.722 * Price
+ 0.592 * Advertising. In other words, for each unit increase in price,
Quantity Sold decreases with 835.722 units. For each unit increase in
Advertising, Quantity Sold increases with 0.592 units. This is valuable
information.

You can also use these coefficients to do a forecast. For example, if price
equals $4 and Advertising equals $3000, you might be able to achieve a
Quantity Sold of 8536.214 -835.722 * 4 + 0.592 * 3000 = 6970.

Residuals
The residuals show you how far away the actual data points are fom the
predicted data points (using the equation). For example, the first data
point equals 8500. Using the equation, the predicted data point equals
8536.214 -835.722 * 2 + 0.592 * 2800 = 8523.009, giving a residual of
8500 - 8523.009 = -23.009.
You can also create a scatter plot of these residuals.

Descriptive Statistics
You can use the Analysis Toolpak add-in to generate descriptive
statistics. For example, you may have the scores of 14 participants for a
test.
To generate descriptive statistics for these scores, execute the following
steps.

1. On the Data tab, in the Analysis group, click Data Analysis.

Note: can't find the Data Analysis button? Click here to load the Analysis
ToolPak add-in.

2. Select Descriptive Statistics and click OK.

3. Select the range A2:A15 as the Input Range.

4. Select cell C1 as the Output Range.


5. Make sure Summary statistics is checked.

6. Click OK.

Result:

You might also like