0% found this document useful (0 votes)
35 views36 pages

Micromod 4

microbiology

Uploaded by

Bhavya Sree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views36 pages

Micromod 4

microbiology

Uploaded by

Bhavya Sree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Pearson Correlation Test In Excel,linear regression - Non

parametric tests: Mann Whitney U test, Wilcoxon signed rank


test – Kruskall Wallis test –Chi- Square test

MOHAMED JABIR PK
ASSISTANT PROFESSOR
DEPARTMENT OF COMPUTER SCIENCE
MAJLIS ARTS AND SCIENCE COLLEGE
What is a Pearson correlation test?

1. A Pearson correlation is a statistical test to determine the


association between two continuous variables.
2. The output is given as the Pearson correlation coefficient (r)
which is a value ranging from -1 to 1 to indicate the
strength of the association.
3. The following values of r indicate the direction and strength
of the association.
● r = -1: A perfect negative association
● r = 0: No association
● r = +1: A perfect positive association
How to perform a Pearson correlation test in Excel

● In Excel, there is a function available to calculate the


Pearson correlation coefficient. However, there is no simple
means of calculating a p-value for this.
● A way around this is to firstly calculate a t statistic which
will then be used to determine the p-value.
1. Calculate the Pearson correlation coefficient in Excel
In Excel, click on an empty cell where you want the correlation
coefficient to be entered. Then enter the following formula.
=PEARSON(array1, array2)
Simply replace ‘array1‘ with the range of cells containing the first
variable and replace ‘array2‘ with the range of cells containing the
second variable.
2. Calculate the t-statistic from the coefficient value
1. The next step is to convert the Pearson correlation coefficient
value to a t-statistic. To do this, two components are required: r
and the number of pairs in the test (n).
2. In order to determine the number of pairs, simply count them
manually or use the count function (=COUNT).
3. Each pair should be a pair, so remove any entries that are not a
pair.
The equation used to convert r to the t-statistic can be found below.
The formula to do this in Excel can be found below.
=(r*SQRT(n-2))/(SQRT(1-r^2))
Simply replace the ‘r‘ with the correlation coefficient value and
replace the ‘n‘ with the number of observations in the analysis.
For the example above, the Pearson
correlation coefficient (r) is ‘0.76‘.
Non-Parametric Test
1. Non-parametric tests are the mathematical methods used in statistical
hypothesis testing, which do not make assumptions about the frequency
distribution of variables that are to be evaluated.
2. The non-parametric experiment is used when there are skewed data,
and it comprises techniques that do not depend on data pertaining to
any particular distribution.
3. The word non-parametric does not mean that these models do not have
any parameters.
4. The fact is, the characteristics and number of parameters are pretty
flexible and not predefined.
5. Therefore, these models are called distribution-free models.
Mann Whitney U Test
Mann Whitney U test is used to compare the continuous outcomes in
the two independent samples.
Null hypothesis, H0: The two populations should be equal.
Perform the following steps to conduct a Mann-Whitney U test in Excel.

Step 1: Enter the data.

Enter the data as follows:


Step 2: Calculate the ranks for both groups.

formula to use to calculate the rank of the first value in the


Treated group:
Although this formula is fairly complicated, you only have to enter it one
time. Then, you can simply drag the formula to all of the other cells to fill in
the ranks
Step 3: Calculate the necessary values for the test statistic.

Next, we’ll use the following formulas to calculate the sum of the
ranks for each group, the sample size for each group, the U test
statistic for each group, and the overall U test statistic:
Step 4: Calculate the z test statistic and the corresponding p-value.

Lastly, we’ll use the following formulas to


calculate the z test statistic and the
corresponding p-value to determine if we
should reject or fail to reject the null
hypothesis:
What Is Linear Regression?
● Linear regression is a type of data analysis that considers the linear
relationship between a dependent variable and one or more independent
variables.
● It is typically used to visually show the strength of the relationship or
correlation between various factors and the dispersion of results – all for
the purpose of explaining the behavior of the dependent variable.
● The goal of a linear regression model is to estimate the magnitude of a
relationship between variables and whether or not it is statistically
significant.
Regression in Excel
1. The first step in running regression analysis in Excel is to double-check
that the free Excel plugin Data Analysis ToolPak is installed.
2. This plugin makes calculating a range of statistics very easy.
3. It is not required to chart a linear regression line, but it makes creating
statistics tables simpler.
4. To verify if installed, select "Data" from the toolbar. If "Data Analysis"
is an option, the feature is installed and ready to use.
5. If not installed, you can request this option by clicking on the Office
button and selecting "Options" to "Add-In's" and from the "Manage"
box, select "Excel Add-In's" and click "Go."
Given the S&P 500 returns, say we want to know if we can estimate the strength and
relationship of Visa (V) stock returns. The Visa (V) stock returns data populates column
1 as the dependent variable. S&P 500 returns data populates column 2 as the
independent variable.
1. Select "Data" from the toolbar. The "Data" menu displays.
2. Select "Data Analysis". The Data Analysis - Analysis Tools dialog box
displays.
3. From the menu, select "Regression" and click "OK".
4. In the Regression dialog box, click the "Input Y Range" box and select
the dependent variable data (Visa (V) stock returns).
5. Click the "Input X Range" box and select the independent variable data
(S&P 500 returns).
6. Click "OK" to run the results.
What is a Wilcoxon signed test
1. The Wilcoxon test is a non parametric test that allows to compare two paired
samples.
2. Two tests have been proposed for the cases where samples are paired: the sign
test and the Wilcoxon signed rank test.
3. The sign test is based on a simple principle: we compare the number of cases
where the first sample is greater than the second sample to the number of cases
where the second sample is greater that the first sample.
4. The disadvantage of the sign test is that it does not take into account the size of
the difference between each pair, data which is often available.
5. Wilcoxon proposed a test which takes into account the size of the difference
within pairs.
Setting up a sign test and a Wilcoxon signed rank test on two paired samples

Once XLSTAT is activated, select the XLSTAT / Nonparametric tests /


Comparison of two samples (Wilcoxon, Mann-Whitney, ...) command, or
click on the corresponding button of the Nonparametric test menu (see
below).
● Once you've clicked the
button, the dialog box
appears.
● You can then select the
data on the Excel sheet.
Select the paired
samples option.
After you have clicked
on the OK button, the
results are displayed
on a new Excel sheet
(because the Sheet
option has been
selected for outputs).
Kruskal-Wallis Test in Excel
● A Kruskal-Wallis Test is used to determine whether or not there is a
statistically significant difference between the medians of three or
more independent groups.
● It is considered to be the non-parametric equivalent of the One-Way
ANOVA.
Kruskal-Wallis Test in Excel
Step 1: Enter the data.
Step 2: Rank the data. Next, we will use the RANK.AVG() function
Copy this formula to the rest of the cells:
Step 3: Calculate the test statistic and the corresponding p-value.

The test statistic is defined as:


H = 12/(n(n+1)) * ΣRj2/nj – 3(n+1)
where:
n = total sample size
Rj2 =sum of ranks for the jth group
nj =sample size of jth group
Under the null hypothesis, H follows a Chi-square distribution with k-1 degrees of freedom.
Chi-Square Test in Excel
1. The chi-square test is a non-parametric test that compares
two or more variables from randomly selected data.
2. It helps find the relationship between two or more variables.
3. In Excel, we calculate the chi-square p-value.
4. Since Excel does not have an inbuilt function, mathematical
formulas are used to perform the chi-square test.
Chi-Square Goodness of Fit Test
Where,

● “x2” is the chi-square statistic


● “Oi” is the observed frequency
● “Ei” is the expected frequency
● “i” is the “ith” position in the contingency table
● “k” is the category
● Degrees of freedom (df)=k-1
Chi-Square Test for Independence

● “x2” is the chi-square statistic


● “Oij” is the observed frequency in the ith row and jth column
● “Eij”is the expected frequency in the ith row and jth column
● “r” is the number of rows
● “c” is the number of columns
● Degrees of freedom (df)=(r-1)(c-1)
References
https://toptipbio.com/pearson-correlation-excel/
https://www.statology.org/how-to-perform-a-mann-whitney-u-test-in-excel/
https://www.investopedia.com/ask/answers/062215/how-can-i-run-linear-and-mul
tiple-regressions-excel.asp
https://help.xlstat.com/6740-wilcoxon-signed-rank-test-two-paired-samples-excel
https://help.xlstat.com/6740-wilcoxon-signed-rank-test-two-paired-samples-excel
https://www.statology.org/kruskal-wallis-test-excel/
https://www.wallstreetmojo.com/chi-square-test-in-excel/

You might also like