0% found this document useful (0 votes)
12 views

Data Analysis Guide

Data Analysis guide on job sress

Uploaded by

angelo.zilva
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Data Analysis Guide

Data Analysis guide on job sress

Uploaded by

angelo.zilva
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

1.

CRONBACH’S ALPHA – Cronbach’s alpha, α (or coefficient alpha), developed by


Lee Cronbach in 1951, measures reliability, or internal consistency. “Reliability” is another
name for consistency.

Cronbach’s alpha tests to see if multiple-question Likert scale surveys are reliable. These
questions measure latent variables—hidden or unobservable variables like: a person’s
conscientiousness, neurosis or openness. These are very difficult to measure in real life.
Cronbach’s alpha will tell you how closely related a set of test items are as a group.

2. STANDARD DEVIATION - Standard deviation is a statistic that measures the


dispersion of a dataset relative to its mean and is calculated as the square root of the variance.
The standard deviation is calculated as the square root of variance by determining each data
point's deviation relative to the mean
If the data points are further from the mean, there is a higher deviation within the data
set; thus, the more spread out the data, the higher the standard deviation.

Standard Deviation=

Standard Deviation=

xi=Value of the ith point in the data set


x=The mean value of the data set
n=The number of data points in the data set

Calculating Standard Deviation

Standard deviation is calculated as follows:

1. Calculate the mean of all data points. The mean is calculated by adding all the
data points and dividing them by the number of data points.
2. Calculate the variance for each data point. The variance for each data point is
calculated by subtracting the mean from the value of the data point.
3. Sum of squared variance values (from Step 3)
4. Divide the sum of squared variance values (from Step 4) by the number of data
points in the data set less 1
5. Take the square root of the quotient (from step 5)
 What Does a High Standard Deviation Mean?
A large standard deviation indicates that there is a lot of variance in the observed data
around the mean. This indicates that the data observed is quite spread out. A small or
low standard deviation would indicate instead that much of the data observed is
clustered tightly around the mean.

 What Does Standard Deviation Tell You?


Standard deviation describes how dispersed a set of data is. It compares each data point
to the mean of all data points, and standard deviation returns a calculated value that
describes whether the data points are in close proximity or whether they are spread out.
In a normal distribution, standard deviation tells you how far values are from the mean.

3. CORRELATION ANALYSIS - Correlation analysis in research is a statistical method


used to measure the strength of the linear relationship between two variables and compute their
association. Simply put - correlation analysis calculates the level of change in one variable due to
the change in the other. A high correlation points to a strong relationship between the two
variables, while a low correlation means that the variables are weakly related.

When it comes to market research, researchers use correlation analysis to analyze


quantitative data collected through research methods like surveys and live polls. They try
to identify the relationship, patterns, significant connections, and trends between two
variables or datasets.

There is a positive correlation between two variables when an increase in one variable
leads to the increase in the other. On the other hand, a negative correlation means that
when one variable increases, the other decreases and vice-versa.

 The Correlation Coefficient

One of the statistical concepts that is most related to this type of analysis is the
correlation coefficient.

The correlation coefficient is the unit of measurement used to calculate the intensity in
the linear relationship between the variables involved in a correlation analysis, this is
easily identifiable since it is represented with the symbol r and is usually a value without
units which is located between 1 and -1.

 Positive correlation: A positive correlation between two variables means


both the variables move in the same direction. An increase in one variable leads to
an increase in the other variable and vice versa.
For example, spending more time on a treadmill burns more calories.
 Negative correlation: A negative correlation between two variables
means that the variables move in opposite directions. An increase in one variable
leads to a decrease in the other variable and vice versa.
For example, increasing the speed of a vehicle decreases the time you take to
reach your destination.
 Weak/Zero correlation: No correlation exists when one variable does not
affect the other.
For example, there is no correlation between the number of years of school a
person has attended and the letters in his/her name.

4. REGRESSION ANALYSIS - Regression analysis is a set of statistical methods


used for the estimation of relationships between a dependent variable and one or
more independent variables. It can be utilized to assess the strength of the relationship between
variables and for modeling the future relationship between them.

Regression Analysis – Linear Model Assumptions

Linear regression analysis is based on six fundamental assumptions:

1. The dependent and independent variables show a linear relationship between the slope
and the intercept.
2. The independent variable is not random.
3. The value of the residual (error) is zero.
4. The value of the residual (error) is constant across all observations.
5. The value of the residual (error) is not correlated across all observations.
6. The residual (error) values follow the normal distribution.

Regression Analysis – Simple Linear Regression

Simple linear regression is a model that assesses the relationship between a dependent
variable and an independent variable. The simple linear model is expressed using the
following equation:

Y = a + bX + ϵ

Where:

 Y – Dependent variable
 X – Independent (explanatory) variable
 a – Intercept
 b – Slope
 ϵ – Residual (error)
5. ANOVA TABLE - Analysis of Variance (ANOVA) is a statistical analysis to test
the degree of differences between two or more groups of an experiment. The results of the
ANOVA test are displayed in a tabular form known as an ANOVA table. The ANOVA table
displays the statistics that used to test hypotheses about the population means. The ANOVA
table can be either one way or two way ANOVA table.

The various column headings that are included in the ANOVA table are as follows:

1. “Source” – It means the source which is responsible for the variation in the data.
2. “DF” – degree of freedom of the data.
3. “SS”- the sum of the squares of the data.
4. “MS”- mean sum of the squares of the data.
5. “F” – F-statistic.
6. “P” – P-value.

The various row headings that are included in the ANOVA table are as follows:

1. “Factor” – It indicates the variability that results from the factor of interest.
2. “Error” – It means the unexplained random error or the variability within the groups.
3. “Total” – It is the total deviation of the data from the grand mean.

ANOVA table can be constructed either by hand or by using any software.

Interpretation of ANOVA table is as follows:

If the obtained P-value from the ANOVA table is less than or equivalent to the level of
significance, the null hypothesis gets rejected and concluded that all the population's means are
not equal.

If the obtained P-value from the ANOVA table is greater than the level of significance, the null
hypothesis does not get rejected and concluded that all the population means are equal.

You might also like