EDA
EDA
1/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl
Test used for one sample when the population standard deviation
One sample t-test
is unknown and n < 30 (small sample)
Test used for one sample when the population standard deviation
One sample z-test
is unknown and n e 30
Nominal Level of measurement for categories without order
Ordinal Level of measurement for ordered categories
Level of measurement for categories with equal intervals but with-
Interval
out a true zero
Level of measurement for categories with equal intervals with a
Ratio
true zero
Interval/Ratio Parametric level of measurements
Nominal/Ordinal Non-parametric level of measurements
Shapiro-Wilk test What test is used to check the normality of a sample?
Levene's test What test is used to check the homogeneity of variances?
Determine the statistical test used for the following data:
Levene's: p = 0.415
Determine the statistical test used for the following data:
Levene's: p = 0.649
Determine the statistical test used for the following data:
Levene's: p = 0.047
Determine the statistical test used for the following data:
Levene's: p = 0.532
What test of difference has the following question:
Test of Difference for Independent Samples
Is there a significant difference between the means of two popu-
lation?
2/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl
Is there a significant difference between the means of a population
between two points in time?
Determine the statistical test used for the following data:
Shapiro-Wilk: p = 0.423
Determine the statistical test used for the following data:
Shapiro-Wilk: p = 0.073
Determine the statistical test used for the following data:
Shapiro-Wilk:
There is a significant and strong positive correlation between the
Math Grade: p = 0.41
Math and Science grades of Grade 11 students. (r = 0.691, p =
Science Grade: p = 0.77
0.043)
Pearson's r:
r = 0.691
p = 0.043
Spearman's rho:
Á = -0.093
p = 0.751
There is no significant correlation between the Math and Science (Significant outlier/s exist)
grades of Grade 11 students. (Á = -0.127,p = 0.475) (No linear relationship)
Shapiro-Wilk:
Math Grade: p = 0.051
Science Grade: p = 0.349
3/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl
Pearson's r:
r = -0.116
p = 0.049
Spearman's rho:
Á = -0.127
p = 0.475
What is the strength and direction of the correlation for the value:
Positive weak correlation
r = 0.27
The correlation computed by using all the possible pairs of the
Population Correlation Coefficient (Á)
data values (x, y) taken from a population.
The correlation computed from the sample data, measuring the
Linear Correlation Coefficient (r) strength and direction of a linear relationship between two quan-
titative variables.
Scatter Plot How is linearity of samples checked?
Boxplot How are outliers identified?
What test of difference has the following question:
Testing Differences Among Groups
Is there a significant difference between the (dependent variable)
among the (independent variable)?
Construct the alternative hypothesis for testing differences among
At least one group is different from the others.
groups.
Post Hoc Test How is the specific group/s, that is different, identified for ANOVA?
How is the specific group/s, that is different, identified for
Dwass-Steel-Critchlow-Fligner (DSCF) pairwise comparisons
Kruskal-Wallis test?
Determine the statistical test used for the following data:
Shapiro-Wilk:
ANOVA
Facebook: p = 0.015
Tiktok: p = 0.439
Youtube: p = 0.098
Levene's: p = 0.254
Determine the statistical test used for the following data:
Shapiro-Wilk:
Kruskal-Wallis
Facebook: p = 0.135
Tiktok: p = 0.790
Youtube: p = 0.284
Levene's: p = 0.034
This predicts the value of a dependent variable based on the value
Simple Linear Regression
of the independent variable.
The straight line that represents the relationship between the
Regression Line
independent and dependent variables in the regression model.
By minimizing the errors or the sum of the squared differences be-
tween the observed data points and the corresponding predicted How is the regression line obtained?
values along the line.
The equation or model that describes the relationship between the
Regression Equation
independent and dependent variables.
Residual
4/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl
The differences between the observed values of the dependent
variable and the values predicted by the regression equation.
This measures the strength of the linear relationship between two
Correlation Coefficient (R) variables, indicating how well the independent variable(s) predict
the dependent variable.
This represents the proportion of the variance in the dependent
Coefficient of Determination (R²)
variable that is explained by the independent variable(s).
This predicts the value of a dependent variable based on the value
Multiple Linear Regression
of two or more independent variables.
Do the independent variables significantly predict the dependent
What is the inferential question for Multiple Linear Regression?
variable?
Determine the regression equation for the following data:
5/5