0% found this document useful (0 votes)
9 views5 pages

EDA

The document serves as a comprehensive review for an Engineering Data Analysis final exam, covering key statistical concepts such as hypotheses, types of errors, sampling methods, statistical tests, and data analysis techniques. It includes definitions, examples of statistical tests for various scenarios, and methods for analyzing relationships between variables. Additionally, it outlines regression analysis and the interpretation of results, including correlation coefficients and regression equations.

Uploaded by

ANDREY AGMANA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views5 pages

EDA

The document serves as a comprehensive review for an Engineering Data Analysis final exam, covering key statistical concepts such as hypotheses, types of errors, sampling methods, statistical tests, and data analysis techniques. It includes definitions, examples of statistical tests for various scenarios, and methods for analyzing relationships between variables. Additionally, it outlines regression analysis and the interpretation of results, including correlation coefficients and regression equations.

Uploaded by

ANDREY AGMANA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Engineering Data Analysis - Final Exam Reviewer

Study online at https://quizlet.com/_f4ndxl


A conjecture about a population parameter which may or may not
Statistical Hypothesis
be true.
Statistical hypothesis stating that there is no difference between a
Null Hypothesis
parameter and a specific value, or between two parameters.
Statistical hypothesis stating the existence of a difference between
Alternative Hypothesis
a parameter and a specific value, or between two parameters.
Non-directional and Directional / One-tailed and Two-tailed Types of Alternative Hypothesis
Type I and Type II Types of Error
This error occurs when the null hypothesis is rejected when it is
Type I error
true.
This error occurs when the null hypothesis is not rejected when it
Type II error
is not true.
Level of significance The maximum probability of committing a type I error.
Greek letter alpha (±) Symbol for level of significance.
This uses the data obtained from a sample to make a decision
Statistical Test
about whether the null hypothesis should be rejected.
Statistical Value The numerical value obtained from a statistical test.
The probability of obtaining results as extreme as the observed
p-value
results, assuming the null hypothesis is true.
What is the decision when:
Reject Ho
p d±
What is the decision when:
Accept Ho
p>±
Primary Data and Secondary Data Types of Data
Probability Sampling and Non-probability Sampling Types of Sampling
A sampling procedure in which each member of the population
Simple Random Sampling
has an equal probability of being included in the sample.
A variation of random sampling in which a researcher selects
Systematic Random Sampling
every nth person from the population
A type of probability sampling in which the population is divided
Stratified Sampling into groups with a common attribute and a random sample is
chosen within each group
A probability sampling technique in which clusters of participants
Cluster Sampling
within the population of interest are selected at random
A nonprobability sampling method in which elements are selected
Purposive Sampling
for a purpose, usually because of their unique position
Convenience Sampling Using a sample of people who are readily available to participate
A nonprobability sampling method in which elements are selected
to ensure that the sample represents certain characteristics in
Quota Sampling
proportion to their prevalence in the population, but without the
random selection
Recruitment of participants based on word of mouth or referrals
Snowball Sampling
from other participants
What test of hypothesis has the following question:
Test of Hypothesis for a Single Sample
Is there a significant difference between sample mean (x) and
population mean (¼)?
Test used for one sample when the population standard deviation
One sample z-test
is known and n < 30 (small sample)

1/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl
Test used for one sample when the population standard deviation
One sample t-test
is unknown and n < 30 (small sample)
Test used for one sample when the population standard deviation
One sample z-test
is unknown and n e 30
Nominal Level of measurement for categories without order
Ordinal Level of measurement for ordered categories
Level of measurement for categories with equal intervals but with-
Interval
out a true zero
Level of measurement for categories with equal intervals with a
Ratio
true zero
Interval/Ratio Parametric level of measurements
Nominal/Ordinal Non-parametric level of measurements
Shapiro-Wilk test What test is used to check the normality of a sample?
Levene's test What test is used to check the homogeneity of variances?
Determine the statistical test used for the following data:

Height of Male and Female students

Independent Samples Student's t-test Shapiro-Wilk:


Male: p = 0.883
Female: p = 0.061

Levene's: p = 0.415
Determine the statistical test used for the following data:

IQ test scores of public and private school students

Mann-Whitney U test Shapiro-Wilk:


Public: p = 0.009
Private: p = 0.530

Levene's: p = 0.649
Determine the statistical test used for the following data:

Temperature of Room A and B

Independent Samples Welch's t-test Shapiro-Wilk:


A: p = 0.534
B: p = 0.181

Levene's: p = 0.047
Determine the statistical test used for the following data:

Stress Level of Grade 11 and Grade 12 students

Mann-Whitney U test Shapiro-Wilk:


Grade 11: p = 0.187
Grade 12: p = 0.076

Levene's: p = 0.532
What test of difference has the following question:
Test of Difference for Independent Samples
Is there a significant difference between the means of two popu-
lation?

What test of difference has the following question:


Test of Difference for Paired Samples

2/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl
Is there a significant difference between the means of a population
between two points in time?
Determine the statistical test used for the following data:

(No significant outliers)


Paired Samples Student's t-test
Temperature of a room before and after having an AC unit

Shapiro-Wilk: p = 0.423
Determine the statistical test used for the following data:

(No significant outliers)


Wilcoxon Signed Rank Test
Happiness Index of workers over the span of 2 years

Shapiro-Wilk: p = 0.073
Determine the statistical test used for the following data:

(No significant outliers)


Wilcoxon Signed Rank Test
Hours of sleep before and after release of famous series

Shapiro-Wilk: p < 0.001


What test has the following question:
Correlation Analysis
Is there a significant correlation/relationship between the two vari-
ables?
A graph of the ordered pairs (x, y) of numbers consisting of the
Scatter plot
independent variable x and the dependent variable y.
A measure used by statisticians to determine the strength of the
Correlation Coefficient
linear relationship between two variables.
Provide a conclusion for the following data:

Math and Science grades of Grade 11 students

(No significant outliers)


(Linear relationship exists)

Shapiro-Wilk:
There is a significant and strong positive correlation between the
Math Grade: p = 0.41
Math and Science grades of Grade 11 students. (r = 0.691, p =
Science Grade: p = 0.77
0.043)
Pearson's r:
r = 0.691
p = 0.043

Spearman's rho:
Á = -0.093
p = 0.751

Provide a conclusion for the following data:

Math and Science grades of Grade 11 students

There is no significant correlation between the Math and Science (Significant outlier/s exist)
grades of Grade 11 students. (Á = -0.127,p = 0.475) (No linear relationship)

Shapiro-Wilk:
Math Grade: p = 0.051
Science Grade: p = 0.349
3/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl

Pearson's r:
r = -0.116
p = 0.049

Spearman's rho:
Á = -0.127
p = 0.475
What is the strength and direction of the correlation for the value:
Positive weak correlation
r = 0.27
The correlation computed by using all the possible pairs of the
Population Correlation Coefficient (Á)
data values (x, y) taken from a population.
The correlation computed from the sample data, measuring the
Linear Correlation Coefficient (r) strength and direction of a linear relationship between two quan-
titative variables.
Scatter Plot How is linearity of samples checked?
Boxplot How are outliers identified?
What test of difference has the following question:
Testing Differences Among Groups
Is there a significant difference between the (dependent variable)
among the (independent variable)?
Construct the alternative hypothesis for testing differences among
At least one group is different from the others.
groups.
Post Hoc Test How is the specific group/s, that is different, identified for ANOVA?
How is the specific group/s, that is different, identified for
Dwass-Steel-Critchlow-Fligner (DSCF) pairwise comparisons
Kruskal-Wallis test?
Determine the statistical test used for the following data:

Hours of activity on social media platforms

Shapiro-Wilk:
ANOVA
Facebook: p = 0.015
Tiktok: p = 0.439
Youtube: p = 0.098

Levene's: p = 0.254
Determine the statistical test used for the following data:

Hours of activity on social media platforms

Shapiro-Wilk:
Kruskal-Wallis
Facebook: p = 0.135
Tiktok: p = 0.790
Youtube: p = 0.284

Levene's: p = 0.034
This predicts the value of a dependent variable based on the value
Simple Linear Regression
of the independent variable.
The straight line that represents the relationship between the
Regression Line
independent and dependent variables in the regression model.
By minimizing the errors or the sum of the squared differences be-
tween the observed data points and the corresponding predicted How is the regression line obtained?
values along the line.
The equation or model that describes the relationship between the
Regression Equation
independent and dependent variables.
Residual
4/5
Engineering Data Analysis - Final Exam Reviewer
Study online at https://quizlet.com/_f4ndxl
The differences between the observed values of the dependent
variable and the values predicted by the regression equation.
This measures the strength of the linear relationship between two
Correlation Coefficient (R) variables, indicating how well the independent variable(s) predict
the dependent variable.
This represents the proportion of the variance in the dependent
Coefficient of Determination (R²)
variable that is explained by the independent variable(s).
This predicts the value of a dependent variable based on the value
Multiple Linear Regression
of two or more independent variables.
Do the independent variables significantly predict the dependent
What is the inferential question for Multiple Linear Regression?
variable?
Determine the regression equation for the following data:

College entrance exam scores based on Math grade, Science


grade, English grade, and Level of College Readiness
College Entrance Exam Score = -1.015 + 2.015(Math grade) +
0.326(Science grade) - 0.886(English grade) + 4.465(Level of Predictor Estimate p
College Readiness) Intercept -1.015 0.994
Math grade 2.015 0.005
Science grade 0.326 0.663
English grade -0.886 0.210
Level of College Readiness 4.465 0.071

5/5

You might also like