MAS202Group1 Group-Assignment
MAS202Group1 Group-Assignment
1
Table of Contents
Part 1: Introduction ..................................................................................................................3
Part 2: Collecting sample data ...................................................................................................3
Part 3: Descriptive statistics.......................................................................................................4
1. Classified by age ..................................................................................................................................... 4
Part 4: Inferential statisticsProblem 1: Test a hypothesis that the average first year GPA is less than 3
at 0.05 level of significance. ..........................................................................................................7
Problem 2: Construct a 95% confidence interval for the average first-year GPA ........................................ 9
Problem 3: Test the hypothesis that the proportion of good students is 20% at a 0.05 level of significance.
.................................................................................................................................................................... 10
Problem 4: Construct a 95% confidence interval for the proportion of good students .............................. 12
Problem 5: Test the hypothesis that there is a difference in the average first year GPA between female
and male students at 0.05 level of significance .......................................................................................... 13
Problem 6: Test the hypothesis that there is a difference in the proportion of good students between
female and male students at 0.05 level of significance .............................................................................. 15
Problem 7: Use simple linear regression to predict first year GPA based on high school GPA ................. 17
2
Part 1: Introduction
The transition from school to university presents a formidable challenge for many students. This
shift in academic rigor can be attributed to several key factors. Chief among these is that university
courses are characterized by a greater depth of knowledge and a faster pace of learning compared
to school. This demands a higher level of critical thinking, independent research, and analytical
skills, all of which freshmen may severely lack. Another contributing factor is the newfound
independence that comes with university life. Unlike school, where there is often a structured
timetable and close monitoring by teachers and parents, university students are expected to manage
their own time, which can be a daunting task for those unaccustomed to this level of autonomy.
GPA stands for Grade Point Average. It is a numerical value that represents a student's average
academic performance.
GPA plays a significant role in assessing a student's academic performance and can have various
implications depending on the educational level. In high school, GPA is often used to determine
eligibility for graduation and to evaluate a student's readiness for college admissions.
For college admissions, a high school GPA is an essential factor considered by admissions
committees. It provides insight into a student's academic abilities and serves as a measure of their
consistency and dedication to their studies. A strong high school GPA can enhance the chances of
gaining admission to competitive colleges and universities.
Understanding the significance of GPA in both high school and college is important for students
aiming to maximize their academic success and future prospects.
The purpose of this study was to assess the influence of high school GPA, and gender on GPA of
freshman students.
3
• The variables used are sex: Gender of the student. hs_gpa: High school grade point
average. fy_gpa: First year (college) grade point average.
• Gender is a categorical variable while both high school GPA and first year college GPA
are continuous variables. The dependent variable is first year college GPA. The
independent variable studied is high school GPA. We choose these two variables because
we are curious about the relationship between them.
1. Classified by age
Gender
48%
52%
Male Female
The data sample has 515 male students and 484 female students.
2. Numerical descriptive measures for high school GPA
hs_gpa
Mean 3.20
4
Sample Variance 0.291975902
Kurtosis -0.908886554
Skewness -0.180353821
Range 2.2
Minimum 1.8
Maximum 4
Sum 3193.6
Count 999
The minimum high school GPA is 1.8, the maximum is 4. The range is 2.2.
The average high school GPA is 3.2 with a standard deviation of 0.54.
5
The column chart shows that the number of students with high school GPAs between 3.6-
3.8 and 2.6-3 is the largest.
Looking at the boxplot chart, we see that the data has no outlier.
3. Numerical descriptive measures for first year GPA
fy_gpa
Mean 2.47
Standard Error 0.023411424
Median 2.47
Mode 2.24
Standard
Deviation 0.74
Sample Variance 0.547546689
Kurtosis -0.183197715
Skewness -0.216848085
Range 4
Minimum 0
Maximum 4
Sum 2466.82
6
Count 999
7
• H0: The population average first year GPA is at least 3.
• H1: The population average first year GPA is less than 3.
• Sample data:
• Implementation:
𝐻0: 𝜇 ≥ 3
𝐻1: 𝜇 < 3
-1.65
8
-22.67
• Conclusion: There is sufficient evidence that the average first-year GPA is less than 3 at a
0.05 level of significance.
Problem 2: Construct a 95% confidence interval for the average first-year GPA.
• Implementation:
1.96
9
The confidence interval is
𝑆 𝑆
𝑥̅≤ 𝜇 ≤ 𝑥̅
2.
• Conclusion: We are 95% confident that the average first-year GPA is between 2.42 and
2.52.
Problem 3: Test the hypothesis that the proportion of good students is 20% at a 0.05 level of
significance.
• Parameter to be estimated: The population proportion of students who have first-year GPA
at least 3.2
• H0: The population proportion of students who have first year GPA at least 3.2 is 20%.
• H1: The population proportion of students who have first year GPA at least 3.2 is different
from 20%.
• Sample data:
Number of students who have first year GPA at least 3.2 176
10
Proportion of students who have first year GPA at least 3.2 0.18
Level of
significance 0.05
• Implementation:
𝐻0: 𝜋 = 20%
𝐻1: 𝜋 ≠ 20%
1.96
±𝑍𝛼/2 = ±𝑍0.025 = ±
Thus, reject H0 if test statistic < -1.96 and test statistic > 1.96.
Test statistic:
11
Since -1.96 < -1.88 < 1.96, then we do not reject
H0.
• Conclusion: There is sufficient evidence that the proportion of good students is 20% at 0.05
level of significance.
Problem 4: Construct a 95% confidence interval for the proportion of good students.
• Parameter to be estimated: The population proportion of students who have first year GPA
at least 3.2
• Sample data:
Number of students who have first year GPA at least 3.2 176
Proportion of students who have first year GPA at least 3.2 0.18
𝑍𝛼/2 = 𝑍0.025 =
1.96
𝑝 ∗√
≤ ≤ ∗ √𝑝 ∗ (1 − 𝑝)
𝑛
12
0.18 ∗ (1 − 0.18) 0.18 ∗ (1 − 0.18)
0.18 − 1.96 ∗ √ √
≤ 𝜋 ≤ 0.18 + 1.96 ∗
999 999
• Implementation
• Conclusion: We are 95% confident that the proportion of good students is between 15%
and 20%.
Problem 5: Test the hypothesis that there is a difference in the average first year GPA between
female and male students at 0.05 level of significance.
• Parameter to be estimated: The difference in the average first year GPA between female
and male students.
• H0: There is no difference in the average first year GPA between female and male students.
• H1: There is significantly difference in the average first year GPA between female and
male students.
• Sample data:
Female Male
Mean 2.544587 2.398524
Variance 0.576608 0.510947
Observations 484 515
• Excel output:
Female Male
Mean 2.544587 2.398524
13
Variance 0.576608 0.510947
Observations 484 515
Pooled Variance 0.542757
Hypothesized Mean
Difference 0
df 997
t Stat 3.131697
P(T<=t) one-tail 0.000894
t Critical one-tail 1.646383
P(T<=t) two-tail 0.001789
t Critical two-tail 1.962346
• Implementation:
𝐻0: 𝜇1 − 𝜇2 = 0
𝐻1: 𝜇1 − 𝜇2 ≠ 0
1.96
Thus, reject H0 if test statistic < -1.96 or test statistic > 1.96.
14
• Conclusion: There is sufficient evidence that there is a difference in the average first year
GPA between female students and male students at 0.05 level of significance.
Problem 6: Test the hypothesis that there is a difference in the proportion of good students between
female and male students at 0.05 level of significance.
• H0: There is no difference in the proportion of good students between female and male
students.
• H1: There is significantly difference in the proportion of good students between female and
male students.
• Sample data:
Column1 Female Male Total
>= 3.2 102 74 176
< 3.2 382 441 823
Total 484 515 999
• Implementation:
𝐻0: 𝜋1 − 𝜋2 = 0
𝐻1: 𝜋1 − 𝜋2 ≠ 0
±𝑍𝛼 = ±𝑍0.025 = ±
2
Critical value:
15
1.96
Thus, reject H0 if test statistic < -1.96 or test statistic > 1.96.
Test statistic:
𝑋1 + 𝑋2 102 + 74 0.18
𝑝= = =
𝑛1 + 𝑛2 484 + 515
𝑋1 102 0.21
𝑝1 = = =
𝑛1 482
0.14
𝑋2 74
𝑝2 = = =
𝑛2 515
𝑍𝑠𝑡𝑎𝑡
16
Since 2.78 > 1.96, then we reject H0.
• Conclusion: There is sufficient evidence that there is a difference in the proportion of good
Problem 7: Use simple linear regression to predict first year GPA based on high school GPA.
𝑌̂ = 0.06 + 0.75 ∗ 𝑋 b0
= 0.06
17
Meaning: Because a student cannot have a high school GPA of 0, b0 has no practical
application.
b1 = 0.75
Meaning: The mean value of first year GPA increases by 0.75, on avearge, for each
additional 1 point of high school GPA.
We can predict the first year GPA for a student with high school GPA of 3.5 by using the
formula:
R^2 = 30.27%
Meaning: 30.27% of the total variation in the first-year GPA is explained by variation in
the high school GPA.
• Test a hypothesis that a linear relationship exists between the high school GPA and firstyear
GPA at a 0.05 significance level.
alpha =
0.05
Thus, reject H0 if p-value < 0.05.
p-value = 0.00
18
Conclusion: There is sufficient evidence that high school GPA affects first year GPA.
Part 5: Conclusion
There is a difference in the average first year GPA between female students and male students.
There is a difference in the proportion of good students between female and male students.
We should focus on studying from high school. That will create good habits for students, helping
them achieve high scores in higher levels of education.
You will learn how to collect data, process raw data, and turn it into useful information.
Statistical analysis and graphing help you better understand data and find trends and relationships.
19
Hypothesis testing: Determine the difference between groups or parameters.
During a project, you may have to work with other teammates to collect data or discuss results.
20