0% found this document useful (0 votes)
16 views

Math Data Analysis (1)

Uploaded by

unghow0516
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Math Data Analysis (1)

Uploaded by

unghow0516
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Part 1: Topic Selection and Hypothesis

Topic: The relationship between study time and math test scores among high school students.

Hypothesis:
Students who dedicate more hours to studying for math assessments will achieve higher
scores, indicating that increased study time is positively correlated with better academic
performance. Specifically, I expect that students who study at least 3 hours daily will score
consistently above the median, suggesting that a structured study routine significantly enhances
math achievement.

Part 2: Data Collection and Collation


Data Collection: To test this hypothesis, I surveyed 20 students in secondary, asking them how
many hours they study each day and their scores. Each student’s study time was recorded to
observe how increased study time might influence math performance.

Raw Data Table:


This table lists each student's daily study time and corresponding math score. It organizes raw
data clearly, allowing for initial observations and setting up for statistical calculations. The table
also enables a straightforward comparison of study hours with performance, directly supporting
the hypothesis of a correlation between study time and scores.

Student number Study time Score out of 8

1 1.5 5

2 3 6

3 2 6

4 4 7

5 1 5

6 2.5 6

7 3.5 7

8 1.2 5

9 2 5

10 4 7

11 1 4

12 3 6

13 2 6
14 5 7

15 4.5 7

16 1 5

17 2.7 6

18 3 6

19 3.5 7

20 4 7

Proccessed table:
This table shows the calculation of the mean study time and each data point’s deviation from the
mean. By including these deviations, it highlights how individual study times vary from the
average, setting the stage for calculating the standard deviation. This measure of spread
indicates how consistent students' study habits are, which is relevant to understanding whether
a more regular study routine correlates with better performance.

Study time (x) Mean(u) x-u (x-u)squared

1 2.78 hours -1.78 3.1684

1 2.78 hours -1.78 3.1684

1 2.78 hours -1.78 3.1684

1.2 2.78 hours -1.58 2.4964

1.5 2.78 hours -1.28 1.6384

2 2.78 hours -0.78 0.6084

2 2.78 hours -0.78 0.6084

2 2.78 hours -0.78 0.6084

2.5 2.78 hours -0.28 0.784

2.7 2.78 hours -0.08 0.0064

3 2.78 hours 0.22 0.0484


3 2.78 hours 0.22 0.0484

3 2.78 hours 0.22 0.0484

3.5 2.78 hours 0.72 0.5184

3.5 2.78 hours 0.72 0.5184

4 2.78 hours 1.22 1.4884

4 2.78 hours 1.22 1.4884

4 2.78 hours 1.22 1.4884

4.5 2.78 hours 1.72 2.9584

5 2.78 hours 2.22 4.9284

Total=20 Mean=2.78 29.7896


The histogram visually represents the distribution of scores, revealing patterns at a glance. It
shows that most students score in the upper range (6 and 7), which suggests that study time
might be positively associated with higher performance. This graphical representation aligns
with the hypothesis by showing that a higher frequency of study time tends to be associated
with stronger scores, as indicated by the skew.
The first top box-and-whisker plot visually represents the spread and central tendency of
students' test scores, highlighting key statistics like the minimum, lower quartile (Q1), median
(Q2), upper quartile (Q3), and maximum scores. This plot reveals that most scores are clustered
between 6 and 7, indicating a narrow range and low variability in performance. The
concentration of high scores (close to the maximum) suggests that students generally perform
well, supporting the hypothesis that consistent study time is associated with higher test scores.
The plot effectively illustrates the consistency of high performance within the group, aligning with
the expectation of a positive correlation between study time and test results.

The second box-and-whisker plot provides a summary of the distribution of study hours,
displaying key data points like the minimum, first quartile, median, third quartile, and maximum
study times. This plot helps visualize the range and variability of study habits among students.
The clustering around the median suggests that most students study for a moderate amount of
time, with few extreme values (outliers) in study duration. This distribution supports the
hypothesis by showing that most students follow a consistent study routine, which may be linked
to higher performance scores, as seen in the score distribution.
Data Interpretation

Interpretation of the Frequency Histogram

1. Shape and Distribution:


○ The histogram shows a right-skewed (positively skewed) distribution, where most
of the scores are concentrated at the higher end (6 and 7).
○ This skewness suggests that students generally perform well, with fewer students
achieving scores at the lower end of the scale.
2. Mode and Peaks:
○ The peak of the histogram is at scores 6 and 7, indicating these are the most
frequent scores, which implies strong performance consistency across students.
○ This peak at the higher end shows a positive trend in academic achievement
among the group.
3. Spread and Variation:
○ The range of scores (4 to 7) and the clustering of scores near the upper end
imply low variation, meaning that there isn’t much difference in students'
performance.
○ The histogram confirms that most students are achieving above-average scores,
which aligns with the calculated standard deviation of 0.907, indicating that
scores are tightly grouped around the mean.
4. Insights on Academic Performance:
○ This histogram implies that the majority of students are consistently scoring at a
high level, suggesting either effective study habits, supportive learning
environments, or a combination of both.
○ The concentration of high scores (6 and 7) suggests that students in this dataset
generally meet or exceed performance expectations.
5. Relation to Hypothesis:
○ Given that higher study times are hypothesised to correlate with higher scores,
this histogram supports that idea by showing that students tend to perform better
overall, aligning with the hypothesis of study time positively impacting scores.

In conclusion, the histogram provides visual evidence of a positive skew in performance scores,
reinforcing the analysis that the students in the sample tend to score well, with limited variation
in their performance. This analysis strengthens the hypothesis that study time may contribute to
achieving higher scores in assessments.
Box-and-Whisker Plot:

1. Mean: 2.78 hours


This mean study time indicates a central point around which students’ study times are
distributed.

2. Median:
○ The median of study time is 3 hours, showing that half of the students study less
than or equal to 3 hours, and the other half study more.
3. Mode:
○ The mode for study time is also 3 hours, indicating this is the most frequent daily
study time.
4. Range:
○ The range for study times is 4 hours, calculated as Max−Min=5−1 = 4
5. Interquartile Range (IQR):
○ The IQR, representing the middle 50% of the data, is 1.75 hours, reflecting
moderate variability around the median.
6. Standard Deviation:
○ ​1.22 hours.
○ This standard deviation suggests that study times are moderately spread around
the mean.

For the scores out of 8, a box-and-whisker plot and a frequency distribution table will effectively
represent the data spread and central tendency.

Box-and-Whisker Plot:

○ The plot highlights the minimum, Q1, median (Q2), Q3, and maximum values for
scores:
■ Minimum (Min): 4
■ First Quartile (Q1): 5.5
■ Median (Q2): 6.5
■ Third Quartile (Q3): 7
■ Maximum (Max): 7
○ This box-and-whisker plot demonstrates a concentration of scores around 6 and
7, indicating that most students scored well.

Frequency Distribution Table:

○ This table breaks down the frequency of each score, allowing a detailed look at
score distribution. The cumulative frequency provides insight into the percentage
of students reaching each score level. This table complements the histogram by
adding specific numerical detail to the observed trends, reinforcing the analysis
that a higher frequency of study correlates with better scores.

Score(x) New upper Frequency(f) Culmative U x-u f(x-u)squ


bound Frequency =6.15 ared

4 4.5 1 1 6.15 2.15 4.6225

5 5.5 5 6 6.15 -1.15 6.6125

6 6.5 7 13 6.15 -0.15 0.1575

7 7.5 7 20 6.15 0.85 5.0575

Data Analysis (Measures of Central Tendency and Spread) for Scores

1. Mean:
○ The mean (average) score is 6.15.
○ This mean indicates that the central tendency of scores is around 6.15 on an
8-point scale.
2. Median:
○ The median (middle score) is 6.5, showing that half of the students scored 6.5 or
below, while half scored above.
3. Mode:
○ The mode, or most frequent score, is 7, suggesting a clustering of higher scores.
4. Range:
○ The range of scores is: Range=Max−Min=7−4
○ This narrow range reflects limited variability in scores, with most students scoring
between 4 and 7.
5. Interquartile Range (IQR):
○ The IQR, which measures the middle 50% of scores, is 1:
○ This low IQR indicates that the scores are tightly clustered around the median.
6. Standard Deviation:
○ The standard deviation is approximately 0.907
○ A standard deviation of 0.907 shows limited dispersion from the mean, meaning
students' scores are consistently close to each other, supporting the observation
of high academic performance consistency.

Conclusion: This analysis reveals that scores are generally concentrated around 6-7,
suggesting strong academic performance among the students with little variation. The data thus
supports a trend toward higher scores, with most students achieving above the mean.
Findings: The data analysis supports the hypothesis that increased study hours positively
correlate with higher scores. The box-and-whisker plot indicates a symmetric distribution, while
the scatter plot shows a positive trend between study time and IB scores.

Limitations:

1. Sample Size: The sample of 20 students is relatively small and may not represent
broader student behaviors accurately.
2. Self-Reported Data: Since study hours are self-reported, responses may be biassed or
inaccurately recalled.
3. External Influences: Other variables, such as teaching quality or individual aptitude,
could impact test scores but were not controlled in this analysis.

Conclusion: The data analysis supports the hypothesis that students who study more tend to
score higher on math assessments. This insight, while valuable, would benefit from a larger
sample size and consideration of additional influencing factors for a more robust conclusion

You might also like