0% found this document useful (0 votes)

306 views12 pages

1.8.4 Test (TST) - Statistical Analysis (Test)

The document contains a test on statistical analysis covering various topics including graphical analysis of data, measures of central tendency, random variables, experimental design, and reports and experiments. It includes specific tasks such as creating stem-and-leaf plots, calculating mean and variance, and evaluating the design of studies. Additionally, it discusses flaws in study designs and the importance of control groups and randomization in experiments.

Uploaded by

moropleussmica

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

306 views12 pages

1.8.4 Test (TST) - Statistical Analysis (Test)

Uploaded by

moropleussmica

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

1.8.

4 Test (TST): Statistical Analysis Test

Mathematics III Sem 1 Name: Micaela Moro
Date: 1/13/2025

Answer the following questions using what you've learned from this unit. Write your
answers in the space provided. Be sure to show all the work.

GRAPHICAL ANALYSIS OF DATA

1. The list below shows the ages of the first 20 fans to arrive at a professional
basketball game.

38, 39, 14, 46, 9, 25, 27, 33, 60, 11, 14, 37, 25, 28, 16, 30, 42, 35, 35, 47

Part A: Display the fan age data on this stem-and-leaf plot. (1 point)

Part B: Use your plot from Part A to display the data in this frequency table. (1
point)

Part C: Use your table from Part B to display the data on this histogram. (1 point)
Part D: Use your results from Parts A – C to answer these questions. (6 points)

a. What does it mean that the fan data are numerical and univariate?

Univariate means we’re just looking at one thing. So, when we’re doing
univariate analysis, we’re only paying attention to one characteristic, like
the ages of the fans. We’re not looking at anything else. We’re just trying to
understand and describe that one thing.

b. Are the data discrete or continuous? Explain.

The data are continuous because there are an infinite number of possible
values. Age can be measured in years, months, days, minutes, etc.

c. Which age group had the most fans? Explain how to find the answer using
each of the three data displays.

Stem-and-leaf plot: Look for the stem representing ages 30-39 and count
the number of leaves (individual data points) in that stem.

Histogram: Identify the age group intervals and find the tallest bar
corresponding to ages 30-39.

Frequency table: Locate the age group intervals and find the one with the
highest frequency (number of fans).

d. Write a question that can be answered using the stem-and-leaf plot but
cannot be answered using the frequency table or histogram.

What are the individual ages of the fans attending the game?
MEASURE AND DESCRIBE DATA

2. Jason is the captain of his basketball team. The list below shows how many
points Jason scored in each of the first 10 games of the season.

18, 23, 14, 26, 16, 10, 12, 24, 14, 13

Part A: Find the range, mean, median, and mode of Jason's scores. Show your
work. (4 points)

Range:

The range is the difference between the highest and lowest scores.
Highest score: 26
Lowest score: 10
Range = Highest score - Lowest score = 26 -10 = 16

Mean:

The mean (average) is the sum of all scores divided by the total number of scores.
Sum of all scores: 18 + 23 + 14 + 26 + 16 + 10 + 12 + 24 + 14 + 13 = 170
Number of scores: 10
Mean = Sum of all scores / Number of scores = 170 / 10 = 17

Median:

Arrange the scores in numerical order: 10, 12, 13, 14, 14, 16, 18, 23, 24, 26
Since there are 10 scores (an even number), the median is the average of the two
middle scores: (14 + 16) / 2 = 15

Mode:

The mode is the score that appears most frequently.

Scores and their frequencies: 10(1), 12(1), 13(1), 14(2), 16(1), 18(1), 23(1), 24(1),
26(1)
The score 14 appears twice, which is more than any other score. Therefore, the
mode is 14.

Part B: Use your mean result from Part A to find the variance and standard
deviation of Jason's scores. Show your work. Round your answers to the nearest
hundredth. (2 points)
Find the squared differences from the mean:

(18 - 17)2 = 12 = 1
(23 - 17)2 = 62 = 36
(14 - 17)2 = (-3)2 = 9
(26 - 17)2 = 92 = 81
(16 - 17)2 = (-1)2 = 1
(10 - 17)2 = (-7)2 = 49
(12 - 17)2 = (-5)2 = 25
(24 - 17)2 = 72 = 49
(14 - 17)2 = (-3)2 = 9
(13 - 17)2 = (-4)2 = 16

Sum of squared differences: 1 + 36 + 9 + 81 + 1 + 49 + 25 + 49 + 9 + 16

Number of scores: 10

Variance: 276/10 = 27.6

Taking the square root of the variance to find the standard deviation:

Standard deviation: Approximately 5.25

Part C: Are your results in Part B for a sample or a population? Explain. (1 point)

Answer:
The results in Part B are for a sample.

Explanation:
In Part B, we calculated the variance and standard deviation of Jason’s scores
from the first 10 games of the season. Since these 10 games represent only a
portion of all possible games Jason could play in a season, they form a sample
rather than the entire population of his game scores.

Part D: Use your results from Parts A – C to describe the shape of the
distribution of Jason's scores. (1 point)

The distribution of Jason’s scores is approximately symmetric and bell-shaped,

indicating a normal distribution. This conclusion is supported by the close
alignment of the mean, median, and mode, as well as the moderate variance and
standard deviation, which suggest that the scores are moderately spread out
around the mean. In a histogram, we would expect to see a peak at the mean
with scores gradually decreasing towards the lower and higher ends.

Part E: Use your results from Parts A – D to answer these questions. (3 points)

a. If Jason scored 40 points on his next game, would it be an outlier for his
first 11 games? Explain.
How would a score of 40 affect the range, mean, median, and mode of
Jason's scored points?

If Jason were to score 40 points on his next game, it would be considered

an outlier, as the upper bound for potential outliers is 40 according to the
1.5 * IQR rule. Let’s summarize the impact of this score on the summary
statistics:

Range: The range would increase as the maximum score now would be 40.
Mean: The mean would increase as the sum of all scores would now
include the additional 40 points.
Median: The median may or may not change depending on the position of
40 within the ordered set of scores. If 40 becomes the highest score, the
median would likely increase. Otherwise, it would remain unchanged.
Mode: The mode would remain the same unless 40 occurs more frequently
than any other score.

Overall, a score of 40 would be an outlier, and it would affect the range,

mean, median, and potentially the mode of Jason’s scores.

b. If Jason could add 10 points to the number of points he scored each game,
how would it affect the mean, median, and standard deviation of his scored
points?

If Jason were to add 10 points to the number of points he scored each

game, it would uniformly increase all scores in his dataset by 10 points.
Consequently, the mean and median of his scored points would both
increase by 10 points, as each individual score is raised by the same
amount. However, the standard deviation of his scores would remain
unchanged, as adding a constant value to each score does not alter the
spread or dispersion of the data points relative to the mean. Therefore, this
adjustment would result in a uniform upward shift in the distribution of his
scores, without affecting the variability of the data around the mean.
c. If Jason could double every game's scored points, how would it affect the
mean, median, and standard deviation of his scored points?

If Jason were to double every game’s scored points, it would uniformly

increase all scores in the dataset by a factor of 2. Consequently, both the
mean and median of his scored points would double, reflecting the
proportional increase in the central tendency measures of the data.
Additionally, the standard deviation, which measures the variability or
spread of the data points around the mean, would also double, as doubling
each score would increase the distance of individual points from the mean
by the same factor. Therefore, this adjustment would result in a significant
transformation of Jason’s score distribution, with both the central tendency
measures and the variability measure experiencing a proportional increase.

RANDOM VARIABLES

3. The table below shows the probabilities of rolling sums from 2 to 12 with a pair of
6-sided dice. Use the table to find the mean of the random variable x (the expected
value of the sum rolled with 2 dice).
x 2 3 4 5 6 7 8 9 10 11 12

P(x)

Part A: Complete the table below. You do not need to reduce fractions. The first
row is filled in for you. (5 points)

i xi P(xi) xiP(xi)

1 2 1/36 2/36

2 3 2/36 6/36

3 4 3/36 12/36

4 5 4/36 20/36

5 6 5/36 30/36

6 7 6/36 42/36
7 8 5/36 40/36

8 9 4/36 36/36

9 10 3/36 30/36

10 11 2/36 22/36

11 12 1/36 12/36

Part B: Find the mean of the random variable x. Use this formula: . (3
points)

(2 + 6 + 12 + 20 + 30 + 42 + 40 + 36 + 30 + 22 + 12)/ 36 = 7

Mean of the random variable x = 7

EXPERIMENTAL DESIGN

4. Below is a list of 30 students numbered from 00 to 29. Use the random number
table below to put exactly 6 of these students into a treatment group. Start with the
first line of the table (line 101). List the students of the treatment group in the order
in which their numbers come up in the table.
00 Aaron 06 Fallon 12 Kiefer 18 Quincy 24 Ukiah
01 Buffy 07 Graham 13 Lucia 19 Rachael 25 Valerie
02 Chandler 08 Heather 14 Monte 20 Sarah 26 Wahib
03 Cindy 09 Hsin-Chi 15 Naomi 21 Stacy 27 Xandra
04 Drusilla 10 Ismail 16 Otis 22 Tasha 28 Yale
05 Eric 11 Jasmine 17 Polly 23 Turan 29 Zach

Part A: Search through the list for pairs that match student numbers and circle
each one found. If a student number is repeated, only use the first occurrence.
Stop when you have 6 matches. (3 points)

20-02-03-11-14-26

Part B: List the 6 matching student numbers you found. (3 points)

Chandler, Cindy, Jasmine, Monte, Sarah, Wahib

Part C: Which 6 students will be in the treatment group? (3 points)

Chandler, Cindy, Jasmine, Monte, Sarah, Wahib

REPORTS and EXPERIMENTS

5. Devon wants to know if drinking milk before bed helps teenagers sleep. He
chooses 10 friends on his high school basketball team for his study. Every night for
one month, the friends drink a glass of milk before bed and later record how many
hours they slept that night.

Part A: Describe three reasons why Devon's study design is flawed. (3 points)

1. Small Sample Size: Only selecting 10 friends from his basketball team
limits the study’s representativeness and increases the risk of bias.

2. Lack of Control Group: Without a control group, it’s difficult to isolate the
effect of drinking milk before bed on sleep duration accurately.

3. Reliance of Self-Reported Data: Using self-reported sleep data introduces

potential inaccuracies and biases, as people may not report their sleep
accurately.

Part B: After one month, Devon finds that his 10 friends slept an average of 0.5
hour more each night when they drank milk before bed. Based on this, Devon
concludes that drinking milk makes teenagers sleepy. What is one reason why
Devon's conclusion is most likely invalid? (1 point)

Devon’s conclusion is likely invalid because the observed increased in sleep

duration could be due to factors other than drinking milk before bed. Without a
control group, it’s impossible to determine if the increase in sleep duration is
truly caused by drinking milk or if it’s simply a coincidence or influenced by other
variables.

Part C: A study conducted by a major milk manufacturer showed that 83% of

American teenagers prefer drinking milk to drinking soda. What are two reasons
why this statistic cannot be trusted? (2 points)

1. Potential Bias: The study was conducted by a major milk manufacturer,

which introduces a potential conflict of interest. The manufacturer may
have a vested interest in promoting milk consumption and could
manipulate the study design or results to favor their product.
2. Sampling Bias: The study’s sample may not be representative of the entire
population of American teenagers. The sample might have been selectively
chosen to include individuals more likely to prefer milk over soda, leading to
an overestimation of the preference for milk among teenagers. Additionally,
the methodology used to select participants may not have been rigorous or
random, further undermining the reliability of the statistic.

6. A lab researcher wants to find out whether mice will run through a maze quicker
during the day or at night, after training. He has 100 mice available. He randomly
assigns 50 of them into each group. He trains the first group to run the maze at 9:00
am and trains the second group to run the maze at 9:00 pm. Each mouse is trained
the same way and the last three run times are recorded.

Part A: Describe what is being measured in this experiment and what variable is
being manipulated. (1 point)

In this experiment, the researcher is measuring the time it takes for mice to
navigate through a maze, which is the dependent variable. The independent
variable being manipulated is the time of day at which the mice are trained to
run the maze, with one group trained during the day and the other at night.

Part B: Control, randomization, and replication are ways to design an experiment

so that bias is reduced. Describe whether or not this experiment incorporates
each of these principles into the experimental design. (6 points)

Incorporation: The experiment seems to

lack control over potential confounding
variables. Factors like maze complexity,
training method consistency, and
environmental conditions are not
explicitly addressed. Without controlling
these variables, it’s challenging to
attribute any observed differences in
maze-running performance solely to the
Control time of training.

Randomization Incorporation: Random assignment of

mice into the day and night training
groups is mentioned, which is a form of
randomization. This helps ensure that
any individual differences among the
mice are evenly distributed between the
two groups, reducing the likelihood of
bias in group assignment.

Incorporation: The experiment does not

explicitly mention replication.
Replication involves conducting the
same experiment multiple times to
verify the consistency and reliability of
the results. Without replication, the
robustness of the findings may be
Replication uncertain.

Copyright © 2018 Apex Learning Inc. Use of this material is subject to Apex Learning's Terms of Use. Any unauthorized
copying, reuse, or redistribution is prohibited. Apex Learning ® and the Apex Learning Logo are registered trademarks of Apex
Learning Inc.