Skittles Project
Skittles Project
For this project, everyone in the class was asked to buy a 2.17 individual sized bag of
Skittles and count the number of each color of candy in the bag. The class data was gathered and
The first part, we had to determine the proportion of each color of candy and created a
Pareto chart and a pie chart for the total number of each color of the entire class. We compared
the class data to our own personal data and noted any similarities or differences.
Part two consisted using the skittles data to create statistics summaries of the mean,
The last part of the project involves confidence intervals and hypothesis testing. We
found 3 different confidence intervals for the population proportion, mean, and standard
Pie Chart
Proportion Percentages
1
20% 21%
2
3
19% 20% 4
20% 5
Pareto Chart
0.215
0.21
0.205
0.2
0.195
0.19
0.185
0.18
0.175
Red Yellow Purple Orange Green
Its interesting how for the pie graph, all of the proportion percentages are about the same, but
when it comes to the pareto graph, there is an obvious difference. Even though it says on the pie
chart that purple, orange, and yellow skittles are of the same percentages, it clearly is not the case
Minimum: 53
Q1: 58
Median: 60
Q3: 61
Maximum: 65
Mean: 59.414
Colors
Orange Skittles 14
Purple Skittles 9
Green Skittles 12
Red Skittles 14
Yellow Skittles 11
There are two basic divisions of data, quantitative and categorical. Quantitative data is
values that can be measured or counted, you will sometimes hear it called numerical data. Some
examples of quantitative data would be weight and time. Categorical data is values or
observations like names or labels that can be sorted into groups or categories but cannot be
measured. Categorical data can take on numerical values in some cases but those numbers dont
have mathematical meaning. Examples of categorical data would be gender and eye color.
Graphs that make sense with categorical data would be pareto chart, bar graph, and pie
chart. Scatterplot, stem-and-leaf plot, time-series graph, and dot-plot are example of graphs that
make sense using quantitative data. In quantitative data you could add, multiply, subtract, and
divide the data. Calculations with quantitative data still makes sense mathematically when
manipulated but you could not make calculations with the categorical data.
Part 2:
99%: The confidence interval for the true proportion of yellow candies is (0.178, 0.227)
95%: The confidence interval for the true mean number of candies per bag is (58.46, 60.34)
98%: The confidence interval for the standard deviation of the number of candies per bag is
(22.2, 27.7)
statistical hypothesis.
Reflection:
Mistakes could be made gathering this data. One type of error could be recording
incorrect data. This could happen if someone counted incorrectly or wrote the wrong quantity
down for that color. Non-response error is also something that could affect the data. Each person
in the class was assigned to buy a Skittles bag but if someone never bought a bag to record the
data then we are missing part of our sample. After calculating confidence intervals and
hypothesis tests I have learned that it is important to get a simple random sample because it can
affect the accuracy of the results you get when gathering data. I have been able to learn how to
properly set up the procedures of a hypothesis test and how to determine how to reject or fail to
When I was assigned the Skittles project, I was intimidated by the process of using
statistical concepts to interpret real-life data. As the project continued, I was getting more
comfortable about using confidence intervals and interpreting the different types of graphs as
well. Understanding what things like confidence intervals are and what makes data significant or
unusual is very helpful when you need to figure out what the data means. Completing the Skittles
project helped me understand how companies and corporations need to use statistics in order to
produce accurate and consistency in their work and products. This project helped me handle how
to do confidence intervals and hypothesis testing, in which I was super nervous about before.