Statsfinalproject

1) The document summarizes a student's statistics project analyzing data collected from Skittles bags. The student collected individual data on Skittle colors from classmates and compiled it to analyze distributions, means, and confidence intervals. 2) The class data showed colors were evenly distributed around 20% each, though some variation. The student was surprised that red was the 2nd most common rather than less common. 3) Through this project, the student learned statistics can be applied to real-world situations by collecting and analyzing population data in various ways such as graphs, calculations, and confidence intervals.

Uploaded by

api-325102852

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views

Statsfinalproject

Uploaded by

api-325102852

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Mikaela Ingraham

Math 1040 Signature Project

Throughout this project I learned how realistic everyday situations can
be applied to calculations in statistics. This was a very interesting project to
me because I did not realize that collecting the population data of Skittles
could be used in a variety of different statistic calculations such as graph
compilation, confidence intervals, and margin of error. To summarize, this
project helped me realize that statistics calculations can be used to solve all
kinds of real world applications, as long as population data is collected. The
following information includes data I have collected and summarized during
my studies of statistics, beginning with sample data collection.
Based on limited information about the manufacturing process we
assumed each color would be roughly equivalent about 20%
proportionately. If you look at the results of the class data set you can see
that the variation from 20% ranges from +0.014 to -0.010. With a sample
size of 2,268 that would mean a count of 454 of each color. We think that the
estimate of 20% per color is valid and relatively accurate. We resisted the
temptation to make an inference from our own bags of skittles because we
considered the sample size to be too small, although there is no reason to
believe it is not random.
If you consider the overall class data as the sample then the population
would be Skittles manufactured. If for experimental purposes you designate
only Skittles purchased by our class as the population then of course every
individual bag would be a sample. In either case, unless there are distribution
factors we are unaware of, each sample is random.
Count Count Count Count
Count Red Orange Yellow Green Purple Total
Class
Counts 464 439 485 449 431 2268
In My Bag 8 19 12 8 9 56

The graphs in some ways reflected what I expected to see. I had heard
a rumor that yellows and oranges were by far the most common color in a
bag of Skittles, but based on our data, the colors are pretty evenly matched
up. Oranges appear to be the second least color which didnt match up with
my bags data, although yellow is the greatest color, as I predicted. It was
also a surprise that red Skittles were the second most common color, I
expected them to be less since I feel I never get a lot of red Skittles in a bag.
There were a few outliers, like my abnormally high orange Skittles
count, and I did see someone elses dta had a high purple Skittles count.
Outliers will add to the count of the data, which can make it higher than an
average sample.
I think the Skittles distribution mostly matches up with my data,
besides the orange count. Yellow Skittles were not my highest count like the
class data suggests, nor did I have a large amount of red Skittles.
The next part of the project I worked on included compiling the data
into numerical representation graphs. In order to collect the data for these
graphs, we had to calculate the average and standard deviation amount of
candies per bag. The results of our calculations and graphs are included.
Mean number of candies per bag: 59.5
Standard deviation of the number of candies per bag: 2.9
5-number summary for the number of candies per bag: 53, 58, 59.5, 61, 66
The shape of the distribution is symmetrical, or bell shaped. I was a
little surprised, because glancing over the data everyones number of
Skittles in each bag seemed completely random. However, seeing the data in
a graph made it apparent that the most common amount of Skittles in a bag
was around 60. Both the frequency and box-and-whiskers plot agreed with
this amount. I had 56 Skittles in my bag, so out of the classs 38 bags, my
bag agreed with the rest of the data. My bag would fit in with the first
quadrant.
Categorical data is qualitative data that can fit into groups such as the
number of yellow Skittles in a bag, as we demonstrated in this project.
Categorical data is further organized into groups for organizational and
statistical purposes. In our project, we individually recorded how many
Skittles of each color we had in a bag and combined our data into a class
sample. We grouped the data into categories because we wanted to see how
each color compared to each other in the sample. For categorical data, using
a pie chart makes sense because pie charts often represent the number of a
subject in each category. A Pareto chart is also useful for categorical data
because each category is arranged in descending order. Pareto charts put
focus on a significant part of the data, for instance, if we want to know the
frequency of the most common Skittles color. For categorical data, which
focuses on frequency, we would want to use calculations of frequency
distribution and relative frequency distribution. Relative frequency
distribution always adds up to one, which is ideal for pie chart
representations.
Quantitative data is numerical data that can be ordered and measured.
We would want to use this data for comparison of measurements. Graphs
that are used for quantitative data are frequency histograms and box-and-
whiskers plots. These graphs are useful because they group quantitative
data into numerical measurements which can be easily organized. These
graphs are also easy to analyze and determine whether the data is skewed
or symmetrical. Common calculations for quantitative data include
determining mean, mode, range, standard deviation, quadrants, and five
number summaries. We can also use lower and upper fence calculations to
determine whether there is an outlier in the data set.
The final part of the project I worked on included calculating
proportions, margin of error, and confidence intervals for different
proportions of Skittles. The following calculations were solving the confidence
intervals for yellow Skittles and the true value of the population mean. A
confidence interval is essential for providing a range of values that is likely to
contain the population parameter.
A confidence interval provides a range of values that is likely to contain
the population parameter of interest, or to express the degree of uncertainty
associated with a statistic. In statistics it is important that how well a sample
statistic estimates the population value. Specifically, a confidence interval is
an interval estimate combined with a probability statement. Typically
confidence intervals are preferred to point estimates because confidence
intervals provide the uncertainty and precision of the estimate.
During my time in Math 1040 I learned how exactly data is gathered,
organized, and calculated into readable information that is simple for one to
scan and understand. This class helped expand my understanding of
statistics as a whole and why gathering data about a population is important
in real-world applications. To summarize, I learned that statistics is about
gathering information in a relatively simple and reliable way and using data
results in a way that will better current existing conditions.

Prior to beginning this class I had no idea what statistics entailed. I had
an image in my head of a call center gathering data for personal interest but
I didnt know how that data was organized or used. I learned in this class that
there are diverse ways of sampling data from a population such as simple
random or systematic sampling. Statistics is not also a science only utilized
by call centers as I had previously believed, but can be used in very real
world settings such as customer responses for improving a company, or
optimal standardized testing in schools. One of my favorite problems in this
class involved a teacher missing an exam score but being able to find out
what it was using the mean and number of the exam scores. It seemed
simple but it had never occurred to me previously that there could be a way
to figure that out.

In this project itself I found it interesting that we could organize a vast

amount of data, the number of each color in a bag of Skittles for each person
in the class, and organize it in such a different amount of ways. I liked
arranging the data into graphs that held a lot of information but were visually
easy to scan and read. Applying what we learned about confidence intervals
to the project also helped me realize that confidence interval calculations
can be used and interpreted in a variety of data. Overall, this project helped
me understand that statistics calculations can be applied to many different
real-world applications.

Painless Statistics
From Everand
Painless Statistics
Barron's Educational Series
No ratings yet
Stats Project
No ratings yet
Stats Project
14 pages
Skittles Statistics Project
No ratings yet
Skittles Statistics Project
9 pages
Statistics Project
No ratings yet
Statistics Project
10 pages
Statistics Formulae Sheet: X X N X F - X N L+ I F N - C) FM F 1) FM F 1) + (FM F 2) × I Lowest Value+highest Value
No ratings yet
Statistics Formulae Sheet: X X N X F - X N L+ I F N - C) FM F 1) FM F 1) + (FM F 2) × I Lowest Value+highest Value
4 pages
Factor Analysis Exercises
No ratings yet
Factor Analysis Exercises
5 pages
Eportfolio
No ratings yet
Eportfolio
12 pages
Skittles Project Stats 1040
No ratings yet
Skittles Project Stats 1040
8 pages
The Skittles Project Final
No ratings yet
The Skittles Project Final
8 pages
Math 1040 Statistics Term Project: Blake Freeman
No ratings yet
Math 1040 Statistics Term Project: Blake Freeman
10 pages
Skittles Project 2
No ratings yet
Skittles Project 2
5 pages
Math 1040 Skittles Term Project
No ratings yet
Math 1040 Skittles Term Project
9 pages
Final Project
No ratings yet
Final Project
9 pages
Team Project Part 6 Final Report
No ratings yet
Team Project Part 6 Final Report
8 pages
Term Project Part 5 Compile Term Project Reflection and Eportfolio Posting 1
No ratings yet
Term Project Part 5 Compile Term Project Reflection and Eportfolio Posting 1
8 pages
Final Project
No ratings yet
Final Project
6 pages
Math 1040 Skittles Term Project Eportfolio
No ratings yet
Math 1040 Skittles Term Project Eportfolio
7 pages
Final Math Project
No ratings yet
Final Math Project
7 pages
Skittles Proyect Part 5
No ratings yet
Skittles Proyect Part 5
7 pages
Final Project 2017
No ratings yet
Final Project 2017
7 pages
Skittles Project Group Final Word
No ratings yet
Skittles Project Group Final Word
6 pages
Math 1040 Term Project
No ratings yet
Math 1040 Term Project
7 pages
The Science of Collecting, Organizing, Summarizing, and Analyzing Information To Draw Conclusions or Answer Questions
No ratings yet
The Science of Collecting, Organizing, Summarizing, and Analyzing Information To Draw Conclusions or Answer Questions
7 pages
Skittle Project Final
No ratings yet
Skittle Project Final
6 pages
Skittles Sum
No ratings yet
Skittles Sum
1 page
Math Skittle Project
No ratings yet
Math Skittle Project
11 pages
Skittles Project
No ratings yet
Skittles Project
5 pages
Skittles Project With Reflection
100% (1)
Skittles Project With Reflection
7 pages
Group Project #1 - Skittles
No ratings yet
Group Project #1 - Skittles
9 pages
Term Project - Eportfolio
No ratings yet
Term Project - Eportfolio
7 pages
Skittles Report
No ratings yet
Skittles Report
9 pages
Skittles Project 1
100% (2)
Skittles Project 1
7 pages
TP 2 Stats
No ratings yet
TP 2 Stats
5 pages
Group Project Part 6-E Portfolio
No ratings yet
Group Project Part 6-E Portfolio
10 pages
Skittles Project Part 1
No ratings yet
Skittles Project Part 1
6 pages
Skittles Project Complete
No ratings yet
Skittles Project Complete
8 pages
Skittle Pareto Chart
No ratings yet
Skittle Pareto Chart
7 pages
Croot Part6 Eportfolio
No ratings yet
Croot Part6 Eportfolio
16 pages
Skittles 2 Final
No ratings yet
Skittles 2 Final
12 pages
The Full Skittles Project
100% (1)
The Full Skittles Project
6 pages
Math 1030 Skittles Term Project
No ratings yet
Math 1030 Skittles Term Project
8 pages
Dulces Math 1040
No ratings yet
Dulces Math 1040
6 pages
The Rainbow Report
No ratings yet
The Rainbow Report
11 pages
Skittles Word
No ratings yet
Skittles Word
5 pages
Math 1040 Skittles Term Project
No ratings yet
Math 1040 Skittles Term Project
9 pages
Skittles Project Final
No ratings yet
Skittles Project Final
5 pages
Math Profile
No ratings yet
Math Profile
9 pages
Math Skittles Term Project-1
No ratings yet
Math Skittles Term Project-1
6 pages
Skittles Final
No ratings yet
Skittles Final
12 pages
Statistics Term Project E-Portfolio Reflection
No ratings yet
Statistics Term Project E-Portfolio Reflection
8 pages
Skittles Project
No ratings yet
Skittles Project
9 pages
Skittles Project
No ratings yet
Skittles Project
6 pages
Stats Stuff
No ratings yet
Stats Stuff
7 pages
Skittle Term Project
No ratings yet
Skittle Term Project
15 pages
Math1040skittlestermproject Tomac-Dylan
No ratings yet
Math1040skittlestermproject Tomac-Dylan
4 pages
Eportfolio Statistics
No ratings yet
Eportfolio Statistics
8 pages
Term Project Skittles
No ratings yet
Term Project Skittles
4 pages
Skittles Project
No ratings yet
Skittles Project
6 pages
Skittle
No ratings yet
Skittle
6 pages
Final Project
No ratings yet
Final Project
3 pages
Final Stats Project
100% (2)
Final Stats Project
8 pages
Skittles Project
No ratings yet
Skittles Project
4 pages
Chapter 9
No ratings yet
Chapter 9
43 pages
To Pool or Not To Pool: That Is The Confusion
No ratings yet
To Pool or Not To Pool: That Is The Confusion
7 pages
Internship
No ratings yet
Internship
28 pages
Ch5 Forecasting
No ratings yet
Ch5 Forecasting
89 pages
Midterm Assessment #5: Answers Will Mean A Deduction of Points
No ratings yet
Midterm Assessment #5: Answers Will Mean A Deduction of Points
4 pages
كتاب الاحصاء الحيوية
No ratings yet
كتاب الاحصاء الحيوية
67 pages
Pengaruh Leverage, Likuiditas, Profitabilitas Dan Ukuran Perusahaan Terhadap Kebijakan Dividen
No ratings yet
Pengaruh Leverage, Likuiditas, Profitabilitas Dan Ukuran Perusahaan Terhadap Kebijakan Dividen
22 pages
Chapter 5 Testbank Topic Grid: Garrison/Noreen/Brewer, Managerial Accounting, Twelfth Edition 5-1
No ratings yet
Chapter 5 Testbank Topic Grid: Garrison/Noreen/Brewer, Managerial Accounting, Twelfth Edition 5-1
4 pages
Chap11 Two Sample Hypothesis Testing BBA 2K3
No ratings yet
Chap11 Two Sample Hypothesis Testing BBA 2K3
47 pages
Edexcel S2 Statistics 2 Definition List
100% (1)
Edexcel S2 Statistics 2 Definition List
2 pages
Analisis Data Skripsi Fix
No ratings yet
Analisis Data Skripsi Fix
11 pages
The Difference Between Significant and Not Significant Is Not Itself Statistically Significant
No ratings yet
The Difference Between Significant and Not Significant Is Not Itself Statistically Significant
5 pages
Chapter 1 Simple Linear Regression (Part 6: Matrix Version)
No ratings yet
Chapter 1 Simple Linear Regression (Part 6: Matrix Version)
12 pages
Data Science 1 2023 - Lecture 02 - Mathematical Preliminaries and Correlation
No ratings yet
Data Science 1 2023 - Lecture 02 - Mathematical Preliminaries and Correlation
49 pages
Model Selection and Model Averaging
No ratings yet
Model Selection and Model Averaging
16 pages
Assignment On Probit Model
No ratings yet
Assignment On Probit Model
17 pages
Global Carbon Emission - Regression Modelling
No ratings yet
Global Carbon Emission - Regression Modelling
4 pages
Ordinal Regression
No ratings yet
Ordinal Regression
4 pages
Chapter 3 Dispersion
No ratings yet
Chapter 3 Dispersion
12 pages
Hu and Bentler (1999)
No ratings yet
Hu and Bentler (1999)
57 pages
Planning Data Analysis
No ratings yet
Planning Data Analysis
11 pages
Effectiveness of Homework
No ratings yet
Effectiveness of Homework
20 pages
4869 Kline Chapter 5 Classical Test Theory
No ratings yet
4869 Kline Chapter 5 Classical Test Theory
16 pages
JURNAL Psiko Resiliensi Korban Kekerasan
No ratings yet
JURNAL Psiko Resiliensi Korban Kekerasan
29 pages
Ch4 Supervised
No ratings yet
Ch4 Supervised
78 pages
3.1 Comparing Two Means
No ratings yet
3.1 Comparing Two Means
26 pages
Hypothesis Testing by Example Hands On Approach Using R
No ratings yet
Hypothesis Testing by Example Hands On Approach Using R
39 pages
Day 1 Python Notebook
No ratings yet
Day 1 Python Notebook
19 pages