0% found this document useful (0 votes)

83 views

Final Project

This document summarizes a group project analyzing Skittles candy data. The group calculated proportions of each Skittles color using a class data set of over 1800 candies. They found the mean number of candies per bag was 59, with a standard deviation of 2.4. Confidence intervals were also calculated: the group is 99% confident the population proportion of yellow Skittles is between 18.54-23.45%, and 90% confident the population mean number of candies per bag is between 58.279-59.721.

Uploaded by

api-262329757

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views

Final Project

Uploaded by

api-262329757

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

John Hutchins

Report | Reflection

Introduction to our work:

In this project, our group used a convenient sample of random 2.17 bags of Skittles

provided from each of the student in this class and compiled into one data set. We answered the

question of why we would expect to see a proportion size of 20% as well as made both a pareto

chart and a pie chart of relative frequencies of each color or skittles. Next, our team calculated

the mean, standard deviation, and a 5-number summary (maximum, Q1, median, Q2, minimum)

of the sample data of skittles. Then our team provided both a 99% and 90% confidence interval

as well as interpreted what these intervals meant. You will also find “My Take” at the end of

each section of our project. This briefly goes over another subject in each portion of this project.

Group Part #1

To start off this project, we need to start somewhere and that somewhere is finding the

relative frequency of the total number of colors per bag. We also thought it would be a good idea

to graph these.

1. What proportion (or percentage) of the Skittles do you expect to see of each color?

Why?

If you were to pick one skittle from an original bag of skittles you would expect the same

probability of pulling out a red, orange, yellow, green, or purple skittle from your bag. Now we

are assuming that the distribution of skittles is completely random and the number of skittles we

have in this bag is divisible by 5 to get us an even 20% (though this is not very probable). Then,

Page | 1 of 9
statistically, if you were to pick just one skittle, this would create a probability of 20% of anyone

pulling any 1 of the 5 colors of skittles. We have the results from opening and counting 31 bags

of skittles, let’s see the results below.

2. Now open the data set and compute the proportions of Red, Orange, Yellow, Green,

and Purple candies in the class data set. Note that the sample size is the total number

of candies collected by the class.

Pareto Chart to match our data: Pie chart to match our data:

3. Does the class data represent a random sample? What would the population be?

Collaborate to discuss sampling and our data in a paragraph or two. Look carefully at

Page | 2 of 9
the definition of random sample when you work on your group response. This will

likely take some discussion!

The 31 students who opened a 2.17 ounce bag of original skittles represents our

somewhat random sample. The population is everyone who has purchased and opened a bag of

original skittles. As you can see from the graphs (though the pie chart colors may be deceiving),

the yellow skittle was the most pulled at a proportion of 21%. The purple skittle was the least

frequently pulled at 18.64% and came in under our expected proportion as well as the orange

skittle (which was pulled at a below average of 19.41%). The red skittle and the green skittle

were above average at 20.56% and 20.4%. Though we didn’t have a perfect proportion of each

skittle, if we were to take a bigger sample size, we may or may not observe us getting very close

to our expected proportions.

My take:

Creating a table that displays the proportions by color and the total count from your own

bag of candies together with the proportions by color and total count for the entire class sample:

There are 5 categories of skittles and they are randomly put into bags, thus you would

think that all categories would be about 20% each if we surveyed all bags of skittles. The class

count graphs do represents this in a better fashion than my bag did. This illustrates that a sample

can or cannot represent the population as a whole with one or few experiments. It does appear

that I got jibbed out of some orange candies (my bag: 8.2%) while the average (19.41%) had an

Page | 3 of 9
expected proportion. If you were to just look at my bag, you would think that most or all skittles

bags have 10% orange though this is not the case as the average bag contained about 20% orange

skittles. However, I did have an above average amount of green which evened out the orange

skittle proportion. Overall, the class count represented what I thought I would see as we averaged

many samples out even though some of my proportions were way off the class counts. My

orange skittle count was low but my green skittle count was high, thus my total class data does

not match with my own sample bag of candy.

Group Project #2:

Now that we have some visual aids and the relative frequency, we are going to find the

mean number of candies per bag, standard deviation, and a 5 number summary to better describe

our data. We also thought it would be a good idea for some more visual aids in the form of a box

plot and a histogram.

1. Total candies in each bag (calculation via StatCrunch):

a. Mean number of candies per bag: 59

b. Standard Deviation: 2.4
c. 5-Number Summary:

i. Minimum: 53
ii. Q1: 57
iii. Median: 59
iv. Q3: 61
v. Maximum: 63

Page | 4 of 9
2. Histogram: 3. Box Plot:

My Take:

In our findings of the variable number of candies in each bag, by looking at the

histogram, you notice it is bell-shaped (there is a gap at 54, but overall this is still relatively bell-

shaped). This indicates that the data is symmetric and also is proved by the fact the mean is equal

to the median. As each bag contains 2.17 ounces of skittles, you would expect that most of these

bags would have roughly the same number of whole skittles and the graph reflects this with its

bell shape. My bag contained 61 skittles and which was above the mean (59) and median (59)

out of 31 bags sampled. This tells us that my bag was above average but still agrees with the

whole class’s data as most data has above average data and below average data.

Along with discussing the differences between the number of skittles in my bag

compared to the class average and graph shapes, I am going to also discuss the differences in

categorical data and quantitative data and their graphs. Categorical data is data that is broken into

categories and as its name implies and no real math can be done with this data. Gender, for

Page | 5 of 9
instance, would be a category and the data that could fill this category would be “Male”,

“Female”, or “Other” and it wouldn’t make much sense if we added “Male” to “Male”.

Quantitative data on the other hand is data about numerical variables such as number of males or

number of skittles. Adding each skittle together in a bag of skittles would make sense as we

could produce actual data if we collected multiple bags of skittles unlike categorical data. Pie

graphs and bar graphs are very good for graphing categorical data as you wish to display

percentages or counts in the categories and wouldn’t make sense to use graphs like a box plot or

scatter plot as you are representing numerical data with those graphs. With quantitative data, you

do want to use graphs such as a scatter plot or box plot as it will summarize and show data in a

visual representation. Pie charts are very discouraged when graphing quantitative data as it can

be hard to see if numbers are close in value to each other. In summary, categorical data

represents the data about categories such as gender or eye color and are typically graphed with

pie charts or bar graphs and no real math can done with this type of data. Quantitative data is

numerical data such as height that is graphed using scatter plots, box plots, and other graphs, real

math can be done with this data making it very useful to statisticians.

Group Project #3:

Now that we have a mean, standard deviation, and some other useful statistics, we want

to find out how viable these statistics we have just found truly are. For this, we are going to use 2

different confidence intervals as well as find a margin of error to these confidence intervals.

Page | 6 of 9
1. Our 99% confidence interval estimate for the population proportion of yellow candies:
(0.1854, 0.2345).

a. Our yellow skittles had a sample proportion of .21 (where x=384 and n=1829).
b. We also verified that we need to use a z interval rather than a t interval by:

i. Simple random sample or randomized experiment; NO, the data failed to

meet requirements for a simple random sample or randomized experiment
because selection was based upon convenience.
ii. np(1 - p) >10; where n=1829 and p=0.21, 303.43 > 10, YES
iii. n<0.05N, YES, because the total number of skittles in the population is
greater than 36580 (1829/0.05=36580).

c. We concluded this by determining the margin of error is E = 0.0246.

2. Interpret with a complete sentence the confidence interval estimate for the population

proportion of yellow candies.

We are 99% confident that the population proportion of yellow Skittles lies between 0.1854

and 0.2345 with a margin of error of 0.0246 yellow Skittles.

3. Our 90% confidence interval estimate for the population mean number of candies per

bag: (58.279, 59.721).

a. Sample mean: 59.

b. We also verified we need to use a T interval for this portion by:

i. Simple random sample or randomized experiment; NO, the data failed to

meet requirements for a simple random sample or randomized experiment
because selection was based upon convenience.
ii. sample size relatively small to size of population (n<0.05N) YES, sample
size of 31 bags is relatively small compared to the total number of bags of
skittles in existence.
iii. n ≥ 30 OR data comes from a population that is at least approximately
normal with no outliers (verified using a normal probability plot and box
plot); YES, n>30, because n = 31; 31 > 30.

c. We then calculated a margin of error of: 0.721.

4. Interpret with a complete sentence the confidence interval estimate for the population
mean number of candies per bag.

Page | 7 of 9
With a 90% confidence interval, we can conclude we are 90% confident that the actual value

of the population mean number of Skittles in each bag is between 58.279 and 59.721 with a

margin of error of 0.721 Skittles.

My Take:

In statistics we take random samples and run computations on these samples such as

computing the mean of a given data set. We run into a problem when we try to compare these

computations to populations or other results. This is where confidence intervals come into play.

We can use confidence intervals to provide some margin of error and a range of values we would

expect to see given data like a mean. For example, if we had a 95% confidence interval, we

could have a margin of error along with a lower bound and upper bound to see if our data is

within these bounds. Overall, a confidence interval is a range of values that you can be 95%

certain contains the true value of the population.

Summary:

After seeing what all of the things we have done to just a convenient sample of Skittles,

it’s easy to see how important stats is. We used a lot of the basic but essential calculations in

statistics to describe our data and give further use to data. As you’ve seen above, our group ran

calculations such as the mean, confidence intervals, margins or error, standard deviation, as well

as graphed our results. Even with just a bag of Skittles, statistics can be applied.

Reflection on this class:

At the beginning of the semester we sought out a 2.17 ounce bag of candy of skittles.

Our goal was to count how many skittles there were for each color in this bag. We then

Page | 8 of 9
submitted our results to Professor Maw to be compiled and given back to us for further

instructions. Over the semester, we calculated relative frequencies, frequencies, the mean,

standard deviation, 5-number summary, created confidence intervals, and graphed some of

these results.

Along with all of the statistical calculations listed above, we learned about z and t

intervals and when we should use one over the other. We also discussed what a margin of error

and why they are important. All of the things we learned in statics have a real world use. In

computer science, we often use statistics to monitor how efficiently an algorithm is. More than

often we use standard deviation to measure batches of processors to determine how many are

expected to be nonfunctional as well as a certain batches clock. Statistics constantly proves its

usefulness in everyday life.

Through this course I was reminded how important it is to know and understand

statistics. Everywhere you go there are data banks filled with data from millions of people just

waiting to be gone through with statistical analysis. If there is anything I have learned from my

computer science classes is that data matters and the more efficient you are at parsing data

with statistics the more successful your company that you work for or own will grow to be.

Throughout the semester, statistics showed me all of the things you can do with data and how

applicable it is in the real world and it was truly an eye-opening experience.

Page | 9 of 9

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
The Hold Me Tight Workbook - Dr. Sue Johnson
100% (16)
The Hold Me Tight Workbook - Dr. Sue Johnson
187 pages
Read People Like A Book by Patrick King-Edited
62% (66)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
94% (212)
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
212 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
36 Questions To Fall in Love 1
97% (31)
36 Questions To Fall in Love 1
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Stats Project
No ratings yet
Stats Project
14 pages
Skittles Assignment
No ratings yet
Skittles Assignment
8 pages
Final Group Project
No ratings yet
Final Group Project
11 pages
Skittles 2 Final
No ratings yet
Skittles 2 Final
12 pages
Term Project Report
No ratings yet
Term Project Report
9 pages
Skittles Project 1
100% (2)
Skittles Project 1
7 pages
Math 1040 Statistics Term Project: Blake Freeman
No ratings yet
Math 1040 Statistics Term Project: Blake Freeman
10 pages
The Rainbow Report
No ratings yet
The Rainbow Report
11 pages
Skittles Project
No ratings yet
Skittles Project
9 pages
Math 1040 Skittles Term Project
No ratings yet
Math 1040 Skittles Term Project
9 pages
Skittles Final
No ratings yet
Skittles Final
12 pages
Skittlesdataclassproject
No ratings yet
Skittlesdataclassproject
8 pages
Eportfolio
No ratings yet
Eportfolio
8 pages
Skittles Term Project 11 9 14
No ratings yet
Skittles Term Project 11 9 14
5 pages
Skittles Project Math 1040
No ratings yet
Skittles Project Math 1040
8 pages
Math 1040 Skittles Term Project Eportfolio
No ratings yet
Math 1040 Skittles Term Project Eportfolio
7 pages
Skittles Project 2
No ratings yet
Skittles Project 2
5 pages
The Full Skittles Project
100% (1)
The Full Skittles Project
6 pages
Term Project - Eportfolio
No ratings yet
Term Project - Eportfolio
7 pages
Skittles Term Project
No ratings yet
Skittles Term Project
10 pages
Final Math Project
No ratings yet
Final Math Project
7 pages
Skittles Project 5-02
No ratings yet
Skittles Project 5-02
8 pages
Skittle Pareto Chart
No ratings yet
Skittle Pareto Chart
7 pages
TP 2 Stats
No ratings yet
TP 2 Stats
5 pages
Eportfolio
No ratings yet
Eportfolio
12 pages
Skittles Project
No ratings yet
Skittles Project
9 pages
Skittles Project
No ratings yet
Skittles Project
6 pages
Statistics Project 17
No ratings yet
Statistics Project 17
13 pages
Emily Gregorio Skittles Project Final
No ratings yet
Emily Gregorio Skittles Project Final
3 pages
Term Project Part 5 Compile Term Project Reflection and Eportfolio Posting 1
No ratings yet
Term Project Part 5 Compile Term Project Reflection and Eportfolio Posting 1
8 pages
The Science of Collecting, Organizing, Summarizing, and Analyzing Information To Draw Conclusions or Answer Questions
No ratings yet
The Science of Collecting, Organizing, Summarizing, and Analyzing Information To Draw Conclusions or Answer Questions
7 pages
Stats Stuff
No ratings yet
Stats Stuff
7 pages
Skittles Project Final
No ratings yet
Skittles Project Final
5 pages
Final Stats Project
100% (2)
Final Stats Project
8 pages
Skittles Project 2
No ratings yet
Skittles Project 2
10 pages
Skittles Report
No ratings yet
Skittles Report
5 pages
Skittles Project
No ratings yet
Skittles Project
5 pages
The Skittles Project Final
No ratings yet
The Skittles Project Final
8 pages
Full Skittles Project
No ratings yet
Full Skittles Project
13 pages
Skittles Proyect Part 5
No ratings yet
Skittles Proyect Part 5
7 pages
Math 1040
No ratings yet
Math 1040
5 pages
Statistics Term Project E-Portfolio Reflection
No ratings yet
Statistics Term Project E-Portfolio Reflection
8 pages
Skittles Project Part 5
No ratings yet
Skittles Project Part 5
16 pages
Skittles
No ratings yet
Skittles
7 pages
Final Project
No ratings yet
Final Project
6 pages
Skittles Project
No ratings yet
Skittles Project
4 pages
Croot Part6 Eportfolio
No ratings yet
Croot Part6 Eportfolio
16 pages
Math 1040
No ratings yet
Math 1040
3 pages
Skittles Project
No ratings yet
Skittles Project
6 pages
Skittles Project
No ratings yet
Skittles Project
5 pages
Skittles Project Stats 1040
No ratings yet
Skittles Project Stats 1040
8 pages
Math 1040 Term Project
No ratings yet
Math 1040 Term Project
7 pages
Skittles Project With Reflection
100% (1)
Skittles Project With Reflection
7 pages
Reflection and Eportfolio
No ratings yet
Reflection and Eportfolio
8 pages
Final Skittles Project
No ratings yet
Final Skittles Project
9 pages
Stats Semester Project
No ratings yet
Stats Semester Project
11 pages
Group Project #1 - Skittles
No ratings yet
Group Project #1 - Skittles
9 pages
Skittles Project Group Final Word
No ratings yet
Skittles Project Group Final Word
6 pages
Skittle
No ratings yet
Skittle
6 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Mathematical Model of Systems:: Rotational Mechanical System Transfer Function
No ratings yet
Mathematical Model of Systems:: Rotational Mechanical System Transfer Function
13 pages
207 Python Programming Exercises Volume 1 - Become A Pro Python Developer
No ratings yet
207 Python Programming Exercises Volume 1 - Become A Pro Python Developer
199 pages
Structural Dynamics of Linear Elastic Multiple-Degrees-Of-Freedom ...
100% (2)
Structural Dynamics of Linear Elastic Multiple-Degrees-Of-Freedom ...
125 pages
SPE 80437 Integrated Reservoir Simulation Studies To Optimize Recovery From A Carbonate Reservoir
No ratings yet
SPE 80437 Integrated Reservoir Simulation Studies To Optimize Recovery From A Carbonate Reservoir
14 pages
AGC (Chapter 9 of W&W)
100% (1)
AGC (Chapter 9 of W&W)
94 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Six Sigma Training Presentation
No ratings yet
Six Sigma Training Presentation
92 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
11 pages
Physics IA
No ratings yet
Physics IA
2 pages
Micro PS1 Fall 2019 Sol Key
No ratings yet
Micro PS1 Fall 2019 Sol Key
7 pages
Maths 7 27P (1)
No ratings yet
Maths 7 27P (1)
16 pages
Calculus Abstract 1 PDF
No ratings yet
Calculus Abstract 1 PDF
3 pages
Agricultural Extension
No ratings yet
Agricultural Extension
48 pages
GUI Programming With Python - Layout Management in Tkinter
No ratings yet
GUI Programming With Python - Layout Management in Tkinter
9 pages
updated-ISMO d180b611
No ratings yet
updated-ISMO d180b611
2 pages
Or 2marks Ans
100% (1)
Or 2marks Ans
6 pages
Fundamentals of Helicopter Dynamics by Venkatesan, C (Z-Lib - Org) - 1
No ratings yet
Fundamentals of Helicopter Dynamics by Venkatesan, C (Z-Lib - Org) - 1
27 pages
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
No ratings yet
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
22 pages
Self-Consistent Field
No ratings yet
Self-Consistent Field
6 pages
Algebra 1 Study Guide
No ratings yet
Algebra 1 Study Guide
16 pages
Finite Automata
No ratings yet
Finite Automata
46 pages
Multibody Dynamics With Abaqus
No ratings yet
Multibody Dynamics With Abaqus
20 pages
Academic Year 2021-22 Scheme and Syllabus
No ratings yet
Academic Year 2021-22 Scheme and Syllabus
49 pages
On Generalization Property of Is - Open Sets in Ideal Topological Semigroups
No ratings yet
On Generalization Property of Is - Open Sets in Ideal Topological Semigroups
5 pages
B.tech ECE 2022 2023 Syllabus Scheme
No ratings yet
B.tech ECE 2022 2023 Syllabus Scheme
67 pages
Addition and Subtraction Workbook Grade 1, 5 Minute Drill
0% (1)
Addition and Subtraction Workbook Grade 1, 5 Minute Drill
152 pages
Mas 102 - Mathematics Ii
No ratings yet
Mas 102 - Mathematics Ii
5 pages
Fully-Differential Amplifiers TI PAPERS
No ratings yet
Fully-Differential Amplifiers TI PAPERS
28 pages
Dsp2013 Hw2 Sol
No ratings yet
Dsp2013 Hw2 Sol
11 pages
1 s2.0 S1364815222002444 Main
No ratings yet
1 s2.0 S1364815222002444 Main
17 pages