Case 1
Case 1
A Case Study
Presented to the Department of Industrial and Systems Engineering
De La Salle University-Manila
1st Term, A.Y. 2023-2024
In partial fulfillment
of the requirements for the course
Advanced Quantitative Methods Laboratory (LBYIE2C)
M&M 0.85649 0.005179 0.051794 0.002683 0.82625 0.858 0.881 0.696 1.015 100
Weights
The standard error is relatively small, indicating that the sample mean is estimated with a
high level of precision. This suggests that the sample mean is likely a good representation
of the population mean. The standard deviation measures the spread or variability in the
data. Here, the relatively small standard deviation suggests that M&M weights in the
sample are relatively consistent, with most candies clustered around the mean weight. The
variance provides additional insight into the dispersion of the data. In this case, the
variance is small. This reinforces the observation that M&M weights in the sample are not
highly variable.
Red 0.863538 0.015974 0.057594 0.003317 0.825 0.859 0.8975 0.751 0.966 13
Orange 0.8578 0.010021 0.050104 0.00251 0.84 0.863 0.881 0.735 0.977 25
Yellow 0.8345 0.013958 0.039479 0.001559 0.794 0.8495 0.85875 0.769 0.883 8
Brown 0.84775 0.028107 0.0795 0.00632 0.8145 0.857 0.874 0.696 0.982 8
Blue 0.856037 0.008082 0.041995 0.001764 0.825 0.855 0.881 0.775 0.942 27
Green 0.863526 0.013068 0.056963 0.003245 0.814 0.865 0.881 0.778 1.015 19
Smaller standard errors indicate more precise estimates. In this case, Green M&Ms have
the smallest standard error, suggesting that the mean weight of Green M&Ms is observed
and estimated with relatively high precision. Brown M&Ms have the largest standard
deviation, indicating that the weights of Brown M&Ms are more variable compared to
other colors. Conversely, Yellow M&Ms have the smallest standard deviation, indicating
less variability in their weights. In this situation, Brown M&Ms have the highest variance,
further indicating greater dispersion in their weights.
b) Provide a pie chart of the distribution of colors if we can say that the data provides a
representative proportion of each color from the manufacturing production counts.
c) Give a histogram of weights for all the observations with a normal curve for all
observations and then by each color. Does it seem to show a normally distributed fit if a
normal curve is superimposed against the histograms?
The M&M Weights histogram remains normally distributed. If the distribution of M&M
weights is approximately normal, it can be beneficial for quality control and
manufacturing processes because it suggests that the production process is consistent and
predictable. In the future, deviations from normality may indicate issues with the
production process that need to be addressed.
c) Which category of garbage is most correlated with the total weight? Which categories are
just random and cannot be predicted if the total weight is known from a sample household?
The box plot suggests that plastics account for the majority of the overall weight of
garbage. While Textile, Yard, and Others have most data with an abnormal distance from
other values which makes it unable to predict their relationship with total weight.
The pie chart above illustrates that there is a noticeable difference in word counts between
men and women, with the women contributing significantly more to the total word count
with 53.9%, whereas the men’s contributions accounted for 46.1%. This suggests that in
this context, women are more talkative than men.
The bar chart provides a clear visual representation of the data, highlighting that women
surpass men when it comes to word count.
Firstly, it is evident that women consistently contribute more words than men, as reflected
in the higher median word count for women. Additionally, the box plot shows that while
women have a higher median, their word counts also exhibit a more extensive spread,
encompassing a wider range of values. On the other hand, men exhibit a lower median but
a narrower range of word counts, with the maximum word count being higher compared
to women. Moreover, the presence of the outliers in both genders’ data points indicates
extreme values in their word counts. These observations collectively offer a thorough
perspective on the distribution of word counts.
The histogram provides a comprehensive view of the word count distribution between
men and women. Women, on average, have a higher mean word count of 16,215 compared
to men’s mean of 15,669, indicating that, as a group, women tend to speak slightly more.
Moreover, the normally distributed shape of the histogram suggests that the differences in
word counts are not skewed, adding further credence to the conclusion that women
consistently have a higher word count compared to men.
Situation 4: Speed Dating
Data are from 199 speed-dating encounters. DEC BY FEM is decision (1 = yes) of female to
date again, AGE FEM is age of female, LIKE BY FEM is “like” rating by female of male
(scale of 1-10), ATTRACT BY FEM is “attractive” rating by female of male (scale of 1-10),
ATTRIB BY FEM is sum of ratings of five attributes (sincerity, intelligence, fun,
ambitious, shared interests) by female of male. Data for males use corresponding
descriptors. Higher scale ratings correspond to more positive impressions.
Answer each of the following questions by showing a graph and making a conclusion based
on the graph.
a) Is a woman more likely to decide to date again if the man is older than her, or less likely to
decide to date again if the man is younger than her? Is there a threshold on age difference
that affects a woman’s decision to date again?
The dotplots are normally distributed, showcasing that there is no threshold on age
difference that affects a woman’s decision.
b) Is a man more likely to decide to date again if the woman is younger than him, or less
likely to date again if the woman is older than him? Is there a limit on how older a woman is
before a man will not pursue the woman?
Both dotplots are normally distributed and have a limit wherein a man would decide not to
date a woman again if she is younger than 10 years or older by 14 years.
c) Is attraction and liking strongly related? Is the attraction score and the liking score both
high or else both low, and is less likely to be high-low or low-high? Create different graphs
for men and for women.
In both scatterplots, there is a linear correlation between attraction and liking with both
genders.
d) Compare and contrast the sample observations when both decide to pursue dating (dec
by male=1 and dec by fem=1) versus the observations when both decide to not pursue
dating (dec=0, dec=0 for both). Is liking, attraction, or attribution the factor that decides
this mutual decision?
The boxplot shows that average attribution, like, and attraction scores all correlate with a
higher probability to date.
e) Take the observations when the woman wants to pursue but the man says no (dec by
fem=1 and dec by male=0). Is liking, attraction or attribution a factor in predicting this
outcome? Can you identify a cause-and-effect statement?
Average like, attraction, and attribution scores do not affect this outcome.
f) Take the observations when the man wants to pursue but the woman says no (dec by
fem=0 and dec by male=1). Is liking, attraction or attribution a factor in predicting this
outcome? Can you identify a cause-and-effect statement?
Average like, attraction, and attribution scores do not affect this outcome.