0% found this document useful (0 votes)
16 views9 pages

Decision Science - Semester 2

The regression analysis shows a weak relationship between Instagram followers and number of posts per day. The correlation coefficient (R) is 0.0466, indicating a poor fit of the linear regression model to the data. The coefficient of determination (R-squared) is only 0.00217, meaning the independent variable only explains 0.2% of the variation in the dependent variable. Based on the p-value of 0.874 for the F-statistic, the regression model is not statistically significant. Overall, the number of posts per day does not appear to be a good predictor of the number of Instagram followers based on this data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views9 pages

Decision Science - Semester 2

The regression analysis shows a weak relationship between Instagram followers and number of posts per day. The correlation coefficient (R) is 0.0466, indicating a poor fit of the linear regression model to the data. The coefficient of determination (R-squared) is only 0.00217, meaning the independent variable only explains 0.2% of the variation in the dependent variable. Based on the p-value of 0.874 for the F-statistic, the regression model is not statistically significant. Overall, the number of posts per day does not appear to be a good predictor of the number of Instagram followers based on this data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

NMIMS – Global Access School For Continuing

Education (NGA-SCE)
Subject: Decision Science
Student ID: 77122921935
Internal Assignment: June’2023 – Semester 2nd

Prepared By,
Vibhor Shrivastava
Question:1) Bad gums may mean a bad mood. Researchers discovered that 85% of people who have
suffered a bad mood had periodontal disease, an inflammation of the gums. Only 29% of healthy
people have this disease. Suppose that in a certain community bad moods are quite rare, occurring
with only 10% probability. If someone has periodontal disease, what is the probability that he or she
will have a bad mood?
Note: Draw the tree diagram for the above problem. Handwritten tree diagram is prohibited.
Answer:1)
INTRODUCTION:
A tree diagram allows users to visualize possible outcomes and probabilities for a given situation. Tree
diagrams, also called decision trees, are particularly useful in charting the outcomes of dependent
events, where if one element changes, it impacts the entire outcome. Tracking and analysing cause-
and-effect scenarios is much easier when you have a visual aid such as a tree diagram. In a tree
diagram, each "branch" of the tree connects an idea or a step in the process to a possible outcome.
Outcomes are commonly referred to as "nodes" on a tree diagram. The resulting diagram resembles a
tree with many options and outcomes that branch off from the original idea. Tree diagrams are versatile
and useful for decision-making and other tasks across various fields and industries, including
marketing, software development, logistics, project management, and more.
Process to make a tree diagram:
a. Choose your main concept, idea, or topic: This could be a problem you need to solve, a project
you're starting, or another topic.
b. Place your main concept at the top of your diagram: Tree diagrams are hierarchical, so you
should always start with your biggest, broadest idea and get more specific as you go.
c. Create the first branches: Your first level of branches will be ideas or steps that would come
immediately after or are immediately related to the main concept.
d. Keep adding branches: Add more ideas based on your first layer of branches and continue
branching off until you reach a conclusion or outcome of each path.
e. Finish your tree diagram: Once you've exhausted all ideas, you should have enough possible
outcomes mapped out to assist you in solving your problem, making your decision, pursuing your
project, or moving forward with whatever situation inspired your tree diagram.

DESCRIPTION & CONTEXT:

For solution “Bayes' theorem” is to be used, which states that the probability of A given B is equal to
the probability of B given A times the probability of A, divided by the probability of B.

Given:
People with bad mood have periodontal disease = 85%
People who are healthy have periodontal disease = 29%
In a certain community, only 10% population has a bad mood
To Find:
The probability of having a bad mood in presence of periodontal disease in a community with 10%
population having a bad mood

Page-1
Wherein;
P(A/B) is the probability of having periodontal disease in presence of a bad mood= 0.85
P(B/A) is the probability of having a bad mood in presence of periodontal disease
P(A) is the probability of having the periodontal disease itself
P(B) is the probability of having a bad mood= 0.1
To find P(A), we must use the formula as follows:
P(A) = P(A/B) * P(B) + P(A/B`) * P(B')
Where, P(B`) = 1 - P(B); which is the probability of not having a bad mood
P(A/B`) is the probability of having periodontal disease without a bad mood
Substituting the values we have:

P(A) = 0.85 * 0.10 + 0.29 * 0.90 = 0.344

Now we can find P(B/A):

P(B/A) = 0.247 or 25% (Approximately)

CONCLUSION:
It is being concluded from the above calculations that, the probability that someone with periodontal
disease will have a bad mood is 0.247, or about 25%.

*---------xxx---------*

Page-2
Question:2) Using MS-EXCEL show the Regression model, consider ‘Instagram followers’ as
dependent variable and ‘no f post per day’ as an independent variable. Write the interpretation of
EXCEL Tables. Write the conclusion on the fitting of your model also.
No. of followers No. of post per day
439 2
340 1
315 4
444 5
377 2
456 5
495 2
304 2
401 5
305 5
338 4
348 2
402 1
395 5

Answer:2)
INTRODUCTION:
Regression analysis is a set of statistical methods used for the estimation of relationships between a
dependent variable and independent variables. We can use it to assess the strength of the relationship
between variables and for modelling the future relationship between them.

DESCRIPTION & CONTEXT:

While using regression analysis to estimate the relationships between two or more variables. There are
two basic terms that is to be understood. First one is the “Dependent Variable”, it is the factor you are
trying to predict and another one is the “Independent Variable”, it is the factor that might influence the
dependent variable.
Consider the following data where we have a ‘Instagram followers’ as dependent variable and ‘no of
post per day’ as an independent variable:
No. of followers No. of post per day
439 2
340 1
315 4
444 5
377 2
456 5
495 2
304 2
401 5
305 5

Page-3
338 4
348 2
402 1
395 5

“Summary Output” tells you how well the calculated linear regression equation fits your data source.

SUMMARY OUTPUT
Regression Statistics
Multiple R 0.046617964
R Square 0.002173235
Adjusted R Square -0.080978996
Standard Error 1.690228701
Observations 14

“Multiple R” is the Correlation Coefficient that measures the strength of a linear relationship between
two variables. The larger the absolute value, the stronger is the relationship.
• 1 means a strong positive relationship
• -1 means a strong negative relationship
• 0 means no relationship at all
“R Square” signifies the Coefficient of Determination, which shows the goodness of fit. It shows how
many points fall on the regression line.
“Adjusted R Square” is the modified version of R square that adjusts for predictors that are not
significant to the regression model.
“Standard Error” is another goodness-of-fit measure that shows the precision of your regression
analysis.
“ANOVA” stands for Analysis of Variance. It gives information about the levels of variability within
your regression model.

ANOVA
df SS MS F Significance F
Regression 1 0.07466613 0.07466613 0.026135613 0.874259616
Residual 12 34.28247673 2.856873061
Total 13 34.35714286

“df” is the number of degrees of freedom associated with the sources of variance.
“SS” is the sum of squares. The smaller the Residual SS viz a viz the Total SS, the better the fitment of
your model with the data.
“MS” is the mean square.
“F” is the F statistic or F-test for the null hypothesis. It is very effectively used to test the overall model
significance.
“Significance F” is the P-value of F.

Page-4
Coefficients Standard Error t Stat P-value
Intercept 2.735081178 2.998403817 0.912179061 0.379635263
X Variable 1 0.001251887 0.007743706 0.161665127 0.874259616
Lower 95% Upper 95% Lower 95.0% Upper 95.0%
-3.79787953 9.268041883 -3.79787953 9.268041883
-0.0156202 0.018123973 -0.0156202 0.018123973

RESIDUAL OUTPUT
Observation Predicted Y Residuals
1 3.284659659 -1.284659659
2 3.160722826 -2.160722826
3 3.129425646 0.870574354
4 3.290919095 1.709080905
5 3.207042653 -1.207042653
6 3.305941742 1.694058258
7 3.354765342 -1.354765342
8 3.115654887 -1.115654887
9 3.237087945 1.762912055
10 3.116906774 1.883093226
11 3.158219052 0.841780948
12 3.170737924 -1.170737924
13 3.238339833 -2.238339833
14 3.229576622 1.770423378
15 3.229576622 1.770423378

5
Dependent Variable

0
0 100 200 300 400 500 600
Independent Variable

CONCLISION:
It is being concluded here that no linear trend observed.

*---------xxx---------*

Page-5
Question:3) 1000 light bulbs with a mean life of 120 days are installed in a new factory and their
length of life is normally distributed with standard deviation of 20 days.
Note: You are not supposed to use EXCEL or any other software to write this answer.
Question:3A) If it is decided to replace all the bulbs together, what interval should be allowed
between replacements if not more than 10% should expire before replacement? (5 Marks)
Answer:3A)
SOLUTION:
Given:
Mean life of light bulb (m) = 120 days
Standard deviation (s) = 20 days
Number of light bulbs = 1000
We need to find out the interval between replacements such that not more than 10% of the bulbs expire
before replacement.
Formula becomes:
z = (90 - 120) / 20 = -1.5
area under normal distribution curve less than z-score of -1.5 = .0668072287.
= .0668072287 * 1000 = 66.8072287
If you don't want more than 10% to expire before replacement, then the z-score for an area of 10% to
the left = -1.28
z = (x - m) / s
Inserting the values;
-1.28 = (x – 120)/20
x = 94.4 (Approximately)
CONCLUSION:

Hence, it is being concluded that if you don't want more than 10% to fail before replacing, then bulbs
should be replaced in not more than 94 days.

*---------xxx---------*

Page-6
Question:3B) Calculate the average age of migrants for both the categories of gender and write your
interpretation.
Age Male Female
0-4 98,34,738 91,27,975
group
5-9 1,09,59,506 99,58,059
10-14 1,24,25,108 1,14,51,227
15-19 1,26,83,733 1,65,18,666
20-24 1,31,97,283 3,36,58,466
25-29 1,30,45,214 3,75,22,017
30-34 1,21,34,009 3,42,86,096
35-39 1,20,60,030 3,30,54,887
40-44 1,09,00,143 2,72,61,236
45-49 97,04,026 2,34,47,716
50-54 79,40,152 1,78,42,986
55-59 61,61,754 1,51,92,910
60-64 54,01,736 1,43,47,372
65-69 36,87,082 1,01,41,196
70-74 26,62,421 70,33,728
75-79 13,41,572 34,93,001
80-85 14,61,296 42,53,695

Answer:3B)
SOLUTION:
By calculating the average age, it can be determined that at what age most of the members on a group
can closely resembles.
The following are average for male:
Age Midpoint x Frequency Fx (Male)
0-4 2 98,34,738 1,96,69,476
5-9 7 1,09,59,506 7,67,16,542
10-14 12 1,24,25,108 14,91,01,296
15-19 17 1,26,83,733 21,56,23,461
20-24 22 1,31,97,283 29,03,40,226
25-29 27 1,30,45,214 35,22,20,778
30-34 32 1,21,34,009 38,82,88,288
35-39 37 1,20,60,030 44,62,21,110
40-44 42 1,09,00,143 45,78,06,006
45-49 47 97,04,026 45,60,89,222
50-54 52 79,40,152 41,28,87,904
55-59 57 61,61,754 35,12,19,978
60-64 62 54,01,736 33,49,07,632
65-69 67 36,87,082 24,70,34,494
70-74 72 26,62,421 19,16,94,312
75-79 77 13,41,572 10,33,01,044
80-85 82 14,61,296 11,98,26,272
Total 14,55,99,803 4,61,29,48,041
Page-7
Average age for male = ∑Fx/∑f

= 4,61,29,48,041/14,55,99,803

= 31.68
The following are average for Female:
Age Midpoint x Frequency Fx (Female)
0-4 2 91,27,975 1,82,55,950
5-9 7 99,58,059 6,97,06,413
10-14 12 1,14,51,227 13,74,14,724
15-19 17 1,65,18,666 28,08,17,322
20-24 22 3,36,58,466 74,04,86,252
25-29 27 3,75,22,017 1,01,30,94,459
30-34 32 3,42,86,096 1,09,71,55,072
35-39 37 3,30,54,887 1,22,30,30,819
40-44 42 2,72,61,236 1,14,49,71,912
45-49 47 2,34,47,716 1,10,20,42,652
50-54 52 1,78,42,986 92,78,35,272
55-59 57 1,51,92,910 86,59,95,870
60-64 62 1,43,47,372 88,95,37,064
65-69 67 1,01,41,196 67,94,60,132
70-74 72 70,33,728 50,64,28,416
75-79 77 34,93,001 26,89,61,077
80-85 82 42,53,695 34,88,02,990
Total 30,85,91,233 11,31,39,96,396

Average age for female = ∑Fx/∑f

= 11,31,39,96,396/30,85,91,233

= 36.66

CONCLUSION:
It is clearly concluded from the above-mentioned calculation that average age for female is larger than
that of male. Hence, male mortality is higher than female, this will increase females in each
consecutive age group.

*---------xxx---------*

Page-8

You might also like