Decision Science Assignment
Decision Science Assignment
ASSIGNMENT
ANSWER 1
INTRODUCTION:
Probability refers to the chances of occurring of a random event that takes place in an
environment. It is also defined as a proportion of favorable outcomes to all the total number of
outcomes of an event that occurred. It is denoted by symbol “P”. The sum of all the probabilities
of an event is 1. The probability of any situation lies between 0 to 1. It cannot be negative as the
favorable outcomes of an event cannot exceed the total outcomes of an event. There are various
ways in which application of probability can be used such as selecting a card from the deck of
cards, winning a lottery, tossing a coin, rolling a dice etc. Other than these, it is widely used by
companies and industry for risk evaluation, the government can forecast the weather by examine
the changes in the weather, hike in share prices can also get measured with the help of
Probability. The formula of Probability:
Probability (P) = No. of Favorable Outcomes / Total No. of Outcomes
CONCEPT:
According to the question given, BAYES’ THEOREM would be applicable.
Bayes’ Theorem refers to the possibility of situation occurring on the basis of previous
understanding of any circumstances that could be related to an event. It is an extension of
conditional law of probabilities. It is a mathematical technique which helps in calculating the
conditional probabilities is known as Bayes’ Rule. This rule is widely used in various
applications such as medicine, sports, philosophy, law, engineering etc.
Where,
P (X|Y) = Conditional Probability of event X given that event Y has already occurred
P (Y|X) = Conditional Probability of event Y given that event X has already occurred
P (X) = Probability of event X
P (Y) = Probability of event Y
For solving Bayes’ Rule Problems, Tree Diagram is one of the best ways to determine the
probability.
BAD MOOD
(0.10)
NOT HAVING DISEASE (0.15)
EVENT
HAVING DISEASE (0.29)
HEALTHY MOOD
(0.90)
NOT HAVING DISEASE (0.71)
With the help of Tree Diagram, we can easily calculate the other probabilities as the summation
of probabilities is 1. The first layer of tree diagram denotes the sub division of an event i.e. Bad
Mood and Healthy Mood.
P (B) = 0.10, therefore, P (H) = 0.90
As, P (H) = P (B’) = Probability of having Healthy Mood or Probability of having No Bad
Mood.
P (H) = P (B’) = 1 – P (B)
= 1 – 0.10
= 0.90
The second layer of tree diagram further divides into two sub section of each bad mood and
healthy mood i.e. Having Disease and Not Having Disease. Subsequently, probability of not
having disease in presence of bad mood is 0.15 and probability of not having disease in the
presence of no bad mood is 0.71.
Thus, the Probability of having Bad Mood in the presence of Periodontal Disease is 0.2457.
Moreover, there will be 24.57% chances that he or she will have a bad mood when someone is
having periodontal disease.
CONCLUSION:
However, with the help of Tree Diagram and Bayes’ Theorem, we are able to determine that
there are 24.57% chances of having bad mood when someone is suffering from periodontal
disease. This represents the relation between the mood of people and disease from which they are
suffering.
ANSWER 2
INTRODUCTION:
A statistical approach utilized across different domains of finance, investing, and other relevant
areas where it seeks to formulate the exact nature and magnitude of relationship between
dependent and independent variable is known as REGRESSION ANALYSIS. The dependent
variable is represented by “y” and independent variable is represented by “x” and also known as
Predictor. It is a very beneficial technique that helps the organization in measuring the
consequences of independent variables on the dependent variables i.e. to which extent
independent variables affect the dependent variables. Many companies employ regression
analysis in their data to get insights about the gathered data which goes unnoticed in the past,
assist in optimizing the resources, manufacturing process and in logistics. In addition to this, it
also aids in predicting future events and resolve errors i.e. increase productivity and efficiency of
the firm.
CONCEPT:
Regression Analysis is mathematical and statistical technique that examines the relationship
between dependent variable and one or more independent or predictor variables. The analysis
can be done with different methods such as Simple Regression Analysis and Multiple Regression
Analysis.
▪ SIMPLE REGRESSION ANALYSIS refers to the analysis of that set of data which has
single dependent and independent variable.
The equation of line is: y = a + bx + u
▪ MULTIPLE REGRESSION ANALYSIS refers to the analysis of that set of data which
has a dependent variable and multiple independent variables i.e. more than 1 variable.
The equation of line is: y = a + b1x1 + b2x2 + b3x3 +……………..+ bnxn + u
Where, y = Dependent Variable
x = Independent Variable
a = Y Intercept
b = Slope Of Y
u = Error of Prediction
With the given data, it’s very hard to examine the data and make some conclusion out of it. So,
we use graphs (scatter plot or line chart) to determine the slope and intercept of the given data
and ascertain that the data is feasible or not to the company. With the help of Excel which
provides a set of tools that runs a regression on our data. With the regression model on excel, we
can easily predict the future events of the company.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.046617964
R Square 0.002173235
Adjusted R Square -0.080978996
Standard Error 62.9409903
Observations 14
ANOVA
df SS MS F Significance F
Regression 1 103.538016 103.538016 0.026135613 0.874259616
Residual 12 47538.81913 3961.568261
Total 13 47642.35714
❖ REGRESSION STATISTICS:
It represents the statistical information of the variability for the given data.
o STANDARD ERROR: This shows the accuracy of the regression model. It is the
difference between the actual and estimated values. In our regression model, there
is a standard error of 62.94.
o OBSERVATIONS: The sample size of the data for which regression model has to
be created.
❖ ANOVA:
ANOVA comes from Analysis of Variance. It analyzes the variables and provides the
levels of variability according to our regression model. It comprises of Regression
Statistics and Residual Output.
y = 377.20 + 1.735x
CONCLUSION:
However, in conclusion, the regression model represents that the data is not feasible or not a
good fit as per the company past records. Since, the value of R Square, Adjusted R, Multiple R,
Significance of F, and F Statistics are very low accordance to the 95% of the confidence level.
Thus, it suggests that there is no significant linear relationship between the variables i.e. no of
followers and no of posts per day.
ANSWER 3 (A)
INTRODUCTION:
NORMAL DISTRIBUTION is one of the fundamental and significant continuous probability
distribution of random variables. It is also known as bell-shaped curve as it is uni-modal in
which all the values are covered under the graph. The all values which are under the bell shaped
curve refer to as the total area which is the summation of probabilities equal to 1. The normal
distribution has several features such as it is a continuous distribution; having a symmetrical
distribution about its means i.e. half of the distribution is identical to the other half of the
distribution. The curve is asymptotic to the horizontal axis that means the curve does not touch
the x-axis. In addition to this, it is a family of curves and the total area under the graph equals to
1. The Normal Distribution is determined on the basis of two values – Mean and Standard
Deviation.
CONCEPT:
A distinctive type of normal distribution in which mean is 0 and standard deviation is 1 is known
as STANDARD NORMAL DISTRIBUTION. It is also known as Z DISTRIBUTION. In case of
normal distribution, individual’s values are referred to as “x” where in case of standard normal
distribution it is referred to as “z”. A probability corresponds to the z-score reflects the
possibility that values are less than the z-score which are going to occur. Z-score can be
calculated as follows:
z = x–μ
σ
According to the question given, we need to determine the interval must be allowed between
replacements to make sure that not more than 10% should expire before replacement; this can be
done with the help of Standard Normal Distribution and z- scores.
It is given that;
We need to find;
Value of x = Lifetime of Light Bulbs that should expire not more than 10% before
replacement.
Probability of X when x < 0.10 This means 10% or 0.10 bulbs should expire before replacement.
However,
We search for the z-score in the z distribution table when x < 0.10. The area 10% left to it has a
z-score equivalent to -1.28
Thus, z – score = -1.28
z = x–μ
σ
-1.28 = x – 120
20
-1.28 * 20 = x – 120
-25.6 = x – 120
-25.6 + 120 = x
94.4 = x
So, Interval between replacements of 1000 light bulbs to make sure that not more than
10% light bulbs should expire before replacement is 94.4 days.
CONCLUSION:
In conclusion, it is recognized that after every 94.4 days bulbs should be replaced if the firm
does not want to expire 10% before replacement. Thus, for the 1000 light bulbs with the mean
value of 120 days and standard deviation of 20 days, the interval should be allowed for 94.4 days
if not more than 10% should expire before replacement.
ANSWER 3 (B)
INTRODUCTION:
Measures of Central Tendency are a statistical concept which enables an individual to make use
of single number to indicate a whole distribution and set of information. Every central tendency
measure precisely represents the each and every aspect of distribution data. There are three types
of measures and these are Mean, Median and Mode. Mean is the average of observations
recorded. Median is the middle value of the recorded data. Mode is the value which occur the
most in the data set. From all these measures, Mean is the most common and significant measure
of central tendency.
CONCEPT:
Mean is also referred to as Average. It is defined as the average of all the observations that are
recorded. It is computed by dividing the sum of all observations by the total number of
observations. Mean is calculated for grouped as well as ungrouped data. It is denoted by x̄.
MEAN = ∑fx
∑f
Where; f = frequency of the data
x = mid value of the interval
∑ = summation
However, the given data is in continuous form, so the formula of Mean for continuous series is
as follows:
MEAN (x̄) = ∑fx
∑f
So, with the given dataset, firstly we need to calculate the Mid Value of All Age Groups.
Calculation of Mid Value is as follows:
Now, we have to calculate the product of frequency of migrants each of male and female with
the mid value.
The summation of column Male Migrants, Female Migrants, MX and FX has to be done
separately. Therefore, the values are:
CONCLUSION:
In conclusion, the Average Age or Mean of Male Migrants and Female Migrants are 31.69 years
and 36.67 years respectively. It indicates that females are older than males. It could happen due
to many factors such as migration, life expectancy of people etc.