Assignment 1 Ans (Reference)
Assignment 1 Ans (Reference)
Q2) Identify the Data types, which were among the following
Nominal, Ordinal, Interval, Ratio.
Data Data Type
Gender Nominal
High School Class Ranking Ordinal
Celsius Temperature Interval
Weight Ratio
Hair Color Nominal
Socioeconomic Status Ordinal
Fahrenheit Temperature Interval
Height Ratio
Type of living accommodation Ordinal
Level of Agreement Ordinal
IQ(Intelligence Scale) Interval
Sales Figures Interval
Blood Group Nominal
Time Of Day Interval
Time on a Clock with Hands Interval
Number of Children Ratio
Religious Preference Ordinal
Barometer Pressure Interval
SAT Scores Interval
Years of Education Interval
Q3) Three Coins are tossed, find the probability that two heads and one tail are
obtained?
Ans- There coins are tossed so the possibility of outcomes will be: HHH, HHT,
HTH, THH, TTT, TTH, THT and HTT
The probability of getting two Heads and one tail = (interested events / total no of
outcomes) = 3/8 = 37 %
In Python
def event_probability(event_outcomes, sample_space):
probability = (event_outcomes / sample_space) * 100
return round(probability, 1)
tevents = 8
ievents = 3
HT_probability = event_probability(ievents,tevents)
print('Probability of getting 2 heads & 1 tails is:',str(HT_probability) + '%')
Probability of getting 2 heads & 1 tails is: 37.5%
Q4) Two Dice are rolled, find the probability that sum is
a) Equal to 1
b) Less than or equal to 4
c) Sum is divisible by 2 and 3
Ans –
Two Dice are rolled, so the no of outcomes = 6 * 6 = 36
a) When we rolled two dice the probability of sum is minimum 2, because it
starts from (1,1) so the probability that sum is equal to 1 = ZERO
Q5) A bag contains 2 red, 3 green and 2 blue balls. Two balls are drawn at
random. What is the probability that none of the balls drawn is blue?
Ans-
Total number of balls = (2 + 3 + 2) = 7
Let S be the sample space
Then, n(S) = Number of ways of drawing 2 balls out of 7
n(S)=7C2
n(S)=(7×6) / (2×1)= 21
Q6) Calculate the Expected number of candies for a randomly selected child
Below are the probabilities of count of candies for children (ignoring the nature of
the child-Generalized view)
CHILD Candies count Probability
A 1 0.015
B 4 0.20
C 3 0.65
D 5 0.005
E 6 0.01
F 2 0.120
Child A – probability of having 1 candy = 0.015.
Child B – probability of having 4 candies = 0.20
Ans –
Expected number of candies for a randomly selected child
= (1 x 0.015) + (4 x 0.20) + (3 x 0.65) + (5 x 0.005) + (6 x 0.01) + (2 x 0.12)
= 0.015 + 0.8 + 1.95 + 0.025 + 0.06 + 0.24 = 3.09
Q7) Calculate Mean, Median, Mode, Variance, Standard Deviation, Range &
comment about the values / draw inferences, for the given dataset
- For Points, Score, Weigh>
Find Mean, Median, Mode, Variance, Standard Deviation, and Range
and also Comment about the values/ Draw some inferences.
Use Q7.csv file
Ans-
import pandas as pd
df = pd.read_csv('D:\\Study\\Assignments\\Q7.csv')
df
df.describe()
df.var()
Points Score Weigh
Mean 3.596563 3.217250 17.848750
Median 3.695000 3.325000 17.710000
Mode 3.92 3.44 17.02
Variance(s2) 0.285881 0.957379 3.193166
Standard 0.534679 0.978457 1.786943
Deviation(s)
Range 2.17 3.911 8.4
Inferences:
Here in this case of different models of cars data, most type of cars have average
points of 3.596563 , score of 3.217150 and weigh of 17.848750. Also here in this
scenario the standard deviation is very low in points and score so chances of
presence of outliers in both the case is very low and comparing to weigh there is
little bit higher standard deviation so may be some outliers are present.
Somehow data points in every case have less spread so most of the data points lie
near to the median.
Inferences:
For car speed skewness is negative and also the kurtosis is negative, which
suggests that the distribution is more towards left. It means the distribution
is left skewed or negative skewed. Here in negative skewed mean is less
than median. As taking kurtosis into consideration it shows that the
distribution has broad peak and thin tail.
For the distance travel by the car skewness is positive and also the kurtosis
is positive, which suggests that the distribution is more towards right. It
means the distribution is right skewed or positive skewed. Here in positive
skewed mean is greater than median. As taking kurtosis into consideration
it shows that the distribution has pointed peak and wide tail.
SP and Weight (WT)
Use Q9_b.csv
Ans:
import pandas as pd
Q9_b = pd.read_csv("D:\\Study\\Assignments\\Q9_b.csv")
Q9_b
Q9_b.skew(axis = 0, skipna = True)
Q9_b.kurt(axis = 0, skipna = True)
Skewness of SP = 1.611450
Skewness of WT = -0.614753
Kurtosis of SP = 2.977329
Kurtosis of WT = 0.950291
Inferences:
For SP skewness is positive and also the kurtosis is positive, which suggests
that the distribution is more towards right. It means the distribution is right
skewed or positive skewed. Here in positive skewed mean is greater than
median. As taking kurtosis into consideration it shows that the distribution
has pointed peak and wide tail.
For WT skewness is negative and the kurtosis is positive, which suggests
that the distribution is more towards left. It means the distribution is left
skewed or negative skewed. Here in negative skewed mean is less than
median.
Q10) Draw inferences about the following boxplot & histogram
Inference:
From this above Histogram and Box plot, it shows that the distribution has
outliers at the end (means in histogram tail side and in box plot at in upper
extreme). The distribution is positive skewed or right skewed.
Ans:
=198.73 to 201.27
=198.43 to 201.56
=198.62 to 201.38
OR in Python
1.
from scipy import stats
import numpy as np
from math import sqrt
ci_94 = stats.norm.interval(0.94,200,scale = (30/sqrt(2000)))
print('Weight at 94% confidence interval is:',np.round(ci_94,4))
Weight at 94% confidence interval is: [198.7383 201.2617]
2.
ci_98 = stats.norm.interval(0.98,200,scale = (30/sqrt(2000)))
print('Weight at 98% confidence interval is:',np.round(ci_98,4))
Weight at 98% confidence interval is: [198.4394 201.5606]
3.
ci_98 = stats.norm.interval(0.98,200,scale = (30/sqrt(2000)))
print('Weight at 98% confidence interval is:',np.round(ci_98,4))
Weight at 96% confidence interval is: [198.6223 201.3777]
#a. P(MPG>38)
from scipy import stats
1-stats.norm.cdf(38,34.42,9.13)
0.34748702501304063
#b. P(MPG<40)
stats.norm.cdf(40,34.42,9.13)
0.7294571279557076
#c. P(20<MPG<50)
stats.norm.cdf(50,34.42,9.13)-stats.norm.cdf(20,34.42,9.13)
0.8989177824549222
plt.hist(cars['MPG'])
From this above box plot and histogram we can say the MPG of Cars follows
normal distribution.
plt.hist(wc_at["Waist"])
plt.boxplot(wc_at["Waist"])
From the above histogram and box plot for both AT & Waist of wc-at data
set , it shows that both AT & Waist follows normal distribution.
Q 24) A Government company claims that an average light bulb lasts 270
days. A researcher randomly selects 18 bulbs for testing. The sampled bulbs
last an average of 260 days, with a standard deviation of 90 days. If the
CEO's claim were true, what is the probability that 18 randomly selected
bulbs would have an average life?
Hint:
rcode pt(tscore,df)
df degrees of freedom
Ans:
Sample size, n = 18
=(260-270)/(90/sqrt(18))
= -10/21.23
= -0.47
Probability
pt(tscore,df)
pt(-0.47,17)
ans = 0.3221639