Assignment: - Basic Statistics 1: Q1) Identify The Data Type For The Following
Assignment: - Basic Statistics 1: Q1) Identify The Data Type For The Following
Q2) Identify the Data types, which were among the following:
Q3) Three Coins are tossed, find the probability that two heads and one tail are
obtained?
HHH, HHT, HTH, THH, HTT, THT, TTH, TTT i.e. 8 possible results.
Q4) Two Dice are rolled, find the probability that sum is
a) Equal to 1
b) Less than or equal to 4
c) Sum is divisible by 2and 3
Ans. Possible outcomes when 2 Dice are rolled-
(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5)
(2, 6), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4)
(4, 5), (4, 6), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 1), (6, 2), (6, 3),
(3, 6), (4, 2), (4, 4), (4, 5), (4, 6), (5, 1), (5, 3), (5, 4), (5, 5), (6, 2), (6, 3),
Q5) A bag contains 2 red, 3 green and 2 blue balls. Two balls are drawn at random.
What is the probability that none of the balls drawn is blue?
Number of 2 balls, none of which is blue i.e. N(Possible events with no blue
balls)= 5C2
P(Possible events with no blue balls) = N(Possible events with no blue balls)/
N(Possible events)
=10/21 = 0.476
Q6) Calculate the Expected number of candies for a randomly selected child
Below are the probabilities of count of candies for children(ignoring the nature of
the child-Generalized view)
Q7) Calculate Mean, Median, Mode, Variance, Standard Deviation, Range &
comment about the values / draw inferences, for the given dataset
- For Points,Score,Weigh>
Find Mean, Median, Mode, Variance, Standard Deviation, and Range and
also Comment about the values/ Draw some inferences.
Ans.
Mean-
Points = 115.09/32= 3.59
Scores = 102.952/32= 3.22
Weigh = 27.16/32= 17.85
Median-
Points = (3.69+3.7)/2= 7.39//2= 3.695
Scores = (3.215+3.435)/2= 6.65/2= 3.325
Weigh = (17.6+17.82)/2= 35.42/2= 17.71
Mode-
Points = 3.92
Scores = 3.44
Weigh = 17.02
Variance-
Points =8.862/32= 0.2769
Score = 29.678748/32= 0.9275
Weigh = 98.98815/32= 3.0093379688
Standard Deviation-
Points = 0.526
Scores = 0.9630
Weigh = 1.734744
From Rstudio-
X1 Points Score Weigh
Length:32 Min. :2.760 Min. :1.513 Min. :14.50
Class :character 1st Qu.:3.080 1st Qu.:2.581 1st Qu.:16.89
Mode :character Median :3.695 Median :3.325 Median :17.71
Mean :3.597 Mean :3.217 Mean :17.85
3rd Qu.:3.920 3rd Qu.:3.610 3rd Qu.:18.90
Max. :4.930 Max. :5.424 Max. :22.90
> sd(Points)
[1] 0.5346787
> sd(Score)
[1] 0.9784574
> sd(Weigh)
[1] 1.786943
Q8) Calculate Expected Value for the problem below
Assume one of the patients is chosen at random. What is the Expected Value
of the Weight of that patient?
Ans.
Q9) Calculate Skewness, Kurtosis & draw inferences on the following data
Negative kurtosis
SP and Weight(WT)
Ans.
By seeing histogram graph we can it is Right skewed, because histogram tells the
shape of plot.8
The main purpose of box plot is finding the outliers, by seeing the above boxplot,
we can see that there are outliers beyond the upper extreme.
Ans.
X+/-(Z1-α. σ/sqrt(n)
Degrees of freedom= 2000-1= 1999
Confidence interval= 94%(1-σ/2)= 1-0.03) =0.97 for confidene interval for 94% is 1.882
Confidence interval for 98%= 2.33
Confidence interval for 96% = 2.05
Mean= 41
Median= 40
Variance= 24.111
Standard deviation= 4.910
Q13) What is the nature of skewness when mean, median of data are equal?
Ans.
Skewness is symmetrical when mean, median of data are equal.
Ans.
Skewness is right skewed when mean>median.
Ans.
Skewness is left skewed when median>mean.
Ans.
Positive kurtosis value indicates normal distribution and kurtosis value is 0.
Ans.
The distribution of the data has lighter tails and a flatter peaks than
the normaldistribution.
Q18) Answer the below questions using the below boxplot visualization.
Draw an Inference from the distribution of data for Boxplot 1 with respect Boxplot
2
Ans.
By observing both the plots whisker’s level is high in boxplot 2, mean and
median are equal hence distribution is symetrical.
Q 20) Calculate probability from the given dataset for the below cases
MPG<- Cars$MPG
a. P(MPG>38)
b. P(MPG<40)
c. P (20<MPG<50)
Ans.
By using filter command
and installing the dplyr package into the ‘R’.
a) There are 33 observations in MPG which are greater than 38.
b) 61 observations in MPG which are lesser than 40.
c) P (20<MPG<50) = 69/81
Rcode:
MPG <-c(Cars$MPG)
MPGsample(MPG)
a=subset(MPG,MPG>38)
b=subset(MPG,MPG<40)
c=subset(MPG,MPG>20 & MPG <50)
21) Check whether the data follows normal distribution
a) Check whether the MPG of Cars follows Normal Distribution
Dataset: Cars.csv
Ans.
We can interpret that the data of MPG of Cars follows the normal distribution by:
We can interpret that the data of Weight of WC_AT follows the normal distribution
by:
1) Conducting shapiro test (w=0,95586; p value =0,00117).
2) Evaluating kurtosis value which is -1,141846 .
3) Finding of mean value (91.902) which is not so far difference from median
value (90.8),
We can interpret that the data of AT of WC_AT follows the non-normal distribution
by:
1) Conducting shapiro test (w=0,95234; p value =0,000654) which is significant
lower than 0,05
2) Evaluating kurtosis value which is -0,37600593.Finding of mean value
(101,894) which is quite far difference from median value (96.54)
Q 22) Calculate the Z scoresof 90% confidence interval,94% confidence interval,
60% confidence interval
Ans.
Z score of 90% confidence interval is 1.65
Z score of 94% confidence interval is 1.55
Z score of 60% confidence interval is 0.85
Q 23) Calculate the t scores of 95% confidence interval, 96% confidence interval,
99% confidence interval for sample size of 25
Ans.
For 95%= 1.96
For 96%= 2.5
For 99% = 2.47
Q 24)A Government companyclaims that an average light bulb lasts 270 days. A
researcher randomly selects 18 bulbs for testing. The sampled bulbs last an average
of 260 days, with a standard deviation of 90 days. If the CEO's claim were true,
what is the probability that 18 randomly selected bulbs would have an average life
of no more than 260 days
Hint:
rcodept(tscore,df)
Ans.
Mean = 270 days
Sample size = 18
Sample mean = 260
Deviation sample = 90 days