0% found this document useful (0 votes)
72 views19 pages

Workbook Part 1 Revised 2023 Header

Uploaded by

dahalritesh59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views19 pages

Workbook Part 1 Revised 2023 Header

Uploaded by

dahalritesh59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

PROBABILITY & STATISTICS

WORK BOOK
By

DEEPAK DHAS GEORGE


Junior Associate
Junior Associate SeniorProfessor
Professor Lecturer, Kantipur Engineering College, Dhapakhel

Part 1
Part 1 ((Statistics
Statistics& Correlation And Regression
And Correlation )
& Regression )

NAME OF THE STUDENT:

ROLL NUMBER:
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

Topic 1 : Statistics And Correlation & Regression

* Mean, Median, Mode, Standard Deviation, Variance, Coefficient of Variation


* Percentiles,Deciles, Quartiles
* Histogram, Bar Diagrams, Pie Diagrams, Cumulative Frequency Graphs
* Stem and Leaf Diagram, Box Plot
* Simple Correlation, Multiple Correlation And Partial Correlation
* Simple Regression And Multiple Regression, Confidence Interval of Regression lines.

Very Important Theory Questions For Topic 1


Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

Note Down Important Points and Formulae


Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]
Note Down Important Points and Formulae
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

1) The following table shows the number of hours 45 hospital patients slept following the
administration of a certain anesthetic.

7 10 12 4 8 7 3 8 5
12 11 3 8 1 1 13 10 4
4 5 5 8 7 7 3 2 3
8 13 1 7 17 3 4 5 5
3 1 17 10 4 7 7 11 8

(a) Find Sample mean , sample variance and sample standard deviation.
(b) Compute a value that measures the amount of variability relative to the value of mean.

2) Following data reveals 50 samples of Speed ( km/hr ) of vehicles traveling in an intersection of a


busy road. Estimate average speed of vehicles. Also test consistency by applying suitable
formula.

54 65 58 59 53 67 68 59 64 68
69 72 75 48 44 42 53 52 51 49
48 46 54 52 50 51 64 45 42 49
54 71 72 58 59 52 55 55 56 54
54 68 66 65 67 67 65 65 64 64
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

3) A semiconductor manufacturer produces devices used as central processing units in personal


computers. The speed of the device (in Megahertz) is important because it determines the price
that the manufacturer can charge for the devices. The following table contains measurements on
48 devices.

717 727 653 637 660 693 679 682 724 642 704 695
704 652 664 702 661 720 695 670 656 718 660 648
683 723 710 680 684 705 681 748 697 703 660 722
662 709 683 705 678 674 656 667 683 691 750 685

Find the (a) Sample mean of the distribution.


(b) Sample standard deviation and coefficient of variation.

4) As part of a study monitoring acid rain, measurements of sulfate deposits (kg/hectare) are
recorded for different locations on the Eastern Terai of Nepal. The results are listed in the
following table for 15 recent and consecutive years:
Acid Rain : Sulfate Deposited (kg/ hectare )
Year Location 1(x) Location 2(y) Location 3(z) (a) Find Sample mean, Sample standard
1 11.94 13.09 7.96 deviation and Coefficient of Variation for
Sulfate deposits of each location.
2 11.28 10.88 12.84
3 10.38 12.19 7.38 (b) Give your conclusion about Variability
and uniformly from the analysis.
4 8.00 10.75 7.26
PARAMETER – A measurable characteristic
5 12.12 17.21 10.12 of a Population
6 10.27 10.26 8.89
STATISITIC – A measurable characteristic
7 14.80 15.49 11.60 of a Sample
8 13.52 11.61 9.02
9 10.55 10.53 7.78
10 9.81 12.50 8.70
11 11.27 9.94 10.50
12 12.12 11.21 9.95
13 11.68 9.71 15.59
14 11.77 9.37 10.54
15 17.29 13.87 13.64
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

PERCENTILE
The pth percentile is a value such that at least p percent of the observations are less than or equal to
this value and at least (100-p) percent of the observations are greater than or equal to this value.

CALCULATING THE Pth PERCENTILE

Step 1 : Arrange the data in ascending order (smallest value to largest value).
Step 2 : Compute an index i
p
i=( )n
100
Step 3 : (a) If it is not an integer, round up. The next integer greater than i denotes the position of
the pth percentile.

(b) If i is an integer, the pth percentile is the average of the values in positions i and i + 1

5) Monthly starting salaries for a sample of 12 Business school Graduates are given below:

Graduate Monthly Starting Salary ($) Graduate Monthly Starting Salary ($)
1 3450 7 3490
2 3550 8 3730
3 3650 9 3540
4 3480 10 3925
5 3355 11 3520
6 3310 12 3480
(a) Calculate the Mean, SD , Variance and CV using Formula Method.
(b) Calculate the three Quartiles, which quartile represents the Median.
(c) Make a Box plot of the above data. FORMULAE

Σx
μ =
N
2
( Σ x)
σ =
√ Σ x 2−
n
(YOU CAN SIMPLIFY)
n

Σx
x̄ =
n
2
(Σ x )


2
Σx −
n
s =
n−1
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

6) The number of minutes that a person had to wait for the bus to work on 13 working days are :
1, 10 , 13 , 12 , 8 , 2 , 6 , 9 , 17 , 30 , 5 , 4 , and 15
(a) Find the values constituting the 5 – number summary
(b) Construct a box plot.

7) The cost of consumer purchases such as single family housing, gasoline, Internet services, tax
preparation and hospitalization were provided in the Wall Street Journal (January 2 , 2007).
Sample data typical of the cost of tax-return preparation by services such as H & R Block are
shown below.
120 230 110 115 160
130 150 105 195 155
105 360 120 120 140
100 115 180 235 255
(a) Compute the mean , median and mode
(b) Compute the first and third quartiles
(c) Compute and Interpret the 90th Percentile.
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

8) Arrange the following data using stem and leaf method.

108 94 188 116 165 181 106 133 176 110


169 134 129 109 85 124 119 165 153 135
105 180 105 91 117 148 83 96 101 123
128 143 136 99 169 133 89 90 174 144
151 168 103 116 106 107 179 113 172 120
179 183 99 94 87 120 154 159 103 139

9) Find the mean, median and mode.

Marks Number of Students


20 3
25 4
30 7
35 2
40 3
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]
10) Two different sections of a statistics class take the same quiz and the scores are recorded below
(a) Find the range and standard deviation for each section.
(b) What do the range values lead you to conclude about the variation in the two sections
(c) Why is the range misleading in this case
(d) What do the standard deviation values lead you to conclude about the variation in two
sections?

Section 1 1 20 20 20 20 20 20 20 20 20 20
Section 2 2 3 4 5 6 14 15 16 17 18 19

11) Write down the significance of statistics in Engineering. An Experiment shows the height of 51
plants given below. If average heights of all the 51 plants are 40 cm. Find the missing
frequencies corresponding to the height 30 cm and 50 cm.
Height (cm) 10 20 30 40 50 60
Number of Plant 2 3 - 21 - 5
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

12) The heights of female and male students are given below :

(a) Calculate mean height for male and female students.


(b) Calculate the sample standard deviation and Sample variance for given data.
(c) Which data for height is consistent ?

13) Calculate Quartiles and Median from the following data of rainfall.

Rainfall in mm 20-30 30-40 40-50 50-60 60-70


No. of days 15 16 20 24 15
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

14) Calculate Mode from the following data of rainfall.

Rainfall in mm 20-30 30-40 40-50 50-60 60-70


No. of days 15 16 20 24 15

15) The expenditure of 1000 families is given as below:

Expenditure 40-59 60-79 80-99 100-119 120-139


No. of families 50 - 500 - 50

If the median of frequency distribution is 87, find the missing frequencies.


Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

16) Following is the age distribution of 1000 persons working in a factory.

Age Group 15-20 20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60
Number of People 60 122 135 242 148 107 85 63 38

Due to heavy loss, the management decides to bring down the strength to 50 percent of the
present number according to following scheme:
(i) to retrench the first 8 % from the lower group
(ii) to absorb the next 32 % in other branches
(iii) to make 10% from highest age group retire premature.
What will be the age limits of the person retained in the mill and of those transferred to other
branches?

17) In two companies A and B engaged in similar type of industry, the average weekly wage and
standard deviation are given below:

Company A Company B
Average Weekly Wage (Rs) 460 490
Standard Deviation 50 40
No of wage earners 100 80

(i) Which company pays larger amount as weekly wags?


(ii) Which company show greater variability in the distribution weekly wages?
(iii) What is the mean and standard deviation of all the workers in two companies taken
together?
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

18) The mean weight of 100 students in a certain class is 59 kg. The mean weight of boys in the
class is 65 kg and that of girls is 50 kg. Find the number of boys and girls in the class.

19) The mean and S.D. of 20 items is found to be 10 and 2 respectively. At the time of checking it
was found that one item 8 was incorrect.
Calculate the mean and standard deviation if
(a) the wrong item is omitted.
(b) it is replaced by 12.
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

CORRELATION & REGRESSION

n Σ x y−Σ x Σ y
r xy =
√ n Σ x −( Σ x)2 √ n Σ y2 −( Σ y)2
2

Note – The square of coefficient of correlation is called coefficient of determination.


• Since -1 ≤ rxy ≤ 1 , 0 ≤ rxy2 ≤ 1
• ( rxy2 x 100)% gives the % coefficient of determination

*** ( ^y− ȳ) = b yx (x− x̄) is called the regression line of y on x,

n Σ x y−Σ x Σ y
where byx =
n Σ x 2−(Σ x)2

*** Similarly if x is the dependent variable, the line of x on y is given by

n Σ x y−Σ x Σ y
( x^ − x̄) = b xy ( y− ȳ) , where bxy =
n Σ y 2−(Σ y)2

A common Confusion !!!


NOTE - bxy and bxy do not have square root in their formula.

20) The following data gives the experience of machine operators in years and their performance as
given by the number of good parts turned out per 100 pieces.

(a) Find Karl Pearson’s Correlation coefficient and interpret it.


(b) Determine the coefficient of determination and interpret it.
(c) Fit the regression equation of performance rating on experience and estimate the probable
performance of an operator who has 8 years of experience.
(d) What does the regression coefficient indicate ?
21) In trying to evaluate the effectiveness of antibiotics in killing bacteria, a research institute
compiled the following information.

Antibiotics (mg) 12 15 14 16 17 10
Bacteria 5 7 5.6 7.2 8.6 6.2
Find strength and direction of relationship between them.
Also find a relationship between the variables.
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

22) The following show the improvement (gain in reading speed) of eight students via speed
reading program, and the number of weeks they have been in program.
Estimate the parameters of a simple linear regression model with No of weeks as independent
variable.

No of weeks 3 5 2 8 6 9 3 4
Speed Gain 86 118 49 193 164 232 73 109
(word/minute)

Least Square Method Normal Equation Method

23) The following table gives the age of the cars of a certain company and annual maintenance
costs:

Age of cars (Years) 2 4 6 8 10


Maintenance costs (Rs) 10 15 22 32 46

(a) Obtain the regression equation for cost related to age and also estimate the cost of
maintenance for 10 yrs old car.
(b) Does it match the observed value? Why?
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

24) The sample correlation coefficient between temperature (X 1) , corn yield (X2) and rain fall (X3)
are r12 = 0.59 r13 = 0.46 r23 = 0.77 Calculate the partial correlation coefficient r 12.3 and multiple
correlation R 1.23

r xy −r xz r yz
r xy . z = 2 2
√1−r √ 1−r
xz yz

r 2xy−r 2xz−2 r xy r xz r yz
R x . yz =
√ 1−r 2yz

25) The sample of 10 values of three variables X1 X2 X3 were obtained as :

(a) Find Partial Correlation between X1 and X3 eliminating the effect of X2


(b) Find Multiple Correlation between X 1 X2 and X3 assuming X1 as dependent.
(c) Find the Multiple regression plane by treating X 1 as independent.
Probability & Statistics Workbook by Deepak Dhas George Contact: [email protected]

26) A household survey on monthly expenditure on food yield following data :

Find the linear regression plane to predict monthly expenditure

27) In a moderately asymmetrical distribution the value of mean and median are 20 and 24
respectively. Find the value of mode.

You might also like