VASIREDDY VENKATADRI INSTITUTE OF
TECHNOLOGYNAMBUR-522508.
YEAR: II B. Tech SEMESTER: I (OR) II
COURSE NAME: Probability and Statistics
COURSE CODE: 23XXXXXX
BRANCH: CSE AI&ML and CSM Branches
PREREQUISITE: Sets, functions, Permutations and Combinations
Course Objectives:
● To familiarize the students with the foundations of probability and statistical methods.
● To impart probability concepts and statistical methods in various applications
Engineering
Course Outcomes:
At the end of the course, the student will be able
Cognitive
Levels as
S.N Weightag
OUTCOME per
o e (%)
Bloom’s
Taxonomy
Classify the concepts of data science and its
importance. L1, L2, L3,
CO1 20
L4
Interpret the association of characteristics and
through correlation and regression tools. L1, L2, L3,
CO2 20
L4
Apply discrete and continuous probability
distributions. L1, L2, L3,
CO3 20
L4
Design the components of a classical hypothesis test.
L1, L2, L3,
CO4 20
L4
Infer the statistical inferential methods based on
small and large sampling tests. L1, L2, L3,
CO5 20
L4
WEIGHTAGE OF BLOOM’S LEGENDS & PERCENTAGE OF QUESTIONS
IN EXAMINATIONS:
L1 (Remembering) = 10- 20%, L2 (Understanding) = 30
- 40%,
L3 (Applying) = 30-40 %, L4 (Analysing) = 20 - 30%,
Easy (%) = 15%-20%, Average (%) = 60% - 70%, Difficult (%) = 15% - 20%
TOTAL = L1 + L2 + L3 + L4 = 100% (on an average about 2 minutes per
mark)
Note: This specification weightage in above shall be treated as a general
guideline for students, teachers, and paper setters. The actual distribution of
marks in the question paper may vary slightly.
DETAILED SYLLABUS:
UNIT-I: Descriptive statistics and methods for data science: 12 hrs
Data science-Statistics Introduction-Population Vs Sample-Collection of data-primary and
secondary data-Types of variable: dependent and independent Categorical and Continuous
Variables-Data Visualization-Measures of Central Tendency-Measures of Variability -Skewness
Kurtosis
UNIT-II: Correlation and Regression: 8 hrs
Correlation – Correlation coefficient – Rank correlation.
Linear Regression: Straight line – Multiple Linear Regression - Regression coefficients and
properties – Curvilinear Regression: Parabola – Exponential – Power curves.
UNIT-III: Probability and Distributions: 10 hrs
Probability– Conditional probability and Baye’s theorem – Random variables – Discrete and
Continuous random variables – Distribution functions – Probability mass function, Probability
density function and Cumulative distribution functions – Mathematical Expectation and Variance
– Binomial, Poisson, Uniform and Normal distributions
UNIT-IV: Sampling Theory: 8 hrs
Introduction – Population and Samples – Sampling distribution of Means and Variance
(definition only) – Point and Interval estimations – Maximum error of estimate – Central limit
theorem (without proof) – Estimation using t, chi-square and F-distributions.
.
UNIT-V: Tests of Hypothesis: 12 hrs
Introduction – Hypothesis – Null and Alternative Hypothesis – Type I and Type II errors –
Level of significance – One tail and two-tail tests – Test of significance for large samples and
Small Samples: Single and difference means – Single and two proportions – Student’s t- test, F-
test, and Chi-square-test.
Text Books:
1. Miller and Freund’s, Probability and Statistics for Engineers, 7/e, Pearson, 2008.
2. S. C. Gupta and V. K. Kapoor, Fundamentals of Mathematical Statistics, 11/e, Sultan
Chand & Sons Publications, 2012
Reference Books:
1. Shron L. Myers, Keying Ye, Ronald E Walpole, Probability and Statistics Engineers
and the Scientists, 8th Edition, Pearson 2007.
2. Jay I. Devore, Probability and Statistics for Engineering and the Sciences, 8 th Edition,
Cengage.
3. Sheldon M. Ross, Introduction to probability and statistics Engineers and the Scientists,
4th Edition, Academic Foundation, 2011.
4. Johannes Ledolter and Robert V. Hogg, Applied statistics for Engineers and Physical
Scientists, 3rd Edition, Pearson, 2010.
Micro Syllabus:
UNIT-I: Descriptive statistics and methods for data science:
Data science-Statistics Introduction-Population vs Sample-Collection of data-
primary and secondary data-Types of variable: dependent and independent
Categorical and Continuous Variables-Data Visualization-Measures of Central
Tendency-Measures of Variability -Skewness Kurtosis.
Unit 1 Module Micro content
Collection of data-primary and secondary
Introduction- data
Population vs
Population
Sample
Sample
dependent
Descriptive
Statistics independent
Types of
variables Categorical & Discrete
Continuous variables
Data visualization Data visualization
Measures of Central tendency
Measures of
Methods for Central tendency Measures of Variability
data science and Measures of
Variability
Skewness and Kurtosis.
UNIT-II: Correlation and Curve fitting:
Correlation-correlation coefficient-Rank Correlation-Regression coefficient and
properties-regression lines-Multiple Regression-Method of least squares-Straight
line-parabola-Exponential-Power curves.
Unit 2 Module Micro content
correlation coefficient
Correlation
Rank correlation
Regression coefficient
Regression properties
regression lines
Correlation
and Curve Multiple regression
fitting
Straight line
Parabola.
Method of least
squares Exponential curves
Power curves.
UNIT-III: Probability and Distributions:
Probability-Conditional probability and Baye’s theorem- Random variables -
Discrete and Continuous random variables-Distribution Function-Mathematical
Expectation and Variance-Binomial, Poisson, Uniform and Normal distributions.
Unit 3 Module Micro content
Probability Probability Conditional probability
Baye’s theorem
Discrete Random variables
Continuous Random variables
Random variables
Distribution function
Mathematical Expectation and variance
Binomial distribution.
and
Distributions
Poisson distribution
Distributions
Uniform distribution
Normal distribution
UNIT-IV: Sampling Theory:
Introduction – Population and Samples – Sampling distribution of Means and
Variance (definition only) – Point and Interval estimations – Maximum error of
estimate – Central limit theorem (without proof) – Estimation using t, chi-
square and F-distributions. .
Unit 4 Module Micro content
Population samples
Introduction
Central limit theorem (without proof)
Sampling Sampling distribution of Means
Sampling distributions
Theory Sampling distribution of Variance
Point estimations
Estimation Interval estimation
Maximum error of estimate.
t-distribution
Chi-square distribution
F-distribution
UNIT-V: Test of Hypothesis:
Introduction – Hypothesis – Null and Alternative Hypothesis – Type I and Type II
errors – Level of significance – One tail and two-tail tests – Test of significance
for large samples and Small Samples: Single and difference means – Single and
two proportions – Student’s t- test, F-test, and Chi-square-test.
Unit 5 Module Micro content
Null Hypothesis
Alternative Hypothesis
Hypothesis Type I and Type II errors
Level of significance
One tail and two-tail tests
Tests concerning one mean using Z test
Test of
Hypothesis Test for large Tests concerning one two means using Z
samples test.
Tests concerning proportions using Z test
Tests concerning one mean, two means
Tests for small using t test
samples
chi-square test
F test
R23
Code No:
II B. TECH I SEMESTER REGULAR EXAMINATIONS, DEC-2024
PROBABILITY AND STATISTICS
(COMMON TO CSE, CSM & AIML BRANCHES)
Time: 3 Hours Max. Marks: 70
______________________________________________________________________________
___
Note: 1. The question paper consists of two parts (Part-A and Part-B)
2. All the questions in Part-A are Compulsory
3. Answer ONE Question from Each Unit in Part-B
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
PART – A (20 Marks)
1. a [2M
Define nominal variable and give an example? 1 L1
) ]
b [2M
Write a formula for kurtosis. 1 L1
) ]
c) Write the normal equations to straight line of the [2M
2 L1
form y = mx+c ]
d [2M
Write a formula for rank correlation 2 L2
) ]
e Write the limiting case of Poisson distribution [2M
3 L1
) ]
Write the recurrence formula for binomial [2M
f) 3 L2
distribution ]
g Define unbiased estimation [2M
4 L2
) ]
h Write the formula for confidence limits of true [2M
4 L2
) mean ]
i) Define Null Hypothesis [2M
5 L2
]
j) Define Type-2 error [2M 5 L1
]
PART – B (50 Marks)
UNIT-I Mark C B
s O L
2. a Discus measures of central tendency of statistical L
data [5M] 1
) 2
b Compute the Quartile’s coefficient of skewness for
) the following distribution
x 0-10 10- 20- 30- 40- L
[5M] 1
20 30 40 50 3
f 14 25 36 11 14
(OR)
3. a Explain in detail descriptive statistics and L
[5M] 1
) inferential statistics with examples. 2
b The following are scores of two batsmen A and B
) in a series of innings, who is better score getter
and who is more consistent?
A 1 11 6 73 7 19 119 36 84 29 L
[5M] 1
2 5 3
B 4 12 16 42 4 51 37 48 13 0
7
UNIT-II
4. a Compute linear multiple regression equation of Z
) on X and Y the form the following data
x 3 5 6 8 1 1
2 4 L
[5M] 2
3
y 1 1 7 4 3 2
6 0
z 9 7 5 4 3 1
0 2 4 2 0 2
(OR)
5. a Compute the coefficient of correlation between X
) and Y.
L
X 1 2 3 4 5 6 7 8 9 [5M] 2
3
Y 12 11 13 15 14 17 16 19 18
b Fit a second-degree polynomial to the following
) data
X 0 1 2 3 4 L
[5M] 2
3
Y 1 1.3 1.8 2.5 6.3
.
UNIT-III
6. a A box contains 10 white and 3 black balls, while
) another box contains 3 white and 5 black balls.
L
Two balls are drawn from the first box and put into [5M] 3
2
the second box and then a ball is drawn from it
what is the probability that it is a white ball?
b In a Normal distribution, 31% of the items are L
) under 45 and 8% are over 64 Compute the Mean [5M] 3
3
and variance of distribution
(OR)
7. a If x is a random variable having probability density [5M] 3 L
) 3
function
Compute mean and variance.
b State and prove Baye’s theorem L
[5M] 3
) 2
UNIT-IV
8. a A population consists of the five numbers 3, 6, 9, [5M] 4 L
) 15 and 27. Consider all possible samples of size 2 3
that can be drawn without replacement from this
population. Find (i)The mean of the sampling
distribution of means. ii) The standard deviation
of the sampling distribution of means..
b A random sample of size 81 was taken whose [5M]
L
) variance is 20.25 and means is 32, construct 98% 4
2
confidence interval.
(OR)
9. a Discus the following (i) Efficiency estimation.
) (ii)Point estimation (iii) Interval estimation .
[5M] 4 L2
b The mean of certain normal population is equal [5M] 4 L3
) to the standard error of the mean of the
samples of 64 from that distribution. Find the
probability that the mean of the sample size 36
will be negative.
UNIT-V
10. a In a big City 325 men out of 600 men were found [5M]
) to be smokers. Does this information support the L
5
conclusion that the majority of the men in this city 4
are smokers?
b A random of 10 boys had the following I.Q’s 70, [5M] 5 L
) 120, 110, 101, 88, 83, 95, 98, 107 and 100. Do the 4
data support the assumption of population means
I.Q of 100.Test at 5% level of significance?
(OR)
11. a A pair of dice are thrown 360 times and frequency
) of each sum is indicated below:
sum 2 3 4 5 6 7 8 9 1 1 1
0 1 2 L
[5M] 5
Frequen 8 2 3 3 4 6 5 4 2 1 1 4
cy 4 5 7 4 5 1 2 6 4 4
would you say that the dice are fair on the basis of
the chi-square test at 0.05 level of significance
b The means of two single large samples of 1000 [5M] 5 L
) and 2000 members are 67.5 inches and 68.0 4
inches respectively. Can the samples be regarded
as drawn from the same population of standard
deviation 2.5 inches? (Test at 5%LOS).
THE ABOVE MODEL PAPER ATTAINMENTS OF BLOOM’S TAXONOMY AS
FOLLOWS
L1: 10Marks
L2: 45 Marks
L3: 45 Marks
L4: 25 Marks
SIGNATURES OF
COURSE COORDINATER MODULE COORDINATER HOD