Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes I Semester, 2020-21
Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes I Semester, 2020-21
Course Description
This course will cover the statistical techniques which are very important in Data Science. It covers
the models related to inferential statistics, predictive analytics and applied multivariate analytics.
Course Objectives
CO1 Understanding the data representation and analysis which is very important in Data Science
CO2 Understanding the predictive & inferential statistical models used in Data Science
Text Books
T1 Probability and Statistics for Engineering and Sciences, 8th Edition, Jay L Devore, Cengage
Learning
T2 Business Forecasting, 9th Edition, John E Hanke, Pearson Education
Reference Books
R1 Miller and Freund’s Probability and statistics for Engineers, 8th Edition, PHI
R2 Statistics for Business and Economics by Anderson, Sweeney and Wiliams, CENAGE
learning
R3 Applied Logistic Regression, Hosmer and Lemeshow, 3rd Edition, Wiley
R4 Introduction to Time Series and Forecasting, Second Edition, Peter J Brockwell, Richard A
Davis, Springer.
Modular Content Structure
1. Probability
1.1 Probability – Introduction and Basics
1.2 Conditional probability
1.3 Bayes’ theorem
2. Probability Distributions
2.1 Random variables – pmf, pdf, cumulative df
2.2 Probability Distributions
2.2.1 Discrete distributions, mean and variance
2.2.2 Continuous distributions, mean and variance
2.2.3 Joint probability distributions
3. Generating functions
3.1 Moment generating functions
3.2 Characteristic functions
3.3 Central Limit theorem
3.4 Markov and Chebychev’s inequalities
4. Point Estimation
4.1 General Concepts
4.2 Methods of point estimation – maximum likelihood function
5. Testing of Hypothesis
5.1 Sampling, estimation and confidence intervals
5.2 Type I, Type II errors, p-value
5.3 Testing of Hypothesis – Mean – one and two mean
5.4 Testing of hypothesis – Proportions – one and several proportions
5.5 ANOVA
6. Regression
6.1 Covariance
6.2 Correlation
6.3 Sum of Least Squares
6.4 Simple linear regression
6.5 Ridge Models &Lasso Model
6.6 Model validation
6.7 Multiple linear regression
6.8 Nonlinear regression
6.9 Logistic regression
7. Forecasting Model
7.1 Principles of forecasting
7.2 Time series analysis
7.2.1 Moving averages, smoothing & decomposition methods
7.2.2 ARIMA Model
Learning Outcomes:
No Learning Outcomes
LO1 Clear understanding of the various statistical models to model the data
LO2 Drawing conclusions from the models selected to understand the data
Course Contents
Lab
HW Related problems
Lab
Contact Session 7: Module 5 – Sampling
HW Related problems
Lab
HW
HW
Lab
HW
Lab Exemplification in R
Contact Session 12: Module 6 – Regression
Lab Exemplification in R
HW Related problems
Lab
Lab
Contact Session 16: Review session
HW
Lab