0% found this document useful (0 votes)
57 views9 pages

Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes I Semester, 2020-21

This document provides details about the course "Introduction to Statistical Methods" including: 1) The course covers statistical techniques important for data science like inferential statistics, predictive analytics, and applied multivariate analytics. 2) The modular content includes probability distributions, sampling, hypothesis testing, analysis of variance (ANOVA), and regression models. 3) The course objectives are for students to understand data representation, analysis, and predictive/inferential statistical models used in data science.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views9 pages

Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes I Semester, 2020-21

This document provides details about the course "Introduction to Statistical Methods" including: 1) The course covers statistical techniques important for data science like inferential statistics, predictive analytics, and applied multivariate analytics. 2) The modular content includes probability distributions, sampling, hypothesis testing, analysis of variance (ANOVA), and regression models. 3) The course objectives are for students to understand data representation, analysis, and predictive/inferential statistical models used in data science.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI

WORK INTEGRATED LEARNING PROGRAMMES


M .Tech (Data Science & Engineering)
I Semester, 2020-21
Course Handout

Course Title Introduction to Statistical Methods


Course No(s) DSECL ZC413

Course Description

This course will cover the statistical techniques which are very important in Data Science. It covers
the models related to inferential statistics, predictive analytics and applied multivariate analytics.

Course Objectives

CO1 Understanding the data representation and analysis which is very important in Data Science

CO2 Understanding the predictive & inferential statistical models used in Data Science

Text Books

No Author(s), Title, Edition, Publishing House

T1 Probability and Statistics for Engineering and Sciences, 8th Edition, Jay L Devore, Cengage
Learning
T2 Business Forecasting, 9th Edition, John E Hanke, Pearson Education

Reference Books

No Author(s), Title, Edition, Publishing House

R1 Miller and Freund’s Probability and statistics for Engineers, 8th Edition, PHI
R2 Statistics for Business and Economics by Anderson, Sweeney and Wiliams, CENAGE
learning
R3 Applied Logistic Regression, Hosmer and Lemeshow, 3rd Edition, Wiley
R4 Introduction to Time Series and Forecasting, Second Edition, Peter J Brockwell, Richard A
Davis, Springer.
Modular Content Structure
1. Probability
1.1 Probability – Introduction and Basics
1.2 Conditional probability
1.3 Bayes’ theorem
2. Probability Distributions
2.1 Random variables – pmf, pdf, cumulative df
2.2 Probability Distributions
2.2.1 Discrete distributions, mean and variance
2.2.2 Continuous distributions, mean and variance
2.2.3 Joint probability distributions
3. Generating functions
3.1 Moment generating functions
3.2 Characteristic functions
3.3 Central Limit theorem
3.4 Markov and Chebychev’s inequalities
4. Point Estimation
4.1 General Concepts
4.2 Methods of point estimation – maximum likelihood function
5. Testing of Hypothesis
5.1 Sampling, estimation and confidence intervals
5.2 Type I, Type II errors, p-value
5.3 Testing of Hypothesis – Mean – one and two mean
5.4 Testing of hypothesis – Proportions – one and several proportions
5.5 ANOVA
6. Regression
6.1 Covariance
6.2 Correlation
6.3 Sum of Least Squares
6.4 Simple linear regression
6.5 Ridge Models &Lasso Model
6.6 Model validation
6.7 Multiple linear regression
6.8 Nonlinear regression
6.9 Logistic regression
7. Forecasting Model
7.1 Principles of forecasting
7.2 Time series analysis
7.2.1 Moving averages, smoothing & decomposition methods
7.2.2 ARIMA Model

Learning Outcomes:

No Learning Outcomes

LO1 Clear understanding of the various statistical models to model the data

LO2 Drawing conclusions from the models selected to understand the data

Part B: Course Handout

Academic Term I Semester 2020 – 21

Course Title Introduction to Statistical Methods

Course No DSECL ZC413

Course Contents

Contact Session 1: Probability

Contact List of Topic Title Reference


Session
Pre-reading Descriptive Statistic, data visualisation T1:Chapter 1

During Axioms of probability and the probability T1: 2.1 to 2.5


contact space. Events, independence of events,
session - 1 conditional events and Bayes’ theorem

HW Problems on defining events, conditional


probability and Bayes’ theorem

Lab Finding the mean, median and mode of a


distribution using excel / R
Contact Session 2: Module 2 – Discrete Probability Distributions

Contact List of Topic Title Reference


Session
Pre- Binomial theorem, properties of binomial
reading coefficients.
During Random variables, PMF, mean, variance and T1: 3.1 to
contact higher order moments. Discrete distributions: 3.6
session Binomial, Hypergeometric and Poisson
-2

HW Negative binomial distribution and related


problems

Lab Mean and variance computation using excel / R

Contact Session 3: Module 2 – Continuous Probability Distributions

Contact List of Topic Title Reference


Session
Pre- Review of differential and integral calculus.
reading Fundamental theorem of integral calculus
During PDF, mean and variance of continuous random T1: 4.1 to
contact variables. Uniform, exponential, Beta, Gamma 4.6
session and Normal distributions
-3

HW Weibull distribution, related problems

Lab Plot of the distributions in excel / R

Contact Session 4: Module 2 – Joint Probability Distributions

Contact List of Topic Title Reference


Session
Pre- Session 2 and session 3 contents
reading

During Joint pmf / pdf, conditional distribution, T1: 5.1 to


contact marginal distribution, conditional expectation, 5.5
session covariance, correlation, distribution of sample
-4 mean and distribution of linear combination of
random variables. Marginal distributions.
HW Problems on joint probability distributions

Lab

Contact Session 5: Module 3 – Moment Generating functions

Contact List of Topic Title Reference


Session
Pre- Differential and integral calculus
reading
During Moment generating functions. Distribution of Class notes
contact simple functions of random variables through
session MGF, Central Limit Theorem. Markov and
-5 Chebychev’s inequalities. Weak Law of Large
Numbers and the strong law.
HW Related problems
Lab Exemplification of CLT through uniform
random variates

Contact Session 6: Module 4 – Point Estimation

Contact List of Topic Title Reference


Session
Pre- Review of calculus, pdf and pmf of standard
reading distributions
During General concepts and methods of point T1: 6.1 and
contact 6.2
estimation
session
–6

HW Related problems

Lab
Contact Session 7: Module 5 – Sampling

Contact List of Topic Title Reference


Session
Pre- CLT, zα values
reading

During Sampling and sampling distributions – T1: Chapter


contact 7 and class
Sampling, sample mean, Confidence intervals,
session notes
z, Chi square, F, t distributions
-7

HW Related problems

Lab

Contact Session 8: Module 5: Testing of Hypothesis

Contact List of Topic Title Reference


Session
Pre- Contact session 7 portions
reading

During Test of hypothesis based on a single sample. T1: Chapter


contact Usage of z, Chi-squared, F and t distributions 8
session
-8

HW

Lab Usage of R for hypothesis testing

MID SEMESTER EXAMINATION

Contact Session 9: Module 5 – Testing of Hypothesis

Contact List of Topic Title Reference


Session
Pre- Session 8 contents
reading

During Tests of hypotheses for two samples T1: Chapter


contact 9
session
-9

HW

Lab

Contact Session 10: Module 5 - ANOVA

Contact List of Topic Title Reference


Session
Pre- Chapter no. 10
reading

During Single factor ANOVA and Multiple T1: 10.1 and


contact comparisons in ANOVA 10.2
session
- 10

HW

Lab Use of R for ANOVA

Contact Session 11: Module 6 – Regression

Contact List of Topic Title Reference


Session
Pre- Chapter 12
reading

During Simple Linear regression model, Assumption of T1: Chapter


contact the model, interpretation of the model 12
session
- 11

HW Problems on correlation and co variance

Lab Exemplification in R
Contact Session 12: Module 6 – Regression

Contact List of Topic Title Reference


Session
During Multiple linear regression model, non – linear T1:Chapter
contact regression & Logistic regression 13 and class
session notes
- 12

HW Problems on Linear regression

Lab Exemplification in R

Contact Session 13: Module 7 – Forecasting Models

Contact List of Topic Title Reference


Session
Pre- Regression
reading

During Moving Averages and Exponential smoothing T2: Chapter


contact models 3
session
- 13

HW Related problems

Lab

Contact Session 14 and 15: Module 7 – Forecasting Models

Contact List of Topic Title Reference


Session
During Principles of Forecasting, Time series models _ T2: Chapter
contact and decomposition methods, AR, MA, ARIMA 4 and 8
sessions Models
– 14
and 15

HW Problems Time series models

Lab
Contact Session 16: Review session

Contact List of Topic Title Reference


Session
During Review and wrap-up
contact
session
- 16

HW

Lab

You might also like