Ca 1 Merged

Data Analytics with Python
Lecture 1: Introduction to data analytics
Dr. A. Ramesh
DEPARTMENT OF MANAGEMENT
IIT ROORKEE
1
Objective of the course
• The principle focus of this course is to introduce conceptual understanding
using simple and practical examples rather than repetitive and point click
mentality
• This course should make you comfortable using analytics in your career
and your life
• You will know how to work with real data, and might have learned many
different methodologies but choosing the right methodology is important
2
Objective of the course Contd…
• The danger in using quantitative method does not generally

lie in the inability to perform the calculation
• The real threat is lack of fundamental understanding of:
– Why to use a particular technique of procedure
– How to use it correctly and,
– How to correctly interpret the result
3
Learning objectives
1. Define data and its importance
2. Define data analytics and its types
3. Explain why analytics is important in today’s business environment
4. Explain how statistics, analytics and data science are interrelated
5. Why python?
6. Explain the four different levels of Data:
– Nominal
– Ordinal
– Interval and
– Ratio
4
1. Define Data and its importance
• Variable, Measurement and Data
• What is generating so much data?
• How data add value to the business?
• Why data is important?
5
1.1 Variable, Measurement and Data
• Variables – is a characteristic of any entity being studied that is capable of

taking on different values
• Measurements – is when a standard process is used to assign numbers to

particular attributes or characteristic of a variable
• Data – data are recorded measurements
6
1.2 What is generating so much data?
• Data can be generated by

– Humans,
– Machines or
– Humans-machines combines
• It can be generated anywhere where any information is
generated and stored in structured or unstructured formats
7
1.3 How data add value to business?
Data warehouse
Development of Data Product Discovery of Data Insight

Algorithm solutions in production, marketing and sales Quantitative data analysis to help steer
etc.(e.g. Recommendation Engines) strategic business decision
Business value
Source:https://datajobs.com/
8
Data Products
9
1.4 Why Data is important?
• Data helps in make better decisions

• Data helps in solve problems by finding the reason for
underperformance
• Data helps one to evaluate the performance.
• Data helps one improve processes
• Data helps one understand consumers and the market
10
2. Define data analytic and its types
• Define data analytics
• Why analytics is important?
• Data analysis
• Data analytics vs. Data analysis
• Types of Data analytics
11
2.1. Define data analytics
• Analytics is defined as “the scientific process of transforming data into

insights for making better decisions”
• Analytics, is the use of data, information technology, statistical analysis,
quantitative methods, and mathematical or computer-based models to
help managers gain improved insight about their business operations and
make better, fact-based decisions – James Evans
• Analysis = Analytics ?
12
2.2 Why analytics is important?
• Opportunity abounds for the use of analytics and big data

such as:
1. Determining credit risk
2. Developing new medicines
3. Finding more efficient ways to deliver products and services
4. Preventing fraud
5. Uncovering cyber threats
6. Retaining the most valuable customers
13
2.3 Data analysis
• Data analysis is the process of examining, transforming, and
arranging raw data in a specific way to generate useful
information from it
• Data analysis allows for the evaluation of data through
analytical and logical reasoning to lead to some sort of
outcome or conclusion in some context
• Data analysis is a multi-faceted process that involves a
number of steps, approaches, and diverse techniques
14
Analysis 2.4 Data analytics vs. Data analysis
Past
Explain
How?
Why?
15
2.4 Data analytics vs. Data analysis Analytics
Future
Explore potential future events
16
2.4 Data analytics vs. Data analysis
Analytics
Qualitative Quantitative
ll
ll
Intuition + analysis Formulas + algorithms
17
Analysis
Quantitative
ll
Qualitative Data + how the sale decreased last summer
ll
Explains How And Why Story ends the way it did ?
18
Analysis =/ Analytics
Data Analysis =/ Data analytics
Business Analysis =/ Business analytics
19
2.5 Classification of Data analytics
Based on the phase of workflow and the kind of analysis required, there are
four major types of data analytics.
• Descriptive analytics
• Diagnostic analytics
• Predictive analytics
• Prescriptive analytics
20
Classification of Data analytics
https://www.governanceanalytics.org/knowledge-
base/Main_Tools/Data_classification_and_analysis
21
Descriptive Analytics
• Descriptive Analytics, is the conventional form of Business Intelligence and
data analysis
• It seeks to provide a depiction or “summary view” of facts and figures in
an understandable format
• This either inform or prepare data for further analysis
• Descriptive analysis or statistics can summarize raw data and convert it
into a form that can be easily understood by humans
• They can describe in detail about an event that has occurred in the past
22
Example
A common example of Descriptive Analytics are company reports that simply
provide a historic review like:
• Data Queries
• Reports
• Descriptive Statistics
• Data Visualization
• Data dashboard
Source: https://www.linkedin.com/learning/478e9692-d13d-338f-907e-d76f0724d773
23
Diagnostic analytics
• Diagnostic Analytics is a form of advanced analytics which examines data

or content to answer the question “Why did it happen?”
• Diagnostic analytical tools aid an analyst to dig deeper into an issue so

that they can arrive at the source of a problem
• In a structured business environment, tools for both descriptive and

diagnostic analytics go parallel
24
Example
• It uses techniques such as:
1. Data Discovery
2. Data Mining
3. Correlations
25
Predictive analytics
• Predictive analytics helps to forecast trends based on the current events
• Predicting the probability of an event happening in future or estimating

the accurate time it will happen can all be determined with the help of
predictive analytical models
• Many different but co-dependent variables are analysed to predict a trend

in this type of analysis
26
Source: https://www.logianalytics.com/wp-content/uploads/2017/11/predictive-1.png
27
Example
• Set of techniques that use model constructed from past data to predict
the future or ascertain impact of one variable on another:
1. Linear regression
2. Time series analysis and forecasting
3. Data mining
Source: https://bigdata-madesimple.com/5-examples-predictive-analytics-travel-industry/
28
Prescriptive analytics
• Set of techniques to indicate the best course of action

• It tells what decision to make to optimize the outcome
• The goal of prescriptive analytics is to enable:
1. Quality improvements
2. Service enhancements
3. Cost reductions and
4. Increasing productivity
29
Prescriptive analytics: Example
• Optimization Model
• Simulation
• Decision Analysis
30
3. Explain why analytics is important
• Demand for Data Analytics

• Element of data Analytics
31
3. Explain why analytics is important
Data Scientist
Search Trends
Statistician, Operations Researcher
32
https://timesofindia.indiatimes.com/india/Data-scientists-earning-more-than-
CAs-engineers/articleshow/52171064.cms
33
3.1 Demand for Data Analytics
http://timesofindia.indiatimes.com/articleshow/52171064.cms?utm_source=
contentofinterest&utm_medium=text&utm_campaign=cppst
34
3.2 Element of data Analytics
35
4. Data analyst and Data scientist
• The requisite skill set
• Difference between Data analyst and Data Scientist
36
4.1 The requisite skill set
Technology;
Mathematic
Hacking Skill
Expertise
Business and
strategy Data Science
acumen
37
Mathematic Technology;
Expertise Hacking Skill
Business and
strategy
Data Science
acumen
38
Mathematic Technology;
Expertise Hacking Skill
Business and
strategy
Data Science
acumen
39
4.2 Difference between Data analyst and Data Scientist
Business Administration
Analyst
Domain specific responsibility : For Example marketing analyst, Financial analyst etc.
Data exploration analysis and insight
Data Scientist
Advance algorithms and machine learning
Data product engineering
Source:https://datajobs.com/
40
5. Why python?
Features
• Simple and easy to learn
• Freeware and Open source
• Interpreted
• Dynamically Typed
• Extensible
• Embedded
• Extensive library
41
5. Why python?
Usability
• Desktop and web applications
• Database applications
• Networking applications
• Data analysis (Data Science)
• Machine learning
• IoT and AI applications
• Games
42
Companies using Python
43
Why Jupyter NoteBook?
Why?
• Client – Server Application
• Edit code on web browser
• Easy in documentation
• Easy in demonstration
• User- friendly Interface
44
6. Explain the four different levels of Data
• Types of Variables
• Levels of Data Measurement
• Compare the four different levels of Data:
Nominal
Ordinal
Interval and
Ratio
• Usage Potential of Various Levels of Data
• Data Level, Operations, and Statistical Methods
45
6.1 Types of Variables
Data
Categorical Numerical
Examples:
 Marital Status
 Political Party Discrete Continuous
 Eye Color
Examples: Examples:
(Defined categories)
 Number of Children  Weight
 Defects per hour  Voltage
(Counted items) (Measured characteristics)
6.2 Levels of Data Measurement
• Nominal — Lowest level of measurement

• Ordinal
• Interval
• Ratio — Highest level of measurement
47
6.3.1 Nominal
• A nominal scale classifies data into distinct categories in which no ranking

is implied
• Example : Gender, Marital Status
48
6.3.2 Ordinal scale
• An ordinal scale classifies data into distinct categories in which ranking is

implied
• Example:
– Product satisfaction  Satisfied, Neutral, Unsatisfied
– Faculty rank  Professor, Associate Professor, Assistant Professor
– Student Grades  A, B, C, D, F
49
6.3.3. Interval scale
• An interval scale is an ordered scale in which the difference between

measurements is a meaningful quantity but the measurements do not have a
true zero point.
• Example
– Temperature in Fahrenheit and Celsius
– Year
50
6.3.4 Ratio scale
• A ratio scale is an ordered scale in which the difference between the

measurements is a meaningful quantity and the measurements have a true
zero point.
• Example
– Weight
– Age
– Salary
51
6.4 Usage Potential of Various
Levels of Data
Ratio
Interval
Ordinal
Nominal
52
6.5 Impact of choice of measurement scale
Statistical
Data Level Meaningful Operations
Methods
Nominal Classifying and Counting Nonparametric
Ordinal All of the above plus Ranking Nonparametric
Interval All of the above plus Parametric

Addition, Subtraction
Ratio All of the above plus

multiplication and division Parametric
53
Thank You
54
Lecture 2: Python – Fundamentals
Dr. A. Ramesh
IIT ROORKEE
1
Learning objectives
1. Installing Python
2. Fundamentals of Python
3. Data Visualisation
2
Python Installation Process
Installation Process –
Step 1: Type https://www.anaconda.com at the address bar of web

browser.
Step 2: Click on download button
Step 3: Download python 3.7 version for windows OS
Step 4: Double click on file to run the application
Step 5: Follow the instructions until completion of installation process
3
Installation Process –
Step 1: Type https://www.anaconda.com at the address bar of web browser.
4
Step 2: Click on download button
5
Step 3: Download python 3.7 version for windows OS
6
Step 4: Double click on file to run the application
7
8
9
10
11
12
13
14
15
16
Why Jupyter NoteBook?
Why?
• Edit code on web browser
• Easy in documentation
• Easy in demonstration
• User- friendly Interface
17
Python and Jupyter
Python Programming Language Jupyter Application
Software Package contains both

python and jupyter application
18
19
About Jupyter NoteBook
Cell -> Access using Enter Key
20
Input Field -> Green color indicates edit mode

Blue color indicates command mode
21
-> It contains documentation

-> Text not executed as code
22
About Jupyter Notebook
• Command mode allow to edit notebook as whole

• To close edit mode (Press Escape key)
• Execution (Three ways)
o Ctrl +Enter (Output field can not be modified)
o Shift +Enter (Output field is modified)
o Run button on Jupyter interface
• Comment line is written preceding with # symbol.
23
About Jupyter Notebook
• Important shortcut keys
o A -> To create cell above

o B -> To create cell below
o D + D -> For deleting cell
o M -> For markdown cell
o Y -> For code cell
24
Fundamentals of Python
• Loading a simple delimited data file

• Counting how many rows and columns were loaded
• Determining which type of data was loaded
• Looking at different parts of the data by subsetting rows
and columns
25
26
Loading a simple delimited data file
Data Source: www.github.com/jennybc/gapminder.
27
28
• head method shows us only the first 5 rows
29
Get the number of rows and columns
30
get column names
31
get the dtype of each column
32
Pandas Types Versus Python Types
33
get more information about data
34
Looking at Columns, Rows, and Cells
• # get the country column and save it to its own variable
35
# show the first 5 observations
36
# show the last 5 observations
37
# Looking at country, continent, and year
38
39
Lecture 3: Python – Fundamentals - II
Dr. A. Ramesh
IIT ROORKEE
1
Looking at Columns, Rows, and Cells
• Subset Rows by Index Label: loc
2
get the first row
• Python counts from 0
3
• # get the 100th row
# Python counts from 0
4
• get the last row
5
Subsetting Multiple Rows
• # select the first, 100th, and 1000th rows
6
Subset Rows by Row Number: iloc
• # get the 2nd row
7
• get the 100th row
8
• # using -1 to get the last row
9
With iloc, we can pass in the -1 to get the last row—something we couldn’t do with loc.
10
• # get the first, 100th, and 1000th rows
11
Subsetting Columns
• The Python slicing syntax uses a colon, :

• If we have just a colon, the attribute refers to everything.
• So, if we just want to get the first column using the loc or iloc syntax,
we can write something like df.loc[:, [columns]] to subset the column(s).
12
• # subset columns with loc
# note the position of the colon
# it is used to select all rows
13
14
• # subset columns with iloc
• # iloc will alow us to use integers
• # -1 will select the last column
15
Subsetting Columns by Range
• # create a range of integers from 0 to 4 inclusive
16
• # subset the dataframe with the range
17
Subsetting Rows and Columns
• # using loc
18
• # using iloc
19
Subsetting Multiple Rows and Columns
• #get the 1st, 100th, and 1000th rows

# from the 1st, 4th, and 6th columns
20
• if we use the column names directly,
# it makes the code a bit easier to read
# note now we have to use loc, instead of iloc
21
22
23
Grouped Means
• # For each year in our data, what was the average life
expectancy?
# To answer this question,
# we need to split our data into parts by year;
# then we get the 'lifeExp' column and calculate the mean
24
25
26
• If you need to “flatten” the dataframe, you can use the
reset_index method.
27
Grouped Frequency Counts
• use the nunique to get counts of unique values on a Pandas Series.
28
Basic Plot
29
30
Visual Representation of the Data
• Histogram -- vertical bar chart of frequencies
• Frequency Polygon -- line graph of frequencies
• Ogive -- line graph of cumulative frequencies
• Pie Chart -- proportional representation for categories of a whole
• Stem and Leaf Plot
• Pareto Chart
• Scatter Plot
31
Methods of visual presentation of data
• Table
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East 20.4 27.4 90 20.4
West 30.6 38.6 34.6 31.6
North 45.9 46.9 45 43.9
32
• Graphs
90
80
70
60
50 East
40 West
30 North
20
10
0
33
• Pie chart
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
34
• Multiple bar chart
4th Qtr
3rd Qtr North

West
2nd Qtr East
1st Qtr
0 20 40 60 80 100
35
• Simple pictogram
100
80
60
40
North
20
East
0 West
36
Frequency distributions
• Frequency tables
Observation Table
Class Interval Frequency Cumulative Frequency
< 20 13 13
<40 18 31
<60 25 56
<80 15 71
<100 9 80
37
Frequency diagrams
Frequency
30 Cumulative Frequency
25 Frequency
20
90
80
15
70
10 60
5 50
Cumulative Frequency
0 40
< 20 <40 <60 <80 <100 30
20
Frequency 10
0
30 < 20 <40 <60 <80 <100
25
20
15 Frequency
10
5
0
< 20 <40 <60 <80 <100
38
Histogram
20
Class Interval Frequency
Frequency
20-under 30 6
10
30-under 40 18
40-under 50 11
50-under 60 11
0
60-under 70 3 0 10 20 30 40 50 60 70 80
Years
70-under 80 1
39
Histogram Construction
20
20-under 30 6
Frequency
30-under 40 18
10
40-under 50 11
50-under 60 11
60-under 70 3
0
70-under 80 1
0 10 20 30 40 50 60 70 80
Years
40
Frequency Polygon
20
Class IntervalFrequency
20-under 30 6
Frequency
30-under 40 18
10
40-under 50 11
50-under 60 11
60-under 70 3
0
70-under 80 1 0 10 20 30 40 50 60 70 80
Years
41
Ogive
Cumulative
60
40
Frequency
20-under 30 6
30-under 40 24
20
40-under 50 35
50-under 60 46
0
60-under 70 49 0 10 20 30 40 50 60 70 80
70-under 80 50 Years
42
Relative Frequency Ogive
Cumulative
Cumulative Relative Frequency

1.00
Relative 0.90
0.80
Class Interval Frequency 0.70
0.60
20-under 30 .12 0.50
30-under 40 .48 0.40
0.30
40-under 50 .70 0.20
0.10
50-under 60 .92 0.00
60-under 70 .98 0 10 20 30 40 50 60 70 80
70-under 80 1.00 Years
43
Pareto Chart
100 100%
90 90%
80 80%
70 70%
Frequency 60 60%
50 50%
40 40%
30 30%
20 20%
10 10%
0 0%
Poor Short in Defective Other
Wiring Coil Plug
44
Scatter Plot
Registered Gasoline Sales

Vehicles (1000's of 200
(1000's) Gallons)
Gasoline Sales
5 60 100
15 120
9 90
0
15 140 0 5 10 15 20
Registered Vehicles
7 60
45
Principles of Excellent Graphs
• The graph should not distort the data
• The graph should not contain unnecessary adornments (sometimes
referred to as chart junk)
• The scale on the vertical axis should begin at zero
• All axes should be properly labeled
• The graph should contain a title
• The simplest possible graph should be used for a given set of data
Graphical Errors: Chart Junk
Bad Presentation  Good Presentation
Minimum Wage Minimum Wage

1960: $1.00
$
4
1970: $1.60
2
1980: $3.10
0
1990: $3.80 1960 1970 1980 1990
Graphical Errors:
Compressing the Vertical Axis
Bad Presentation  Good Presentation

Quarterly Sales Quarterly Sales
$ $
200 50
100 25
0 0
Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4
Graphical Errors: No Zero Point on the Vertical Axis
Bad Presentation
 Good Presentations
Monthly Sales $ Monthly Sales
$ 45
45
42
42 39
39 36
36 0
J F M A M J J F M A M J
Graphing the first six months of sales

Lecture 4: Central Tendency and Dispersion
Dr. A. Ramesh
Department of Management Studies
1
Lecture objectives
• Central tendency
• Measures of Dispersion
2
Measures of Central Tendency
• Measures of central tendency yield information about “particular places or

locations in a group of numbers.”
• A single number to describe the characteristics of a set of data
3
Summary statistics
• Central tendency or measures of • Dispersion

location – Skewness
– Arithmetic mean – Kurtosis
– Weighted mean – Range
– Median – Interquartile range
– Percentile – Variance
– Standard score
– Coefficient of variation
4
Arithmetic Mean
• Commonly called ‘the mean’
• It is the average of a group of numbers
• Applicable for interval and ratio data
• Not applicable for nominal or ordinal data
• Affected by each value in the data set, including extreme values
• Computed by summing all values in the data set and dividing the sum by
the number of values in the data set
5
Population Mean
 X  X 1
 X 2
 X 3
 ...  X N
N N
24  13  19  26  11

5
93

5
 18.6
6
Sample Mean
X 
X  X 1
 X 2
 X 3
 ...  X n
n n
57  86  42  38  90  66

6
379

6
 63.167
7
Mean of Grouped Data
• Weighted average of class midpoints
• Class frequencies are the weights
  fM
f

 fM
N
f 1M 1  f 2 M 2  f 3M 3    fiMi

f 1  f 2  f 3    fi
8
Calculation of Grouped Mean
Class Interval Frequency(f) Class Midpoint(M) fM
20-under 30 6 25 150
30-under 40 18 35 630
40-under 50 11 45 495
50-under 60 11 55 605
60-under 70 3 65 195
70-under 80 1 75 75
50 2150

fM 2150
  43.0
f 50
9
Weighted Average
• Sometimes we wish to average numbers, but we want to assign more

importance, or weight, to some of the numbers.
• The average you need is the weighted average.

Formula for Weighted Average
 xw
Weighted Average 
w
where x is a data value and w is
the weight assigned to that data
value. The sum is taken over all
data values.
Example
Suppose your midterm test score is 83 and your final exam score is 95.
Using weights of 40% for the midterm and 60% for the final exam, compute
the weighted average of your scores. If the minimum average for an A is
90, will you earn an A?
Weighted Average 
830.40 950.60
0.40  0.60
32  57
  90.2
1 You will earn an A!
Median
• Middle value in an ordered array of numbers
• Applicable for ordinal, interval, and ratio data
• Not applicable for nominal data
• Unaffected by extremely large and extremely small values
13
Median: Computational Procedure
• First Procedure
– Arrange the observations in an ordered array
– If there is an odd number of terms, the median is the middle term of the
ordered array
– If there is an even number of terms, the median is the average of the
middle two terms
• Second Procedure
– The median’s position in an ordered array is given by (n+1)/2.
14
Median: Example with an Odd Number of Terms
Ordered Array
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21 22
• There are 17 terms in the ordered array.
• Position of median = (n+1)/2 = (17+1)/2 = 9
• The median is the 9th term, 15.
• If the 22 is replaced by 100, the median is 15.
• If the 3 is replaced by -103, the median is 15.
15
Median: Example with an Even Number of Terms
Ordered Array
3 4 5 7 8 9 11 14 15 16 16 17 19 19 20 21
• There are 16 terms in the ordered array
• Position of median = (n+1)/2 = (16+1)/2 = 8.5
• The median is between the 8th and 9th terms, 14.5
• If the 21 is replaced by 100, the median is 14.5
• If the 3 is replaced by -88, the median is 14.5
16
Median of Grouped Data
N
 cfp
Median  L  2 W 
fmed
Where :
L  the lower limit of the median class
cfp = cumulative frequency of class preceding the median class
fmed = frequency of the median class
W = width of the median class
N = total of frequencies
17
Median of Grouped Data -- Example
Cumulative
N
Class Interval Frequency Frequency  cfp
20-under 30 6 6 Md  L  2 W 
30-under 40 18 24 fmed
40-under 50 11 35 50
 24
50-under 60 11 46
60-under 70 3 49
 40  2 10 
11
70-under 80 1 50  40.909
N = 50
18
Mode
• The most frequently occurring value in a data set
• Applicable to all levels of data measurement (nominal, ordinal, interval,

and ratio)
• Bimodal -- Data sets that have two modes
• Multimodal -- Data sets that contain more than two modes
19
Mode -- Example
• The mode is 44
• There are more 44s 35 41 44 45
than any other value 37 41 44 46
37 43 44 46
39 43 44 46
40 43 44 46
40 43 45 48
20
Mode of Grouped Data
• Midpoint of the modal class
• Modal class has the greatest frequency

20-under 30 6  d1 
Mode  LMo   w 
30-under 40 18  d1  d 2 
40-under 50 11
 12 
50-under 60 11 30   10  36.31
60-under 70 3  12  7 
70-under 80 1
21
22
Percentiles
• Measures of central tendency that divide a group of data into 100 parts
• Example: 90th percentile indicates that at most 90% of the data lie
below it, and at least 10% of the data lie above it
• The median and the 50th percentile have the same value
• Applicable for ordinal, interval, and ratio data
• Not applicable for nominal data
23
Percentiles: Computational Procedure
• Organize the data into an ascending ordered array
• Calculate the p th percentile location:
P
i ( n)
100
• Determine the percentile’s location and its value.
• If i is a whole number, the percentile is the average of the values at the

i and (i+1) positions
• If i is not a whole number, the percentile is at the (i+1) position in the

ordered array
24
Percentiles: Example
• Raw Data: 14, 12, 19, 23, 5, 13, 28, 17
• Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28
• Location of 30th percentile:
30
i (8)  2.4
100
• The location index, i, is not a whole number; i+1 = 2.4+1=3.4;

the whole number portion is 3; the 30th percentile is at the 3rd
location of the array; the 30th percentile is 13.
25
Dispersion
• Measures of variability describe the spread or the dispersion of a set of

data
• Reliability of measure of central tendency
• To compare dispersion of various samples
26
Variability
No Variability in Cash Flow Mean
Variability in Cash Flow Mean
27
Measures of Variability or dispersion
Common Measures of Variability
• Range
• Inter-quartile range
• Mean Absolute Deviation
• Variance
• Standard Deviation
• Z scores
• Coefficient of Variation
28
Range – ungrouped data
• The difference between the largest and the smallest values in 35 41 44 45

a set of data
37 41 44 46
• Simple to compute
37 43 44 46
• Ignores all data points except the two extremes
• Example: 39 43 44 46
Range = Largest – Smallest = 48 - 35 = 13 40 43 44 46
40 43 45 48
29
Quartiles
• Measures of central tendency that divide a group of data into four subgroups
• Q1: 25% of the data set is below the first quartile
• Q2: 50% of the data set is below the second quartile
• Q3: 75% of the data set is below the third quartile
• Q1 is equal to the 25th percentile
• Q2 is located at 50th percentile and equals the median
• Q3 is equal to the 75th percentile
• Quartile values are not necessarily members of the data set
30
Quartiles
Q1 Q2 Q3
25% 25% 25% 25%
31
Quartiles: Example
• Ordered array: 106, 109, 114, 116, 121, 122, 125, 129
• Q1 i
25
(8)  2 Q 
109  114
1  111.5
100 2
• Q2:
50 116  121
i (8)  4 Q2   118.5
100 2
• Q3:
75 122  125
i (8)  6 Q3   123.5
100 2
32
Interquartile Range
• Range of values between the first and third quartiles

• Range of the “middle half”
• Less influenced by extremes
Interquartile Range  Q3  Q1
33
Deviation from the Mean
• Data set: 5, 9, 16, 17, 18

• Mean:

 X  65  13
N 5
• Deviations from the mean: -8, -4, 3, 4, 5
-4 +5
-8 +4
+3
0 5 10 15 20

34
Mean Absolute Deviation
• Average of the absolute deviations from the mean
X X   X  
M . A.D. 
 X 
5 -8 +8 N
9 -4 +4 24

16 +3 +3 5
17 +4 +4  4.8
18 +5 +5
0 24
35
Population Variance
• Average of the squared deviations from the arithmetic mean
X X   X  
2
 X  
2
 
2
5 -8 64 N
130
9 -4 16 
5
16 +3 9  26.0
17 +4 16
18 +5 25
0 130
36
Population Standard Deviation
• Square root of the variance
X X   X  
2
 X  
2
 
2
N
5 -8 64 130
9 -4 16 
5
16 +3 9  26.0
17 +4 16   
2
18 +5 25  26.0
0 130  5.1
37
Sample Variance
• Average of the squared deviations from the arithmetic mean
X X  X X  X 
2
 X  X 
2
2,398 625 390,625


2
1,844 71 5,041 S n 1
1,539 -234 54,756 663,866
1,311 -462 213,444 
3
7,092 0 663,866
 221, 288.67
38
Sample Standard Deviation
• Square root of the sample variance
X X  X X  X 
2
 X  X 
2

2
2,398 625 390,625 S n 1

663,866
1,844 71 5,041 
3
1,539 -234 54,756  221, 288.67
1,311 -462 213,444 S  S
2
7,092 0 663,866  221, 288.67

 470.41
39
Uses of Standard Deviation
• Indicator of financial risk
• Quality Control
– construction of quality control charts
– process capability studies
• Comparing populations
– household incomes in two cities
– employee absenteeism at two plants
40
Standard Deviation as an Indicator of Financial Risk
Annualized Rate of Return

Financial  
Security
A 15% 3%
B 15% 7%
41
Lecture 5: Central Tendency and Dispersion- II
Dr. A. Ramesh
1
The Empirical Rule… If the histogram is bell shaped
• Approximately 68% of all observations fall

within one standard deviation of the mean.
• Approximately 95% of all observations fall

within two standard deviations of the mean.
• Approximately 99.7% of all observations fall

within three standard deviations of the mean.
2
Empirical Rule
• Data are normally distributed (or approximately normal)
Distance from Percentage of Values

the Mean Falling Within Distance
  1 68
  2 95
  3 99.7
3
Chebysheff’s Theorem…Not often used because interval is very wide.
• A more general interpretation of the standard deviation is derived

from Chebysheff’s Theorem, which applies to all shapes of histograms
(not just bell shaped).
• The proportion of observations in any sample that lie within k
standard deviations of the mean is at least:
For k=2 (say), the theorem states that at least

3/4 of all observations lie within 2 standard
deviations of the mean. This is a “lower bound”
compared to Empirical Rule’s approximation
(95%).
41
Coefficient of Variation
• Ratio of the standard deviation to the mean, expressed as a percentage

• Measurement of relative dispersion

. .  100 
CV

5
Coefficient of Variation
  29
1
  84
2
 1
 4.6  2
 10
 100  100
C.V .  
1
1
C.V .  
2
2
1 2
4.6 10
 100  100
29 84
 1586
.  11.90
6
Variance and Standard Deviation
of Grouped Data
Population Sample
 f  M   S  M  X 
2 2
f
 
2
2 
n1
N
2
S 
   S
2
7
Population Variance and Standard Deviation of
Grouped Data(mu=43)
f M fM M M  M 

2 2
Class Interval f
20-under 30 6 25 150 -18 324 1944

30-under 40 18 35 630 -8 64 1152
40-under 50 11 45 495 2 4 44
50-under 60 11 55 605 12 144 1584
60-under 70 3 65 195 22 484 1452
70-under 80 1 75 75 32 1024 1024
50 2150 7200
M    2
2
 f 7200
 144  12

2
   144
N 50
8
Measures of Shape
• Skewness
– Absence of symmetry
– Extreme values in one side of a distribution
• Kurtosis
Peakedness of a distribution
– Leptokurtic: high and thin
– Mesokurtic: normal shape
– Platykurtic: flat and spread out
• Box and Whisker Plots
– Graphic display of a distribution
– Reveals skewness
9
Skewness
Negatively Symmetric Positively

Skewed (Not Skewed) Skewed
10
Skewness..
The skewness of a distribution is measured by comparing the relative positions
of the mean, median and mode.
• Distribution is symmetrical
• Mean = Median = Mode
• Distribution skewed right
• Median lies between mode and mean, and mode is less than mean
• Distribution skewed left
• Median lies between mode and mean, and mode is greater than
mean
11
Skewness
Mean Mode Mean Mean

Mode
Median
Median Mode Median

12
Coefficient of Skewness
• Summary measure for skewness
3   Md 
S

• If S < 0, the distribution is negatively skewed (skewed to the left)
• If S = 0, the distribution is symmetric (not skewed)
• If S > 0, the distribution is positively skewed (skewed to the right)
13
Coefficient of Skewness
 1
 23  2
 26  3
 29
M
d1  26 M
d2  26 M
d3  26
 1
 12.3  2
 12.3  3
 12.3


3 1  M 
d1


3 2  M d2  

3 3  M 
d3
S 1
 S 2
 S 3

1 2 3
3 23  26 3 26  26 3 29  26

  
12.3 12.3 12.3
 0.73 0  0.73
14
Kurtosis
• Peakedness of a distribution
– Leptokurtic: high and thin
– Mesokurtic: normal in shape
– Platykurtic: flat and spread out
Leptokurtic
Mesokurtic
Platykurtic
15
Box and Whisker Plot
• Five specific values are used:
– Median, Q2
– First quartile, Q1
– Third quartile, Q3
– Minimum value in the data set
– Maximum value in the data set
16
Box and Whisker Plot
Minimum Q1 Q2 Q3 Maximum
17
Skewness: Box and Whisker Plots, and Coefficient of
Skewness
S=0 S>0
S<0

18
THANK YOU
19
Lecture 6: Introduction to Probability
Dr. A. Ramesh
1
Lecture objectives
• Comprehend the different ways of assigning probability

• Understand and apply marginal, union, joint, and conditional probabilities
• Solve problems using the laws of probability including the laws of addition,
multiplication and conditional probability
• Revise probabilities using Bayes’ rule
2
Probability
• Probability is the numerical measure of the likelihood that an event will occur.
• The probability of any event must be between 0 and 1, inclusively

– 0 ≤ P(A) ≤ 1 for any event A.
• The sum of the probabilities of all mutually exclusive and collectively

exhaustive events is 1.
– P(A) + P(B) + P(C) = 1
– A, B, and C are mutually exclusive and collectively exhaustive
3
Range of Probability
1 Certain
.5
0 Impossible
4
Methods of Assigning Probabilities
• Classical method of assigning probability (rules and laws)
• Relative frequency of occurrence (cumulated historical data)
• Subjective Probability (personal intuition or reasoning)
5
Classical Probability
• Number of outcomes leading to the event divided by the total number of

outcomes possible
• Each outcome is equally likely
• Determined a priori -- before performing the experiment
• Applicable to games of chance
• Objective -- everyone correctly using the method assigns an identical
probability
6
Classical Probability
P( E ) 
n e
N
Where:
N  total number of outcomes
ne
 number of outcomes in E
7
Relative Frequency Probability
• Based on historical data
• Computed after performing the experiment
• Number of times an event occurred divided by the number of trials
• Objective -- everyone correctly using the method assigns an identical

probability
8
Relative Frequency Probability
P( E )  ne
N
Where:
N  total number of trials
n e
 number of outcomes
producing E
9
Subjective Probability
• Comes from a person’s intuition or reasoning

• Subjective -- different individuals may (correctly) assign different numeric
probabilities to the same event
• Degree of belief
• Useful for unique (single-trial) experiments
– New product introduction
– Initial public offering of common stock
– Site selection decisions
– Sporting events
10
Probability - Terminology
• Experiment
• Event
• Elementary Events
• Sample Space
• Unions and Intersections
• Mutually Exclusive Events
• Independent Events
• Collectively Exhaustive Events
• Complementary Events
11
Experiment, Trial, Elementary Event, Event
• Experiment: a process that produces outcomes
– More than one possible outcome
– Only one outcome per trial
• Trial: one repetition of the process
• Elementary Event: cannot be decomposed or broken down into other
events
• Event: an outcome of an experiment
– may be an elementary event, or
– may be an aggregate of elementary events
– usually represented by an uppercase letter, e.g., A, E1
12
An Example Experiment
• Experiment: randomly select,
without replacement, two families Tiny Town Population
from the residents of Tiny Town
• Elementary Event: the sample Children in Number of
Family Household
includes families A and C Automobiles
• Event: each family in the sample
has children in the household A Yes 3
• Event: the sample families own a B Yes 2
total of four automobiles C No 1
D Yes 2
13
Sample Space
• The set of all elementary events for an experiment

• Methods for describing a sample space
– roster or listing
– tree diagram
– set builder notation
– Venn diagram
14
Sample Space: Roster Example
• Experiment: randomly select, without replacement, two families from the

residents of Tiny Town
• Each ordered pair in the sample space is an elementary event, for example
-- (D,C)
Children in Number of Listing of Sample Space
Family
Household Automobiles
(A,B), (A,C), (A,D),
A Yes 3
(B,A), (B,C), (B,D),
B Yes 2
(C,A), (C,B), (C,D),
C No 1
(D,A), (D,B), (D,C)
D Yes 2
15
Sample Space: Tree Diagram for Random Sample of Two
Families
16
Sample Space: Set Notation for Random Sample of Two
Families
• S = {(x,y) | x is the family selected on the first draw, and y is the family
selected on the second draw}
• Concise description of large sample spaces
17
Sample Space
• Useful for discussion of general principles and concepts
Listing of Sample Space

Venn Diagram
(A,B), (A,C), (A,D),
(B,A), (B,C), (B,D),
(C,A), (C,B), (C,D),
(D,A), (D,B), (D,C)
18
Union of Sets
• The union of two sets contains an instance of each element of the two
sets.
X  1,4,7,9
Y  2,3,4,5,6 X Y
X  Y  1,2,3,4,5,6,7,9
C   IBM , DEC , Apple

F   Apple, Grape, Lime
C  F   IBM , DEC , Apple, Grape, Lime
19
Intersection of Sets
• The intersection of two sets contains only those element common to the
X  1,4,7,9
two sets.
Y  2,3,4,5,6 X Y
X  Y   4

F   Apple, Grape, Lime
C  F   Apple
20
Mutually Exclusive Events
• Events with no common outcomes
• Occurrence of one event precludes the occurrence of the other event

F  Grape, Lime
C F  
X Y
X  1,7,9
Y  2 ,3,4 ,5,6
X Y    P( X  Y )  0
21
Independent Events
• Occurrence of one event does not affect the occurrence or nonoccurrence

of the other event
• The conditional probability of X given Y is equal to the marginal probability
of X.
• The conditional probability of Y given X is equal to the marginal probability
of Y.
P( X | Y )  P( X ) and P(Y | X )  P(Y )
22
Collectively Exhaustive Events
• Contains all elementary events for an experiment
E1 E2 E3
Sample Space with three

collectively exhaustive events
23
Complementary Events
• All elementary events not in the event ‘A’ are in its complementary event.
P( Sample Space )  1
A
Sample
Space A
P( A)  1  P( A)
24
Counting the Possibilities
• mn Rule
• Sampling from a Population with Replacement
• Combinations: Sampling from a Population without Replacement
25
mn Rule
• If an operation can be done m ways and a second operation can be done n

ways, then there are mn ways for the two operations to occur in order.
• This rule is easily extend to k stages, with a number of ways equal to
n1.n2.n3..nk
• Example: Toss two coins . The total umber of simple events is 2 x 2 =4
26
Sampling from a Population with Replacement
• A tray contains 1,000 individual tax returns. If 3 returns are randomly

selected with replacement from the tray, how many possible samples are
there?
• (N)n = (1,000)3 = 1,000,000,000
27
Combinations
• A tray contains 1,000 individual tax returns. If 3 returns are randomly

selected without replacement from the tray, how many possible samples
are there?
N N! 1000!
    166,167,00 0
 n  n!( N  n)! 3!(1000  3)!
28
Four Types of Probability
Marginal Union Joint Conditional
P( X ) P( X  Y ) P( X  Y ) P( X | Y )
The probability The probability The probability The probability
of X occurring of X or Y of X and Y of X occurring
occurring occurring given that Y
has occurred
X X Y X Y
29
General Law of Addition
P ( X  Y )  P( X )  P( Y )  P( X  Y )
X Y
30
Design for improving productivity?
31
Problem
• A company conducted a survey for the American Society of Interior
Designers in which workers were asked which changes in office design
would increase productivity.
• Respondents were allowed to answer more than one type of design
change.
Reducing noise would increase 70 %

productivity
More storage space would 67 %
increase productivity
32
Problem
• If one of the survey respondents was randomly selected and asked what
office design changes would increase worker productivity,
– what is the probability that this person would select reducing noise or
more storage space?
33
Solution
• Let N represent the event “reducing noise.”

• Let S represent the event “more storage/ filing space.”
• The probability of a person responding with N or S can be symbolized
statistically as a union probability by using the law of addition.
34
General Law of Addition -- Example
P( N  S )  P( N )  P( S )  P( N  S )
N S P ( N ) .70
P ( S ) .67
P ( N  S ) .56
.56
.70 .67 P ( N  S ) .70.67 .56
 0.81
35
Office Design Problem
Probability Matrix
Increase
Storage Space
Yes No Total
Noise Yes .56 .14 .70
Reduction
No .11 .19 .30
Total .67 .33 1.00
36
Joint Probability Using a Contingency Table
Event
Event B1 B2 Total
A1 P(A1 and B1) P(A1 and B2) P(A1)
A2 P(A2 and B1) P(A2 and B2) P(A2)
Total P(B1) P(B2) 1
Joint Probabilities Marginal (Simple) Probabilities

37
Office Design Problem - Probability Matrix
Increase
Storage Space
Yes No Total
Noise Yes .56 .14 .70
Reduction
No .11 .19 .30
Total .67 .33 1.00
P( N  S )  P( N )  P( S )  P( N  S )
.70.67 .56
.81
38
Law of Conditional Probability
39
Office Design Problem
40
Problem
• A company data reveal that 155 employees worked one of four types of
positions.
• Shown here again is the raw values matrix (also called a contingency table)
with the frequency counts for each category and for subtotals and totals
containing a breakdown of these employees by type of position and by
sex.
41
Contingency Table
42
Solution
• If an employee of the company is selected randomly, what is the

probability that the employee is female or a professional worker?
43
Problem
• Shown here are the raw values matrix and corresponding probability
matrix for the results of a national survey of 200 executives who were
asked to identify the geographic locale of their company and their
company’s industry type.
• The executives were only allowed to select one locale and one industry
type.
44
Lecture 7: Introduction to Probability-II
Dr. A. Ramesh
1
Problem
• A company data reveal that 155 employees worked one of four types of
positions.
• Shown here again is the raw values matrix (also called a contingency table)
with the frequency counts for each category and for subtotals and totals
containing a breakdown of these employees by type of position and by
sex.
2
Contingency Table
3
Solution
• If an employee of the company is selected randomly, what is the

probability that the employee is female or a professional worker?
4
Problem
• Shown here are the raw values matrix and corresponding probability
matrix for the results of a national survey of 200 executives who were
asked to identify the geographic locale of their company and their
company’s industry type.
• The executives were only allowed to select one locale and one industry
type.
5
6
Questions
a. What is the probability that the respondent is from the Midwest (F)?
b. What is the probability that the respondent is from the communications

industry (C) or from the Northeast (D)?
c. What is the probability that the respondent is from the Southeast (E) or
from the finance industry (A)?
7
8
Type of Gender
Position Male Female Total
Managerial 8 3 11
Professional 31 13 44
Technical 52 17 69
Clerical 9 22 31
Total 100 55 155
P(T  C )  P(T )  P(C )
69 31
 
155 155
.645
9
Type of Gender
Position Male Female Total
Managerial 8 3 11
Professional 31 13 44
Technical 52 17 69
Clerical 9 22 31
Total 100 55 155
P( P  C )  P( P)  P(C )
44 31
 
155 155
.484
10
Law of Multiplication
P( X  Y )  P( X )  P( Y | X )  P( Y )  P( X | Y )
11
Problem
• A company has 140 employees, of which 30 are supervisors.

• Eighty of the employees are married, and 20% of the married employees
are supervisors.
• If a company employee is randomly selected, what is the probability that
the employee is married and is a supervisor?
12
Married
Y N Sub total
Supervisor Y 0.1143 30
N 110
Sub 80 60 140
total
13
80
P( M )   0. 5714
140
P( S| M )  0. 20
P ( M  S )  P ( M )  P ( S| M )
 ( 0. 5714 )( 0. 20 )  0.1143
14
Law of Multiplication
P( S )  1  P( S )
Probability Matrix  1  0. 2143  0. 7857
of Employees P( M  S )  P( S )  P( M  S )
Married  0. 7857  0. 4571  0. 3286
Supervisor Yes No Total P( M  S )  P( M )  P( M  S )
Yes .1143 .1000 .2143  0. 5714  0.1143  0. 4571
No .4571 .3286 .7857 P( M  S )  P( S )  P( M  S )
Total .5714 .4286 1.00  0. 2143  0.1143  0.1000
P( M )  1  P( M )
 1  0. 5714  0. 4286
15
Special Law of Multiplication for Independent Events
• General Law
P( X  Y )  P( X )  P(Y | X )  P(Y )  P( X | Y )
• Special Law
If events X and Y are independent,
P( X )  P( X | Y ), and P (Y )  P (Y | X ).
Consequently,
P( X  Y )  P( X )  P(Y )
16
Law of Conditional Probability
• The conditional probability of X given Y is the joint probability of X and Y

divided by the marginal probability of Y.
P( X  Y ) P( Y | X )  P( X )
P( X | Y )  
P( Y ) P( Y )
17
Conditional Probability
• A conditional probability is the probability of one event, given that
another event has occurred:
P(A and B) The conditional
P(A | B)  probability of A given
P(B) that B has occurred
P(A and B) The conditional
P(B | A) 
P(A) probability of B given
that A has occurred
Where P(A and B) = joint probability of A and B
P(A) = marginal probability of A
P(B) = marginal probability of B
18
Computing Conditional Probability
• Of the cars on a used car lot, 70% have air conditioning (AC)
and 40% have a CD player (CD). 20% of the cars have both.
• What is the probability that a car has a CD player, given that it
has AC ?
• We want to find P(CD | AC).
Computing Conditional Probability
CD No CD Total
AC 0.2 0.5 0.7
No 0.2 0.1 0.3

AC
Total 0.4 0.6 1.0
P(CD and AC) .2

P(CD | AC)    .2857
P(AC) .7
Given AC, we only consider the top row (70% of the cars). Of
these, 20% have a CD player. 20% of 70% is about 28.57%.
Computing Conditional Probability: Decision Trees
.2
.7
Given AC or P(AC and CD) = .2
no AC:
.5
P(AC and CD/) = .5
.7
All
Cars .2
.3
P(AC/ and CD) = .2
.1 P(AC/ and CD/) = .1

.3
Independent Events
• If X and Y are independent events, the occurrence of Y does not affect the
probability of X occurring.
• If X and Y are independent events, the occurrence of X does not affect the
probability of Y occurring.
If X and Y are independent events

,
P( X | Y )  P( X ), and
P(Y | X )  P(Y ).
Statistical Independence
 Two events are independent if and only if:
P(A | B)  P(A)
 Events A and B are independent when the

probability of one event is not affected by the
other event
Independent Events Demonstration
Geographic Location
Northeast Southeast Midwest West
D E F G
Finance A .12 .05 .04 .07 .28
Manufacturing B .15 .03 .11 .06 .35
Communications C .14 .09 .06 .08 .37
.41 .17 .21 .21 1.00
Test the matrix for the 200 executive responses to determine

whether industry type is independent of geographic location.
Independent Events Demonstration Contd…
P( A  G) 0.07
P( A| G)    0.33 P( A)  0.28
P(G ) 0.21
P( A| G)  0.33  P( A)  0.28
Independent Events
D E
A 8 12 20 8
P( A| D)  .2353
34
B 20 30 50
20
P ( A)  .2353
C 6 9 15 85
P( A| D)  P( A)  0.2353
34 51 85
Revision of Probabilities: Bayes’ Rule
• An extension to the conditional law of probabilities

• Enables revision of original probabilities with new
information
P(Y | Xi ) P( Xi )
P( Xi| Y ) 
P(Y | X 1) P( X 1)  P(Y | X 2 ) P( X 2 )  P(Y | Xn ) P( Xn )
28
29
30
31
Problem
• A particular type of printer ribbon is produced by only
two companies, Alamo Ribbon Company and South
Jersey Products.
• Suppose Alamo produces 65% of the ribbons and
that South Jersey produces 35%.
• Eight percent of the ribbons produced by Alamo are
defective and 12% of the South Jersey ribbons are
defective
• A customer purchases a new ribbon. What is the
probability that Alamo produced the ribbon? What is
the probability that South Jersey produced the
ribbon?
Revision of Probabilities
with Bayes' Rule: Ribbon Problem
P( Alamo)  0. 65
P( SouthJersey)  0. 35
P( d | Alamo)  0. 08
P( d | SouthJersey)  0.12
P( d | Alamo)  P( Alamo)
P( Alamo| d ) 
P( d | Alamo)  P( Alamo)  P( d | SouthJersey)  P( SouthJersey)
( 0. 08)( 0. 65)
  0. 553
( 0. 08)( 0. 65)  ( 0.12 )( 0. 35)
P( d | SouthJersey)  P( SouthJersey)
P( SouthJersey| d ) 
P( d | Alamo)  P( Alamo)  P( d | SouthJersey)  P( SouthJersey)
( 0.12 )( 0. 35)
  0. 447
( 0. 08)( 0. 65)  ( 0.12 )( 0. 35)
Revision of Probabilities with Bayes’ Rule: Ribbon Problem
Revision of Probabilities
with Bayes' Rule: Ribbon Problem
Defective
0.08 0.052
Alamo
0.65
Acceptable + 0.094
0.92
Defective 0.042
0.12
South
Jersey
0.35 Acceptable
0.88
THANK YOU
36
Lecture 8: Probability Distributions
Dr. A. Ramesh
IIT ROORKEE
1
Lecture Objectives
• Empirical Distribution
• Discrete Distributions
• Continuous Distributions
2
What is a distribution?
• Describes the ‘shape’ of a batch of numbers
• The characteristics of a distribution can sometimes be defined using a

small number of numeric descriptors called ‘parameters’
3
Why distribution?
• Can serve as a basis for standardized comparison of empirical
distributions
• Can help us estimate confidence intervals for inferential statistics
• Form a basis for more advanced statistical methods
– ‘fit’ between observed distributions and certain theoretical
distributions is an assumption of many statistical procedures
4
Random variable
• A variable which contains the outcomes of a chance experiment
• “Quantifying the outcomes”
• Example X= (1 = Head, 0 = Tails)
• A variable that can take on different values in the population
according to some “random” mechanism
• Discrete
– Distinct values, countable
– Year
• Continuous
– Mass
5
Probability Distributions
• The probability distribution function or probability density function (PDF)

of a random variable X means the values taken by that random variable
and their associated probabilities.
• PDF of a discrete r.v. (also known as PMF):

Example 1: Let the r.v. X be the number of heads obtained in two tosses of
a coin.
Sample Space: {HH, HT, TH, TT}
6
PDF of Discrete r.v.
Number of Heads (X): 0 1 2 sum
PDF (P(X)): ¼ ½ ¼ 1
The PDF of the Number of Heads in Two
Tosses of a Coin
0.6
0.5
Probability Density
0.5
0.4
0.3 0.25 0.25
0.2
0.1
0
0 1 2
Number of Heads
7
Probability Distribution for the Random Variable X
A probability distribution for a discrete random
variable X:
x –8 –3 –1 0 1 4 6
P(X = x) 0.13 0.15 0.17 0.20 0.15 0.11 0.09
Find
a. P  X  0  0.65
b. P  3  X  1 0.67
8
Discrete Distribution -- Example
Distribution of Daily
Crises P
Number of r 0.5
Probability o
Crises 0.4
b
0 0.37 a 0.3
b
1 0.31 0.2
i
2 0.18 l 0.1
3 0.09 i
0
4 0.04 t 0 1 2 3 4 5
y
5 0.01 Number of Crises
9
Requirements for a Discrete Probability Function
• Probabilities are between 0 and 1, inclusively
• Total of all probabilities equals 1
0  P( X )  1 for all X
 P( X )  1
over all x
10
Cumulative Distribution Function
• The CDF of a random variable X (defined as F(X)) is a graph

associating all possible values, or the range of possible values with
P(X  x).
• CDFs always lie between 0 and 1 i.e., 0  F(Xi)  1, Where F(Xi) is
the CDF.
11
The Expected Value of X
Let X be a discrete rv with set of possible values D and pmf p(x). The
expected value or mean value of X, denoted
E ( X ) or  X , is
E( X )   X   x  p ( x)
xD
12
Mean and Variance of a Discrete Random Variable
A probability distribution can be viewed as a loading with the

mean equal to the balance point. Parts (a) and (b) illustrate
equal means, but Part (a) illustrates a larger variance.
Mean and Variance of a Discrete Random Variable
The probability distribution illustrated in Parts (a) and (b)

differ even though they have equal means and equal
variances.
Example – Expected Value
• Use the data below to find out the expected number of credit cards that a
customer to a retail outlet will possess.
x = # credit cards
x P(x =X) E  X   x1 p1  x2 p2  ...  xn pn
0 0.08
1 0.28
 0(.08)  1(.28)  2(.38)  3(.16)
2 0.38  4(.06)  5(.03)  6(.01)
3 0.16
4 0.06 =1.97
5 0.03
About 2 credit cards
6 0.01
15
The Variance and Standard Deviation
Let X have pmf p(x), and expected value Then the 

variance of X, denoted V(X)
(or  X2 or  2 ), is
V ( X )   ( x   ) 2  p( x)  E[( X   ) 2 ]
D
The standard deviation (SD) of X is
 X   X2
16
The quiz scores for a particular student are given below:
22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25, 18
Find the variance and standard deviation.
Value 12 18 20 22 24 25
Frequency 1 2 4 1 2 3
Probability .08 .15 .31 .08 .15 .23
  21
V ( X )  p1  x1     p2  x2     ...  pn  xn   
2 2 2
  V (X )
17
V ( X )  .08 12  21  .15 18  21  .31 20  21
2 2 2
.08  22  21  .15  24  21  .23  25  21

2 2 2
V ( X )  13.25
  V (X )  13.25  3.64
18
Shortcut Formula for Variance
 
V ( X )      x  p( x)    2
2 2
D 
E X    E  X 
2 2
19
Mean of a Discrete Distribution
  E  X    X  P( X )
X P(X) X.P(X)
-1 .1 -.1
0 .2 .0
1 .4 .4
2 .2 .4
3 .1 .3
1.0
20
Variance and Standard Deviation
of a Discrete Distribution

2
 X     P( X )  1.2
2
  
2
 12
.  110
.
X P(X) X  ( X   ) ( X  )
2 2
 P( X )
-1 .1 -2 4 .4
0 .2 -1 1 .2
1 .4 0 0 .0
2 .2 1 1 .2
3 .1 2 4 .4
1.2
21
Mean of the Data Example
  E  X    X  P( X )  115
.
X P(X) XP(X) P
r 0.5
0 .37 .00
o 0.4
1 .31 .31 b
a 0.3
2 .18 .36 b
0.2
i
3 .09 .27
l 0.1
4 .04 .16 i
0
t 0 1 2 3 4 5
5 .01 .05 y
Number
1.15
22
Properties of Expected Value
1.E (b)  b, b is a constant.

2. E(X +Y)= E(X)+ E(Y).
 
3.E X
Y

E( X )
E (Y )
.
4.E ( XY )  E ( X ) E (Y ) unless they are indpendendent.

5.E (aX )  aE ( X ), a constant.
6.E (aX  b)  aE ( X )  b, a and b are constants.
23
Properties of Variance
1. Var(constant) = 0
2. If X and Y are two independent random variables, then
Var(X + Y) = Var(X) + Var (Y) and
Var(X - Y) = Var(X) + Var (Y)
3. If b is a constant then Var(b+X) = Var(X)
4. If a is a constant then Var(aX) = a2Var(X)
5. If a and b are constants then Var(aX+b) = a2Var(X)
6. If X and Y are two independent random variables and a and b are
constants then Var(aX+bY) = a2Var(X) + b2Var(Y)
24
Covariance
Covariance: For two discrete random variables X and Y with E(X) =

x and E(Y) = y, the covariance between X and Y is defined as
Cov(XY) = xy = E(X - x) E(Y - y) = E(XY) - x y.
25
Covariance
• In general, the covariance between two random variables can be
positive or negative.
• If two random variables move in the same direction, then the
covariance will be positive, if they move in the opposite direction
the covariance will be negative.
Properties:
1.If X and Y are independent random variables, their covariance is
zero. Since E(XY) = E(X)E(Y)
2. Cov(XX) = Var(X)
3. Cov(YY) = Var(Y)
26
Correlation Coefficient
• The covariance tells the sign but not the magnitude about how
strongly the variables are positively or negatively related. The
correlation coefficient provides such measure of how strongly the
variables are related to each other.
• For two random variables X and Y with E(X) = x and E(Y) = y,
the correlation coefficient is defined as
Cov( XY )  xy
xy  
 x y  x y
27
28
Thank You
29
Lecture 9: Probability Distributions-II
Dr. A. Ramesh
IIT ROORKEE
1
Some Special Distributions
• Discrete
– Binomial
– Poisson
– Hyper geometric
• Continuous
– Uniform
– Exponential
– Normal
2
Binomial Distribution
• Let us consider the purchase decisions of the next three customers who
enter a store.
• On the basis of past experience, the store manager estimates the

probability that any one customer will make a purchase is .30.
• What is the probability that two of the next three customers will make a
purchase?
3
Tree diagram for the Martin clothing store problem
4
Trial Outcomes
5
Graphical representation of the probability distribution
for the number of customers making a purchase
x P(x)
0 0.7 x 0.7 x 0.7=0.343
1 0.3x0.7x07+
0.7x0.3x0.7+
0.7x0.7x0.3 = 0.441
2 0.189
3 0.027
6
Binomial Distribution- Assumtions
• Experiment involves n identical trials
• Each trial has exactly two possible outcomes: success and failure
• Each trial is independent of the previous trials
• p is the probability of a success on any one trial
q = (1-p) is the probability of a failure on any one trial
• p and q are constant throughout the experiment
• X is the number of successes in the n trials
7
Binomial Distribution
• Probability n! X n X
P( X )  p q for 0  X  n
function X ! n  X !
• Mean
value   n p
• Variance and
standard  2
 n pq
deviation    2
 n pq
8
Binomial Table
SELECTED VALUES FROM THE BINOMIAL PROBABILITY TABLE
EXAMPLE: n = 10, x = 3, p = .40; f (3) = .2150
9
Mean and Variance
• Suppose that for the next month the Clothing Store forecasts 1000
customers will enter the store.
• What is the expected number of customers who will make a purchase?
• The answer is μ = np = (1000)(.3) = 300.
• For the next 1000 customers entering the store, the variance and
standard deviation for the number of customers who will make a
purchase are
10
Poisson Distribution
• Describes discrete occurrences over a continuum or interval

• A discrete distribution
• Describes rare events
• Each occurrence is independent any other occurrences.
• The number of occurrences in each interval can vary from zero to infinity.
• The expected number of occurrences must hold constant throughout the
experiment.
11
Poisson Distribution: Applications
• Arrivals at queuing systems
– airports -- people, airplanes, automobiles, baggage
– banks -- people, automobiles, loan applications
– computer file servers -- read and write operations
• Defects in manufactured goods

– number of defects per 1,000 feet of extruded copper wire
– number of blemishes per square foot of painted surface
– number of errors per typed page
12
Poisson Distribution
• Probability function
e
X 
P( X )  for X  0,1, 2, 3,...

X!
where:
  long  run average
e  2. 718282 ... (the base of natural logarithms )
Mean value Variance Standard deviation
  
13
Poisson Distribution: Example
  3.2 customers/4 minutes   3.2 customers/4 minutes

X = 10 customers/8 minutes X = 6 customers/8 minutes
Adjusted  Adjusted 
 =6.4 customers/8 minutes  =6.4 customers/8 minutes
P(X)= 

P(X)= 
X  X
e e
X! X!
10 6.4 6 6.4
P(X =10)= 6.4 e  0.0528 P(X =6)= 6.4 e  0.1586

10! 6!
14
Poisson Probability Table
Example: μ = 10, x = 5; f (5) = .0378
15
The Hypergeometric Distribution
• The binomial distribution is applicable when selecting from a

finite population with replacement or from an infinite population
without replacement.
• The hypergeometric distribution is applicable when selecting

from a finite population without replacement.
Hyper Geometric Distribution
• Sampling without replacement from a finite population
• The number of objects in the population is denoted N.
• Each trial has exactly two possible outcomes, success and failure.
• Trials are not independent
• X is the number of successes in the n trials
• The binomial is an acceptable approximation, if N/10 > n Otherwise it is not.
17
Hypergeometric Distribution
• Probability function
– N is population size
P( x) 
 ACx  N  ACn  x 
– n is sample size
N Cn
– A is number of successes in population
– x is number of successes in sample An
 
N
• Mean Value
A( N  A) n( N  n)

2
 2
N ( N  1)
• Variance and standard deviation
 
2

18
The Hypergeometric Distribution Example
• Different computers are checked from 10 in the department. 4 of the 10
computers have illegal software loaded.
• What is the probability that 2 of the 3 selected computers have illegal
software loaded?
• So, N = 10, n = 3, A = 4, X = 2
 A  N  A   4  6 
     
 X  n  X   2 1  (6)(6)
P(X  2)           0.3
N 10  120
   
n  3 
   
• The probability that 2 of the 3 selected computers have illegal
software loaded is .30, or 30%.
Continuous Probability Distributions
• A continuous random variable is a variable that can assume any value on

a continuum (can assume an uncountable number of values)
– thickness of an item
– time required to complete a task
– temperature of a solution
– height
• These can potentially take on any value, depending only on the ability to
measure precisely and accurately.
Continuous Distributions
• Uniform
• Normal
• Exponential
The Uniform Distribution
• The uniform distribution is a probability distribution that has equal

probabilities for all possible outcomes of the random variable
• Because of its shape it is also called a rectangular distribution

Uniform Distribution
 1
b  a for a xb
 1
f ( x)  
 0 ba
for all other values f (x)


Area = 1
a x b
Uniform Distribution: Mean and Standard Deviation
Mean
a +b
 =
2
Standard Deviation
ba

12
The Uniform Distribution
Example: Uniform probability distribution over the range 2 ≤ X ≤ 6:
1
f(X) = 6 - 2 = .25 for 2 ≤ X ≤ 6
f(X)
ab 26
μ   4
.25 2 2
(b - a) 2 (6 - 2 ) 2
σ   1 .1 5 4 7
2 6 X 12 12
Uniform Distribution Example
 1
 47  41 for 41  x  47
 1 1
f ( x)   
 0 47  41 6

for all other values f ( x)

Area = 1
41 47 x
Uniform Distribution: Mean and Standard Deviation
Mean Mean
a +b 41+47 88
 = =   44
2 2 2
Standard Deviation Standard Deviation

ba 47  41 6
    1. 732
12 12 3. 464
Uniform Distribution Probability
P ( x1  X  x2)  x x1
2
ba 45  42 1

47  41 2
f (x)
45  42 1
P( 42  X  45)  
47  41 2 Area
= 0.5
41 42 45 47 x
Example : Uniform Distribution
• Consider the random variable x representing the flight time of an airplane

traveling from Delhi to Mumbai.
• Suppose the flight time can be any value in the interval from 120 minutes
to 140 minutes.
• Because the random variable x can assume any value in that interval, x is a
continuous rather than a discrete random variable
29
Example : Uniform Distribution contd….
• Let us assume that sufficient actual flight data are available to conclude
that the probability of a flight time within any 1-minute interval is the
same as the probability of a flight time within any other 1-minute interval
contained in the larger interval from 120 to 140 minutes.
• With every 1-minute interval being equally likely, the random variable x is
said to have a uniform probability distribution.
30
Uniform Probability Distribution for Flight time
31
Probability of a flight time between 120 and 130
minutes
32
Exponential Probability Distribution
• The exponential probability distribution is useful in describing the time it
takes to complete a task.
• The exponential random variables can be used to describe:
Time required Distance between

Time between
to complete major defects
vehicle arrivals
a questionnaire in a highway
at a toll booth
• Density Function
for x > 0,  > 0
1  x /
f ( x)  e

where:  = mean
e = 2.71828
• Suppose that x represents the loading time for a truck at loading dock and
follows such a distribution.
• If the mean, or average, loading time is 15 minutes ( μ = 15), the
appropriate probability density function for x is
Exponential Distribution for the loading Dock Example
• Cumulative Probabilities
Cumulative Probabilities
 xo / 
P( x  x0 )  1  e
where:
x0 = some specific value of x x
Example: Exponential Probability Distribution
• The time between arrivals of cars at a Petrol pump follows an exponential
probability distribution with a mean time between arrivals of 3 minutes.
• The Petrol pump owner would like to know the probability that the time
between two successive arrivals will be 2 minutes or less.

Example: Petrol Pump Problem
f(x)
.4 P(x < 2) = 1 - 2.71828-2/3 = 1 - .5134 = .4866

.3
.2
.1
x
1 2 3 4 5 6 7 8 9 10
Time Between Successive Arrivals (mins.)
Relationship between the Poisson and Exponential
Distributions
The Poisson distribution
provides an appropriate description
of the number of occurrences
per interval
The exponential distribution

provides an appropriate description
of the length of the interval
between occurrences
Mean of Poisson and Mean of Exponential Distributions
• Because the average number of arrivals is 10 cars per hour, the average
time between cars arriving is
42
The Normal Distribution: Properties
• ‘Bell Shaped’
• Symmetrical f(X)
• Mean, Median and Mode are equal
• Location is characterized by the mean, μ σ
• Spread is characterized by the standard μ
deviation, σ
Mean = Median = Mode
• The random variable has an infinite
theoretical range: - to +
The Normal Distribution: Density Function
The formula for the normal probability density function is
2
1  (X μ) 
1   
2  
f(X)  e
2π
Where e = the mathematical constant approximated by 2.71828
π = the mathematical constant approximated by 3.14159
μ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
Chap 6-44
The Normal Distribution: Shape
By varying the parameters μ and σ, we obtain different normal

distributions
Lecture 10: Probability Distributions-III
Dr. A. Ramesh
IIT ROORKEE
1
The Normal Distribution: Properties
• ‘Bell Shaped’
• Symmetrical f(X)
• Mean, Median and Mode are equal
• Location is characterized by the mean, μ σ
• Spread is characterized by the standard μ
deviation, σ
Mean = Median = Mode
• The random variable has an infinite
theoretical range: - to +
The Normal Distribution: Density Function
The formula for the normal probability density function is
2
1  (X μ) 
1   
2  
f(X)  e
2π
μ = the population mean
σ = the population standard deviation
X = any value of the continuous variable
Chap 6-3
By varying the parameters μ and σ, we obtain different normal

distributions
f(X) Changing μ shifts the distribution

left or right.
Changing σ increases or
decreases the spread.
σ
μ X
The Standardized Normal Distribution
• Any normal distribution (with any mean and standard deviation

combination) can be transformed into the standardized normal
distribution (Z).
• Need to transform X units into Z units.
• The standardized normal distribution has a mean of 0 and a standard

deviation of 1.
The Standardized Normal Distribution
• Translate from X to the standardized normal (the “Z” distribution) by

subtracting the mean of X and dividing by its standard deviation:
X μ
Z
σ
The Standardized Normal Distribution: Density
Function
• The formula for the standardized normal probability density

function is
Z2
1 2
f(Z)  e
2π
Z = any value of the standardized normal distribution
The Standardized Normal Distribution: Shape
• Also known as the “Z” distribution
• Mean is 0
• Standard Deviation is 1
f(Z)
Z
0
Values above the mean have positive Z-values, values below the mean have
negative Z-values
The Standardized Normal Distribution: Example
• If X is distributed normally with mean of 100 and standard deviation of

50, the Z value for X = 200 is
X  μ 200  100
Z   2 .0
σ 50
• This says that X = 200 is two standard deviations (2 increments of 50
units) above the mean of 100.
The Standardized Normal Distribution: Example
100 200 X (μ = 100, σ = 50)

0 2.0 Z (μ = 0, σ = 1)
Note that the distribution is the same, only the scale has changed. We
can express the problem in original units (X) or in standardized units (Z)
Normal Probabilities
Probability is measured by the area under the curve
f(X)
P(a ≤ X ≤ b)
(Note that the

probability of any
individual value is zero)
a b
Normal Probabilities
The total area under the curve is 1.0, and the curve is symmetric,
so half is above the mean, half is below.
f(X) P (    X  μ )  0 .5
P (μ  X   )  0 .5
0.5 0.5
P (    X   )  1 .0
Normal Probability Tables
Example:
P(Z < 2.00) = .9772
.9772
0 2.00 Z
Normal Probability Tables
The column gives the value of

Z to the second decimal point
Z 0.00 0.01 0.02 …
The row shows 0.0

the value of Z to 0.1
.
the first decimal . The value within the
. table gives the probability
point 2.0 .9772
from Z =   up to the
desired Z value.
2.0
P(Z < 2.00) = .9772
Finding Normal Probability
Procedure
To find P(a < X < b) when X is distributed normally:
• Draw the normal curve for the problem in terms of X.
• Translate X-values to Z-values.
• Use the Standardized Normal Table.

Finding Normal Probability: Example
• Let X represent the time it takes (in seconds) to download an image file
from the internet.
• Suppose X is normal with mean 8.0 and standard deviation 5.0
• Find P(X < 8.6)
X
8.0
8.6
• Suppose X is normal with mean 8.0 and standard deviation 5.0. Find
P(X < 8.6).
X  μ 8 .6  8 .0
Z   0 .1 2
σ 5 .0
μ=8 μ=0
σ = 10 σ=1
8 8.6 X 0 0.12 Z
P(X < 8.6) P(Z < 0.12)

Standardized Normal Probability P(X < 8.6)
Table (Portion)
= P(Z < 0.12)
.5478
Z .00 .01 .02
0.0 .5000 .5040 .5080

μ=0
0.1 .5398 .5438 .5478 σ=1
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255 0 0.12 Z

• Find P(X > 8.6)…
P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)

= 1.0 - .5478 = .4522
.5478
1.0 - .5478 = .4522
Z
0
0.12
Finding Normal Probability: Between Two Values
• Suppose X is normal with mean 8.0 and standard deviation 5.0.

Find P(8 < X < 8.6)
Calculate Z-values:
X μ 88
Z  0
σ 5
8 8.6 X
X  μ 8.6  8 0 0.12 Z
Z   0.12
σ 5 P(8 < X < 8.6)
= P(0 < Z < 0.12)
Finding Normal Probability
Between Two Values
P(8 < X < 8.6)

• Standardized Normal Probability = P(0 < Z < 0.12)
• Table (Portion) = P(Z < 0.12) – P(Z ≤ 0)
= .5478 - .5000 = .0478
Z .00 .01 .02
.0478
0.0 .5000 .5040 .5080 .5000
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255 Z

0.00 0.12
Given Normal Probability: Find the X Value
• Let X represent the time it takes (in seconds) to download an image file
from the internet.
• Suppose X is normal with mean 8.0 and standard deviation 5.0
• Find X such that 20% of download times are less than X.
.2000
? 8.0 X
? 0 Z
Given Normal Probability, Find the X Value
• First, find the Z value corresponds to the known probability

using the table.
Z …. .03 .04 .05
-0.9 …. .1762 .1736 .1711

.2000
-0.8 …. .2033 .2005 .1977
-0.7 …. .2327 .2296 .2266

? 8.0 X
-0.84 0 Z
Given Normal Probability,
Find the X Value
• Second, convert the Z value to X units using

the following formula.
X  μ  Zσ
 8.0  (0.84)5.0
 3.80
So 20% of the download times from the distribution with mean 8.0
and standard deviation 5.0 are less than 3.80 seconds.
Assessing Normality
• It is important to evaluate how well the data set is approximated by a normal
distribution.
• Normally distributed data should approximate the theoretical normal
distribution:
– The normal distribution is bell shaped (symmetrical) where the mean is
equal to the median.
– The empirical rule applies to the normal distribution.
– The interquartile range of a normal distribution is 1.33 standard deviations.
Assessing Normality
• Construct charts or graphs
– For small- or moderate-sized data sets, do stem-and-leaf display
and box-and-whisker plot look symmetric?
– For large data sets, does the histogram or polygon appear bell-
shaped?
• Compute descriptive summary measures
– Do the mean, median and mode have similar values?
– Is the interquartile range approximately 1.33 σ?
– Is the range approximately 6 σ?
Assessing Normality
• Observe the distribution of the data set

– Do approximately 2/3 of the observations lie within mean ± 1 standard
deviation?
– Do approximately 80% of the observations lie within mean ± 1.28
standard deviations?
– Do approximately 95% of the observations lie within mean ± 2 standard
deviations?
Z Table
Second Decimal Place in Z
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.10 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.90 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
2.00 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
3.00 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.40 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
3.50 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998 0.4998
Table Lookup of a
Standard Normal Probability
P( 0  Z  1)  0. 3413
Z 0.00 0.01 0.02
0.00 0.0000 0.0040 0.0080

0.10 0.0398 0.0438 0.0478
0.20 0.0793 0.0832 0.0871
1.00 0.3413 0.3438 0.3461
1.10 0.3643 0.3665 0.3686

1.20 0.3849 0.3869 0.3888
-3 -2 -1 0 1 2 3
Applying the Z Formula
X is normally distributed with  = 485, and  = 105

P(485  X  600)  P(0  Z  1.10)  .3643
For X = 485, Z 0.00 0.01 0.02
X -  485  485
Z=  0 0.00 0.0000 0.0040 0.0080
 105 0.10 0.0398 0.0438 0.0478
1.00 0.3413 0.3438 0.3461

For X = 600,
X -  600  485 1.10 0.3643 0.3665 0.3686
Z=   1.10
 105 1.20 0.3849 0.3869 0.3888
32
33
Thank You
34
Lecture 11: Python demo for Distribution
Dr. A. Ramesh
DEPARTMENT OF MANAGEMENT STUDIES
1
Agenda
• Different numerical problems are solved for the following Distribution
using Python:
– Discrete
• Binomial
• Poisson
• Hyper geometric
– Continuous
• Uniform
• Exponential
• Normal
2
THANK YOU
3
Lecture 11: Sampling and Sampling Distribution
Dr. A. Ramesh
DEPARTMENT OF MANAGEMENT STUDIES
IIT ROORKEE
1
Lecture Objectives
After completing this lecture, you should be able to:
• Describe a simple random sample and why sampling is important
• Explain the difference between descriptive and inferential statistics
• Define the concept of a sampling distribution
• Determine the mean and standard deviation for the sampling distribution
of the sample mean,
2
Lecture Objectives
• Describe the Central Limit Theorem and its importance

• Determine the mean and standard deviation for the sampling distribution
of the sample proportion,
• Describe sampling distributions of sample variances
3
Descriptive vs Inferential Statistics
• Descriptive statistics
– Collecting, presenting, and describing data
• Inferential statistics
– Drawing conclusions and/or making decisions concerning a population
based only on sample data
4
Populations and Samples
• A Population is the set of all items or individuals of interest

– Examples: All likely voters in the next election
All parts produced today
All sales receipts for November
• A Sample is a subset of the population

– Examples: 1000 voters selected at random for interview
A few parts selected for destructive testing
Random receipts selected for audit
5
Population vs. Sample
• Population • Sample
a b cd b c
ef ghi jkl m n gi n
o pq rs t uv w o r u
x y z y
6
Why Sample?
• Less time consuming than a census
• Less costly to administer than a census
• It is possible to obtain statistical results of a sufficiently high precision
based on samples.
• Because the research process is sometimes destructive, the sample can
save product
• If accessing the population is impossible; sampling is the only option
7
Reasons for Taking a Census
• Eliminate the possibility that a random sample is not representative of the

population
• The person authorizing the study is uncomfortable with sample

information
Random Versus Nonrandom Sampling
• Random sampling
• Every unit of the population has the same probability of being included in the
sample.
• A chance mechanism is used in the selection process.
• Eliminates bias in the selection process
• Also known as probability sampling
• Nonrandom Sampling
• Every unit of the population does not have the same probability of being
included in the sample.
• Open the selection bias
• Not appropriate data collection methods for most statistical methods
• Also known as non-probability sampling
Random Sampling Techniques
• Simple Random Sample
• Stratified Random Sample
– Proportionate
– Disproportionate
• Systematic Random Sample
• Cluster (or Area) Sampling

Simple Random Samples
• Every object in the population has an equal chance of being selected

• Objects are selected independently
• Samples can be obtained from a table of random numbers or computer
random number generators
• A simple random sample is the ideal against which other sample methods
are compared
11
Simple Random Sample:
Numbered Population Frame
01 Andhra Pradesh 11 Madhya Pradesh

02 Himachal Pradesh 12 Uttar Pradesh
03 Gujrath 13 Bihar
04 Maharashtra 14 Rajasthan
05 Nagaland 15 J & K
06 Goa 16 Tamil Nadu
07 West bengal 17 Karantaka
08 Haryana 18 Kerala
09 Punjab 19 Orissa
10 Delhi 20 Manipur
Simple Random Sampling:
Random Number Table
9 9 4 3 7 8 7 9 6 1 4 5 7 3 7 3 7 5 5 2 9 7 9 6 9 3 9 0 9 4 3 4 4 7 5 3 1 6 1 8
5 0 6 5 6 0 0 1 2 7 6 8 3 6 7 6 6 8 8 2 0 8 1 5 6 8 0 0 1 6 7 8 2 2 4 5 8 3 2 6
8 0 8 8 0 6 3 1 7 1 4 2 8 7 7 6 6 8 3 5 6 0 5 1 5 7 0 2 9 6 5 0 0 2 6 4 5 5 8 7
8 6 4 2 0 4 0 8 5 3 5 3 7 9 8 8 9 4 5 4 6 8 1 3 0 9 1 2 5 3 8 8 1 0 4 7 4 3 1 9
6 0 0 9 7 8 6 4 3 6 0 1 8 6 9 4 7 7 5 8 8 9 5 3 5 9 9 4 0 0 4 8 2 6 8 3 0 6 0 6
5 2 5 8 7 7 1 9 6 5 8 5 4 5 3 4 6 8 3 4 0 0 9 9 1 9 9 7 2 9 7 6 9 4 8 1 5 9 4 1
8 9 1 5 5 9 0 5 5 3 9 0 6 8 9 4 8 6 3 7 0 7 9 5 5 4 7 0 6 2 7 1 1 8 2 6 4 4 9 3
Simple Random Sample:
Sample Members
01 Andhra Pradesh 11 Madhya Pradesh

02 Himachal Pradesh 12 Uttar Pradesh
03 Gujrath 13 Bihar
04 Maharashtra 14 Rajasthan
05 Nagaland 15 J & K
06 Goa 16 Tamil Nadu
07 West bengal 17 Karantaka
08 Haryana 18 Kerala
09 Punjab 19 Orissa
10 Delhi 20 Manipur
• N = 20
• n=4
Stratified Random Sample
• Population is divided into non-overlapping subpopulations called strata

• A random sample is selected from each stratum
• Potential for reducing sampling error
• Proportionate -- the percentage of these sample taken from each stratum
is proportionate to the percentage that each stratum is within the
population
• Disproportionate -- proportions of the strata within the sample are
different than the proportions of the strata within the population
Stratified Random Sample:
Population of FM Radio Listeners
Stratified by Age
20 - 30 years old
(homogeneous within)
(alike) Heterogeneous
(different)
30 - 40 years old between
(alike) Heterogeneous
(different)
40 - 50 years old between
(alike)
Systematic Sampling
• Convenient and relatively easy to
N
administer k = ,
n
• Population elements are an ordered
where:
sequence (at least, conceptually).
n = sample size
• The first sample element is selected
N = population size
randomly from the first k population
elements. k = size of selection interval
• Thereafter, sample elements are selected

at a constant interval, k, from the
ordered sequence frame.
Systematic Sampling: Example
• Purchase orders for the previous fiscal year are serialized 1 to 10,000 (N =
10,000).
• A sample of fifty (n = 50) purchases orders is needed for an audit.
• k = 10,000/50 = 200
• First sample element randomly selected from the first 200 purchase
orders. Assume the 45th purchase order was selected.
• Subsequent sample elements: 245, 445, 645, . . .
Cluster Sampling
• Population is divided into non-overlapping clusters or areas
• Each cluster is a miniature of the population.
• A subset of the clusters is selected randomly for the sample.
• If the number of elements in the subset of clusters is larger than the

desired value of n, these clusters may be subdivided to form a new
set of clusters and subjected to a random selection process.
Cluster Sampling
 Advantages
• More convenient for geographically dispersed populations
• Reduced travel costs to contact sample elements
• Simplified administration of the survey
• Unavailability of sampling frame prohibits using other random
sampling methods
 Disadvantages
• Statistically less efficient when the cluster elements are similar
• Costs and problems of statistical analysis are greater than for simple
random sampling
Nonrandom Sampling
• Convenience Sampling: Sample elements are selected for the convenience
of the researcher
• Judgment Sampling: Sample elements are selected by the judgment of the

researcher
• Quota Sampling: Sample elements are selected until the quota controls are
satisfied
• Snowball Sampling: Survey subjects are selected based on referral from

other survey respondents
Errors
• Data from nonrandom samples are not appropriate for analysis by inferential
statistical methods.
• Sampling Error occurs when the sample is not representative of the
population
• Non-sampling Errors
• Missing Data, Recording, Data Entry, and Analysis Errors
• Poorly conceived concepts , unclear definitions, and defective questionnaires
• Response errors occur when people so not know, will not say, or overstate in their
answers
Sampling Distribution of x
Proper analysis and interpretation of a sample statistic
requires knowledge of its distribution.
Calculate x
to estimate 
Population Sample
 Process of x
Inferential Statistics
(parameter) (statistic)
Select a
random sample
• Making statements about a population by examining sample results

Sample statistics Population parameters
(known) Inference (unknown, but can be estimated
from sample evidence)
Sample
Population
24
Drawing conclusions and/or making decisions concerning a
population based on sample results.
• Estimation
– e.g., Estimate the population mean weight
using the sample mean weight
• Hypothesis Testing
– e.g., Use sample evidence to test the claim
that the population mean weight is 120
pounds
25
Sampling Distributions
• A sampling distribution is a distribution of all of the possible values of a

statistic for a given size sample selected from a population
26
Types of sampling distributions
Sampling
Distributions
Sampling Sampling Sampling

Distribution of Distribution of Distribution of
Sample Sample Sample
Mean Proportion Variance
27
Sampling Distributions of Sample Means
Sampling
Distributions

Sample Sample Proportion Sample Variance
Mean
28
Developing a Sampling Distribution
• Assume there is a population … A B C D
• Population size N=4

• Random variable, X,
is age of individuals
• Values of X:
18, 20, 22, 24 (years)
29
(continued)
Summary Measures for the Population Distribution:
μ
 X i P(x)
N
.25
18  20  22  24
  21
4
0
18 20 22 24 x
σ
 (X i  μ) 2
 2.236
A B C D
N Uniform Distribution
30
(continued)
Now consider all possible samples of size n = 2
1st 2nd Observation
Obs 18 20 22 24 16 Sample
18 18,18 18,20 18,22 18,24 Means
20 20,18 20,20 20,22 20,24
22 22,18 22,20 22,22 22,24 1st 2nd Observation
Obs 18 20 22 24
24 24,18 24,20 24,22 24,24
18 18 19 20 21
16 possible samples 20 19 20 21 22
(sampling with 22 20 21 22 23
replacement) 24 21 22 23 24
31
(continued)
• Sampling Distribution of All Sample Means

16 Sample Means Sample Means Distribution
_
1st 2nd Observation P(X)
Obs 18 20 22 24 .3
18 18 19 20 21 .2
20 19 20 21 22 .1
22 20 21 22 23 _
0
24 21 22 23 24 18 19 20 21 22 23 24 X
(no longer uniform)
32
(continued)
• Summary Measures of this Sampling Distribution:
E(X) 
 X i

18  19  21   24
 21  μ
N 16
σX 
 ( X  μ)
i
2
N
(18 - 21)2  (19 - 21)2    (24 - 21)2
  1.58
16
33
Comparing the Population with its Sampling
Distribution
Population Sample Means Distribution
N=4 n=2
μ  21 σ  2.236 μX  21 σ X  1.58
_
P(X) P(X)
.3 .3
.2 .2
.1 .1
0 0 _
18 20 22 24 X 18 19 20 21 22 23 24 X
A B C D
34
1,800 Randomly Selected Values
from an Exponential Distribution
450
F
400
r
e 350
q 300
u 250
e 200
n 150
c 100
y
50
0
0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10
X
Means of 60 Samples (n = 2)
F 9
r 8
e
77
q
u 66
e 55
n
44
c
y 33
22
11
00
0.00 0.25
0.00 0.25 0.50
0.50 0.75
0.75 1.00
1.00 1.25
1.25 1.50
1.50 1.75
1.75 2.00
2.00 2.25
2.25 2.50
2.50 2.75
2.75 3.00
3.00 3.25
3.25 3.50
3.50 3.75
3.75 4.00
4.00
xx
10
F
r 9
e 8
q 7
u
6
e
n 5
c 4
y 3
2
1
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00
x
16
F
14
r
e 12
q
10
u
e 8
n
c 6
y 4
0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00
x
1,800 Randomly Selected Values
from a Uniform Distribution
F 250
250
r
e 200
200
q
u 150
150
e
n 100
100
c
y 50
50
00
0.0
0.0 0.5
0.5 1.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
X-bar
F 10
r 9
e 8
q 7
u
6
e
n 5
c 4
y 3
2
1
0
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
x
12
10
F
r 8
e
q 6
u
e 4
n
c 2
y
0
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
x
25
20
F
r
15
e
q
u 10
e
n 5
c
y
0
1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25
x
Expected Value of Sample Mean
• Let X1, X2, . . . Xn represent a random sample from a population
• The sample mean value of these observations is defined as
1 n
X   Xi
n i1
43
Standard Error of the Mean
• Different samples of the same size from the same population will yield
different sample means
• A measure of the variability in the mean from sample to sample is given by
the Standard Error of the Mean:
σ
σX 
n
• Note that the standard error of the mean decreases as the sample size
increases
44
If sample values are not independent
(continued)
• If the sample size n is not a small fraction of the population size N,

then individual sample members are not distributed independently
of one another
• Thus, observations are not selected independently
• A correction is made to account for this:
σ2 N  n σ Nn
Var(X)  or σX 
n N 1 n N 1
45
If the Population is Normal
• If a population is normal with mean μ and standard deviation σ, the

sampling distribution of X is also normally distributed with
σ
μX  μ σX 
and n
• If the sample size n is not large relative to the population size N, then
μX  μ and
σX 
σ Nn
n N 1
46
Z-value for Sampling Distribution of the Mean
• Z-value for the sampling distribution of :

( X  μ)
Z
σX
where: X = sample mean
μ = population mean
σ x = standard error of the mean
47
Sampling Distribution Properties
Normal Population
Distribution
μx  μ
μ x
(i.e. x is unbiased ) Normal Sampling
Distribution
(has the same mean)
μx
x
48
Sampling Distribution Properties
• For sampling with replacement:
As n increases,
σ x decreases Larger sample
Smaller sample size
size
x
μ
49
If the Population is not Normal- Central Limit Theorem
We can apply the Central Limit Theorem:
– Even if the population is not normal,

– sample means from the population will be approximately normal as
long as the sample size is large enough.
Properties of the sampling distribution:
σ
μx  μ And σx 
n
50
Central Limit Theorem
n
the sampling
As the sample distribution becomes
size gets large almost normal
enough… regardless of shape of
population
51
If the Population is not Normal
(continued)
Population Distribution
Sampling distribution
properties:
Central Tendency
μx  μ
μ x
Variation Sampling Distribution (becomes normal as n increases)
σ
σx  Larger
n Smaller sample
sample
size
size
μx x
52
How Large is Large Enough?
• For most distributions, n > 25 will give a sampling distribution that is

nearly normal
• For normal population distributions, the sampling distribution of the mean
is always normally distributed
53
Example
• Suppose a large population has mean μ = 8 and standard deviation σ = 3.

Suppose a random sample of size n = 36 is selected.
• What is the probability that the sample mean is between 7.8 and 8.2?
54
Example
Solution:
• Even if the population is not normally distributed, the central limit
theorem can be used (n > 25)
• … so the sampling distribution of x is approximately normal
• … with mean μx = 8
• …and standard deviation
σ 3
σx    0.5
n 36
55
Example (continued)
Solution (continued)
 
 7.8 - 8 μX -μ 8.2 - 8 
P(7.8  μ X  8.2)  P   
 3 σ 3 
 36 n 36 
 P(-0.5  Z  0.5)  0.3830
Sampling Standard Normal
Distribution Distribution .1915
??? +.1915
? ??
? ? Sample Standardize
?? ?
?
-0.5 0.5
μ8 X 7.8
μX  8
8.2
x μz  0 Z
56
Distribution of Sample Mean, proportion,
and variance
Dr. A. Ramesh
IIT ROORKEE
1
2
Acceptance Intervals
Goal: determine a range within which sample means are likely to occur, given a
population mean and variance
• By the Central Limit Theorem, we know that the distribution of X is
approximately normal if n is large enough, with mean μ and standard
deviation
• Let zα/2 be the z-value that leaves area α/2 in the upper tail of the normal
distribution (i.e., the interval - zα/2 to zα/2 encloses probability 1– α)
• Then
μ  z/2σ X
is the interval that includes X with probability 1 – α
3
Sampling Distributions of Sample Proportions
Sampling
Distributions

4
Sampling Distributions of Sample Proportions
P = the proportion of the population having some characteristic
• Sample proportion (p̂) provides an estimate of P:
X number of items in the sample having the characteristic of interest

pˆ  
n sample size
• 0 ≤ p̂ ≤ 1
• p̂ has a binomial distribution, but can be approximated by a normal
distribution when nP(1 – P) > 5
5
^
Sampling Distribution of p
• Normal approximation:
Sampling Distribution
P(Pˆ )
.3
.2
Properties: E(pˆ )  P
.1
0
0 .2 .4 .6 8 1
(where P = population proportion)
 X  P(1 P)
And σ p2ˆ  Var   
n n
6
7
Z-Value for Proportions
Standardize p̂ to a Z value with the formula:
pˆ  P pˆ  P
Z 
σ pˆ P(1 P)
n
8
Example
• If the true proportion of voters who support Proposition A is

P = .4, what is the probability that a sample of size 200 yields
a sample proportion between .40 and .45?
• i.e.:
if P = .4 and n = 200, what is
P(.40 ≤ p̂ ≤ .45) ?
9
Example (continued)
• if P = .4 and n = 200, what is

P(.40 ≤ p̂ ≤ .45) ?
Find: σ pˆ P(1  P) .4(1  .4)

σ p̂    .03464
n 200
Convert to  .40  .40 .45  .40 

P(.40  pˆ  .45)  P Z 
standard  .03464 .03464 
normal:  P(0  Z  1.44)
10
Example
(continued)
if P = .4 and n = 200, what is P(.40 ≤ p̂ ≤ .45) ?

Use standard normal table: P(0 ≤ Z ≤ 1.44) = .4251
Standardized
Sampling Distribution Normal Distribution
.4251
Standardize
.40 .45 p̂ 0 1.44

Z
11
Sampling Distributions of Sample Variance
Sampling
Distributions

12
Sample Variance
• Let x1, x2, . . . , xn be a random sample from a population. The
sample variance is
1 n
s 
2
 i
n  1 i1
(x  x) 2
• the square root of the sample variance is called the sample

standard deviation
• the sample variance is different for different random samples from
the same population
13
Sampling Distribution of Sample Variances
• The sampling distribution of s2 has mean σ2
E(s2 )  σ 2
• If the population distribution is normal then

(n - 1)s2
σ2
has a 2 distribution with n – 1 degrees of freedom
14
15
The Chi-square Distribution
• The chi-square distribution is a family of distributions, depending on

degrees of freedom: d.f. = n – 1
0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2 0 4 8 12 16 20 24 28 2
16
Degrees of Freedom (df)
Idea: Number of observations that are free to vary after sample
mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0
If the mean of these three values is 8.0,
Let X1 = 7 then X3 must be 9
Let X2 = 8 (i.e., X3 is not free to vary)
What is X3?
Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2

(2 values can be any numbers, but the third is not free to vary for a
given mean)
17
Chi-square Example
• A commercial freezer must hold a selected temperature with little

variation. Specifications call for a standard deviation of no more than 4
degrees (a variance of 16 degrees2).
• A sample of 14 freezers is to be tested
• What is the upper limit (K) for the sample variance such that the
probability of exceeding this limit, given that the population standard
deviation is 4, is less than 0.05?
18
Finding the Chi-square Value
(n  1)s2
χ 
2
Is chi-square distributed with (n – 1) = 13
σ 2
degrees of freedom
• Use the the chi-square distribution with area 0.05 in the
upper tail:
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)
probability
α = .05
2
213 = 22.36
19
Chi-square Example
(continued)
213 = 22.36 (α = .05 and 14 – 1 = 13 d.f.)

 (n  1)s 2 2 
P(s  K)  P
2
 χ13   0.05
So:  16 
(n  1)K
or  22.36 (where n = 14)
16
(22.36)(16)
so K  27.52
(14  1)
If s2 from the sample of size n = 14 is greater than 27.52, there is

strong evidence to suggest the population variance exceeds 16.
20
Summary
• Introduced sampling distributions
• Described the sampling distribution of sample means
– For normal populations
– Using the Central Limit Theorem
• Described the sampling distribution of sample proportions
• Introduced the chi-square distribution
• Examined sampling distributions for sample variances
• Calculated probabilities using sampling distributions
21
Thank You
22
Confidence Interval Estimation: Single
Population
Dr. A. Ramesh
IIT ROORKEE
1
Goals
After completing this lecture, you should be able to:
• Distinguish between a point estimate and a confidence interval estimate
• Construct and interpret a confidence interval estimate for a single
population mean using both the Z and t distributions
• Form and interpret a confidence interval estimate for a single population
proportion
• Create confidence interval estimates for the variance of a normal
population
2
Confidence Intervals
• Confidence Intervals for the Population Mean, μ
– when Population Variance σ2 is Known
– when Population Variance σ2 is Unknown
• Confidence Intervals for the Population Proportion, p̂ (large samples)
• Confidence interval estimates for the variance of a normal population
3
Definitions
• An estimator of a population parameter is

– a random variable that depends on sample information . . .
– whose value provides an approximation to this unknown parameter
• A specific value of that random variable is called an estimate
4
Point and Interval Estimates
• A point estimate is a single number,

• a confidence interval provides additional information about
variability
Lower Upper
Confidence Confidence
Point Estimate Limit
Limit
Width of
confidence interval
5
Point Estimates
We can estimate a with a Sample

Population Parameter … Statistic
(a Point Estimate)
Mean μ x
Proportion P p̂
6
Unbiasedness
• A point estimator θ̂ is said to be an unbiased estimator of the

parameter  if the expected value, or mean, of the sampling
distribution of θ̂ is ,
E(θˆ )  θ
• Examples:
– The sample mean x is an unbiased estimator of μ
– The sample variance s2 is an unbiased estimator of σ2
– The sample proportion p̂ is an unbiased estimator of P
7
Unbiasedness
(continued)
• θ̂1 is an unbiased estimator, θ̂2 is biased:
θ̂1 θ̂2
θ θ̂
8
Bias
• Let θ̂ be an estimator of 
• The bias in θ̂ is defined as the difference between its mean and 
Bias(θˆ )  E(θˆ )  θ
• The bias of an unbiased estimator is 0
9
Most Efficient Estimator
• Suppose there are several unbiased estimators of 
• The most efficient estimator or the minimum variance unbiased estimator
of  is the unbiased estimator with the smallest variance
• Let θ̂1 and θ̂2 be two unbiased estimators of , based on the same number
of sample observations. Then,
– θ̂1 is said to be more efficient than θ̂2 if Var(θˆ 1 )  Var(θˆ 2 )
– The relative efficiency of θ̂1 with respect to θ̂2 is the ratio of

their variances:
Var( θˆ 2 )
Relative Efficiency 
Var( θˆ )
1
10
• How much uncertainty is associated with a point estimate of a population

parameter?
• An interval estimate provides more information about a population

characteristic than does a point estimate
• Such interval estimates are called confidence intervals
11
Confidence Interval Estimate
• An interval gives a range of values:

– Takes into consideration variation in sample statistics from sample to
sample
– Based on observation from 1 sample
– Gives information about closeness to unknown population
parameters
– Stated in terms of level of confidence
• Can never be 100% confident
12
Confidence Interval and Confidence Level
• If P(a <  < b) = 1 -  then the interval from a to b is called a 100(1 -

)% confidence interval of .
• The quantity (1 - ) is called the confidence level of the interval (
between 0 and 1)
– In repeated samples of the population, the true value of the

parameter  would be contained in 100(1 - )% of intervals
calculated this way.
– The confidence interval calculated in this manner is written as a <  <
b with 100(1 - )% confidence
13
Estimation Process
Random Sample I am 95% confident

that μ is between 40
Population & 60.
Mean
(mean, μ, is X = 50
unknown)
Sample
14
Confidence Level, (1-)
(continued)
• Suppose confidence level = 95%
• Also written (1 - ) = 0.95
• A relative frequency interpretation:
– From repeated samples, 95% of all the confidence intervals that can
be constructed will contain the unknown true parameter
• A specific interval either will contain or will not contain the true
parameter
15
General Formula
• The general formula for all confidence intervals is:
Point Estimate  (Reliability Factor)(Standard Error)
• The value of the reliability factor depends on the desired level of confidence
16
Confidence
Intervals
Population Population Population

σ2 Known σ2 Unknown
17
Confidence Interval for μ (σ2 Known)
• Assumptions
– Population variance σ2 is known
– Population is normally distributed
– If population is not normal, use large sample
• Confidence interval estimate:
σ σ
x  z α/2  μ  x  z α/2
n n
(where z/2 is the normal distribution value for a probability of /2 in each tail)
18
Margin of Error
• The confidence interval,
σ σ
x  z α/2  μ  x  z α/2
n n
• Can also be written as x  ME

where ME is called the margin of error
σ
ME  z α/2
n
19
Reducing the Margin of Error
σ
ME  z α/2
n
The margin of error can be reduced if
• the population standard deviation can be reduced (σ↓)
• The sample size is increased (n↑)
• The confidence level is decreased, (1 – ) ↓
20
Finding the Reliability Factor, z/2
• Consider a 95% confidence interval:
1    .95
α α
 .025  .025
2 2
Z units: z = -1.96 0 z = 1.96

Lower Upper
X units: Confidence Point Estimate Confidence
Limit Limit
 Find z.025 = 1.96 from the standard normal distribution table

21
Common Levels of Confidence
• Commonly used confidence levels are 90%, 95%, and 99%
Confidence
Confidence
Coefficient, Z/2 value
Level
1 
80% .80 1.28
90% .90 1.645
95% .95 1.96
98% .98 2.33
99% .99 2.58
99.8% .998 3.08
99.9% .999 3.27
22
Intervals and Level of Confidence
Sampling Distribution of the Mean
/2 1  /2
Intervals
x
μx  μ
extend from 100(1-)%
x1
of intervals
σ
LCL  x  z x2 constructed
n contain μ;
to
σ 100()% do
UCL  x  z not.
n
23
Example
• A sample of 11 circuits from a large normal population has a mean

resistance of 2.20 ohms. We know from past testing that the population
standard deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean resistance of the
population.
24
Example
(continued)
• A sample of 11 circuits from a large normal population has a mean resistance

of 2.20 ohms. We know from past testing that the population standard
deviation is .35 ohms.
σ
x z
• Solution: n
 2.20  1.96 (.35/ 11)
 2.20  .2068
1.9932  μ  2.4068
25
Interpretation
• We are 95% confident that the true mean resistance is

between 1.9932 and 2.4068 ohms
• Although the true mean may or may not be in this interval,
95% of intervals formed in this manner will contain the true
mean
26
Confidence
Intervals

27
Confidence Interval Estimation: Single
Population-II
Dr. A. Ramesh
IIT ROORKEE
1
Student’s t Distribution
• Consider a random sample of n observations
– with mean x and standard deviation s
– from a normally distributed population with mean μ
• Then the variable x μ

t
s/ n
follows the Student’s t distribution with (n - 1) degrees of freedom
2
Confidence Interval for μ (σ2 Unknown)
• If the population standard deviation σ is unknown, we can substitute

the sample standard deviation, s
• This introduces extra uncertainty, since s is variable from sample to
sample
• So we use the t distribution instead of the normal distribution
3
Confidence Interval for μ (σ Unknown)
(continued)
• Assumptions
– Population standard deviation is unknown
– Population is normally distributed
– If population is not normal, use large sample
• Use Student’s t Distribution
• Confidence Interval Estimate:
S S
x  t n-1,α/2  μ  x  t n-1,α/2
n n
where tn-1,α/2 is the critical value of the t distribution with n-1 d.f. and an area of α/2 in each tail
4
Margin of Error
• The confidence interval,
S S
x  t n-1,α/2  μ  x  t n-1,α/2
n n
• Can also be written as

x  ME
where ME is called the margin of error:
σ
ME  t n-1,α/2
n
5
• The t is a family of distributions

• The t value depends on degrees of freedom (d.f.)
– Number of observations that are free to vary after sample mean has
been calculated
d.f. = n - 1
6
Note: t Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
7
Student’s t Table
Upper Tail Area

Let: n = 3
df .10 .05 .025 df = n - 1 = 2
 = .10
1 3.078 6.314 12.706 /2 =.05
2 1.886 2.920 4.303
3 1.638 2.353 3.182 /2 = .05
The body of the table

contains t values, not
probabilities
0 2.920 t
8
t distribution values
With comparison to the Z value
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) ____
.80 1.372 1.325 1.310 1.282

.90 1.812 1.725 1.697 1.645
.95 2.228 2.086 2.042 1.960
.99 3.169 2.845 2.750 2.576
Note: t Z as n increases
9
Example
A random sample of n = 25 has x = 50 and s = 8. Form a 95%

confidence interval for μ
t n1,α/2  t 24,.025  2.0639
– d.f. = n – 1 = 24, so
The confidence interval is

S S
x  t n-1,α/2  μ  x  t n-1,α/2
n n
8 8
50  (2.0639)  μ  50  (2.0639)
25 25
46.698  μ  53.302
10
Confidence
Intervals

11
Confidence Intervals for the
Population Proportion
• An interval estimate for the population proportion ( P ) can

be calculated by adding an allowance for uncertainty to the
sample proportion ( p̂ )
12
Confidence Intervals for the Population
Proportion, p
(continued)
• Recall that the distribution of the sample proportion is approximately

normal if the sample size is large, with standard deviation
P(1 P)
σP 
n
• We will estimate this with sample data:
pˆ (1 pˆ )
n
13
Confidence Interval Endpoints
• Upper and lower confidence limits for the population proportion are
calculated with the formula
pˆ (1 pˆ ) ˆ (1 pˆ )
p
pˆ  z α/2  P  pˆ  z α/2
n n
• where
– z/2 is the standard normal value for the level of confidence desired
– p̂ is the sample proportion
– n is the sample size
– nP(1−P) > 5
14
Example
• A random sample of 100 people shows that 25 are left-

handed.
• Form a 95% confidence interval for the true proportion of
left-handers
15
Example (continued)
• A random sample of 100 people shows that 25 are left-handed. Form a

95% confidence interval for the true proportion of left-handers.
ˆ ˆ ˆ ˆ
ˆp  z α/2 p(1 p)  P  pˆ  z α/2 p(1 p)
n n
25 .25(.75) 25 .25(.75)
 1.96  P   1.96
100 100 100 100
0.1651  P  0.3349
16
Interpretation
• We are 95% confident that the true percentage of left-handers in the

population is between
16.51% and 33.49%.
• Although the interval from 0.1651 to 0.3349 may or may not contain the true
proportion, 95% of intervals formed from samples of size 100 in this manner
will contain the true proportion.
17
Confidence
Intervals

18
Confidence Intervals for the Population
Variance
 Goal: Form a confidence interval for the population variance, σ2
• The confidence interval is based on the sample variance, s2
• Assumed: the population is normally distributed
19
Confidence Intervals for the Population Variance
(continued)
The random variable

(n  1)s2
 n21 
σ2
follows a chi-square distribution with (n – 1)
degrees of freedom
20
Confidence Intervals for the Population Variance
The (1 - )% confidence interval for the population variance is
(n  1)s2 (n  1)s 2
 σ 2
 2
χn1, α/2
2
χn1, 1 - α/2
21
Example
You are testing the speed of a batch of computer processors. You

collect the following data (in Mhz):
Sample size 17
Sample mean 3004
Sample std dev 74
Assume the population is normal. Determine the 95%

confidence interval for σx2
22
Finding the Chi-square Values
• n = 17 so the chi-square distribution has (n – 1) = 16 degrees of

freedom
•  = 0.05, so use the the chi-square values with area 0.025 in each tail:
χn21, α/2  χ16
2
, 0.025  28.85
probability probability
χ 2
n 1, 1 - α/2 χ 2
16 , 0.975  6.91 α/2 = .025 α/2 = .025
216
216 = 6.91 216 = 28.85
23
Calculating the Confidence Limits
• The 95% confidence interval is

(n  1)s2 (n  1)s 2
 σ 2
 2
χn1, α/2
2
χn1, 1 - α/2
(17  1)(74)2 (17  1)(74)2

σ 
2
28.85 6.91
3037  σ 2  12683
Converting to standard deviation, we are 95% confident that the population standard
deviation of CPU speed is between 55.1 and 112.6 Mhz
24
Finite Populations
• If the sample size is more than 5% of the population size (and

sampling is without replacement) then a finite population correction
factor must be used when calculating the standard error
25
Finite Population Correction Factor
• Suppose sampling is without replacement and the sample size is large

relative to the population size
• Assume the population size is large enough to apply the central limit
theorem
• Apply the finite population correction factor when estimating the
population variance
Nn
finite population correction factor 
N 1
26
Estimating the Population Mean
• Let a simple random sample of size n be taken from a population

of N members with mean μ
• The sample mean is an unbiased estimator of the population mean
μ
• 1 n
x   xi
The point estimate is:
n i1
27
Finite Populations: Mean
• If the sample size is more than 5% of the population size, an unbiased

estimator for the variance of the sample mean is
Nn
2
ˆ  s
σ  
2
 N 1 
x
n
• So the 100(1-α)% confidence interval for the population mean is
ˆ x  μ  x  t n-1,α/2σ
x - t n-1,α/2σ ˆx
28
Estimating the Population Proportion
• Let the true population proportion be P

• Let p̂ be the sample proportion from n observations from a simple
random sample
• The sample proportion, p̂ , is an unbiased estimator of the population
proportion, P
29
Finite Populations: Proportion
• If the sample size is more than 5% of the population size, an unbiased

estimator for the variance of the population proportion is
ˆ (1- pˆ )  N  n 
p
ˆ 
σ 2
pˆ  
n  N 1 
• So the 100(1-α)% confidence interval for the population proportion is
pˆ - zα/2σ
ˆ pˆ  P  pˆ  zα/2σ
ˆ pˆ
30
Lecture Summary
• Introduced the concept of confidence intervals
• Discussed point estimates
• Developed confidence interval estimates
• Created confidence interval estimates for the mean (σ2
known)
• Introduced the Student’s t distribution
• Determined confidence interval estimates for the mean (σ2
unknown)
31
Lecture Summary
(continued)
• Created confidence interval estimates for the proportion
• Created confidence interval estimates for the variance of a normal
population
• Applied the finite population correction factor to form confidence
intervals when the sample size is not small relative to the population size
32
Summary
• Introduced sampling distributions
• Described the sampling distribution of sample means
– For normal populations
– Using the Central Limit Theorem
• Described the sampling distribution of sample proportions
• Introduced the chi-square distribution
• Examined sampling distributions for sample variances
• Calculated probabilities using sampling distributions
33
Thank You
34
Hypothesis Testing
Class Objectives
• Developing Null and Alternative Hypotheses
• Type I and Type II Errors- Explanation
• Population Mean: Sigma Known
• Population Mean: Sigma Unknown
• Population Proportion
Hypothesis Testing
• Hypothesis testing can be used to determine whether a statement about

the value of a population parameter should or should not be rejected.
• The null hypothesis, denoted by H0 , is a tentative assumption about a

population parameter
• The alternative hypothesis, denoted by Ha, is the opposite of what is stated

in the null hypothesis
• The hypothesis testing procedure uses data from a sample to test the two
competing statements indicated by H0 and Ha.
Developing Null and Alternative Hypotheses
• It is not always obvious how the null and alternative hypotheses should be
formulated
• Care must be taken to structure the hypotheses appropriately so that the test
conclusion provides the information the researcher wants
• The context of the situation is very important in determining how the hypotheses
should be stated
• In some cases it is easier to identify the alternative hypothesis first. In other

cases the null is easier
• Correct hypothesis formulation will take practice

Alternative Hypothesis as a Research Hypothesis

•Many applications of hypothesis testing involve an attempt to gather evidence in support of a research
hypothesis
• In such cases, it is often best to begin with the alternative hypothesis and make it the conclusion that
the researcher hopes to support
• The conclusion that the research hypothesis is true is made if the sample data provide sufficient
evidence to show that the null hypothesis can be rejected
Alternative Hypothesis as a Research Hypothesis
• Example: A new manufacturing method is believed to be better than the current method.
• Alternative Hypothesis:
– The new manufacturing method is better.
• Null Hypothesis:
– The new method is no better than the old method.

• Alternative Hypothesis as a Research Hypothesis
• Example: A new bonus plan, that is developed in an attempt to increase sales
– The new bonus plan increase sales
– The new bonus plan does not increase sales

• Alternative Hypothesis as a Research Hypothesis
• Example:
– A new drug is developed with the goal of lowering Cholesterol-level more

than the existing drug
– The new drug lowers Cholesterol-level more than the existing drug
– The new drug does not lower Cholesterol-level more than the existing
drug
• Null Hypothesis as an assumption to be challenged
• We might begin with a belief or assumption that a statement about the value of a population
parameter is true
• We then using a hypothesis test to challenge the assumption and determine if there is statistical
evidence to conclude that the assumption is incorrect
• In these situations, it is helpful to develop the null hypothesis first

• Null Hypothesis as an Assumption to be Challenged
• Example:
– The label on a milk bottle states that it contains 1000 ml
– The label is correct. µ > 1000 ml
– The label is incorrect. µ < 1000 ml

Null and Alternative Hypotheses about a Population Mean 
• The equality part of the hypotheses always appears in the null hypothesis
• In general, a hypothesis test about the value of a population mean  must take one of the following
three forms (where 0 is the hypothesized value of the population mean)
One-tailed One-tailed Two-tailed

(lower-tail) (upper-tail)
Null and Alternative Hypotheses
• A major hospital in Chennai provides
one of the most comprehensive
emergency medical services in the
world
• Operating in a multiple hospital
system with approximately 10 mobile
medical units, the service goal is to
respond to medical emergencies with
a mean time of 8 minutes or less
• The director of medical services
wants to formulate a hypothesis test
that could use a sample of
emergency response times to
determine whether or not the
service goal of 8 minutes or less is
being achieved.
Null and Alternative Hypotheses
The emergency service is meeting the response

H0:   8 goal; no follow-up action is necessary.
The emergency service is not meeting the

Ha:   8 response goal; appropriate follow-up action is
necessary.
where:  = mean response time for the population

of medical emergency requests
Type I Error
• Because hypothesis tests are based on sample data, we must allow for the
possibility of errors
• A Type I error is rejecting H0 when it is true
• The probability of making a Type I error when the null hypothesis is called
the level of significance
• Applications of hypothesis testing that only control the Type I error are
often called significance tests
Type II Error
• A Type II error is accepting H0 when it is false.
• It is difficult to control for the probability of making a Type II error.
• Statisticians avoid the risk of making a Type II error by using “do not reject H0” and not “accept H0”.
Type I and Type II Errors
Population Condition
H0 True H0 False
Conclusion ( < 8) (  8)
Accept H0 Correct
Type II Error
(Conclude  < 8) Decision
Reject H0 Correct
Type I Error
(Conclude  > 8) Decision
Three Approaches for Hypothesis Testing
• P- Value
• Critical Value
• Confidence Interval Value

p-Value Approach to One-Tailed Hypothesis Testing
• The p-value is the probability, computed using the test statistic, that measures the support (or lack of
support) provided by the sample for the null hypothesis
• If the p-value is less than or equal to the level of significance  , the value of the test statistic is in the
rejection region
• Reject H0 if the p-value < 

Lower-Tailed Test About a Population Mean: s Known
p-Value Approach p-Value <  ,

so reject H0.
 = .10 Sampling
distribution
of
p-value
 72
z
z = -za = 0
-1.46 -1.28
p-Value Approach
Upper-Tailed Test About a Population Mean :s Known
p-Value Approach p-Value <  ,

so reject H0.
Sampling
distribution
of  = .04
p-Value
 11
z
0 z = z=
1.75 2.29
p-Value Approach
Critical Value Approach to One-Tailed Hypothesis Testing
• The test statistic z has a standard
normal probability distribution.
• We can use the standard normal
probability distribution table to
find the z-value with an area of 
in the lower (or upper) tail of the
distribution.
• The value of the test statistic that
established the boundary of the
rejection region is called the
critical value for the test.
• The rejection rule is:
Lower tail: Reject H0 if z < -z
Upper tail: Reject H0 if z > z
Lower-Tailed Test About a Population Mean: s Known
Critical Value Approach
Sampling
distribution
of
Reject H0
  1
Do Not Reject H0
z
-z = -1.28 0
Upper-Tailed Test About a Population Mean: s Known
Sampling
distribution
of
Reject H0
  
Do Not Reject H0
z
0 z = 1.645
Steps of Hypothesis Testing – P value approach
• Step 1. Develop the null and alternative hypotheses.
• Step 2. Specify the level of significance .
• Step 3. Collect the sample data and compute the test statistic.
• p-Value Approach
• Step 4. Use the value of the test statistic to compute the p-value.
• Step 5. Reject H0 if p-value < .

Steps of Hypothesis Testing
•Step 4. Use the level of significance  to determine the critical value and
the rejection rule.
•Step 5. Use the value of the test statistic and the rejection rule to determine
whether to reject H0.

Hypothesis Testing
1
Class Objectives
• Population Mean: Sigma Known –Example
2
One-Tailed Tests About a Population Mean: s Known
• Example: The mean response times for a random sample

of 30 Pizza Deliveries is 32 minutes
• The population standard deviation is believed to be 10
minutes.
• The pizza delivery services director wants to perform a
hypothesis test, with a =0.05 level of significance, to
determine whether the service goal of 30 minutes or less
is being achieved.
3
Given Values
• Sample • Population
• Sample mean = 32 Min • a =0.05
• Sample size = 30 • Population mean = 30 Min
4
p -Value Approach
5
One-Tailed Tests About a Population Mean:
s Known
1. Develop the hypotheses.
2. Specify the level of significance. H0: 30
3. Compute the value of the test statistic. Ha:30
a = .05
x 32  30
z   1.09
s / n 10 / 30
6
7
p –Value Approach
4. Compute the p –value.
For z = 1.09, p–value = = 0.137
5. Determine whether to reject H0.
• Because p–value = 0.137 > a = .05 , we do not reject H0.
• There are not sufficient statistical evidence to infer that Pizza delivery services is not meeting the response
goal of 30 minutes.
8
p –Value Approach
Sampling
distribution a = .05
of
p-value
0.137

z
z = za =
0 1.09 1.645
9
10

4. Determine the critical value and rejection rule.
– For a = .05, z.05 = 1.645
– Reject H0 if z > 1.645
– Because 1.645 > 1.05, we do not reject H0.
11
p-Value Approach to Two-Tailed Hypothesis Testing
12
Compute the p-value using the following three steps:
1. Compute the value of the test statistic z.
2. If z is in the upper tail (z > 0), find the area under the standard normal curve to the right of z.
3. If z is in the lower tail (z < 0), find the area under the standard normal curve to the left of z.
4. Double the tail area obtained in step 2 to obtain the p –value.
The rejection rule:
Reject H0 if the p-value < a .
13
Critical Value Approach to Two-Tailed Hypothesis Testing
• The critical values will occur in both the lower and upper tails of the standard normal curve.
• Use the standard normal probability distribution table to find za/2 (the z-value with an area of a/2
in the upper tail of the distribution).
• The rejection rule is:
Reject H0 if z < -za/2 or z > za/2.
14
Two-Tailed Tests About a Population Mean:
s Known
• Example: Milk Carton

• Assume that a sample of 30 milk carton provides a sample mean of 505 ml.
• The population standard deviation is believed to be 10 ml.
• Perform a hypothesis test, at the 0.03 level of significance, population
mean 500 ml and to help determine whether the filling process should
continue operating or be stopped and corrected.
15
Given Values
• Sample • Population
• Sample size = 30 • Population mean = 500 ml
• Sample mean = 505 ml • Standard deviation = 10 ml
• Significance level 0.03
16
p –Value approach
17
s Known
1. Determine the hypotheses.
2. Specify the level of significance.
3. Compute the value of the test statistic.
a = .03
x   505  500
z   2.74
s / n 10 / 30
18
19
s Known
p –Value Approach
4. Compute the p –value.
– For z = 2.74, p–value = 2(1 - .9969) = .0061

– Because p–value = .0062 < a = .03, we reject H0.
There is no sufficient statistical evidence to infer that the null hypothesis is true (i.e. the mean filling
quantity is not 500 ml)
20
Two-Tailed Tests About a Population Mean: s Known
p-Value Approach
1/2 1/2
p -value p -value
= .0031 = .0031
a/2 = a/2 =
.015 .015
z
z = -2.74 0 z = 2.74
-za/2 = -2.17 za/2 = 2.17
21
22
Two-Tailed Tests About a Population Mean :s Known
• Critical Value Approach

4. Determine the critical value and rejection rule, for a/2 = .03/2 = .015, z.015 = 2.17
Reject H0 if z < -2.17 or z > 2.17
Because 2.74 > 2.17, we reject H0.
There is sufficient statistical evidence to infer that the null hypothesis is not true
23
24
Two-Tailed Tests About a Population Mean :s Known

Sampling
distribution
x   505  500
z   2.74 of
s / n 10 / 30
Reject H0 Do Not Reject H0 Reject H0

a/2 = .015 a/2 = .015
z
-2.17 0 2.17
25
Confidence Interval Approach
26
Confidence Interval Approach to
Two-Tailed Tests About a Population Mean
• Select a simple random sample from the population and use the value of the sample mean to
develop the confidence interval for the population mean .
• If the confidence interval contains the hypothesized value 500, do not reject H0.
• Otherwise, reject H0.
• Actually, H0 should be rejected if 0 happens to be equal to one of the end points of the confidence
interval.
27
Confidence Interval Approach to Two-Tailed Tests About a Population Mean
The 97% confidence interval for 500 is
5 5 3.9619
501.03814 ,508.96186
Because the hypothesized value for the population mean, 0 = 500ml, is not in this interval, the
hypothesis-testing conclusion is that the null hypothesis, H0:  = 500, is rejected.
28
Thanks
29
Hypothesis Testing-III
1
Tests About a Population Mean:s Unknown
• Test Statistic
This test statistic has a t distribution with n - 1 degrees of freedom.
2
Tests About a Population Mean:s Unknown
Rejection Rule: p -Value Approach

Reject H0 if p –value < 
Rejection Rule: Critical Value Approach
H0:  Reject H0 if t < -t
H0:  Reject H0 if t > t
H0:  Reject H0 if t < - t or t > t
3
4
One-Tailed Test About a Population Mean: s Unknown
Example: Ice Cream Demand
Day No. of Ice- Day No. of Ice-
• In a ice cream parlor at IIT Roorkee, the following data cream cream
Sold Sold
represent the number of ice-creams sold in 20 days
1 13 11 12
2 8 12 11
• Test hypothesis H0:  < 10 3 10 13 11
4 10 14 12
• Use = .05 to test the hypothesis. 5 8 15 10
6 9 16 12
7 10 17 7
8 11 18 10
9 6 19 11
10 8 20 8
5
Given Data
6
7
One-Tailed Test About a Population Mean:
s Unknown
Reject H0
Do Not Reject H0

t
0
8
Hypothesis Testing – proportion
9
Null and Alternative Hypotheses: Population Proportion
• The equality part of the hypotheses always appears in the null hypothesis.
• In general, a hypothesis test about the value of a population proportion p must take one of the
following three forms (where p0 is the hypothesized value of the population proportion).
H0: p > p0 H0: p < p0 H0: p = p0

H a : p < p0 H a : p > p0 H a : p ≠ p0
One-tailed One-tailed
(lower tail) (upper tail) Two-tailed
10
Tests About a Population Proportion
Test Statistic
where:
assuming np > 5 and n(1 – p) > 5
11
Tests About a Population Proportion
Rejection Rule: p –Value Approach
Reject H0 if p –value < 
Rejection Rule: Critical Value Approach
H0: pp Reject H0 if z > z
H0: pp Reject H0 if z < -z
H0: pp Reject H0 if z < -z or z > z
12
Two-Tailed Test About a Population Proportion
Example: City Traffic Police
For a New Year’s week, the City

Traffic Police claimed that 50% of the
accidents would be caused by drunk
driving.
A sample of 120 accidents showed

that 67 were caused by drunk driving.
Use these data to test the Traffic
Police’s claim with  = .05.
13
p –Value Approach
14
H 0 : p  .5
1. Determine the hypotheses.
H a : p  .5
2. Specify the level of significance.  = .05
3. Compute the value of the test statistic.
p0 (1  p0 ) .5(1  .5)
sp    .045644
n 120
p  p0 (67 /120)  .5
z   1.28
sp .045644
15
4. Compute the p -value.
For z = 1.28, cumulative probability = .8997 p–value = 2(1 - .8997) = .2006
Because p–value = .2006 >  = .05, we cannot reject H0.
16
17
18
4. Determine the critical value and rejection rule.
For /2 = .05/2 = .025, z.025 = 1.96
Reject H0 if z < -1.96 or z > 1.96
Because 1.278 > -1.96 and < 1.96, we cannot reject H0.
19
Errors in Hypothesis Testing
Dr. A. Ramesh
Indian Institute of Technology Roorkee
1
Example
• We are interested in burning rate of a solid propellant used to power aircrew escape systems
• Burning rate is a random variable that can be described by a probability distribution
• Suppose our interest focus on mean burning rate
• Ho: µ = 50 centimeters per second
• H1: µ ≠ 50 centimeters per second
Reference: Applied statistics and probability for engineers, Douglas C. Montgomery, George C. Runger, John Wiley &
Sons, 2007
2
Value of the null hypothesis
• The value of the null hypothesis can be obtained by
– Past experience or knowledge of the process, or even from the previous tests or experiments
– From some theory or model regarding the process under study
– From external consideration, such as design or engineering specifications, or from contractual
obligations
3
Note: for this example n=10
Note: for this example we will

assume  = 2.5
4
Type I Error
• The true mean burning rate of the propellant could be equal to 50 centimeters per second
• However randomly selected propellant specimens that are tested, we could observe a value of test
statistics x that falls into the critical region(rejection region).
• We would then reject the null hypothesis Ho in favor of the alternate H1, in fact, Ho is really true
• This type of wrong conclusion is called a type I error
5
Type I Error
• Rejecting the null hypothesis Ho when

it is true is defined as a type I error
6
Type II Error
• Now suppose the true mean burning rate is different from 50 centimeters per second, yet the sample
mean x falls in the acceptance region
• In this case we would fail to reject Ho when it is false
• This type of wrong conclusion is called a type II error
7
Type II Error
• Failing to reject the null

hypothesis when it is false is
defined as a type II error
8
Type 1 and Type II Errors
H0 is correct H0 is incorrect
H0 is accepted correct decision Type II error ()

Incorrect
acceptance
H0 is rejected Type I error () correct decision

Incorrect rejection
9
Type I error
• In the propellant burning rate example, a type I error will occur when either x  51.5 _ or _ x  48.5
when the true mean burning rate is µ = 50 centimeters per second
• Suppose the standard deviation of burning rate is σ = 2.5 centimeters per second and n = 10
• Probability distribution µ = 50,standard error = 0.79.
• Type I error is
  P( x  48.5 _ when _   50)  P( x  51.5 _ when _   50)
10
Where
does this We will reject the null
number hypothesis ( = 50) if our
come sample mean is either of
from? these two regions
11
12
Type I error
• Type I error = 0.057434
• This implies that 5.7 % of all random samples would lead to rejection of the hypothesis Ho: µ=50
centimeters per second.
• We can reduce the type I error by widening the acceptance region. If we make critical value 48 and
52, the value of alpha is 0.0114 ( adding 0.0057 and 0.0057).
• Change sample size to 16 then alpha is 0.0164.
13
TYPE II ERROR
14
The pink area is
the probability
of a Type II error
if the actual mean
is 52.
15
Type II Error
• Type II error will be committed if the sample mean x-bar falls between 48.5 and 51.5 (critical region
boundaries) when µ = 52.   P(48.5  x  51.5 _ when _   52)
• 0.2643
• When µ = 50.5
• 0.8923
16
17
18
Computing the
probability of a type II
error may be the most
difficult concept
19
For constant n, increasing the acceptance region (hence
decreasing ) increases .
Increasing n, can decrease both types of errors.
20
Type I & II Errors Have an Inverse Relationship
If you reduce the probability of one error, the other

one increases so that everything else is unchanged.
21
Factors Affecting Type II Error
• True value of population parameter

–  Increases when the difference between hypothesized parameter and its true value
decrease
• Significance level
– Increases when  decreases
• Population standard deviation  
– 
Increases when increases

• Sample size
–  Increases when n decreases  

n
22
How to Choose between Type I and Type II Errors
• Choice depends on the cost of the errors
• Choose smaller Type I Error when the cost of rejecting the maintained hypothesis is high
– A criminal trial: convicting an innocent person
• Choose larger Type I Error when you have an interest in changing the status quo
23
Calculating the probability of Type II Error
Ho: µ = 8.3
H1: µ < 8.3
Determine the probability of Type II error if µ = 7.4 at 5% significance level. σ = 3.1 and n = 60.
24
Solution:
An error will be made when Z ≥ -1.645, for that will fail to reject Ho.
ᵦ = 0.2729
25
Solving for Type II Errors:
Example
Ho:   12    Zc

X
Ha:   12
c
n
010
.
 12  ( 1645
. )
60
Rejection
Region
 11979
.
=.05
If X  11979
. , reject Ho.
Non Rejection Region
=0 If X  11979
. , do not reject Ho.
Zc  1.645
26
Type II Error for Example with  =11.99 Kg
Reject Ho Do Not Reject

Type I Ho Correct
Error Decision
95%
=.05
Ho is True   Z0
Ho is False
Correct Type II
19.77% =.8023
Decision Error

Z1

X
  
27
28
Type II Error for Demonstration with =11.96 Kg
Reject Ho Do Not Reject Ho

Type Correct
I 95% Decision
Error
=.05
Ho is True  
Z0
Ho is False
Correct =.0708 Type II
Decision 92.92% Error
Z1

  
X
29
30
Hypothesis Testing and Decision Making
• We have illustrated hypothesis testing applications referred to as significance tests
• In the tests, we compared the p-value to a controlled probability of a Type I error, a, which is
called the level of significance for the test
• With a significance test, we control the probability of making the Type I error, but
not the Type II error
• We recommended the conclusion “do not reject H0” rather than “accept H0”
because the latter puts us at risk of making a Type II error
31
Hypothesis Testing and Decision Making
• With the conclusion “do not reject H0”, the statistical evidence is considered inconclusive
• Usually this is an indication to postpone a decision until further research and testing is
undertaken
• In many decision-making situations the decision maker may want, and in some cases may be
forced, to take action with both the conclusion “do not reject H0 “and the conclusion “reject
H0.”
• In such situations, it is recommended that the hypothesis-testing procedure be extended to

include consideration of making a Type II error
32
Power of a test
• The mean response time for a random sample of 40 food-

order is 13.25 minutes
• The population standard deviation is believed to be 3.2
minutes.
• The restaurant owner wants to perform a hypothesis test,
with  =0.05 level of significance, to determine whether the
service goal of 12 minutes or less is being achieved.
33
Calculating the Probability of a Type II Error
Hypotheses are: H0:    and Ha:   
Rejection rule is: Reject H0 if z > 1.645
Value of the sample mean that identifies the rejection region:
We will accept H0 when x < 12.8323
34
Calculating the Probability of a Type II Error
Probabilities that the sample mean will be in the acceptance region:
Values of   1-
14.0 -2.31 .0104 .9896
13.6 -1.52 .0643 .9357
13.2 -0.73 .2327 .7673
12.8323 0.00 .5000 .5000
12.8 0.06 .5239 .4761
12.4 0.85 .8023 .1977
12.0001 1.645 .9500 .0500
35
36
Power of the Test
• The probability of correctly rejecting H0 when it is false is called the power of the test.
• For any particular value of m, the power is 1 – b.
• We can show graphically the power associated with each value of

power curve.
 ; such a graph is called a
37
Power Curve
1.00
Rejecting Null Hypothesis

0.90
Probability of Correctly
0.80
H0 False
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00 
11.5 12.0 12.5 13.0 13.5 14.0 14.5
38
Thank You
39
Hypothesis Testing: Two sample test
Dr. A. Ramesh
IIT ROORKEE
1
Hypothesis Testing about the Difference in Two
Sample Means
Population 1
X 1
X X
x 1 2
X  n1
1
X X1 2

x
X 2
n2
X 2
Population 2
2
Two Sample Tests
Two Sample Tests
Population Population
Means, Means, Population Population
Independent Dependent Proportions Variances
Samples Samples
Examples:
Group 1 vs. Same group before Proportion 1 vs. Variance 1 vs.
independent vs. after treatment Proportion 2 Variance 2
Group 2
3
Difference Between Two Means
Population means,
independent samples
σ12 and σ22 known Test statistic is a z value
σ12 and σ22 unknown
σ12 and σ22 assumed equal

Test statistic is a value from the
σ12 and σ12 assumed Student’s t distribution
unequal
4
σ12 and σ12 Known
Population means, Assumptions:

independent samples
 Samples are randomly and
independently drawn
σ12 and σ22 known  both population distributions

are normal
 Population variances are
known
5
σ12 and σ22 Known
When σx2 and σy2 are known and both

Population means, populations are normal, the variance of 1 – 2
independent is 2
σ1 σ 2
2
samples σ 2X1 X2  
n1 n 2
σ12 and σ22 known …and the random variable

(x1  x 2 )  (μ1  μ 2 )
Z
σ12 σ 22
σ12 and σ22 unknown 
n1 n 2
has a standard normal distribution
6
Test Statistic, σ12 and σ22 Known
Population means, H0 :μ1  μ 2  D0

independent
samples The test statistic for
μ1 – μ2 is:
σ12 and σ22 known
z
 x 1 
 x2  D0
σ12 and σ22 unknown σ12 σ 2 2


n1 n2
7
Hypothesis Tests for Two Population Means
Two Population Means, Independent Samples
Lower-tail test: Upper-tail test: Two-tail test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2
H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
8
Decision Rules
a a
a/2 a/2
-za za -za/2 za/2

Reject H0 if z < -za Reject H0 if z > za Reject H0 if z < -za/2 or z > za/2
9
Hypothesis Testing about the Difference in Two
Sample Means
X  X2
  
1 2
X   2
1
  2
2
 X2 n n
1
1
1 2
X 1
 X2
X 1
 X 2
10
Sampling Distribution of x1  x2
• Expected Value
• Standard Deviation (Standard Error)
where: 1 = standard deviation of population 1

2 = standard deviation of population 2
n1 = sample size from population 1
n2 = sample size from population 2
11
Interval Estimation of 1 - 2:  1 and  2 Known
• Interval Estimate
where: 1 - a is the confidence coefficient
12
Problem ( 1 and  2 Known)
• A product developer is interested in reducing the drying time of a primer paint.
• Two formulations of the paint are tested; formulation 1 is the standard chemistry, and
formulation 2 has a new drying ingredient that should reduce the drying time.
• From experience, it is known that the standard deviation of drying time is 8 minutes, and this
inherent variability should be unaffected by the addition of the new ingredient.
• Ten specimens are painted with formulation 1, and another 10 specimens are painted with
formulation 2; the 20 specimens are painted in random order.
• The two-sample average drying times are 𝑥1 = 121 minutes and 𝑥2 = 112 minutes,
respectively.
• What conclusions can the product developer draw about the effectiveness of the new
ingredient, using alpha = 0.05?
Source: Applied Probability and statistics for Engineers by Douglas C. Montgomery and George C. Runger John Wiley, 3rd Ed. 2003
13
14
15
Reject H0
t
121  112   0  2.52
.05
0 1.645 t
 1 1 2.52
8   
2
 10 10  Decision:
Reject H0 at a = 0.05
Conclusion:
There is evidence of a difference in
means.
16
17
18
σ12 and σ22 Unknown, Assumed Equal
Population means, Assumptions:

independent samples • Samples are randomly and
independently drawn
σ12 and σ12 known • Populations are normally
distributed
• Population variances are unknown
σ12 and σ12 assumed equal
*
σ12 and σ12 assumed unequal
but assumed equal
19
σ12 and σ22 Unknown, Assumed Equal
• The population variances are assumed equal, so use the two sample
standard deviations and pool them to estimate σ
• use a t value with (n1 + n2 – 2) degrees of freedom
20
Test Statistic, σ12 and σ22 Unknown, Equal
The test statistic for
μ1 – μ2 is:
t
 x 1 
 x2   μ1  μ 2 
s 2p s 2p

n1 n2
Where t has (n1 + n2 – 2) d.f.,

and (n1  1)s12  (n 2  1)s 22
s 
2
n1  n 2  2
p
21
Decision Rules
1 2 1 2 1 2
1 2 1 2 1 2
22
Decision Rules
23
σ12 and σ22 Unknown, Assumed equal
• Two catalysts are being analyzed to
determine how they affect the mean Observation Catalyst 1 Catalyst 2
yield of a chemical process. Number
• Specifically, catalyst 1 is currently in use, 1 91.50 89.19
but catalyst 2 is acceptable. 2 94.18 90.95
• Since catalyst 2 is cheaper, it should be 3 92.18 90.46
adopted, providing it does not change 4 95.39 93.21
the process yield. 5 91.79 97.19
• A test is run in the pilot plant and results 6 89.07 97.04
in the data shown in table. 7 94.72 91.07
• Is there any difference between the 8 89.21 92.75
mean yields?
𝑥 1= 92.255 𝑥 1 = 92.733
• Use 0.05, and assume equal variances.
s1 =2.39 s2 =2.98
24
25
26
27
28
Thank You
29

Ca 1 Merged

Uploaded by

Copyright:

Available Formats

Ca 1 Merged

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ca 1 Merged

Uploaded by

Copyright:

Available Formats

Data Analytics with Python

Lecture 1: Introduction to data analytics

• The danger in using quantitative method does not generally

• Variable, Measurement and Data

• What is generating so much data?

• How data add value to the business?

• Why data is important?

• Variables – is a characteristic of any entity being studied that is capable of

• Measurements – is when a standard process is used to assign numbers to

• Data – data are recorded measurements

• Data can be generated by

Development of Data Product Discovery of Data Insight

• Data helps in make better decisions

• Why analytics is important?

• Data analytics vs. Data analysis

• Types of Data analytics

• Analytics is defined as “the scientific process of transforming data into

• Opportunity abounds for the use of analytics and big data

Explore potential future events

Explains How And Why Story ends the way it did ?

Business Analysis =/ Business analytics

• Diagnostic Analytics is a form of advanced analytics which examines data

• Diagnostic analytical tools aid an analyst to dig deeper into an issue so

• In a structured business environment, tools for both descriptive and

• It uses techniques such as:

• Predictive analytics helps to forecast trends based on the current events

• Predicting the probability of an event happening in future or estimating

• Many different but co-dependent variables are analysed to predict a trend

• Set of techniques to indicate the best course of action

• Demand for Data Analytics

• The requisite skill set

• Difference between Data analyst and Data Scientist

Data exploration analysis and insight

Data product engineering

• Nominal — Lowest level of measurement

• A nominal scale classifies data into distinct categories in which no ranking

• An ordinal scale classifies data into distinct categories in which ranking is

• An interval scale is an ordered scale in which the difference between

• A ratio scale is an ordered scale in which the difference between the

Nominal Classifying and Counting Nonparametric

Ordinal All of the above plus Ranking Nonparametric

Interval All of the above plus Parametric

Ratio All of the above plus

Step 1: Type https://www.anaconda.com at the address bar of web

Step 1: Type https://www.anaconda.com at the address bar of web browser.

Step 2: Click on download button

Step 3: Download python 3.7 version for windows OS

Step 4: Double click on file to run the application

Python Programming Language Jupyter Application

Software Package contains both

Cell -> Access using Enter Key

Input Field -> Green color indicates edit mode

-> It contains documentation

• Command mode allow to edit notebook as whole

• Comment line is written preceding with # symbol.

• Important shortcut keys

o A -> To create cell above

• Loading a simple delimited data file

Data Source: www.github.com/jennybc/gapminder.