0% found this document useful (0 votes)
27 views49 pages

Inferential Statistics Final

The document provides an overview of inferential statistics, which allows conclusions about a population based on sample analysis. It covers key concepts such as confidence intervals, hypothesis testing, various statistical tests (t-test, f-test, z-test, ANOVA, and Chi-Square test), and the errors associated with hypothesis testing. Each statistical test is explained with its assumptions, null and alternate hypotheses, and examples of application.

Uploaded by

Ananya Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views49 pages

Inferential Statistics Final

The document provides an overview of inferential statistics, which allows conclusions about a population based on sample analysis. It covers key concepts such as confidence intervals, hypothesis testing, various statistical tests (t-test, f-test, z-test, ANOVA, and Chi-Square test), and the errors associated with hypothesis testing. Each statistical test is explained with its assumptions, null and alternate hypotheses, and examples of application.

Uploaded by

Ananya Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Inferential

Statistics

Copyright Intellipaat. All rights reserved.


Agenda

01 What is Inferential Statistics? 05 t-Test

02 Confidence Interval 06 f-test

03 Hypothesis Testing 07 z-test

04 Errors in Hypothesis Testing 08 Chi-Square Test & ANOVA

Copyright Intellipaat. All rights reserved.


What is Inferential
Statistics?

Copyright Intellipaat. All rights reserved.


What is Inferential Statistics?

While descriptive statistics


describes the data,
inferential statistics is used
to draw conclusions about
the population based on
statistical findings on sample
analysis.

Copyright Intellipaat. All rights reserved.


Confidence Interval

Copyright Intellipaat. All rights reserved.


Confidence Interval

Confidence interval assumes certainty of population parameter


falling in the given intervals i.e. 95%, 99%, etc.

For example: If a point estimate 10.0 from the sample statistics for the
confidence interval 95% falls into 9.5 to 10.5, we can infer that there is a
95% certainty that the true or population estimate will fall in the same
interval.

Copyright Intellipaat. All rights reserved.


Confidence Interval

We have taken a random


normal sample of size 100,
and calculated the lower
confidence interval and
upper confidence interval
with interval value 95%.

Copyright Intellipaat. All rights reserved.


Confidence Interval

According to our analysis, there is 95% certainty that the


population will have the mean in the given interval.

Copyright Intellipaat. All rights reserved.


Hypothesis Testing

Copyright Intellipaat. All rights reserved.


Hypothesis Testing

Hypothesis testing is the analysis


where the plausibility of an
assumption for a population
parameter is tested on the sample,
and statistical evidence is used to
verify the hypothesis.

Copyright Intellipaat. All rights reserved.


Steps involved in Hypothesis Testing

01 Formulate Two Hypothesis for analysis

02 Draw samples from population for analysis

03 Perform appropriate statistical test

04 Accept or reject hypothesis based on evidence

Copyright Intellipaat. All rights reserved.


Hypothesis Testing

Null Alternate
Hypothesis Hypothesis

Null Hypothesis states that Alternate Hypothesis states


there is no effect on the that there is effect on the
population mean. population mean

Copyright Intellipaat. All rights reserved.


Errors in Hypothesis
Testing

Copyright Intellipaat. All rights reserved.


Errors in Hypothesis Testing

Type 1 Error Type 2 Error

The Type 1 error is the false Type 2 error is a false


positive error where we negative conclusion where
have rejected the null we have not rejected the null
hypothesis but it is actually hypothesis but it is actually
true. false.

Copyright Intellipaat. All rights reserved.


T-Test

Copyright Intellipaat. All rights reserved.


T-Test

T-test is a parametric test, that compares the means of the two samples.
Ideally, a sample for t-test should have less than 30 values. There are a few
other assumptions that are taken before we can conduct a t-test.

Assumptions

1. The samples are independent

2. Homogeneity in sample variances

3. The Data is assumed to be normally distributed.

Copyright Intellipaat. All rights reserved.


Types of t-test

One-sample Two-sample Paired

If we are If both the samples


If the samples are
comparing the are taken from two
taken from the
sample against a different
same population.
standard value. populations.

Copyright Intellipaat. All rights reserved.


One-Tailed vs Two-Tailed T-Test

One-tailed Two-tailed

If we want to check whether If we want to check whether


the population means are the population means differ
greater than or smaller than, significantly, we will use a
we will use one-tailed test. two tailed test.

Copyright Intellipaat. All rights reserved.


One Sample t-test

The average height of Indian adult males is 165cm.

Null hypothesis: The average height is 165cm.


Alternate Hypothesis: The average height is not 165cm.

Copyright Intellipaat. All rights reserved.


One Sample t-test

We will use python programming to perform a one sample test on a random


sample taken from adult Indian males, where each of the 30 samples have
their heights in cm.

Since the p-value


is less than 0.05,
we can reject the
null hypothesis.

Copyright Intellipaat. All rights reserved.


Two Sample t-test

We have to check whether the mean height of adult males in both


the schools is same or not.

Null hypothesis: The means are equal.


Alternate Hypothesis: The means are not equal.

Copyright Intellipaat. All rights reserved.


Two-Sample t-test

We will check the variances of each groups and then perform a two-sample t-
test for equal variances, otherwise a Welch’s t-test will be conducted by not
taking into consideration – the unequal population variances.

We have
insufficient
evidence to reject
the null
hypothesis.

Copyright Intellipaat. All rights reserved.


Two Sample t-test

We have to check if the mean of heights of males and females


are same in the school?

Null hypothesis: The means are equal.


Alternate Hypothesis: The means are not equal.

Copyright Intellipaat. All rights reserved.


Paired t-test

We will use the paired sample t-test for the groups because the samples come
from the same population.

We have
sufficient
evidence to reject
the null
hypothesis.

Copyright Intellipaat. All rights reserved.


F-Test

Copyright Intellipaat. All rights reserved.


F-Test

F-test is a statistical test that is used to compare the variances of two


populations. There are several assumptions that are made about the data
before we can begin the F-test.

Assumptions

1. Data is normally distributed

2. The data is independent

Copyright Intellipaat. All rights reserved.


f-test

We have to check if the variances of the two populations where


the groups are taken from equal or not.

Null hypothesis: The variances are equal.


Alternate Hypothesis: The variances are not equal.

Copyright Intellipaat. All rights reserved.


F-test

We will calculate the variances of the two samples and compute the f-statistic
and p-value to gather statistical evidence to reject the null hypothesis.

Not enough
evidence to reject
the null
hypothesis.

Copyright Intellipaat. All rights reserved.


ANOVA

Copyright Intellipaat. All rights reserved.


ANOVA

ANOVA or Analysis of Variance is a statistical test that compares the means


or two or more groups to find significance or either groups on one another or
how different they are from each other.

Assumptions

1. Independent Samples

2. All populations have common variance

3. Samples are drawn from normally distributed population

Copyright Intellipaat. All rights reserved.


One-Way ANOVA

We have to check if the effect of 4 different performance


enhancers on an electric vehicle is same or not?

Null hypothesis: The performance averages are equal.


Alternate Hypothesis: The performance averages are not equal.

Copyright Intellipaat. All rights reserved.


One-Way ANOVA

We have taken 4 random samples that has performance values, we will


calculate the test statistics and p-value to reject or fail to reject he null
hypothesis.

P-value is less
than 0.05, we
can reject the null
hypothesis.

Copyright Intellipaat. All rights reserved.


Two-Way ANOVA

Two way ANOVA checks how two factors will affect the response
variable.

Null hypothesis: There is no significance of the two factors on response


variable.
Alternate Hypothesis: There is significance of the two factors on response
variable.

Copyright Intellipaat. All rights reserved.


One-Way ANOVA

There is no
evidence to reject
the null
hypothesis.

Copyright Intellipaat. All rights reserved.


Z-Test

Copyright Intellipaat. All rights reserved.


Z-Test

Z-test is a statistical test to compare the means of populations where the


variances are known and sample sizes are considerably larger compared to
t-test.

Assumptions

1. Standard Deviation and variances are known.

2. Population should be 10 times as much as the sample size.

3. Samples are drawn at random from the population.

Copyright Intellipaat. All rights reserved.


One Sample z-test for Means

The average weight of the high-schoolers pre pandemic was


55Kg with a standard deviation of 8. Has it changed post
pandemic?

Null hypothesis: The average weight is same.


Alternate Hypothesis: The average weight is not same.

Copyright Intellipaat. All rights reserved.


One Sample z-test for Means

We will use a one sample z-test for this problem, where we will take weights of
50 high schoolers randomly and perform the z-test using python.

Not enough
evidence to reject
the null
hyptohesis

Copyright Intellipaat. All rights reserved.


Two Sample z-test for Means

Is the average height post pandemic for high schoolers going to


school A and school B is same, given that the standard deviation
of the populations is known.

Null hypothesis: The mean difference is zero.


Alternate Hypothesis: The mean difference is not zero.

Copyright Intellipaat. All rights reserved.


Two Sample z-test for Means

We will take one sample from each of the populations with 50 individuals each.
And then perform a two-sample z-test using python.

Not enough
evidence to reject
the null
hypothesis

Copyright Intellipaat. All rights reserved.


One Sample z-test for Proportion

It was observed from a purchase case study, that 35% of women


spend more than 10000. Is it true for our population in analysis?

Null hypothesis: The proportion is same.


Alternate Hypothesis: The proportion is not same.

Copyright Intellipaat. All rights reserved.


One Sample z-test for Proportion

We will perform a one


sample z-test for
proportion to check the
test statistics in order to
reject or fail to reject the
null hypothesis. Since the
p-value is less than 0.05,
we can reject the null
hypothesis.

Copyright Intellipaat. All rights reserved.


Two Sample z-test for Proportion

Is the percentage of men who have spend more than 10000 sam
e for the ages 18-25 and 26-35

Null hypothesis: The proportion is same.


Alternate Hypothesis: The proportion is not same.

Copyright Intellipaat. All rights reserved.


z-test for Proportion

We will perform a two


sample z-test for
proportion to check
the test statistics in
order to reject or fail
to reject the null
hypothesis. Not
sufficient evidence to
reject the null
hypothesis.

Copyright Intellipaat. All rights reserved.


Chi-Square Test

Copyright Intellipaat. All rights reserved.


Chi-Square Test

Chi-Square test for categorical data that can be used to check the goodness
of fit or test of independence.

Assumptions

1. The features are categorical in Nature

2. The samples are taken at random.

3. Minimum of five observations expected in each group.

Copyright Intellipaat. All rights reserved.


Chi-Square Test of Independence

Is Purchase independent of Product_Category_1?

Null hypothesis: Purchase and product_category_1 are not related


Alternate Hypothesis: Purchase and product_category_1 are related

Copyright Intellipaat. All rights reserved.


Chi-Square Test of Independence

We will perform chi-


square test of
independence and
validate our
assumptions based
on statistical evidence.
P-value is less than
0.05, we can reject
the null hypothesis.

Copyright Intellipaat. All rights reserved.


India: +91-7847955955

US: 1-800-216-8930 (TOLL FREE)

[email protected]

24/7 Chat with Our Course


Advisor

Copyright Intellipaat. All rights reserved.

You might also like