0% found this document useful (0 votes)
28 views7 pages

Lecture 7

This document provides an overview of inferential statistics, including the central limit theorem, point estimation, and interval estimation. Formulas for calculating statistics like sample means, standard errors, and confidence intervals are also presented. An example using weather data demonstrates how to apply these concepts in practice.

Uploaded by

muendokyalo11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views7 pages

Lecture 7

This document provides an overview of inferential statistics, including the central limit theorem, point estimation, and interval estimation. Formulas for calculating statistics like sample means, standard errors, and confidence intervals are also presented. An example using weather data demonstrates how to apply these concepts in practice.

Uploaded by

muendokyalo11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

LECTURE 7: INFERENTIAL STATISTICS

Inferential Statistics

Inferential statistics are often used to compare the differences between the
treatment groups. Inferential statistics use measurements from the sample
of subjects in the experiment to compare the treatment groups and make
generalizations about the larger population of subjects.

There are many types of inferential statistics and each is appropriate for a
specific research design and sample characteristics. Researchers should
consult the numerous texts on experimental design and statistics to find the
right statistical test for their experiment. However, most inferential statistics
are based on the principle that a test-statistic value is calculated on the
basis of a particular formula. That value along with the degrees of freedom,
a measure related to the sample size, and the rejection criteria are used to
determine whether differences exist between the treatment groups. The
larger the sample size, the more likely a statistic is to indicate that
differences exist between the treatment groups. Thus, the larger the sample
of subjects, the more powerful the statistic is said to be.

Virtually all inferential statistics have an important underlying assumption.


Each replication in a condition is assumed to be independent. That is each
value in a condition is thought to be unrelated to any other value in the
sample. This assumption of independence can create a number of
challenges for animal behavior researchers.

Using descriptive statistics, you can report characteristics of your data:

• The distribution concerns the frequency of each value.


• The central tendency concerns the averages of the values.
• The variability concerns how spread out the values are.

In descriptive statistics, there is no uncertainty – the statistics precisely


describe the data that you collected. If you collect data from an entire
population, you can directly compare these descriptive statistics to those
from other populations.

An example Example: Descriptive statistics


You collect data on the SAT scores of all 11th graders in a school for three
years.
You can use descriptive statistics to get a quick overview of the school’s
scores in those years. You can then directly compare the mean SAT score
with the mean scores of other schools.

1. Central Limit Theorem (CLT):

Imagine repeatedly taking small samples from a large population (like coins
from a jar). The CLT states that regardless of the population's original
distribution, as the sample size increases, the distribution of the sample
means will tend towards a normal distribution, also known as a bell curve.
This does not mean the individual samples follow a normal distribution, but
the average of those samples does.

Think of it like flipping a coin multiple times. Even though each flip has a
50/50 chance of heads or tails, the average of many flips will likely be closer
to 50% heads than any extreme (all heads or all tails).

The CLT is crucial because it lets us:

● Estimate population means more accurately with larger samples.


● Use statistical tests based on the normal distribution even if the
population's distribution is unknown.
● Create confidence intervals for population parameters more reliably.

2. Point Estimation:

This involves using a single statistic to estimate an unknown population


parameter, like the mean or proportion. For example, calculating the
average height of a small group of people would be a point estimate for the
average height of the entire population.

Several methods exist for point estimation, depending on the data and
parameter:

● Sample mean: estimates the population mean.


● Sample proportion: estimates the population proportion (e.g.,
percentage) of a certain characteristic.
● Sample median: estimates the population median (the middle value
when data is ordered).

While point estimates give us a best guess, they don't account for
uncertainty.
3. Interval Estimation:

This addresses the limitations of point estimates by providing a range of


values within which the true population parameter is likely to fall, with a
certain level of confidence. This range is called a confidence interval.

For example, a 95% confidence interval for the average height of the
population might be 170-175 cm. This means we are 95% confident that the
true average height lies somewhere between these two values.

Constructing confidence intervals involves calculations using the point


estimate, sample size, and standard error. They are valuable because they:

● Acknowledge the inherent uncertainty in sampling.


● Provide a range of plausible values for the population parameter.
● Allow us to assess the precision of our estimation.

Understanding these three concepts together builds a strong foundation for


statistical inference and drawing reliable conclusions from data.

Calculating Central Limit Theorem, Point Estimation

Here are the formulas involved in the calculations for Central Limit
Theorem, Point Estimation, and Interval Estimation:

Central Limit Theorem:

● No specific formula: The CLT is a theoretical concept that describes the


behavior of sample means as sample sizes increase. It's demonstrated
through simulations and observations.

Point Estimation:

● Sample mean:
x̄ = Σxi / n

where:

● x̄ is the sample mean


● Σxi is the sum of all values in the sample
● n is the sample size

Interval Estimation:

● Standard error of the mean:


SE = σ / √n

where:

● SE is the standard error of the mean


● σ is the population standard deviation (estimated from the sample
standard deviation when unknown)
● n is the sample size

Margin of error:
ME = z* * SE

where:

● ME is the margin of error


● z* is the critical value (obtained from a standard normal distribution table
based on the desired confidence level)
● Confidence interval:
(x̄ - ME, x̄ + ME)

Example using meteorological data:

Sample mean:

x̄ = (21 + 18 + 25 + 22 + 17 + 27 + 20 + 19 + 23 + 24) / 10 = 22.1°C

Standard error (assuming sample standard deviation σ = 3°C):

SE = 3 / √10 ≈ 0.95°C

● Margin of error (for a 95% confidence interval, z = 1.96):*

ME = 1.96 * 0.95 ≈ 1.86°C

95% confidence interval:

(22.1 - 1.86, 22.1 + 1.86) = (20.24°C, 23.96°C)

Therefore, we are 95% confident that the true average daily temperature in
March lies between 20.24°C and 23.96°C.
Rainfall Data (mm)

Month Rainfall

Jan 50

Feb 30

Mar 70

Apr 40

May 90

Jun 120

Jul 150

Aug 100

Sep 80

Oct 60

Nov 40

Dec 55

Point Estimation:

Average monthly rainfall: (50 + 30 + 70 + 40 + 90 + 120 + 150 + 100 + 80 + 60


+ 40 + 55) / 12 = 79.2 mm

Interval Estimation:
Standard deviation: Assume the standard deviation calculated from the data is
35 mm.

90% confidence interval:

● Critical value (z*) for a 90% confidence level: 1.645


● Margin of error (ME): z* * standard deviation = 1.645 * 35 mm ≈ 57.6 mm

Lower bound: 79.2 mm - 57.6 mm = 21.6 mm Upper bound: 79.2 mm + 57.6


mm = 136.8 mm

Interpretation: We are 90% confident that the true average monthly rainfall for
this city lies between 21.6 mm and 136.8 mm.

Note: This is just one example. The actual calculations and results will vary
depending on the specific data and chosen confidence level.

I hope this revised version with the table is more helpful! Feel free to ask any
further questions.

Analysis of Variance (ANOVA)

ANOVA, or Analysis of Variance, is a statistical technique used to analyze


the differences among group means in a sample. It is commonly employed
when there are more than two groups in a study and allows researchers to
determine whether there are statistically significant differences among these
groups.
ANOVA involves partitioning the total variability in a dataset into different
sources of variation, such as variation within groups and variation between
groups. The goal is to assess whether the variation between groups is
greater than the variation within groups, indicating that there are significant
differences in the means of the groups being compared.
There are different types of ANOVA depending on the design of the study:
1. One-Way ANOVA: Used when there is one independent variable with
more than two levels or groups. It compares the means of the groups
to determine if there are significant differences.
2. Two-Way ANOVA: Involves two independent variables. It examines the
influence of each variable individually as well as their interaction.
3. Repeated Measures ANOVA: Used when the same subjects are used
for each treatment (repeated measures). It is common in experimental
designs where participants are measured under different conditions.
ANOVA is widely applied in various fields, including experimental
psychology, biology, economics, and many others. It helps researchers
understand whether the observed differences among group means are likely
to be due to actual differences in the population or if they could have
occurred by chance. The results of ANOVA are often expressed in terms of a
p-value, indicating the probability of obtaining such differences by random
chance.
In summary, ANOVA is a statistical method used to compare means across
multiple groups, providing insights into the sources of variability in the data
and helping researchers make inferences about population differences.

You might also like