Lecture 9 Statistical Significance, Effect Size, and Confidence Intervals
The document discusses statistical methods used by researchers to assess the significance and effect size of sample data in relation to a larger population. It explains the concepts of statistical significance, effect size, and confidence intervals, emphasizing their importance in making meaningful inferences. Additionally, it highlights the role of probability, sampling variability, and degrees of freedom in statistical analysis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
16 views32 pages
Lecture 9 Statistical Significance, Effect Size, and Confidence Intervals
The document discusses statistical methods used by researchers to assess the significance and effect size of sample data in relation to a larger population. It explains the concepts of statistical significance, effect size, and confidence intervals, emphasizing their importance in making meaningful inferences. Additionally, it highlights the role of probability, sampling variability, and degrees of freedom in statistical analysis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32
Statistical Significance, Effect
Size, and Confidence Intervals
Introduction
◼ Researchers use statistical methods to
determine how meaningful sample data are when making inferences about a larger population. ▪ For example, they assess whether observed differences between sample groups (such as happiness levels between adults from Alaska and New York) are large enough to conclude that the populations differ. ◼ Similarly, they examine whether relationships (like between education and income) in a sample can be generalized to the larger population. ◼ Statistical techniques help researchers decide how accurately sample statistics reflect the characteristics of the broader population. ◼ Three of the common tools used by researchers to reach such conclusions include testing for statistical significance and calculating the effect sizes and confidence intervals ◼ The focus of this chapter is to describe what these concepts mean and how to interpret them, as well as to provide general information about how statistical significance and effect sizes are determined. ◼ Statistics are often divided into two types: descriptive statistics and inferential statistics ◼ Descriptive Statistics: ▪ Describe characteristics of a specific data set. ▪ Example: Collecting weight data for a group of 30 adults (mean, range, standard deviation). ▪ Provide information only about the sample (group of 30 adults). ◼ Inferential Statistics: ▪ Used to make inferences about a larger population based on sample data. ▪ Researchers often want to generalize sample results to a larger population. ▪ Example: Using a sample of boys and girls to infer differences in physical aggression levels in the general population of boys and girls. Statistical Significance
◼ The first step in understanding statistical
significance is to understand the difference between a sample and a population ▪ A sample is an individual or group from whom or from which data are collected. ▪ A population is the individual or group that the sample is supposed to represent. Definition of statistical significance
◼ Statistical significance refers to the likelihood,
or probability, that a statistic derived from a sample represents some genuine phenomenon in the population from which the sample was selected. Probability
◼ Probability plays a key role in inferential
statistics. ◼ When it comes to deciding whether a result in a study is statistically significant, we must rely on probability to make the determination ◼ ◼ Sampling Variability: ◼ When selecting samples from the same population, the calculated statistics (e.g., mean) can vary between samples due to random differences. ▪ Example: A sample of 1,000 randomly selected U.S. men might have an average shoe size of 10, while another similar random sample might yield a mean of 9. ◼ Sampling Distribution and Standard Error: ◼ If an infinite number of random samples of the same size are taken, the means of these samples form a sampling distribution. ◼ The standard deviation of this sampling distribution is referred to as the standard error of the mean, which quantifies the variability in the sample statistic relative to the population. ◼ ◼ Probability and Random Sampling Error: ◼ Random sampling error refers to the likelihood of obtaining differences in sample statistics purely by chance. ▪ Example: If the population mean shoe size is 9, the chance of getting a random sample with an average shoe size of 10 depends on the sample size, standard deviation, and sampling error. ◼ Statistical Analysis: ◼ Using the t Distribution: ▪ A t value measures the deviation of the sample statistic (e.g., mean) from the population parameter (e.g., population mean) in standard error units. ◼ For large sample sizes (typically n>120), the t distribution closely approximates the normal distribution. ◼ Statistical Significance: ◼ Defining Statistical Significance: ▪ A result is considered statistically significant if the observed difference (e.g., between sample and population means) is unlikely to have occurred by chance. ▪ Statistical significance depends on the p value; typically, p<0.05 is considered significant, though smaller thresholds like p<0.001 increase confidence. ◼ P Value: ▪ The p value quantifies the probability of obtaining a sample statistic as extreme as the observed one due to random chance alone. ▪ Smaller p values indicate a lower likelihood that the observed result is due to chance, increasing confidence in the findings. ◼ Interpreting t Value: ◼ In a t distribution table, for degrees of freedom approaching infinity (n>120), a t value of 16.67 corresponds to p<0.001. ◼ This means the likelihood of obtaining a sample mean of 10, when the population mean is 9, is less than 1 in 1,000 due to random chance. ◼ Conclusion: ◼ Since p<0.001 the result is statistically significant. ◼ The observed difference (sample mean = 10 vs. population mean = 9) is unlikely to have occurred by random sampling error alone. Effect size
◼ Statistical Significance vs. Effect Size:
Statistical significance has traditionally been emphasized in quantitative research, often with undue focus on the “p < .05” rule. ◼ Recently, researchers and journal editors have highlighted its limitations, particularly its sensitivity to sample size, and are advocating for the inclusion of effect size to assess the importance of findings. ◼ The process of determining statistical significance involves comparing sample statistics to population parameters, influenced heavily by sample size. ◼ Larger samples lead to smaller standard errors, inflating test statistics (e.g., t, F, or z values) and making even minor differences statistically significant. ◼ Effect size provides an alternative measure by standardizing the difference between means relative to the standard deviation, rather than the standard error. Definition of effect size
◼ Effect size is a measure of the strength or
magnitude of a relationship, difference, or effect observed in a study, independent of the sample size. ◼ Unlike p-values, which indicate whether an effect exists, effect size quantifies how large the effect is. It helps researchers understand the practical significance of their findings. Types of Effect Size
◼ For Differences Between Groups:
▪ Cohen’s d: Measures the standardized difference between two means. ◼ For Relationships Between Variables: ▪ Correlation Coefficient (r): Indicates the strength and direction of a relationship between two variables. ◼ For Proportions or Variance Explained: ▪ R² (Coefficient of Determination): Shows the proportion of variance explained in a regression analysis. Why Is Effect Size Important?
◼ Goes Beyond Statistical Significance: Even
if a result is statistically significant (e.g., p<0.05), the effect size tells us how meaningful or impactful the result is in practical terms. ◼ Helps Compare Studies: Researchers can compare the effect sizes from different studies to assess which intervention or phenomenon has a stronger impact. Interpreting Effect Sizes
◼ Effect sizes can be categorized as small
(<0.20), moderate (0.250–0.750), or large (>0.80), though interpretation depends on context. Confidence Intervals
▪ Confidence intervals are becoming increasingly
common in reports of inferential statistics, in part because they provide another measure of effect size. ◼ The confidence interval provides a range of values that we are confident, to a certain degree of probability, contains the population parameter ◼ These values correspond with p values of .05 and .01, respectively
◼ A 95% confidence interval means that if we
repeated the experiment 100 times, approximately 95 of the intervals would contain the true population parameter. Degree of freedom
◼ Degrees of freedom (df) are the number of
values in a calculation that are free to change while still following certain rules.
◼ For example, if you're calculating an average
for a set of numbers, once you've fixed all but one of them, the last number can't vary because it has to make the average correct. ◼ Degrees of freedom are important in statistics because they help determine how much information you have and how reliable your results are.
◼ They're used in tests like t-tests and chi-square
tests to find critical values and decide if your results are statistically significant. ▪ Imagine you have 3 numbers, and their total must be 10. ▪ You can freely choose the first number (e.g., 4). ▪ You can also freely choose the second number (e.g., 3). ▪ But the third number can’t be just anything—it has to make the total 10 (in this case, 3).
◼ So, in this situation, only 2 numbers are free to
change. The third one is fixed by the rule. This means there are 2 degrees of freedom. ◼ For an independent samples t-test (comparing two groups), the formula is: ◼ df=n1+n2−2