0% found this document useful (0 votes)
11 views5 pages

Unit 5 Study Guide

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views5 pages

Unit 5 Study Guide

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

‭Unit 5, Inferential Statistics‬

‭Topic 19 (W17D3 KC)‬


‭●‬ A ‭ confidence interval communicates how accurate a point estimate is likely to be, based on‬
‭standard deviation‬
‭●‬ ‭A confidence interval starts with the point estimate and then adds or subtracts a‬‭margin of‬
‭error‬
* ‭σ‬
‭●‬ ‭𝑥‬± ‭𝑧‬
‭𝑛‬
‭○‬ ‭When the population standard deviation is unknown, the sample standard deviation‬
‭σ‬
‭can be used as a close estimate, where the standard error is:‬‭𝑠‬ ≈
‭𝑛‬
‭‬ T
● ‭ he more confident we wish the be, the larger (and less useful) our interval will be‬
‭●‬ ‭The more information we have about a population (larger sample size), the less new‬
‭information (additional data) affects the confidence interval‬
‭●‬ ‭A 95% confidence interval means that we can be 95% sure that the population‬
‭mean lies within our interval‬
‭●‬ ‭Confidence intervals estimate the value of a population mean; they do not estimate the value‬
‭of a sample statistic or of an individual observation‬
‭●‬ T ‭ he whole point of a confidence interval is to estimate the unknown value of‬‭μ‬‭based on the‬
‭observed value‬‭of x̄‬
‭●‬ ‭t-distribution‬
‭○‬ ‭Characterized by‬‭degrees of freedom‬
‭○‬ ‭Mound-shaped, centered at 0, wider‬
‭than normal distributions‬
‭○‬ ‭Degrees of freedom = sample size - 1‬
‭■‬ ‭𝑑𝑓‬ = ‭𝑛‬ − ‭1‬
‭●‬ ‭For an x% confidence interval, the critical‬
‭value t* is the value such that x% of the area‬
‭under the curve is between -t* and t*‬
‭○‬ ‭Look up this value in a‬‭t-table‬
‭●‬ ‭Confidence-interval for a population mean‬
‭(t-interval)‬
* ‭𝑠‬
‭○‬ ‭𝑥‬± ‭𝑡‬
‭𝑛‬
‭■‬ W
‭ here t* is the appropriate‬
‭critical value from the‬
‭t-distribution with n-1 degrees‬
‭of freedom for the desired‬
‭confidence level‬
‭ ‬ ‭Conditions to use a t-interval:‬

‭1.‬ ‭The sample was derived from the population via SRS‬
‭2.‬ ‭Either‬‭the sample size is large (‬‭𝑛‬ ≥ ‭30‬‭)‬‭or‬‭the population‬‭is normally distributed‬

‭well class of 2028 hf ig‬ ‭Page‬‭21‬


‭Study guide courtesy of Catelyn Dao, Patrick Du, Shreyas Jain‬
‭Topic 20 (W18D2 KC)‬
‭Test of Significance‬
‭1.‬ ‭Give a description of the parameter, being sure to identify the type of number (e.g.., a‬
‭mean or a proportion), the variable, and the population‬
‭a.‬ ‭Example: The parameter of interest is the average time in seconds that a student‬
‭estimated elapsed while the clip of "ABC" was played. Using μ to represent it‬
‭2.‬ ‭State competing claims about the parameter of interest. The null hypothesis states the‬
‭parameter is equal to a specific value‬
‭a.‬ ‭H‬‭0‬‭: parameter = hypothesized value (statement of no‬‭effect)‬
‭b.‬ ‭Example: H‬‭0‭:‬ μ = 10‬
‭The alternative hypothesis states what the researchers suspect or hope to be true about‬
‭the parameter. The form of the alternative hypothesis is determined by the research‬
‭question before the samples are collected‬
‭a.‬ ‭H‬‭a‭:‬ parameter < hypothesized value‬
‭b.‬ ‭H‬‭a‭:‬ parameter > hypothesized value‬
‭c.‬ ‭H‬‭a‭:‬ parameter ≠ hypothesized value‬
‭3.‬ ‭Specify the behavior of the sampling distribution under the null hypothesis. Typically‬
‭involving checking some technical conditions that need to be met, such as the Central‬
‭Limit Theorem. The initial two important conditions to check are a simple random‬
‭sample (SRS) and the sample size/normality of the distribution. Conditions will be‬
‭checked assuming the null hypothesis is true‬
‭a.‬ ‭Conditions for CLT:‬
‭i.‬ ‭Simple random sample (assumed implemented if unmentioned)‬
‭ii.‬ ‭n > 30 (If false, check if the distribution is normally distributed. If not‬
‭normally distribution, proceed with caution)‬
‭iii.‬ ‭Independent where n < 10% of overall population size (assume that it is‬
‭independent but make sure to state this in conditions)‬
‭Draw a well-labeled sketch with a mean of the hypothesized value and a standard‬
‭𝑠‬
‭deviation of‬ ‭√‬‭𝑛‬ ‭, where s is the standard deviation‬‭of the sample. Also mark the‬
‭observed mean value of your sample‬
‭4.‬ C
‭ alculate a test statistic to measure the discrepancy between the observed statistic and‬
‭the hypothesized value of the parameter. If the discrepancy is large, we have evidence‬
‭against a null hypothesis‬
‭a.‬ ‭The test statistic can be found by seeing how many standard deviations the‬
‭sample mean is away from the population mean‬
‭𝑥‬−‭μ‬
‭b.‬ ‭t =‬ 𝑠‭ ‬‭/√‬‭𝑛‬
‭5.‬ C
‭ alculate the p-value, which is the probability, assuming the null hypothesis to be true,‬
‭of obtaining a test statistic at least as extreme as the one actually observed. Extreme‬
‭meaning “in the direction of the alternative hypothesis.”‬
‭a.‬ ‭An alternative hypothesis of “not equal” means that you need the probability in‬
‭both tails‬

‭well class of 2028 hf ig‬ ‭Page‬‭22‬


‭Study guide courtesy of Catelyn Dao, Patrick Du, Shreyas Jain‬
‭b.‬ A ‭ positive t-statistic indicates finding the probability above it in the t-distribution‬
‭with the degrees of freedom. You then multiply that by 2 in order to calculate the‬
‭probability in both tails (done with a t-table or a calculator)‬
‭6.‬ ‭Summarize your conclusion in context. State a test decision or a comment evaluating the‬
‭strength of evidence against the null hypothesis where the test decision needs to be‬
‭made.‬
‭a.‬ ‭If the p-value is small, reject the null hypothesis‬
‭i.‬ ‭p-value below 0.05 but above‬
‭0.01 constitutes reasonably‬
‭strong evidence against the‬
‭null hypothesis‬
‭ii.‬ ‭p-value below 0.01 constitutes‬
‭very strong evidence against‬
‭the null hypothesis‬
‭b.‬ ‭If the p-value is high, do not reject the‬
‭null hypothesis‬
‭i.‬ ‭p-value above .10 constitutes little to no evidence against the null‬
‭hypothesis‬
‭ii.‬ ‭p-value below 0.10 but above 0.05 constitutes moderately strong evidence‬
‭against the null hypothesis‬
‭In some studies, the researcher can decide in advance how small the p-value‬
‭needs to be to support a null hypothesis. This cutoff is called a significance level,‬
‭denoted by 𝞪. A smaller significance level indicates stricter standards. If a‬
‭researcher specifies a level of significance in advance, you have to say you fail or‬
‭reject the hypothesis at a certain level.‬
‭Another common expression is to say that the data are statistically significant if it‬
‭is unlikely to have occurred by chance or sampling variability alone.‬
‭Proceed with responding to the research question and whether you have evidence for the‬
‭alternative hypothesis or not (restate final conclusions towards research question)‬
‭ ‬ ‭Hypotheses are always about parameters, not statistics‬

‭●‬ ‭The alternative hypothesis should be composed before data is collected (unbiased)‬
‭𝑠‬
‭●‬ ‭The denominator of a test statistic is the standard error denoted by‬ ‭√‬‭𝑛‬
‭●‬ W ‭ hen calculating the p-value for a two-sided alternative, include the total area in both‬
‭tails of the t-distribution beyond the value of the test statistic. You can calculate this total‬
‭area by doubling the area in the right tail.‬
‭●‬ ‭A low p-value indicates strong evidence against a null hypothesis‬
‭●‬ ‭You cannot generalize to a larger population if the sample was not random, in which case‬
‭you can generalize to smaller, more representative populations‬

‭well class of 2028 hf ig‬ ‭Page‬‭23‬


‭Study guide courtesy of Catelyn Dao, Patrick Du, Shreyas Jain‬
‭Topic 22 (W19D1 KC)‬
‭tl; dr: it’s basically the same as before, just with two variables‬
‭main differences (by step):‬
‭1.‬ ‭state two parameters, one for each sample group‬
‭2.‬ ‭null hypothesis is always µ‬‭1‬ ‭= µ‬‭2‭;‬ choices for alternative‬‭hypothesis are‬
‭a.‬ ‭µ‭1‬‬‭> µ‬‭2‬
‭b.‬ ‭µ‭1‬‬ ‭< µ‬‭2‬
‭c.‬ ‭µ‭1‬‬ ‭≠ µ‬‭2‬
‭3.‬ ‭for the independence and randomness conditions, you now should either have that each‬
‭sample was taken using an SRS or the two sample groups came from one large group‬
‭randomly assigned to two treatment groups‬
‭𝑥‭1‬ ‬−‭𝑥‭2‬ ‬
‭4.‬ ‭different formula:‬‭𝑡‬ = ‭2‬ ‭2‬
‭𝑠‬ ‭1‬ ‭𝑠‬ ‭2‬
‭𝑛‭1‬ ‬
+ ‭𝑛‭2‬ ‬

‭5.‬ b ‭ asically the same, just make sure to take the degrees of freedom of the lower of n‬‭1‬ ‭and n‬‭2‬
‭for your t-test‬
‭6.‬ ‭again, basically the same, just make sure your context is correct for the scenario‬

‭confidence interval (measured for µ‬‭1‬ ‭- µ‬‭2‬‭) :‬


‭2‬ ‭2‬
* ‭𝑠‬ ‭1‬ ‭𝑠‬ ‭2‬
‭1.‬ ‭different formula:‬(‭𝑥‬‭1‬ − ‭𝑥‭2‬ ‬) ± ‭𝑡‬ ‭𝑑𝑡‬ ‭𝑛‬‭1‬
+ ‭𝑛‭2‬ ‬

‭a.‬ ‭make sure to get the t* value corresponding to the lesser degrees of freedom‬

‭Topic 23 (W19D2 PKC)‬


🥳 Wooo last topic of RS1!‬

I‭ f the sampling or experimental design is‬‭paired‬‭,‬‭then use‬‭paired t-procedures‬‭. If the samples‬


‭are drawn independently for the two groups, or if randomization is used to assign subjects to‬
‭separate treatment groups, use two-sample t-procedures.‬

‭ ata are collected with a paired design when there is a link between each observation‬
D
‭in one group with a specific observation in another‬

‭A‬‭paired t-procedure‬‭applies one sample t-procedures‬‭to the differences within a pair‬

‭●‬ ‭𝐻‬‭0:‬ ‭‭μ


‬ ‬‭𝑑‬ = ‭0‬

‭𝑥‬‭𝑑‬
‭𝑡‬ =
‭𝑆‬‭𝑑‭/‬ ‬ ‭𝑛‬

‭well class of 2028 hf ig‬ ‭Page‬‭24‬


‭Study guide courtesy of Catelyn Dao, Patrick Du, Shreyas Jain‬
‭●‬ f‭ or a p-value based on a t-distribution with (n-1) degrees of freedom, where n is the number‬
‭of pairs in the sample‬

* ‭𝑆‬‭𝑑‬
‭𝑥‬‭𝑑‬ ± ‭𝑡‬ ‭​‬

‭𝑛‬

‭●‬ ‭confidence interval for the population mean difference‬‭μ‭𝑑‬ ‬

‭ he technical conditions for paired t-procedures are the same as with a one-sample t procedure‬
T
‭except observational units are paired and the data are differences.‬

‭well class of 2028 hf ig‬ ‭Page‬‭25‬


‭Study guide courtesy of Catelyn Dao, Patrick Du, Shreyas Jain‬

You might also like