Stats 201 Midterm Sheet

The document discusses statistical inference concepts including population parameters, sampling distributions, random sampling, standard error, confidence intervals, and the difference between quantitative and categorical variables. Random sampling aims to obtain a representative sample and reduce bias. The standard error quantifies variability in sampling distributions and is used to calculate confidence intervals.

Uploaded by

Lisbon Anderson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views2 pages

Stats 201 Midterm Sheet

Uploaded by

Lisbon Anderson

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

1 stat inference

When inferential Explain random and representative sampling and how this can influence
Must be a population parameter that you would like to estimate NOT a estimation.
solid number. Statistical inference is the act of making a guess about a
The third set of terms relate to sampling methodology: the
population using a sample
method used to collect samples. You’ll see here and throughout
mean, proportion, median, variance, standard deviation, and correlation
the rest of your book that the way you collect samples directly
often estimated using sample data and write computer scripts to calculate
influences their quality;
estimates of these parameters
A sample is said to be representative if it roughly
mean(), median(), var(), sd(), cor()
“looks like” the population. In other words, if the
virtual_prop_interest <- samples %>% sample’s characteristics are a “good” representation of
the population’s characteristics.
group_by(replicate) %>%
We say a sample is generalizable if any results based
summarize(Interest = sum(category == "Interest")) %>%
on the sample can generalize to the population. In
mutate(propInterest = Interest / n) n = samplesSizesN other words, if we can make “good” guesses about the
population using the sample.
Virtual_prop_interest
We say a sampling procedure is biased if certain individuals in a
Population
population have a higher chance of being included in a sample than others.
A population is a collection of individuals or observations that we are We say a sampling procedure is unbiased if every individual in a
interested in (a study population) population has an equal chance of being sampled.
Sample Define random variables and explain how they relate to sampling.
is the act of collecting a sample from the population, which we generally Before we take sample
only do when we can’t perform a census. We mathematically denote the The elements of the sample are random
sample size using lowercase n , as opposed to uppercase N which denotes The sample proportion is random
the population’s size. Typically the sample size n is much smaller than the The sample standard error is random
population size N. Thus sampling is a much cheaper alternative than The boundaries of a confidence interval are random
performing a census. The population parameter is constant
population parameters
After we take the sample:
A population parameter is a numerical summary of interest about the The elements of the sample are constant
population like mean median variance etc. The sample proportion is constant
Estimate The sample standard error is constant
The boundaries of a confidence interval are constant
A point estimate, also known as a sample statistic, is a summary statistic The population parameter is constant
computed from a sample that estimates the unknown population The elements of bootstrap samples are random
parameter.
sampling distribution Define standard error and explain its purpose.

Is a distribution of sample means which has standard error not sd. Taken Sd of sampling distribution of sample means
with repeated sampling from a single sample to see the distribution of population distribution
parameter of interest Collection of mean heights from repeated samples of
30 students, showing how the average height varies across different The population distribution refers to the distribution of a
samples. particular variable in the entire population of interest.

sample distribution. It describes the frequency or probability of each possible

outcome within the entire group.
Distribution of values within a single sample collected from a
population focusing on a single data set Often, the entire population distribution is unknown or difficult
to measure entirely.
Sample distribution: Heights of 30 randomly selected
students in a school. sample distribution

How to draw random samples from a finite population (e.g., census data) A sample distribution is the distribution of a specific variable
within a sample drawn from the population.
rep_sample_n(reps = x, size = x, replace = TRUE/FALSE)
(based on is it is from the if it from the population or from the It provides insights into the characteristics of the sample and
sample (bootstrap)) helps make inferences about the population.

How to estimate a sampling distribution for a given statistic and Sample distributions may vary from one sample to another.
population. estimator's sampling distribution.
Define Population: Clearly define the population of interest. When estimating a parameter (e.g., mean or proportion) using a
Identify Statistic,Determine Sample Size,Sampling Procedure,Generate sample statistic (e.g., sample mean or sample proportion), the
Multiple Samples,Calculate Statistic,Create Frequency distribution of the statistic across all possible samples is called
Distribution,Analyze Sampling Distribution,Calculate Confidence the estimator's sampling distribution.
Intervals, Assess Precision and Bias
2 cont stat inference Explain what a sampling distribution is, list its properties, and its purpose
Compare and contrast quantitative and categorical variables. in statistical inference.
Categorial words and qualities not numerical quantitative is
number
The sampling distribution is the distribution of a statistic (e.g., What is a confidence interval
mean, variance, proportion) for all possible samples of a given A confidence interval, in statistics, refers to the probability that a
size from a population. population parameter will fall between a set of values for a
certain proportion of times based on the confidence level(90,95,
Properties
99)
Central Limit Theorem: Regardless of the shape of the population Given that the distribution is normal you can use standard error
distribution, the sampling distribution of the sample mean approaches a to calculate the ci. Since the bootstrap approximates the
normal distribution as the sample size increases. sampling can hand calculate the interval
x±1.96⋅SE=(¯¯¯x−1.96⋅SE,¯¯¯x+1.96⋅SE)=(1995.4
Mean of Sampling Distribution: The mean of the sampling distribution
4−1.96⋅2.15,1995.44+1.96⋅2.15)=(1991.15,1999.73)
is equal to the population parameter being estimated (unbiased estimator).
What data do you use to get ci
Standard Deviation of Sampling Distribution (Standard Error): The Bootstrapped from sampling
standard deviation of the sampling distribution is known as the standard How to calculate
error. It quantifies how much the sample statistic is expected to vary from #sample from sample
the true population parameter. Bootstrap <- pennies_sample %>%
Shape: For large sample sizes, the sampling distribution tends to be rep_sample_n(size = 50, replace = TRUE, reps =
normal, even if the population distribution is not. 1000) %>%
group_by(replicate) %>%
Spread: Larger sample sizes result in smaller standard errors and, summarize(mean_year = mean(year))
therefore, a more precise estimate of the population parameter. #Visualize boot
3 bootstrapping Plot <- bootstrap |>
Define bootstrapping ggplot(aes(x = mean_year)) +
Given a population with some population parameter you can geom_histogram() +
take samples over and over from it to get a sampling distribution labs()
of the population parameter. The bootstrap distribution is going #get ci
to approximate the sampling distribution NOT the population. ci <- bootstrap |>
Calc summarize(ci_lower = quantile(mean_year, 0.05),
Bootstrap <- sampling_dist |> (given we want 90% diff=ci)
rep_sample_n(reps = x, size = x, replace = ci_upper = quantile(mean_year, 0.95))
TRUE) #graph ci
ci_plot <- bootstrap %>%
# Take 1000 virtual samples of size 50 from the bowl: ggplot(aes(x = mean_year)) +
virtual_samples <- bowl %>% geom_histogram(binwidth = 1) +
rep_sample_n(size = 50, reps = annotate("rect", xmin = ci$ci_lower, xmax =
1000) ci$ci_upper, ymin = 0, ymax = Inf) +
# Compute the sampling distribution of 1000 values of p-hat geom_vline(xintercept = population_mean,
sampling_distribution <- virtual_samples size = 2,
%>% colour = "red") +
group_by(replicate) %>% labs(title = "Bootstrap distribution with 90%
summarize(red = sum(color == confidence interval",
"red")) %>% x = "Mean year")
mutate(prop_red = red / 50) Infer version
# Visualize sampling distribution of p-hat #bootstrap & visualize
ggplot(sampling_distribution, aes(x = Bootstrap <- pennies_sample |>
prop_red)) + specify(response = year) |>
geom_histogram(binwidth = 0.05, boundary generate(reps = 1000) |>
= 0.4, color = "white") + calculate(stat = “mean”) |>
labs(x = "Proportion of 50 balls that were #forgraph visualize()
red", #get ci
title = "Sampling distribution") percentile_ci <- bootstrap %>%
get_confidence_interval(level = 0.90, type =
Infer version "percentile")
bootstrap_distribution <- sample_1 %>% #plot ci
specify(response = parameterInterest, Plot <- bootstrap |>
success = "typeInterest") %>% visualize() +
generate(reps = 1000, type = "bootstrap") shade_confidence_interval(endpoints = percentile_ci)
%>%
calculate(stat = "prop") What can you say about the results from the ci
Why bootstrap The effectiveness of a confidence interval is judged by whether or not it
No access and can approx the sampling dist contains the true value of the population parameter.
Sampling vs bootstrap
While both sampling and bootstrap are techniques used in population -> parameters
statistics, they differ in their approach, assumptions, and sample -> point estimate -> estimates -> Parameter(1)
applications. Sampling focuses on drawing representative sample -> Estimator(2) -> standard error
samples from populations, while bootstrap is a resampling bootstrap samples -> bootstrap distribution -> Estimates(3) ->
technique used to estimate the sampling distribution of a statistic sampling distribution
from observed data. bootstrap distribution -> Standard deviation(4) -> estimates ->
Standard error (5)

Sta 112 Past Questions Answer
No ratings yet
Sta 112 Past Questions Answer
5 pages
First Year Computer Engg. 1
No ratings yet
First Year Computer Engg. 1
38 pages
Probabilistic ML Crash Course - Leblanc, Mason
No ratings yet
Probabilistic ML Crash Course - Leblanc, Mason
95 pages
Mathematics Syllabus
No ratings yet
Mathematics Syllabus
33 pages
R Programming Unit 4
No ratings yet
R Programming Unit 4
26 pages
Introduction To Econometrics With R: Christoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer
No ratings yet
Introduction To Econometrics With R: Christoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer
481 pages
Stats 1 For Students
No ratings yet
Stats 1 For Students
60 pages
Actuarial Statistics: by Defaru Debebe AMU
No ratings yet
Actuarial Statistics: by Defaru Debebe AMU
197 pages
Statistics Chapter2
No ratings yet
Statistics Chapter2
102 pages
Eba3e PPT ch06
No ratings yet
Eba3e PPT ch06
41 pages
Exam MLC Finan
100% (1)
Exam MLC Finan
777 pages
Sampling Inference
No ratings yet
Sampling Inference
83 pages
STAT100 - Full Course Notes
No ratings yet
STAT100 - Full Course Notes
27 pages
Chapter 6 - Sampling and Estimation
No ratings yet
Chapter 6 - Sampling and Estimation
36 pages
Evans Analytics2e PPT 06 Final
100% (1)
Evans Analytics2e PPT 06 Final
36 pages
Sampling & Sampling Distributions
No ratings yet
Sampling & Sampling Distributions
26 pages
MA Economics
No ratings yet
MA Economics
92 pages
(Ebook PDF) Essentials of Business Analytics 2nd Edition - The Ebook in PDF and DOCX Formats Is Ready For Download
50% (2)
(Ebook PDF) Essentials of Business Analytics 2nd Edition - The Ebook in PDF and DOCX Formats Is Ready For Download
48 pages
MIT18 S096F13 Pset2
0% (1)
MIT18 S096F13 Pset2
4 pages
Statistic 2nd Sem Oed Exam
No ratings yet
Statistic 2nd Sem Oed Exam
19 pages
Point and Interval Estimate
No ratings yet
Point and Interval Estimate
135 pages
7 Estimation
No ratings yet
7 Estimation
91 pages
Tuesday, 16 January 2024 2:58 PM
No ratings yet
Tuesday, 16 January 2024 2:58 PM
46 pages
Probability and Statistics: Foundation
No ratings yet
Probability and Statistics: Foundation
2 pages
UNL STAT318 Notes Chapter 1-4 (2020)
No ratings yet
UNL STAT318 Notes Chapter 1-4 (2020)
66 pages
5-6.sampling Error and Confidence Interval
No ratings yet
5-6.sampling Error and Confidence Interval
74 pages
Jim Dai Textbook
No ratings yet
Jim Dai Textbook
168 pages
Statistical Inference
No ratings yet
Statistical Inference
52 pages
Educational Statistics Notes
No ratings yet
Educational Statistics Notes
32 pages
Newbold Sbe8 ch04
No ratings yet
Newbold Sbe8 ch04
61 pages
Distributions of Sample Statistics
No ratings yet
Distributions of Sample Statistics
112 pages
Unit - 1 Introduction-Statistical Inference
No ratings yet
Unit - 1 Introduction-Statistical Inference
28 pages
Sp25 Module 06 Sampling
No ratings yet
Sp25 Module 06 Sampling
45 pages
Lecture 8
No ratings yet
Lecture 8
39 pages
Stats-And-Prob-Reviewer (Grade 11 Stem)
100% (1)
Stats-And-Prob-Reviewer (Grade 11 Stem)
5 pages
Bizstat ssn2
No ratings yet
Bizstat ssn2
55 pages
Isom 2500
No ratings yet
Isom 2500
58 pages
Fack Review Detection
No ratings yet
Fack Review Detection
53 pages
Lecture1 - Copy (1) Copy 2
No ratings yet
Lecture1 - Copy (1) Copy 2
24 pages
Sampling Distributions and Confidence Intervals
No ratings yet
Sampling Distributions and Confidence Intervals
69 pages
3rd-4th Semester Detailed Syllabus - EE
No ratings yet
3rd-4th Semester Detailed Syllabus - EE
30 pages
Chapter 4 Sampling Distributions PDF
No ratings yet
Chapter 4 Sampling Distributions PDF
74 pages
Business Statistics
No ratings yet
Business Statistics
25 pages
RMB W2
No ratings yet
RMB W2
22 pages
Recap
No ratings yet
Recap
75 pages
2.1 Random Variables 2.1.1 Definition: PX PX X
100% (1)
2.1 Random Variables 2.1.1 Definition: PX PX X
13 pages
Lectorial Slides 6a
No ratings yet
Lectorial Slides 6a
30 pages
Lecture5 Classnotes
No ratings yet
Lecture5 Classnotes
23 pages
2 - Analyze - Inferential Statistics
No ratings yet
2 - Analyze - Inferential Statistics
27 pages
Lesson 07 - Sampling and Sampling Distributions (Without Video)
No ratings yet
Lesson 07 - Sampling and Sampling Distributions (Without Video)
53 pages
Lecture 5
No ratings yet
Lecture 5
21 pages
Chapter 6 Sampling and Estimation - v2
No ratings yet
Chapter 6 Sampling and Estimation - v2
57 pages
Lecture 03 Probability and Statistics Review Part2
No ratings yet
Lecture 03 Probability and Statistics Review Part2
74 pages
Math 140 Final Review Notes
No ratings yet
Math 140 Final Review Notes
20 pages
4th Unit - Statistics
No ratings yet
4th Unit - Statistics
13 pages
3 SamplingDistributions Complete
No ratings yet
3 SamplingDistributions Complete
39 pages
Review of Chapters 1-5
No ratings yet
Review of Chapters 1-5
21 pages
Business Analytics Module 2
No ratings yet
Business Analytics Module 2
24 pages
Inferential Statistics: X (Called X Bar), To Symbolize The Sample
No ratings yet
Inferential Statistics: X (Called X Bar), To Symbolize The Sample
19 pages
Essentials of Statistics L 06
No ratings yet
Essentials of Statistics L 06
6 pages
Unit Summary
No ratings yet
Unit Summary
31 pages
Introduction To Statistic Lab Report
No ratings yet
Introduction To Statistic Lab Report
16 pages
Seminar Week 4 - With Solutions - Fullpage
No ratings yet
Seminar Week 4 - With Solutions - Fullpage
35 pages
1 Performance Check
No ratings yet
1 Performance Check
8 pages
Statistical Characteristics of Numerical Data
No ratings yet
Statistical Characteristics of Numerical Data
9 pages
Simulation Assignment
No ratings yet
Simulation Assignment
7 pages
Topic 1 Sampling
No ratings yet
Topic 1 Sampling
19 pages
Implications For Sampling Distributions and Population Inferences PPT Rommel
No ratings yet
Implications For Sampling Distributions and Population Inferences PPT Rommel
12 pages
Sampling & Sampling Distributions
No ratings yet
Sampling & Sampling Distributions
44 pages
Sampling and Estimation: Research Methods: Lecture 6 Sarah Griffiths Sarah - Griffiths@ucl - Ac.uk
No ratings yet
Sampling and Estimation: Research Methods: Lecture 6 Sarah Griffiths Sarah - Griffiths@ucl - Ac.uk
24 pages
Lesson 6 Normal Distribution
No ratings yet
Lesson 6 Normal Distribution
15 pages
04.sampling Distributions of The Estimators
No ratings yet
04.sampling Distributions of The Estimators
32 pages
MIT2 854F10 Stats
No ratings yet
MIT2 854F10 Stats
38 pages
Sampling
No ratings yet
Sampling
27 pages
Case 5 Thoughtful Forecasting FIN 635
No ratings yet
Case 5 Thoughtful Forecasting FIN 635
3 pages
Topic06 Written
No ratings yet
Topic06 Written
15 pages
Session On Confidence Interval
No ratings yet
Session On Confidence Interval
13 pages
Sampling and Sampling Distributions
No ratings yet
Sampling and Sampling Distributions
35 pages
Gate Scholorship Work - October: Sampling Fundamentals
No ratings yet
Gate Scholorship Work - October: Sampling Fundamentals
13 pages
Discrete Random Variables
No ratings yet
Discrete Random Variables
15 pages
Stat Notes
No ratings yet
Stat Notes
5 pages
01 SPSS
No ratings yet
01 SPSS
14 pages
SPE26339 Risk Analysis Drilling AFE
No ratings yet
SPE26339 Risk Analysis Drilling AFE
9 pages
Sampling, Sampling Distributions and Estimation
No ratings yet
Sampling, Sampling Distributions and Estimation
8 pages
Discrete Random Variable Distributions
No ratings yet
Discrete Random Variable Distributions
6 pages
Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam
No ratings yet
Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam
3 pages
MBAE 616 Course Outline
No ratings yet
MBAE 616 Course Outline
2 pages
Discussion Questions Week 2
No ratings yet
Discussion Questions Week 2
2 pages
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)

Stats 201 Midterm Sheet

Uploaded by

Stats 201 Midterm Sheet

Uploaded by

1 stat inference

sample distribution. It describes the frequency or probability of each possible

You might also like