Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam

This document contains a cheat sheet of key statistical concepts and formulas for an exam on statistical design and analysis. It defines important terms like population, sample, parameter, statistic, and hypothesis testing. It also summarizes common statistical tests and analyses for different variable types, including measures of central tendency, variability, confidence intervals, hypothesis testing, and more. Formulas are provided for calculating z-scores, t-scores, standard errors, margins of error, and confidence intervals. Visualization techniques are also outlined for different variable combinations.

Uploaded by

Urbi Roy Barman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views3 pages

Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam

Uploaded by

Urbi Roy Barman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

lOMoARcPSD|5567380

Cheat sheet stats for exam

Statistical Design and Analysis (University of Technology Sydney)

StuDocu is not sponsored or endorsed by any college or university

Downloaded by sujib barman ([email protected])
lOMoARcPSD|5567380

-Statistical inference is the process of using data from a sample Bootstrap statistic: is the statistic computed on a bootstrap
to gain information about population sample
a sample statistic is an estimate of the value of a descriptive Bootstrap distribution: is the distribution of many bootstrap
characteristic of a sampling distribution statistics
A parameter is a number that describes some aspect of a Our interest is in the variablility of the statistics from the
population. bootstrap samples, which will be similar to the SE from the true
When a mean is larger than the median its is skewed to the right population.
A statistic is a number that is computed from data in a sample. The sampling distribution is centred around the population
-Sampling bias occurs when the method of screening a sample parameter while the bootstrap distribution is centred around the
causes the sample to differ from the population in some way. If sample statistic.
bias occurs the we cannot trust these generalizations to reflect the Margin of error: = 1.96 x SE for 95% confidence interval.
rest of the population…. causes; Question wording, Context, Hypothesis tests ( statistical test)
Inaccurate responses, Poor sampling methods. Needs random Remember to DEFINE parameters
sampling, Control groups, placebos and blinding to reduce bias Null hypothesis (H0 ): no effect or difference (status quo) =
Correlation does not equal Causation Alternative hypothesis (Ha): claim for which we seek evidence.
Response/ Explanatory Variable: using one variable to help us (Either equal to, less than or greater than)
understand or predict values of another variable, we call the REJECT H0 when P value is less than 0.05 for 5%
former the explanatory and the latter the response variable. significance
Confounding variable: Additional variables that are associated Hypotheses are always about population parameters and not
with both the explanatory variable and the response variable. sample statistics, thus only population notation is used.
confounding variables are minimised by random allocation of - Never accept the null. Only reject or do not reject
subjects to treatment groups. null.
Notations Population Sample Statistical significance: If sample statistic is unlikely to occur
Mean 𝜇 𝑋 𝐵𝑎𝑟 just by chance the results are statistically significant. If results are
Proportion P P-hat not significant results are inconclusive
Correlation Rho r The p-value is the chance of obtaining a sample statistic at least
Standard deviation 𝜎 S as extreme as the observed sample statistic, if the null hypothesis
Resistance: when the metric is not distorted by skewed data. is true
Medians are resistant, means are not. Reject when P value is < alpha(0.05) or when observed value
Standard deviation: 95% of data in a bell shaped curve should in rejection region
fall between 1.96 standard deviations of the mean. Randomization distribution: bootstrapping but with focus on
Z score: how many standard deviations the observation is from null rather than the sample. A randomization distribution assumes
the mean. (Observed – Mean)/ Std deviation H0 is true, while a bootstrap distribution has no knowledge of the
IQR: Q3 – Q1 where Q3= median of values above median Null hypothesis and is used for Confidence Intervals.
Range: Maximum value- minimum value
Outliers: use mean and std deviation for shape if NO outliers. If a 95% CI misses
Use IQR and median if there are outliers. the parameter in
Five number summary: Min, Q1, Median, Q3, Max H0, then a two-
Reject 𝐻) Do not reject 𝐻) tailed test should
𝐻) is true Type I error No error reject H0 at a 5%
𝐻) is false No error Type II error significance level.
One categorical variable And vice versa
Summary statistics: Mode, frequency table, proportions
Visualization: Bar chart, Pie chart
Two Categorical variables Confidence
Summary statistics: two-way table, difference in proportions intervals are most
Visualization: stacked or clustered bar chart useful when you want to estimate population parameters
One Quantitative Variable Hypothesis tests and p-values are most useful when you want to
R – Range S – shape L – Location test hypotheses about population parameters
Summary statistics: median, mean, A density curve is a theoretical model to describe a variable’s
Visualization: Dotplots, Histograms or boxplots distribution.
Regression A normal distribution has a symmetric bell-shaped density
The observed response value, y, is the response value observed curve
for a particular data point. The mean is its center of symmetry (µ).The standard deviation controls its spread (σ).
The predicted response value, y-hat, is the response that would Central limit Theorem For random samples with a sufficiently
be predicted for a given x value, based on a model. large sample size( >=30 quantitative if not very skewed, >= 10 for
Shape: if the sample size is large enough the sampling categorical within each category), the distribution of sample
distribution will be symmetric and bell-shaped (CLT) statistics for a mean is Normal. Use t score IF LESS)
An interval estimate (on the other hand) gives a range of CI = statistic +- Z* x SE where Z* is specific for each %CI
plausible values for a population parameter. The standardized test statistic is the number of standard errors a
statistic ± margin of error statistic is from the null: Z= (sample statistic – null) / SE
The standard error of a statistic, SE, is the standard deviation of t-distribution: compensates for added variability of not knowing
the sampling distribution of that statistic or bootstrap distributio Std dev and is used for small samples. Is very similar to normal
Confidence intervals: Formula: 95% CI = sample stat ± 1.96 x curve but with fatter tails to reflect added uncertainty. Df=n-1
SE T score is used when the Std deviation is not known or the sample
We are 95% confident that the true [population parameter] lies size is less than 30.
between (blah and blah) T* is found using t distribution.
Bootstraping: To simulate a population from a sample
population by drawing samples randomly (out of a hat).

Downloaded by sujib barman ([email protected])

lOMoARcPSD|5567380

much observed counts vary from expected counts for a

categorical variable.

The expected count is the sample size(n) times the null

proportion (Pi)
Df in freedom = number of categories - 1
Always use right tail to find p value in chi^2 distribution
Single mean- 1 quantitative variable Chi- squared test for association (two categorical)
Difference in Means – 1 quant and 1 cat variable tests for an association between two categorical variables
Single proportion P- one cat variable Expected = row total x column total / grand total
Differences in proportion- two cat variables H0: independent/ not associated
Paired difference in means (matched pairs)- looks like 2 samples Ha: associated/ not independent
really only 1 (observation for paired data is 2 observations per Df= (rows-1) x (columns-1) (use right tail test for chi^2 dist)
subject) Ensure expected count in each group is at least 5 for both chi
ANOVA: Test for difference in means across > two samples
H0: Mean1=mean2=mean3… Ha: at least one mean different
Analysis of variance compares the variability between groups
to the variability within groups.
Total variability= Var between groups + var within groups

K= number of groups, n= total sample size

Conditions required to be met to use Theoretical F ,
1. Sample sizes large (ni ≥ 30) OR data reasonably normal
2. Variability is similar in all groups ( Max SD/ Min SD <2)
Df= n-1 of smallest group if more than one groups ANOVA was for a single quantitative var split over > 2 levels.
Regression is for 2 Quantitative Variables
𝒚=α+β𝐗+ε if either slope or correlation is zero = no relationship
Df =n-2
R2 -Coefficient of
determination=
proportion of
variability in
response variable
Y that is
“explained” by the model based on the predictor X
R2 is high if the data is close to the line..Correlation(r): -1<=r<=1
*replace S in t score with square the correlation, r2, we get a number between 0 and 1
𝜎
that we can express as a percentage. A good model explains
75% or more.
We assume the errors (ε) are randomly distributed above and
below the line, curve is linear, variability is even, no outliers/
influential points
Residual - the difference between the observed value of the
dependent variable (y) and the predicted value (ŷ) is called the
residual (e)
Chi- squared goodness of fit (single cat)
The 𝜒2-test for goodness-of-fit tests if at least one categorical
variable differs from a hypothesized distribution
Previously to get a p-value we found how far a p-hat was from
the Null in Standard Errors on the distribution of a difference in
proportions. This won’t work for > 2 groups.. More than two
sample statistics (p-hat), more than one Null value...
We use the chi-square statistic (single number) to quantify how

Downloaded by sujib barman ([email protected])

It0089 Finalreviewer
100% (1)
It0089 Finalreviewer
143 pages
Sumana Bandyopadhyay - Kolkata The Colonial City in Transition - Reflections in Geographies of Urban India-Routledge (2022)
100% (1)
Sumana Bandyopadhyay - Kolkata The Colonial City in Transition - Reflections in Geographies of Urban India-Routledge (2022)
395 pages
Flow Over Cylinder
No ratings yet
Flow Over Cylinder
8 pages
Network Models: Assignment Problem
No ratings yet
Network Models: Assignment Problem
23 pages
Mysql Commands
0% (1)
Mysql Commands
3 pages
Ciao 6-1850 User Manual English
No ratings yet
Ciao 6-1850 User Manual English
8 pages
Algorithm Analysis Cheat Sheet PDF
0% (1)
Algorithm Analysis Cheat Sheet PDF
2 pages
1 The Role of Statistics and The Data Analysis Process
100% (1)
1 The Role of Statistics and The Data Analysis Process
30 pages
Concrete Sheet Pile Drawingdrawing06040
100% (1)
Concrete Sheet Pile Drawingdrawing06040
4 pages
Patient Clinical Audit Case Study Example
No ratings yet
Patient Clinical Audit Case Study Example
3 pages
Cheat Sheet - BT1101
100% (2)
Cheat Sheet - BT1101
29 pages
Marko Grobelnik, Blaz Fortuna, Dunja Mladenic Jozef Stefan Institute, Slovenia
100% (1)
Marko Grobelnik, Blaz Fortuna, Dunja Mladenic Jozef Stefan Institute, Slovenia
107 pages
Econ 445 Problem Set 3 Linear Programming: Heinz2
No ratings yet
Econ 445 Problem Set 3 Linear Programming: Heinz2
4 pages
You Are Not Your Brain
0% (1)
You Are Not Your Brain
7 pages
Cheat Sheet: Hive Basics
No ratings yet
Cheat Sheet: Hive Basics
1 page
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
No ratings yet
3141b86-6fd4-7726-D8ad-20a1516bcd Statistics Interview Cheat Sheet - Emmading - Com. All Rights Reserved.
10 pages
Visualizations in Spreadsheets and Tableau
No ratings yet
Visualizations in Spreadsheets and Tableau
4 pages
Statistics Cheatsheet
No ratings yet
Statistics Cheatsheet
3 pages
A Comprehensive Statistics Cheat Sheet For Data Science 1685659812
No ratings yet
A Comprehensive Statistics Cheat Sheet For Data Science 1685659812
39 pages
Statistic Interview Questions and Answers by Jeevan Raj
No ratings yet
Statistic Interview Questions and Answers by Jeevan Raj
21 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
A Comprehensive Statistics Cheat Sheet For Data Science Interviews - StrataScratch
No ratings yet
A Comprehensive Statistics Cheat Sheet For Data Science Interviews - StrataScratch
32 pages
Tropicsun
100% (1)
Tropicsun
2 pages
Statistics Cheat Sheet
100% (1)
Statistics Cheat Sheet
1 page
Strength of Materials
No ratings yet
Strength of Materials
115 pages
VOCALOID 6 Reference Manual ENG
No ratings yet
VOCALOID 6 Reference Manual ENG
88 pages
Unit3 160420200647 PDF
No ratings yet
Unit3 160420200647 PDF
146 pages
Section One1
No ratings yet
Section One1
85 pages
STAT1008 Cheat Sheet
100% (1)
STAT1008 Cheat Sheet
1 page
Cargo OLE Spreadsheet 281 29
No ratings yet
Cargo OLE Spreadsheet 281 29
1 page
Class 7
No ratings yet
Class 7
42 pages
Linear Programming: Back To Top
No ratings yet
Linear Programming: Back To Top
7 pages
Review of Basic Statistical Concepts Hanke
No ratings yet
Review of Basic Statistical Concepts Hanke
28 pages
Cheat Sheet: With Stata 15
No ratings yet
Cheat Sheet: With Stata 15
6 pages
Beige Aesthetic Modern Business Plan A4 Document
No ratings yet
Beige Aesthetic Modern Business Plan A4 Document
22 pages
Review of Chapters 1-5
No ratings yet
Review of Chapters 1-5
21 pages
Data Mining Cheat Sheet PDF
No ratings yet
Data Mining Cheat Sheet PDF
6 pages
Ok Java Case Study
No ratings yet
Ok Java Case Study
18 pages
Quiet Versus Loud Luxury The Influence of Overt and Covert Narcissism On Young Chinese and US Luxury Consumers' Preferences
No ratings yet
Quiet Versus Loud Luxury The Influence of Overt and Covert Narcissism On Young Chinese and US Luxury Consumers' Preferences
27 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
Improving Quality in Food Products: Nestlé's Strategies For Standard Operating Procedures (SOP) and Documentation
No ratings yet
Improving Quality in Food Products: Nestlé's Strategies For Standard Operating Procedures (SOP) and Documentation
10 pages
Bes Summary
No ratings yet
Bes Summary
11 pages
Top 10 API Security Risks 2019 PDF
No ratings yet
Top 10 API Security Risks 2019 PDF
31 pages
XXXXX: Important Instructions To Examiners
No ratings yet
XXXXX: Important Instructions To Examiners
16 pages
Log
No ratings yet
Log
8 pages
AP Stats Study Guide
No ratings yet
AP Stats Study Guide
17 pages
Mechanical Tube English
No ratings yet
Mechanical Tube English
8 pages
Begreber Note For Statistics
No ratings yet
Begreber Note For Statistics
17 pages
Design and Optimization of Spur Gear: Second Review
No ratings yet
Design and Optimization of Spur Gear: Second Review
44 pages
EMR System UI Design
No ratings yet
EMR System UI Design
3 pages
Narsee Monjee Institute of Management Studies: Linear Programming - Sensitivity Analysis
No ratings yet
Narsee Monjee Institute of Management Studies: Linear Programming - Sensitivity Analysis
11 pages
Chapter 1 Data Analysis
No ratings yet
Chapter 1 Data Analysis
18 pages
Lecture Set Three-Wave Generator
No ratings yet
Lecture Set Three-Wave Generator
10 pages
Valve and Pump
No ratings yet
Valve and Pump
32 pages
Distribution and Network Models: 6.1 Transportation Problem
No ratings yet
Distribution and Network Models: 6.1 Transportation Problem
28 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Basics of Machine Learning
No ratings yet
Basics of Machine Learning
20 pages
Footscan®v9 Software Packages
No ratings yet
Footscan®v9 Software Packages
1 page
PMP Exam Cheat Sheet
No ratings yet
PMP Exam Cheat Sheet
10 pages
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
No ratings yet
Introduction To Data Analysis: Professor David Richardson IIT Stuart School of Business
31 pages
AB Cheatsheet
No ratings yet
AB Cheatsheet
13 pages
Tổng Hợp Đề Thi Ielts Speaking Quý 4 - 2019 by Ngocbach
No ratings yet
Tổng Hợp Đề Thi Ielts Speaking Quý 4 - 2019 by Ngocbach
14 pages
LPP
No ratings yet
LPP
3 pages
De Chuyen Anh Vinh Phuc 2018-2019
No ratings yet
De Chuyen Anh Vinh Phuc 2018-2019
6 pages
DAX Functions - Math and Statistical Functions
No ratings yet
DAX Functions - Math and Statistical Functions
9 pages
Machine A B C Capacity Minimiz e Cost 1 2 3 Orders
No ratings yet
Machine A B C Capacity Minimiz e Cost 1 2 3 Orders
2 pages
Planning A Lesson Using PRIMM: The Five Stages of PRIMM
No ratings yet
Planning A Lesson Using PRIMM: The Five Stages of PRIMM
2 pages
Powerpoint Workshop Introduction To Deep Learning - Statistics and Data Analysis
No ratings yet
Powerpoint Workshop Introduction To Deep Learning - Statistics and Data Analysis
26 pages
Self Study
No ratings yet
Self Study
3 pages
22 21 15 20 14 Total Supply 11 16 16 13 27 Total Demand 25 14 16 24 22 29 19 19 26 11 1 21 16 24 22 17 2 23 21 26 14 25 3
No ratings yet
22 21 15 20 14 Total Supply 11 16 16 13 27 Total Demand 25 14 16 24 22 29 19 19 26 11 1 21 16 24 22 17 2 23 21 26 14 25 3
2 pages
CA CheatSheet
No ratings yet
CA CheatSheet
3 pages
Software Architecture Cheat Sheet For Daily Usage
No ratings yet
Software Architecture Cheat Sheet For Daily Usage
6 pages
Graph 2 Worksheet
No ratings yet
Graph 2 Worksheet
2 pages
Maximize, Z 300x1 + 500x2
No ratings yet
Maximize, Z 300x1 + 500x2
3 pages
Sap BW Cheat Sheet
No ratings yet
Sap BW Cheat Sheet
2 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
7 pages
ISOM Cheat Sheet 1
No ratings yet
ISOM Cheat Sheet 1
6 pages
Chapter 1 THE PROBLEM AND ITS BACKGROUND
No ratings yet
Chapter 1 THE PROBLEM AND ITS BACKGROUND
10 pages
LN40D550 - Fast Track Troubleshooting Manual PDF
No ratings yet
LN40D550 - Fast Track Troubleshooting Manual PDF
4 pages
Template Resource Mobilization
No ratings yet
Template Resource Mobilization
14 pages
Sales Prediction Using Regression Analysis: Problem Statement
No ratings yet
Sales Prediction Using Regression Analysis: Problem Statement
3 pages
A. Variables:: Types of Distributions
No ratings yet
A. Variables:: Types of Distributions
10 pages
Data Science With R Workflow: Click The Links For Documentation
No ratings yet
Data Science With R Workflow: Click The Links For Documentation
3 pages
EX3 - MEEN 363 Cheat Sheet
No ratings yet
EX3 - MEEN 363 Cheat Sheet
7 pages
Dataprep Cheat Sheet
No ratings yet
Dataprep Cheat Sheet
1 page
Quantitative Techniques & Operations Research: Ankit Sharma Neha Rathod Suraj Bairagi Vaibhav Thamman
No ratings yet
Quantitative Techniques & Operations Research: Ankit Sharma Neha Rathod Suraj Bairagi Vaibhav Thamman
12 pages
Stat & Prob Formula Sheet
100% (1)
Stat & Prob Formula Sheet
2 pages
Business Strategy: Program Objectives
No ratings yet
Business Strategy: Program Objectives
6 pages
SQL01 - Introduction To Business Intelligence
No ratings yet
SQL01 - Introduction To Business Intelligence
75 pages
Cheat Sheet PSM
No ratings yet
Cheat Sheet PSM
3 pages
Cheat Sheet Statistics
No ratings yet
Cheat Sheet Statistics
3 pages
Excel - Module 2 (Formulas, Functions, and Formatting)
No ratings yet
Excel - Module 2 (Formulas, Functions, and Formatting)
3 pages
Isom 2700 Cheat Sheet - 1
No ratings yet
Isom 2700 Cheat Sheet - 1
2 pages
HPVM Cheat Sheet
No ratings yet
HPVM Cheat Sheet
4 pages
Global Skills of Drawing
No ratings yet
Global Skills of Drawing
2 pages
General Cheat Sheet
No ratings yet
General Cheat Sheet
2 pages
SQL Cheat Sheet: Basic Queries
No ratings yet
SQL Cheat Sheet: Basic Queries
1 page
189 Cheat Sheet Minicards
No ratings yet
189 Cheat Sheet Minicards
2 pages
Sap BW Cheat Sheet
No ratings yet
Sap BW Cheat Sheet
2 pages
Statistics For Dummies
From Everand
Statistics For Dummies
Deborah J. Rumsey
4/5 (28)
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet

Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam

Uploaded by

Cheat Sheet Stats For Exam Cheat Sheet Stats For Exam

Uploaded by

lOMoARcPSD|5567380

Cheat sheet stats for exam

Statistical Design and Analysis (University of Technology Sydney)

StuDocu is not sponsored or endorsed by any college or university

Downloaded by sujib barman ([email protected])

much observed counts vary from expected counts for a

The expected count is the sample size(n) times the null

K= number of groups, n= total sample size

Downloaded by sujib barman ([email protected])

You might also like