0% found this document useful (0 votes)

26 views8 pages

CS210 Statistics Notes PDF

Uploaded by

kemalefekolayli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views8 pages

CS210 Statistics Notes PDF

Uploaded by

kemalefekolayli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Suppose there was a medical study and the outcomes showed that almost 70% of the treatment group

and 19% of the

control group showed good results.
—>Do the data show a “real” difference between the groups?
The observed between the two groups might be real, or may be due to natural variation.
Since the difference is quite large, it is more believable that the difference is real.
!We need statistical tools to determine if the difference is so large that we should reject the notion that it was due to chance.

Data Basics

-
all var!ables -O
Y -
Categor!cal numer!cal
L -
ord!nal ↓ L
Categor!cal d!screte cont!nuous

assoc!ated =
dependent
Associated vs. Independent !ndependent
-When two variables show some connection with one another, they are called associated/dependent variables.
-If two variables are not associated, ie. there is no evident connection between the two, then they are said to be independent.

Overview of Data Collection Principles

Research Question, Population of Interest, Sample, Population to which results can be generalized

*Anecdotal Evidence: “My uncle smokes three packs a day and he’s in perfectly good health”

*Census: Sampling the entire population

—>There are problems with it: it can be difficult to complete a census, populations rarely stand still, may be more complex
than sampling.

Exploratory Analysis to Interference

Sampling is natural.
Think about sampling something you are cooking - you taste(examine) a small part of what you’re cooking to get an idea about
the dish as a whole.
When you taste a spoonful of soup and decide the spoonful you tasted isn’t salty enough, that’s exploratory analysis.
If you generalize and conclude that you entire soup needs salt, that’s an inference.
For your inference to be valid, the spoonful you tasted(the sample) needs to be representative of the entire pot (the population)
If your spoonful comes only from the surface and the salt is collected at the bottom of the pot Ti what you tasted is
probably not representative of the whole pot.

Sampling Bias
Non-Response: If only a small fraction of the randomly sampled people choose to respond to a survey, the sample may no
longer choose to respond to a survey, the sample may no longer be representative of the population.
Voluntary Response: Occurs when the sample consists of people who volunteer to respond because they have strong
opinions on the issue. Such a sample will also not be representative of the population.
Convenience Sample: Individuals who are easily accessible are more likely to be included in the sample.

This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
Explanatory variable —(might affect)—> Response variable

Observational Study: Researchers collect data in a way that does not directly interfere with how the data arise, ie. they
merely “observe”, and can only establish an association between the explanatory and response variables.
Experiment: Researchers randomly assign subjects to various treatments in order to establish causal connections between
the explanatory and response variables.
//CORRELATION DOES NOT IMPLY CAUSATION//
—>Main difference between observational studies and experiments:
Most experiments use random assignment while observational studies do not.

Observational Studies and Sampling Strategies

“New study sponsored by General Mills says that eating breakfast makes girls thinner.”
This is an observational study.
The conclusion is that there is a n association between girls eating breakfast and being slimmer.
The study is sponsored by General Mills.

3 Possible Explanations
1-Eating breakfast causes girls to be thinner.
2-Being thin causes girls to eat breakfast.
3-A third variable is responsible for both. What could it be?
—>An extraneous variable that affects both the explanatory and the response variable and that make it seem like there is a
relationship between the two are called confounding variables.

collect data as event

Prospective Study: Identifies individuals and collects information as events unfold. prospect!ve >
-

Retrospective Study: Collect data after events have taken place. after
retrospect!ve >
-
collect data

events
Obtaining Good Samples
Almost all statistical methods are based on the notion of implied randomness.
If observational data are not collected in a random framework from a population, these statistical methods - the estimates and
errors associated with the estimates - are not reliable.

-Simple Random Sample: -Stratified Sample: Strata are -Cluster Sample: Clusters are
Randomly select cases from the made up of similar observations. We usually not made up of homogenous
population, where there is no implied take a simple random sample from observations.
connection between the points that each stratum. Fig1 We take a simple random
are selected. sample of clusters, and then sample
all observations in that cluster.
Usually preferred for economical
reasons.

Most commonly used random sampling techniques:

Difference between Stratified and Cluster Sample:

-Multistage Sample:
Stratified sampling divides a population into groups, then includes some
Fig2 We take a simple random
members of all of the groups. sample of clusters, and then take a
Cluster sampling divides a population into groups, then includes all members simple random sample of
observations from the sampled
of some randomly chosen groups
clusters.

This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
Principles of Experimental Design
1-Control: Compare treatment of interest to a control group.
2-Randomize: Randomly assign subjects to treatments, and randomly
sample from the population whenever possible.
3-Replicate: Within a study, replicate by collecting a sufficiently large
sample. Or replicate the entire study.
4-Block: If there are variables that are known or suspected to affect the response variable, first group subjects into
blocks based not these variables, and then randomize cases within each block to treatment groups.
Ex. We would like to design an experiment to investigate if energy gels makes you run faster. It’s suspected that
energy gels might affect pro and amateur athletes differently, therefore we block for pro status.

Another ex. A study is designed to test the effect of light level and noise level on exam performance of students. The
researcher also believes that light and noise levels might have different effects on males and females, so wants to
make sure both genders are equally represented in each group.
Explanatory Variable(s): Light level & Noise Level
Response Variable(s): Exam Performance
Blocking Variable(s): Gender

Difference Between Blocking and Explanatory Variables

—Factors are conditions we can impose on the experimental units.
—Blocking variables are characteristics that the experimental units come with, that we would like to control for.
—Blocking is like stratifying, except used in experimental settings when randomly assigning, as opposed to when sampling.

More experimental design terminology:

—Placebo: Fake treatment, often used as the control group for medical studies.
—Placebo effect: Experimental units showing improvement because they believe they are receiving special treatment.
—Blinding: When experimental units do not know whether they are in the control or treatment group.
—Double-blind: When both the experimental units and the researchers who interact with the patients do not know the is in the
control and who is in the treatment group.

Examining Numerical Data:

—>Data Visualization Stuff

Mean X Xz t t kn
X n
samplemean

This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
The population mean is also computed the same way but is denoted as mu. It is often not possible to calculate mu since the
population data are rarely available.
M
The sample mean is a sample statistic, and serves as a point estimate of the population mean. This estimate may not be
perfect, but it the sample is good (representative of the population), it is usually a pretty good estimate.

Commonly Observed Shapes of Distributions

Unimodal(Single Prominent Peak), Bimodal/Multimodal(2+ prominent peaks), Uniform(no apparent peaks)
Skewness(Right/Left/Symmetric)

Variance Ei Xi X is roughly the average squared deviation from the mean.

s n t

Why do we use the squared deviation in the calculation of variance?

->To get rid of negatives so that the observations equally distant from the mean are weighed equally.
->To weigh larger deviations more heavily.

Standard Deviation s = (s^2)^1/2 is the square root of the variance, and has the same units as the data.

Median is the value that splits the data in half when ordered in ascending order.

Q1, Q2 and IQR

The 25th percentile is also called the first quartile, Q1.
The 50th percentile is also called the median.
The 75th percentile is also called the third quartile, Q3.
Between Q1 and Q3 is the middle 50% of the data. The range these data span is called the interquartile range, or the IQR.
—>IQR = Q3 - Q1

Outliers
Why is it important to look for outliers?
-Identify extreme skew in the distribution.
-Identify data collection and entry errors.
-Provide insight into interesting features of the data.

Robust Statistics
Median and IQR are more robust to skewness and outliers than mean and SD. Therefore,
for skewed distributions it is often more helpful to use median and IQR to describe the center and spread
For symmetric distributions it is often more helpful to use the mean and SD to describe the center and spread

Mean vs. Median

If the distribution is symmetric, center is often defined as the mean (mean median)
If the distribution is skewed or has extreme outliers, center is often defined as the median
Right-skewed: mean>median
Left-skewed: mean<median

***When data are extremely skewed, transforming them might make modeling easier. A common transformation is the log
transformation.
Pros & Cons of Transformations: Skewed data are easier to model with when they are transformed because outliers tend
to become far less prominent after an appropriate transformation.
However, results of an analysis might be difficult to interpret because the log of a measured variable is usually meaningless.

This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
***A table that summarizes data for two categorical variables is called a contingency table.

Hypothesis Testing Framework:

—We start with a null hypothesis(H0) that represents the status quo.
—We also have an alternative hypothesis(HA) that represents our research question, i.e. what we’re testing for.
—We conduct a hypothesis test under the assumption that the null hypothesis is true, either via simulation or theoretical
methods.
—If the test results suggest that the data do not provide convincing evidence for the alternative hypothesis, we stick with the
null hypothesis. If they do, then we reject the null hypothesis in favor of the alternative.

OS02 Probability
A random process is a situation in which we know what outcomes could happen, but we don’t know which particular outcome
will happen.
P(A)=Probability of event A
0<=P(A)<=1

—>Frequentist interpretation:
The probability of an outcome is the proportion of times the outcome would occur if we observed the random process an
infinite number of times.
—>Bayesian interpretation:
A Bayesian interprets probability as a subjective degree of belief: For the same event, two separate people could have
different viewpoints and so assign different probabilities.
Largely popularized by revolutionary advance in computational technology and methods during the last twenty years.

Law of Large Numbers: As more observations are collected, the proportion of occurrences with a particular outcome, p ,
n
converges to the probability of that outcome, p.

Law of Averages(Gambler’s Fallacy): The common misunderstanding of the LLN is that random processes are supposed
to compensate for whatever happened in the past; this is just not true.

—Disjoint(Mutually Exclusive) Outcomes: Cannot happen at the same time.

—Non-Disjoint Outcomes: Can happen at the same time.

General Addition Rule: P(A or B) = P(A) + P(B) - P(A and B)

Probability Distribution: A probability distribution lists all possible

events and the probabilities with which they occur.

Sample Space: The collection of all possible outcomes of a trial.

Ex. Sample Space of the gender of one kid = S = {M,F}
Sample Space of the genders of two kids = S = {MM, FF, MF, FM}
Complementary Events: Two mutually exclusive events whose probabilities that add up to 1.
Ex. A couple has two kids, if we know that they are not both girls what are the possible gender combinations for these kids?
{MM, FM, MF}

Independence: Two processes are independent if knowing the outcome of one provides no useful information about the
outcome of the other.
—>Checking for independence:
If P(A occurs, given that B is true) = P(A|B) = P(A), then A and B are independent.

This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
*Determining Dependence Based On Sample Data*
If conditional probabilities calculated based on sample data suggest dependence between two variables, the next step is to
conduct a hypothesis test to determine if the observed difference between the probabilities is likely or unlikely to have
happened by chance.
If the observed difference between the conditional probabilities is large, then there is stronger evidence that the difference is
real.
If a sample is large, then even a small difference can provide strong evidence of a real difference.

Product Rule For Independent Events: P(A and B) = P(A) x P(B)

Q: Do the sum of probabilities of two disjoint events always add up to 1?

A: Not necessarily, there may be more than 2 events in the sample space, e.g party affiliation.

Q: Do the sum of probabilities of two complementary events always add up to 1?

A: Yes, that’s the definition of complementary, e.g. heads and tails.

Conditional Probability: P(A|B) = P(A and B)/P(B)

Independence and conditional probabilities

Generically, if P(A|B) = P(A) then the events A and B are said to be independent.
-Conceptually: Giving B doesn’t tell us anything about A.
-Mathematically: We know that if events A and B are independent, P(A and B) = P(A) x P(B)
P(A|B) = P(A and B)/P(B) = P(A)xP(B)/P(B) = P(A)

Bayes’ Theorem

É
Random Variables:
A random variable is a numeric quantity whose value depends on the outcome of a random event. P(X=x)
—Discrete Random Variables: Often take only integer values.
i
—Continuous Random Variables: Take real(decimal) values

Expectation
We are often interested in the average outcome of a random variable. We call this the expected value(mean), and it is a
weighted average of the possible outcomes. M EX ExiPlXxi
OS03: Distributions of Random Variables

Normal Distribution

M
Unimodal and symmetric, bell shaped curve
Many variables are nearly normal, but none are exactly normal
Denoted as
Mms —>Normal with mean n and standard deviation
M3am26 nG M Mepasspts

Observationmean
2
This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
Z SD
Standardizing with Z scores
Standardized score or Z score of an observation is the number of standard deviations it falls above or below the mean.

Z scores are defined for distributions of any shape, but only when the distribution is normal can we use Z scores to calculate
percentiles.
Observations that are more than 2SD away from the mean (|Z| > 2) are usually considered unusual.

Percentile
Percentile is the percentage of observations that fall below a given data point.
Graphically, percentile is the area below the probability distribution curve to the left of that observation.
>>One can use Z-table.

Six Sigma:
“The term six sigma process comes from the notion that if one has six standard deviations between the process mean and the
nearest specification limit, no items will fail to meet specifications.”

Normal Probability Plot: Anatomy of a normal probability plot

Data are plotted on the y-axis of a normal probability plot, and theoretical quantiles (following a normal distribution) on
the x-axis.
If there is a linear relationship in the plot, then the data follow a nearly normal distribution.
Constructing a normal probability plot requires calculating percentiles and corresponding z-scores for each observation,
which is tedious. Therefore we generally rely on software when making these plots.

OS04: Foundations for inference

Variability in Estimates
“Margin of sampling error is plus or minus 2.9 percentage points for results based on the total sample and 4.4 percentage
points for adults ages 18-34 at the 95% confidence level.”
—41% +- 2.9%: We are 95% confident that 38.1% to 43.9% of the public believe young adults, rather that middle-aged or
older adults, are having the toughest time in today’s economy.
—49% +- 4.4%: We are 95% confident that 44.6% to 53.4% of 18-34 years olds have taken a job they didn’t want just to pay
the bills.

—We are often interested in population parameters.

—Since complete populations are difficult (or impossible) to collect data on, we use sample statistics as point estimates for
the unknown population parameters of interest.
—Sample statistics vary from sample to sample.
—Quantifying how sample statistics vary provides a way to estimate the margin of error associated with our point estimate.

Sampling Distribution
A sampling distribution is a probability distribution of a statistic obtained from a larger
X n number
mean of samples
SE drawn from a
p En
specific population. The sampling distribution of a given population is the distribution of frequencies of a range of different
outcomes that could possibly occur for a statistic of a population.

Central Limit Theorem

The distribution of the sample mean is well approximated by a normal model:
where SE represents standard error, which is defined as the standard deviation of the sampling distribution. If is unknown,

This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
use s.

Certain conditions must be met for the CLT to apply: In general n should be >30 || <10%
1–Independence: Sampled observations must be independent. This is difficult to verify, but is more likely if
random sampling/assignment is used & if sampling without replacement, n<10% of the population
2–Sample size/skew: Either the population distribution is normal, or if the population distribution is skewed, the sample size is
large.
the more skewed the population distribution, the larger sample size we need for the CLT to apply
for moderately skewed distributions n>30 is a widely used rule of thumb

Confidence Intervals: A plausible range of values for the population parameter is called a confidence interval.
The approximate 95% confidence interval is defined as (point estimate) +- 2xSE where SE = s/(n)^1/2
Confidence interval, a general formula
point estimate +- z* x SE

—>Width of an interval: If we want to be more certain that we capture the population parameter, i.e. increase our confidence
level, we should use a wider interval.

This study source was downloaded by 100000859810404 from CourseHero.com on 01-04-2024 09:51:09 GMT -06:00

https://www.coursehero.com/file/187263415/CS210-Statistics-Notespdf/
Powered by TCPDF (www.tcpdf.org)

Data and Statistical Notes
No ratings yet
Data and Statistical Notes
10 pages
AP Stats Study Guide
No ratings yet
AP Stats Study Guide
17 pages
AP Stats Review
No ratings yet
AP Stats Review
16 pages
001 Introduction PSY102
No ratings yet
001 Introduction PSY102
58 pages
AP Review Packet 1 - Important Concepts Not On The AP Statistics Formula Sheet
No ratings yet
AP Review Packet 1 - Important Concepts Not On The AP Statistics Formula Sheet
16 pages
ch1 - Tagged
No ratings yet
ch1 - Tagged
26 pages
Chapter 2: Gathering Data and Statistical Inference 1. Statistical Methods
No ratings yet
Chapter 2: Gathering Data and Statistical Inference 1. Statistical Methods
4 pages
statTI5e PPT 0103
No ratings yet
statTI5e PPT 0103
26 pages
Chapter Goals: After Completing This Chapter, You Should Be Able To
No ratings yet
Chapter Goals: After Completing This Chapter, You Should Be Able To
32 pages
Unit 3 Statistics Notes
No ratings yet
Unit 3 Statistics Notes
6 pages
Unit 3 - Collecting Data
No ratings yet
Unit 3 - Collecting Data
22 pages
Statistics
No ratings yet
Statistics
34 pages
Reviewer in Psych Stat
No ratings yet
Reviewer in Psych Stat
4 pages
Chapter 4 Designning Studies
No ratings yet
Chapter 4 Designning Studies
59 pages
AP Stats Module 3 Notes
No ratings yet
AP Stats Module 3 Notes
2 pages
Chapter 1: Introduction To Statistics: 1.1 An Overview of Statistics
No ratings yet
Chapter 1: Introduction To Statistics: 1.1 An Overview of Statistics
5 pages
Chapter 1 Slides
No ratings yet
Chapter 1 Slides
40 pages
Module 4
No ratings yet
Module 4
9 pages
Applied Statistics - MIT
100% (1)
Applied Statistics - MIT
654 pages
Combined STAT101B CheatSheet Raw
No ratings yet
Combined STAT101B CheatSheet Raw
17 pages
Chapter 1 - Introduction To Statistics
No ratings yet
Chapter 1 - Introduction To Statistics
6 pages
Lecture Notes - Data
No ratings yet
Lecture Notes - Data
26 pages
Lecture7 Ch2
No ratings yet
Lecture7 Ch2
31 pages
Psychological Statistics
No ratings yet
Psychological Statistics
6 pages
Chapter 1 - Basic Concepts in Stat - Presentation
No ratings yet
Chapter 1 - Basic Concepts in Stat - Presentation
76 pages
Variables:-: Research in Architecture - Assignment 3
No ratings yet
Variables:-: Research in Architecture - Assignment 3
6 pages
GW E8 CH 01
No ratings yet
GW E8 CH 01
50 pages
Chapter - 1 - Introduction To Statistics
No ratings yet
Chapter - 1 - Introduction To Statistics
50 pages
Chapter 12
No ratings yet
Chapter 12
120 pages
Discussion For Today: Probability Sampling Non Probability Sampling Questionnaire
No ratings yet
Discussion For Today: Probability Sampling Non Probability Sampling Questionnaire
31 pages
001 Introduction PSY102
No ratings yet
001 Introduction PSY102
58 pages
STAT Exam 1 - Review Sheet
No ratings yet
STAT Exam 1 - Review Sheet
2 pages
STAT 1181 Chapter 3
No ratings yet
STAT 1181 Chapter 3
12 pages
Statistics and Probability - Midterm Reviewer
No ratings yet
Statistics and Probability - Midterm Reviewer
13 pages
1 - Intro To Stats
No ratings yet
1 - Intro To Stats
32 pages
STT 215 Exam 1 Study Guide
No ratings yet
STT 215 Exam 1 Study Guide
2 pages
BRT - Notes Part 1
No ratings yet
BRT - Notes Part 1
15 pages
Intro To Biostat in The Health Sciences
No ratings yet
Intro To Biostat in The Health Sciences
29 pages
Statistics Portfolio
No ratings yet
Statistics Portfolio
49 pages
Chapter IV Data Management
No ratings yet
Chapter IV Data Management
74 pages
Introductory Statistics For The Behavioral Sciences Presentation - Chapters 1 & 2
No ratings yet
Introductory Statistics For The Behavioral Sciences Presentation - Chapters 1 & 2
37 pages
Introduction To Statistics
100% (1)
Introduction To Statistics
42 pages
Chapter 1 Introduction To Psych Stat
No ratings yet
Chapter 1 Introduction To Psych Stat
4 pages
Presentation of Data Lecture 2
No ratings yet
Presentation of Data Lecture 2
112 pages
Introduction To Statistics Web
No ratings yet
Introduction To Statistics Web
18 pages
AP Psych Prep 2 Part I Research and Experimental Method
No ratings yet
AP Psych Prep 2 Part I Research and Experimental Method
44 pages
Introduction To Statistics. An Overview of Statistics
No ratings yet
Introduction To Statistics. An Overview of Statistics
11 pages
RSU - Statistics - Lecture 1 - Final - myRSU
100% (1)
RSU - Statistics - Lecture 1 - Final - myRSU
44 pages
Final Notes
No ratings yet
Final Notes
184 pages
Stat 110 Lecture 1
No ratings yet
Stat 110 Lecture 1
4 pages
Experimental Design
No ratings yet
Experimental Design
29 pages
RM Unit 3
No ratings yet
RM Unit 3
20 pages
Statistics and Probability - Midterm Reviewer
No ratings yet
Statistics and Probability - Midterm Reviewer
12 pages
PSYC2012 Module 3 Research Designs and NHST
No ratings yet
PSYC2012 Module 3 Research Designs and NHST
4 pages
1 Intro To Stat Methods
No ratings yet
1 Intro To Stat Methods
36 pages
Lecture 1: Introduction To Statistics
No ratings yet
Lecture 1: Introduction To Statistics
23 pages
Psychological Statistics Assignment
No ratings yet
Psychological Statistics Assignment
4 pages
Introduction To Biostatistics: Dr. M. H. Rahbar
No ratings yet
Introduction To Biostatistics: Dr. M. H. Rahbar
35 pages
GNS 311 Adv Psychometrics
No ratings yet
GNS 311 Adv Psychometrics
20 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
2014 Ifa Phosphate Method
No ratings yet
2014 Ifa Phosphate Method
21 pages
ICML - 2016 - Stratified Sampling Meets Machine Learning
No ratings yet
ICML - 2016 - Stratified Sampling Meets Machine Learning
10 pages
Introduction To SOCIOLOGY (SOCY 111)
No ratings yet
Introduction To SOCIOLOGY (SOCY 111)
61 pages
Monitoring of Stack Emissions To Air (Environment Agency)
No ratings yet
Monitoring of Stack Emissions To Air (Environment Agency)
86 pages
Students Satisfaction On The Supreme Stu
No ratings yet
Students Satisfaction On The Supreme Stu
6 pages
Assesment of The Perfomance of Small-Scale Local Contractors On Civil Projects in Dar Es Salaam Tanzania
No ratings yet
Assesment of The Perfomance of Small-Scale Local Contractors On Civil Projects in Dar Es Salaam Tanzania
73 pages
Diversity-Focused HR Practices and Perceived Firm Performance: Mediating Role of Procedural Justice
No ratings yet
Diversity-Focused HR Practices and Perceived Firm Performance: Mediating Role of Procedural Justice
26 pages
Subject Name:-Inspection and Quality Control Paper Code: - 121762/031762 Branch/Sem.: - Mechanical Engg./6th
No ratings yet
Subject Name:-Inspection and Quality Control Paper Code: - 121762/031762 Branch/Sem.: - Mechanical Engg./6th
30 pages
Introduction To Probability PPT 1 Final
No ratings yet
Introduction To Probability PPT 1 Final
71 pages
Problem-Solving and Data Analysis-Inference From Sample Statistics and Margin of Error
No ratings yet
Problem-Solving and Data Analysis-Inference From Sample Statistics and Margin of Error
14 pages
Statistics and Data
No ratings yet
Statistics and Data
67 pages
Cola Project 1
No ratings yet
Cola Project 1
36 pages
MT Coursework STPM
100% (2)
MT Coursework STPM
8 pages
Reliance SIP and Research Work
No ratings yet
Reliance SIP and Research Work
70 pages
Logistics Management Proposal
No ratings yet
Logistics Management Proposal
39 pages
Recalls 2 NP 4 Reviewer: Situation 1
No ratings yet
Recalls 2 NP 4 Reviewer: Situation 1
12 pages
Research Article
No ratings yet
Research Article
7 pages
UNIT V Sampling Theory: Parameter
No ratings yet
UNIT V Sampling Theory: Parameter
8 pages
QUESTIONS - Quantitative Technique Answer
No ratings yet
QUESTIONS - Quantitative Technique Answer
13 pages
Model Tata Ruang Kantor Terhadap Efisiensi Kerja Karyawan Di Fakultas Keguruan Dan Ilmu Pendidikan Universitas Muhammadiyah Prof. Dr. Hamka
No ratings yet
Model Tata Ruang Kantor Terhadap Efisiensi Kerja Karyawan Di Fakultas Keguruan Dan Ilmu Pendidikan Universitas Muhammadiyah Prof. Dr. Hamka
23 pages
MKT214
No ratings yet
MKT214
16 pages
Kansei Journal
No ratings yet
Kansei Journal
5 pages
The Impact of Work On The Mental Health of Parents of Children With Disabilities
No ratings yet
The Impact of Work On The Mental Health of Parents of Children With Disabilities
21 pages
Abebe Agdew HH
No ratings yet
Abebe Agdew HH
41 pages
July 2024 Research Methodology 1
No ratings yet
July 2024 Research Methodology 1
146 pages
Artigo - An Experimental Investigation of The Effects of Retargeted
No ratings yet
Artigo - An Experimental Investigation of The Effects of Retargeted
50 pages
TQM-I - 04 - Sampling Plans and Acceptance Sampling
100% (1)
TQM-I - 04 - Sampling Plans and Acceptance Sampling
54 pages
Principle of Statistical Regularity: Actually Derived From
100% (2)
Principle of Statistical Regularity: Actually Derived From
2 pages
Zuobog, Presentation
No ratings yet
Zuobog, Presentation
38 pages
ALam 2020
No ratings yet
ALam 2020
11 pages

CS210 Statistics Notes PDF

Uploaded by

CS210 Statistics Notes PDF

Uploaded by

Suppose there was a medical study and the outcomes showed that almost 70% of the treatment group

and 19% of the

Overview of Data Collection Principles

*Census: Sampling the entire population

Exploratory Analysis to Interference

Observational Studies and Sampling Strategies

collect data as event

Most commonly used random sampling techniques:

Difference between Stratified and Cluster Sample:

Difference Between Blocking and Explanatory Variables

More experimental design terminology:

Examining Numerical Data:

Commonly Observed Shapes of Distributions

Variance Ei Xi X is roughly the average squared deviation from the mean.

Why do we use the squared deviation in the calculation of variance?

Q1, Q2 and IQR

Mean vs. Median

Hypothesis Testing Framework:

—Disjoint(Mutually Exclusive) Outcomes: Cannot happen at the same time.

General Addition Rule: P(A or B) = P(A) + P(B) - P(A and B)

Probability Distribution: A probability distribution lists all possible

Sample Space: The collection of all possible outcomes of a trial.

Product Rule For Independent Events: P(A and B) = P(A) x P(B)

Q: Do the sum of probabilities of two disjoint events always add up to 1?

Q: Do the sum of probabilities of two complementary events always add up to 1?

Conditional Probability: P(A|B) = P(A and B)/P(B)

Independence and conditional probabilities

Normal Probability Plot: Anatomy of a normal probability plot

OS04: Foundations for inference

—We are often interested in population parameters.

Central Limit Theorem

You might also like