Stats Notes

The document discusses different statistical concepts and techniques. It defines random and non-random sampling, explaining that random sampling allows for an equal probability of selection while non-random sampling relies on factors like convenience. It also discusses the differences between statistics and parameters, and types of data like qualitative and quantitative data. Finally, it provides an overview of techniques like simple random sampling, stratified random sampling, and the five steps of the six sigma methodology.

Uploaded by

Maryam Hussain

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

170 views

Stats Notes

Uploaded by

Maryam Hussain

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Differences:

Random Sampling/probability Non-Random sampling/non-probability

Random sampling is a sampling technique Non-random sampling is a sampling
where each sample has an equal probability of technique where the sample selected will be
getting selected based on factors such as convenience,
judgement and experience of the researcher
and not on probability
Random sampling is unbiased in nature Non-random sampling is biased in nature
Based on probability Based on other factors such as convenience,
judgement and experience of researcher but,
not based on probability
Random sampling is representative of the Non-random sampling lacks the
entire population representation of the entire population
No chances of zero probability Zero probability can occur
Random sampling is the simplest sampling Non-random sampling method is a somewhat
technique complex sampling technique

Simple Random Sampling with replacement Without replacement

Sampling with replacement is used to find Sampling without Replacement is a way to
probability with replacement figure out probability without replacement
We choose two items which are independent We choose two items which are dependent of
of each other each other
it allows us to use the same dataset multiple we don’t want the data for any given
times to build models as opposed to going out household to appear twice in the sample so
and gathering new data, which can be time- we would sample without replacement.
consuming and expensive

Statistics Parameter
a statistic is a number describing a sample A parameter is a number describing a whole
population
With statistics, we can use sample statistics to The goal of quantitative research is to
make educated guesses about population understand characteristics of populations by
parameters. finding parameters
Easy, time-saving, and feasible too difficult, time-consuming or unfeasible
e.g., sample means, sample variance e.g., population mean, population variance
Exp: Standard deviation of weights of Exp: Standard deviation of weights of all
avocados from one farm. avocados in the region.

Simple Random Sampling Stratified Random Sampling

A simple random sample is used to represent A stratified random sample, on the other
the entire data population and randomly hand, first divides the population into smaller
selects individuals from the population groups, or strata, based on shared
without any other consideration. characteristics
Economical in nature, less time consuming Economical in nature, less time consuming,
less chance of bias as compared to simple
random sampling, and higher accuracy than
simple random sampling
Chance of bias, the difficulty of getting a Need to define the categorical variable by
representative sample which subgroups should be created — for
instance, age group, gender, occupation,
income, education, religion, region, etc.
The simple random sample is often used when This method of sampling means there will be
there is very little information available about selections from each different group—the size
the data population of which is based on its proportion to the
entire population

Proportional Allocation Optimum Allocation

Variance is higher than optimum Optimum Allocation has the least Variance
sample units are selected from within strata The cheaper the cost per unit in a stratum, the
in proportion to their strata sizes larger should be the size of the sample from that
stratum
proportional allocation is appropriate when The allocation of the sample to different strata
different parts of population are made in accordance with this principle is called
proportionally represented in the sample the principle of optimum allocation
each stratum directly depends on the The larger the variability within a stratum, the
number of units in the stratum large should be the size of the sample from it
N1, S1, C1

One way ANOVA Two-way Anova

A test that allows one to make comparisons A test that allows one to make comparisons
between the means of three or more groups of between the means of three or more groups of
data. data, where two independent variables are
considered.
A one-way ANOVA has one independent A two-way ANOVA has two independent
variable variables
The means of three or more groups of an The effect of multiple groups of two
independent variable on a dependent variable. independent variables on a dependent variable
and on each other.
Number of samples are Three or more. Each variable should have multiple samples.
Questions and Answers:
Statistics and Types:
Statistics:
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and
presentation of data. In applying statistics to a scientific, industrial, or social problem, it is
conventional to begin with a statistical population or a statistical model to be studied.
Populations can be diverse groups of people or objects such as "all people living in a country" or
"every atom composing a crystal". Statistics deals with every aspect of data, including the
planning of data collection in terms of the design of surveys and experiments. There are two
branches:
Descriptive:
Descriptive statistics deals with the presentation and collection of data. This is usually the first
part of a statistical analysis. It is usually not as simple as it sounds, and the statistician needs to
be aware of designing experiments, choosing the right focus group and avoid biases that are so
easy to creep into the experiment.
Inferential:
Inferential statistics, as the name suggests, involves drawing the right conclusions from the
statistical analysis that has been performed using descriptive statistics. In the end, it is the
inferences that make studies important and this aspect is dealt with in inferential statistics.

Data and its Types:

Data:
In statistics, groups of individual data points may be classified as belonging to any of various
statistical data types, e.g., categorical ("red", "blue", "green"), real number (1.68, -5, 1.7e+6),
odd number (1,3,5) etc. The data type is a fundamental component of the semantic content of the
variable, and controls which sorts of probability distribution can logically be used to describe the
variable, the permissible operations on the variable, the type of regression analysis used to
predict the variable, etc. The concept of data type is similar to the concept of level of
measurement, but more specific: For example, count data require a different distribution (e.g., a
Poisson distribution or binomial distribution) than non-negative real-valued data require, but
both fall under the same level of measurement (a ratio scale)
Types:
There are different types of data in Statistics, that are collected, analysed, interpreted and
presented. The data are the individual pieces of factual information recorded, and it is used for
the purpose of the analysis process. There are two types of data; qualitative data and quantitative
data.
Qualitative Data:
Qualitative data, also known as the categorical data, describes the data that fits into the
categories. Qualitative data are not numerical. The categorical information involves categorical
variables that describe the features such as a person’s gender, home town etc. Categorical
measures are defined in terms of natural language specifications, but not in terms of numbers. It
includes normal data and ordinal data.
Quantitative Data:
Quantitative data is also known as numerical data which represents the numerical value (i.e.,
how much, how often, how many). Numerical data gives information about the quantities of a
specific thing. Some examples of numerical data are height, length, size, weight, and so on. The
quantitative data can be classified into two different types based on the data sets. The two
different classifications of numerical data are discrete data and continuous data. It includes
discrete data and continuous data.

5 steps of six sigma:

The Six Sigma Methodology comprises five data-driven stages — Define, Measure, Analyze,
Improve and Control (DMAIC)
1. Define - The “Define” stage seeks to identify all the pertinent information necessary to
break down a project, problem or process into tangible, actionable terms. It emphasizes
the concrete, grounding process improvements in actual, quantifiable and qualifiable
information rather than abstract goals.
2. Measure - In the “Measure” phase, organizations assess where current process
capabilities are. While they understand they need to make improvements and have listed
those improvements concretely in the Define phase, they
3. Analyse - The “Analyze” phase examines the data amassed during the Measure stage to
isolate the exact root causes of process inefficiencies, defects and discrepancies. In short,
it extracts meaning from your data. Insights gleaned from Analyzation begin scaffolding
the tangible process improvements for your team or organization to implement
4. Improve - The “Improve” initiates formal action plans meant to solve the target root
problems gleaned from your Analyzations. Organizations directly address what they’ve
identified as problem root causes, typically deploying a Design of Experiment plan to
isolate different variables and co-factors until the true obstacle is found.
5. Control - In the final phase, “Control,” Six Sigma teams create a control plan and deploy
your new standardized process. The control plan outlines improved daily workflows,
which result in critical business process variables abiding by accepted quality control
variances.

Importance of QC Chart:
QC Chart in Food Safety and Quality:
A quality control chart is a graphical representation of whether a firm's products or processes are
meeting their intended specifications. If problems appear to arise, the quality control chart can be
used to identify the degree by which they vary from those specifications and help in error
correction.
The food industry deals with highly sensitive products. This is one of the key reasons behind
maintaining quality standards and adhering to quality requirements, which are imperative for
players in the food industry. When it comes to food items, most of us tend to repeatedly buy the
same brand which we perceive is of good quality and matches our expectations
Also, in the case of companies in this industry, even a small incident where the quality of
products has been compromised could tarnish the brand image. Consequently, the company’s
profits could go crashing down the hill. This makes having appropriate quality control measures
highly necessary for brands dealing in food products. Quality control (QC) is a reactive process
and aims to identify and rectify the defects in finished products. It can be achieved by identifying
and eliminating sources of quality problems to ensure customer’s requirements are continually
met. It involves the inspection aspect of quality management and is typically the responsibility of
a specific team tasked with testing products for defects.

Scales of Measurement:
In statistics, there are four data measurement scales: nominal, ordinal, interval and ratio. These
are simply ways to sub-categorize different types of data (here’s an overview of statistical data
types).
1. Nominal - The nominal scale of measurement defines the identity property of data. This
scale has certain characteristics, but doesn’t have any form of numerical meaning. The
data can be placed into categories but can’t be multiplied, divided, added or subtracted
from one another. It’s also not possible to measure the difference between data points.
2. Ordinal - The ordinal scale defines data that is placed in a specific order. While each
value is ranked, there’s no information that specifies what differentiates the categories
from each other. These values can’t be added to or subtracted from.
3. Interval - The interval scale contains properties of nominal and ordered data, but the
difference between data points can be quantified. This type of data shows both the order
of the variables and the exact differences between the variables. They can be added to or
subtracted from each other, but not multiplied or divided. For example, 40 degrees is not
20 degrees multiplied by two.
4. Ratio - Ratio scales of measurement include properties from all four scales of
measurement. The data is nominal and defined by an identity, can be classified in order,
contains intervals and can be broken down into exact value. Weight, height and distance
are all examples of ratio variables. Data in the ratio scale can be added, subtracted,
divided and multiplied.
Hypothesis testing:
Hypothesis testing is a form of statistical inference that uses data from a sample to draw
conclusions about a population parameter or a population probability distribution. First, a
tentative assumption is made about the parameter or distribution. This assumption is called the
null hypothesis and is denoted by H0. In a statistical hypothesis test, a null hypothesis and an
alternative hypothesis is proposed for the probability distribution of the data.
Five Steps in Hypothesis Testing:
6. Specify the Null Hypothesis - The null hypothesis (H0) is a statement of no effect,
relationship, or difference between two or more groups or factors. In research studies, a
researcher is usually interested in disproving the null hypothesis.
7. Specify the Alternative Hypothesis - The alternative hypothesis (H1) is the statement that
there is an effect or difference. This is usually the hypothesis the researcher is interested
in proving. The alternative hypothesis can be one-sided (only provides one direction,
e.g., lower) or two-sided.
8. Set the Significance Level (a) - The significance level (denoted by the Greek letter alpha
— a) is generally set at 0.05. This means that there is a 5% chance that you will accept
your alternative hypothesis when your null hypothesis is actually true.
9. Calculate the Test Statistic and Corresponding P-Value - In another section we present
some basic test statistics to evaluate a hypothesis. Hypothesis testing generally uses a test
statistic that compares groups or examines associations between variables.
10. Drawing a Conclusion – reject or accept null hypothesis

Definition:
X-Chart:
In statistical process monitoring, the X-Chart is a type of scheme, popularly known as control
chart, used to monitor the mean and range of a normally distributed variables simultaneously,
when samples are collected at regular intervals from a business or industrial process.
X-Bar Chart:
In industrial statistics, the X-bar chart is a type of Shewhart control chart that is used to monitor
the arithmetic means of successive samples of constant size, n. This type of control chart is used
for characteristics that can be measured on a continuous scale, such as weight, temperature,
thickness etc.
S-Chart:
s charts are used to monitor the mean and variation of a process based on samples taken from the
process at given times (hours, shifts, days, weeks, months, etc.). The measurements of the
samples at a given time constitute a subgroup. Typically, an initial series of subgroups is used to
estimate the mean and standard deviation of a process.
P-Chart:
In statistical quality control, the p-chart is a type of control chart used to monitor the proportion
of nonconforming units in a sample, where the sample proportion nonconforming is defined as
the ratio of the number of nonconforming units to the sample size, n.

10 Quiz 1 Magbanua
No ratings yet
10 Quiz 1 Magbanua
3 pages
Intro Stats Formula Sheet
No ratings yet
Intro Stats Formula Sheet
5 pages
Linear Regression Example Data
100% (1)
Linear Regression Example Data
43 pages
Sampling and Surveying Handbook
100% (1)
Sampling and Surveying Handbook
72 pages
R Package Recommendation
No ratings yet
R Package Recommendation
4 pages
Tutorial Sheet
No ratings yet
Tutorial Sheet
2 pages
Chemical Engineering: 13BCH011-13BCH015
No ratings yet
Chemical Engineering: 13BCH011-13BCH015
8 pages
Statistics For Management: Q.1 A) 'Statistics Is The Backbone of Decision Making'. Comment
No ratings yet
Statistics For Management: Q.1 A) 'Statistics Is The Backbone of Decision Making'. Comment
10 pages
Statistics Formulas: Parameters
No ratings yet
Statistics Formulas: Parameters
3 pages
Important Statistics Formulas
No ratings yet
Important Statistics Formulas
7 pages
Statistics Packet
No ratings yet
Statistics Packet
17 pages
The Three MS: Analysis Data
No ratings yet
The Three MS: Analysis Data
5 pages
IELTS Academic Reading 11,12,13 - Key
No ratings yet
IELTS Academic Reading 11,12,13 - Key
10 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
6 pages
Formulae Sheet
No ratings yet
Formulae Sheet
11 pages
11-Hypothesis Testing Two Population
No ratings yet
11-Hypothesis Testing Two Population
47 pages
Lecture 2 Chapter3
No ratings yet
Lecture 2 Chapter3
53 pages
GATE:linear Algebra SAMPLE QUESTIONS
No ratings yet
GATE:linear Algebra SAMPLE QUESTIONS
14 pages
If Are Partitions of Probability Space S: AB A B AB A B
No ratings yet
If Are Partitions of Probability Space S: AB A B AB A B
4 pages
A Socio-Ecological Study of Population, Migration, Urbanization, and Socio-Climate Variation in Andhra Pradesh and Telangana, India
No ratings yet
A Socio-Ecological Study of Population, Migration, Urbanization, and Socio-Climate Variation in Andhra Pradesh and Telangana, India
33 pages
ST2187 Block 5
No ratings yet
ST2187 Block 5
15 pages
R Packages For Machine Learning
No ratings yet
R Packages For Machine Learning
3 pages
Interview Questions For DS & DA (ML)
100% (1)
Interview Questions For DS & DA (ML)
66 pages
Data Science Introduction
No ratings yet
Data Science Introduction
82 pages
Ancova
100% (1)
Ancova
20 pages
Sample Design and Sampling Procedures
No ratings yet
Sample Design and Sampling Procedures
43 pages
Deviation in Sociology
No ratings yet
Deviation in Sociology
5 pages
Part 1: Data Investigation and Cleaning: Classification For Data Errors
No ratings yet
Part 1: Data Investigation and Cleaning: Classification For Data Errors
12 pages
MATRICES
No ratings yet
MATRICES
1 page
1 Basic Math Symbols
No ratings yet
1 Basic Math Symbols
13 pages
Multiple Choice Questions 3
100% (1)
Multiple Choice Questions 3
5 pages
Categorical Data Analysis-Tabular and Graphical
No ratings yet
Categorical Data Analysis-Tabular and Graphical
16 pages
Recursive Functions
No ratings yet
Recursive Functions
32 pages
Lecture Note On Statistical Methods With An Application
No ratings yet
Lecture Note On Statistical Methods With An Application
489 pages
Statistics Probability Midterm Cheat Sheet
0% (1)
Statistics Probability Midterm Cheat Sheet
5 pages
CFE Assignment
No ratings yet
CFE Assignment
17 pages
Course Outline Statistical Inference BBA QTM 232 30092022 081756am
No ratings yet
Course Outline Statistical Inference BBA QTM 232 30092022 081756am
6 pages
Statistics For Management and Economics, Tenth Edition Formulas
No ratings yet
Statistics For Management and Economics, Tenth Edition Formulas
11 pages
Linear Algebra Final Review
No ratings yet
Linear Algebra Final Review
7 pages
Domestication - Docx 2
No ratings yet
Domestication - Docx 2
5 pages
Estimation of Parameters: Example
No ratings yet
Estimation of Parameters: Example
2 pages
Statistical Infrences Lec 1
No ratings yet
Statistical Infrences Lec 1
35 pages
Data Science Engineering Full Time Program Brochure
No ratings yet
Data Science Engineering Full Time Program Brochure
21 pages
Statistics For Css
No ratings yet
Statistics For Css
73 pages
Normal Probability Distribution CHARLS PDF
No ratings yet
Normal Probability Distribution CHARLS PDF
21 pages
Mumbai Educational Trust: MET Institute of Computer Science
No ratings yet
Mumbai Educational Trust: MET Institute of Computer Science
368 pages
Chapter 4 SQQS1013
No ratings yet
Chapter 4 SQQS1013
20 pages
Introduction To Deviance
No ratings yet
Introduction To Deviance
7 pages
3.4 The Matrix of Linear Transformation
No ratings yet
3.4 The Matrix of Linear Transformation
4 pages
Matrices: Definition. A
No ratings yet
Matrices: Definition. A
5 pages
Define Statistics: Psychology
No ratings yet
Define Statistics: Psychology
6 pages
STATISTICS
No ratings yet
STATISTICS
17 pages
Scatter Plots
No ratings yet
Scatter Plots
12 pages
2.4 Transition Matrices
No ratings yet
2.4 Transition Matrices
9 pages
BSADM Module 4 Session 17 22 KSR
No ratings yet
BSADM Module 4 Session 17 22 KSR
28 pages
Stat111 Lectures
No ratings yet
Stat111 Lectures
36 pages
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
No ratings yet
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
111 pages
A Comparative Study in The Influences To The Student Engagement in Academic Performance
No ratings yet
A Comparative Study in The Influences To The Student Engagement in Academic Performance
65 pages
Statistics For Business and Economics: Sampling and Sampling Distributions
No ratings yet
Statistics For Business and Economics: Sampling and Sampling Distributions
50 pages
Laptop Review PDF
No ratings yet
Laptop Review PDF
12 pages
Definition of Statistics: Individuals Who Shaped Statistics Today
No ratings yet
Definition of Statistics: Individuals Who Shaped Statistics Today
12 pages
Module 6 Lesson 1
No ratings yet
Module 6 Lesson 1
8 pages
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
100% (1)
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
17 pages
Data Responden 100: Regression of Ability On Mean Score. Group: 1
No ratings yet
Data Responden 100: Regression of Ability On Mean Score. Group: 1
2 pages
5ccfcbc1e4b0a4eb53be4158 Original PDF
No ratings yet
5ccfcbc1e4b0a4eb53be4158 Original PDF
45 pages
Problems: Simple Linear Regression
No ratings yet
Problems: Simple Linear Regression
5 pages
Reviewer PR 2 2ND Quarter
No ratings yet
Reviewer PR 2 2ND Quarter
35 pages
Get Experimental Design With Applications in Management Engineering and The Sciences Paul D. Berger Free All Chapters
100% (4)
Get Experimental Design With Applications in Management Engineering and The Sciences Paul D. Berger Free All Chapters
42 pages
SPSS Assignment 3 1.: Paired Samples Test
No ratings yet
SPSS Assignment 3 1.: Paired Samples Test
2 pages
Quality Control Pada PT Sat Nusapersada TBK
No ratings yet
Quality Control Pada PT Sat Nusapersada TBK
10 pages
Qinghua Academic Integrity Test (2022)
No ratings yet
Qinghua Academic Integrity Test (2022)
9 pages
Data Science Course in Hyderabad
100% (1)
Data Science Course in Hyderabad
29 pages
Branding Igor Naming Guide
100% (10)
Branding Igor Naming Guide
112 pages
Admin, Document
No ratings yet
Admin, Document
18 pages
Methodology Research Design: Tongco ST., Maysan, Valenzuela City
No ratings yet
Methodology Research Design: Tongco ST., Maysan, Valenzuela City
11 pages
Intro To Qualitative Research
No ratings yet
Intro To Qualitative Research
19 pages
MBA II BRM Trimester End Exam
50% (2)
MBA II BRM Trimester End Exam
3 pages
On The Challenges of Learning With Inference Networks On Sparse, High-Dimensional Data
No ratings yet
On The Challenges of Learning With Inference Networks On Sparse, High-Dimensional Data
14 pages
Nursing Practice Is Based and Provides The Foundation For Understanding Theory As One Type of Nursing Knowledge
100% (1)
Nursing Practice Is Based and Provides The Foundation For Understanding Theory As One Type of Nursing Knowledge
32 pages
Sample Chapter PDF
No ratings yet
Sample Chapter PDF
29 pages
Axiomatic, Parameterized, Off-Shell Quantum Field Theory: Ed Seidewitz
No ratings yet
Axiomatic, Parameterized, Off-Shell Quantum Field Theory: Ed Seidewitz
27 pages
STROBE-nut Checklist Table
No ratings yet
STROBE-nut Checklist Table
8 pages
Grp1 Consumers Perception and Buying Behavior Towards Swiggy Zomato-A Study
No ratings yet
Grp1 Consumers Perception and Buying Behavior Towards Swiggy Zomato-A Study
9 pages
Descriptive Research Techniques
No ratings yet
Descriptive Research Techniques
12 pages
PR2Notes - Session 4 - Mendeleev - Clemente - May 1, 2021
No ratings yet
PR2Notes - Session 4 - Mendeleev - Clemente - May 1, 2021
8 pages
Carver - The Case Against Statistical Significance Testing
No ratings yet
Carver - The Case Against Statistical Significance Testing
18 pages
Antiparticle Physics
No ratings yet
Antiparticle Physics
8 pages
Chapter 11 Practice Free Response Answers
No ratings yet
Chapter 11 Practice Free Response Answers
5 pages
NEP Guidelines of DSC-1
No ratings yet
NEP Guidelines of DSC-1
2 pages
Application of The QFD Method in Clothing Industry
100% (8)
Application of The QFD Method in Clothing Industry
10 pages

Stats Notes

Uploaded by

Stats Notes

Uploaded by

Differences:

Random Sampling/probability Non-Random sampling/non-probability

Simple Random Sampling with replacement Without replacement

Simple Random Sampling Stratified Random Sampling

Proportional Allocation Optimum Allocation

One way ANOVA Two-way Anova

Data and its Types:

5 steps of six sigma:

You might also like