0% found this document useful (0 votes)

26 views12 pages

E Book - Unit 4

Uploaded by

65kzmdy4xr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views12 pages

E Book - Unit 4

Uploaded by

65kzmdy4xr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

DATA ANALYSIS & TESTING OF

HYPOTHESIS

TO P I C S TO B E CO V E R E D

• Descriptive analysis/statistics & Inferential analysis/statistics,

• Hypothesis testing (concept, type of error, steps, types),
• Parametric tests with SPSS (z-test, t-test, F-test) and
• non-parametric test with SPSS (Chi- square, Mann-Whitney U Test, Kruskal Wallis test),
• Multivariate Analysis (Factor Analysis, Regression Analysis).
Statistics is the study of the collection, organization, analysis, interpretation, presentation, and
of data

STATISTICS DISCRIPTIVE
MATHEMATICS
INFERENTIAL

DESCRIP T I VE STAT IST ICS

DESCRIPTIVE STATISTICS

Measure of Central Measure of Spread

Measure of Shape
Tendency (Variability)

Descriptive Statistics- Measure of central tendency

• Mean : the arithmetic average of a Data set that is found by adding the numbers in a set
and dividing by the number of observations in the Data set.
MERITS & DEMERITS OF MEAN
Merits:

Ø Arithmetic mean rigidly defined.

Ø It is easy to calculate and simple to understand.

Ø It is based on all observations of the given data.

Ø It is suitable for further mathematical and statistical analysis.

Demerits:

Ø It can neither be determined by inspection or by graph.

Ø Arithmetic mean can not be computed for qualitative data.

Ø Arithmetic mean can not be computed for open ended class intervals.

Ø It can’t be computed if one or more observations are missing.

Ø It is highly affected by the extreme values (very high or very small values as compared to other
observations).

• Median: The middle number in the Data set while listed in either ascending or
descending order is the Median.
ADVANTAGES of Median

• (1) Simplicity:- In the case of simple statistical series, just a glance at the data is enough to
locate the median value.

(2) Free from the effect of extreme values: - Unlike arithmetic mean, median value is not
destroyed by the extreme values of the series.

(3) Certainty: - Certainty is another merits is the median. Median values are always a certain
specific value in the series.

(4) Real value: - Median value is real value and is a better representative value of the series
compared to arithmetic mean average, the value of which may not exist in the series at all.

(5) Graphic presentation: - Besides algebraic approach, the median value can be estimated also
through the graphic presentation of data.

(6) Possible even when data is incomplete: - Median can be estimated even in the case of
certain incomplete series. It is enough if one knows the number of items and the middle item of

the series.
Demerits of median:

• (1) Lack of representative character: - median is of limited representative character as it is not

based on all the items in the series.

• (2) Unrealistic:- When the median is located somewhere between the two middle values, it
remains only an approximate measure, not a precise value.

• (3) Lack of algebraic treatment: - Arithmetic mean is capable of further algebraic treatment, but
median is not. For example, multiplying the median with the number of items in the series will
not give us the sum total of the values of the series.

• Mode: The number that occurs the most in a Data set and ranges between the highest
and lowest value is the Mode.
Advantages of Mode

• Simple and popular: - Mode is very simple measure of central tendency. Sometimes,
just at the series is enough to locate the model value.

(2) Less effect of marginal values: - Compared top mean, mode is less affected by
marginal values in the series. Mode is determined only by the value with highest
frequencies.

(3) Graphic presentation:- Mode can be located graphically, with the help of histogram.

(4) Best representative: - Mode is that value which occurs most frequently in the series.
Accordingly, mode is the best representative value of the series.

(5) No need of knowing all the items or frequencies: - The calculation of mode does not
require knowledge of all the items and frequencies of a distribution. In simple series, it
is enough if one knows the items with highest frequencies in the distribution.
Demerits of mode:
(1) Uncertain and vague: - Mode is an uncertain and vague measure of the central
tendency.

(2) Not capable of algebraic treatment: - Unlike mean, mode is not capable of further
algebraic treatment.

(3) Difficult: - With frequencies of all items are identical, it is difficult to identify the
modal value.
(4) Complex procedure of grouping:- Calculation of mode involves cumbersome
procedure of grouping the data. If the extent of grouping changes there will be a change
in the model value.

(5) Ignores extreme marginal frequencies:- It ignores extreme marginal frequencies. To

that extent model value is not a representative value of all the items in a series. Besides,
one can question the representative character of the model value as its calculation does
not involve all items of the series.
Descriptive Statistics- Measure of Variability
• Range : It is the spread of your data from the lowest to the highest value in the
distribution.

R=H–L
• Mean Standard deviation : is the average amount of variability in your dataset. Average,
how far each value lies from the mean. A high standard deviation means that values are
generally far from the mean, while a low standard deviation indicates that values are
clustered close to the mean.

• Variance : Variance reflects the degree of spread in the data set. The more spread the
data, the larger the variance is in relation to the mean. Variance is the square of the
standard deviation. This means that the units of variance are much larger than those of
a typical value of a data set.
Descriptive Statistics- Measure of Shape-

SKEWNESS
 Skewness is a statistical number that tells us if a distribution is symmetric or not.
 A distribution is symmetric if the right side of the distribution is similar to the left side of
the distribution.
 If a distribution is symmetric, then the Skewness value is 0. i.e.
 If a distribution is Symmetric (normal distribution): median= mean= mode, (Skewness
value is 0)
 If Skewness is greater than 0, then it is called right-skewed or that the right tail is longer
than the left tail.
 If Skewness is less than 0, then it is called left-skewed or that the left tail is longer than
the right tail

Relationship b/w Skewness & Mean Median & Mode

• The mode is the apex high point of the curve
• The median is the middle value
• The means tends to be located to wards the tail of the distribution
• The coefficient compares the mean and median in the light of the magnitude of standard
deviation

• If the distribution is symmetrical , the co-efficient is equal to zero

X is the mean, S- Std Deviation, Md- Median

Descriptive Statistics- Measure of Shape

KURTOSIS

• Kurtosis determines the amount of peakedness.

• If a distribution is similar to the normal distribution, the Kurtosis value is 0.
• If Kurtosis is greater than 0, then it has a higher peak compared to the normal
distribution.
• If Kurtosis is less than 0, then it is flatter than a normal distribution.
• There are three types of distributions:
 Leptokurtic: Sharply peaked with fat tails, and less variable.
 Mesokurtic: Medium peaked
 Platykurtic: Flattest peak and highly dispersed.
**********************************************************************

Inferential analysis

INFERENTIAL
STATISTICS

Confidence Interval Hypothesis Testing Regression Analysis

1. Confidence Interval

• A confidence interval uses the variability around a statistic to come up with an interval
estimate for a parameter.
• A confidence level tells you the probability (in percentage) of the interval containing the
parameter estimate if you repeat the study again.
• A 95% confidence interval means that if you repeat your study with a new sample in
exactly the same way 100 times, you can expect your estimate to lie within the
specified range of values 95 times.
2. Regression TEST

• Regression tests demonstrate whether changes in predictor variables cause changes in an

outcome variable
• The dependent variable’s response to a unit change in the independent variable is
examined through linear regression.

3. Hypothesis testing

Null Hypothesis – There is no significance relationship b/w two variables

Alternate Hypothesis – There is a significant relationship b/w variables.

Ex:
Ho- There is no significant relationship between eating sugar and weight gain
Ha- There is significant relationship between eating sugar and weight gain
PARAMETRIC TEST V/S NON
PARAMETRIC

1. Parametric test – Z Test

 A z test is a test that is used to check if the means of two populations are different or not
provided the data follows a normal distribution.
 It checks if the means of two large samples are different or not when the population
variance is known.
 The null hypothesis and the alternative hypothesis must be set
 The sample Size should be greater than 30
 The z test formula to set up the required hypothesis tests for a one sample and a two-
sample z test

x¯= sample mean,

μ = population mean,
Z Test Formula for σ = population standard
one sample is deviation and
n = the sample size.
 A two sample z-test is used to test whether two population means are equal.
 This test assumes that the standard deviation of each population is known.
 The distribution of the two sample is normal
 H0: μ1 = μ2 (the two population means are equal)
 HA: μ1 ≠ μ2 (the two population means are not equal)

FORMULA –
x1, x2: sample means
σ1, σ2: population standard
deviations
n1, n2: sample sizes

2. Parametric test – T Test

 A t-test (also known as Student's t-test) is a tool for evaluating the means of one or two
populations using hypothesis testing
 When choosing a t test, you will need to consider two things: whether the groups being
compared come from a single population or two different populations, and whether you
want to test the difference in a specific direction.

t-Test assumptions
• The data are continuous.
• The sample data have been randomly sampled from a population.
• There is homogeneity of variance (i.e., the variability of the data in each group is
similar).
• The distribution is approximately normal.

3. Parametric test – F Test

• F test can be defined as a test that uses the f test statistic to check whether the variances
of two samples (or populations) are equal to the same value.
• The population should have a normal distribution
• The F-test is commonly employed to assess the fit of a proposed regression model to the
data, evaluating how well the model explains the variability in the data.
• The f test formula can be used to find the f statistic. The f test formula is given as
follows:

ISM Session 1-8+webinar1,2 Merged
No ratings yet
ISM Session 1-8+webinar1,2 Merged
718 pages
Lesson 2.3 Advanced Spreadsheet Skills
100% (3)
Lesson 2.3 Advanced Spreadsheet Skills
28 pages
Psychology Project
No ratings yet
Psychology Project
14 pages
Paths, Path Products and Regular Expressions: UNIT-3
100% (3)
Paths, Path Products and Regular Expressions: UNIT-3
70 pages
Year 8 Checkpoint Paper 1 (Nov2005 Till Apr2022) Ans
100% (2)
Year 8 Checkpoint Paper 1 (Nov2005 Till Apr2022) Ans
259 pages
Descriptive Analytics Notes
No ratings yet
Descriptive Analytics Notes
6 pages
It0089 Finalreviewer
100% (1)
It0089 Finalreviewer
143 pages
1 - III YR, VII Unit Intro To Statistics
No ratings yet
1 - III YR, VII Unit Intro To Statistics
214 pages
Define Statistics
No ratings yet
Define Statistics
89 pages
Biostatistics: Khadeeja PK
0% (1)
Biostatistics: Khadeeja PK
27 pages
Analysis of Data - Unit III (New)
No ratings yet
Analysis of Data - Unit III (New)
90 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Data Presentation
No ratings yet
Data Presentation
104 pages
Lecture 1
No ratings yet
Lecture 1
72 pages
Agriculture Development Bank Intership Report
67% (6)
Agriculture Development Bank Intership Report
43 pages
Data Analysis and Statistical Treatment
No ratings yet
Data Analysis and Statistical Treatment
99 pages
Blue Modern Marketing and Emotions Presentation - 20230909 - 071331 - 0000.pptx - 20250108 - 191853 - 0000
No ratings yet
Blue Modern Marketing and Emotions Presentation - 20230909 - 071331 - 0000.pptx - 20250108 - 191853 - 0000
25 pages
المحاضرة رقم 3
No ratings yet
المحاضرة رقم 3
44 pages
8614 Assignment No 2
No ratings yet
8614 Assignment No 2
26 pages
Unit II TYCS DS
No ratings yet
Unit II TYCS DS
176 pages
Ai - Ssmda
No ratings yet
Ai - Ssmda
142 pages
Bma301 Cat1 Questions
No ratings yet
Bma301 Cat1 Questions
8 pages
Introduction To Statistics 2 - 012233
No ratings yet
Introduction To Statistics 2 - 012233
29 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel
48 pages
Statistics - Compendium - DMS IIT DELHI - 2025
No ratings yet
Statistics - Compendium - DMS IIT DELHI - 2025
18 pages
Session 1 ISM May 2024
No ratings yet
Session 1 ISM May 2024
59 pages
MCS Lecture 3
No ratings yet
MCS Lecture 3
57 pages
Measures of Central Tendency
100% (1)
Measures of Central Tendency
48 pages
2nd Unit - Statistics
No ratings yet
2nd Unit - Statistics
15 pages
Important Measures of Central Tendency Are Mean, Median and Mode
No ratings yet
Important Measures of Central Tendency Are Mean, Median and Mode
31 pages
B26 Notes
No ratings yet
B26 Notes
11 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Statistics SS2020
No ratings yet
Statistics SS2020
12 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Statistics
No ratings yet
Statistics
33 pages
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
No ratings yet
Mba Semester 1 Mb0040 - Statistics For Management-4 Credits (Book ID: B1129) Assignment Set - 1 (60 Marks)
10 pages
Statistics
No ratings yet
Statistics
68 pages
Cba101 MT
No ratings yet
Cba101 MT
4 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
SS 104 - Lecture Notes Part 1 EDITED
No ratings yet
SS 104 - Lecture Notes Part 1 EDITED
8 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
Assignment
No ratings yet
Assignment
23 pages
It0089 Finalreviewer
No ratings yet
It0089 Finalreviewer
143 pages
Interpreting Test Score: Online Workshop 8602 Aiou
100% (1)
Interpreting Test Score: Online Workshop 8602 Aiou
39 pages
Unit 8. Data Analysis
No ratings yet
Unit 8. Data Analysis
69 pages
Data Management
No ratings yet
Data Management
48 pages
Analysis of Data-Statistic: Unit IV
No ratings yet
Analysis of Data-Statistic: Unit IV
30 pages
43hyrs Principles of Statistics 3
No ratings yet
43hyrs Principles of Statistics 3
56 pages
Assignment
No ratings yet
Assignment
30 pages
Ge8 Statistics
No ratings yet
Ge8 Statistics
2 pages
Statistical Analysis - Descriptive Stat
No ratings yet
Statistical Analysis - Descriptive Stat
6 pages
SEM 1 MB0040 1 Statistics For Management
No ratings yet
SEM 1 MB0040 1 Statistics For Management
8 pages
Statistics 1 (Final) / Orthodontic Courses by Indian Dental Academy
No ratings yet
Statistics 1 (Final) / Orthodontic Courses by Indian Dental Academy
15 pages
f592b059 1643454320549
No ratings yet
f592b059 1643454320549
39 pages
Name-Shilpi Singh Patel Assignment Set - 1 Programe - M.B.A. Semester - 1 Subject Code - Mb0040 Subject - Statistics For Management
No ratings yet
Name-Shilpi Singh Patel Assignment Set - 1 Programe - M.B.A. Semester - 1 Subject Code - Mb0040 Subject - Statistics For Management
9 pages
Business Statistics - Session Descriptive Statistics
No ratings yet
Business Statistics - Session Descriptive Statistics
28 pages
Statistical Techniques - Bda
No ratings yet
Statistical Techniques - Bda
33 pages
MATM111
No ratings yet
MATM111
8 pages
Week One: Introduction To Quantitative Methods MBA 2013
No ratings yet
Week One: Introduction To Quantitative Methods MBA 2013
49 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
How To Write Chapter 2 Review of The Related Literature and Studies
No ratings yet
How To Write Chapter 2 Review of The Related Literature and Studies
21 pages
Problem Solving With C Ktu Question Paper MCA S1 DEC 208
No ratings yet
Problem Solving With C Ktu Question Paper MCA S1 DEC 208
13 pages
3 C Estimating Population Sized I
100% (1)
3 C Estimating Population Sized I
9 pages
Complete Pastpaper
No ratings yet
Complete Pastpaper
82 pages
Normal Distr I
No ratings yet
Normal Distr I
16 pages
P2015 22 - Stat 211 PDF
No ratings yet
P2015 22 - Stat 211 PDF
10 pages
West Bengal State University: Department of Commerce & Management
No ratings yet
West Bengal State University: Department of Commerce & Management
52 pages
Group 1 - SPSS ACTIVITY
No ratings yet
Group 1 - SPSS ACTIVITY
39 pages
Data Collection Project
No ratings yet
Data Collection Project
3 pages
Statistics and Probability: STT-500 BS (CS) 2A Course Instructor
No ratings yet
Statistics and Probability: STT-500 BS (CS) 2A Course Instructor
27 pages
A Practical Approach For Evaluating Oil Analysis Results With Limit Values
No ratings yet
A Practical Approach For Evaluating Oil Analysis Results With Limit Values
12 pages
8614 Solved Assignment 1
No ratings yet
8614 Solved Assignment 1
26 pages
2608d501 Instrumentation and Control
No ratings yet
2608d501 Instrumentation and Control
6 pages
Statistics in Traffic Engineering-1
No ratings yet
Statistics in Traffic Engineering-1
14 pages
4-21 Unit 9 Quiz Review
No ratings yet
4-21 Unit 9 Quiz Review
4 pages
Midrange, Range, and Median (GC)
No ratings yet
Midrange, Range, and Median (GC)
43 pages
Weeks 8-12 Lesson
No ratings yet
Weeks 8-12 Lesson
18 pages
NN - 207 Quantitive Techniques in Geography
No ratings yet
NN - 207 Quantitive Techniques in Geography
133 pages
Proc - Compare
No ratings yet
Proc - Compare
7 pages
P&S (Solex24)
No ratings yet
P&S (Solex24)
6 pages
Assignment
No ratings yet
Assignment
14 pages
Chi-Square Test of Independence
No ratings yet
Chi-Square Test of Independence
46 pages
Stem-and-Leaf Plots Lesson Companion PDF
No ratings yet
Stem-and-Leaf Plots Lesson Companion PDF
3 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
11 pages
Chapter Test 4
No ratings yet
Chapter Test 4
2 pages
Probabilistic Predictive Strain Calculation in Asphalt Pavements
No ratings yet
Probabilistic Predictive Strain Calculation in Asphalt Pavements
6 pages
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
From Everand
De-Mystifying Math and Stats for Machine Learning: Mastering the Fundamentals of Mathematics and Statistics for Machine Learning
Seaport AI Madhavan
No ratings yet
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet

E Book - Unit 4

Uploaded by

E Book - Unit 4

Uploaded by

DATA ANALYSIS & TESTING OF

• Descriptive analysis/statistics & Inferential analysis/statistics,

DESCRIP T I VE STAT IST ICS

Measure of Central Measure of Spread

Descriptive Statistics- Measure of central tendency

Ø Arithmetic mean rigidly defined.

Ø It is easy to calculate and simple to understand.

Ø It is based on all observations of the given data.

Ø It is suitable for further mathematical and statistical analysis.

Ø It can neither be determined by inspection or by graph.

Ø Arithmetic mean can not be computed for qualitative data.

Ø It can’t be computed if one or more observations are missing.

• (1) Lack of representative character: - median is of limited representative character as it is not

(5) Ignores extreme marginal frequencies:- It ignores extreme marginal frequencies. To

Relationship b/w Skewness & Mean Median & Mode

• If the distribution is symmetrical , the co-efficient is equal to zero

X is the mean, S- Std Deviation, Md- Median

Descriptive Statistics- Measure of Shape

• Kurtosis determines the amount of peakedness.

Confidence Interval Hypothesis Testing Regression Analysis

• Regression tests demonstrate whether changes in predictor variables cause changes in an

Null Hypothesis – There is no significance relationship b/w two variables

1. Parametric test – Z Test

x¯= sample mean,

2. Parametric test – T Test

3. Parametric test – F Test

You might also like