100% found this document useful (1 vote)
29 views

Lesson 18 Basic Statistical Tool

This document discusses various statistical tools used for analyzing data, including exploratory data analysis, descriptive data analysis, and inferential data analysis. It describes measures of central tendency such as mean, median, and mode. It also covers measures of dispersion like range, average deviation, and standard deviation. Various statistical tests are mentioned, including t-tests, z-tests, ANOVA, correlation, and regression.

Uploaded by

Christma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
29 views

Lesson 18 Basic Statistical Tool

This document discusses various statistical tools used for analyzing data, including exploratory data analysis, descriptive data analysis, and inferential data analysis. It describes measures of central tendency such as mean, median, and mode. It also covers measures of dispersion like range, average deviation, and standard deviation. Various statistical tests are mentioned, including t-tests, z-tests, ANOVA, correlation, and regression.

Uploaded by

Christma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

BASIC STATISTICAL

TOOL USED IN
ANALYZING DATA

PRACTICAL RESEARCH 2
Data Analysis Strategies
• Exploratory Data Analysis.
• Descriptive Data Analysis.
• Inferential Data Analysis.
Exploratory Data Analysis
• This type of data analysis is used when it is not
clear what to expect from the data.
• This strategy uses numerical and visual
presentations such as graphs.
Descriptive Data Analysis
• This type of data analysis is used to describe,
show or summarize data in a meaningful way,
leading to a simple interpretation of data
- frequency,
- percentage,
- measures of central tendency, and
- measures of dispersion.
Measures of Central Tendency
• Mean – Often called the arithmetic average of a set of data.
The sum of the observed values in the distribution divided by
the number of observations.
Weighted Mean – The weighted average or weighted mean
is necessary in some situation.
• Median - The median is the midpoint of the distribution. It
represents the point in the data where 50% of the values fall
below that point and 50% fall above it.
• Mode – Is the frequently occurring value in a set of
observations.
Mean
Scores in the National Achievement Test (NAT)
n=4 n=4 n=4 n=4 n=4 Total
90 95 96 87 110 478
102 95 98 87 117 499
115 96 91 95 95 492
93 105 86 103 106 493
Total n = 20 ∑x=1962
Mean
∑x
X =
n

1962
X =
20

X = 98.1
Weighted Mean
Heights of 50 senior high school students
Respondent Height, in (x) Frequency (f) f (x)
A 56 6 336
B 57 15 855
C 58 12 696
D 59 8 472
E 60 5 300
F 61 2 122
G 62 2 124
∑f = 50 ∑fx = 2905
Weighted Mean
∑fx
WM =
∑f

2905
WM =
50

WM = 58.1
Median (odd)

16, 18, 25, 18, 25, 18, 25, 30, 34, 36, and 38.

16, 18, 18, 18, 25, 25, 25, 30, 34, 36, and 38.

th th th
n+1 11 + 1 12
= 6 th
= = 2
2 2
Median (Even)
16, 18, 25, 18, 25, 18, 25, 30, 34, 36, 37 and 38

16, 18, 18, 18, 25, 25, 25, 30, 34, 36, 37 and 38

th th th th
n n 12 12
th th
+ +1 + +1 6 7
2 2 2 2
+
= =
2 2
2

25 + 25 50
= = 25
2 2
Mode

16, 18, 25, 18, 25, 18, 25, 30, 34, 36, and 38.

16, 18, 18, 18, 25, 25, 25, 30, 34, 36, and 38.

Bi-modal
Measures of Dispersion
The extent of the spread, or the dispersion of the data is described by
a group of measures of dispersion, also called measures of variability.
• The Range – is the difference between the largest and the smallest
values in a set of data.
• Average (Mean) Deviation – This measure of spread is defined as
the absolute difference or deviation between the values in a set of
data and the mean, divided by the total number of values in the set
of data.
• Standard Deviation – is a measure of the spread or variation of
data about the mean.
Range

6, 18, 23, 12, 15, 18, 25, 10, 24, and 28.

6, 10, 12, 15, 18, 18, 24, 23, 25, and 28.

Range = 28 - 6 Range = 22
Average (Mean) Deviation

20, 25, 35, 40, and 45

Mean = 20+25+35+40+45 = 33
5
Average (Mean) Deviation
│X – X │
AD = ∑ n
│20 – 33│+│25 – 33│+│35 – 33│+│40 – 33│+│45 – 33│
=
5
│– 13│+│– 8│ +│2│+│7│+│12│
=
5
13 + 8 + 2 + 7 + 12 Thus, on the average, each
= value is 8.4 units from the mean
5
= 8.4
Standard Deviation
6, 10, 12, 15, 18, 18, 20, 23, 25 and 28

6+10+12+15+18+18+20+23+25+28
Mean =
10
= 175
10
= 17.5
Standard Deviation
Respondent SCORE (X) Mean (X) (X-X) (X-X)2
A 6 17.5 -11.5 132.25
B 10 17.5 -7.5 56.25
C 12 17.5 -5.5 30.25
D 15 17.5 -2.5 6.25
E 18 17.5 0.5 0.25
F 18 17.5 0.5 0.25
G 20 17.5 2.5 6.25
H 23 17.5 5.5 30.25
I 25 17.5 7.5 56.25
J 28 17.5 10.5 110.25
n = 10 ∑x = 175 ∑ (X-X)2= 428.50
Standard Deviation
SD =
√ ∑(X – X)2
n-1

= √ 428.5
10-1

= √ 428.5
9

= √ 48.278

= 6.948
Standard Deviation
1. Approximately 68% of the scores in
the sample falls within one standard 10.5 17.5 24.5

deviation from the mean


2. Approximately 95% of the scores in
the sample falls within two standard
deviation from the mean 3.6 17.5 31.4

3. Approximately 99% of the scores in


the sample falls within three standard
deviation from the mean
17.5
Inferential Data Analysis
• Inferential statistics tests hypotheses about a set of data to
reach conclusions or make generalizations beyond merely
describing the data.
• It includes
- tests of significance of difference (t-test, z-test, ANOVA);
and
- tests of relationship (Product Moment Coefficient or
Correlation or Pearson r, Spearman rho, linear
regression and Chi-square test).
Analysis of Variance (ANOVA)

• Analysis of Variance – relies on the F-ratio to


test the hypothesis that the two variance are
equal. That is, the subgroups are from the same
population. “Between groups” refers to the
variation between each group mean and the grand
or overall mean
Analysis of Variance (ANOVA)
DEGREES
SUM OF VARIANCE
SOURCE OF F-RATIO
SQUARES ESTIMATE
FREEDOM

BETWEE SSB
N K-1 MSB = SSB/K -1
MSB /
WITHIN SSW MSW = SSW/N -K
K-1 MSW
SSr =
TOTAL
SSB+SSW n-1
Test of Significance of Difference

• Between means (T-test)


• Between proportions or percentage (z-test)
• Analysis of Variance (ANOVA) – is used when
significance of difference of means of two or
more groups are to be determined at one time.
Test of Relationship
• Spearman Rank-Order Correlation or Spearman rho – This is
used when data available are expressed in terms of ranks (ordinal
variable)
• Chi-Square Test for Independence – This is used when data are
expressed in terms of frequencies or percentage.
• Product-Moment Coefficient of Correlation or Pearson r – This
is used when data are expressed in terms of scores such as weights
and heights or scores in a test (ratio or interval)
• T-test to test the significance of Pearson r – This is used to
determine if the value of the computed coefficient of correlation is
significant.
Types of Correlation

1. Simple correlation. This is a relationship


between two variables. The relationship
between independent variable and a dependent
is usually measured.
2. Multiple Correlation. Involve more than two
variables. The relationship between a
dependent variable and two or more
independent variables is usually measured.
Types of Correlation

3. Partial Correlation. This is a relative measure


of relationship between the dependent variable and
a particular independent variable, without
considering the effect of the other independent
variables under study.
Simple Correlation
A. Linear Correlation – This means that a change in one
variable is at a constant rate with respect to the change in the
second variable.
a.1 Direct – For every increase in one variable, there is a
corresponding increase in the second variable.
a.2 Inverse – For every increase in one variable, there is
a corresponding decrease in the second variable.
B. Curvilinear Correlation – This means that a change in one
variable is not at a fixed rate. It may be increasing or decreasing
with respect to the change in the other variable
Multiple Correlation

A. Non-linear Correlation – The relationship is


non-linear correlation is similar to curvilinear
correlation. However, in this correlation more than
two variables are involved.
B. Joint Correlation – This correlation between
the dependent variable and two or more variables
is changed with the addition of another
independent variable.
The Coefficient of Correlation

• To obtain the quantitative value of the extent of


the relationship between two sets of items, it is
necessary to calculate the correlation coefficient.
• The values of the coefficient of correlation ranges
between +1 to -1. Zero represents no relationship.
• Correlation coefficient between 1 and -1
represents various degrees of relationship
between two variables.
The Coefficient of Correlation

• A positive correlation coefficient means that


individuals obtaining high scores in the first
variable tend to obtain low scores in a second
variable.
• A negative correlation coefficient means that for
every increase in one variable, there is a
corresponding decrease on a second variable.
The Coefficient of Correlation
• Pearson r can be interpreted as follows as suggested by Garett
(1969)
- r from 0.00 to ± 0.20 denotes indifferent, inverse or
negligible relationship
- r from ± 0.21 to ± 0.40 denotes low but slight relationship
- r from ± 0.41 to ± 0.70 denotes substantial or marked
relationship
- r from ± 0.71 to ± 1.00 denotes high to very high
relationship
The Pearson Product-Moment Correlation
Coefficient (Pearson r).
Example: Pearson r
The scores of ten randomly selected senior high school students on the
mathematical portion of the National Admission Test (NAT) and mathematical
ability part of a university admission test
Respondent NAT (x) CAT (y) x2 y2 xy
A 5 6 25 36 30
B 7 15 49 225 105
C 9 16 81 256 144
D 10 12 100 144 120
E 11 21 121 441 231
F 12 22 144 484 264
G 15 18 225 324 270
H 17 26 289 676 442
I 20 25 400 625 500
J 26 30 676 900 780
n = 10 ∑x = 132∑y = 191∑x2 = 2110∑x2 = 4111∑xy = 2886
The Spearman Rank Order Correlation
Coefficient (Spearman Rho)
Example: Spearman Rho rs

You might also like