0% found this document useful (0 votes)
19 views25 pages

Lesson 4-Analysis-Interpretation-Descriptive Statistics

The document discusses descriptive statistics, focusing on measures of central tendency (mean, median, mode) and variability (variance, standard deviation, interquartile range, range). It explains the characteristics of normally distributed, positively skewed, and negatively skewed data, as well as methods for assessing normality such as skewness and the Shapiro-Wilk test. Additionally, it covers the use of frequency counts and percentages in data analysis using JASP software.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views25 pages

Lesson 4-Analysis-Interpretation-Descriptive Statistics

The document discusses descriptive statistics, focusing on measures of central tendency (mean, median, mode) and variability (variance, standard deviation, interquartile range, range). It explains the characteristics of normally distributed, positively skewed, and negatively skewed data, as well as methods for assessing normality such as skewness and the Shapiro-Wilk test. Additionally, it covers the use of frequency counts and percentages in data analysis using JASP software.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Descriptive Statistics

It is use to accurately describe data and trends


or patterns of data.

It involves collection, classification/organization,


presentation, analysis, and interpretation of
data.

The most commonly and widely used descriptive


statistics are the measures of central tendency,
which are mean, median, and mode.
Measures of Central Tendency
Used to describe the set of data distribution's
center.
The mean, median, and mode are the
measurements used to describe the
distribution's center.
The data distribution is described as normally
distributed and skewed distribution, which is
skewed to the left and skewed to the right.
Measures of Central Tendency
Normally distributed data, called bell curve,
means that the graph data values shows a bell-
shape or symmetrical image. Moreover, the
mean, median, and the mode are all the same
value and coincide with the peak of the curve.
The figure of normally distributed data is show
below:
Measures of Central Tendency
Positively skewed means that the left side of the
graph contains more values, but the right side
contains a longer distribution tail or has lesser
values. This distribution has a greater mean
value than the median.
The figure of positively skewed is show below:
Measures of Central Tendency
Negatively skewed means that the right side of
the graph has more values, while the left side
has a longer distribution tail or has fewer values.
Furthermore, the mean is lower than the
median, and the mode might be zero.
The figure of negatively skewed is show below:
Measures of Central Tendency
Mean
is the arithmetic average of a set of scores.
is obtained by getting the sum of the scores and
dividing it by the number of scores.
is the most reliable measure of central tendency
because it involves all the scores in the
distribution.
is affected by extremely low or high scores. There
is only one value for the mean.
Used when the given data levels of measurement
is interval or ratio, and when the distribution is
normally distributed.
Measures of Central Tendency
Median
is the middle score.
is the score that divides the distribution of scores
into 2 halves, the top half and the bottom half.
is used when extreme score/values are given. If
there are two middle scores (where n = even), the
median is the average of the two middle scores.
Just like the mean, there is only one value for the
median.
Used when the data levels of measurement are
ordinal, interval, or ratio, and when the distribution
skewed (either negatively skewed or positively
skewed)
Measures of Central Tendency
Mode
is considered as the most popular score because it is the
score that occurred most frequently.
is the least reliable measure of central tendency.
Unlike the mean and median, it can take several values, or
none at all.
Unimodal - when there is only one mode value in the
distribution.
Bimodal - when there are two mode values in the
distribution.
Trimodal or multi-modal - when there are three or more
mode values in the distribution.
Used when the data levels of measurement are nominal,
ordinal, interval, or ratio.
Normality of Distribution
Skewness
Skewness quantifies the degree to which the
distribution of statistical data deviates from the
normal distribution, which is evenly distributed on
both sides.
Skewness value = 0, the distribution of
statistical data is normally distributed
Skewness value < 0, the distribution of
statistical data is negatively skewed
Skewness value > 0, the distribution of
statistical data is positively skewed
Normality of Distribution
Description of skewness are as follows:
1) The data are fairly symmetrical if the
skewness is between -0.5 and 0.5.
2) The data are substantially skewed if the
skewness is between – 1 and
– 0.5 or between 0.5 and 1.
3) The data are highly skewed if the skewness
is less than -1 or higher than 1.
Normality of Distribution
Standard error of skewness and kurtosis were also
used for checking normality.
That is, z-scores for skewness and kurtosis were
used as a rule.
If z-scores of skewness and kurtosis are smaller
than 1.96 (for %5 of type I error rate) the data was
considered as normal (Field, 2009; Kim, 2013).
A Z score could be obtained by dividing the
skewness values or excess kurtosis value by their
standard errors. For small sample size (n <50), z
value ± 1.96 are sufficient to establish normality of
the data.
Normality of Distribution
Shapiro Wilks Test
The Shapiro–Wilk test can be used to decide
whether or not a sample fits a normal distribution,
and it is commonly used for small samples.
If the chosen alpha level is 0.05 and the p-value is
less than 0.05, then the null hypothesis that the
data are normally distributed is rejected (this
means that the distribution is not normal). If the p-
value is greater than 0.05, then the null hypothesis
is not rejected.(this means that the distribution is
normal)
Measures of Variability
Used to describe the spread of the
data, or its variation around a central
value.
The variance, standard deviation, and
interquartile range are the most
commonly used measurements to
describe variability.
Measures of Variability
The variability for the mean is standard
deviation and variance, the variability
for the median is interquartile range,
and the variability for the mode is
range.
Measures of Variability
Variance
is a numerical value that indicates the degree to
which individuals within a group differ.
If individual observations differ greatly from the
group mean, the variance is large; conversely, if
individual observations deviate little from the
group mean, the variance is little.
is the expectation of the squared deviation from
the mean.
It measures how far a set of random numbers are
spread out from the mean.
Measures of Variability
Standard deviation
As with variance, it is a numerical figure that
describes the variability of individual data points
within a group.
When the standard deviation is large (that means it
reflects the heterogeneity of the data set) or there
is a large variation in the data or observations from
the distribution mean; but when the standard
deviation is small (that means it is close to the
group mean), there is a small variation in the data
or observations from the distribution mean (or
homogeneous data set group).
Measures of Variability
Standard deviation
is the square root of the variance.
It describes, on the average the
distance of the scores from the
mean.
The higher the value of the
standard deviation, the farther the
scores are from the mean.
Measures of Variability
Interquartile range
The difference between the upper and
lower quartile values or the middle fifty
in a set of data.
Indicates the measure of where the
majority of the values are located.
Measures of Variability
Range
is the score distance between the
highest and the lowest value in the
distribution
Frequency Counts and Percentages in JASP
Most of the time, in addition to evaluating the
information to address the primary goal we often begin
by figuring out how many respondents (from a large
data set) fall into a category for a study variable in
order to test the hypothesis of the study (i.e., finding
out if there is a significant difference in test anxiety
between male and female students). For example,
"What percentage of the 1000 responders are
women? Perhaps our goal is to find out "what
percentage of students strongly agreed on one of the
self-efficacy questionnaires, again without actually
having to count it." is what we might be interested in
finding out.
Frequency Counts and Percentages in JASP
To illustrate this, we will use the excel file. Sample
Matrix of Data Lecture CSV which is show below:
Frequency Counts, Percentages, Descriptive
Statistics, and Normality Distribution in
JASP
Import the file to JASP.
Determine the frequency and percentages
of the data.
If the table shows 0 Missing, this indicates
that the data set is complete.
Frequency Counts, Percentages, Descriptive
Statistics, and Normality Distribution in
JASP
Presents the results:
The frequency table shows that out of ____ respondents, ____ were
males, and that is _____%. There are ____ females comprising _____%.
Similarly, _____ or _____% are enrolled as _________ year level, while
_____ or ____% are enrolled as________ year level. In terms of program,
____(_____%) responded as ________ program, while ____(_____%) were
________ program.
The output shows the descriptive statistics of male and female in the
engagement scale. The mean age of the ___ males is _____with the
standard deviation of ____. Similarly, the mean age of the ____ females
is ___ with the standard deviation of ____. The output also shows that
the age of the males are more ________ than the age of the females. The
overall mean of the ____ students in the engagement scale is _____.
Frequency Counts, Percentages, Descriptive
Statistics, and Normality Distribution in
JASP
Activity 2. Present the results of the
data that you have gathered.

You might also like