SFM Nháp
SFM Nháp
Data can be categorized into two separate types, each requiring different methods of
analysis.
The relative frequency of a class equals the percentage or part of observations that are
part of a class (Anderson, D.R. et al, 2017).
A percent frequency distribution illustrates the data's percentage frequency for each
class. The percent frequency of a class is calculated by multiplying its relative frequency
by 100 (Anderson, D.R. et al, 2017).
Graphical methods
There will be two stages in the process of creating the frequency distribution for the
quantitative variable. By defining the ranges that will be utilized to group the data,
classes are created. Selecting a class width is the second stage in creating a frequency
distribution for quantitative data (Anderson, D.R. et al, 2017).
Percent and relative frequency distributions are calculated in the same manner as
qualitative data.
Grouped: x=
∑ f i xi values. This
∑ fi allows quick and
effective data
aggregation
(Hippel, P, 2023).
Mode The category or values that Because the mode If there are few
occur most frequently is unaffected by values in the
outlier values in dataset, the mode
the data set, it can could not be
be used when properly identified
there are (Taylor, S, 2019).
significant value
changes in the data
(Taylor, S, 2019).
Median ∑ fi – S The median is a The median does
x M ( min ) + 2 M −1
hM more accurate not use the
fm
measurement in complete range of
the event of an available
outlier data information
distribution since because it does not
it is unaffected by take into account
excessively large all of the values in
or tiny values the dataset while
(Zach, 2023). calculating (Zach,
2023).
Graphical methods
Strength Weakness
Histogram Allow viewers to quickly They give a broad picture but
compare a wide range of data. may overlook specifics and
Large data ranges benefit discrete data pieces.
greatly from the consistent
intervals, which make it easier
to move data from the
frequency table to the graph.
Dot Plot Dot charts are simple to read Dotted cells can become
and comprehend, so even those clogged and less effective with
with no background in statistics huge data sets because the dots
may use them. can overlap and become hard to
see.
Scatter They make it simpler to Scatter charts can get congested
Diagram identify patterns, trends, and when working with a lot of
correlations by clearly data, which makes it harder to
illustrating the link between see particular trends or data
two variables. points.
Ogive They are excellent at showing Ogives doesn't provide much
how they have changed from information regarding the data's
class to class, with a slope sharpness, deviation, dispersion,
indicating an increase or or central trend.
decline.
Steam and They give you the ability to Stem and leaf charts can
leaf display view specific values by become complicated and hard
displaying comprehensive data to read for very big data sets.
information. Other chart styles,
such charts, typically do not
provide this degree of detail.
b. Inferential analysis
a.1. Population and sample
A sample is an aspect of the population, while the population is the collection of all
elements of interest in a given study (Anderson, D.R. et al, 2017).
a.2. T-test
Definition: Another statistical technique to find out if there are statistically significant
differences between the averages of two groups is t-testing. This is a standard research
tool for comparing two samples (Kim, T.K., 2015).
Strength: Because t-testing depends on the sample's average and variance rather than the
population as a whole, it can be used successfully to small samples. Additionally, the T-
test makes it simple to compare the two groups directly and find differences.
Weakness: Not all kinds of data always fit the normal distribution requirement for t-
testing. Furthermore, this approach makes the unfounded assumption that the variances of
the two groups are the same.
Strength: It is simple to evaluate the quality of the regression model because correlation
coefficients and associated statistical parameters are used for evaluation.
Weakness: The accuracy of the regression model will be impacted if the input data has
errors, such as missing values, duplicate data, exceptions, or an uneven distribution of
data.