STATISTICS Methods of Research
STATISTICS Methods of Research
METHODS OF RESEARCH
GROUP 6
DEFINITION OF STATISTICS &
NEED OF STATISTIC IN
RESEARCH
DATA CHARACTERIZATION
& CLASSIFICATION
Data Characterization
Data characterization is a summarization of the general characteristics or features of a target
class of data (target class).
The data corresponding to the user-specified class are typically collected by a query.
The output of data characterization can be presented in various forms.
Examples include pie charts, bar charts, curves, multidimensional data cubes, and
multidimensional tables, including crosstabs. The resulting descriptions can also be presented
as generalized relations or in rule form (called characteristic rules).
Data Classification
What is data classification?
● Data classification is the process of organizing data into categories that make it easy to retrieve,
sort and store for future use.
● It is the process of arranging data into homogeneous groups according to their common
characteristics.
● Heterogeneous data is divided into separate homogeneous classes. Ex. Separating data on the
basis of public, internal and confidential
● Systematic classification of data helps organizations manipulate, track and analyze individual
pieces of data. Data professionals often have a specific goal when categorizing data. The goal
affects the approach they take and classification levels they use.
◦Confidentiality
◦Data Integrity
◦Data Availability
Data Classification OBJECTIVES
a.Simplification
b.Improves Utility
c.Brings out Individuality
d.Aids Comparison
e.Increase Reliability
f.Make it Attractive
g.Consolidation
Types of Data Classification
GEOGRAPHICAL CLASSIFICATION
§Classification of data is according to location/ geographical area or region
Example:
Palay Production by Region
in the Philippines 2019-2020
In metric tons
Source: psa.gov.ph
Types of Data Classification
Example: Philippine GDP Growth Rate
§CHRONOLOGICAL CLASSIFICATION
Data are classified on the basis of time
of existence, such as years, months,
weeks, days etc. Data are arranged
either in ascending or descending order.
It is also known as Temporal
Classification
Source: datacommons.org
Types of Data Classification
§QUALITATIVE CLASSIFICATION TWO TYPES OF QUALITATIVE DATA:
● Simple Classification
Classification of data is according to
Data is qualified into two groups.
characteristics and attributes or quality
Ex. Educational qualification: educated &
such as gender, hair colour, literacy,
uneducated.
intelligence, religion etc.
● Manifold Classification
The attribute under study cannot be Data is classified on the basis of two or
measured and can only be discovered more than two qualities
whether it is present or missing in the Ex. Population classified on the
sections of study. basis of sex and religion
Types of Data Classification
§ QUANTITATIVE CLASSIFICATION
A quantitative type of variable can be calculated, measured and/or operated with; it presents particular
information on a numerical scale. Ex. Temperature, volume, height, income, results of students or any
type of computation or numerical value.
Also recognized as classification by variables.
2 TYPES OF QUANTITATIVE DATA:
◦A. Discrete Variable –A discrete variable can take only certain specific values that are whole
numbers (integers). E.g. Number of children in a family or Number of class rooms in a school.
◦B. Continuous Variable - A Continuous variable can take any numerical value within a specific
interval. Example: the average weight of a particular class student is between 60 and 80 kgs.
● It relies on the assumptions about the shape of the distribution in the underlying population and
about the form or parameters of the assumed distribution.
Nonparametric Statistics
● It is not based on assumptions, that is, the data can be collected from a sample that does not follow
a specific distribution. The sample data is not based on numbers but on other criteria, such as
ranking or commonness. As they involve weaker assumptions, they are less powerful than the
parametric tests and require larger samples to yield the same level of significance.
Nonparametric tests are used with nominal and ordinal data. Examples are:
● Chi-square test- is useful for determining if a statistically significant relationship exists between
two variables, for example, age and frequency of library use.
● Spearman rank order correlation- a non-parametric correlation coefficient that can be calculated for
ranked or ordinal level data.
Parametric vs Nonparametric
Parametric Nonparametric
Population is well-known No information about the population
available
Assumptions made about the population No assumptions made about the
population
Sample data based on distribution Arbitrary sample data
Applicable for continuous variables Applicable for continuous and discrete
variables
More powerful Less powerful
DESCRIPTIVE FROM
INFERENTIAL STATISTICS
DESCRIPTIVE STATISTICS
- is a term given to the analysis of data that helps to describe, show and summarize data in a
meaningful way.
it is very important to present our raw data in effective and meaningful way using numerical
calculations or tables.
DESCRIPTIVE STATISTICS
COMMON FORMS OF DESCRIPTIVE STATISTICS:
1. Summary Statistics - these are statistics that summarize the data using a single number.
a. measures of tendency - these numbers describe where the center of a dataset is located.
Examples:
Mean - which is the arithmetic average
Median - is the center number in the data set.
Mode - the number that occurs the most frequently in a dataset.
b. Measures of dispersion - these numbers describe how spread out the values are in the dataset.
Examples:
Range -
Standard deviation
Variance
Examples:
Mean - which is the arithmetic average
Median - is the center number in the data set.
Mode - the number that occurs the most frequently in a dataset.
DESCRIPTIVE STATISTICS
COMMON FORMS OF DESCRIPTIVE STATISTICS:
1. Summary Statistics - these are statistics that summarize the data using a single number.
measures of tendency - these numbers describe where the center of a dataset is located.
Examples:
Mean - which is the arithmetic average
Median - is the center number in the data set.
Mode - the number that occurs the most frequently in a dataset.
Measures of dispersion - these numbers describe how spread out the values are in the dataset.
Examples:
Range -
Standard deviation
Variance
Examples:
Mean - which is the arithmetic average
Median - is the center number in the data set.
Mode - the number that occurs the most frequently in a dataset.
DESCRIPTIVE STATISTICS
COMMON FORMS OF DESCRIPTIVE STATISTICS:
Regression Analysis - used to quantify how one variable will change with respect to another
variable.
STATISTICAL
TECHNIQUES USED
TO TREAT RESEARCH DATA
What is statistical analysis?
- It involves analyzing collected data with the aid of
statistical tool to answer the research problem. Thus, this
gives meaning to the data collected.
Common Statistical Techniques
1. Measure of Central Tendency – The data describes the average or summary of the
data using mean, median or mode.
2. Parametric Test – This test assures that the data are on quantitative (numerical)
scale, with a normal distribution of the underlying population.
· T-test – This is used to test the null hypothesis that there is no difference
between the means of the two groups. Unfortunately, this test cannot be used
for comparison of three or more groups.
· Analysis of Variance (ANOVA) – This is used to test if there is any
significant difference between the means of two or more groups.
3. Non-parametric Test – This statistical technique is used to analyze ordinal and categorical
data which are set in order to scale form.