Introduction of Statistics
Introduction of Statistics
INTRODUCTION OF STATISTICS
IMPORTANT TERMS IN STATISTICS
Statistics is defined as a science that studies data to make a decision. Hence, it
is a tool in the decision-making process.
Statistics involves the methods of collecting, processing, summarizing, and
analyzing data to provide answers or solutions to an inquiry.
I. Measures of Frequency
* Count, Percent, Frequency
* Use this when you want to show how often a response is given
* Use this to show how "spread out" the data are. It is helpful to know
when your data are so spread out that it affects the mean
NATURE OF DATA
Data – a collection of facts from experiments, observations, sample surveys,
censuses, and administrative reporting system.
• Data are facts and figures that are presented, collected, and analyzed.
Data are either numeric or non-numeric and must be contextualized.
• To contextualize data, we must identify its six W’s, or to put meaning on
the data, we must know the following W’s of the data:
• Age
Categorical variables
Categorical variables represent groupings of some kind. They are
sometimes recorded as numbers, but the numbers represent categories rather
than actual amounts of things.
Independent variables (aka Variables you manipulate in order The amount of salt added to each
treatment variables) to affect the outcome of an plant’s water.
experiment.
Dependent variables (aka Variables that represent the Any measurement of plant health
response variables) outcome of the experiment. and growth: in this case, plant
height and wilting.
Control variables Variables that are held constant The temperature and light in the
throughout the experiment. room the plants are kept in, and
the volume of water given to each
plant.
Example datasheet
In this experiment, we have one independent and three dependent variables.
The other variables in the sheet can’t be classified as independent or
dependent, but they do contain data that you will need in order to interpret
your dependent and independent variables.
LEVEL OF MEASUREMENT
a) Nominal level of measurement is characterized by data that consist of
names, labels, or categories only.
d) Ratio Interval is the highest level of measurement. Like interval, ratio data
can be ordered. What differentiates it from interval data is that zero is
absolute.
SAMPLING TECHNIQUE
Sampling refers to the process of selecting individuals who will participate as
part of the study.
Data Presentation
3 Methods to present data
A. Textual or Narrative
B. Tabular
In presenting the data in textual or paragraph or narrative form, one describes the
data by enumerating some of the highlights of the data set like giving the highest,
lowest, or average values. In case there are only a few observations, say less than
ten observations, the values could be enumerated if there is a need to do so. An
example of which is shown below:
The country’s poverty incidence among families as reported by the
Philippine Statistics Authority (PSA), the agency mandated to release
official poverty statistics, decreases from 21% in 2006 down to 19.7% in
2012. For 2012, the regional estimates released by PSA indicate that the
Autonomous Region of Muslim Mindanao (ARMM) is the poorest region
with poverty incidence among families estimated at 48.7%. The region with
the smallest estimated poverty incidence among families at 2.6% is the
National Capital Region (NCR).
The tabular method of presentation is applicable for large data sets. Trends could
easily be seen in this kind of presentation. However, there is a loss of information
when using such kind of presentation. The frequency distribution table is the usual
tabular form of presenting the distribution of the data. The following are the
common parts of a statistical table:
a. Table title includes the number and a short description of what is found inside
the table.
b. Column header provides the label of what is being presented in a column.
c. Row header provides the label of what is being presented in a row.
d. Body is the information in the cell intersecting the row and the column.
In general, a table should have at least three rows and/or three columns.
However, too much information to convey in a table is also not advisable. Tables
are usually used in written technical reports and in oral presentations. Table 5.1 1 is
an example of presenting data in tabular form. This example was taken from 2015
Philippine Statistics in Brief, a regular publication of the PSA which is also the basis
for the example of the textual presentation given above
1. LINE GRAPH is used to represent changes in data over a period of time. A line
graph may be curved broken or straight.
NOTE: Generally, the horizontal axis is used as the time axis and the vertical axis is
used to show the changes in the other quantity.
The above graph tells about the trend in the temperature of New York on a hot
day.
2. BAR GRAPH is a graph that uses horizontal or vertical bars to represent data.
• When a bar graph has a bar, which extends from left to right, it is called a
horizontal bar graph.
• If the bar extends from bottom to top, it is called a vertical bar graph.
Bar Graphs are good when your data is in categories (such as "Comedy",
"Drama", etc).
But when you have continuous data (such as a person's height) then use
a Histogram. It is best to leave gaps between the bars of a Bar Graph, so it
doesn't look like a Histogram.
References
Tales, K. A. (2016). Statistics and Probability. Quezon City, Philippines: FNB Educational, Inc.
Winston S. Sirug, P. (2015). Basic Probability and Statistics A step by step Approach (Revised Edition).
Manila, Philippines: MIndshapers Co., Inc.
Gates, LB; Gentry, D; Sevilla, D; Montes, J.E; 2021 mathisfun. Using and Handling
Data.https://www.mathsisfun.com/data/index.html