Module-4-Intro-to-stat
Module-4-Intro-to-stat
4.1 Introduction
When we hear the word Statistics, the first thing that comes to mind is set
of numerical figures, such as your monthly allowance, the number of hours you
spend in school, the number of hours you spend on Facebook, your vital
statistics, etc.
However, the study of statistics is not limited to knowing and memorizing
numerical figures. This module will give us a better understanding of what
Statistics is about. Discussion on how some of its processes are done is also
included.
4.2 Learning Outcomes
After finishing this module, you are expected to:
1. discuss the importance of statistics in your field of study;
2. compare and contrast between descriptive statistics and inferential
statistics;
3. define data;
4. identify different types of data as well as their level of measurement;
5. identify appropriate data collection methods based on needed data; and
6. identify appropriate data presentation type for a set of data.
Why are all processes involved in Statistics important? Statistics has the
ability to provide us with tools we need to convert raw data into information that
we can use to make sensible decisions and intelligent choices.
People from various fields of interest need to obtain information to answer
different types of problems. Nowadays, we do this by performing a statistical
Page 1 of 23
inquiry. This will allow us to answer problems with clearer understanding of a
particular collection of information.
Usually, the population of interest may be too large that it becomes too
expensive and time-consuming to collect data from every element of the
population. Thus, we have no other option but to get the data we need from only
a subset of the population. We use the term sample to refer to this subset of the
population.
In any statistical inquiry, we study certain characteristics or attributes of
the elements in the population, which we call variables. Just like in algebra, we
denote variables with letters of the English alphabets. We refer to these
characteristics as variables because their realized values may vary for the
different elements in the sample or population.
Page 2 of 23
Example 2. Below are illustrations of variables together with their possible
values.
Example 4.
A summary measure that we are familiar with is the proportion. The
proportion is the quotient obtained when we divide the magnitude of a part by
the magnitude of the whole. Suppose that among the 35 students, 28 claimed
that they own a cellular phone. We can now compute for the proportion of
students in the population with cellular phones.
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑤𝑖𝑡ℎ 𝑐𝑒𝑙𝑙𝑢𝑙𝑎𝑟 𝑝ℎ𝑜𝑛𝑒𝑠 28
𝑃= = = 0.8
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 35
Page 3 of 23
The proportion of students in our population with cellular phones is an
example of a parameter because it is a summary measure describing a
characteristic of the population.
Suppose we take a sample of 10 students from this class. Among the 10
students in the sample, 7 own cellular phones. We cannot compute the
proportion 𝑃 of students in the population with cellular phones but we can
compute for 𝑃̂ (read as “𝑃 hat”), where 𝑃̂ is the proportion of students in the
sample with cellular phones, as follows:
Learning Activity 1
Page 4 of 23
problems. On the other hand, mathematical statistics is concerned with the
development of the mathematical foundations of the methods used in applied
statistics.
There are two major areas of interest in applied statistics. These are
descriptive statistics and inferential statistics.
Inferential Statistics includes all the techniques used in analyzing the sample
data that will lead to generalizations about a population from which the sample
came from. It consists of performing hypothesis testing, determining
relationships among variables, and making predictions.
Page 5 of 23
be clear that whatever conclusions we make using inferential statistics is always
subject to some error.
Example 6. Below is an application of inferential statistics.
To determine if reforestation is effective, we can take a representative portion
of denuded forests and use inferential statistics to draw conclusions about the
effect of reforestation in all denuded forests.
Learning Activity 2
Page 6 of 23
2. Quantitative Variables are numerical variables and can be measured.
Learning Activity 3
Page 7 of 23
4.3.2.2 Levels of Measurement
Variables can also be classified according to the level of measurement.
There are four levels of measurement: Nominal, Ordinal, Interval, and Ratio.
1. Nominal Data. In this case, numbers are used to represent an item or
characteristic. Examples include: names, gender, religious affiliation,
civil status, college majors. Note that such data should not be treated as
numerical, since relative size has no meaning.
5 − 𝑂𝑢𝑡𝑠𝑡𝑎𝑛𝑑𝑖𝑛𝑔
4 − 𝑉𝑒𝑟𝑦 𝑆𝑎𝑡𝑖𝑠𝑓𝑎𝑐𝑡𝑜𝑟𝑦
3 − 𝑆𝑎𝑡𝑖𝑠𝑓𝑎𝑐𝑡𝑜𝑟𝑦
2 − 𝑃𝑜𝑜𝑟
3. Interval Data. In this set, numbers can be ordered and has exact
difference between any two units but has no meaningful zero or starting
point. For example, Temperature is an interval data since they can be
ordered, there is an exact difference between two degrees, but the zero
does not mean the starting point since there can be temperatures below
zero.
4. Ratio Data. This set is the highest level of measurement and allows for
all basic arithmetic operations, including division and multiplication.
Data at this level can be ordered, has exact difference between units,
and has a meaningful zero. Things that are counted are usually ratio
level, for example, business data, such as cost, revenue and profit.
Page 8 of 23
Learning Activity 4
In the case where data are not properly gathered, the consequences are as
follows:
Page 9 of 23
One can obtain documented data from previous studies of individuals,
written reports of government and nongovernment agencies, periodicals, and
others.
Example 7.
The Philippine Statistics Authority is a major collector of data for
government needs. It provides the public with basic data on various subject
matters. A few of these are household income and expenditure,
employment, and others.
Primary data are data documented by a primary source. The data collectors
themselves documented this data.
Page 10 of 23
4.3.2.4b Surveys
DEFINITION 5.8. (survey, census, sample survey)
Page 11 of 23
we did not. Both pots have the same soil type. We watered the pots at the same
time using the same amount of water. A few weeks later, we observed the heights
of the mongo plants.
In this experiment, the objective is to determine the effect of sunlight on the
height of a mongo plant. The explanatory variable is the amount of sunlight.
Categories for the explanatory variable are called “treatments” or factor levels.
The response variable is the height of the mongo plant and the extraneous
variables are identified to be the soil type and amount of water.
The extraneous variables are usually controlled making sure that the two
groups will receive the same levels or amounts. The use of randomization
mechanism in assigning the treatments and controlling the identifies extraneous
variables makes the experiment a more effective method of data collection in
establishing cause and effect.
Example 11.
The school administration wishes to determine which of the two methods is
more effective in training new student leaders. They randomly assigned twenty
student leaders to training method 1 and twenty student leaders to training
method 2. After one month of training, they administered a standardized
achievement test to the two groups and compared their scores.
4.3.2.4d Observation
DEFINITION 5.10. (observation method)
Page 12 of 23
The table below shows the comparison of survey, experiment, and
observation methods.
Data Collection Method
Aspect
Survey Experiment Observation
Assessing the reliability of
Generally Sometimes Oftentimes
generalizations about a well-
possible difficult difficult
defined population
Learning Activity 5
Page 13 of 23
4.3.3 Presentation of Data
After data collection, we need to organize and analyze the data. After
organizing and analyses, we present the results in forms that will allow us to
reveal important information we obtained from the data.
There are three ways to present the information from our data. These
include textual, tabular, and graphical presentations.
4.3.3.1 Textual Presentation
Textual presentation of data incorporates important figures in a paragraph
of text. In this type of presentation, we insert important data figures or summary
measures within the paragraph of text to support our conclusions.
Textual presentation allows us to direct reader’s interest to vital information
we want to highlight. Summary measures like minimum, maximum, total, and
percentages are just few information that may be included in a textual
presentation.
It is necessary to select the most important figures we want to focus on.
Whenever we use textual presentation, we must always provide our readers with
additional discussion about the relevance of the figures in our presentation.
Example 12. Here is an illustration of textual presentation.
Excerpts taken from the Isabela Covid-19 Case Updates.
“As of 4PM today, the Department of Health reports a total number of COVID-
19 cases at 290,190, after 3,475 newly-confirmed cases were added to the list of
COVID-19 patients.
DOH likewise announces 400 recoveries. This brings the total number of
recoveries to 230,233.
Twenty-eight duplicates were removed from the total case count. Of these, 19
were recovered cases.
Moreover, 13 cases previously reported as recovered were reclassified as death
(12) and active (1) cases after final validation.”
From the illustration given, the paragraphs showed and highlighted only the
most important figures. Few numbers were included and minute details or a
large quantity of data were not presented. If we want to refer to other details of
the data, then it would be more appropriate to use tabular presentation.
4.3.3.2 Tabular Presentation
Tabular presentation of data arranges figures in a systematic manner in
rows and columns. It is the most common method of data presentation. We can
use it for various purposes such as description, comparison, and in showing
relationships between two or more variables of interest.
Page 14 of 23
In tabular presentation, we arrange the data figures or summary measures
in rows and columns for easy reading. Tables should be simple and easy to
understand. Each row and column must have an appropriate label.
Three types of tabular presentation will be discussed in this module namely,
leader work, text tabulation, and the formal statistical table.
4.3.3.2a Leader Work
Leader work has the simplest layout among all three types of tables. It
contains no table title or column headings and has no table borders. We
incorporate this type within a paragraph presenting one or two columns of
figures as supporting data.
Example 13.
The population in the Philippines for the census years 1975 to 2000 is as
follows:
1975 42,070,660
1980 48,098,460
1990 60,703,206
1995 68,616,536
2000 76,498,735
Page 15 of 23
4.3.3.2c Formal Statistical Table
The formal statistical table is the most complex type of table since it has all
the different parts like the table number, table title, head note, box head, stub
head, column headings, and so on. It is a stand-alone table and can be easily
understood even without a description.
The following presents the different parts of a formal statistical table:
Page 16 of 23
Example 14.
Below is an example of a formal statistical table.
Page 17 of 23
4.3.3.3 Graphical Presentation
Graphical presentation of data portrays numerical figures or relationships
among variables in pictorial form. Some statistical charts used in this type of
presentation is given in the following table:
Type of
Description Example
Chart
Line Chart • Useful for presenting historical
data
• Effective in showing movement
of a series over time
• Appropriate when comparing
two or more time series data
and trends over time
Page 18 of 23
Pictograph • Like a horizontal bar chart that
uses symbols or pictures
instead of bars
• The purpose is to get the
attention of the readers
Page 19 of 23
3. To help the researcher in making credible decisions based on
quantitative data or arguments.
• Excel Charts & Graphs: Learn the Basics for a Quick Start by Leila
Gharani
https://www.youtube.com/watch?v=DAU0qqh_I-A
A. Short-response Essay
Page 20 of 23
B. Identification
1. The average weekly allowance of students last year at a private high school
was Php 600.00 per week, based on an enrollment of 1,080 stdents. The
third year students who did not have this information interviewed 50
students and found their average weekly allowance last year to be Php
550.00. Identify the following:
a. Population
b. Sample
c. Variable of interest
d. Parameter
e. Sample
2. Observe the use of the number seven in the following statements. Classify
each statement according to the level of measurement used to get the value
7.
3. What method of data collection is most appropriate for the following cases?
Page 21 of 23
4. Indicate the type of chart you would choose to present the information
given in each of the following cases.
Your answers in items where you are asked to discuss will be graded according
to the given standards/basis for grading:
Score Criteria
Unable to elicit the ideas and concepts from the learning activity, material, or
0
video
Able to elicit the ideas and concepts from the learning activity, material, or video
1
but shows erroneous understanding
Able to elicit the ideas and concepts from the learning activity, material, or video
2
and shows correct understanding
Able to elicit the correct ideas from the learning activity, material, or video and
3 also shows evidence of internalization and consistently contributes additional
thought to the core idea
4.8 References:
Beaver, B.M. and Beaver R.J. (1999). Introduction to Probability and
Statistics. 10th ed. New York: Duxbury Press.
Bluman, A. (1998) Elementary Statistics: A Step by Step Approach. 3rd ed.
McGraw-Hill Book Co.
Deuna, Melecio C. (1996), Elementary Statistics for Basic Education.
Quezon City: Phoenix Publishing House, Inc.
Febre, F.A. and Virginia F. Cawagas (Consultant)(1987) Introduction to
Statistics. Metro Manila, Pheonix Publishing House, Inc.
Reyes, C.Z. and Saren, L.L. (2003). Metro Manila. M.G. Reprographics.
Spiegel, M. and Stephens, L. (1999). Schaum’s Outline Theory and Problems
in Probability and Statistics. 3rd. Edition. Singapore: McGraw-Hill
Book Company.
Thorndike,R.M. & Dinnel,D.L. (2002)Basic Statistics for the Behavioral
Sciences.Prentice Hall,Inc.
Triola, Mario (1995) Elementary Statistics. New York: Addison-Wesley
Publishing Company.
Page 22 of 23
Most, .M.M., Craddick, S., Crawford, S., Redican, S., Rhodes, D., Rukenbrod,
F., Laws, R. (2003). Dietary quality assurance processes of the DASH-
Sodium controlled diet study. Journal of the American Dietetic
Association, 103(10): 1339-1346.
Web Sources:
http://lsc.cornell.edu/wp-content/uploads/2016/01/Why-study-
statistics.pdf
Page 23 of 23