Module 1. Introduction: Proel 12
Module 1. Introduction: Proel 12
Module 1. Introduction: Proel 12
La Carlota City
Education Department
Module in ProEl 12
2nd Semester, AY 2020-2021
MODULE 1. INTRODUCTION
HISTORY & DEFINITION
The word statistics means different things to different people. To a college student,
statistics are scores on all quizzes, seatwork, assignments and recitations made in his
subject. To a biological researcher investigating the effects of pollution to our
environment, statistics are evidence of success of research efforts. To a school president,
statistics are information on faculty and employee salary, tardiness & absenteeism, and
increase or decrease in enrollment. To a manager of a food chain, statistics may be kind of
food frequently served to customers, and to the president of a country, statistics are the
information to jobs created, housing projects, increase or decrease in economic situation,
etc.
They are using statistics correctly, yet they use it in different ways and purposes.
The word statistik comes from the Italian word statista which means “statesman”. The
word was first used by Gottfried Achenwall (1719-1772), a professor at Marlborough and
Gottingen, while Dr. E.A.W. Zimmerman introduced it in England. Its used was popularized
by Sir John Sinclair in his work, Statistical Account of Scotland (1791-1799), However,
people had been recording and using data long before the 18 th century.
Presently, Statistics is defined as the branch of scientific methodology which deals with the
collection, classification, description and interpretation of data obtained through survey or
experiment.
STATISTICS is a scientific body of knowledge that deals with the collection, organization or
presentation, analysis and interpretation of data.
Functions of Statistics
1. To provide investigators means of measuring scientifically the conditions that may
be involved in a given problem and assessing the way in which they are related.
2. To show the laws underlying facts and events that cannot be determined by
individual observations.
3. To show relations of cause and effect that otherwise may remain unknown.
4. To find the trends and behavior in related conditions which otherwise may remain
ambiguous.
The history of statistics can be traced back at least to the Biblical times in Ancient Egypt,
Babylon and Rome. As early as 3,500 years before the birth of Christ, statistics had been
used in Egypt in the form of recording the number of sheep or cattle owned, the amount of
grain produced, and the number of people living in a particular city. In 3800 BC.,
Babylonian government used statistics to measure the number of men under the king’s rule
and the vast territory that he occupied. It was his belief that the more men under his
command and the more lands he conquered, the more powerful his kingdom would
become. In 700 B.C., Roman empires used statistics by conducting registration to record
population for the purpose of collecting taxes.
In modern times, statistical methods have been used to record and predict such things as
birth and death rates, employment and inflation rates, sports achievement, and other
economic and social trends. Try have even used to assess opinions from polls and unlock
secret codes from the game of chance.
Modern Statistics is said to have begun with John Graunt (1620-1674), an English
tradesman. Graunt collected published records called “bills of mortality” that included
information about the numbers and causes of deaths in the city of London. Graunt
analyzed more than fifty years of data and created the first mortality table, a table that
shows how long a person may be expected to live after reaching a certain age.
There were so many other great men who made important contribution to statistics. One
of them was Karl Friedrich Gauss (1777-1855), the brilliant German mathematician who
used statistical methods in making predictions about the positions of the planets in our
solar system. Adolphe Quetelet (1796-1874), A Belgian astronomer developed the idea of
the “average man” from his studies of the Belgian census. He was also known as the
“Father of Modern Statistics”. Karl Pearson (1857-1936), an English mathematician made
important links between probability and statistics. In the 20 th century, the British
statistician Sir Ronald Aylmer Fisher developed the F-tool in inferential statistics (derived
after his name), this tool has been very useful in testing improvements of production from
agricultural experiments and improvement of precision of results from medical, biological
and industrial experimentation. The American George Gallup (1901-1984) was
instrumental in making statistical polling, a common tool in political campaigns.
In this age of information technology, a lot of computer programs such as Microstat, Soritec
Sampler, SPSS, and others are made available in diskette or websites that perform more
than the manual calculations in statistics. People working in some government agencies, in
laboratories, in media, and in business generally use these electronic devices to easily
access data, improve graphics, and obtain ready-made analyses interpretations about the
data.
APPLICATION OF STATISTICS
In Education
Through statistical tool, a teacher can determine the effectiveness of a particular
teaching method by analyzing test scores
obtained by their students. Results of this study may be used to improve teaching-
learning activities.
In Business
A business firm collects and gathers data or information from its everyday
operation. Statistics is used to summarize and describe those data such as the amount of
sales, expenditures, and production to enable the management to understand and
determine the status of the firm. Data that have been organized and analyzed provide the
management a baseline to make wise decision pertaining to the operation of the business.
In Psychology
Psychologists are able to interpret meaningful aptitude tests, IQ tests and other
psychological tests using statistical procedure or tools.
In Politics & Government
Public Opinion and election polls are commonly used to assess the opinions or
preferences of the public for issues or candidates of interest. Statistics plays an important
role in conducting surveys or interviews for that purpose.
In Medicine
Statistics is also used in determining the effectiveness of new drug products in
treating a particular type of disease. To illustrate, a drug company wants to test the
effectiveness of its new drug product in treating tuberculosis. An experiment or a clinical
trial is conducted. Ten tuberculosis patients are treated using the new drug product and
another are treated using the existing drug. The results are analyzed statistically to find
out if the new product is more effective in treating tuberculosis.
In Agriculture
Through statistical tools, an agriculturist can determine the effectiveness of a new
fertilizer in the growth of plants or crops. Moreover, crop production and yield can be
better analyzed through the use of statistical methods.
In Industry
The most favorite actresses and actors can be determined by using surveys. Ratings
of the members of the board of judges in a beauty contest are statistically analyzed.
Interviews are used to determine the most widely viewed television show. The top grosser
movies for this year are reported based on statistical records of movie houses. All these
activities involve the use of statistics.
In everyday life
The number of cars passing through streets or a highway is recorded to enable
traffic enforcers to manage efficiently. Even the number of pedestrians crossing the street,
the number of people entering a warehouse or a department store, and the number of
people engaged in video games involve the use of statistics. In short statistics is found and
used in everyday life.
BRANCHES OF STATISTICS
For example, we may describe a collection of persons by stating how many are poor and
how many are rich, how many are literate and how many are illiterate, how many fall into
various categories of age, height, civil status, IQ, and many more. We may also describe a
particular barangay in terms of the number of families it has, the number of grade-
schoolers, the number of professionals, the number of households with certain kinds of
appliances, the number of siblings in each household, or the rate of unemployment.
Generally, descriptive statistics involve gathering, organizing, presenting and describing
data.
Suppose we want to know the most favorite brand of toothpaste of a certain barangay and
we do not have enough time and money to interview all the residents of that barangay, we
may just ask selected residents. With the data obtained from the interviews, we shall draw
or make conclusions as to barangay’s favorite brand of toothpaste. This example involves
the use of inferential statistics.
TERMINOLOGIES IN STATISTICS
Some important terms are commonly used in the study of Statistics. These terms should be
understood fully in order to facilitate the study of statistics.
1. Population refers to a large collection of objects, places or things. To illustrate this,
suppose a researcher wants to determine the average income of the residents of a
certain barangay and there are 1500 residents in the barangay. Then all of these
residents comprise the population. A population is usually denoted or represented
by N. Hence, this case, N = 1500.
2. Sample is a small portion or part of a population. It could also be define as a sub-
group, subset, or representative of a population. For instance, suppose the above-
mentioned researcher does not have enough time and money to conduct the study
using the whole population and he wants to use only 200 residents. These 200
residents comprise the sample. A sample is usually denoted by n, thus n = 200.
3. Parameter is any numerical or nominal characteristics of a population. It is a value
or measurement obtained from a population. It is usually referred to as the true or
actual value. If in the preceding illustration, the researcher uses the whole
population (N=1500), then the average income obtained is called a parameter.
4. Statistic is an estimate of a parameter. It is a value or measurement obtained from
the sample. If the researcher in the preceding illustration makes use of the sample
(n=200), then the average income obtained is called statistic.
5. Data –(singular form is datum) are facts, or a set of information or observation
under study. More specifically, data are gathered by the researcher from a
population or from a sample. Data may be classified into two categories, qualitative
or quantitative
a. Qualitative data are data which can assume values that manifest the concepts of
attributes. These are sometimes called categorical data. Data falling in this
category cannot be subjected to meaningful arithmetic. They cannot be added,
subtracted or divided. Gender and nationality are qualitative data.
b. Quantitative Data are data which are numerical in nature. These are data
obtained from counting or measuring. In addition, meaningful arithmetic
operations can be done with this type of data. Test scores and height are
quantitative data.
7. Constant refers to the fundamental quantities that do not change in value, fixed
costs and acceleration due to gravity are examples of such.
SCALES OF MEASUREMENT
1. Nominal Scale- This is the most primitive level of measurement. The nominal
level of measurement used when we want to distinguish one object from another
for identification purposes. In this level, we can only say that one object is
different from another, but the amount of difference between them cannot be
determined. We cannot tell that one is better or worse than the other. Gender,
nationality and civil status are of nominal scale.
2. Ordinal scale – in the ordinal level of measurement, data are arranged in some
specified order or rank. When objects are measured in this level, we can say that
one is better or greater than the other. But we cannot tell how much more or
how much less of the characteristic one objects than the other. The ranking of
contestants in a beauty contest, or siblings in the family, or of honor students in
the class are of ordinal scale.
3. Interval Scale- If data are measured in the interval level, we can say not only that
one object is greater or less than another, but we can also specify the amount of
difference. The scores in an examination are of interval scale of measurement.
To illustrate, suppose Kensly Kyle got 50 in a Math examination while Kwenn
Anne got 40. We can say the Kensly Kyle got higher score than Kwenn Ann by 10
points.
4. Ratio Scale- The ratio level of measurement is like the interval level. The only
difference is that the ratio level always starts from an absolute or true zero
point. In addition, in the ratio level, there is always the presence of units of
measure. If data are measured in this level, we can say that one object is so
many times as large or as small as the other. For example, suppose Mrs. Reyes
weight 50 kg, while her daughter weighs 25 kg. We can say that Mrs. Reyes is
twice heavy as her daughter. Thus, weight is an example of data measured in the
ratio.
SOURCES OF DATA
There are two sources of obtaining data. One is called primary source from which a
first-hand information is obtained usually by means or personal interview and
actual observation. On the other hand, the secondary source of information is taken
from other’s works, news reports, readings, journals, magazines, and those that are
kept by the National Statistics Office, Securities and Exchange Commission, Social
Security System and other government and private agencies.
Data are said to be an asset of a company if they are accurate, updated and available
when needed. Hence, any institution or business organization must have a database
called Management Information System where all information about their business
are made available in order to facilitate verification of claims and to come up with
wise management decision.
METHODS OF COLLECTING DATA: Its Advantages and Disadvantages
v VARIABLE
QUALITATIV Dependent
QUANTITATI
E Independent
VE
Dichotomous *Discrete
Trichotomous *Continuous
Multinomous
DATA
SCALES OF
SOURCES PRESENTATION
MEASUREMENT
*Primary Textual
METHODS *Nominal
* Secondary Tabular
*Interview *Ordinal
Graphical/Chart
*Questionnaire *Interval
-Line Graph
*Registration *Ratio
-Bar Graph
*Observation
-Pie Graph
*Experimentation
-Pictograph
-Map/Cartogram
-Scatter Point Diagram
In research, we seldom use the entire population because of the cost and time involved.
In fact, most researchers do not use the population in their study. Instead, the sample
which is small representative of a population is used. The characteristics of the whole
entire population are described using the characteristics observed from the sample.
To illustrate, suppose we want to find out the average age of the students in Manila.
However, due to insufficient time, only the students in three particular schools were
used to estimate the average age. Obviously, the result is not the actual average age but
just an estimate and thus, there is really an error when we use the sample instead of the
population.
Study the examples below in finding the sample size.
Example 1. A group of researcher will conduct a survey to find out the opinion of
residents of a particular community regarding the oil price hike. If there are 10,000
residents in the community and the researchers plan to use a sample using a 10%
margin of error, what should the sample size be?
Hence, the researchers will just conduct the survey using 99 residents. A 10% margin or
error means that the researcher is 90% confident that the result obtained using the sample
will closely approximate the result had he used the population.
Example 2. Suppose that in example 1, the researcher would like to use a 5% margin of
error. What should be the size of the sample?
n =384.62 or 385
Observe from examples 1 & 2 that as we reduce the margin of error, the sample size gets
larger. Hence if we want to have a more accurate result, we have use a larger sample.
SAMPLING TECHNIQUES
Sampling Technique- is a procedure used to determine the individuals or
members of a sample.
A – PROBABILITY OR RANDOM SAMPLING TECHNIQUE is a sampling technique
wherein each member or element of the population has an equal chance of being selected
as members of the sample.
a. Lottery Method
Suppose Mrs. Cruz wants to send five students to attend a 2-day training or
seminar in basic computer programming. To avoid bias in selecting these five
students from her 40 students, she can use the lottery sampling. This is done by
assigning a number of paper to each student and then writing these numbers on
pieces of paper. Then, these pieces of paper will be rolled or folded and placed in
a box called lottery box. The lottery box should be thoroughly shaken and then
five pieces of paper will be picked or drawn from the box. The students who
were assigned to the numbers chosen will be sent to the training. In this case,
the selection of the students is done without bias. Note that we can simply
assign1 to the first student, 2 to the second student and so on.
Let us illustrate how these random numbers are use to select the members of the
sample. Let us consider the preceding example wherein Mrs. Cruz wants to
select 5 students from her 40 students. Again, we will assign a number to each
student, say from 1 to 40.
Since there are 40 students, we will use the two-digit number of the table of
random number when selecting the members of the sample. This is because the
students have been assigned with number 01, 02, 03,. . . up to 40. Looking at the
first column of the table of random numbers above, we see that the number
formed by the first two-digit is 31, hence, the student assigned to number 31 is
chosen as a member of the sample. If we proceed down the column, we see that
the number formed is 87 which cannot be used because we have only 40
members. In a similar manner, the third number is 06 so that the student
assigned to number 6 is chosen. Notice that the next two numbers from the table
are 95 and 44, numbers we cannot use for the same reason as before. When we
get to the bottom of the column, we move up the column and merely shift one
digit to the right for the next random number. Thus, we will have 18 as our next
number. Thus is one of the many alternatives. We can have other ways of
selecting the members of the sample until we complete the 5 students.
2. Systematic Sampling
Let us use the example wherein Mrs. Cruz wants to select 5 students from her 40
students. First, we select a random starting point. This is done by dividing the
number of members in the population by the number of the members in the sample.
Hence, in our case we shall have i = 8. The next step is to write the numbers 1, 2, 3,
4, 5, 6, 7, and 8 on pieces of paper and draw one number by lottery. If we were able
to get 5, this means that we will select every 5th student in the population as
members of the sample. Therefore, the 5 th, 10th, 15th, 20th, and 25th student shall be
the members of the sample. If, for instance, we were able to obtain the number 6,
then the members of the sample will be the 6th, 12th, 18th, 24th and 30th students.
To do this, we will use the stratified random sampling. The word stratified comes
from the root word strata which means group or categories (singular form is
stratum). When we use this method, we are actually dividing the elements of the
population into different categories or subpopulation and then the members of the
sample are drawn or selected proportionally from each subpopulation.
Solution: the first step is to find the percentage of each stratum. This is done by
dividing the number of families in each stratum by the total of families. Then, we
multiply each percentage by desired number of families in the sample.
Strata Number of Percentage Number of Families
Families in the Sample
High 1000 1000/5000= 0.2 or 0.2x200= 40
20%
Average 2500 2500/5000=0.5 or 0.5x200=100
50%
Low 1500 1500/5000=0.3 or 0.3x 200=60
30%
N=5000 n = 200
From the above table, we see that if we are going to draw 200 members from the
population of 5000, we should draw 40 families belonging to the high-income, 100
from the average, and 60 from the low-income groups. Observe that the number of
families drawn as sample in each stratum is proportional to the number of families
from the population.
4. Cluster Sampling
Cluster sampling is sampling wherein groups or clusters instead of individuals are
randomly chosen. Recall that in the simple random sampling we select members of
the sample individually. In cluster sampling, we will select or draw the members of
the sample by group and then we select a sample of elements from each cluster or
group randomly. Cluster sampling is sometimes called area sampling because this is
usually applied when population is large.
To illustrate the use of this sampling method, let’s suppose that we want to
determine the average income of the families in Manila. Let us assume there are
250 barangay in Manila. We can draw a random sample of 20 barangays using
simple random sampling, and then a certain number of families from each of the 20
barangays may be chosen.
5. Multi-Stage Sampling
Multi-stage sampling is a combination of several sampling techniques. This
method is usually used by the researchers who are interested in studying a very
large population, say the whole island of Luzon or even the Philippines. This is done
by starting the selection of the members of the sample using cluster sampling and
then dividing each number or group into strata. Then, from each stratum
individuals are drawn using simple random sampling.
1. Convenience Sampling
As the name implies, convenience sampling is used because of the convenience it
offers to the researcher. For example, a researcher who wishes to investigate the
most popular noontime show may just interview the respondents through the
telephone. The result of this interview will be biased because the opinions of those
without telephone will not be included. Although convenience sampling may be
used occasionally, we cannot depend on it in making inferences about a population.
2. Quota Sampling
In this type of sampling, the proportions of the various subgroups in the
population are determined and the sample is drawn to have the same percentage in
it. This is very similar to the stratified random sampling the only difference is that
the selection of the members of the sample using quota sampling is not done
randomly. To illustrate this, let us suppose that we want to determine the teenagers’
most favorite brand of T-shirt. If there are 1000 female and 1000 male teenagers in
the population and we want to draw 150 members for our sample, we can select 75
female and 75 male teenagers from the population without using randomization.
This is quota sampling.
4. Incidental Sampling
This design is applied to those samples which are taken because they are the
most available. The investigator simply takes the nearest individuals as subjects of
the study until it reaches the desired size. In an interview, for instance, an
interviewer can simply choose to ask those people around him or in a coffee shop
where he is taking a break.