Module 1 Introduction To Statistics
Module 1 Introduction To Statistics
Overview
This module presents topics introductory and basics to applied statistics. Words pertinent to the study of statistics are
defined to facilitate better understanding of the course.
Objectives
Learning Focus
Importance of Statistics
Some of the functions of statistics can be as follows:
To present facts in a definite form.
Statistics facilitates comparisons.
Statistics gives guidance in the formulation of suitable policies.
Statistics can be formulated well in advance for predictions.
Statistical methods are helpful in formulating, testing hypothesis and develop new theories.
Role of Statistics
The following are some applications of Statistics in various disciplines and in real life.
Statistics in Research. Statistics provides a platform as well as backbone of research as to: how to go
about one's research either to consider a sample or the whole population the techniques to use in data
collection, how to go about the data description using measures of central tendency and variability, or how
to determine the relationships between variables. It does not only help us to collect, organize or present
data, but more importantly to analyze, interpret, and draw conclusions and make inferences in order for us
to solve a problem. Research needs to validate or disprove processes, or determine if the current approach
is worthwhile, accurate, etc. and to do this, researchers need this very powerful tool called statistics.
Business and Economics. All activities of a businessman are based on statistical information. Statistics
helps businessman to plan production according to the needs and tastes of the customers. The quality of
the products can also be checked more efficiently by using statistical methods. He can make correct
decisions about the location of business, marketing of the products, financial resources, etc. All of these
activities are through the help of information gathered and analyzed using this powerful tool called Statistics.
Politics and Governance. Government largely depends on statistics. Different policies of the government
are based on statistics. Statistical data are now widely used in taking all administrative decisions. Statistical
methods are used for preparation of budget, in estimating the expected expenditures and revenue from
different sources. Public opinion and election polls are used to assess the pulse of the public about issues
affecting administrative decisions.
In Banking, Accounting and Auditing. The banks work on the principle that all the people who deposit
their money with the banks do not withdraw it at the same time. The bank earns profits out of these by
lending to others on interest. Bankers use statistical approaches based on probability to estimate the
number of depositors and their claims for a certain day. Accounting is impossible without exactness. But for
decision-making purposes, so much precision is not always essential. The decision may be taken on the
basis of approximation, projection, and estimation known as Statistics. In auditing, sampling techniques are
commonly used. An auditor determines the sample size of the book to be audited on the basis of the error.
In Natural and Social Science. Statistics plays an important role in almost all the natural and social
sciences: Biology, Physics, Chemistry, Business, Public Administration, Communication, and Mathematics,
Meteorology, Astronomy, Sociology, and Information Technology, statistical methods are used for gathering,
organizing and analyzing data, as well as analyzing the results of experiments.
Sports and Entertainment. The most valuable player is determined by analyzing the performance per
game of top players in all sports. Interviews and surveys are used to settle the viewership battle of TV
networks, as well as find out the most widely viewed television programs. Ratings of the members of the
board of judges in the Miss Universe contest are statistically analyzed. All these activities use Statistics.
In education. Through the help of Statistics, a teacher can determine the effectiveness of a particular
teaching strategy by analyzing the test scores obtained by the students. Teachers record the performance
of students while students evaluate the teaching methodology of teachers. The information obtained on both
activities is analyzed using Statistics, and then the results are used to improve teaching and learning.
Statistics is the primary tool of research, and research is for development. In order to improve the sorry state
of education, one needs to go deeper and trace roots of the problem. This can be done through intensive
research, and for this research to be valid, a good Statistical design for analysis is a must.
Division of Statistics
1. Descriptive Statistics is a statistical procedure concerned with describing the characteristics and
properties of a group of persons, place or things. It is based on easily verifiable facts.
Descriptive Statistics organizes the presentation, description, and interpretation of data gathered without
trying to infer anything that goes beyond the data.
2. Inferential Statistics is a statistical procedure used to draw inferences from a large group of people, places
or things on the basis of the information obtained from a small portion of a large group. It involves
generalizing from sample to populations, performing estimations and hypothesis tests, determining
relationship among variables, and making predictions.
Inferential statistics draws inferences about the population based on the data gathered from samples using
the techniques of descriptive statistics. The backbone of inferential statistics is descriptive statistics.
Definitions
Population refers to the large collection of objects, place or things.
Parameter is any numerical value which describes a population.
Example:
1. There are 8,756 students enrolled in Masagana National High School
2. The average age of the students is 14
N = 8,756 and µ=14 are parameters because each describe the population
Sample is a small portion or part of a population; a representative of the population in a research study.
Statistic is any numerical value which describes a sample
Example:
1. Of the 8,756 students enrolled in Masagana High School, 2,449 are male
3. The average age of the students is 15.2
n = 2,449 and 𝑥̅ =15.2 are statistics because each describe the sample
Definitions
Data are facts, or a set of information gathered or under study.
Quantitative Data are numerical in nature and therefore meaningful arithmetic can be done. It involves
numbers and can be obtained by counting
Example: age, weekly allowance, monthly salary, age, height
Qualitative Data are attributes which cannot be subjected to meaningful arithmetic. These are attributed or
characteristics such as sex, educational attainment, feelings or opinion
Example: gender, Size of T-shirt, brand of cars, educational attainment
Definitions
Quantitative or numerical data gathered about the population or sample can be further classified into either
discrete of continuous.
Discrete Data assume exact values only and can be obtained by counting.
Example: number of student, score in an examination
Continuous Data assume infinite values within a specified interval and can be obtained by measurement.
Example: height a PBA player, length of waistline
Definition
Constant is a characteristic or property of a population or sample which makes the members similar to each
other.
Example: Gender in a class of all boys is constant
Variable is a characteristic or property of population or sample which makes the members different from
each other.
Example: Gender in a coed school is variable
Researchers are not interested in constants since they do not make the subjects of research different from
one another. They are specifically interested in variables.
Definition
In statistics, variables can also be classified as either independent or dependent.
Dependent. A variable which is affected by another variable.
Example: “test scores” is dependent on number of hours spent in studying, IQ, attitude towards
studying, etc
Independent. A variable which affects the dependent variable.
Example: “number of hours spent in studying” affects test scores
Levels Of Measurements
Level of Measurement or Scale of Measure is a classification that describes the nature of information
within the values assigned to variables. In statistics, level of measurement is a classification that relates the
values that are assigned to variables with each other. In other words, level of measurement is used to
describe information within the values. Psychologist Stanley Smith is known for developing four levels of
measurement: nominal, ordinal, interval, and ratio. The four levels of measurements are defined as follows:
a. Nominal. A nominal variable (sometimes called categorical variable) is one that has two or more categories, but
there is no intrinsic ordering to the categories. For a nominal variable, numbers do not mean anything, they just
label.
Example: gender, color of hair, religion
Gender is a categorical variable having two categories (male and female) and there is no inherent
ordering to the categories. You can code the two categories if you want (male=1 and female=0) but the
order is arbitrary and any calculation (for example, counting an average would be meaningless.)
b) Ordinal. An ordinal variable is similar to a categorical variable. The difference between the two is that there is a
clear ordering of the variables. With ordinal data, however you cannot state with certainty whether the intervals
between each values are equal.
Example: size of t-shirt, job position, educational attainment
You might ask patients to express the amount of pain they are feeling on a scale of 1 to 10. A score of 7
means more pain than a score of 5, and that is more than a score of 3. But the difference between 7
and 5 may not be the same as that between 5 and 3. The value simply express order.
c) Interval. An interval variable is similar to an ordinal variable, except that the intervals between the values of the
interval variables are equally spaced. It does not have a true zero value.
Example: temperature, grade, pH
For Temperature, the difference between a temperature of 100 degrees and 90 degrees is the same
differences as between 90 degrees and 80 degrees.
d) Ratio. A ratio variable has all the properties of an interval variable, and also has clear definition of 0.0 or has true
zero. When the variable equals 0.0 there is nothing of that variable or thee is an absence of the attribute. Another
property of such variable is that it allows ratio statements.
Example: number of votes, number of car accidents, length, dose amount
True zero value: If a candidate garnered zero (0) votes, it would mean no one voted for him/her.
Ratio Statement: A 100 votes is TWICE as much a 50 votes.
Sometimes it’s hard to distinguish interval from ratio because they used interchangeably. Don’t worry it
won’t make you lose your grasp of other statistical terms…just remember that interval has no true zero, while ratio
has a true zero.
Ways to categorize different types of variables according to level of measurement is shown in table 1 below
to help us choose the right statistical test, visualization technique and guide our data analysis.
Level Properties Example Descriptive statistics Graphs
Nominal Discrete Binary Frequency Bar
Order less Responses (True or False) Percentages Pie
Names of People Mode
Color of Paints
Ordinal Ordered Categories Likert Scale Frequency Bar
Comparisons Grades on An Exam Percentages Pie
Mode Stem and Leaf
Median
Percentile
Interval Differences Degree Co or Fo Frequency Bar
Between ordered pH Percentages Pie
values have meaning Mode Stem and Leaf
No true zero value Median Box Plot
Mean Histogram
Standard Deviation
Ratio Continuous Money Mean Histogram
True Zero Value Weight Standard Deviation Box Plot
Allows ratio comparison