Stat Chapter 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Chapter 1

INTRODUCTION

What is Statistics?
The word statistics is frequently used to refer to a set of data such as the size of enrolment, the number
of patients treated in the university clinic, or the number of students visiting the library. However,
statistics involves more than the simple recording or collecting of data. Statistics is the science of
collecting, organizing, summarizing and analyzing information in order to draw conclusions.
Statistics is used in all fields of human endeavor. The following are some examples of the uses of
statistics:
• a coach keeps record of his players’ performance during a basketball game
• surveys are conducted to forecast the outcome of an election
• a professor conducts an experiment to compare a new method of teaching to an old one
• a chemist conducts experiments to determine the effect of a new drug on human

Two Major Areas of Statistics


The study of statistics has two major areas: descriptive statistics and inferential statistics.

DEFINITION
Descriptive Statistics is the area of statistics that involves techniques for organization,
summarization and description of data.
Inferential Statistics is the area of statistics that involves techniques that uses sample data to
draw conclusion about a population. It consists of generalizing from samples to populations,
performing estimations and hypothesis tests, determining significant relationships among
variables and making predictions. A basic tool in inferential statistics is probability.

EXAMPLE : Descriptive and Inferential Statistics


Problem1: Determine whether the following statements use the area of descriptive or
inferential statistics.
1) The average number of library users for the last 6 months is 650.
Answer: Descriptive
2) 9.5 percent of Filipinos experienced involuntary hunger in the first quarter of 2019
(inquirer.net, 4/25/2019).
Answer: Inferential
3) The current dengue incidence is 85% higher than in 2018
Answer: Descriptive
Basic Statistical Terms.

The following are terms commonly used in Statistics:


DEFINITION
A population is a collection of all individuals or items under consideration in a statistical
study
A sample is a part of the population from which information is collected.
A parameter is a numerical description of a population characteristic.
A statistic is a numerical description of a sample characteristic.

EXAMPLE : POPULATION AND SAMPLE


Problem1:
In the following study, identify the population and sample.
A pollster wants to determine which local news channel is watched most frequently
in the town of Silang. Residents of Puting Kahoy and Pasong Langka were
interviewed.
Answer: Population: Residents of the town of Silang
Sample: Residents of Puting Kahoy and Pasong Langka.
Problem2:
Decide whether the numerical value describes a population parameter or a sample statistic.
In random check of a sample of school children, it was found that 54% of the
children had head lice.

Answer: Statistic

Try It!

 A. In each of these statements, identify the population and sample.


1. In a recent survey, 1200 adults in the Philippines were asked if they are very
satisfied, fairly satisfied, not very satisfied or not at all satisfied with the life they
are experiencing. Adapted from Social Weather Stations
Population: ___________________ Sample:____________________
2. Administrators of a certain university want to determine if students are satisfied with
the new registration procedures. Toward this goal, 300 of the 3,000 students are
selected and each is asked, “Are you satisfied with the new registration procedure?”
Population: ___________________ Sample:____________________
B. Decide whether the numerical value describes a population parameter or a
sample statistic.
________1. The average monthly salary for 30 of a university’s 250 faculty is P15,500.
________2. For the 1st semester of CY 2010-2011, the average score in the math
entrance test for all freshmen applicant in AUP is 27.8.
DEFINITION
A variable is a characteristic that varies from one person or object to another. Variables
are opposite of constant whose values never change.
Data are actual values of the variable.

When doing a study, it is important to know the kind of variable involved. The nature of the
variables will determine which statistical procedures can be used.
Variables in a statistical study can be classified as qualitative variable or quantitative variable.
Quantitative variables can further be classified as either discrete or continuous.

DEFINITION
A qualitative variable, also called categorical variable, is a variable which assumes non-
numerical values. They allow for classification of individuals based on some attributes or
characteristics (e.g. sex: male, female; religion: Christianity, Islam, Buddhism, etc.)
A quantitative variable is a variable which assumes numerical values for an individual (e.g.
weight: 150 lbs., 45 kg., etc; height: 6 cm, 15 inches, 7 ft., etc.).
A discrete variable is a quantitative variable whose possible values can be listed or counted
(e.g. number of students, number of patients).
A continuous variable is a quantitative variable whose possible values form some interval of
numbers (e.g. height, weight, distance).

The values of a variable for one or more people or things are referred to as data. Data like
variables can be classified as qualitative data, quantitative data, discrete data and continuous data.

The classification of variables can be summarized as follows:

Variable

Quantitative Qualitative

Discrete Continuous
Levels of Measurement
Another way to classify data is by identifying how they are categorized, counted or measured. This
type of classification uses measurement scales. The four types of measurement scales or the levels
of measurement are nominal, ordinal, interval and ratio. These levels of measurement determine the
mathematical operations that can be performed and the statistical tools that can be applied to the data
set.
DEFINITION
1) Data at the nominal level of measurement are classified using names, labels, or
qualities. Values cannot be ranked or ordered. No mathematical computations can be
made at this level
Example: sex: male, female
religion: Christianity, Islam, Buddhism, …
2) Data at the ordinal level of measurement can be arranged in order or ranked but
differences between the ranks are not meaningful
Example: Academic honors: Summa Cum Laude, Magna Cum Laude, Cum Laude
Satisfaction Rating: Highly satisfied, satisfied, moderately satisfied,…
3) Data at the interval level of measurement can be ordered and meaningful differences
between data entries can be calculated. However, there is no meaningful zero. A zero
entry at this level simply represents a position on a scale, it is not an inherent zero
Example: IQ: 95, 100, 110, …
Temperature: 29℃, 32℃, 15℃, …
4) Data at the ratio level of measurement can be ordered; meaningful differences can be
calculated, true ratio exist, and there is an absolute zero
Example: Height: 160 cm, 155 cm, 132 cm, …
Systolic Blood Pressure: 100, 120, 135, …

The table below summarizes which operations are meaningful at each of the four levels of
measurement. When identifying a data set’s level of measurement, use the highest level that applies.

Level of Put data in Arrange data in Subtract data Determine the


Measurement categories order values ratio of values
Nominal ✓   
Ordinal ✓ ✓  

Interval ✓ ✓ ✓ 

Ratio ✓ ✓ ✓ ✓
Two Basic Types of Statistical Studies:

Statistical studies can be classified as either observational studies or designed experiments.


Observational studies can reveal only associations; whereas designed experiments can help establish
causation.
DEFINITION
In an observational study, the researcher observes what is happening or what has happened
in the past and tries to draw conclusions based on these observations.
In a designed experiment, the researcher manipulates one of the variables and tries to
determine how the manipulation influences other variables.

Understanding a statistical study will help you make sense of the many things you read in newspapers,
magazines and the internet. However, it is important that before you interpret the results of any study,
you should know whether the results are valid and reliable. In other words, you should be familiar with
how to design a statistical study.

Guidelines on Designing a Statistical Study:


1. Identify the variable(s) of interest and the population of the study.
2. Develop a detailed plan for collecting data. If you use a sample, make sure the sample is
representative of the population.
3. Collect the data.
4. Describe the data using descriptive statistics techniques.
5. Interpret the data and make decisions about the population using inferential statistics
6. Identify any possible errors.

Introduction to Experimental Designs


In order to produce meaningful results, experiments should be carefully designed and executed. There
are three key elements of a well-designed experiment: control, randomization, and replication. These
elements enable a researcher to conclude that the differences in the results of experiments not
reasonably attributable to chance are likely caused by the treatments.

PRINCIPLES OF EXPERIMENTAL DESIGN


Control. Two or more treatments should be compared.
Randomization. The experimental units should be randomly assigned into groups to avoid
unintentional selection bias in constituting the groups.
Replication. A sufficient number of experimental units should be used to ensure that
randomization creates groups that resemble each closely and to increase the chances of
detecting any differences among the treatments.
In a well designed experiment there is a control group. Generally, this group is used to account for the
influence of other known or unknown variables that might be an underlying cause of change in
response to the experimental group. Such variables are called confounding variables. The control
group also helps the researchers to control for placebo effect. The placebo effect occurs when a subject
receives no treatment, but believes he or she is in fact receiving treatment and respond favorably.
Placebo effect can also be minimized by using blinding. Blinding is a technique where the
subjects do not know whether they are receiving treatment or a placebo. In a double-blind experiment,
neither the experimenter nor the subjects know if the subjects are receiving a treatment or a placebo.

Independent vs Dependent variable. Statistical studies usually include one or more independent
variables and one dependent variable.
DEFINITION
The independent variable is the variable which is being manipulated by the researcher in an
experimental study. It is also called the explanatory variable.
The dependent variable is the variable which is affected or influenced by another variable. It is
also known as the resultant or outcome variable.

Example 5.
Identify the independent variable(s) (IV) or dependent variable(s )(DV) in the study.
A study of more than 3000 Japanese adults published in the British Medical Journal found
that those who ate their meals quickly were about twice as likely to be obese as their slow-
munching counterparts. (Source: Readers Digest)
Answer:
Independent variable: rate of eating
Dependent variable: person’s weight or BMI (Body Mass Index), which determines whether a
person is obese or not.

You might also like