Lecture 1-Introduction To Statistics
Lecture 1-Introduction To Statistics
Elementary Statistics
(STATS 1)
Lecture 1: Introduction to
Statistics and Probability
Source: Timor-Leste Agricultural Census 2019 Source: Timor-Leste Population Census 2015
Source: UNICEF, 2020
6. From the information given, comment on the relationship between the variables.
Based on the data, it appears that, in general, the better your attendance, the higher
your grade.
Variables and Types of Data
Qualitative (Categorical) vs Quantitative Variable
Example 6:
• Subjects are classified according to gender (male or female), then the
variable gender is qualitative.
• Subjects are classified to indicate Religious upbringing (Catholic, Islam,
Hindu, Buddha), then the variable religious upbringing is qualitative.
• Subjects are classified geographical origin (Aileu, Baucau, Covalima,
etc.), then the variable geographical origin is qualitative.
Quantitative variables are variables that can be counted or measured.
Example 7:
• Variable age is numerical, and people can be ranked in order according
to the value of their ages. Therefore, variable age is quantitative.
• Data being taken is the Heights of people. Therefore, variable height is
quantitative.
• Data being taken is the Weights of people. Therefore, variable weights is
quantitative.
• Data being taken is the body temperatures of people. Therefore,
variable body temperature is quantitative.
Discrete vs Continuous Variable
Quantitative variables can be further classified into two groups: discrete
and continuous.
Discrete variables assume values that can be counted.
Example 8:
• Number of eggs a female chicken has in a month
• The number of students in a class.
Continuous variables can assume an infinite number of values between any
two specific values. They are obtained by measuring. They often include
fractions and decimals.
Example 9:
• The Heights of people in a city
• Time a marathon runners reaches finish line.
The classification of variables can be summarized as
follows:
EXERCISE 3
Classify the following Variables as Discrete or Continuous Variables?
Events Type of Statistics
Maximum allowable motorcycle speed on
public road Continuous
The number of pages in a Statistics book
Discrete
Volume of gasoline pumped into the car Continuous
The number of books on a shelf in the JSU Discrete
library
Level of Measurements
Since the interval scale has no true zero point, you cannot calculate Ratios. For
example, there is no any sense the ratio of 90 to 30 degrees F to be the same
as the ratio of 60 to 20 degrees.
Ratio
The ratio level of measurement possesses all the characteristics of interval
measurement, and there exists a true zero. In addition, true ratios exist when
the same variable is measured on two different members of the population.
Example 15:
• Salary
• Length
• Weight
• Height
• Time
EXERCISE 4
What level of measurements would be used to measure each of the following
variables?
Disadvantages:
have NO cellphone might not pick up the phone Many Unlisted Numbers
Mailed Questionnaires
Advantages:
Cover a wider geographic area respondents remain anonymous
Disadvantages:
Low number of responses inappropriate answers difficulty reading or understanding
Personal Interview
Advantages:
Obtaining in-depth responses
Disadvantages:
Interviewers must be trained More Costly Biased on Selection of Respondents
Sampling Technique
As stated earlier in our discussion, researchers use samples to collect data and
information about a particular variable from a large population. Using
samples saves time and money and in some cases enables the researcher to
get more detailed information about a particular subject.
patronizing.
• Volunteer sample or self-selected sample: Respondents decide for
themselves if they wish to be included in the sample.
For example, RTTL asks a question about Violence situation and then asks
people to call one number if they agree with the action taken by the government
or the police. The results are then announced at the end of the day.
Sampling Error
Since samples are not perfect representatives of the populations from which
they are selected, there is always some error in the results. This error is
called a sampling error
Sampling error is the difference between the results obtained from
a sample and the results obtained from the population from which
the sample was selected.
For example, suppose you select a sample of full-time students at your
college and find 56% are female. Then you go to the admissions office
and get the genders of all full-time students that semester and find that
54% are female. The difference of 2% is said to be due to sampling
error.
EXERCISE 5
Which of the sampling techniques were used in the following study?
a.) Out of 10 High Schools in Dili, researchers select one and record Cluster
the number of students late for a whole month.
b.) A researcher divides a group of students according to gender,
major field, and low, average, and high-grade point average. Then Stratified
she randomly selects six students from each group to answer
questions in a survey.
c.) Number of people visited Cristo Rei are recorded. Then sample Random
of this people are to be selected.
d.) Every 10th bottle of GOTA water is selected, and the amount of
liquid in the bottle is measured. The purpose is to see if the Systematic
machines that fill the bottles are working properly.
Experimental Design
Observational vs Experimental Studies
Observational Study
In an observational study, the researcher merely observes what is
happening or what has happened in the past and tries to draw
conclusions based on these observations.
Example 16:
A researcher wants to know how many people pursue higher education after
finishing high school.
There are three main types of observational studies:
• Cross-sectional study: When all the data are collected at one time.
• Retrospective study : When the data are collected using records obtained
from the past.
• Longitudinal study : When data are collected over a period of time, say,
past and present.
Experimental Study
In an experimental study, the researcher manipulates one of the variables
and tries to determine how the manipulation influences other variables.
Example 17:
A researcher is conducting a study to know which language of instruction is
better for student learning in one of the elementary school in Viqueque.
Dependent and Independent Variables
The independent variable in an experimental study is the one that is being
manipulated by the researcher. The independent variable is also called the
explanatory variable. The resultant variable is called the dependent variable
or the outcome variable.
Example 18:
A researcher is conducting the relationship between physical exercise and
health. The study is to examine various physical exercise and its affects on
overall physical exercise.
Computers use for Statistics
Microsoft Excel
In the past statistical calculations were done with pencil and paper.
However, with the invention of computers numerical computation
becomes much easier and faster. In this class we will be using Microsoft
excel for our statistical calculation.