100% found this document useful (1 vote)
220 views14 pages

Introduction of Statistics

Statistics involves collecting, organizing, analyzing, and drawing conclusions from data. It has two main divisions: descriptive statistics, which summarize and describe data through measures like central tendency, variation, and position; and inferential statistics, which make predictions about populations from samples. Key terms include variables, data, samples, populations, levels of measurement, and sampling techniques. Statistics provides tools for decision-making across many domains.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
220 views14 pages

Introduction of Statistics

Statistics involves collecting, organizing, analyzing, and drawing conclusions from data. It has two main divisions: descriptive statistics, which summarize and describe data through measures like central tendency, variation, and position; and inferential statistics, which make predictions about populations from samples. Key terms include variables, data, samples, populations, levels of measurement, and sampling techniques. Statistics provides tools for decision-making across many domains.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

STATISTICS AND PROBABILITY

INTRODUCTION OF STATISTICS
IMPORTANT TERMS IN STATISTICS
Statistics is defined as a science that studies data to make a decision. Hence, it
is a tool in the decision-making process.
Statistics involves the methods of collecting, processing, summarizing, and
analyzing data to provide answers or solutions to an inquiry.

MAJOR DIVISIONS OF STATISTICS


a) Descriptive Statistics are numbers that are used to summarize and
describe data.

Descriptive statistics allow you to characterize your data based on its


properties. There are four major types of descriptive statistics:

I. Measures of Frequency
* Count, Percent, Frequency

* Shows how often something occurs

* Use this when you want to show how often a response is given

II. Measures of Central Tendency

* Mean, Median, and Mode

* Locates the distribution by various points

* Use this when you want to show how an average or most


commonly indicated response

III. Measures of Dispersion or Variation

* Range, Variance, Standard Deviation

* Identifies the spread of scores by stating intervals

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

* Range = High/Low points

* Variance or Standard Deviation = difference between the


observed score and mean

* Use this to show how "spread out" the data are. It is helpful to know
when your data are so spread out that it affects the mean

IV. Measures of Position

* Percentile Ranks, Quartile Ranks

* Describes how scores fall about one another. It relies on


standardized scores

* Use this when you need to compare scores to a normalized score


(e.g., a national norm)

b) Inferential Statistics – consist of generalizing from samples to populations,


performing hypothesis testing, determining relationships among variables,
and making predictions.

NATURE OF DATA
Data – a collection of facts from experiments, observations, sample surveys,
censuses, and administrative reporting system.
• Data are facts and figures that are presented, collected, and analyzed.
Data are either numeric or non-numeric and must be contextualized.
• To contextualize data, we must identify its six W’s, or to put meaning on
the data, we must know the following W’s of the data:

1. Who? Who provided the data?


2. What? What was information from the respondents and the unit of
measurement used for each data (if any)?
3. When? When was the data collected?
4. Where? Where was the data collected?
5. Why? Why was the data collected?
6. HoW? HoW was the data collected?

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

Census is collecting data from all possible respondents.


Universe is the collection or set of units or entities from whom we got the data.
Population is an entire collection of individuals or objects with common
observable characteristics.
• The set of all variable values is referred to as a population.
Sample – a subgroup of a universe or a population.
Data is a specific measurement of a variable – it is the value you record in your
datasheet. Data is generally divided into two categories:
• Quantitative data represents amounts.
• Categorical data represent groupings. (also called qualitative data)
A variable that contains quantitative data is a quantitative variable; a variable
that contains categorical data is a categorical variable. Each of these types of
variables can be broken down into further types.
Variable – is a characteristic that is observable or measurable in every unit of the
universe.
Quantitative variables
When you collect quantitative data, the numbers you record represent real
amounts that can be added, subtracted, divided, etc. There are two types of
quantitative variables: discrete and continuous.

CLASSIFICATION OF VARIABLES ACCORDING TO


CONTINUITY.
A. Discrete variable. A result from either a finite number of possible values.
(The values are obtained by counting)
Example:
students enrolled in Statistics class, computers at the computer lab.
B. Continuous variable. A result of infinitely many possible values that can
be associated with points on a continuous scale so that there are no
gaps or interruptions. (The values are obtained by measuring)
Example:

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

height of Carla in centimeters, GPA of Cardo Dalisay last semester


attended.

Discrete vs continuous variables

Type of variable What does the data represent? Examples

Discrete variables (aka integer Counts of individual items or • Number of students in a


variables) values. class

• number of different tree


species in a forest

Continuous variables (aka ratio Measurements of continuous or • Distance


variables) non-finite values.
• Volume

• Age

Categorical variables
Categorical variables represent groupings of some kind. They are
sometimes recorded as numbers, but the numbers represent categories rather
than actual amounts of things.

CLASSIFICATION OF VARIABLES ACCORDING TO THE


FUNCTIONAL RELATIONSHIP
1. Independent variable. This is sometimes termed as the predictor
variable. It causes an effect on the dependent variable. The other
variables cannot change it.
2. Dependent variable. This is sometimes called the criterion variable.
Example: Academic performance depends on IQ.
IQ is the independent variable, while academic performance is the
dependent variable.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

Independent vs dependent vs control variables

Type of variable Definition Example (salt tolerance


experiment)

Independent variables (aka Variables you manipulate in order The amount of salt added to each
treatment variables) to affect the outcome of an plant’s water.
experiment.

Dependent variables (aka Variables that represent the Any measurement of plant health
response variables) outcome of the experiment. and growth: in this case, plant
height and wilting.

Control variables Variables that are held constant The temperature and light in the
throughout the experiment. room the plants are kept in, and
the volume of water given to each
plant.

Example datasheet
In this experiment, we have one independent and three dependent variables.
The other variables in the sheet can’t be classified as independent or
dependent, but they do contain data that you will need in order to interpret
your dependent and independent variables.

LEVEL OF MEASUREMENT
a) Nominal level of measurement is characterized by data that consist of
names, labels, or categories only.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

Example: classifying gender of the student


0 – female
1 – Male
NOTE: the numbers have no meaning.

b) Ordinal level of measurement involves data that may be arranged in


some order but differences between data values either cannot be
determined or are meaningless.

Example: The top 10 students of the graduating class, rank in a beauty


contest.

c) Interval level of measurement is like the ordinal level but a meaningful


amount of differences between data can be determined. The zero point
is arbitrary, meaning a zero score is not a true zero.

Example: Average annual temperatures in Baguio

d) Ratio Interval is the highest level of measurement. Like interval, ratio data
can be ordered. What differentiates it from interval data is that zero is
absolute.

Example: weight of garbage discarded by households.

SAMPLING TECHNIQUE
Sampling refers to the process of selecting individuals who will participate as
part of the study.

Random Sampling is a process whose members had an equal chance of being


selected from the population. It is also called probability sampling.

a. Simple Random Sampling is a process of selecting n sample size in


the population via random number or through lottery.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

b. Systematic Sampling is a process of selecting a kth element in the


population until the desired number of subjects or respondents is
attained.
c. Stratified Sampling is a process of subdividing the population into
subgroups or strata and drawing members at random from each
subgroup or statum.
d. Cluster Sampling is a process of selecting clusters from a population
that is very large or widely spread out over a wide geographical
area.
Non-Random Sampling is a sampling procedure where samples are selected in
a deliberate manner with little or no attention to randomization; it is called
non-probability sampling.

a. Convenience sampling is a process of selecting a group of


individuals who (conveniently) are available for study.
b. Purposive sampling is a process of selecting based on judgment to
select a sample that researchers believed, based on prior
information, will provide the data they need.
c. Quota sampling is applied when an investigator survey collects
information from an assigned number or quota of individuals from
one of several sample units fulfilling certain prescribed criteria or
belonging to a statum.
d. Snowball sampling is a technique in which one or more members of
the population are located and used to lead the researcher to
another member of the population.
e. Voluntary sampling is a technique when samples are composed of
respondents who are self-select into the study/survey.
f. Judgement sampling is a technique wherein a researcher relies on
his/her personal/sound judgment in choosing to participate in the
study or sample selected is based on the opinion of an expert.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

Data Presentation
3 Methods to present data

A. Textual or Narrative

B. Tabular

C. Graphical methods of presentation

In presenting the data in textual or paragraph or narrative form, one describes the
data by enumerating some of the highlights of the data set like giving the highest,
lowest, or average values. In case there are only a few observations, say less than
ten observations, the values could be enumerated if there is a need to do so. An
example of which is shown below:
The country’s poverty incidence among families as reported by the
Philippine Statistics Authority (PSA), the agency mandated to release
official poverty statistics, decreases from 21% in 2006 down to 19.7% in
2012. For 2012, the regional estimates released by PSA indicate that the
Autonomous Region of Muslim Mindanao (ARMM) is the poorest region
with poverty incidence among families estimated at 48.7%. The region with
the smallest estimated poverty incidence among families at 2.6% is the
National Capital Region (NCR).
The tabular method of presentation is applicable for large data sets. Trends could
easily be seen in this kind of presentation. However, there is a loss of information
when using such kind of presentation. The frequency distribution table is the usual
tabular form of presenting the distribution of the data. The following are the
common parts of a statistical table:
a. Table title includes the number and a short description of what is found inside
the table.
b. Column header provides the label of what is being presented in a column.
c. Row header provides the label of what is being presented in a row.
d. Body is the information in the cell intersecting the row and the column.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

In general, a table should have at least three rows and/or three columns.
However, too much information to convey in a table is also not advisable. Tables
are usually used in written technical reports and in oral presentations. Table 5.1 1 is
an example of presenting data in tabular form. This example was taken from 2015
Philippine Statistics in Brief, a regular publication of the PSA which is also the basis
for the example of the textual presentation given above

Graphical presentation, on the other hand, is a visual presentation of the data.


Graphs are commonly used in oral presentations. There are several forms of
graphs to use like the pie chart, pictograph, bar graph, line graph, histogram,
and box-plot.
DIFFERENT FORMS OF GRAPH

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

1. LINE GRAPH is used to represent changes in data over a period of time. A line
graph may be curved broken or straight.
NOTE: Generally, the horizontal axis is used as the time axis and the vertical axis is
used to show the changes in the other quantity.

The above graph tells about the trend in the temperature of New York on a hot
day.

2. BAR GRAPH is a graph that uses horizontal or vertical bars to represent data.
• When a bar graph has a bar, which extends from left to right, it is called a
horizontal bar graph.
• If the bar extends from bottom to top, it is called a vertical bar graph.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

3. HISTOGRAM is a graphical display of data using bars of different heights.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

Bar graph vs Histogram

Bar Graphs are good when your data is in categories (such as "Comedy",
"Drama", etc).

But when you have continuous data (such as a person's height) then use
a Histogram. It is best to leave gaps between the bars of a Bar Graph, so it
doesn't look like a Histogram.

4. PIE GRAPH OR PIE CHART is another visual representation of data. It is used to


show how all the parts of something are related to the whole.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

5. PICTOGRAPH is a graph that uses pictures to illustrate data.

Prepared by: Ms. Justine Ann Nasol


STATISTICS AND PROBABILITY

References
Tales, K. A. (2016). Statistics and Probability. Quezon City, Philippines: FNB Educational, Inc.

Winston S. Sirug, P. (2015). Basic Probability and Statistics A step by step Approach (Revised Edition).
Manila, Philippines: MIndshapers Co., Inc.

Gates, LB; Gentry, D; Sevilla, D; Montes, J.E; 2021 mathisfun. Using and Handling
Data.https://www.mathsisfun.com/data/index.html

Prepared by: Ms. Justine Ann Nasol

You might also like