0% found this document useful (0 votes)

171 views14 pages

Class Module - APPLIED STATS - ICFV

This document provides an outline for a course on applied statistics for business and economics. It covers key topics including data presentation, measures of central tendency, measures of variability, probability, the normal distribution, simple linear regression, hypothesis testing, and analysis of variance. The goal is to illustrate statistical concepts and tools using practical examples to help students apply these principles to business and economic problems.

Uploaded by

Merylle Shayne Gustilo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

171 views14 pages

Class Module - APPLIED STATS - ICFV

Uploaded by

Merylle Shayne Gustilo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

APPLIED STATISTICS FOR BUSINESS AND ECONOMICS

ECO 2209

Ivan Cassidy F. Villena

1|Page
TABLE OF CONTENTS

Table of Contents
PREFACE 3
INTRODUCTION 4
DATA PRESENTATION 5
MEASURES OF CENTRAL TENDENCY 7
MEASURES OF VARIABILITY 7
PROBABILITY 8
NORMAL DISTRIBUTION 9
SIMPLE LINEAR REGRESSION 10
HYPOTHESIS TESTING 12
ANALYSIS OF VARIANCE 13
CHI-SQUARE TEST 13
REFERENCES 14

2|Page
PREFACE
This module attempts to illustrate statistical concepts and tools by using easy-to-
understand examples and exercises, in order for students to apply these to the
different business and economic problems. Moreover, statistical principles are, as
much possible, explain with words rather than formulas. The author hopes that this
module could help students enrolled in this subject, and them as future practitioners
in the field of business and economics, to appreciate the importance of “statistical
thinking” on transforming data into meaningful information that are indispensable in
the decision-making processes.

3|Page
Week 1

I. INTRODUCTION

1. Definition of Statistics. Statistics (singular) involves the collection, organization,

analysis, and presentation of data that are subject to variability, and how these
data could be process into useful information that involve uncertainty.

2. Basic Concepts about Data. Data collected are either classified as quantitative or
qualitative.

a. Quantitative (numerical) – data, whose sizes are meaningful. These type of data
may be further classified into discrete or continuous

Discrete – those data that can be counted. These are values that can be put
into one to one correspondence with a subset of the set of counting numbers
(e.g. number of students enrolled in this subject).

Continuous – data that can be measured (e.g. Exact height of a student

enrolled in this subject).

b. Qualitative (categorical) – data that answer the question “what kind.” These data
can either be ordered or unordered.

3. Importance of Statistics. The primary objective of statistics is to transform data

into meaningful and useful information by describing, explaining, predicting
and/or control some phenomena of interest (e.g. determining whether the
vaccine is effective).

4. Types of data and Level of Measurement. To have a better understanding of

data, it is important to know the meaning of a variable. A variable is a
characteristic of a unit of observation or subject that can take on different
values for different units/subjects or for the same unit/subjects at different
periods.

a. Nominal – the simplest scale of measurement where a value or unit of data is

assigned to one of at least two qualitative classes of categories (e.g. sex, civil
status). It is exhaustive and mutually exclusive.

b. Ordinal – involves placement of values or codes in some rank/order to create

an ordinal scale variable (e.g. social class: upper, middle, and lower).

c. Interval – data that has a zero point and no meaning (e.g. temperature
measured either in Celsius or Fahrenheit).

d. Ratio – It has all the features of an interval scale and requires absolute, fixed,
and non-arbitrary zero point (e.g. per capita GNP or GDP).

4|Page
It is imperative that one understands the importance of knowing the level of
measurement, because of the following reasons: (1) it helps one to decide
how to interpret data; and (2) it helps one to decide what statistical
analysis is appropriate on the assigned values.

Types of Data

a. Primary (Raw) – any set of data or information that directly collected from
the source (e.g. IATF’s COVID-19 statistics announced via national
television).

b. Secondary – data that are directly provided by an organization or

government agency in convenient form such as written report or census
data, or data that are processed and re-processed by individuals or
entities from sources other than the primary source of information (e.g.
PSA’s Labor Force Survey and BSP’s Core Banking Statistics).

Weeks 2 and 4

II. DATA PRESENTATION

5. Textual Form. Presenting the data in the form of words, sentences and
paragraphs. It allows one to present qualitative data that cannot be presented in
graphical or tabular forms (e.g. commonly used words during an interview). The
most common form of presenting data in textual form is the wordcloud (See
Figure 1). Recommended softwares: MS Excel (add-in function) and
Polleverywhere.com (web-based analytical tool).

Figure 1. Example of a “Wordcloud”

6. Graphical Form. It is method commonly used to analyze numerical data. It is

used to present to the readers and audience the relation between data, ideas,
information and concepts in a diagram. There are different types of graphical
forms of presentation of data:

5|Page
a. Line Graphs. This is used to display the continuous data and it is useful for
predicting future events over time (historical or time period analysis).

b. Bar Graphs. This is used to display the category of data and it compares the
data using solid bars to represent the quantities.

c. Histograms. This is used to present the distribution of data. In statistics,

histogram is utilized to measure the “skewness” of data.

d. Frequency Table. Presents the data in summary form by aggregating the data,
in particular choosing suitable non-ovelapping classes, tallying, (or counting) the
data into these classes, and presenting them in tabular form.

e. Stem and Leaf Display. In this form, the data are organized from least value to
the greatest value. This can be constructed by splitting each data in two parts, a
stem (one or more of the leading digits) and leaf (which consists of the remaining
digits) (See Figure 2).

Figure 2. Example of a “Stem and Leaf” Graph

6|Page
Weeks 5 to 6

III. Measures of Central Tendency

Measure of central tendency is an index of the central location of a distribution. It is a

single value that is used to identify the center of the data or the typical value.

a. Mean. The most common used measure of central tendency is the average
(also called the mean or arithmetic mean). Simply, the mean or average of a
data set is the sum of the data divided by the number of data.

b. Mode. The value of a variable that occurs most frequently. It is also referred
as nominal average.

c. Median. The “middle observation” when the data set is sorted (in either
increasing or decreasing order), or also termed as the “central value of a
distribution.”

d. Percentiles, Quartiles. Quartiles are values that separate the sorted data
into four equal groups and percentiles are values that separate the sorted
data into 100 equal groups.

Weeks 7 to 8

IV. Measures of Variability/Variation

Resort to the measures of central tendency, by using a single summary number such as
the mean, is not enough to provide a clear picture of the distribution of a list. Several lists
of data may have the same mean, but the spread of the lists may be different. Thus,
calculating other features of the data such as measure of spread or variation may also
be important. This can be done by computing for the following:

a. Range. The difference between the largest and smallest values in the list
(Formula: Range = largest or maximum value – smallest or minimum value)

b. Inter-quartile range. The difference between the upper and lower quartiles. A
five-number summary consists of the smallest value, the lower quartile, the
media, the upper quartile, and the largest value (Formula: Upper Quartile (Q3)
– First Quartile (Q1).

c. Mean Absolute Deviation. It is used to measure the spread of variance,

formed by taking the mean of the squared deviations from the average.

7|Page
Formula: Mean Absolute Deviation = ∑ | Xi – Mean| / no. of observations

d. Variance. It is the mean of the squared deviations from the average.

Formula: σ2 = ∑ (xi - µ)2 / no. of observations

Standard Deviation – it is the square root of the variance.

Week 9

V. Probability

Probability theory or commonly known as “probability” is a branch of mathematics

concerned with the analysis random phenomena. Simply, it tells us how an event would
likely something to happen. Obviously, the best example of understanding probability is
flipping a coin – wherein the there are two possible outcomes: the heads or tails (Basic
Formula: Probability of an event /P(A) = (no. of ways it can happen) / (total no. of
outcomes)

a. Permutations and Combinations. The various ways in which objects from

a set may be selected, generally without replacement, to form subsets. This
selection of subsets is called permutation when the order of selection
is a factor – otherwise, it is combination.

Permutation: nPk = n! / (n-k)!

The symbol nPk reads “n permutes k.” The expression n!—read “n factorial”—

indicates that all the consecutive positive integers from 1 up to and
including n are to be multiplied together, and 0! is defined to equal 1.

Combination: nCk = n! / k! (n-k)!

For combinations, k objects are selected from a set of n objects to produce

subsets without ordering. The number of such subsets is denoted by nCk,
read “n choose k.” For combinations, since k objects have k! arrangements,
there are k! indistinguishable permutations for each choice of k objects

The formulas for nPk and nCk are called counting formulas since they can be
used to count the number of possible permutations or combinations in a
given situation without having to list them all.

8|Page
Week 9

VI. Normal Distribution

a. The Normal Distribution. When data are continuous, one associate it to a

curve rather than a histogram to its distribution. There are many theoretical
continuous distributions and normal distribution is one such continuous
distribution. The normal distribution is characterized by two parameters: (1)
the mean; and (2) standard deviation of the distribution.

Normal distributions have the following features (See Figure 3):

1) symmetric bell shape;

2) mean and median are equal; both located at the center of the distribution;

3) one standard deviation from the mean is about 68%;

4) two standard deviations from the mean is about 95%

5) three standard deviations from the mean is about 99.7%

Figure 3. The Normal Distribution Curve

b. Why the normal distribution is the most important curve in statistics.

One reason is because many variables are normally distributed or at least,
approximately normally distributed such as heights, weights, and
examination scores. Another reason is that it is easy for statisticians to work
with, because a number of inferential statistical tools are based on the
assumption that the data come from normal distributions. But the most
important reason is the fact that the sample for large observations tend to
be normal regardless of the original population from where the sample
values came from.

9|Page
c. Measure of Skewness. In order to assess whether a distribution is skewed
or asymmetric, one may calculate some measure of shape. One method is
to compute for skewness, which can be obtained by computing for the
ration (Upper Quartile – Median) / (Median – Lower Quartile). Another
measure of skewness is the difference Mean – Median, which zero for
symmetric data, positive for right skewed data, and negative for left skewed
data.

d. Kurtosis. It measures whether the data are heavy-tailed or light-tailed

relative to normal distribution. Data sets with high kurtosis (positive) tend to
have heavy tails, or outliers, while data sets with low kurtosis (negative)
tend to have low tails, or lack of outliers.

When the value of kurtosis is zero, the distribution is mesokurtic. A positive

value indicates leptokurtic distribution, and platykurtic when it is in the
negative.

Weeks 11 to 12

VII. Regression and Correlation

a. Pearson Product-Moment Correlation (r). Two variables are said to be

associated if knowing the value of one of them tells that something about
the value of the other. The degree to which the two variables are
associated is measured by a quantity known as the correlation
coefficient. Two variables that are positively correlated would tend to
go together in the same direction (x+, y+). On the other hand, two
variables are negatively correlated if the values tend to go on a opposite
direction (x+, y-) (See Figure 4 for the formula).

Figure 4. Pearson Product-Moment Correlation Formula

b. Test of Significance of Correlation Coefficient.

b.1. Using a table of Critical Values. The 95% Critical Values of the
Sample Correlation Coefficient Table can be used to give a good idea
of whether the computed value of r is significant or not. Compare r to
the appropriate critical value in the table. If r is not between the positive
and negative critical values, then the correlation coefficient is

10 | P a g e
significant. If r is significant, then one may want to use the line for
prediction.

Process: Compare the computed r to the critical values associated with

the computed degree of freedom (df = n – 2). If r < negative critical
value or r > positive critical value, then r is significant, otherwise.

URL Link for Table of Critical Values:

https://www.statisticssolutions.com/table-of-critical-values-pearson-
correlation/

c. Simple Linear Regression Analysis. When one visualizes the point in

a scatterplot generally clustering about a line, it is interesting to obtain
an estimate of such line in order to estimate the expected level of the
continuous variable for a known specific value. This statistical tool is
called the simple linear regression.

Formula: Ŷ = â+ βX

∑ Xi ∑ Xi
â= −b =X −bX
n n

∑ XiYi−∑ Xi ∑Yi Sy
β= 2 2
=r
n ∑ X i −( ∑ X i )
2 Sx

d. Sum of Squares. A statistical tool that is used to identify the dispersion

of data as well as how the data can fit the model in regression analysis.
It is the most important outputs in regression analysis The general rule
is that a smaller sum of squares indicates a better model, as there
is less variation in the data.

Types of Sum of Squares

1. Total Sum of Squares. Measures the variation of the values of a

dependent variable from the sample mean of the dependent
variable.

Formula: TSS = ∑ ( yi− ȳ )2= SSR + SSE

2. Regression Sum of Squares. Describes how well a regression

model represents the modeled data. A higher sum indicates that
the model does not fit the data well.

11 | P a g e
Formula: SSR = ∑ ( ŷ ¿i− ȳ ) ¿
2

3. Residual Sum of Squares (Sum of Squared Errors). Measures

the variation of modeling errors. In other words, it depicts how the
variation in the dependent variable in a regression model cannot be
explained by the model. Generally, a lower residual sum of
squares indicates that the regression model can better explain
the data while a higher residual sum of squares indicates that
the model poorly explains the data.

Formula: SSE = ∑ ( yi− ŷi )2

4. The Standard Error of Estimate. Measures the accuracy of

prediction.

√
2
Formula: σest = ∑ ( yi− ŷi )
n

5. Coefficient of Determination. Measures the proportion of variance

in the dependent variable that is predictable from the independent
variable. Values of 1 or 0 would indicate the regression line
represents all or none of the data, respectively. A higher coefficient
is an indicator of a better goodness of fit for the observations.

Formula: square root of the correlation coefficient (r2) converted into

percentage

Weeks 13 to 14

VIII. Hypothesis Testing

a. Hypothesis Test: Basic Idea. Statistical hypothesis testing involves

two competing claims about a population parameter of interest (i.e.
statements regarding a population parameter(s), and making a
decision to accept one of these claims on the basis of evidence (and
uncertainty in the evidence). Hypothesis testing is actually subject to
errors, the chances of which one would like it to be is small.

1. Hypothesis Testing using p-value. This value represents the

chance of generating a value extreme as the observed value of the
test statistic or something more extreme if the null hypothesis is
true. One may use the p-value in combination with the level of
significance to also make a decision on whether to reject or not to
reject the null hypothesis. In such case, if the p-value is less than
the level of significance (p<0.05), usually 5%, then one may

12 | P a g e
reject the null hypothesis. In this case, the result is statistically
significant at the 5% level.

2. Confidence Interval: Basic Idea. This are viewed as interval

estimates for the population mean as they provide a band (range)
of values within which one is confident that the true value of the
population mean lies. Such interval estimates also provide us a
sense of the impression as well as the accuracy of “point
estimates.” Whenever interval estimates are generated, it is
important that confidence level be stated. Confidence levels, like
chances, are between 0 and 100%. Commonly, used values of the
confidence level are 67%, 90%, 95%, and 99%.

3. One Sample T test. A statistical procedure used to determine

whether a sample of observations could have been generated by a
process with a specific mean (Steps will be discussed in class
lecture).

4. Z test for Proportion. It is a statistical procedure that used to have

a distribution that could be readily approximated by a normal curve.
The Z statistic can also be used for situations involving binary
classification and counts for the binary classes (Steps will be
discussed in class lecture).

Weeks 15 to 16

IX. Analysis of Variance (ANOVA)

a. ANOVA. When one would like to decide whether observed differences

among more than two sample means can be attributed to chance or
whether they reflect actual differences among the means of the
populations the data are sampled from. One-way ANOVA is one
statistical procedure of analyzing difference among sample means, and
the other is Two-way ANOVA, which is used to compare the mean
differences between groups that have been split on two independent
variables (Steps will be discussed in class lecture).

Weeks 17 to 18

X. Chi-square Tests

a. Chi-square statistics It is a summary measure of how far the

observed numbers of counts in each category are from their expected
values. One of the advantages of the test is that it may be employed to
determine the independence of variables. It may be also used to

13 | P a g e
determine the goodness of fit to a hypothesized distribution (Steps
will be discussed in class lecture).

REFERENCES:

Albert, J.R.G. (2008). Basic Statistics for the Tertiary Level. 1st edition. Rex Bookstore Inc.
Manila, Philippines.

14 | P a g e

Lecture 1-Introduction To Research & Research Methodology - PPTX Lecture 1
No ratings yet
Lecture 1-Introduction To Research & Research Methodology - PPTX Lecture 1
21 pages
Case Study No. 7 - Contingency Plan of Manila Water
No ratings yet
Case Study No. 7 - Contingency Plan of Manila Water
20 pages
L3 Probability Distributions and Histogram For The Probability Distribution
No ratings yet
L3 Probability Distributions and Histogram For The Probability Distribution
10 pages
SP11S2Q3W4 Random Sampling and Parameter and Statistics
No ratings yet
SP11S2Q3W4 Random Sampling and Parameter and Statistics
42 pages
Introduction To Enzymes Practical Handout For 2nd Year MBBS
No ratings yet
Introduction To Enzymes Practical Handout For 2nd Year MBBS
7 pages
BT0210 - Molecular Biology Laboratorymanual
No ratings yet
BT0210 - Molecular Biology Laboratorymanual
29 pages
Bacterial DNA Isolation
No ratings yet
Bacterial DNA Isolation
6 pages
DNA Extraction From Organic Phase of Trizol Reagent After RNA Isolation
No ratings yet
DNA Extraction From Organic Phase of Trizol Reagent After RNA Isolation
2 pages
Area of Normal Curve Using Z Table
100% (1)
Area of Normal Curve Using Z Table
12 pages
RACE Kit
No ratings yet
RACE Kit
31 pages
Exam Trends Network and Critical Thinking in The 21st Century
No ratings yet
Exam Trends Network and Critical Thinking in The 21st Century
4 pages
Mean, Variance, and Standard Deviation of
No ratings yet
Mean, Variance, and Standard Deviation of
15 pages
Water Supply Case Study
No ratings yet
Water Supply Case Study
3 pages
Defining and Non-Defining Relative Clauses: Relative Pronouns
No ratings yet
Defining and Non-Defining Relative Clauses: Relative Pronouns
14 pages
Top Finance N Excel Cheat Sheets
No ratings yet
Top Finance N Excel Cheat Sheets
24 pages
Irony Powerpoint 2
No ratings yet
Irony Powerpoint 2
16 pages
Raspberry Pi As A Sensor Web Node For Home Automatoin PDF
No ratings yet
Raspberry Pi As A Sensor Web Node For Home Automatoin PDF
19 pages
Diagnostic Enzymes
No ratings yet
Diagnostic Enzymes
33 pages
Advanced PCR: Methods and Applications: Dr. Maryke Appel
No ratings yet
Advanced PCR: Methods and Applications: Dr. Maryke Appel
21 pages
Q3 - Trends, Networks & Critical Thinking - Week 4
No ratings yet
Q3 - Trends, Networks & Critical Thinking - Week 4
13 pages
Plasma Enzymes
No ratings yet
Plasma Enzymes
13 pages
Protocols For Recombinant Dna Isolation
No ratings yet
Protocols For Recombinant Dna Isolation
126 pages
3 Months Before) : Bangles, and Wristwatch
No ratings yet
3 Months Before) : Bangles, and Wristwatch
1 page
HUMMS Track-Detailed Lesson Plan in Trend, Networks, and Critical Thinking
No ratings yet
HUMMS Track-Detailed Lesson Plan in Trend, Networks, and Critical Thinking
3 pages
Badminton and Table Tennis
No ratings yet
Badminton and Table Tennis
21 pages
Cot Trends
No ratings yet
Cot Trends
5 pages
MA2 - Student Note - Agust 2023
No ratings yet
MA2 - Student Note - Agust 2023
132 pages
Statistics - Normal Distribution
100% (1)
Statistics - Normal Distribution
63 pages
Normal Distribution PPT
No ratings yet
Normal Distribution PPT
20 pages
Quarter 3 - Summative Test STAT
No ratings yet
Quarter 3 - Summative Test STAT
9 pages
TNCT DLL Q3 Week 3
No ratings yet
TNCT DLL Q3 Week 3
7 pages
Scientific Research Methodology: I Nstructors
No ratings yet
Scientific Research Methodology: I Nstructors
48 pages
Theory of Change
No ratings yet
Theory of Change
10 pages
Clinically Important Enzymes and Diagnostic Application Class 2 2012 (المحاضره الخامسه)
No ratings yet
Clinically Important Enzymes and Diagnostic Application Class 2 2012 (المحاضره الخامسه)
21 pages
STAT Summative Test - Q3 (Week 1 - 2)
No ratings yet
STAT Summative Test - Q3 (Week 1 - 2)
2 pages
Basic Statistics
No ratings yet
Basic Statistics
54 pages
Business Statistics: Qualitative or Categorical Data
No ratings yet
Business Statistics: Qualitative or Categorical Data
14 pages
Final AB 19-21 PIM3 Basics of Business Statistics
No ratings yet
Final AB 19-21 PIM3 Basics of Business Statistics
37 pages
Part1 141104090445 Conversion Gate01
No ratings yet
Part1 141104090445 Conversion Gate01
27 pages
PIM3 - Basics of Business Statistics
No ratings yet
PIM3 - Basics of Business Statistics
37 pages
Guiang Mamow Paper 1 Statistical Terms
No ratings yet
Guiang Mamow Paper 1 Statistical Terms
5 pages
Introduction To Statistics For IGCSE Students
No ratings yet
Introduction To Statistics For IGCSE Students
10 pages
Statistics and Probability
No ratings yet
Statistics and Probability
17 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
58 pages
Module Stat 1
No ratings yet
Module Stat 1
4 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
21 pages
Chapter 2 Stat (MMW)
No ratings yet
Chapter 2 Stat (MMW)
13 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
8 pages
Note For Int To Statistics
No ratings yet
Note For Int To Statistics
24 pages
2.educational Statistics - Learning Insights
No ratings yet
2.educational Statistics - Learning Insights
26 pages
Statistics A Review
No ratings yet
Statistics A Review
47 pages
Q3 Research Ii Week 5 6 Statistics 1
No ratings yet
Q3 Research Ii Week 5 6 Statistics 1
30 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
13 pages
Module 4 Data Management GayasGarcia PART1
100% (1)
Module 4 Data Management GayasGarcia PART1
35 pages
Week 5A - Statistics Handout
No ratings yet
Week 5A - Statistics Handout
9 pages
Module 1
No ratings yet
Module 1
10 pages
Data Management
No ratings yet
Data Management
7 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
73 pages
Lesson 1: Fundamental Concepts and Summation Notation
No ratings yet
Lesson 1: Fundamental Concepts and Summation Notation
8 pages
Report Stat
No ratings yet
Report Stat
21 pages
NSTP Common Module 2 Drug Education
No ratings yet
NSTP Common Module 2 Drug Education
33 pages
Group 3 - Globe SMMP 1
No ratings yet
Group 3 - Globe SMMP 1
75 pages
Basic-Micro 1 PDF
No ratings yet
Basic-Micro 1 PDF
5 pages
Taxation Chap 1
No ratings yet
Taxation Chap 1
16 pages
INDUSTRY ANALYSIS For Strategic Plan
No ratings yet
INDUSTRY ANALYSIS For Strategic Plan
6 pages
Kotler mm14 ch14 DPPT
No ratings yet
Kotler mm14 ch14 DPPT
33 pages