0% found this document useful (0 votes)
251 views48 pages

Bus Stat. 11

This document outlines a business statistics course at Apayao State College in the Philippines. The course is 3 credit units and meets twice a week for 1.5 hours each session. The course objectives are to develop students' abilities in creating and interpreting data distributions and graphs, applying probability theory, and using linear correlation, regression, and hypothesis testing. The course covers topics such as measures of central tendency, measures of location and dispersion, probability, correlation, regression, and hypothesis testing. Assignments include exercises to develop computational skills for solving mathematical problems in statistics.

Uploaded by

Shela Ramos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
251 views48 pages

Bus Stat. 11

This document outlines a business statistics course at Apayao State College in the Philippines. The course is 3 credit units and meets twice a week for 1.5 hours each session. The course objectives are to develop students' abilities in creating and interpreting data distributions and graphs, applying probability theory, and using linear correlation, regression, and hypothesis testing. The course covers topics such as measures of central tendency, measures of location and dispersion, probability, correlation, regression, and hypothesis testing. Assignments include exercises to develop computational skills for solving mathematical problems in statistics.

Uploaded by

Shela Ramos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Republic of the Philippines

APAYAO STATE COLLEGE


Conner, Apayao, Philippines 3807
asc.edu.ph/www.facebook.com/asceduofficial.

Course Code: BUS STAT 11 Semester: First


Course Title: Business Statistics Day: TTh
Credit: 3.0 units Time: 1:00 -2:30
Course Objective
At the end of the course the student should able to:
CLO 1: Create and interpret frequency distribution and graphs representing data sets

CLO 2: Recognize and apply the basic definitions and rules of probability theory

CLO 3: Read and interpret the results of linear correlation and regression

CLO 4: Independently solve mathematical problems applying computational skills and assessing the results.

Course Coverage

1. Introduction to Statistics
1.1. Definition Statistical Terms
1.2. Levels of Measurements
2.Organization of Data and Sampling Method
2.1. Methods of Data Collection
2.2. Sampling
2.3. Methods of Data Presentation
3. Measures of Central Tendency
3.1.Ungrounded Data
3.2.Grouped Data
4. Measures of Location of Data
4.3. Percentile, Quartile, Interquartile Range, and Decile
5. Measures of Dispersion
5.1. Ungrouped Data
5.2. Grouped Data
6. Hypothesis Testing
6.1. Basic Concepts of Statistical Hypothesis Testing
6.2. ANOVA
6.3. Chi-Square
7. Linear Correlation and Regression
7.1. Coefficient of correlation
7.2. Testing the Significance of the Correlation Coefficient
7.3. Linear Regression
7.4. Using Regression to Develop a Forecasting Trend Line
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 1 of 48
Module 1

Introduction to business Statistics

At the of the chapter, the students should able to:

 define some statistical terms;


 state the importance of statistics;
 classify data as nominal, ordinal, interval or ratio.

Lesson 1. Definition of term related to statistics

Statistics is a branch of mathematics that deals with the collection, organization, presentation, analysis,
and interpretation of data with the purpose of describing and drawing inferences about the numerical properties
of a population.

Two Types of Statistics

1. Descriptive Statistics. Methods of organizing, summarizing, and presenting data in an in formative


way.
2. Inferential Statistics. Methods to find out something about a population, based on a sample.
Two important terms that you should understand in studying statistics are population and sample.

In statistics, population does not only mean a group of people but it also means a defined groups or aggregates
of objects, animals, materials, measurements, “things”, “events” or “happening” of any kind. It is a collection of
all possible individuals, objects, or measurements of interest. Thus, a sack of rice, a whole pizza pie, or a set of
weights and heights are considered population.

Since it would be impractical to study the whole population as in the case of asak of rice, then it is
necessary to just take a sample of the population. Thus, a handful of rice is a sample of the population in a sack
of rice. So, sample is defined as any subgroup of the population drawn by some appropriate method from the
population. It should be a representative of the population, that is, the sample will sow the properties of the
population.

Types of Varibles

1. Qualitative variable are those obtained from a qualitative population. When the charactistic or variable
being studied is nonnumeric it is called qualitative variable or an attribute. Example civil status, gender,
hair colour, etc.
2. A quantitative variable is the type when the variables studied can be reported numerically. Example:
age, scores, height, length, weight all that can be quantified.

Quantitative variable can be classified into:

a. Discrete variables can assume certain values, and there are usually gaps between the values.
Example: the number of chair in a room; the number of students in a class, the number of employees
in an office etc.
b. Continuous variables can assume any value within a specific range. Example: the weight of the
shipment of apples; length of the lawn; the height of a man; etc.

Lesson 2: levels of measurement

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 2 of 48
1. Nominal level; in the nominal level of measurement, the observations can only be classified or counted.
There is no particular order to the labels. Example: the number placed at the back of basketball player
which helps the reference identify the particular player, gender, civil status, etc.
2. Ordinal level of measurement: In an ordinal level measurement, data categories are ranked or ordered
accordingly. Example: the rating of the students given to a professor during the evaluation, the honor
given to students during graduation (first honor, second honor, etc)
3. Interval Level Data: The interval level of measurement includes all the characteristics of the ordinal
level, but in addition, the difference between value is a constant size. Example: score of the students in
an examination, IQ scores, etc.
4. Ratio Level Data is the highest level measurement. The ratio of measurement has all the characteristic
of the interval level, but in addition, the zero point is meaningful and the ratio between two numbers is
meaningful. Examples are wages, height, and weight. Money is a good example, because if one has no
money, we are referring to zero.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 3 of 48
Module 1

Introduction to business Statistics

Name __________________________________________________________

Course and Year _________________________________________________

Exercise 1.1

Determine level of measurement of the following

1. _________________________ Civil status of a man


2. _________________________ Students’ score on final examination
3. _________________________ The citizenship of a person
4. _________________________ The time spent in the internet café of a student
5. _________________________ The classification of student by state of birth
6. _________________________ The rating given by the students his professor
7. _________________________ For each of the following, tell whether it is a population or a sample
8. _________________________ The total number of students in a mathematics class
9. _________________________ Forty of the students are chosen to represent the student body
10. _________________________ The senior citizens of the city were the recipients of the housing project of
the local government.

Module 2
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 4 of 48
Organization of data and Sampling methods

At the of the chapter, the students should able to:

 apply the basic statistical concepts and principles in the collection of data;

 identify the type of sampling used in statements;

 present data in textual method, tabular method and graphical method.

Lesson 1: Organize Data into a Frequency Distribution

Frequency Distribution. Refer to the grouping of data into categories showing the number of
observations in each mutually exclusive category. A summary of data presented in the form of class intervals
and frequencies.

Examples 1:

Table 1.1 60 years of unemployment Data (Raw Data)

2 5 7 8 10
2 5 7 8 10
3 5 7 8 10
3 5 7 8 10
4 6 7 9 10
4 6 8 9 10
4 6 8 9 11
4 6 8 9 12
5 6 8 9 12
5 6 8 9 12
5 6 8 9 12
5 7 8 9 12
Steps in Organizing Data into a Frequency Distribution
Step 1. Determine the Range

Range is the difference between the highest and the lowest number in a set of data.

Based from Table 1.1, the range is 12 – 2= 10

Step 2. Determine the number of classes it will contain. One rule of thumb is to select between 5 and 15 classes.
To approximate the class width or size, divide the range by the desired number of classes.

Example, if we decide to have 6 classes, we divide the range by 6, such as:

10/5 = 2; normally, the class size is rounded to the nearest whole number.

Step 3. Tally. The following table summarizes the raw data or the ungrouped data into a frequency distribution.

Table 1.2 Frequency Distribution

Class Interval Tally Frequency


2-3 //// 4
4-5 /////-/////-// 12
6-7 /////-/////-/// 13
8-9 /////-/////-/////-//// 19
10-11 /////-// 7
12-13 ///// 5
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 5 of 48
Total 60

Class Midpoint it is midpoint of each class or sometimes called as class Mark. It the value half – way
across the class interval and can be calculated as the average of the two class endpoints.

Example: The class mark of the class 2 - 3 is (2 + 3) / 2 = 2.5; the class mark of the class 4 -5 is (4 + 5) = 4.5,
etc.

Table 1.3 Sample class mark or class midpoint

Class Interval Class Mark (M) Frequency


2-3 2.5 4
4-5 4.5 12
6-7 6.5 13
8-9 8.5 19
10-11 10.5 7
12-13 12.5 5
Total 60

Relative Frequency is the proportion of the total in any given class interval in a frequency distribution.
Example: the relative frequency of the frequency 4 is 4/60 or .0667; the relative frequency of 12/60 = .20000,
etc.

Table 1.4 Example of class mark, Relative frequency and cumulative frequency

Class Interval Class Mark (M) Frequency Relative Cumulative


Frequency Frequency
2-3 2.5 4 .0667 4
4-5 4.5 12 .2000 16
6-7 6.5 13 .2167 29
8-9 8.5 19 .3167 48
10-11 10.5 7 .1167 55
12-13 12.5 5 .0833 60
Total 60

The cumulative frequency is a running total of frequencies through the classes of frequency
distribution. Example: based from table 1.4, the cumulative frequency is 4 + 12 = 16. 16 + 13 = 29; 29 + 19 =
48; 48 + 7 = 55; 55 + 5 = 60

Sampling Method

Lesson 1. Sampling Method

Sampling is widely used in business as a means of gathering useful information about a population.
Data are gathered from sample and conclusions are drawn about the population as a part of the interval statistics
process

Estimating Sample Size

Research may use the following formula by Slovin

N
n=
1+ N e2

Where

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 6 of 48
N = total population

n = sample size

e = margin of error (use 5%)

Example: Given the following data, determine the sample size. Population (N) is 1,000, margin of error is 5%

Using the formula

1,000
n=
1+ 1000¿ ¿

1,000
n=
1+ 2.5

n = 286

Therefore, the sample size is 286. This means that 286 will be drawn from the 1,000

Two main types of sampling

a. Random sampling
b. Nonrandom sampling
a. Random sampling. In random sampling every unit of the population has the same probability of being
selected into the sample.

1. Simple random sampling. It can be viewed as basis for other sampling techniques, with simple
random sampling, each unit of the frame is numbered from 1 to N (where N is the size of the
population ). Then a table of random number or a number generator is used to determine the
unit to be included in the sample.
2. Stratified random sampling. A stratified random sampling method divides the population first
into homogeneous subgroups, called strat, from which simple random samples are then drawn.
3. Systematic random sampling. In a random sampling method whereby every Kth item is
selected to produce a sample of size n from a population of size N.
Determining the value of K
K=N/n

Where
N = population size
n = sample size
K = size of interval for selection
4. Cluster (or area) random sampling. Cluster (or area) sampling involves dividing the
population into non-overlapping area, or clusters.
b. Nonrandom Sampling. Sampling techniques are used to select elements from the population by any
mechanism that does not involve a random selection process are called non-random sampling. The following
are the non-random sampling techniques.
1. In convenience sampling, element for the sample are selected for the convenience of the
researcher. Example, a convenience sample of homes for door to door interview might include
houses where people are at home, houses near the street, first door apartment, houses with
friendly people, etc. Using the telephone directory to know the popularity of the president of the
country is also an example of convenience sampling.
2. Judgment sampling or Purposive sampling occurs when elements selected for the sample are
chosen by the judgement of the researcher. Example, when a researcher is studying the
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 7 of 48
activities of the retired employees of a certain company, judgment or purposive sampling is
needed because what he needs are those that have retired as the subject of his study.
3. Quota sampling are similar to stratified sampling, except that in quota sampling instead of
randomly sampling from each stratum, the researcher uses a non-random sampling method to
gather data from one stratum until the desired quota of samples is filled. Example, suppose a
researcher wants to stratify the population into owners of different types of cars. Here he will
interview all car owners by looking into the quota for each brand.
4. Snowball sampling. In snowball sampling, the subjects are done through referrals from other
survey respondents. The researcher identifies a person who fits the profile of subjects wanted
for the study. The researcher, then asks this person the names and location of the others who
would also fit the profile of the subjects wanted for the study.

Module 2

Organization of data and Sampling methods

Name __________________________________________________________

Course and Year _________________________________________________


Exercise 2.1.
For the following problem, construct the Frequency Distribution; reflecting in the same table the class mark,
Relative Frequency and Cumulative Frequency. Construct the graph of each.

1. The following data represent the number of passenger per flight in a sample of 50 flight from Legaspi
to Manila then to Perto Princesa. (use 5 classes)

23 34 66 67 13 58 19 17 65 17
25 20 47 28 16 38 44 29 48 29
69 34 35 60 37 52 80 59 51 33
48 46 23 38 52 50 17 57 41 77
45 47 49 19 32 64 27 61 70 19

2. For the following data, construct a frequency distribution with 6 classes

57 23 35 18 21
26 51 47 29 21
46 43 29 23 39
50 41 19 36 28
31 42 52 29 18
28 46 33 28 20

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 8 of 48
Sampling Method

Exercise 2.2.

1. For the following researcher problem, determine what sampling method/s should be used.
___________________a. A city wide study of motels and hotels is being conducted.
___________________b. A study of consumer’s attitude and behavior.
___________________c. A researcher would like to determine the popularity of a candidate
___________________d. A study of the retired employees of private educational institutions.
2. For each of the following researcher problems, list some strata into which the variables can be divided.
___________________a. Age of the respondent
___________________b. Size of the company (sales volume)
___________________c. Geographic location
___________________d. Occupation of the respondents
___________________e. Types of business

3. For each of the following researcher projects, list at least one area or cluster that could be used in obtaining the
sample.
_________________________ A study of road conditions of the city
_________________________ A study of the effects of the cement factory of the place

Module 3

Measures of Central Tendency

Objective

 Apply operations involving the summation;


 Compute and interpreted the different measures of central tendency.

Measures of Central Tendency (Ungrouped Data)

Definition. A single value that summarizes set of date. Measures of central tendency yield information about
the center, or middle part, of a group of numbers.

Mean

The arithmetic mean is the average of a group of numbers and is computed by summing all numbers and
dividing by the number of values. Because the arithmetic mean is so widely used, most statisticians refer to it
simply as mean.

defined as the ∑ of the values∈the population


Population Mean =
number of values∈the population

Formula: Population Mean μ=


∑x
n

Where :

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 9 of 48
μ represents the population mean. It is the Greek Lower case letter for “mu”

n is the number of items in the population

x represents any particular value

∑ ❑ is the Greek capital letter “sigma” and indicates the operation of adding
∑ x is the sum of the X values
Example 1. There are 12 automobile companies in Albay. Listed below is the number of patents granted by the
government to each Automobile company.

Company Number of patents Company Number of patent granted


granted
A 511 G 210
B 385 H 97
C 275 I 50
D 257 J 36
E 249 K 23
F 234 L 13

Is this information a population? What is the mean of patents granted?

Solution:

This is a population because we are considering the automobile companies of Albay obtaining patents. To
obtain the mean we get the total number of patents granted and divided by the number of companies of Albay.
Using the formula (1) we have,

511+395+ 275+…+13 2340


μ= = =195
12 12

How do we interpret the value of 195? The average number of patents received by an automobile company is
195. Because we consider all the companies receiving patents, this value is a population parameter.

Sample Mean

The mean is the sum of all the values divided by the total number of values.

Sample Mean x́ =
∑x
n

Where

x́ stands for sample mean

n is the total number of sample

x is any particular value

∑ x is the sum of x values


The mean of a sample, any measure based on a sample is called a statistic.
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 10 of 48
Example: A random sample of six bonds revealed the following interest rates.

Issue Interest Rate (%)


Bond A 9.50
Bond B 7.25
Bond C 6.50
Bond D 4.75
Bond E 12.00
Bond F 8.30

What is the mean interest rate on this sample of long term bonds?

Solution: using formula (2) Sample mean is:

x́=
∑ x = 9.50+ 7.25+ 6.50+4.75+12.00+ 8.30 = 48.3 =8.05
n 6 6

The mean interest rate of the sample of long term bonds is 8.05

The mean is affected by each and every value, which is considered as an advantage. It is also a disadvantage
because extremely large value or small value can cause the mean to be pulled toward the extreme value.

The mean is most commonly used measure of central tendency it uses its data item on its computation, it
is a familiar measure, and it has mathematical properties that make it attractive to use in inferential statistics
analysis.

Mode

The mode is the most frequency occurring value in a set of data. If there are two modes in the set of
data, then data are said to be bimodal. Data sets with more than two modes is referred to as multimodal.

Example 1. Data set: 15, 11, 14, 3, 21, 17, 22, 16, 19, 16, 19, 16, 5, 7, 16, 8, 9, 20, 4

Solution: 16 is the mode because, 16 occurs three times in the data set.

2. Data set: 15, 11, 14, 3 21, 17, 22, 16, 19, 22, 16, 5, 22, 7, 16, 8, 9, 20, 4

Solution: Data set is bimodal, because, 16 and 22 has the same number of values in the set and these data appear
three times in the data set.

Median

Median is the middle value in an ordered array of number. The following steps are used to determine the
median.

Step 1. Array the observation in an ordered data array.

Step 2. For an odd number of terms, find the middle term of the ordered array, It is the median

Step 3. For an even number of terms, find the average of the two middle terms. The average is the median.

Example 1. Suppose a business researcher wants to determine the median for the following numbers:

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 11 of 48
15, 11, 14, 3 21, 17, 22, 16, 19, 16, 5, 7, 16, 8, 9, 20, 4

Step 1. The researcher arranges the numbers in an ordered array.

3,4,5,7,,8,9,11,14,15,16,16,17,19,20,21,22

Step 2. Since the array contains 17 term, (an odd number of terms), the median is the middle number, or 15

Step 3. If the number 22 is removed from the data set, the array would contain only 16 terms.

3,4,5,7,,8,9,11,14,15,16,16,17,19,20,21

Step 4. Now for an even number of terms, the statistician determines the median by getting the average of the
two middles values, 14 and 15. The resulting median is (14+15)/2 = 14.5.

Note : Another way to locate the median is to find the (n + 1)/2 term in on ordered array.

For example from the above data set, is to find the (n+1)/2 = 18/2 = 9, that is the 9th term is the median. The
median is 15.

3,4,5,7,8,9,11,14,(15),16,16,16,17,19,20,21,22

If there is an even number of terms, the median is (16+1)/2 = 8.5; the median for these data is located halfway
between 8th and 9th terms, or average of 14 and 15. Thus the median is (14+15)/2 = 14.5

3,4,5,7,8,9,11,14,15,16,16,16,17,19,20,21

Measures of Central Tendency (Grouped Data)

Population MEAN

μ grouped=
∑ fM = ∑ fM = f 1 M 1+ f 2 m 2+ …+fnmn
N ∑f f 1+f 2+…+ fn

Where: I = the number of classes

f = class frequency

N = total frequency

M = class mark

Example 1:

Class Interval Frequency (f) Class Mark (M) fM


1–3 4 2 8
4–6 12 5 60
7–9 13 8 104
10 – 12 19 11 209
13 – 15 7 14 98
16 – 18 5 17 85
Total ∑ f =60 ∑ f =564
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 12 of 48
μ=
∑ fM = 564 =9.4
N 60

Step 1. Determine the class mark (M) of each interval.

Step 2. Multiply the class mark by the corresponding frequency (fM)

Step 3. Get the sum of the result from step 2. (Summation of fm)

Step 4. Determine the mean by dividing the ( ∑ fM )/ ∑ f = 564/60 = 9.4, Hence , μ=9.4

Sample MEAN

Example 2.

1 2 3 4
Class Interval Frequency M fM
10 – 14 6 12 72
15 – 19 22 17 374
20 – 24 35 22 770
25 – 29 29 27 783
30 – 34 16 32 512
35 – 39 8 37 296
40 – 44 4 42 168
45 – 49 2 47 94
Summation ∑ ❑ ∑ f =122 ∑ fM =3,069

Solution : The Computation is shown in the table of example 2.

The Mean is computed as follows:

Step 1. Determine the class mark (M) (3)

Step 2. Multiply the class mark by the corresponding frequency (4)

Step 3. Get the sum from Step 2; ∑ fM =3,069


Step 4. Use the following formula to substitute the data obtained from step 3.

Grouped Mean x́=


∑ fm = 3,069 =25.16
∑ f 1222
Mode: The mode for grouped data is the class midpoint of the modal class. In the example above, the modal
class is the class interval with the greatest frequency. Hence, the mode = 22.

Median (Grouped Data) The middle value in an ordered array of numbers

N
Formula (Md) = l + 2
md
−cf p
f med [ ]
(i)

Where: L = lower limit of the median class

N/2 = 50% of the total frequency

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 13 of 48
cf p = a cumulative total of the frequency up to but not including the frequency of the median class

f med = the frequency of the median class

I = the class size.

Example 1. Median (Grouped Data)

Class interval Frequency (f) Cumulative Frequency


1–3 4 4
4–6 12 16
7–9 13 29
10 – 12 19 48
13 – 15 7 55
16 – 18 5 60
Total ∑ f =60

Step 1: Determine N/2 = 60/2 = 30

Step 2: Determine the Lower Limit of the median class. (L = 9.5)

Step 3: Determine the cumulative total of the frequencies up to but not including the frequency of the median
class (29)

Step 4: Frequency of the median class (19)

Step 5: Determine the class size (3)

Step 6: Substitute the value obtain from step to step 5 to the formula:

30−29 1
Md = 9.5 + [ 19 ]
(3 )=9.5+ =9.5+0.16=9.66
9

Hence, Md = 9.66

Example 2: Median (Grouped Data)

Class interval Frequency Cumulative


10 – 14 6 6
15 – 19 22 28
20 – 24 35 63
25 – 29 29 92
30 – 34 16 108
35 – 39 8 116
40 – 44 4 120
45 – 49 2 122
∑ f =122

Step 1. Determine the cumulative Frequency

Step 2. Determine the n/2; cumulative frequency up to but not including the frequency of the median class;
lower of the median class; frequency of the median class; class size.

N/2 = 122/2 = 61

Cfb = 28

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 14 of 48
Fmodc = 35

Lmd = 19.5

i=5

Step 3. Substitute the above data to the formula

N
Formula (Md) = l + 2
md
f med[ ]
−cf p
(i)

61−28
Md = 19.5+ [ 35 ]
(5) = 19.5 + 33/35 (5) = 19. + 4.71 = 24.21

Median is 24.21.

Module 3

Measures of Central Tendency

Name __________________________________________________________

Course and Year _________________________________________________

Exercise 3.1

Apply operations involving the summation in each items.

1. 10, 5, 6, 10, 4, 10, 19, 11, 10


x =
2. 1,2,3,4,5,6,7,10,11,13,1,15,16
 x2  x  2 =
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 15 of 48
3. 16, 18, 10, 14, 18, 19, 10, 22, 44, 55, 12, 16
 x2  2 =
4. 10, 5, 6, 10, 4, 1, 9, 11, 1
 x(x)  2 =
5. 12,17,10,16,13,19,15,16
x
 x2  x  2 2
=

Measures of Central Tendency (Ungrouped Data )

Exercise 3.2

Compute and interpreted the different measures of Central Tendency. (Show your complete solution)

1. A sample of households that subscribe to the United Bell Phone Company revealed the following
numbers of calls received last week.

52 43 30 38 30 42 12 46
34 46 32 18 41 5 39 37

2. The following data represent the number of passenger per flight in a sample of 50 flight from Legaspi
City to Manila.

23 46 66 67 13 58 19 17 65 17
25 20 47 28 16 38 44 29 48 29
69 34 35 60 37 52 80 59 51 33
48 46 23 38 52 50 17 57 41 77
45 47 49 19 32 64 27 61 70 19

Measures of Central Tendency (Grouped Data)

Exercise 3.3

Compute and interpreted the different measures of Central Tendency. (Show your complete solution)

1. The air transport association recorded the following number of passenger arriving and departing on the
Busiet Airport in Metro Manila. The following Frequency distribution has been constructed.

Number of passenger arriving and departing Frequency


30 – 31 5
32 – 33 7
34 – 35 15
36 – 37 21
38 – 39 34
40 – 41 24
42 – 43 17
44 - 45 8

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 16 of 48
Module 4

MEASURES OF LOCATION OF DATA

Objective:

Calculate the measures of location of data; and

Interpret the value of measures of location of data.

Lesson 1.

Percentiles, Quartiles, InterQuartile Range, and Deciles

Steps in Determining the Location of a Percentile

1. Organize the numbers into an ascending or descending order.

2. Calculate the percentile Location (i) using the following formula:

P
(N )
i = 100
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 17 of 48
Where

P = the percentile of interest

i = percentile location

N = number in the data set

3. Determine the location by either (a) or (b).

a. if I is a whole number, the Pth percentile is the average of the value at the ith location and the value at the
(I + 1)th location.

b. if I not a whole number, the Pth percentile value is located at the whole number part of I + 1.

Examples are provided in the following problems.

A. Quartiles divide a set of observations into four equal parts.

First Quartile (Q1) separates the first lowest, one - fourth of the data from the upper three - fourth and is
equal to 25th percentile or it is the value below which 25 percent of th observations occur. The second Quartile
or Q2 (the median) is the value below which 50 percent of the observations occur and the Third Quartile,
labelled as Q3, is the value below which 75 percent of the observations occur.

Deciles divide a set of observations into 10 equal parts. The deciles labelled as (D1, D2, D3, . . . ,D9 ) are the
values below which 10, 20, 30, . . .,90 percent of the observations occur, respectively.

Percentiles divide the observations into 100 part . . . (P1, P2, P3, . . . P99)

Determine the Location of Quartile, Deciles and Percentiles, in terms of the location of the percentiles.

Location of the Measure : I = (P/100)(n)

Where: i = location

n = total number of observations

P/100 = percentile

Steps in computing the Quartiles, Deciles and Percentiles

Quartiles

Example 1. Determine the Q1, Q2, Q3 of the following numbers:

106 109 114 116 121 122 125 129

Step 1. Arrange the observations or data from the smallest to the largest or from the lowest to the highest value.

Step 2. Determine the location of the first quartile using the formula:

a. The value of Q1 is found at the 25th percentile, P25 by:’


25
For N = 8, i = (8)because i is a whole number, P25 = (109 + 114)/2 = 111.5
100
The value of Q1 is P25 = 111.5
b. the value of Q2, is equal to the median. Because the array contains an even number of terms, the median
is the average of the middle terms.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 18 of 48
Q2 = median = (116 + 121)/118.5
c. the value of Q3, is determined by P75 as follows:
75
(8 )=6
i = 100
because i, is a whole number, P75 is the average of the 6th and the 7th numbers,

Q3 = (122 + 125)/2 = 123.5


d. The value of Q3 is P75 = 123.5. Notice that three-fourths, or six, of the values are less than 123.5 and
two of the values are greater than 123.5

Example 2: The following shows the top 16 global marketing categories for advertising spending for a
recent year according to advertising Age. Spending is given in millions of pesos. Determine the first,
second and third quartiles for these data.

Category Ad Spending
Automobile 22,195
Personal Care 19,526
Entertainment & Media 9,538
Food 7,793
Drug 7,707
Electronics 4,023
Soft Drinks 3,916
Retail 3,576
Cleaners 3,571
Restaurants 3,553
Computers 3,247
Telephone 2,488
Financial 2,433
Beer, Wine & Liquor 2,050
Candy 1,137
Toys 699

Solution

For 16 marketing organizations, N = 16, Q1 = P25 is found by:

25
i= ( 16 ) =4
100

because i is a whole number, Q1 is found to be the average of the 4th and 5th values from the bottom (lowest
value from the observation ).

2,433+2,488
Q1 = =2,460.5
2

Q2 = P50 = media; with 16 items, the median is the average of the 8th and 9th

Q2 = (3,571 + 3,576)/2 = 3,573.5

Q3 = P25 is solved by i = (75/100)(16) = 12

Q3 is found by getting the average of the 12th and 13th terms

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 19 of 48
Q3 = (7,707 + 7,793)/ 2 = 7,750

Therefore the value of first, second, and third quartiles is 2,460.5, 3,573.5, and 7,750 respectively.

Example 3. Shown below are the 20 top companies in the computer industry by sales in 2005.

compute for the Q1, Q3, D5, D9, P30 and P60

Company sales(Php millions)


A 81,234
B 86,178
C 80,200
D 74,211
E 65,030
F 60,200
G 55,345
H 42,250
I 40,900
J 38,450
K 34,290
L 29,150
M 20,400
N 18,392
O 15,302
P 11,285
Q 10,908
R 10,230
S 9,990
T 9,375

Solutions:

First Quartile (Q1)


Step 1. For the 20 companies, N = 20, Q1 = P25 is found by
i = (25/100)(20) = 5
Step 2. Because i is a whole number, Q1 is found to be the average of the 5th and 6th values the bottom of the
distribution.

Step 3 Q1 = (11,285 + 15,302)/2 = 13,293.5; thus Q1 = Php 13,293.50


Third Quartile (Q3)
From the same problem, Q3 = P25
From the same problem, Q3 = P75 is found by
i = (75/100)(20) = 15

Because I is 15, the third quartile is found to be average of the 15th and 16th values from the bottom of the
distribution.
Q3 = (60,200 + 65,030)/2 = 62,615; thus , Q3 = Php 62,615.00

Decile

5th Decile

Step 1. From the above problem, D5 = P50 is found by


_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 20 of 48
I = (50/100)(20) = 10

Step 2 because I is 10, ,a whole number, the D5, is the average of the 10th and 11th values from the bottom of the
distribution.

Step 3 D5 = (34,290 + 38,450)/2 = 36,370; thus, D5 = Php 36,370.00

9TH Decile

Based on the same problem, D9 = P90 is found by

i = (90/100)(20) = 18

Because I = 18, a whole number, D9 is the average of the 18th and 19th values from the bottom of the
distribution.

D9 = (80,200 + 86,178)/2 = 83,189, thus, D9 = Php 83, 189.00

Percentile

Based from the above cited problem, Find P30

Step 1 P30 found by I = (30/100)(20) = 6

Steps Because I = 6, then P30 is the average of the 6th and 7th values from the bottom of the distribution.

Step 3 P30 = (15,302 + 18,392)/2 = 16,847, thus P20 = Php 16,847.00

Find P60

P60 is found by: P60 is the average of the 12th and 13th values from the bottom of the distribution

P60 = (40,900 + 42,250)/2 = 41,575, thus, P60 = Php 41,575.00

InterQuartile Range is the distance between the first and the third quartile. Based from the above
example where Q1 = 23.00 and Q3 = 58.1, the Interquartile Range is IQR = 35.1.

Module 4

MEASURES OF LOCATION OF DATA

Name __________________________________________________________

Course and Year _________________________________________________

Exercise 4.1.

Shown below are the 10 top companies in the computer industry by sales in 2020.

Compute for the Q2, Q3, D7, D8, P40 and P20 and Interpret the value of measures of location of data

Company Sales(Php)
1 50
2 40
3 25
4 38
5 42
6 35
7 30
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 21 of 48
8 47
9 49
10 28

Module 5

Measures of Dispersion

Objective:

 Compare the different measures of variability;


 Compute measure of relative variability of data;
 Interpret the measures of variability.

Measures of Variability describe the spread or the dispersion of a set of data.

A. Ungrouped Data

Lesson 1. Range. Is the difference between the largest (Highest) value of a data set and the smallest (Lowest)
value of a set.

Example 1: From the following set of data, determine the range.

3,4,5,7,8,11,14,15,16,16,16,17,19,20,21

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 22 of 48
Step 1. Determine the highest (H) value and lowest (L) value from the set of data.

H = 21 and L = 3

step 2 Get the Range (R) by difference between the lowest and highest values R = H - L; R = 21 -- 3 = 18;
therefore the range is 18

then the range is 21 -- 3 = 18

Lesson 2. Mean Absolute Deviation(MAD) is the average of the absolute values of the deviation around the
mean for a set of numbers.

Formula: MAD =
∑ |x−x́|
N

Where MAD = Mean Absolute Deviation

N = number of values

∑ ❑= summation
X = value (Score)

x́ = mean

X = 5, 9,16,17,18

Step 1. Solve for the mean

MAD =
∑ x = 65 =13
N 5

Step 2. Substract mean from each of the value from the following table.

x x−x́ |x− x́|


5 --8 +8
9 --4 +4
16 +3 +3
17 +4 +4
18 +5 +5
∑ ¿65 ∑ ( X −x́) ∑|X− x́| = 24
x́=13 | X−x́| 24
MAD=∑ = =4.8
N 5

Step 3. Get the sum of the absolute value of the difference between the mean and the corresponding value.

∑|X− x́| = 24
Step 4. Solve for the MAD using the formula:

| X−x́| 24
MAD=∑ = =4.8
N 5

Lesson 3. Variance is the average of the squared deviation about the arithmetic mean for a set of numbers. The
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 23 of 48
population variance is denoted by σ 2

( x−μ )2
Population variance σ 2= ∑
N

Step 1. Solve for the mean

μ=
∑ x = 65 =13
N 5

Step 2 Get the difference between the mean and the corresponding value.

x−μ

Step 3. Square the difference ( x−μ )2


2
Step 4. Get the sum of step 3[ ∑ ( x−μ ) ]
Step 5. Solve for the variance using the following formula:

2 ∑ ( x−μ )2 130
σ = = =26
N 5

Example 3. Using the steps above - cited, the results are summarized in the following table. Using the data from
the following table, determine the variance.

x x–μ |x−μ| ( x – μ)2


5 −¿8 +8 64
9 −¿4 +4 16
16 +3 +3 9
17 +4 +4 16
18 +5 +5 25
2
∑ x =65 ∑ x −μ=0 ∑ ¿ x – μ/¿+24 ∑ (x – μ) =130
( x−μ)2 130
Variance = σ 2 = ∑ = =26
n 5

Lesson 4. Standard Deviation is the square root of the variance. The empirical rule is used to state the
approximate percentage of value that lie within a given number of standard deviations from the mean of a set of
data if the data are normally curved.

Distance from the Mean Values within Distance


μ ±1 σ 68%
EMPIRICAL RULE
μ ±2 σ 95%
μ ±3 σ 99%

Example 4. A company produces a lightweight valve that is specified to weigh 1,355 grams. Unfortunately,
because of imperfections in the manufacturing process not all of the valves produced weigh exactly 1,355
grams. In fact, the weights of the valves produced are normally distributed with a mean of 1,365 grams and
standard deviation of 294 grams. Within what range of weights would approximately 95% of the valve weights
fall? Approximately 16% of the weights would be more than what value? Approximately 0.15% of the weights
would be less than what value?

Solution

Because the valve weights are normally distributed, the empirical rule applies. According to the
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 24 of 48
empirical rule, approximately 95% of the weights should fall within μ ±2 σ = 1,365 ±2 (294) = 1,365 ± 588.
Thus, approximately 95% should fall between 777 and 1,953. Approximately 68% of the weights should fall
withinμ ±1 σ , and 32% should fall outside this interval. Because the normal distribution is symmetrical,
approximately 16% should be above μ ±1 σ = 1,365 + 294 = 1,659. Approximately 99.7% of the weights should
fall μ ±3 σ , and .3% should fall outside this interval. Half of these, . 15%, should lie below μ−3 σ = 1,365 –
3(294) = 1,365 – 882 = 483.

B. Measures of variability (Grouped Data)

POPULATION

Example 1. The computation of population Mean, Class Mark, the deviation from the mean, the squared
deviation and sum of the product of the respective frequency and the squared deviation.

From the frequency distribution, determine the following; a.) Population mean (μ); b.) class mark (M);
c.) difference between the mean and the individual class mark (M --); d.) square the difference (M -- μ)2; e.)
multiply ((M -- μ)2) by the corresponding frequency; f.) get ∑ f ( M −μ)2.

1 2 3 4 5 6 7
Class mark f m fm ( M −μ ) ( M −μ )2 f ( M −μ )2
1–3 4 2 8 – 7.4 54.76 219.04
4–6 12 5 60 19.36 232.32
– 4.4
7–9 13 8 104 1.96 25.48
10 – 12 19 11 209 – 1.4 2.56 48.64
13 – 15 7 14 98 21.16 148.12
1.6
16 – 18 5 17 85 57.76 288.80
4.6
7.6
∑ f =60 ∑ fm=60 ∑ f ( M −μ)2=962.4

Step 1. Determine the class mark (3) by adding the lower limit and the upper limits of each class, then divided
by 2, like 1 + 3 = 4/2 = 2, (class mark the class 1 - 3)

Step 2. Multiply the obtained class mark (M) by their corresponding frequency to obtain (fm) column 4 of the
table.

Step 3. Determine the population mean (μ) by dividing the sum of fm by the total frequency.

μ=
∑ fm ∨μ= 564 =9.4( Population Mean)
N 60

To determine the population variance

Step 4. Get the difference between the mean and the corresponding class mark (Column 5 of the table)

Step 5. Square the result from step 4

Step 6. Multiply the result of the step 5 by the corresponding frequency and get the sum. ∑ f ( M −μ)2=962.4

Step 7. Determine the population variance using the following formula:

2 ∑ f (M −μ)2 962.4
σ = ∨σ 2 = =16.04(Variance)
N 60

Step 8. Determine the population Standard Deviation by getting the square root of the variance (from step 7)

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 25 of 48
σ =√ σ 2= √16.04=4.00 (Standard Deviation)

Module 5

Measures of Dispersion
Name __________________________________________________________

Course and Year _________________________________________________

Exercise 5.1

Determine the sample variance and standard deviation for the following data.

CI F
10-14 5
15-19 20
20-24 35
25-29 25
30-34 15
Total 100

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 26 of 48
Module 6
Hypothesis Testing
Objective At the end of module the students should be able to;

 Demonstrate the basic concepts of hypothesis testing;


 Construct their own hypothesis;
 Determine the value of Z score;
 Compute the value of t test; and
 Interpret the value of Z – score and t test.
 Determine the observed F value;
 Compare the observed F value with the critical table F value; and
 Interpret the F value.
 Solve the value chi – square;
 Execute the different steps in solving the chi – square; and
 Interpret the value of chi – square.

Hypothesis is a tentative, testable assertion regarding the occurrence of certain behaviours, phenomena, or
events; a prediction of study outcomes.
Two types of hypotheses that will be explored here
1. Null hypothesis
2. Alternative hypothesis

Null Hypothesis state that the null condition exists; that is, there is nothing new happening. it is a statement of
what the researcher believes will be the outcomes of an experiment or a study. Before studies are undertaken,
business researchers often have some idea, or theory based on experience or previous work as to how the study
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 27 of 48
will turn out. These ideas, theories, or notions established before an experiment or study is conducted are
research hypotheses. Some examples of research hypotheses:

 Older worker are more loyal to a company.


 Companies with more than P1 billion in assets spend a higher percentage of their annual budget on
advertising than do companies with less that P1 billion in assets.
 The implementation of a six sigma quality approach in manufacturing will result in greater
productivity.
 The price of scrap metal is good indicator of the industrial production index six months later.
In order to scientifically test research hypotheses, a more formal hypothesis structure needs to be set up
using condition statistical hypotheses.
All statistical hypotheses consist of two parts, a null hypothesis states that there is nothing new happening;
the old theory is still true, the old standard is correct, and the system is in control. The alternative hypothesis, on
the other hand, states that the new theory is true, there new standards, the system is out of control, and/or
something is happening.
Level of Significance
To establish whether our obtained sample difference is statistically significant – the result of a real
population difference and not just sampling error – it is customary to set up a level of significance, which is
denoted by the Greek Letter (apha). The apha value is the level of probability at which the null hypothesis can
be rejected with confidence, and the alternative hypothesis can be accepted with confidence. According, we
decide to reject the null hypothesis if the probability is very small that the sample difference is a product of
sampling error. Conventionally, we symbolize this small probability by p > .05.
Levels of significance do not give us an absolute statement as to the correctness of the null hypothesis.
Whenever we decide to reject the null hypothesis at a certain level of significance, we open ourselves to the
chance of making the wrong decision which could be Type I or Type II error.
Type I and Type II Errors
Type I error is committed if we rejected the null hypothesis when in fact it should be accepted. Type II error is
committed if we accept the null hypothesis when in fact it should be rejected.

Decision
Reality Accept Reject Ho Reject Ho
Ho is True Correct Decision Type I Error
Ho is not True Type II Error Correct Decision

Comparing the same sample measured twice


Example . Social researchers are interested in determining the impact of forced residential mobility on feeling
of neighbourliness (that is, positive feeling about neighbours in the prerelocation neighbourhood and neighbours
in the postrelocation neighborhood). The statement of hypotheses.

Ho : ( μ1=μ 2) : The degree of neighbourliness does not differ before and after the relocation

Ho : ( μ1 ≠ μ2 ) : The degree of neighbourliness differ before and after the relocation

To test the impact of forced relocation on neighborlines, the researchers interview a random sample of
six individual about their both before and after they were forced to move. Interview yield the following acores
of neighbourliness(higher score from 1 to 4 greater neighborliness)

Respondent Before Move (X1) After Move (X2) Difference (D) Difference (D2)
A 2 1 1 1
B 1 2 −1 1
C 3 1 2 4
D 3 1 2 4
E 1 2 −¿1 1
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 28 of 48
F 4 1 3 9
TOTAL ∑ X 1 = 14 ∑ X 2= 8 ∑ D 2 = 20
The following Steps
Step 1. Find the mean for each point in time.

x́ 1=
∑ x1 = 14 =2.33 ; x´ = ∑ x 2 = 8 =1.33
2
n 6 n 6
Step 2. Find the standard deviation for the difference between time 1 and time 2.

D2
S D=
n √
−( x́1 − x́2 ) 2

Where:
S D = standard deviation of the distribution of before – after difference scores

D = after – move raw score scores subtracted from before – move raw score
n = number of cases or respondents of the study
Step 3. Substitute the value to the formula:

20
S D=
√ 6
−( 2.33−1.33 )
2

S D= √3.33−1

S D= √2.33

S D=1.53

Step 4. Find the standard error of the mean difference.


SD 1.53 1.53 1.53
s D́= = = = =0.68
√ N−1 √ 6−1 √ 5 2.24
Step 5. Translate the sample mean difference into units of standard error of the mean difference.
x́1− x́2 2.33−1.33
t= = = 1.47
SD .68

Step 6. Find the number of degree of freedom.


df = n – 1 = 6 – 1 = 5
Step 7. Compare the obtained t ratio with appropriate t ratio in the table 1, Appendix A.
Table 1 = 2.571 with df = 5; at .05
Step 8. Decision Rule: Because the obtained t (1.47) is less than the t critical (2.571) at 5% level of significance
at df = 5, then we accept the null hypothesis and reject the research hypothesis.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 29 of 48
Test of Difference Between Proportions
Example 1. A social psychologist is interest in how personality characteristics are expressed in the car someone
drives. He wonders whether men express a greater need for control than women by driving big cars. He takes a
sample of 200 males and 200 females over 18 and determines whether or not they drive a full – size car.
Consequently, the final sample sizes for analysis were as follows; 180 for men and 150 for women. The
following hypotheses were formulated:
Null Hypothesis: Ho : The proportions of men and women who dive big cars are equal.
Research Hypothesis: H1: The proportions of men and women who drive big cars are not equal.

Male Female Overall


Sample size 180 150 330
Own big cars 81 48 129
Proportion with big cars .45 .32 .39
Step 1. Compute the two sample proportions and the combined sample proportions.
f 1 81 f1 48
P 1= = =.45 ; P1= = =.32 ;
N 1 180 N 1 150

N 1 P 1+ N 2 P2 ( 180 ) ( .45 ) +( 150)(.32) 81+ 48


P= = = =0.39
N1+ N2 180+150 330

Step 2. Compute the standard error of the difference.


_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 30 of 48
N 1+ N 2
S P 1−P 2=P ( 1−P )
√( N 1 N2 )=( .39 ) ( 1−.39 ) (√ (180+150
180 )( 150 ) )=( .39 ) ( .61 ) (
330
27,000 )
=.0539

Step 3. Translate the difference between proportions into units of the standard error of the difference.
P1 −P 2 .45−.32
Z=
S P 1−P 2 (
=
.0539
=2.41 )
Step 4 Compare the obtained Z = (2.41) with the critical value of Z = 1.96. Because the obtained value (z =
2.41) is greater than the critical value of Z = (1.96), then we reject the null hypothesis. The difference between
sample proportions was statistically significant; the social psychologist was able to conclude that men and
women generally tend to drive different cars.

Level of Confidence Z – Value


68% 1.00
95% 1.96
99% 2.58

Example 2. Test about a Proportion.


A survey of the morning beverage market shows that the primary breakfast beverage for 17% of most
Filipino children is milk. A milk producer believes the figure is higher for the city. To test this idea, the
researcher contact a random sample of 550 residents of the place and asks which primary beverage they
consume for breakfast that day. Suppose 115 replied that milk was their primary beverage, using 5% level, test
the idea that the milk figure is higher for the particular city.
Solution:
Step 1. State the null (Ho) and Alternative (Ha) Hypothesis
Ho : p = .17; Ha : p >.17

Step 2. Compute the sample proportion ^p using the following formulas:


x
^p=
n
Where
x = proportion who said that they use milk as breakfast beverage
n = sample size
115
^p= =.209
550
Step 3. Compute for Z using the following Formula
p^ − p
Z = ( p) (q)
√ n
Where
^p = sample proportion

p = population proportion
q=1–p
Substitute the value to the formula
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 31 of 48
.209−.17 .039
= =2.44
Z = ( .17 ) (.83) .016

550
Step 4. Decision Rule
Reject the null Hypothesis because the observed Z test (2.44) is greater than the critical value of Z = 1.96. The
calculated test statistical is often referred to as the observed value.
Step 5. Business Implication. To make managerial decision, the researcher has enough evidence to reject the
null hypothesis that the breakfast beverage of 17% of children in the city is milk. The researcher can conclude
that the average breakfast beverage of the children is more than 17%.
One Tailed and Two – Tailed Tests
One – Tailed Test. A one – tailed test reject the null hypothesis at only tail of the sampling distribution or
when the rejection region is located at only extreme of the range of value for the test statistics.
Two – Tailed Test. Two – tailed test reject the null hypothesis at both tails of the sampling distribution or when
the rejection region is located at both extremes of the distribution./
One – Tailed Test for:

a. Statistical Inferences for two Related Populations.


Example. Suppose that an educational researcher wisher to test whether a particular remedial math program
significantly improves math skills.

Student Before ( x 1) After¿ ¿ Difference (D) Difference D 2


1 58 66 -8 64
2 63 68 -5 25
3 66 72 -6 36
4 70 76 -6 36
5 63 78 -15 225
6 51 56 -5 25
7 44 69 -25 625
8 58 55 3 9
9 50 55 -5 25

One Tailed Test (Sample)


Null Hypothesis : Math ability does not improves after remediation
μ1 ≥ μ2❑

Alternative Hypothesis : Math ability improves after remediation


μ1 < μ2❑

Step 1. Find the mean for both the before and after tests.

x́ 1=
∑ x1 = 523 =58.11 ; x́ = ∑ x 2 = 595 =66.11
2
N1 9 N2 9

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 32 of 48
Step 2. Find the standard deviation of the difference

∑ D2 −( x´ − x́ )2
S D=
√ N 1 2

1,070
S D=
√ 9
−(58.11−66.11)2

S D= √118.89−64

S D= √ 54.89

S D=7.41 ( Standard Deviation )

Step 3. Find the standard error of the difference between means.


SD 7.41 7.41 7.41
S= = = = =2.62( Standard error )
√ N −1 √9−1 √ 8 √ 2.83
Step 4. Translate the sample means difference into units of the standard error of the difference.
x́1 − x́2 58.11−66.11 −8
t= = = =−3.05
s D́ 2.62 2.62

Step 5. Find the degree of freedom.


df = N – 1
= 9–1
=8
Step 6. Compare the obtained t (-3.05) ratio with the critical or table t (1.86) value at 5% significance level.
(table 2, Appendix A)
Obtained t = -- 3.05
Table t = 1.86
α =.05

Step 7. Decision: Reject the Null Hypothesis since the computed value is more extreme in the negative
direction than that of the value (-- 1.86).
Step 8. Interpretation of the result. Since the null hypothesis is reject, therefore the remedial math program
has produced a statistically significant improvement in math ability of the students.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 33 of 48
Analysis of Variance (ANOVA)
is a statistical test that makes a single overall decision as to whether a significant difference is present among
three or more sample means. This test statistic is used to compare several population means simultaneously. The
result of ANOVA, a statistical technique that indicates the size of the between – groups mean square relative to
the size of the within - groups mean square.
Sample Problem 1.
A company has three manufacturing plants, and company officials want to determine whether there is a
difference in the average age of workers at the three locations. The following data are the ages of five randomly
selected workers at each plant. Perform q one-way ANOVA to determine whether there is a significant
difference in the mean ages of the worker at the three plants.
Solution.
Step 1. State the null and Alternative Hypothesis
Ho: There is no significant difference in the average age of worker at the three locations.
H1: There is significant difference in the average age of worker at the three locations.
Step 2. The appropriate test statistic is the F test calculated from ANOVA
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 34 of 48
Step 3. The level of Significant is 5%.
Step 4. The degree of freedom for this problem 3 – 1 = 2 for the numerator and 15 – 3 = 12

Plant (Employee Ages)


X1 x 2
1
x❑2 x 22 x❑3 x 23
29 841 32 1,024 25 625
27 729 33 1,089 24 576
30 900 31 961 24 576
27 729 34 1,156 25 625
28 784 30 900 26 676
2 ❑ 2 ❑ 2
∑ x 1=141 ∑ x 1=3,983 ∑ x 2 =160 ∑ x 2=5,130 ∑ x 3 =124 ∑ x 3=3,078
Step 5. Computation process
Step 5.1. Find the mean for each sample:

x́ 1=
∑ x1 = 141 =28.3 ;
n 5
160
x́ 2= =32
5
124
x́ 3= =24.8
5
Notice that difference do exist, the tendency for group 2 to have higher age than groups 1 and 3.
Step 5.2. Find the sum of ages, sum of squared ages, number of subjects, and mean ages for all groups
combined.

∑ x total=∑ x1 +∑ x 2 +∑ x 3=141+160+124=425
∑ x total=∑ x21 + ∑ x 22 + ∑ x 23=3,983+5,130+3,078=12,191
N total =N 1 + N 2 + N 3 =5+5+5=15

X
´ = ∑ TOTAL = 425 =28.33
X TOTAL
N TOTAL 15

Step 5.3. Find the total sum of squares.


2
SStotal =∑ x total −N TOTAL ( X́ 2TOTAL❑)

= 12,191 – 15(28.33)2
=12,191 – 12,038.83
= 152.17
Step 5.4. Find the within groups sum of squares.
2 ❑
SSwithin =∑ x total −∑ N group( x́ ¿ ¿ group 2) ¿

= 12,191 – [(5)(28.2)2 + (5)(32)2 + (5)(24.8)2]


= 12,191 – [3,976.2 + 5,120 + 3,075.2]
= 12,191 – 12,171.4
= 19.60
Step 5.5 Find the between groups sum of squares.
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 35 of 48

SSbetween =∑ N group x́ 2group−N ❑total x́ 2total

= [(15)(28.2)2 + (5)(32)2 + (5)(24.8)2] – (15)(28.33)2


= [3,976.2 + 5,120 + 3,075.2] – 9,173.59
= 12,17.40 – 12,038.83
= 132.57
Step 5.6 Find the between groups degree of freedom
df between =k−1

=3–1
Step 5.7 Find the within groups degrees of freedom
df with =N total−k

= 15 – 3
= 12

Step 5.8. Find the within – groups mean square.


SSwithin 2,482
MS within= = =206.83
df within 12

Step 5.9 Find the within – groups mean square


ss between 535.41
MS between = = =267.71
df between 2

Step 5.10 obtain the F ratio.


MSbetween 267.71
F= = =1.294
MS within 206.83

Step 5.11 compare the obtained F ratio with the appropriate table F ratio. See Table 3 of Appendix A.
Obtained F ratio = 1.294
Table F ratio = 3.88
df = 2 and 12
= .05
Step 6. Formulate the decision rule.
Tp reject the null hypothesis at the 5% significance level with 2 and 12 degree of freedom, our calculated F
ratio must exceed table value 3.88. Because we have obtained an F ratio of 1.294, we cannot reject the null
hypothesis. This results obtained were not statistically significant.
Step 7. Since we obtained F ratio = 1.294 which is less than the critical F ratio (3.88), therefore we can say that
the results were not statistically significant difference in the average of the workers in the three locations.
Sample Example 2.
A professor had students in a large marketing class rate his performance as excellent, good, fair, or poor. A
graduate student collect the rating and assured the students that the professor would not receive them until after

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 36 of 48
course grades had been sent to the records office. The sample information is reported below.

Excellent( x 21 Good ( x 2) x 22 Fair ( x 3) x 23 Poor ( x 4) x 24


x 1)
94 8,836 75 5,625 70 4,900 68 4,624
90 8,100 68 4,624 73 5,329 70 4,900
85 7,225 77 5,929 76 5,776 72 5,184
80 6,400 83 6,889 78 6,084 65 4,225
88 7,744 80 6,400 74 5,476
68 4,624 65 4,225
65 4,225
349 30,561 391 30,811 510 37, 338 414 28,634

Solution:
Step 1. State the null and the alternative hypotheses

 Null Hypothesis: The mean scores are the same for the four rating
Ho: μ1=μ 2=μ3=μ 4
 Alternative Hypothesis: The mean scores are not the same for the four rating
Ha: μ1 ≠ μ2 ≠ μ 3 ≠ μ4
If the null hypothesis is not rejected, we conclude that there is no difference in the mean course grades
based on the instructor rating. If Ho is rejected, we conclude that there is a difference in at least one pair
of mean rating, but at this point we do not know which pair or how many pairs differ.

Step 2. Select the level of significance. We select .01 significance level.


Step 3. Determine the test statistic: The test statistic is F ratio.
Step 4. Computation Process
Step 4.1: Find the mean for each sample.
349
x́ 1= =87.25
4
391
x́ 2= =78.20
5
510
x́ 3= =72.86
7
414
x́ 4 = =69.00
6

Step 4.2: Find the sum of scores, sum of squared scores, number of subjects, and the
mean for all groups.
∑ x total=349+ 391+510+ 414=1,664
2
∑ x total=30,561+30,811+ 37,338+28,634=127,344
N total =4+5+ 7+6=22

x total =
∑ x total = 1,664 =75.64
N total 22

Step 4.3: Find the total sum of squares.


2
SStotal =∑ x total−N total ( xtotal )2=127,344−( 22 ) (75.64)2=127,344−125,871.01=1,472.99

Step 4.4: Find the within – groups of squares.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 37 of 48
2
SSwithin =∑ x total−∑ N group❑ x́ 2group
= 127,344 – [(4)(87.25)2 + (5)(78.2)2 + (7)(72.86)2 + (6)(69)2]
= 127,344 – [ 30,450.25 + 30,576.2 + 37,160.06 + 28,566]
= 127,344 – 126,752.51
= 591.49
Step 4.5: Find the between groups sum of squares.
SSbetween =126,752.51−22(75.64)2 =126,752.51−125,871.01=881.5
Step 4.6: Find the between group degrees of freedom
df between =k−1=4−1=3

Ste p 4.7: Find the between – groups sum of square.

df within=N total −k=22−4=18


Step 4.8: Find the within groups means square.
SSwithin 591.49
MS within= = =32.86
df within 18
Steps 4.9: Find the between – groups mean square.

SS between 881.5
MS between = = =293.83
df between 3
Step 4.10: Obtain F ratio.
MS within 293.83
F= = =8.94
MS between 32.86
Step 4.11: Compare the obtained F ratio with the table value of F ration or critical
F.
Obtained F = 8.94
Critical F = 5.09 (Table 3, Appendix A)
df = 3 and 18
= .01 0r 1%
Step 5. Decision Rule
Since the obtained F ratio is (8.94) greater than the critical F ratio (5.09) at 1% level of significance with
df (3 and 18), we can reject the null hypothesis. Therefore the result of the test is statistically significant.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 38 of 48
Chi – Square (X2) Test.
A nonparametric test of significance whereby expected frequencies are compared against observed frequencies.

One Way Chi – Square ( x 2)

Example 1. Suppose your instructor return the exam and hands out the answer key. You construct a frequency
distribution of the correct response to the 50 – item test as follows:

Correct answer Fo fe X2
A 12 10 0.4
B 14 10 1.6
C 9 10 0.1
D 5 10 2.5
E 10 10 0.0
TOTAL 50 50 4.5

The one – way chi – square test can be used to determine whether the frequencies we observed previously differ

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 39 of 48
significantly from an even distribution (on any other distribution we might hypothesize).
Null Hypothesis: The instructor shows no tendency to assign any particular correct response from A to E.
Alternative Hypothesis: The instructor shows a tendency to assign particular correct responses from A to E.
Using the formula

( fo−fe)2
X2 = ∑
fe
Where
X2 = chi – square
Fo = observed frequency
Fe = expected frequency

∑ ¿ summation

From the table, the computation of X2 is

(12−10)2 (5−10)2 (10−10)2


X2 = + ¿¿ + +
10 10 10
= 0.4 + 1.6 + 0.1 + 2.5 + 0
= 4.6

Step – by – step illustration


One – way chi – square
To summarize the step by step procedure for calculating one way chi – square, imagine that a social researcher
is interested in surveying attitude of high school students concerning the importance of getting a college degree.
She questions a sample of 60 high school seniors about whether they believe that a college education is
becoming more important, less important, or staying the same. We specify our hypothesis as follows:
Step 1. Formulate the null and research hypotheses
Null Hypothesis: High school students are equally divided in their beliefs regarding the charging importance of
a college education
Research Hypothesis: High school students are not equally divided in their beliefs regarding the changing
importance of a college education.
Let us say that of the 60 high school student surveyed, 35 feel that a college education is becoming more
important, 10 feel that it is becoming less import, and 15 feel that the importance is about the same.
Step 2.1 Arrange the data in the form of a frequency distribution

Category Observed Frequency (fo)


More Important 35
Less Important 10
About The Same 15
Total 60
Step 2.2: Obtain the expected frequency (fe) for each category (k)
Using the formula:
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 40 of 48
fe = N/ K = 60 / 3 = 20

Category Observed Frequency (fo) Expected Frequency


More Important 35 20
Less Important 10 20
About The Same 15 20
Total 60 60

Step 3. Set up a summary table to calculate the chi – square value

Category Observed Expected fo - fe ( fo – fe)2 (fo – fe)2


Frequency (fo) Frequency (fe) fe
More Important 35 20 15 225 11.25
Less Important 10 20 - 10 100 5.00
About The Same 15 20 -5 25 1.24
Total 60 60 X2 = 17.50
Step 4. Find the degree of freedom
df = k – 1 = 3 – 1 = 2
Step 5. Compare the calculated chi – square value with the appropriate chi – square value which is 5.991
The chi – square value required for significant at the 5% level for 2 degree of freedom, is 5.991 which is
referred to as the critical value.
Step 6. Decision. Because the computed chi – square (X 2 = 17.50) is greater than the critical value of chi –
square (X2 = 5.991), then we reject the null hypothesis.
Step 7. Interpretation or implication . These finding suggest, therefore, that high school students are not equally
divided about their views concerning the changing importance of pursuing a college education.
Two way chi – square test
Example: Suppose a researcher investigate the relationship between political orientation and child –
rearing permissiveness.
Step 1. Formulate the null and researcher hypotheses
Null hypothesis: The relative frequency or percentage of liberals who are permissive is the same as the relative
frequency of conservatives who are permissive.
Research hypothesis: The relative frequency or percentage of liberals who are permissive is not the same as the
relative frequency of conservatives who are permissive.
Step 2. Rearrange the data in the form of a 2 x 2 table containing the observed frequencies for each.

Political Orientation
Child rearing methods Liberal conservatives Total
Permissive 5 10 15
Not Permissive 15 10 25
Total 20 20 40

Step 3. Obtain the expected frequency for each cell.


Find the expected frequency: (fe)
( row marginal total ) (column margina; total)
fe =
N

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 41 of 48
For the upper left cell in the table, (permissive liberals)
20(15)
fe = =7.5
40
For the upper right cell (permissive conservative)
20(15)
fe = =7.5
40
For the lower left (not permissive liberals)
( 20 ) (25)
fe = =12.5
40

Political Conservatives Liberals


Orientation Fo Fe X 2
Fo fe X2 Total
Permissive 5 7.5 0.83 10 7.5 0.83 15
Not 15 12.5 0.50 10 12.5 0.5 25
Permissive
20 1.33 20 1.33 40
For lower right (not permissive conservatives)
(20) 25
fe = =12.5
40
Step 4. Determine the X2

(5−7.5)2 (15−12.5)2 (10−7.5)2 (10−12.5)2


X2 = + + + =0.83+ 0.5+0.83+0.5=2.66
7.5 12.5 7.5 12.5
Hence, X2 = 2.66
Step 5. Determine the degree of freedom (df)
df = (r – 1) (c – 1)
where
r = number of rows in the table of observed frequencies
c = number of columns in the table of observed frequencies
df = degree of freedom

df = (2 – 1) (2 – 1)
Step 6. Compare the obtained X2 = 2.66 with the critical X2 (3.84) (Table 5, Appendix A)
Step 7. Decision: Because the computed value of X2 is less than the critical X2 , we must accept the null
hypothesis and reject the researcher hypothesis. In short, the observed frequencies do not differ enough from the
frequencies expected by chance to indicate that the actual population difference exists.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 42 of 48
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 43 of 48
Module 6
Hypothesis Testing
Name __________________________________________________________

Course and Year _________________________________________________

Exercise 6.1.

1. Suppose you are testing Ho = .45 versus Ha > .45 random sample of 310 people produces a value
of ^p = .465. Use = .05 to test this hypothesis.

2. Instructions: Five – step Hypothesis Testing Procedure Used:


A criminologist was interested whether there was disparity in sentencing based on the race of the
defendant. She selected at random 18 burglary convictions and compared the prison terms given
to the 10 whites and 8 blacks sampled. The sentenced length s (in years) are shown for the white
and blacks offenders. Using the data, test the null hypothesis that white and blacks convicted
burglary in this jurisdiction do not differ with respect prison sentence length.

Black 4 8 7 3 5 4 5 4
White 3 5 4 7 5 5 6 4 3 2

Note: Follow all the steps and use two decimal places, round off if possible.

Analysis of Variance (ANOVA)


Exercise 6.2.

1. Compute a one – way ANOVA on the following data. Use .05 level of significance

1 2 3
2 5 3
1 3 4
3 6 5
3 4 5
2 5 3
1 5

Determine the observed F value. Compare the observed F value with the critical table F value and
decide whether to reject the null hypothesis.

Note: Use other long bond paper.


Chi-Square
Exercise 6.3
For the following problems, solve for the chi – square following the different steps in solving the
problems.

1. Comparing Several Groups


For purposes of illustrating the step by step computation of chi – square with several groups, let us
imagine that we are investigating the relationship between religion and child – rearing methods. In this
example we will be drawing information from three random samples: 32 Protestants; 30 Roman
Catholics; and 27 Jews. We categorize the rearing methods as permissive, moderate; or authoritarian.

Protestants Catholics Jewish Total


Permissive 7 9 14 30
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 44 of 48
Moderate 10 10 8 28
Authoritarian 10 10 8 28
Total 27 29 30 86
Module 7

LINEAR CORRELATION AND REGRESSION

Objectives

 determine the correlation coefficient using the Pearson Product and interpreted the results
 use Spearman Rank to find the order correlation coefficient and interpreted the result and interpret
results of regression

Correlation measures the association or the strength of the relationship between two variables say x and y.

Definitions.

Two variable are positively correlated if the value of the two variables both increase.

Two variables are negatively correlated if the values of one variable increase while the values of
the other decrease.

Two variables are not correlated or they have zero correlation if one variable neither increases
nor decreases while the other increases.

Verbal Interpretation

The degree of correlation can determine by correlation coefficient. Its value represents an interpretation as
shown in the table below.

R Verbal Interpretation
0.00 No correlation
± 0.01 to ± 0.20 Slight Correlation
± 0.21 to ± 0.40 Low Correlation
± 0.41 to ± 0.70 Moderate Correlation
± 0.71 to ± 0.80 High Correlation
± 0.81 to ± 0.99 Very High Correlation
± 1.0 Perfect Correlation

Pearson Product – Moment Correlation(r)

The most familiar sort of statistical tool in quantifying the linear relationship between two random
variables, x and y.

Data are parametric (numerical measurement describing a characteristic of a sample).

Formula

N ∑ xy −∑ x ∑ y
r= 2
√ [ N ∑ x −( ∑ x ) ] ¿ ¿ ¿
2

Steps in Solving Correlation

1. State the null hypothesis (Ho) and the Alternative Hypothesis (Ha)
2. Determine the tabular value (TV), degree of freedom (df) = N – 2.
3. Determined computed value (CV).
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 45 of 48
4. State the conclusion.
a. Decision: i. computed r less than tabular (rc > rt ) (means reject Ho) and ii. rc < rt (means
Accept Ho).

Example 1.

Calculate and analyze the correlation coefficient between the number the number of study hours and the
number of sleeping hours of different students at 0.05 level of significance.

Number of 2 4 6 8 10
study hours (x)
Number of 10 9 8 7 6
sleeping hours
(y)

Solution:

1. Ho: There is no significant relationship between the number of study hours and the number of sleeping
hours of different students.
Ha: There is significant relationship between the number of study hours and the number of sleeping
hours of different students.
2. Tabular value: α =0.05 and df = N – 2 = 5 – 2 = 3: (3, 0.05) = 0.878
3. Computed Value:

Student X y x.y x2 y2
1 2 10 20 4 100
2 4 9 36 16 81
3 6 8 48 36 64
4 8 7 56 48 49
5 10 6 60 100 36
N=5 ∑ x =30 ∑ y=40 xy
∑ =220 ∑ x 2=220 ∑ y2 =330

N ∑ xy −∑ x ∑ y
r= 2
√ [ N ∑ x −( ∑ x ) ] ¿ ¿ ¿
2

4. Conclusion
Based on the result of r = --1 which is less than the tabulated value of 0.878, Do not reject Ho. This
implies that there is no significant relationship between the number of study hours and the number of
sleeping hours of different students. The result of r also implies perfect correlation.

Linear Regression

Regression Analysis is very powerful tool in the field of statistical analysis specifically in predicting the
value of one variable to the given value of another variable, and those variables that are related to each other.
Therefore, it is used when predicting the behaviour of a variable. The regression equation explains the amount
of variation visible in the independent variable x. It is actually an equation an equation of a straight line.

The purpose of regression is to determine the trend of the two variable as related to each other whether the
trend is rising or falling.

Formula;

y = a + bx
_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 46 of 48
where: y = criterion measure

x = predictor
a = ordinate or the point where the regression line crosses the y – axis, and
b = beta weight or the slope of the line.
To get the regression equation, the value of a and b are computed using the formula below.
a = ¿¿

b = n¿¿

where: n = number of pairs

Example 1.

The data in the table represent the membership at a university mathematics club during the past 5 years.

Number of Years (x) Membership (y)


1 25
2 30
3 32
4 45
5 50

Form a curve of the form y = a + bx to predict the membership 5 years from now.

Solution:

X Y x2 xy
1 25 1 25
2 30 4 60
3 32 9 96
4 45 16 180
5 50 25 250
∑ x =15 ∑ y=182 ∑ x 2=55 ∑ xy =611

a = ¿¿

b = n¿¿

The equation is y = a + bx : y = 16.9 + 6.5x

Since you need to predict the membership five years from now, or at year 10, substitute 10 for x in the equation.
Thus, 5 years from, y = 16.9 + 6.5 (10) = 81.9 ≈ 82.

Therefore, five years from now, the club would have 82 members.

Reference

 Dalisay, Clarenz, LPT., et al. (2018) Mathematic in the Modern World, OUR LADY OF FATIMA
UNIVERSITY
 Antivola, Hermelita M. et al.(2015)Business Statistics A Modular Approach, Philippines: Books Atbp.
Publishing Corp.

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 47 of 48
Module 7

LINEAR CORRELATION AND REGRESSION

EXERCISES 7.1

Name __________________________________________________________

Course and Year _________________________________________________

1. Solve the following by Pearson r and test the significant correlation at 5% tabular value. Interpret the
results.

X 12 14 10 12 14 12 11
Y 59 60 34 55 77 80 50

2. Use the regression analysis to predict the grade of a student in Mathematics if his grade in Science is:
A. 77
B. 65
C. 89

Student SCIENCE MATHEMATICS


1 87 70
2 77 75
3 88 80
4 98 95
5 69 70
6 81 89
7 90 95
8 85 80
9 85 83
10 86 84

_______________________________________________________________________________________________________________________________________
_______________________________________________________________________________________________________________________________________
BUSINESS STATISTICS ERIC P. SUPANGA
1st Semester, S.Y. 2020-2021 Instructor
Bachelor of Business Administration Cp#: 09752410538
[email protected]
Page 48 of 48

You might also like