0% found this document useful (0 votes)
1 views

statics 1 and 2 (1)

The document provides an introduction to statistics, defining it as a branch of mathematics focused on data analysis for decision-making under uncertainty. It categorizes statistics into descriptive and inferential types, outlines the stages of statistical investigation, and discusses basic terminologies such as population, sample, and variable. Additionally, it highlights the applications, uses, limitations of statistics, and methods of data collection, emphasizing its importance across various fields including business, education, and finance.

Uploaded by

Yilkal Abere
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

statics 1 and 2 (1)

The document provides an introduction to statistics, defining it as a branch of mathematics focused on data analysis for decision-making under uncertainty. It categorizes statistics into descriptive and inferential types, outlines the stages of statistical investigation, and discusses basic terminologies such as population, sample, and variable. Additionally, it highlights the applications, uses, limitations of statistics, and methods of data collection, emphasizing its importance across various fields including business, education, and finance.

Uploaded by

Yilkal Abere
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Basic Statistics for Accounting and Finance

CHAPTER ONE
INTRODUCTION TO STATISTICS
1.1. Definition of statistics
Some of these definitions are given below.
➢ Statistics is a branch of mathematics that consists of a set of analytical techniques that can be
applied to data to help in making judgments and decisions in problems involving uncertainty.
➢ Statistics is a scientific discipline consists of procedures for collecting, describing, analyzing and
interpreting numerical data.
➢ Statistics is a body of principles and methods concerned with extracting useful information from
a set of numerical data.
➢ In general, its meaning can be categories into two entirely different categories. These are plural
sense and singular sense.
Plural sense (statistical data): statistics is defined as aggregates of numerically expressed facts or
figures collected in a systematic manner for a pre-determined purpose.
Singular sense (statistical methods): Statistics refers to the collection, organizing, presentation,
analyzing, and interpretation of numerical data to make inferences and reach decisions in all branches
of economics, business, medicine, and other social and physical sciences.

1.2 Classification of Statistics


Based on the scope of the decision, Statistics is broadly divided into two categories based on how the
collected data are used; descriptive and inferential statistics.
1. Descriptive Statistics: It refers to the procedures used to organize and summarize masses of data
without going beyond further conclusion. The methodologies of descriptive statistics include the
methods of data organization like classification, tabulation and constructing frequency distributions
and methods of data presentation like diagrammatic and graphical displays and calculations of certain
indicators of data like measures of central tendency and measures of dispersion (variation).
Example 1.1: The mean salary of random sample of 60 high school teachers in 2013 was 5,820 birr.

2. Inferential Statistics: It deals with making inferences and/or conclusions about a population based
on data obtained from a sample of observations. It consists of performing hypothesis testing,
determining relationships among variables and making predictions. Inferential statistics includes the
methods used to find out something about a population based on the sample.
Example-1.2:
➢ Based on a survey, the mean weekly hours of TV watched by teenagers in the Debre Markos is 8.8 hours.
➢ In 2030, Debre Markos town’s population is expected to be 2.2 million.

Lecture notes DMU Page 1


Basic Statistics for Accounting and Finance
1.3 Stages In Statistical Investigation
According to the singular sense definition of statistics, a statistical investigation involves five stages: data
collection, organization, presentation, analysis and interpretation of results
1. Collection of data: Data collection is the first stage in any statistical investigation. It involves the
process of obtaining (gathering) a set of related measurements or counts to meet predetermined
objectives. Data may be obtained from primary and secondary source.
2. Organization of data: The second purpose of statistics is describing the properties of the data in a
summary form. Editing is the first step in the organization of data. Since there may be omissions,
inconsistencies, ambiguities, irrelevant answers and recording errors. Once the data is edited, the
second step is classification, which is arranging the collected data according to some common
characteristics. Such classified data can more easily be presented. Organization of data is presenting
the classified data in tabular form using rows and columns.
3. Presentation of data: The purpose of data presentation is to have an overview of what the data actually
looks like, and to facilitate statistical analysis. Data presentation can be done using diagrams and
graphs which have great memorizing effect and facilitate comparison.
4. Analysis of data: The analysis of data is the extraction of summarized and comprehensive numerical
description in order to reach conclusions or provide answers to a problem. The basic purpose of data
analysis is to make it useful for certain conclusions. This analysis may require from simple to
sophisticated mathematical techniques.
5. Interpretation of results: This is the last stage of statistical investigation. Once the data has been
analyzed, some numerical value(s) can be achieved. The main job consists of attaching physical
meaning or interpretation to these numerical results. Involve interpreting constant computed in
analyzing data for the information of valid conclusion and inferences.
1.4 Some Basic Terminologies
1. Population: It is a totality of things, objects, people, etc about which information is being collected.
It is the totality of observations with which the researcher is concerned
Examples: All clients of Telephone Company
➢ All students of Debre Markos University (DMU)
➢ Population of families, etc.
2. Sample: is part or subset of population under study.
3. Sampling frame: is the list of all possible units of the population that the sample can be drawn.
Example: List of all students of DMU, List of all residential houses in Debre Markos town, etc
4. Survey: is an investigation of a certain population to assess its characteristics. It may be census or
sample.
➢ Census survey: A complete enumeration of the population under study.

Lecture notes DMU Page 2


Basic Statistics for Accounting and Finance
➢ Sample survey: the process of collecting data covering a representative part or portion of a
population.
5. Parameter: is a descriptive measure of a population, or summary value calculated from a population.
6. Statistic: is a descriptive measure of a sample, or a summary value calculated from a sample.
7. Sampling: The process or method of sample selection from the population.
8. Sample size: The number of elements or observation to be included in the sample.
9. Variable: It is an item of interest that can take numerical or non-numerical values for different
elements. It may be qualitative or quantitative.
Examples: Marital status, Family size, Price of a commodity, Monthly income, Height of person,
Weight of person, etc.

Variable can be classified into two


1. Qualitative variables: are variables that assume non-numerical values. They can be categorized
and they are usually called attributes.
Example: Sex, marital status, ID number, Hair color, Blood group, Residence (urban, rural), etc.
2. Quantitative variables: are variables which assume numerical values.
Example: Age, Height, Weight, Income, Expenditure, Family size, etc.
There are two types of Quantitative Variable
I. Discrete Variable: Discrete variables are those variables that assume whole number values
and consist of distinct and recognizable individual elements that can be counted.
Examples: Number of children in a family, Number of students, Size of shoes
II. Continuous variable: A continuous variable has a set of possible values including all values in
an interval of the real line and No gaps between possible values.
Examples: Height, Weight, Age, Income, Price, time, temperature etc.
1.5 Applications, uses and limitations of Statistics
1.5.1 Application of Statistics in various fields
Statistics can be applied in any field of study which seeks quantitative evidence. For instance:
In Business: In the competitive business, the business people face some like shortage is over stocking,
uneconomic crisis etc., which can be solved through statistical analysis. To a greater extent statistics
help the businessman maximize their profit.
In Education: Statistics is widely used in education for research purpose. It is used to test the past
knowledge and evolved new knowledge.
In accounting: In accounting correlation analysis between profit and sales is widely used.
➢ It uses statistical methods to select samples for auditing purposes and to understand the cost
drivers in cost accounting

Lecture notes DMU Page 3


Basic Statistics for Accounting and Finance
In Finance - uses statistical methods to choose between alternative portfolio investments and to track
trends in financial measures over time
In Banking: In this past developing technology, the banking sector needs a lot of information about the
present and future business development.
In Investment Decision: Statistics helps investors in selecting securities, which are safe, yielding a
good return an appreciation in the market price.
In Insurance: Statistics is extensively used in the field of Insurance. Actuarial statistics is most of the
insurance company through fix the premium relates which is based on the mortality tables.
In management: Statistical tools are used widely by business enterprises for the promotion of new
business.
➢ It uses statistical methods to improve the quality of the products manufactured or the services
delivered by an organization.
➢ It also helps in the assessment of quantum of product to be manufactured, the amount of raw
material, labor needed, marketing avenues for the product and the competitive products in the
market and so on.
In Marketing - uses statistical methods to estimate the proportion of customers who prefer one product
over another (and why), and to draw conclusions about which advertising strategy might be most useful
in increasing sales
In Industry: statistics is used in quality control through control charge which has its basis on the theory
of probability, normal distribution and inspection, which are based on sampling techniques
1.5.2 Function/Uses of Statistics
Today the field of statistics is recognized as a highly useful tool to making decision process by managers
of modern business, industry, frequently changing technology. It has a lot of functions in everyday
activities. The following are some uses of statistics:
➢ It condenses and summarizes a mass of data.
➢ Statistics facilitates comparison of data.
➢ Statistics helps to predict future trends.
➢ Statistics helps to formulate & review policies.
➢ Formulating and testing hypothesis.
1.5.3 Limitations of Statistics
The field of statistics, though widely used in all areas of human knowledge and widely applied in a
variety of disciplines such as engineering, economics and research, has its own limitations. Some of
these limitations are:

Lecture notes DMU Page 4


Basic Statistics for Accounting and Finance
1. It does not deal with individual values: as discussed earlier, statistics deals with aggregate of facts.
Example, wage earned by an individual worker at any one time, taken by itself is not a statistics.
2. It does not deal with qualitative characteristics directly: statistics is not applicable to qualitative
characteristics such as beauty, honesty, poverty, standard of living and so on since these cannot be
expressed in quantitative terms. For example, intelligence may be compared to some degree by
comparing IQs or some other scores in certain intelligence tests.
3. Statistical conclusions are not universally true: since statistics is not an exact science, as is the case
with natural sciences, the statistical conclusions are true only under certain assumptions.
4. It can be misused: statistics cannot be used to full advantage in the absence of proper understanding
of the subject matter.
1.6 Scale of Measurement
➢ Measurement is the assignment of values to objects or events in a systematic fashion or according
to the rules. Variables can be measured under four levels or scales of measurement.
➢ The measurement scales are,
1. Nominal scale
} 𝑄𝑢𝑎𝑙𝑖𝑡𝑎𝑡𝑖𝑣𝑒 𝑑𝑎𝑡𝑎/𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
2. Ordinal scale
𝟑. Interval scale
} 𝑄𝑢𝑎𝑛𝑡𝑖𝑡𝑎𝑡𝑖𝑣𝑒 𝑑𝑎𝑡𝑎/𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
𝟒. Ratio scale
1. Nominal scale: The nominal level of measurement classifies the data into categories with no
meaningful order or ranking can be imposed on the data. The values of a nominal attribute are just
different names, i.e., nominal attributes provide only enough information to distinguish one object
from another. Qualities with no ranking or ordering; no numerical or quantitative value. These types
of data are consisting of names, labels and categories. Arithmetic operations (+, −,∗,÷) are not
applicable, comparison (<, >, ≠) is impossible.
Example 1.3: Religion, Marital Status, Blood Group, Nationality, Eye color, product type
2. Ordinal scale: Defined as nominal data that can be ordered or ranked. The ordinal level of
measurement classifies the data into categories that can be meaningfully ranked or ordered.
➢ Can be arranged in some order, but the differences between the data values are meaningless.
➢ Data consisting of an ordering of ranking of measurements are said to be on an ordinal scale of
measurements. The values of an ordinal scale provide enough information to order objects.
➢ One is different from and greater /better/ less than the other
➢ Arithmetic operations (+, −,∗,÷) are impossible, comparison (<, >, ≠) is possible.
Example 1.4: Economic Status, Level of Education, Beauty, Letter grading (A, B, C, D, F), Rating
scales (excellent, very good, good, fair, poor), military status (general, colonel, lieutenant).

Lecture notes DMU Page 5


Basic Statistics for Accounting and Finance
3. Interval Level: data are defined as ordinal data and the differences between data values are
meaningful. However, there is no true zero, or starting point, and the ratio of data values are
meaningless. Note: Celsius & Fahrenheit temperature readings have no meaningful zero and ratios
are meaningless. In this measurement scale:
➢ One is different, better/greater and by a certain amount of difference than another.
➢ Arithmetic operations are possible except Multiplication and division
➢ comparisons are possible/applicable.
Example: IQ, temperature, Dates on Calendar.
4. Ratio scale: Similar to interval, except there is a true zero (absolute absence), or starting point, and
the ratios of data values have meaning.
➢ Arithmetic operations (+, −,∗,÷) and comparison (<, >, ≠) are possible/ applicable.
➢ One is different/larger /taller/ better/ less by a certain amount of difference and so much times
than the other.
➢ This measurement scale provides better information than interval scale of measurement.
Example 1.5: Height, Weight, Age, Income, Price, Expenditure, number of students.
Exercise:
1. In each statement, tell whether the descriptive or inferential statistic has been used:
A. By 2040 at least 3.5 billion people will run short of water (World Future Society).
B. Nine out of ten of the job fatalities are men.
C. Expenditures for the cable industry were 5.66 billion in 1996.
D. Allergy therapy makes bees go away.
E. Drinking decaffeinated coffee can raise cholesterol levels by 7%.
F. The national average annual medicine expenditure per person is birr 1052.
G. Experts say the mortgage rates may soon hit bottom.
2. A Research is to be conducted to determine the level of language proficiency and numeracy skills
among the 700 education and 300 management graduating students at Debre Markos University.
The researcher wants to a sample of 300 be selecting representatives from the two programs.
A. What is the population of the study?
B. What is the sample in the study?
C. What are the variables of the study? What is the level of measurement of each variable?

Lecture notes DMU Page 6


Basic Statistics for Accounting and Finance
CHAPTER TWO
METHODS OF DATA COLLECTION AND PRESENTATION
2.1 Methods of Data Collection
Data: is the raw material of statistics. It can be obtained either by measurement or counting.
Data collection is the process of gathering and measuring information on variables of interest, in an
established systematic fashion that enables one to answer stated research questions, test hypotheses, and
evaluate outcomes.
2.2.1 Sources of data
There are many ways of classifying data. A common classification is based upon who collected the data.
The statistical data may be classified under two categories depending up on the sources.
1. Primary data: Data collected by the investigator himself for the purpose of a specific inquiry or
study. Such data are original in character & are mostly generated by surveys conducted by
individuals or research institutions. It is the first-hand information collected, compiled and published
by organization for some purpose. They are most original data in character and have not undergone
any sort of statistical treatment. It is more reliable & accurate since the investigator can extract the
correct information by removing doubts, if any, in the minds of the respondents regarding certain
questions. The Following are the Methods of Collecting Primary Data
➔ Direct personal investigation ➔ Mailed questionnaire method:
➔ Indirect oral interview ➔ Schedule Sent through Enumerators
➔ Information from correspondents:
I. Direct Personal Investigation: In this method data are collected personally by the investigator
(organizing agency) from the source concerned i.e. from the person about whom information is
being collected.
➔ Example: If the information is required for a student, it is obtained from him only.
II. Indirect oral interview: In this method data are collected by the investigator from the
persons who have the knowledge about the person/unit about whom/which the information is
being collected.
➔ Example: if the information for a student is required, it may be collected from his/her
teacher.
III. Information from correspondents: In this method local agents/ correspondents are
appointed to collect information; normally used by newspapers/government to collect
information about the various events.
➔ Used for reporting case(s) of a disease or death/birth in a village; collection of prices for
construction of indices, etc.

Lecture notes DMU Page 7


Basic Statistics for Accounting and Finance
IV. Schedule Sent through Enumerators: In this method enumerators get replies from the
informants and fill in the schedule [Questionnaire filled by enumerator] in their own
handwriting.
V. Mailed questionnaire method: A list of questions relating to the field of enquiry having
space for answers to be filled by the respondents constitutes a questionnaire.
➔ List of Questions (Questionnaire) is prepared and sent to the informants with covering
note by post with a request to supply the relevant information.
2. Secondary data: When an investigator uses data, which have already been collected by others, such
data are called secondary data. The secondary data are second hand information which are already
collected by someone (organization) for some purpose and are available for the present study.
Example of secondary data: books, reports, magazines, Newspapers, Research Organizations etc.
When our source is secondary data check that:
➢ The type and objective of the situations.
➢ The purpose for which the data are collected and compatible with the present problem.
➢ The nature and classification of data is appropriate to our problem.
➢ There are no biases and misreporting in the published data.
Note: Data which are primary for one may be secondary for the other.
2.2 Methods of Data Presentation
Having collected and edited the data, the next important step is to organize it. That is to present it in a
readily comprehensible condensed form that aids in order to draw inferences from it. The presentation
of data is broadly classified in to the following two categories:
✓ Tabular presentation
✓ Diagrammatic and Graphic presentation.
2.2.1. Tabular presentation
Tabulation is the process of placing classified data into tabular form. A table is a symmetric arrangement
of statistical data in rows and columns. Rows are horizontal arrangements whereas columns are vertical
arrangements. It may be simple, double or complex depending upon the type of classification.
The process of arranging data in to classes or categories according to similarities technically is called
classification. It eliminates inconsistency and brings out the points of similarity or dissimilarity of
collected data.
Basis of Classification
Statistical data are classified after taking into account the nature, scope, and purpose of an investigation.
Generally, data are classified on the basis of the following four bases:

Lecture notes DMU Page 2


Basic Statistics for Accounting and Finance
1. Geographical Classification: When data are classified with reference to geographical locations
such as countries, states, cities, region, districts
2. Chronological Classification: A classification where data are grouped according to time
3. Qualitative Classification: Data are classified on the basis of some attributes or qualitative
phenomena such as religion, sex, marital status, literacy, occupation, honesty, beauty and the like.
4. Quantitative Classification: classification is made on the basis of some measurable characteristics
like height, weight, age, income, marks of students, prices, production, sales and the like etc.
2.2.1 Frequency distribution
➢ Frequency distribution is the organization of raw data in table form, using classes and
frequencies. Frequency distribution is a table that lists each data point and its frequency or it is
the organization of raw data in table form using classes and frequency.
➢ Class is a quantitative or qualitative category. A class may be a range of numerical values (that
acts like a “category”) or an actual category.
➢ Frequency – the number of data values contained in a specific class or frequency is the number
of times a particular data point occurs in the set of data.
➢ Relative frequency is the proportion (or percent) of observations within a category and it is found
by:
𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦(𝑐𝑙𝑎𝑠𝑠) 𝑓𝑖
𝑟𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 = =
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑖𝑒𝑠 𝑛
➢ A relative frequency distribution lists each category of data together with the relative
frequency.
➢ Generally, there are three basic types of frequency distributions:
1. Categorical FD
2. Ungrouped FD
3. Grouped FD
1. Categorical Frequency Distribution: Categorical Frequency Distribution used to organize and
summarize qualitative data such as nominal and ordinal level of data using tables.
For instance data on blood types of people, political affiliation, economic status (low, medium and
high), religious affiliation are presented by categorical frequency distributions.
Example 2.1: Twenty five patients were given a blood test to determine their blood type. The data is
as shown below: A B B AB O O O B AB B B B O A O O O AB AB A O O B A.
Solution:
Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O, & AB.
These types will be used as the classes for the distribution. The procedure for constructing a frequency
distribution for categorical data is given below.

Lecture notes DMU Page 3


Basic Statistics for Accounting and Finance
Step 1: Make a table as shown below
CLASS TALLY FREQUANCY PERCENRT
A
B
AB
O
Step 2: Tally data and place the result under the column Tally
Step 3: Count the tallies and place the result under the column Frequency.
Step 4: find the percentage of values in each class by the formula (%= f/n * 100%; f= frequency, n total
number of observation.)
Class Tally Frequency Percent
A //// 5 5/25* 100 = 20%
B //// // 7 7/25*100=28%
AB //// 4 4/20*100=16%
O //// //// 9 9/25*100 = 36%
Example 2.2: A researcher evaluated the taste of four leading brands of instant coffee by having a
sample of 80 individuals taste each coffee and then select their favorite. The results are given:
A D B B D A D B C A B B C A D A B D B B
D B C B B A D C B B A A D C B B A B A B
B B A C D D A B D D B B C B A D D A B D
C D B C B B B D A B B C B D B B D D B B
A. Construct a Frequency table.
B. Add a relative frequency column to the distribution you constructed in part (A).
C. What percent of the people chose Brand C as their favorite?
D. How many people chose Brand A as their favorite?
E. What percent of the people chose Brand B or Brand D as their favorite?
➢ Solution: Since the data are categorical by taking the four coffee brand types as classes we can
construct a FD as shown below for question A and B
Coffee brand Tally Frequency Relative frequency
A //// //// //// 15 15/80 = 0.1875
B //// //// //// //// //// //// //// 35 35/80 = 0.4375
C //// //// 10 10/80 = 0.125
D //// //// //// //// 20 20/80 = 0.25
Total 80 1.00
C. What percent of the people chose Brand C as their favorite?
➢ 1.125 ∗ 100% = 12.5%, 12.5 percent of people choice is brand A coffee.
D. How many people chose Brand A as their favorite?
➢ There are 15 people choose brand A coffee.

Lecture notes DMU Page 4


Basic Statistics for Accounting and Finance
E. What percent of the people chose Brand B or Brand D as their favorite?
➢ (0.4375 + 0.25) ∗ 100% = (0.6875) ∗ 100% = 68.75%
2. Ungrouped Frequency Distribution (UFD)
Ungrouped data is data given as individual data points. Ungrouped frequency distribution is a table of
all potential raw scored values that could possibly occur in the data along with their corresponding
frequencies. It is often constructed for small set of data or a discrete variable.
Constructing ungrouped frequency distribution:
To construct an ungrouped frequency distribution,
✓ First find the smallest and largest raw score in the collected data.
✓ Arrange the data in order of magnitude and count the frequency.
✓ To facilitate counting one may include a column of tallies.
Example 2.3: The following data are the ages in years of 20 women who attend health education last
year: 30, 41, 39, 41, 32, 29, 35, 31, 30, 36, 33, 36, 32, 42, 30, 35, 37, 32, 30, and 41. Construct a
frequency distribution for these data.
Solution:
Step 1: Find the smallest and the largest raw scores in the collected data. Smallest=29 and largest=42
Step 2: Construct a table, tally the data and complete the frequency column. The frequency distribution
becomes as follows.
Age 29 30 31 32 33 35 36 37 39 41 42
Tally / //// / /// / // // / / /// /
Frequency 1 4 1 3 1 2 2 1 1 3 1

3. Grouped Frequency Distribution (GFD): Like ungrouped frequency distribution, grouped


frequency distribution is used for numerical data but in grouped frequency distribution several values of
a variable are grouped into one class and the number of observations belonging to the class is the
frequency of that class. This frequency distribution is also called continuous frequency distribution.
Example 2.4: Suppose the table below is the frequency distribution of test score of 50 students.
Class limit Class boundary Frequency
11 − 15 10.5 − 15.5 7
16 − 20 15.5 − 20.5 8
21 − 25 20.5 − 25.5 10
26 − 30 25.5 − 30.5 12
31 − 35 30.5 − 35.5 9
36 − 40 35.5 − 40.5 4
I. Class Limits: The lowest and highest values that can be included in a class are called class limits.
The lowest values are called lower class limits and the highest values are called upper class limits.

Lecture notes DMU Page 5


Basic Statistics for Accounting and Finance
For example: Class limit for the first class is 11 − 15, where 11 is the lower-class limit and 15 is the
upper class limit of the first class.
II. Class Boundaries: Class boundaries are class limits when there is no gap between the UCL of one
class and the LCL of the next class. The lowest values are called lower class boundaries and the
highest values are called upper class boundaries. The class boundary for the first class 10.5 − 15.5
where the Lower-class boundary is 10.5 and the Upper-class boundary is 15.5. Note that the UCL of
one class is the LCL of the next class.
III. Class Width: It is the difference between UCB and LCB of a certain class. It is also the difference
between the lower limits of two consecutive classes or it is the difference between upper limits of
two consecutive classes. That is,
𝑊 = 𝑈𝐶𝐵 − 𝐿𝐶𝐵 𝑜𝑟 𝑊 = 𝐿𝐶𝐿𝑖 − 𝐿𝐶𝐿𝑖−1 𝑜𝑟 𝑊 = 𝑈𝐶𝐿𝑖 − 𝑈𝐶𝐿𝑖−1
The class width of the above frequency distribution is
𝑊 = 15.5 − 10.5 = 5 𝑜𝑟 𝑊 = 16 − 11 = 5 𝑜𝑟 𝑊 = 20 − 15 = 5.
IV. Class Mark: is the half way between the class limits or the class boundaries of a certain class.
𝐿𝐶𝐿𝑖 + 𝑈𝐶𝐿𝑖 𝐿𝐶𝐵𝑖 + 𝑈𝐶𝐵𝑖
𝐶𝑀𝑖 = =
2 2
Class marks of the above distribution are 𝐶𝑀1 = 13, 𝐶𝑀2 = 18, 𝐶𝑀3 = 23 and 𝐶𝑀4 = 28.
Note also that 𝑊 = 𝐶𝑀𝑖 − 𝐶𝑀𝑖−1
V. Cumulative frequency: A cumulative frequency displays the total number of observations above
(below) a certain value. It is used to determine how many or what proportion of the data values are
below or above a certain value.
➢ The less than cumulative frequency distribution (<CFD) shows the number of observations with
values smaller than the upper class boundary of a given class.
➢ The more or greater than cumulative frequency distribution (>CFD) shows the number of
observations with values larger than the lower class boundary of a given class.
Less than Cumulative Frequency More than Cumulative Frequency
Class Frequency Class Frequency
< 15.5 7 > 10.5 50
< 20.5 15 > 15.5 43
< 25.5 25 > 20.5 35
< 30.5 37 > 25.5 25
< 35.5 46 > 30.5 13
< 40.5 50 > 35.5 4
➢ Open ended class: An open-ended class is a class in which at least one of the lower and higher
value is not perfectly defined

Lecture notes DMU Page 6


Basic Statistics for Accounting and Finance
Steps in Constructing a Grouped Frequency distribution
1. Find the smallest and largest observation and arrange the data in an array form (increasing or
decreasing order).
2. Find the Unit of Measurement (U). Unit of measurement is the smallest numerical difference
between any two distinct values of the data. Unit of measure values are 1, 0.1, 0.01, …
3. Find the Range(R): Range is the difference between the largest and the smallest values of the
variable.
𝑅𝑎𝑛𝑔𝑒 = 𝐿𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒 − 𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒
4. Determine the number of classes (K).
I. Choose arbitrary any class between 5 and 20.
II. Using sturg’s formula
𝐾 = 1 + 3.322 log 𝑛 ;
where 𝑛 = 𝑇𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
5. Specify the class width(W)
𝑅 𝑅𝑎𝑛𝑔𝑒
𝑊= =
𝐾 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠
6. Put the smallest value of the data set as the LCL of the first class
➢ Then obtain the LCL of the second class by adding the class width W to the LCL of the
first class.
➢ Continue adding W until you get k classes
➢ Let X be the smallest observation
𝐿𝐶𝐿1 = 𝑋 , 𝐿𝐶𝐿𝑖 = 𝐿𝐶𝐿𝑖−1 + 𝑊 𝑓𝑜𝑟 𝑖 = 2, 3 … 𝐾
7. Now obtain the UCLs of the FD by adding 𝑊 − 𝑈 to the corresponding LCLs.
𝑈𝐶𝐿𝑖 = 𝐿𝐶𝐿𝑖+1 + (𝑤 − 𝑈) 𝑓𝑜𝑟 𝑖 = 1, 2, 3 … 𝐾
8. Generate the class boundaries
𝑈
𝐿𝐶𝐵𝑖 = 𝐿𝐶𝐿𝑖 − , 𝑓𝑜𝑟 𝑖 = 1,2 … 𝐾.
2
𝑈
𝑈𝐶𝐵𝑖 = 𝑈𝐶𝐿𝑖 + 𝑓𝑜𝑟 𝑖 = 1,2 … 𝐾.
2
9. If necessary, we calculate class mark, RF and RCF.
Example 2.5: The following set of numbers represents mutual fund prices reported at the end of a week
for selected 40 nationally sold funds.
10 17 15 22 11 16 19 24 29 18 25 26 32 14 17 20 23 27 30 12
15 18 24 36 18 15 21 28 33 38 34 13 10 16 20 22 29 29 23 31
Construct a frequency distribution having a suitable number of classes.
Solution:
1. The array form of the data (increasing order):

Lecture notes DMU Page 7


Basic Statistics for Accounting and Finance
10 10 11 12 13 14 15 15 15 16 16 17 17 18 18 18 19 20 20 21 22 22 23 23 24 24
25 26 27 28 29 29 29 30 31 32 33 34 36 38
2. 𝑈 = 11 − 10 = 1
3. 𝑅 = 38 − 10 = 28
4. 𝐾 = 1 + 3.322 log 40 = 1 + 3.322(1.602) = 6.322 ≈ 6
5. 𝑊 = 𝑅/𝐾 = 28/6 = 5
6. 𝑊 − 𝑈 = 5 − 1 = 4

CL CB Class Mark Tally Frequency CF(<) CF(>) RF RCF(>)


10 – 14 10.5 – 14.5 12 // 6 6 40 2/20=0.1 1
15 – 19 14.5 – 19.5 17 // 11 17 34 2/20=0.1 0.9
20 – 24 20.5 – 24.5 22 //// // 9 26 23 7/20=0.35 0.8
25 – 29 24.5 – 29.5 27 //// 7 33 14 4/20=0.2 0.45
30 – 34 29.5 – 34.5 32 /// 5 38 7 3/20=0.15 0.25
35 – 39 34.5 – 39.5 37 // 2 40 2 2/20=0.1 0.10

2.2.2 Diagrammatic presentation of data:


The most convenient and popular way of describing data is using graphical presentation. It is easier to
understand and interpret data when they are presented graphically than using words or a frequency table.
A graph can present data in a simple and clear way. Also it can illustrate the important aspects of the
data. This leads to better analysis and presentation of the data. In this article, we discuss the approach
for the most commonly used diagrammatic or graphical methods such as bar chart, pie chart, histogram,
frequency polygon and cumulative frequency polygon. The three most commonly used diagrammatic
presentation for discrete as well as qualitative data are:
➢ Pie chart
➢ Bar chart
➢ Pictogram
1. Pie chart
A Pie chart is a circle that is divided into sections or wedges according to the percentage of frequencies
in each category of the distribution. The angle of the sector is obtained using:
𝑉𝑎𝑙𝑢𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑝𝑎𝑟𝑡
𝐴𝑛𝑔𝑙𝑒 𝑜𝑓 𝑎 𝑠𝑒𝑐𝑡𝑜𝑟 = ∗ 3600
𝑇ℎ𝑒 𝑤ℎ𝑜𝑙𝑒 𝑞𝑢𝑎𝑛𝑡𝑖𝑡𝑦
Example 2.6: Draw a suitable diagram to represent the following population in a town.
Men Women Girls Boys
2500 2000 4000 1500
Solutions:
Step 1: Find the percentage.

Lecture notes DMU Page 8


Basic Statistics for Accounting and Finance
Step 2: Find the number of degrees for each class.
Step 3: Using a protractor and compass, graph each section and write its name with corresponding
percentage.
Class Frequency Percent Degree
Men 2500 25 90
Women 2000 20 72
Girls 4000 40 144
Boys 1500 15 54
Total 10000 100 360

Boys Men
15% 25%

Girls Women
40% 20%

2. Bar Charts: A bar chart is constructed to show frequencies of different categories of a


categorical variable in the given data. The horizontal axis represents the different categories and
vertical axis is used to show the number of cases (frequencies) of the characteristics in which we
are interested.
➢ Bars can be drawn either vertically or horizontally.
➢ All bars must have equal width and the distance between bars must be equal.
➢ The height or length of each bar indicates the size (frequency) of the figure represented.
There are different types of bar charts. The most common being:
❖ Simple bar chart
❖ Component or sub divided bar chart.
❖ Multiple bar charts.
I. Simple bar chart
✓ Are used to display data on one variable.
✓ They are thick lines (narrow rectangles) having the same breadth. The magnitude of a quantity
is represented by the height /length of the bar.
Example 2.7: Number of students in the four department of Science College given as follows:
Department Physics Maths Chemistry Biology
Number of students 200 400 450 600
Male 170 350 250 200
Female 30 50 200 400
Draw a simple bar chart of the number of students by department.

Lecture notes DMU Page 9


Basic Statistics for Accounting and Finance
Solution:

II. Component Bar chart


✓ When there is a desire to show how a total (or aggregate) is divided in to its component parts,
we use component bar chart.
✓ The bars represent total value of a variable with each total broken in to its component parts and
different colors or designs are used for identifications
Example 2.8: Draw a component bar chart for the following data
Year Sales Gross profit Net profit
2010 100 30 10
2011 120 40 15
2012 130 45 25
2013 150 50 25
Total 500 165 75

III. Multiple Bar charts: A multiple bar chart is used for two or more-dimensional comparison. For
comparison of magnitude of one variable in two or three aspects or comparison of magnitude of
two or three variables, rectangles in a group are replaced side by side.
➢ These are used to display data on more than one variable.
➢ They are used for comparing different variables at the same time.

Lecture notes DMU Page 10


Basic Statistics for Accounting and Finance
Example 2.9: Draw a multiple bar chart to represent the import and export of Ethiopia for the years
1982- 1988.
Year Imports (in dollar) Export (in dollar)
1982-83 2000 1700
1983-84 3000 2500
1984-85 1500 2000
1985-86 3000 2500
1986-87 2000 3000
1987-88 600 200
➢ Solution:
Multiple bar chart of import and export of Ethiopia

Example 2.10: The following data represent sales by product, 1957- 1959 of a given company for three
products A, B, C.
Product Sales in ($)
1957 1958 1959
A 12 14 18
B 24 21 18
C 24 35 54
Draw a multiple bar chart to represent the sales by product from 1957 to 1959.
Solution:

3. Pictograph: In this diagram, we represent data by means of some picture symbols. We decide
about a suitable picture to represent a definite number of units in which the variable is measured.
Example 2.11: The following table shows the orange production in plantation from production year
1990-1993. Represent the by pictogram.

Lecture notes DMU Page 11


Basic Statistics for Accounting and Finance
Year 1990 1991 1992 1993
Production 3000 4000 4000 5000
Solution: Pictogram of the orange production from 1990-1993

2.2.4 Graphical Presentation of data


The histogram, frequency polygon and cumulative frequency graph or ogive is most commonly applied
graphical representation for continuous data.
Procedures for constructing statistical graphs:
➢ Draw and label the X and Y axis.
➢ Choose a suitable scale for the frequencies or cumulative frequencies and label it on the Y axis.
➢ Represent the class boundaries for the histogram or Ogive and the mid points for the frequency
polygon on the X axis.
➢ Plot the points.
➢ Draw the bars or lines to connect the points.
Histogram: A histogram is a graphical display of a quantitative frequency table. It is constructed by
drawing rectangles for each class of data on the xy-coordinate system. Class boundaries are placed along
the horizontal axis (X–axis). Class marks and class limits are sometimes used as quantity on the X axis.
➢ Width of each rectangle are the same
➢ Rectangles touch
A histogram is a useful tool that can quickly communicate many traits about a set of data. A histogram
can be used to get an approximation of:
➢ central tendency
➢ variation in the data
➢ shape of the data
➢ assess if outliers exist
➢ minimum and maximum
Example 2.12: Construct a histogram to represent the following data.
Class limits 15-24 25-34 35-44 45-54 55-64 65-74 75-84
Frequency 3 4 10 15 12 4 2
Solution:

Lecture notes DMU Page 12


Basic Statistics for Accounting and Finance

Frequency polygon: It is a graph that consists of line segments connecting the intersection of the class
marks and the frequencies of a continuous frequency distribution. It can also be constructed from
histogram by joining the mid-points of each bar. It is also called frequency curve if the points are joined
by a smooth free hand sketch.
Steps for creating a frequency polygon
➢ Find the midpoints of each class.
➢ Draw and label the x and y axes. Label the x- axis with the midpoints of each class, the y- axis as
frequency.
➢ Plot the points, x values = midpoints, y values = frequency.
➢ Connect adjacent point with line segments. Connect to x-axis at beginning and end of graph
If we join the mid-points of the tops of the adjacent rectangles of the histogram with line segments a
frequency polygon is obtained. When the polygon is continued to the x-axis just outside the range of the
lengths the total area under the polygon will be equal to the total area under the histogram.
Example 2.13: Construct a frequency polygon to represent the previous data in example 2.8.
Solution:
Class Frequency Class Class R.F. % R.F. Less than More than
limits marks boundaries (percent) C.F. C. F.
15 – 24 3 19.5 14.5 - 24.5 0.06 6% 3 50
25 – 34 4 29.5 24.5 - 34.5 0.08 8% 7 47
35 – 44 10 39.5 34.5 - 44.5 0.20 20% 17 43
45 – 54 15 49.5 44.5 - 54.5 0.30 30% 32 33
55 – 64 12 59.5 54.5 - 64.5 0.24 24% 44 18
65 – 74 4 69.5 64.5 - 74.5 0.08 8% 48 6
75 – 84 2 79.5 74.5 - 84.5 0.04 4% 50 2
Total 50 1.00 100%
Adding two class marks with 𝑓𝑖 = 0, we have 9.5 at the beginning, and 89.5 at the end, the following
frequency polygon is plotted:

Lecture notes DMU Page 13


Basic Statistics for Accounting and Finance

Frequency polygon of Student Achievement Scores


60
50

Frequency
40
30
20
10
0
5 15 25 35 45 55 65 75 85
Achievement Scores

Ogive (cumulative frequency polygon): A cumulative frequency polygon is used to determine how
many or what proportion of the data values are below or above a certain value. There are two types of
cumulative frequency distribution.
1. Less than Ogive: In less than ogive, the less than cumulative frequencies are plotted against the
upper class boundaries of the respective classes. It is an increasing curve having slopes upwards
from left to right.
2. More than Ogive: In more than ogive, the more than cumulative frequencies are plotted against
the lower class boundaries of the respective classes. It is decreasing curve and slopes downwards
from left to right.
Example 2.14: Using the Results of 200 students on Academic achievement test draw the Less than and
more than cumulative frequencies based on data reported in table
Class Interval Frequency Less than c.f. More than c.f.
10-20 12 12 200
20- 30 10 22 188
30- 40 35 57 178
40- 50 55 112 143
50- 60 45 157 88
60- 70 25 182 43
70- 80 18 200 18
Solution: The Ogives for the cumulative frequency distributions given in above table are drawn below.

Lecture notes DMU Page 14


Basic Statistics for Accounting and Finance
Exercise
1. For the following distribution:
Class 10-14 15-19 20-24 25-29 30-34 35-39 40-44 Total
frequency 19 24 37 81 43 30 16 250
A. Find the class interval
B. Find the class boundaries
C. Find the class mark
D. Find the class width or size of class
E. Find the cumulative frequency of the distribution
F. Find the relative frequency of the distribution
2. A company administers an aptitude test to 100 applicants for a job with, the company. The following
are the times taken to complete a simple task for each applicant, measured to the nearest second.
44 92 72 45 85 61 66 46 59 57
52 40 93 54 52 64 65 44 51 66
92 58 74 42 43 56 46 52 45 56
68 40 48 76 71 99 51 72 52 56
69 58 40 76 70 42 52 46 73 59
41 55 74 66 64 47 58 46 52 54
63 89 87 41 57 68 59 81 82 60
67 68 97 57 47 53 61 52 49 47
86 55 54 48 85 45 84 53 49 47
70 78 58 96 54 62 60 57 58 66
A. Construct a frequency table for the above data using classes of 40 – 49, 50 – 59, 60 – 69, etc.
B. Construct a cumulative frequency distribution.
C. Construct a relative frequency distribution.
D. Construct a cumulative frequency graph
E. Construct a histogram
F. Construct a frequency polygon
G. Construct the Ogive graph

Lecture notes DMU Page 15

You might also like