0% found this document useful (0 votes)
23 views13 pages

Stat Chapter One

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 13

CHAPTER ONE: INTRODUCTION

Introduction:
Statistics plays an important role in almost every facet of human life. In the business
context, managers are required to justify decisions based on data. They need statistical
models to support these decisions. Statistical skills enable managers to collect, analyze
and interpret data and make relevant decisions. Statistical concepts and statistical thinking
enable them to solve problems in almost any domain, support their decisions and reduce
guesswork.
Statistical methods are applied to specific problems in various fields such as Biology,
Medicine, Agriculture, Commerce, Business, Economics, Industry, Insurance, Sociology
and Psychology.
In the field of medicine, statistical tools like t-tests are used to test the efficiency of the new
drug or medicine. In the field of economics, statistical tools such as index numbers,
estimation theory and time series analysis are used in solving economic problems related
to wages, price, production and distribution of income. In the field of agriculture, an
important concept of statistics such as analysis of variance is used in the experiments
related to agriculture, to test the significance between two sample means.
In Biology, Medicine and Agriculture, Statistical methods are applied in the Study of
growth of plant, Migration patterns of birds, analyzing the effect of newly invented
medicines, theories of heredity, Estimation of yield of crop, population growth, etc.
Insurance companies decide on the insurance premiums based on the age composition
of the population and the mortality rates. Statistics is a part of Economics, Commerce
and Business. Statistical analysis of the variations in price, demand and production are
helpful to both businessmen and economists. Cost of living index numbers help the
governments in economic planning and fixation of wages. A government’s administrative
system is fully dependent on production statistics, income statistics, labour statistics,
economic indices of cost, price. Economic planning of any nation is entirely based on
statistical facts. Cost of living index numbers are also used to estimate the value of money.
Management of limited resources and labor needs statistical methods to maximize profit.
Planned recruitments and distribution of staff, proper quality control methods, and careful
study of demand for goods in the market as well as balanced investment help the producer
to extract maximum profit out of minimum capital. In manufacturing industries, statistical
quality control techniques help in increasing and controlling the quality of products at
minimum cost. Hence, statistics is applied in every sphere of human activity.
Definition and Characteristics of Statistics

1
Statistics is defined as:
The science of collecting, organizing, presenting, analyzing and interpreting
quantitative information.

Professor A.L. Bowley gave several definitions of Statistics. He defined Statistics as:
“i) The science of counting
ii) The science of averages
iii) The science of measurement of social phenomena, regarded as a whole in all its
manifestations.
iv) A subject not confined to any one science”
However, none of these definitions are complete.

According to Horace Secrist, “Statistics may be defined as the aggregate of facts affected
to a marked extent by multiplicity of causes, numerically expressed, enumerated or
estimated according to a reasonable standard of accuracy, collected in a systematic
manner, for a predetermined purpose and placed in relation to each other”. This
definition is both comprehensive and exhaustive.
Prof. Boddington, on the other hand, defined Statistics as ‘The science of
estimates and probabilities’. This definition is also not complete.
According to Croxton and Cowden, ‘Statistics is the science of collection, presentation, analysis
and interpretation of numerical data from logical analyses.

Characteristics of Statistics

There are several characteristics of Statistics. Let us look at each characteristic in detail.

1. Statistics deals with aggregate of facts


Single figure cannot be analyzed. Thus, the fact ‘Mr Kiran is 170 cms tall’ cannot
be statistically analyzed. On the other hand, if we know the heights of 60 students of a class,
we can comment upon the average height and variation.

2. Statistics gets affected to a huge extent by multiplicity of causes


The Statistics of yield of a crop is the result of several factors such as fertility of soil, amount
of rainfall, quality of seed used, quality and quantity of fertilizer used.

3. Statistics are numerically expressed


Only numerical facts can be statistically analyzed. Therefore, facts as ‘price decreases with
increasing production’ cannot be called statistics. The qualitative data such as the
categorical data cannot be called as statistics. For example, the eye color of a person or
the brand name of an automobile.

2
4. Statistics are collected in a systematic manner
The facts should be collected according to planned and scientific methods. Otherwise,
they are likely to be wrong and misleading.

5. Statistics are collected for a pre-determined purpose


There must be a definite purpose for collecting facts. Otherwise, indiscriminate data
collection might take place which would lead to wrong diagnosis.

6. Statistical values are placed in relation to each other


The facts must be placed in such a way that a comparative and analytical study becomes
possible. Thus, only related facts which are arranged in logical order can be called Statistics.
Statistical analysis cannot be used to compare heterogeneous data.
IMPORTANCE AND LIMITATIONS OF STATISTICS
Importance of Statistics
Let us look at each function of Statistics in detail:

1. Statistics simplifies mass data


The use of statistical concepts helps in simplification of complex data. Using statistical
concepts, the managers can make decisions more easily. The statistical methods help
in reducing the complexity of the data and consequently in the understanding of any huge
mass of data.

2. Statistics makes comparison easier


Without using statistical methods and concepts, collection of data and comparison cannot
be done easily. Statistics helps us to compare data collected from different sources. Grand
totals, measures of central tendency, measures of dispersion, graphs and diagrams,
coefficient of correlation all provide ample scopes for comparison. Hence, visual
representation of numerical data helps you to compare the data with less effort and
can make effective decisions.

3. Statistics brings out trends and tendencies in the data


After data is collected, it is easy to analyze the trend and tendencies in the data by using
the various concepts of Statistics.

4. Statistics brings out the hidden relations between variables


Statistical analysis helps in drawing inferences on data. Statistical analysis brings out
the hidden relations between variables.

5. Decision making power becomes easier


With the proper application of Statistics and statistical software packages on the collected
data, managers can take effective decisions, which can increase the profits in a business.
3
In modern business environment due to advanced communication network, rapid changes
in consumer behavior, varied expectations of variety of consumers and new market
openings, modern managers have a difficult task of making quick and appropriate decisions.
Therefore, there is a need for them to depend more upon quantitative techniques like
mathematical models, statistics, operations research and econometrics.
Decision-making is a key part of our day-to-day life. Even when we wish to purchase a
television, we like to know the price, quality, durability, and maintainability of various
brands and models before buying one. As you can see, in this scenario we are collecting data
and making an optimum decision. In other words, we are using Statistics.
Again, suppose a company wishes to introduce a new product, it has to collect data on market
potential, consumer likings, availability of raw materials, feasibility of producing the
product. Hence, data collection is the back-bone of any decision making process.
Many organizations find themselves data-rich but poor in drawing information from
it. Therefore, it is important to develop the ability to extract meaningful information from
raw data to make better decisions. Statistics play an important role in this aspect.
Limitations of Statistics
Despite all its characteristics and functions, Statistics also have certain limitations.

1. Statistics does not deal with qualitative data


Qualitative data deals with meanings while quantitative data deals with numbers.
Qualitative data describes properties or characteristics that are used to identify things.
Quantitative data describes data in terms of quantity using the numerical figure
accompanied by measurement unit. Statistics deals only with quantitative data.
Statistics deals with numerical data, which can be expressed in terms of quantitative
measurements. So, the qualitative phenomenon like beauty, intelligence cannot be
expressed numerically and any statistical analysis cannot be directly applied on these
qualitative phenomena. However, statistical techniques may be applied indirectly by first
reducing the qualitative data to accurate quantitative terms. For example, the intelligence
of a group of students can be studied based on their marks in a particular examination.

2. Statistics does not deal with individual or isolated items


Statistical methods can be applied only to aggregates of facts, because analysis and
interpretation of data is highly difficult in case of individual facts.

3. Statistical inferences (conclusions) are not exact


Statistical inferences are true only on an average. They are probabilistic statements.
For example, in case of data, which consists of height of 200 male persons taken from a
graduate school, the inferences so obtained may not hold true for an individual male
person in particular.
4
4. Statistics can be misused and misinterpreted
Lack of sufficient knowledge of statistical science often leads to incorrect conclusions.
Therefore, proper care must be taken while selecting collection method and also in choosing
appropriate statistical models. Increasing misuse of Statistics has led to increasing distrust
in Statistics. The field of Statistics is so vast that it needs experience as well as
skill to effectively understand and apply the statistical concepts and models. Hence, only
statisticians can handle statistics properly.
TYPES OF STATISTICS
Statistics is broadly divided into two main categories. The figure below illustrates
the two categories, which are, Descriptive statistics and Inferential statistics.

STATISTICS

Descriptive Statistics Inferential Statistics

Collecting Making inference


Organizing Hypothesis testing
Summarizing Determining
Presenting relationships

Descriptive Statistics: Descriptive statistics is used to present the general description of


data, which is summarized quantitatively. This is mostly useful in clinical research, when
communicating the results of experiments.

Example: In a certain firm, HR manager calculates the average salary of employees pertaining to
production department. The statistical data collected is related only to the production
department and does not give any information about other departments of the firm. The
HR manager displays the summarized collected data in the form of tables, charts and
diagrams to describe the gathered data and this comes under descriptive statistics.

Inferential Statistics: Inferential statistics is used to make valid inferences from the data,
which are helpful in effective decision making for managers or professionals.
Example: In the above example, if the HR manager uses the average salary of employees of

5
production department to estimate the overall average salary of all the departments in
the firm, his method will become under inferential statistics.

Statistical methods such as estimation, prediction and hypothesis testing belong to inferential
statistics. The researchers make deductions or conclusions from the collected data samples
regarding the characteristics of large population from which the samples are taken. If
generalizations or decisions are drawn with incomplete and additional information, the
method used will be considered as inferential statistics.

Exercises:

1. The four lemons, which a person bought at a supermarket, weighed 7, 5, 8 and 12 ounces.
Which of the following conclusions can be obtained from this information by purely
descriptive method and which require generalization?

i. The average weight of the four lemons is 8 ounces.

ii. The average weight of lemons sold at that supermarket is 8 ounces.

2. On three consecutive days, a traffic police officer issued 9, 14, and 10 speeding tickets and
5, 10 and 12 tickets for going through red light. Which of the following conclusions can be
obtained from this information by purely descriptive method and which require
generalization?

i. Altogether, on these three days, the police officer issued more speeding tickets than tickets
for going through red lights.

ii. The police officer issued the smallest number of tickets on the first day because he was new
on the job.

iii. On two of the three days, the police officer issued more speeding tickets than tickets for
going through red lights.

iv. This police officer will seldom give more than 15 speeding tickets on any one day.

v. On the fourth day, it is expected that he will issue 10 speeding tickets and 12 tickets for
going through the red light.

MAJOR STEPS OF ANY STATISTICAL INVESTIGATION


6
1. Collection of Data
Careful planning is needed while collecting data. The different methods used for collecting
data such as census method, sampling method and so on. The investigator has to take
care while selecting appropriate collection methods.

In the census method, every unit or object of the population is included in the investigation.
For example, if we want to study the average annual income of all the families in a given
area, which has 500 families, we must study the income of all 500 families. When
the population is large, census method would be difficult.

A sample of units or objects is taken from the population to describe the overall
characteristics of the population from which the sample was drawn. This method of
collecting data is called sampling. This method is helpful when size of the population is large
or when the results are needed in short time.
2. Organization of data
To brush off some irregularities and to reduce the bulk of data, the process of organization
of data must take place. It includes
i. Editing (Condensation): The process of checking and correcting data for omissions,
inconsistencies, irrelevant answers and wrong computations in the collected data.
ii. Classification: is a task of grouping the collected data into similar categories based on some
criteria.
3. Presentation of Data
The collected data is usually presented for further analysis in a tabular, diagrammatic
or graphic form. The collected data is condensed, summarized and visually represented
in a tabular or graphical form.

Tabulation is a systematic arrangement of classified data in rows and columns. For


the representation of data in diagrams, we use different types of diagrams such as one-
dimensional, two-dimensional and three-dimensional diagrams.
Line diagrams, bar diagrams are one-dimensional diagrams. Pie-charts are the two-
dimensional diagrams which are in the form of circle. In pie-chart, total and component parts
are shown in circular shape.
7
4. Analysis of data
The data presented has to be carefully analyzed to make any inference from it. The inferences
can be of various types, for example, as measures of central tendencies, dispersion,
correlation, regression.

Measures of central tendency will quantify the middle of the distribution. The measures
in case of population are the parameters and in case of sample, the measures are statistics
that are estimates of population parameters. The three most common ways of measuring
the centre of distribution is the mean, mode and median.

In case of population, the measures of dispersion are used to quantify the spread of
the distribution. Range, inter-quartile range, mean deviation and standard deviation are
four measures to calculate the dispersion.
5. Interpretation of Data
The final step is to draw conclusions from the analyzed data. Interpretation requires
high degree of skill and experience.

Thus, Statistics contains the tools and techniques required for the collection, presentation,
analysis and interpretation of data. Thus, we see that this definition is precise and
comprehensive.

DESCRIPTIVE STATISTICS

BASIC TERMS USED IN STATISTICS


Statistics, being a specialized subject, has a number of terms that have to be used. You
need to know and understand these terms in order to do any statistical work. Let us be
acquainted with some of the basic terms used in Statistics.

Units or Individuals
In a Statistical survey, the objects on which the characteristics are measured are called units
or individuals.

Population or Universe
The totality of all units or individuals in a survey is called population or universe. If

8
the number of objects in a population is finite then it is called finite population otherwise
it is known as infinite population.

Parameter
The data that describes the characteristics of the population is known as parameter. It is
a measurable characteristic of population or an overall summary measure applied to a
population.

Sample
A sample is a part or subset of the population. By studying the sample, you can predict
the characteristics of the entire population from where the sample is taken. It helps to
gain the best possible values of population with less time, energy and expenditure.

Statistic
The data that describes measurable characteristic of a sample or the value that is determined
using sample data is known as statistic.

If the population is large, it is hard to collect data. Hence, a part of the population is chosen
to study the characteristics of the entire population. The size of the sample can never be
as large as the size of the population.

Qualitative characteristic (Attribute)


A characteristic, which is not numerically measurable is called a qualitative
characteristic. Qualitative data describe the attributes or properties that objects possess. The
qualitative characteristic that varies from unit to unit is called an attribute.

Quantitative characteristic (Variable)


A characteristic that is numerically measurable is called a quantitative characteristic. In
a population, some characteristics remain the same for all units and some others vary from
unit to unit. The quantitative characteristic that varies from unit to unit is called a variable.

A variable that assumes only some specified values in a given range and associated
with counting is known as discrete variable. A variable that assumes any value in the
range and associated with measurement is known as continuous variable. For example,
the number of children per family and number of petals in a flower are examples of discrete
9
variables. The height and weight of persons are examples of continuous variables.

Exercises:
1. Classify the following as quantitative or qualitative data.
i) Eye color of human beings
ii) Number of pages in a book of various subjects
2. Classify the following as discrete or continuous variable
i) Number of shares sold each day in a stock market.
Temperatures recorded every half hour at a regional meteorological center.
LEVELS OF MEASURMENTS

You learned that statistics is concerned with measurements of one or more variables. These
measurements are referred to as data. We will generally classify data as one of four types
levels of measurement: nominal, ordinal, interval, or ratio.

A. Nominal Scale (Categorical data):

✓ Are measurements that simply classify the units of the sample (or population)
into categories.
Examples: Gender, Political Affiliations, Place of Residence, Codes given to
football players.

✓ Even if the labels may be converted to numbers, as they often are for ease of
computer entry and analysis, the numerical values are simply codes. They cannot
be meaningfully added, subtracted, multiplied, or divided.

B. Ordinal Scale

✓ Are measurements that enable the units of the


sample (or population) to be ordered with respect to the
variable of interest.
Examples

o Level of satisfaction: Very satisfied, satisfied, unsatisfied) – does incorporate


an ordered ranking

o Preferences: first choice, second choice, and so on.

10
o Grade ranks, runners ranks

o A supervisor‟s annual ranking of the performance of her 10


employees using a scale of 1 (worst performance0 to 10 (best
performance).

✓ In addition to providing a categorization, the measurementsactually rank units.

C. Interval Scale

✓ are measurements that enable the determination of how much more or less of the
measured characteristic is possessed by one unit of the sample (or population)
than another.

✓ Interval data are always numerical, and the numbers assigned to two
units can be subtracted to determine the difference between the units
with respect to the variable measured.

Example: Temperature; arrival, starting time etc. Note that in each case
temperature, score, and arrival time more than a ranking is involved.

✓ Although adding or subtracting interval data is valid, multiplying or


dividing them is not. This is because the zero point (the origin or 0)
does not indicate an absence of the characteristic of interest.

Example: the origin on the temperature scale differs for the Fahrenheit and
Celsius scales and does not indicate an absence of heat on either scale.
Temperatures lower than 0 degree indicate that less heat is present, so 0 degree
does not mean “no heat”.

D. Ratio Scale

✓ are measurements that enable the determination of how many times as much of
the measured characteristic is possessed by one unit of the sample (or
population) than another.

✓ Ratio Data are always numerical, and the ratio between the numbers assigned to
two units can be interpreted as the multiple by which the units differ.

11
Examples:

o The sales revenue for each firm in a sample of 100 U.S. firms.

o The number of female executives employed in each of a sample of 50


manufacturing companies

SUMMARY OF MEASUREMENT SCALES

Measurement Data Type


Property Description
Scale
Classification of sample (or population) units into
Nominal
categories. Often uses labels rather than numbers
Rank-orders the sample (or population) units
Ordinal
May be verbal labels or numbers Qualitative
Enables comparison of sample (or population) units
according to differences between values. Always numerical,
Interval
but the zero point on the scale does not indicate an absence of
the measured characteristic.
Enables comparison of sample (or population) units according Quantitative
to multiples of the values. Always numerical, and the zero
Ratio
point on the scale denotes an absence of the measured
characteristic

12
EXERCISE

1. Windows is a computer software product made by Microsoft Corporation. In designing


Windows Version 3.1, Microsoft telephoned 60,000 users of Windows 3.0 (an older version)
and asked them how the product could be improved. Assume customers were asked the
following questions:

a. Are you the most frequent user of Windows 3.0 in your household?

b. What is your age?

c. How would you rate the helpfulness of the Tutorial instructions that accompany
Windows 3.0, on a scale of 1 to 10, where 1 is not helpful?

d. When using a printer with Windows 3.0, do you most frequently use a dot-matrix
printer or another type of printer?

e. If the speed of Windows 3.0 could be changed, which one of the

f. following would you prefer: slower, unchanged, faster?

g. How many people in your household have used Windows 3.0 at least once?

Each of these questions defines a variable of interest to the company. Classify


the data generated for each variable as nominal, ordinal, interval, or ratio.

Justify your classification.

13

You might also like