Stat Chapter One
Stat Chapter One
Stat Chapter One
Introduction:
Statistics plays an important role in almost every facet of human life. In the business
context, managers are required to justify decisions based on data. They need statistical
models to support these decisions. Statistical skills enable managers to collect, analyze
and interpret data and make relevant decisions. Statistical concepts and statistical thinking
enable them to solve problems in almost any domain, support their decisions and reduce
guesswork.
Statistical methods are applied to specific problems in various fields such as Biology,
Medicine, Agriculture, Commerce, Business, Economics, Industry, Insurance, Sociology
and Psychology.
In the field of medicine, statistical tools like t-tests are used to test the efficiency of the new
drug or medicine. In the field of economics, statistical tools such as index numbers,
estimation theory and time series analysis are used in solving economic problems related
to wages, price, production and distribution of income. In the field of agriculture, an
important concept of statistics such as analysis of variance is used in the experiments
related to agriculture, to test the significance between two sample means.
In Biology, Medicine and Agriculture, Statistical methods are applied in the Study of
growth of plant, Migration patterns of birds, analyzing the effect of newly invented
medicines, theories of heredity, Estimation of yield of crop, population growth, etc.
Insurance companies decide on the insurance premiums based on the age composition
of the population and the mortality rates. Statistics is a part of Economics, Commerce
and Business. Statistical analysis of the variations in price, demand and production are
helpful to both businessmen and economists. Cost of living index numbers help the
governments in economic planning and fixation of wages. A government’s administrative
system is fully dependent on production statistics, income statistics, labour statistics,
economic indices of cost, price. Economic planning of any nation is entirely based on
statistical facts. Cost of living index numbers are also used to estimate the value of money.
Management of limited resources and labor needs statistical methods to maximize profit.
Planned recruitments and distribution of staff, proper quality control methods, and careful
study of demand for goods in the market as well as balanced investment help the producer
to extract maximum profit out of minimum capital. In manufacturing industries, statistical
quality control techniques help in increasing and controlling the quality of products at
minimum cost. Hence, statistics is applied in every sphere of human activity.
Definition and Characteristics of Statistics
1
Statistics is defined as:
The science of collecting, organizing, presenting, analyzing and interpreting
quantitative information.
Professor A.L. Bowley gave several definitions of Statistics. He defined Statistics as:
“i) The science of counting
ii) The science of averages
iii) The science of measurement of social phenomena, regarded as a whole in all its
manifestations.
iv) A subject not confined to any one science”
However, none of these definitions are complete.
According to Horace Secrist, “Statistics may be defined as the aggregate of facts affected
to a marked extent by multiplicity of causes, numerically expressed, enumerated or
estimated according to a reasonable standard of accuracy, collected in a systematic
manner, for a predetermined purpose and placed in relation to each other”. This
definition is both comprehensive and exhaustive.
Prof. Boddington, on the other hand, defined Statistics as ‘The science of
estimates and probabilities’. This definition is also not complete.
According to Croxton and Cowden, ‘Statistics is the science of collection, presentation, analysis
and interpretation of numerical data from logical analyses.
Characteristics of Statistics
There are several characteristics of Statistics. Let us look at each characteristic in detail.
2
4. Statistics are collected in a systematic manner
The facts should be collected according to planned and scientific methods. Otherwise,
they are likely to be wrong and misleading.
STATISTICS
Example: In a certain firm, HR manager calculates the average salary of employees pertaining to
production department. The statistical data collected is related only to the production
department and does not give any information about other departments of the firm. The
HR manager displays the summarized collected data in the form of tables, charts and
diagrams to describe the gathered data and this comes under descriptive statistics.
Inferential Statistics: Inferential statistics is used to make valid inferences from the data,
which are helpful in effective decision making for managers or professionals.
Example: In the above example, if the HR manager uses the average salary of employees of
5
production department to estimate the overall average salary of all the departments in
the firm, his method will become under inferential statistics.
Statistical methods such as estimation, prediction and hypothesis testing belong to inferential
statistics. The researchers make deductions or conclusions from the collected data samples
regarding the characteristics of large population from which the samples are taken. If
generalizations or decisions are drawn with incomplete and additional information, the
method used will be considered as inferential statistics.
Exercises:
1. The four lemons, which a person bought at a supermarket, weighed 7, 5, 8 and 12 ounces.
Which of the following conclusions can be obtained from this information by purely
descriptive method and which require generalization?
2. On three consecutive days, a traffic police officer issued 9, 14, and 10 speeding tickets and
5, 10 and 12 tickets for going through red light. Which of the following conclusions can be
obtained from this information by purely descriptive method and which require
generalization?
i. Altogether, on these three days, the police officer issued more speeding tickets than tickets
for going through red lights.
ii. The police officer issued the smallest number of tickets on the first day because he was new
on the job.
iii. On two of the three days, the police officer issued more speeding tickets than tickets for
going through red lights.
iv. This police officer will seldom give more than 15 speeding tickets on any one day.
v. On the fourth day, it is expected that he will issue 10 speeding tickets and 12 tickets for
going through the red light.
In the census method, every unit or object of the population is included in the investigation.
For example, if we want to study the average annual income of all the families in a given
area, which has 500 families, we must study the income of all 500 families. When
the population is large, census method would be difficult.
A sample of units or objects is taken from the population to describe the overall
characteristics of the population from which the sample was drawn. This method of
collecting data is called sampling. This method is helpful when size of the population is large
or when the results are needed in short time.
2. Organization of data
To brush off some irregularities and to reduce the bulk of data, the process of organization
of data must take place. It includes
i. Editing (Condensation): The process of checking and correcting data for omissions,
inconsistencies, irrelevant answers and wrong computations in the collected data.
ii. Classification: is a task of grouping the collected data into similar categories based on some
criteria.
3. Presentation of Data
The collected data is usually presented for further analysis in a tabular, diagrammatic
or graphic form. The collected data is condensed, summarized and visually represented
in a tabular or graphical form.
Measures of central tendency will quantify the middle of the distribution. The measures
in case of population are the parameters and in case of sample, the measures are statistics
that are estimates of population parameters. The three most common ways of measuring
the centre of distribution is the mean, mode and median.
In case of population, the measures of dispersion are used to quantify the spread of
the distribution. Range, inter-quartile range, mean deviation and standard deviation are
four measures to calculate the dispersion.
5. Interpretation of Data
The final step is to draw conclusions from the analyzed data. Interpretation requires
high degree of skill and experience.
Thus, Statistics contains the tools and techniques required for the collection, presentation,
analysis and interpretation of data. Thus, we see that this definition is precise and
comprehensive.
DESCRIPTIVE STATISTICS
Units or Individuals
In a Statistical survey, the objects on which the characteristics are measured are called units
or individuals.
Population or Universe
The totality of all units or individuals in a survey is called population or universe. If
8
the number of objects in a population is finite then it is called finite population otherwise
it is known as infinite population.
Parameter
The data that describes the characteristics of the population is known as parameter. It is
a measurable characteristic of population or an overall summary measure applied to a
population.
Sample
A sample is a part or subset of the population. By studying the sample, you can predict
the characteristics of the entire population from where the sample is taken. It helps to
gain the best possible values of population with less time, energy and expenditure.
Statistic
The data that describes measurable characteristic of a sample or the value that is determined
using sample data is known as statistic.
If the population is large, it is hard to collect data. Hence, a part of the population is chosen
to study the characteristics of the entire population. The size of the sample can never be
as large as the size of the population.
A variable that assumes only some specified values in a given range and associated
with counting is known as discrete variable. A variable that assumes any value in the
range and associated with measurement is known as continuous variable. For example,
the number of children per family and number of petals in a flower are examples of discrete
9
variables. The height and weight of persons are examples of continuous variables.
Exercises:
1. Classify the following as quantitative or qualitative data.
i) Eye color of human beings
ii) Number of pages in a book of various subjects
2. Classify the following as discrete or continuous variable
i) Number of shares sold each day in a stock market.
Temperatures recorded every half hour at a regional meteorological center.
LEVELS OF MEASURMENTS
You learned that statistics is concerned with measurements of one or more variables. These
measurements are referred to as data. We will generally classify data as one of four types
levels of measurement: nominal, ordinal, interval, or ratio.
✓ Are measurements that simply classify the units of the sample (or population)
into categories.
Examples: Gender, Political Affiliations, Place of Residence, Codes given to
football players.
✓ Even if the labels may be converted to numbers, as they often are for ease of
computer entry and analysis, the numerical values are simply codes. They cannot
be meaningfully added, subtracted, multiplied, or divided.
B. Ordinal Scale
10
o Grade ranks, runners ranks
C. Interval Scale
✓ are measurements that enable the determination of how much more or less of the
measured characteristic is possessed by one unit of the sample (or population)
than another.
✓ Interval data are always numerical, and the numbers assigned to two
units can be subtracted to determine the difference between the units
with respect to the variable measured.
Example: Temperature; arrival, starting time etc. Note that in each case
temperature, score, and arrival time more than a ranking is involved.
Example: the origin on the temperature scale differs for the Fahrenheit and
Celsius scales and does not indicate an absence of heat on either scale.
Temperatures lower than 0 degree indicate that less heat is present, so 0 degree
does not mean “no heat”.
D. Ratio Scale
✓ are measurements that enable the determination of how many times as much of
the measured characteristic is possessed by one unit of the sample (or
population) than another.
✓ Ratio Data are always numerical, and the ratio between the numbers assigned to
two units can be interpreted as the multiple by which the units differ.
11
Examples:
o The sales revenue for each firm in a sample of 100 U.S. firms.
12
EXERCISE
a. Are you the most frequent user of Windows 3.0 in your household?
c. How would you rate the helpfulness of the Tutorial instructions that accompany
Windows 3.0, on a scale of 1 to 10, where 1 is not helpful?
d. When using a printer with Windows 3.0, do you most frequently use a dot-matrix
printer or another type of printer?
g. How many people in your household have used Windows 3.0 at least once?
13