Statistics Unit One
Statistics Unit One
In singular sense statistics defined procedural process performing data collection, data
organization (classification), data presentation, data analysis, and data interpretation. So we
consider the following stages of statistical investigation.
Data Collection: This is a stage where we gather information for our purpose
o If data are needed and if not readily available, then they have to be collected.
o Data may be collected by the investigator directly using methods like interview,
questionnaire, and observation or may be available from published or unpublished sources.
1
Data Organization: It is a stage where we edit our data. A large mass of figures that are
collected from surveys frequently need organization. The collected data involve irrelevant
figures, incorrect facts, omission and mistakes. Errors that may have been included during
collection will have to be edited. After editing, we may classify (arrange) according to their
common characteristics. Classification or arrangement of data in some suitable order makes the
information easer for presentation.
Data Presentation: The organized data can now be presented in the form of tables and diagram.
At this stage, large data will be presented in tables in a very summarized and condensed manner.
The main purpose of data presentation is to facilitate statistical analysis. Graphs and diagrams
may also be used to give the data a vivid meaning and make the presentation attractive.
Data Analysis: This is the stage where we critically study the data to draw conclusions about the
population parameter. The purpose of data analysis is to dig out information useful for decision
making. Analysis usually involves highly complex and sophisticated mathematical techniques.
However, in this material only the most commonly used methods of statistical analysis are
included. Such as the calculations of averages, the computation of majors of dispersion,
regression and correlation analysis are covered.
Data Interpretation: This is the stage where draw valid conclusions from the results obtained
through data analysis. Interpretation means drawing conclusions from the data which form the
basis for decision making. The interpretation of data is a difficult task and necessitates a high
degree of skill and experience. If data that have been analyzed are not properly interpreted, the
whole purpose of the investigation may be defected and fallacious conclusion be drawn. So that
great care is needed when making interpretation.
Sampling: - The process of selecting a sample from the population is called sampling.
Population: A population is a totality of things, objects, peoples, etc about which information is being
collected. It is the totality of observations with which the researcher is concerned.
2
Sample: A sample is a subset or part of a population selected to draw conclusions about the
population.
Census survey: It is the process of examining the entire population. It is the total count of the
population.
Statistic: It is a measure used to describe the sample. It is a value computed from the sample.
Sampling frame: A list of people, items or units from which the sample is taken.
Data: Data as a collection of related facts and figures from which conclusions may be drawn.
Variable: A certain characteristic which changes from object to object and time to time.
Application of statistics
Research works.
Proving an important tool to the management of cost budgetary.
Estimating the relationship between dependent and one or more independent behaviors.
Estimating quality standards for industrial products, for maintaining there quested quality
and for assuring that the individual products sold are of a given standard of acceptance.
Uses of statistics
Today the field of statistics is recognized as a highly useful tool to making decision process by
managers of modern business, industry, frequently changing technology. It has a lot of functions
in everyday activities. The following are some of the most important ones.
3
Statistics condenses and summarizes complex data. The original set of data (raw data) is
normally voluminous and disorganized unless it is summarized and expressed in few
numerical values.
Statistics facilitates comparison of data. Measures obtained from different set of data can
be compared to draw conclusion about those sets. Statistical values such as averages,
percentages, ratios, etc, are the tools that can be used for the purpose of comparing sets of
data.
Statistics helps in predicting future trends. Statistics is extremely useful for analyzing the
past and present data and predicting some future trends.
Statistics influences the policies of government. Statistical study results in the areas of
taxation, on unemployment rate, on the performance of every sort of military equipment, etc,
may convince a government to review its policies and plans with the view to meet national
needs and aspirations.
Statistical methods are very helpful in formulating and testing hypothesis and to develop
new theories.
Limitations of statistics
Even though, statistics is widely used in various fields of natural and social sciences, which
closely related with human inhabitant. It has its own limitations as far as its application is
concerned.
Statistics doesn’t deal with single (individual) values: Statistics deals only with
aggregate values. But in some cases single individual is highly important to consider in
some situations. Example, the sun, a driver of bus, president, etc.
Statistics can’t deal with qualitative characteristics: It only deals with data which can be
quantified. Example, it does not deal with marital status (married, single, divorced,
widowed) but it deal with number of married, number of single, number of divorced.
4
Statistical conclusions are true in majority case: Statistical conclusions are true only
under certain condition or true only on average. The conclusions drawn from the analysis
of the sample may, perhaps, differ from the conclusions that would be drawn from the
entire population. For this reason, statistics is not an exact science.
Example: Assume that in your class there is 40 numbers of students. Take their result of mid-
exam out of 30% for all 40 students and analysis mean of mid-exam result out of 30% is
assumed 20. This value is on average, because all individual has not get 20 out of 30%. There is
a student who has scored above 20 and below 20.
Example: From the 1985E.C. graduates of accounting at M college more than 80 percent of the
females graduated with the GPA above 2.50. Therefore females are better in Accounting than
any other field. Here the given information is not sufficient to make the conclusion stated
because
1) It is a data taken from 1985EC only and does not also include the performance of females in
the other departments.
2) It does not tell the female to male proportion, where the fact may be there were only two
female students in the Accounting department who graduated that year and all of them
graduated with a GPA above 2.50.
1.5 Types of variable and Scales of Measurement
The various measurement scales results from the facts that measurement may be carried out
under different sets of rules. Generally, there are four types of measurements of data.
5
Nominal Scale: Consists of ‘naming’ observations or classifying them into various mutually
exclusive categories. Sometimes the variable under study is classified by some quality it
possesses rather than by an amount or quantity. In such cases, the variable is called attribute.
Example
Religion: Christianity, Islam, Hinduism, etc.
Sex: Male, Female
Eye color: brown, black, etc.
Ordinal Scale: Whenever observations are not only different from category to category, but can
be ranked according to some criterion. The variables deal with their relative difference rather
than with quantitative differences.
Ordinal data are data which can have meaningful inequalities. The inequality signs < or > may
assume any meaning like ‘stronger, softer, weaker, better than’, etc.
Example
Interval Scale: With this scale it is not only possible to order measurements, but also the
distance between any two measurements is known but not meaningful quotients. There is no true
zero point but arbitrary zero point. Interval data are the types of information in which an increase
from one level to the next always reflects the same increase. Possible to add or subtract interval
data but they may not be multiplied or divided.
Example: Temperature of zero degrees does not indicate lack of heat. The two common
temperature scales; Celsius (C) and Fahrenheit (F). We can see that the same difference exists
between 10oC (50oF) and 20oC (68OF) as between 25oc (77oF) and 35oc (95oF) i.e. the
measurement scale is composed of equal-sized interval. But we cannot say that a temperature of
20oc is twice as hot as a temperature of 10oc. because the zero point is arbitrary.
6
Ratio Scale: - Characterized by the fact that equality of ratios as well as equality of intervals
may be determined. Fundamental to ratio scales is a true zero point.
Example:
Variables such as age, height, length, volume, rate, time, amount of rainfall, etc. are require
ratio scale.
A variable in statistics is any characteristic, which can take on different values when data are
collected.
A) Continuous Variables: - are usually obtained by measurement not by counting. These are
variables which assume or take any decimal value when collected. The variables like age,
time, height, income, price, temperature, and etc are all continuous since the data collected
from such variables can take decimal values.
Example: variables such as age, height, length, volume, rate, time, amount of rainfall, etc. are
continuous variables.
B) Discrete Variables: - are obtained by counting. A discrete variable takes always whole
number values that are counted.
Example variables such as number of students, number of errors per page, number of accidents
on traffic line, number of defective or non defective items produced in production line.