100% found this document useful (3 votes)
2K views10 pages

1.1 Definitions and Classification of Statistics: Chapter One: Introduction

This document provides an introduction to statistics, including definitions, classifications, and the stages of statistical investigation. It defines statistics as the branch of mathematics dealing with the collection, organization, analysis, and interpretation of data. Descriptive statistics are used to summarize and describe data, while inferential statistics are used to make conclusions beyond the immediate data by inferring properties of an underlying population. The stages of statistical investigation are outlined as formulating the problem, collecting data, organizing and classifying data, presenting data, analyzing data, and interpreting the results.

Uploaded by

Yohannis Reta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (3 votes)
2K views10 pages

1.1 Definitions and Classification of Statistics: Chapter One: Introduction

This document provides an introduction to statistics, including definitions, classifications, and the stages of statistical investigation. It defines statistics as the branch of mathematics dealing with the collection, organization, analysis, and interpretation of data. Descriptive statistics are used to summarize and describe data, while inferential statistics are used to make conclusions beyond the immediate data by inferring properties of an underlying population. The stages of statistical investigation are outlined as formulating the problem, collecting data, organizing and classifying data, presenting data, analyzing data, and interpreting the results.

Uploaded by

Yohannis Reta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Chapter one: Introduction

Contents

1. Introduction
1.1 Definitions and classification of Statistics
1.2 Stages in statistical investigation
1.3 Definition of some terms
1.4 Use, scope, limitation & misuse of Statistics
1.4.1 Uses of Statistics
1.4.2 Scope of Statistics
1.4.3 Limitations of Statistics

Introduction
“Statistical thinking will one day be as necessary for efficient citizenship as the ability to
read and write.”
H. G.WELLS

In the modern world of computers and information technology, the importance of


statistics is very well recognized by all the disciplines. Statistics has originated as a
science of statehood and found applications slowly and steadily in Agriculture,
Economics, Commerce, Biology, Medicine, Industry, planning, education and so on. In
the meantime, there is no other human walk of life, where statistics cannot be applied.
Hence, we are constantly being bombarded with statistics and statistical information.

1.1 Definitions and classification of Statistics

The word “Statistics” and “Statistical” are all derived from Latin word status which
means a political state. Statistics is defined differently by different authors over a period
of time. In the olden days statistics was confined to only state affairs but in modern days
it embraces almost every sphere of human activity. Therefore, a number of old
definitions, which was confined to narrow field of enquiry, were replaced by more
definitions, which are much more comprehensive and exhaustive. Let us examine
different way of defining statistics by different authors and Dictionaries.

[email protected]
1
Chapter one: Introduction

The American Heritage Dictionary defines statistics as “The mathematics of


collection, organization and interpretation of numerical data, especially the
analyses of population characteristics by inference from sampling.”
The Merriam-Webster’s collegiate Dictionary defines statistics as “A branch of
mathematics dealing with the collection, analyses, interpretation, and presentation
of masses of numerical data.”
The former American Statistical Association president Jon Kettering define
statistics as “… the science of learning from data …It presents exciting
opportunities for those who work as professional statisticians. Statistics is
essential for the proper running of government, central to decision making in
industry and a core component of modern educational curricula at all level.”

Despite these, the word statistics can have two different senses while we use it as plural
and singular verb. Statistics in singular verb is defined as the branch of mathematics that
deals with the collection, organization, analysis, and interpretation of numerical data.
Statistics is especially useful in drawing general conclusions about a set of data from a
sample of the data. But statistics in plural verb is defined as numerical data which has
been collected, classified, and interpreted.
Based on the usage of statistical data statistics is defined broadly in to two mutually
exclusive groups so called Descriptive statistics and inferential statistics.

Descriptive statistics are used to describe the basic features of the data in a study. They
provide simple summaries about the sample and the measures. Together with simple
graphics analysis, they form the basis of virtually every quantitative analysis of data.
Various techniques that are commonly used are classified as:

Graphical description in which we use graphs to summarize data.


Tabular description in which we use tables to summarize data.
Summary statistics in which we calculate certain values to summarize data.

[email protected]
2
Chapter one: Introduction

Example-1: Of 350 randomly selected people in the town of Addis Ababa 280 people
had the last name Abebe. An example of descriptive statistics is the following statement:
"80% of these people have the last name Abebe."

Example-2: On the last 3 Sundays, Hiwot Car salesman sold 2, 1, and 0 new cars
respectively. An example of descriptive statistics is the following statement: "Hiwot
averaged 1 new car sold for the last 3 Sundays."

These are both descriptive statements because they can actually be verified from the
information provided.

Inferential statistics (statistical induction) comprise the use of statistics to make


inferences or conclusions and determine the relationships concerning about some
unknown aspect of a population parameters based on the data which are obtained from
the sample i.e. inferential statistics aim to make inferences from the data in order to make
conclusions that go beyond the data.

Example-3: Of 350 randomly selected people in the town of Addis Ababa 280, Ethiopia,
people had the last name Abebe. An example of inferential statistics is the following
statement: "80% of all people living in Ethiopia have the last name Abebe."

We have no information about all people living in Ethiopia, just about the 350 living in
Addis Ababa. We have taken that information and generalized it to talk about all people
living in Ethiopia.

Example-4: On the last 3 Sundays, Hiwot. Car salesman sold 2, 1, and 0 new cars
respectively. An example of inferential statistics is the following statements: "Hiwot
never sells more than 2 cars on a Sunday."

Although this statement is true for the last 3 Sundays, we do not know that this is true for
all Sundays.

[email protected]
3
Chapter one: Introduction

1.2 Stages in Statistical Investigation

Before we deal with statistical investigation, let us see what statistical data mean. Each
and every numerical data can’t be considered as statistical data unless it possesses the
following criteria. These are:

The data must be aggregate of facts


They must be affected to a marked extent by a multiplicity of causes
They must be estimated according to reasonable standards of accuracy
The data must be collected in a systematic manner for predefined purpose
The data should be placed in relation to each other

A statistician should be involved at all the different stages of statistical investigation. This
includes formulating the problem, and then collecting, organizing and classifying,
presenting, analyzing and interpreting of statistical data. Let’s see each stage in detail
Formulating the problem: first research must emanate if there is a problem. At
this stage the investigator must be sure to understand the problem and then
formulate it in statistical term. Clarify the objectives very carefully. Ask as
many questions as necessary because “An approximate answer to the right
question is worth a great deal more than a precise answer to the wrong
question.” -The first golden rule of applied mathematics-
Therefore, the first stage in any statistical investigation should be to:
Get a clear understanding of the physical background to the
situation under study;
Clarify the objectives;
Formulate the objective in statistical terms
Proper collection of data: in order to draw valid conclusions, it is important
‘good’ data. Data are gathered with aim to meet predetermine objectives. In
other words, the data must provide answers to problems. The data itself form the

[email protected]
4
Chapter one: Introduction

foundation of statistical analyses and hence the data must be carefully and
accurately collected..
Organization and classification of data: in this stage the collected data
organized in a systematic manner. That means the data must be placed in
relation to each other. The classification or sorting out of data is, by itself, a
kind of organization of data.
Presentation of data: The purpose of putting the organized data in graphs,
charts and tables is two-fold. First, it is a visual way to look at the data and see
what happened and make interpretations. Second, it is usually the best way to
show the data to others. Reading lots of numbers in the text puts people to sleep
and does little to convey information.
Analyses of data: is the process of looking at and summarizing data with the
intent to extract useful information and develop conclusions. Data analysis is
closely related to data mining, but data mining tends to focus on larger data sets,
with less emphasis on making inference, and often uses data that was originally
collected for a different purpose. In this stage different types of inferential
statistical methods will apply. For instance, hypothesis testing such as  2 test of
association.
Interpretation of data: interpretation means drawing valid conclusions from
data which form the basis of decision making. Correct interpretation requires a
high degree of skill and experience.
Note that: Analyses and interpretation of data are the two sides of the same
coin.

1.3 Definition of Some Terms

In this section, we will define those terms which will be used most frequently. These are:
Data: Facts or figures from which the conclusion can be drawn.
Data set: Facts or figures collected for a particular study. Each value in the data set is
called data value or datum.

[email protected]
5
Chapter one: Introduction

Raw Data: Data sheets are where the data are originally recorded. Original data are
called raw data. Data sheets are often hand drawn, but they can also be printouts from
database programs like Microsoft Excel.
Population: The totality of all subjects with certain common characteristics that are
being studied in a specified time and place.
Sample: Is a portion of a population which is selected using some technique of sampling.
Sample must be representative of the population so that it must be selected by any of the
developed technique.
Sampling: Is the process of selecting units (e.g., people, organizations) from a population
of interest so that by studying the sample we may fairly generalize our results back to the
population from which they were chosen. There are two types of sampling techniques
namely random sampling technique and non-random sampling technique.
Random sampling technique or probability sampling technique gives a non- zero chance
for all elements to be included in the sample. In other words, there is no personal bias
regarding the selection. The five common random sampling techniques are:
Simple Random sampling
Systematic Random sampling
Stratified Random sampling
Cluster Random sampling
Multi-stage sampling
Non-random sampling technique is mostly known as non-probability sampling
techniques and in this case not all elements of a population have a known chance of
inclusion or if some outcomes have a zero chance of being selected as a sample. The
most familiar examples of non-random sampling techniques are
Quota sampling
Convenience sampling
Volunteer sampling
Purposive sampling
Haphazard sampling
Snow ball sampling etc…
Sample size: The number of elements or observation to be included in the sample.

[email protected]
6
Chapter one: Introduction

Parameter: Any measure computed from the data of a population.


Example-5: Populations mean ( ) and population standard deviation ( )
Statistic: Any measure computed from the sample.
()
Example-6: sample mean x , sample standard deviation (S )
Survey: A collection of quantitative information about members of a population when no
special control is exercised over any of the factors influencing the variable of interest.
Sample survey: A survey that include only a portion of the population.
Census: A collection of information about every member of a population
Sample survey has the following advantages over census
Sample survey saves time and cost
Has great accuracy
Avoid wastage of material

Variable: Is an attribute of a physical or an abstract system which may change its value
while it is under observation. Variables are often specified according to their type and
intended use and hence variable can be classified in to two namely qualitative and
quantitative variables.

A quantitative variable is naturally measured as a number for which meaningful


arithmetic operations make sense. Examples: Height, age, crop yield, GPA,
salary, temperature, area, air pollution index (measured in parts per million), etc.
Qualitative variable: Any variable that is not quantitative is qualitative.
Qualitative variables take a value that is one of several possible categories. As
naturally measured, qualitative variables have no numerical meaning. Examples:
Hair color, gender, field of study, college attended, political affiliation, status of
disease infection.
Quantitative variables can be classified as discrete and continuous variable. Discrete
variables can assume certain numerical values. That is, there are gaps between the
possible values. Such as 0, 1, 2...It may be countable infinite of countable infinite. For
example the number of students in a class room, number of children a family. Continuous
variable can take any value within a specified interval with a finite enough measuring

[email protected]
7
Chapter one: Introduction

device. No gaps between possible values. They are obtained by measuring. For example,
consider the heights of two people no matter how close it is we can find another person
whose height falls some where between the two heights is a continuous variable.

1.4 Use, scope, limitation and misuse of statistics

1.4.1 Uses of Statistics


Statistics presents fact in the form of numerical data
It condenses and summarizes a mass of data in to a few presentable and
precise figures.
It facilitates comparison of data
It helps in formulating and testing hypothesis
It helps in predicting future trend
It helps in formulating polices.

1.4.2 Scope of Statistics


The scope of statistics is indeed very vast. Apart from helping elicit an intelligent
assessment from a body of figures and facts, statistics is indispensable tool for any
scientific enquiry-right from the stage of planning enquiry to the stage of conclusion. It
applies almost all sciences: pure and applied, physical natural, biological, medical,
agricultural and engineering. It also finds applications in social and management
sciences, in commerce, business and industry.
Of social sciences, economics leans most heavily on statistical methods for analyses of
data relating to micro as well as to macro economics, from demand analyses up to
national income analyses.

1.4.3 Limitations of Statistics


Statistics with all its wide application in every sphere of human activity has its own
limitation. Some of them are given below
Statistics is not suitable to the study of qualitative phenomenon: Since
statistics is basically a science and deals with a set of numerical data, it is

[email protected]
8
Chapter one: Introduction

applicable to the study of only these subjects of enquiry, which can be


expressed in terms of quantitative measurements. As a matter of fact,
qualitative phenomenon like honesty, poverty, beauty, intelligence etc,
cannot be expressed numerically and any statistical analysis cannot be
directly applied on these qualitative phenomenons. Nevertheless, statistical
techniques may be applied indirectly by first reducing the qualitative
expressions to accurate quantitative terms. For example, the intelligence of a
group of students can be studied on the basis of their marks in a particular
examination.
Statistics does not study individuals: Statistics does not give any specific
importance to the individual items; in fact it deals with an aggregate of
objects. Individual items, when they are taken individually do not constitute
any statistical data and do not serve any purpose for any statistical enquiry.
Statistical laws are not exact: It is well known that mathematical and
physical sciences are exact. But statistical laws are not exact and statistical
laws are only approximations. Statistical conclusions are not universally true.
They are true only on an average.
Statistics table may be misused: Statistics must be used only by experts;
otherwise, statistical methods are the most dangerous tools on the hands of
the inexpert. The use of statistical tools by the inexperienced and untraced
persons might lead to wrong conclusions. Statistics can be easily misused by
quoting wrong figures of data. As King says aptly ‘statistics are like clay of
which one can make a God or Devil as one pleases.’
Statistics is only, one of the methods of studying a problem: Statistical
method does not provide complete solution of the problems because
problems are to be studied taking the background of the countries culture,
philosophy or religion into consideration. Thus the statistical study should be
supplemented by other evidences.
At times, association or relationship between two or more variables is
studied in statistics, but such a relationship does not indicate ‘cause and
effect’ relationship. It simply shows the similarity or dissimilarity in the

[email protected]
9
Chapter one: Introduction

movement of the two variables. In such cases, it is the user who has to
interpret the results carefully, pointing out the type of relationship obtained.

[email protected]
10

You might also like