Data Sources Descriptive Statistics Statistical Inference Computers and Statistical Analysis

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

Data and Statistics

Data Sources

Descriptive Statistics

Statistical Inference

Computers and
Statistical Analysis

Applications of Statistics

Accounting
Public accounting firms use statistical sampling procedures
when conducting audits for their clients.

Economics
Economists use statistical information in making forecasts
about the future of the economy or some aspect of it.

Applications of Statistics

Marketing
Electronic point-of-sale scanners at retail checkout counters ar
used to collect data for a variety of Marketing research
applications.

Production

A variety of statistical quality control charts are used to monito


the output of a production process.
Finance
Financial advisors use price-earnings ratios and dividend
yields to guide their investment recommendations.

Growth and Development of Statistics


The views commonly held about statistics are numerous, and has
different meanings to different people depending on its use.
For example,
(i)for a cricket fan, statistics refers to numerical information or data
relating to the runs scored by a cricketer;
(ii) for an environmentalist, statistics refers to information on the quantity
of pollution released into the atmosphere by all types of vehicles in
different cities;
(iii) for the census department, statistics consists of information about
the birth rate per thousand and the sex ratio in different states;
(iv) for a share broker, statistics is the information on changes in share
prices over a period of time; and so on.

Growth and Development of Statistics


Commonly statistics is perceived as various types of graphs, tables and
charts showing the increase and/or decrease in per capita income,
wholesale price index, industrial production, exports, imports, crime rate
and so on.
The sources of such statistics are newspapers, magazines/journals,
reports/bulletins, radio, and television. The relevant data are collected,
numbers manipulated and information presented with the help of
figures, charts and diagrams;

Statistical Thinking and Analysis

Statistical thinking can be defined as the thought process that focuses on


ways to identify, control, and reduce variations present in all phenomena.
A better understanding of a phenomenon through statistical thinking and
use of statistical methods for data analysis, enhances opportunities for
improvement in the quality of products or services.
Statistical thinking also allows to recognize and make interpretations of the
variations in a process.

06/09/16

1-6

Chapter 1 Statistics: An
Overview

Steps Of Statistical Thinking


S p e c if y t h e A im o f t h e S t u d y

U n d e rs ta n d H o w th e P ro c e s s W o rk s

A s s e s s th e C u r re n t P r o c e s s P e rfo rm a n c e

I d e n t if y S t r a t e g i e s f o r I m p r o v e m e n t

Flow Chart of Process

T e s t th e E ffe c tiv e n e s s o f th e P r o p o s e d S tr a te g y

Improvement
S u c c e s s f u l?
Yes
I m p le m e n t t h e S t r a t e g y

N o

Statistics Defined
As Statistical Data it refers to a special discipline or a collection of
procedures useful in gathering and analysis of numerical information
for the purpose of drawing conclusions and making decisions.
The classified facts respecting the condition of the people in a state .
. . especially those facts which can be stated in numbers or in tables
of numbers or in any tabular or classified arrangement.
Webster

Statistics Defined
As Statistical Methods it refers to the collection and analyses of
statistical data for the purpose of drawing conclusions and making
decisions .
There are two branches of statistics:
(i)Mathematical statistics ,and
(ii)Applied statistics.
Mathematical statistics is a branch of mathematics and deals with the
basic theory about how a particular statistical method is developed.
Applied statistics, uses statistical theory in formulating and solving
problems in other subject areas such as economics, sociology, medicine,
business/industry, education, and psychology.

Statistics Defined
Statistics is the science which deals with the methods of
collecting, classifying, presenting, comparing and interpreting
numerical data collected to throw some light on any sphere of
enquiry.
Seligman
The science of statistics is the method of judging, collecting
natural or social phenomenon from the results obtained from the
analysis or enumeration or collection of estimates.
King

Types of Statistical Methods


Descriptive statistics includes statistical methods involving the collection,
presentation, and characterization of a set of data in order to describe its
various features .
In general, methods of descriptive statistics include graphic methods and
numeric measures. Bar charts, line graphs, and pie charts comprise the graphic
methods, whereas numeric measures include measures of central tendency,
dispersion, skewness, and kurtosis.
Inferential statistics includes statistical methods which facilitate estimating the
characteristic of a population on the basis of sample results.

Limitations of Statistics
Statistics Does Not Study Qualitative Phenomena
Statistics cannot be applied in studying those problems which can
not be stated and expressed quantitatively.
For example, a statement like Export volume of India has increased
considerably during the last few years cannot be analyzed statistically. Also,
qualitative characteristics such as honesty, poverty, welfare, beauty, or health,
cannot directly be measured quantitatively.
However, these subjective concepts can be related in an indirect manner to
numerical data after assigning particular scores or quantitative standards. For
example, attributes of intelligence in a class of students can be studied on the
basis of their Intelligence Quotients (IQ)

Statistics Does Not Study Individuals


A single or isolated figure cannot be considered as statistics, unless it is part of
the aggregate of facts relating to any particular field of enquiry.
For example, price of a single commodity , increase or decrease in the share
price of a particular company does not constitute statistics. However, the
aggregate of figures representing prices, production, sales volume, and profits
over a period of time or for different places do constitute statistics.

Statistics Can be Misused


Statistics only furnishes a tool though imperfect which is dangerous in the
hands of those who do not know its use and deficiencies.- Bowley
For example, conclusion that smoking causes lung cancer, since 90 per cent of
people who smoke die before the age of 70 years, is statistically invalid
because nothing has been mentioned about the percentage of people who do
not smoke and die before reaching the age of 70 years.

Need for Data


The main reasons for collecting data are as listed below:
To provide necessary inputs to a given phenomenon or situation
under study.
To measure performance in an ongoing process such as
production, service, and so on.
To enhance the quality of decision-making by enumerating
alternative courses of action in a decision-making process, and
selecting an appropriate one.
To satisfy the desire to understand an unknown phenomenon.
To assist in guessing the causes and probable effects of certain
characteristics in given situations.

Need for Data


Before relying on any interpreted data, either from a computer, internet or
other source, we should study answers to the following questions:
(i)Have data come from an unbiased source, that is, source should not
have an interest in supplying the data that lead to a misleading
conclusion,
(ii) Do data represent the entire population under study i.e. how many
observations should represent the population,
(iii) Do the data support other evidences already available. Is any
evidence missing that may cause to arrive at a different conclusion? and
(iv) Are data support the logical conclusions drawn. Have we made
conclusions which are not supported by data.

Data and Data Sets


Data are the facts and figures collected, summarized,
analyzed, and interpreted.
The data collected in a particular study are referred
to as the data set.

Elements, Variables, and


Observations

The elements are the entities on which data are collected.


A variable is a characteristic of interest for the elements.
The set of measurements collected for a particular
element is called an observation.
The total number of data values in a complete data set is the
number of elements multiplied by the number of variables.

and

Data, Data Sets, Elements, Variables,


Observations

Observatio
n
Element
Names
Company

Dabur
Jaypee
India Cement
Jindal
ITC

Variables
Stock
Exchange

BSE
NSE
NSE
BSE
NSE

Annual
EPS
Sales(Rs M) (Rs)

73.10
365.70
111.40
17.60

0.86
74.00
0.86
0.33
0.13

Data Set

1.67

Qualitative and Quantitative Data


Data
Data can
can be
be further
further classified
classified as
as being
being qualitative
qualitative
or
or quantitative.
quantitative.

The
The statistical
statistical analysis
analysis that
that is
is appropriate
appropriate depends
depends on
on whethe
whethe
the
the data
data for
for the
the variable
variable are
are qualitative
qualitative or
or quantitative.
quantitative.
In
In general,
general, there
there are
are more
more alternatives
alternatives for
for statistical
statistical
analysis
analysis when
when the
the data
data are
are quantitative.
quantitative.

Quantitative Data
Quantitative
Quantitative data
data indicate
indicate how
how many
many or
or how
how much:
much:
discrete
discrete,, ifif measuring
measuring how
how many
many
continuous,
continuous, ifif measuring
measuring how
how much
much
Quantitative
Quantitative data
data are
are always
always numeric.
numeric.
Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.

Cross-Sectional Data

Cross-sectional
Cross-sectional data
data are
are collected
collected at
at the
the same
same or
or approximatel
approximate
the
the same
same point
point in
in time.
time.

Example:
Example: Data
Data detailing
detailing the
the number
number of
of building
building permits
permits
issued
issued in
in June
June 2010
2010 in
in each
each of
of the
the counties
counties of
of SARC
SARC

Time Series Data


Time
Time series
series data
data are
are collected
collected over
over several
several time
time periods.
periods.

Example:
Example: data
data detailing
detailing the
the number
number of
of building
building permits
permits is
is
sued
sued in
in India
India in
in each
each of
of the
the last
last 36
36 months
months

Scales of Measurement
Data
Qualitative

Numerical

Nominal
Nominal

Ordinal

Quantitative

Non-numerical

Nominal

Ordinal

Numerical

Interval

Ratio

Scales of Measurement
Scales
Scales of
of measurement
measurement include:
include:
Nominal

Interval

Ordinal

Ratio

The
The scale
scale determines
determines the
the amount
amount of
of information
information contained
contained
in
in the
the data.
data.
The
The scale
scale indicates
indicates the
the data
data summarization
summarization and
and statistical
statistical
analyses
analyses that
that are
are most
most appropriate.
appropriate.

Scales of Measurement

Nominal

Data
Data are
are labels
labels or
or names
names used
used to
to identify
identify an
an attribute
attribute of
of the
the
element.
element.
A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Example : Students of a university are classified by the
school in which they are enrolled using a nonnumeric
label such as Business, Humanities, Education, and so
on.
Alternatively, a numeric code could be used for the
school variable (e.g. 1 denotes Business, 2 denotes
Humanities, 3 denotes Education, and so on).

Scales of Measurement

Ordinal
The
The data
data have
have the
the properties
properties of
of nominal
nominal data
data and
and
the
the order
order or
or rank
rank of
of the
the data
data is
is meaningful.
meaningful.
A
A nonnumeric
nonnumeric label
label or
or numeric
numeric code
code may
may be
be used.
used.
Example: Students of a university are classified by
their class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for the
class standing variable (e.g. 1 denotes Freshman, 2
denotes Sophomore, and so on).

Scales of Measurement

Interval
The
The data
data have
have the
the properties
properties of
of ordinal
ordinal data,
data, and
and the
the interval
interval
between
between observations
observations is
is expressed
expressed in
in terms
terms of
of aa fixed
fixed unit
unit
of
of measure.
measure.
Interval
Interval data
data are
are always
always numeric
numeric..
Example: Manisha has an MAT score of 1205, while
Shreya
has an MAT score of 1090. Manisha scored 115 points
more than Shreya.

Scales of Measurement

Ratio
The
The data
data have
have all
all the
the properties
properties of
of interval
interval data
data and
and the
the ratio
ratio
of
of two
two values
values is
is meaningful
meaningful..

Variables
Variables such
such as
as distance,
distance, height,
height, weight,
weight, and
and time
time
use
use the
the ratio
ratio scale.
scale.
This
This scale
scale must
must contain
contain aa zero
zero value
value that
that indicates
indicates that
that nothing
nothing
exists
exists for
for the
the variable
variable at
at the
the zero
zero point.
point.
Example: Manishas college record shows 36 credit
hours
earned, while Shreyas record shows 72 credit hours
earned.
Shreya has twice as many credit hours earned as
Manisha.

Data Sources

Existing Sources
Within a firm almost any department
Business database services ORG Group
Government agencies - Ministry of Commerce
Industry associations Travel Industry Association

Special-interest organizations All India Management Association


Internet more and more firms

Data Sources

Statistical Studies
In
In experimental
experimental studies
studies the
the variable
variable of
of interest
interest is
is first
first
identified.
identified. Then
Then one
one or
or more
more other
other variables
variables are
are identified
identified
and
and controlled
controlled so
so that
that data
data can
can be
be obtained
obtained about
about how
how they
they
influence
influence the
the variable
variable of
of interest.
interest.
In
In observational
observational (non
(non -- experimental)
experimental) studies
studies no
no attempt
attempt is
is
made
made to
to control
control or
or influence
influence the
the variables
variables of
of interest.
interest.
a survey is a good example

Data Acquisition Considerations


Time Requirement

Searching for information can be time consuming.


Information may no longer be useful by the time it
is available.

Cost of Acquisition

Organizations often charge for information even


when it is not their primary business activity.

Data Errors

Using any data that happen to be available or were


acquired with little care can lead to misleading
information.

Descriptive Statistics

Descriptive statistics are the tabular, graphical, and


numerical methods used to summarize and present
data.

Example: Hilton Auto Repair

The manager of Hilton Auto would like to have a


better
understanding of the cost of parts used in the
engine tune-ups performed in the shop. He
examines 50 customer invoices for tune-ups.
The costs of parts, rounded to the nearest
rupees,
are listed
on the
slide.
Sample
of Parts
Cost (Rs)
fornext
50 Tune-ups

91
71
104
85
62

78
69
74
97
82

93
72
62
88
98

57
89
68
68
101

75
66
97
83
79

52
75
105
68
105

99
79
77
71
79

80
75
65
69
69

97
72
80
67
62

62
76
109
74
73

Tabular Summary : Frequency and


Percent Frequency

Parts
Cost (Rs)
50-59
60-69
70-79
80-89
90-99
100-109

Parts
Frequency

Percent
Frequency

2
13
16
7
7
5
50

4
26
32
14
14
10
100

((2/50)100

Graphical Summary: Histogram


Tune-up Parts Cost
18

Frequency

16
14
12
10
8
6
4
2

Parts
Cost (Rs)
5059 6069 7079 8089 9099 100-110

Numerical Descriptive Statistics

The most common numerical descriptive statistic


is the average (or mean).
Hiltons average cost of parts, based on the 50
tune-ups
studied, is Rs 79 (found by summing the
50 cost values
and then dividing by 50).

Statistical Inference
Population

the set of all elements of interest in a


particular study
Sample a subset of the population
Statistical inference

Census

the process of using data obtained


from a sample to make estimates
and test hypotheses about the
characteristics of a population

collecting data for a population

Sample survey

collecting data for a sample

Process of Statistical Inference

1. Population
consists of all tuneups. Average cost of
parts is unknown.
unknown

4. The sample average


is used to estimate the
population average.

2. A sample of 50
engine tune-ups
is examined.

3. The sample data


provide a sample
average parts cost
of Rs 79 per tune-up.

Computers and Statistical Analysis


Statistical analysis typically involves working with
large amounts of data.
Computer software is typically used to conduct the
analysis.

Instructions are provided in chapter appendices for


carrying out many of the statistical procedures
using Minitab and Excel.

You might also like