0% found this document useful (0 votes)
89 views52 pages

SQT I

This document provides an introduction to statistics and data representation. It discusses key concepts like frequency distributions, class intervals, cumulative frequency distributions and different types of graphs including histograms, frequency polygons, ogives and bar charts that are used to represent data. Microsoft Excel is mentioned as a tool for statistical analysis and creating graphs for frequency distributions. The objectives are listed as introducing statistics, data representations, and using graphs and Excel for analysis of frequency distributions.

Uploaded by

shruti jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views52 pages

SQT I

This document provides an introduction to statistics and data representation. It discusses key concepts like frequency distributions, class intervals, cumulative frequency distributions and different types of graphs including histograms, frequency polygons, ogives and bar charts that are used to represent data. Microsoft Excel is mentioned as a tool for statistical analysis and creating graphs for frequency distributions. The objectives are listed as introducing statistics, data representations, and using graphs and Excel for analysis of frequency distributions.

Uploaded by

shruti jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Unit – I

Introduction to Statistics
Dr. Vishal Thelkar
Learning Objectives
• Introduction to Statistics,
• Data Representations and Frequency Distribution;
• Graphs - Histogram,
• Polygon,
• Ogive,
• Bar Chart,
• Pie Chart,
• Pareto Diagram;
• Using Microsoft-Excel for the analysis of frequency distribution and
Graphs
Introduction
• In the modern world of computers and information technology, the importance of
statistics is very well recognized by all the disciplines.
• Statistics has originated as a science of statehood and found applications slowly and
steadily in
o
Agriculture,
o
Economics,
o
Commerce,
o
Biology,
o
Medicine,
o
Industry,
o
Planning, education and so on
• As on date there is no other human walk of life, where statistics cannot be applied.
Origin of Statistics
Statistics
• The word ‘ Statistics’ and ‘ Statistical’ are all derived from the Latin
word Status, means a political state.

Statistics is concerned with scientific methods for


• collecting,
• organizing,
• summarizing,
• presenting and analyzing data
• as well as deriving valid conclusions and making reasonable decisions on
the basis of this analysis.
Meaning
• The word ‘ statistic ’ is used to refer to

• Numerical facts, such as the number of people living in


particular area.

• The study of ways of collecting, analyzing and interpreting


the facts.
Definition
• As per Croxton and Cowden “Statistics may be defined as the
science of collection, presentation analysis and
interpretation of numerical data from the logical analysis.”

• A.L.Bowley defines “Statistics are numerical statement of


facts in any department of enquiry placed in relation to each
other.”
“Statistics may be rightly called the scheme of averages.”
Example
Example
Limitations

• Statistics is not suitable for qualitative phenomenon


• Statistics does not study individuals
• Statistics laws are not exact
• Statistics table may be misused
• Statistics is only, one of the methods of studying a problem
Information and Data

What Comes First ?


Data Presentation and
Graphical Displays
Objectives
• To discover the ways of condensing data so as to make it
suitable for further analysis.
• To effectively use of diagrams and graphs in representation
of data
• To condense the raw data further using frequency tables
using tally marks so as to make them amenable to statistical
techniques
Data
• Data - A collection of observations or measurement on one
or more variables is called data

•  Data Array - The arrangement of a data in ascending or


descending Order is a Data Array
Example of Data on sales (in Thousands)
Example of Data Array
Types of data
• Primary Data:- The data that are collected first hand by
someone specifically for the purpose of facilitating the study
is known as primary data
e.g. Interview, Questionnaire 

• Secondary Data :- Any data that have been collected by some


other person or the organization for their own use but the
investigator also gets it for his/her use
e.g. Newspapers, Periodicals, Research Bureaus
• Qualitative Data - Data expressed in a non-numerical
property such as satisfaction of a customer, rich, poor and
superior is called Qualitative Data 

• Quantitative Data - Data expressed in a numerical  form such


as weight, height, income, price is called Quantitative Data
Frequency & Frequency Distributions
• Frequency refers to the number of occurrences.
• Frequency distributions are formed to condense the data
which is in the raw form, so as to make it more informative.
• Frequency tables of variable are formed by using the tally
marks.
• Continuous frequency distribution is formed by grouping
into class intervals, each interval having a lower limit and
upper limit.
Frequency & Frequency Distributions
• Frequency – no. of times a value observed in the data

• Frequency Distribution – list of all the values obtained in


the data and their corresponding frequencies

• Frequency distribution is for two type of variable


• Discrete 
• Continuous
Class and Class Limits
•Class:- If the observations of a series are divided into groups and
the groups are bounded by limits then each group is called a Class

•Class Limits:- The end values of a class are called as a Class limits
•Upper Class Limit :- The higher end value of a class is called as a
Upper Class limits (U)
•Lower Class Limit:- The Smaller value of a class is called as a Lower
Class limits (L)
Class Interval
• Class interval (C.I) is the difference between upper limit and
lower limit (which is constant through out the classes)
Methods of Frequency distribution

• Exclusive Method:- The Upper class limit of one class is Lower


limit of next class.
Upper limit is excluded from that class

• Inclusive Method:- The upper class limit of one class is included


in the class itself.
Upper limit and lower limit is included within the class
Examples of series
The data below show the value of orders (in '000s) obtained by 50 salesmen during a particular
week

6.0 5.9 3.5 2.9 8.7 7.9 7.1 5.0 5.2 3.9
3.7 6.1 5.8 4.1 5.8 6.4 3.8 4.9 5.7 5.5
6.9 4.0 4.8 5.1 4.3 5.4 6.8 5.9 6.9 5.4
2.4 4.9 7.2 4.2 6.2 5.8 3.8 6.2 5.7 6.8
3.4 5.0 5.2 5.3 3.0 3.6 3.8 5.8 4.9 3.7
Arrange these data as a frequency distribution (forming about 7 classes).

1
So the classes are 2.0 – 3.0, 3.0 – 4.0, 4.0 – 5.0, 5.0 – 6.0 ,
6.0 – 7.0, 7.0 – 8.0 & 8.0 – 9.0
3.7 6.1 5.8 4.1 5.8 6.4 3.8 4.9 5.7
6.9 4.0 4.8 5.1 4.3 5.4 6.8 5.9 6.9
2.4 4.9 7.2 4.2 6.2 5.8 3.8 6.2 5.7
Exclusive Series 3.4 5.0 5.2 5.3 3.0 3.6 3.8 5.8 4.9

Sr. No Class Tally Frequency


1 2.0 – 3.0
2 3.0 – 4.0
3 4.0 – 5.0
4 5.0 – 6.0
5 6.0 – 7.0
6 7.0 – 8.0
7 8.0 – 9.0
Cumulative Frequency Distribution

• A frequency distribution simply tells us how frequently a particular


value of the variable class is occurring

• To know the total number of observations getting a value less than


or more than a particular value of the variable (class)then such
information can be obtained very conveniently from the
cumulative frequency distribution which is a modification of the
given frequency distribution and is obtained on successively adding
the individual frequencies of the value of the variable (class).
Cumulative frequency (c.f) Distribution
• Cumulative Frequency corresponding to a class is the sum of all
the frequencies up to and including that class

• Less than c.f. - frequency for any value of the variable is obtained
on adding successively the frequencies of all the previous value(or
class),including the frequency of variable against which the totals
are written, provided the values (class) are arranged in ascending
order of magnitude .
• More than c.f. - The more than frequency is obtained similarly
by finding the cumulative total of frequencies starting from the
highest value (class) of the variable to the lowest value .
• It is also called the de-cumulative frequency denoted by de-c.f.
• It is obtained by having the total frequency as de-c.f. for the first
class and then subtracting the successive class frequencies from
the de-c.f of the previous class to get the de-c.f of the present
class.
Less than c.f.
Marks No. of Students Less than c.f
5-10 5
10-15 10
15-20 15
20-25 32
25-30 21
Total = 83
More than c.f.
Marks No. of Students More than c.f
5-10 5
10-15 10
15-20 15
20-25 32
25-30 21
Total = 83
More than c.f. by de-c.f
Marks No. of Students More than c.f
5-10 5
10-15 10
15-20 15
20-25 32
25-30 21
Total = 83
Percentage frequencies:

The frequencies expressed as percentage of total frequency is known as


percentage frequency.

𝐶𝑙𝑎𝑠𝑠 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
Percentage   frequency  = ×100
𝑇𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑓
Percentage   frequency  = × 100
𝑁
Percentage frequency
Class Tally Freq. f Less than c.f % c.f =

50-200
200-350
350-500
500-650
650-800
800-950
Diagram – 1 dimensional
BAR DIAGRAMS - SUB-DIVIDED BAR Diagram
PERCENTAGE BAR DIAGRAM
MULTIPLE BAR DIAGRAM
2 dimensional – Pie chart
Graphs
• helps to study the mathematical relationship between two variables.
• Graphs are more obvious, precise and accurate and are helpful to
statisticians for the further study.
• Construction of graph is easier as compared to the construction of
diagrams.
The different types of graphs are
• Frequency polygon
• Frequency curve
• Histogram
• Ogive curves –
a) ‘Less than ‘ type
b) ‘more than’ type.
Frequency polygon
Frequency curve
Histogram

A histogram is a plot that lets you discover, and show, the underlying frequency distribution (shape) of a set of continuous data
Ogive Curves / Cumulative frequency curve –
Less than Ogive
Ogive Curves / Cumulative frequency curve –
More than Ogive
Pareto chart/diagram
• A Pareto chart is a type of chart that contains both bars and a line
graph, where individual values are represented in descending order
by bars, and the cumulative total is represented by the line
• The purpose of the Pareto chart is to highlight the most important
among a (typically large) set of factors. In quality control, it often
represents the most common sources of defects, the highest
occurring type of defect, or the most frequent reasons for customer
complaints, and so on.
• It is basically a bar chart showing how much each cause contributes to
an outcome or effect.
Example of Pareto Diagram
Interpretation
• In applying the 80/20 rule, draw a line starting at 80% on the
percentage scale, running parallel to the x-axis and stopping where it
contacts the cumulative percentage curve. The causes that fall to the
left of this point are the causes (the “vital few”) that contribute to
80% of the problems, while the causes to the right are less
important. This can help you focus improvement efforts on the
causes that can have the most impact on the problems.
Advantages
• To analyze the frequency of problems or defects in a process 
• To analyze broad causes by examining their individual components 
• To help focus efforts on the most significant problems or causes
when there are many

You might also like