1. Introduction SMD
1. Introduction SMD
1. Introduction SMD
INTRODUCTION TO
STATISTICS
Dr.Bharath V MFM., M.Com., Ph.D
Assistant Professor
Kristu Jayanti College
Bengaluru
Course Content
• Unit 1: Introduction to Statistics
• Definition, divisions of statistics, importance, functions, scope, limitations
of statistics; Collection and Classification of data; create and interpret
diagrams and graphs; Construction of frequency distribution table.
• Unit 2: Univariate Data Analysis I – Measures of Central Tendency
• Introduction: measures of central tendency: simple arithmetic mean using
short cut method and step deviation method, Incorrect values, missing
values & missing frequencies problems, combined mean, weighted
arithmetic mean; median: discrete and continuous series problems,
missing frequencies, quartiles; mode: grouping and analysis table
method, interpolation formula.
• Unit 3: Univariate Data Analysis II – Measures of Dispersion
• Introduction: measures of dispersion: range, inter-quartile range, quartile
deviation, standard deviation problems using assumed mean and step
deviation method, coefficient of variation; sampling techniques and its
types.
3
Reference
• Gupta, S.P. (2006). Statistical Methods. New Delhi: Himalaya Publishing
House.
• Sathyaprasad, B.G. & Chikkodi. (2013). Quantitative methods for business
- II. New Delhi: Himalaya Publishing House.
• Rajesh.S. Rajaghatta and Gangadharappa.N.H.(2014), Quantitative
methods for business – II, Kalyani Publishers
• Aggarwal.S.L. and Bhardwaj.S.L. (2011), Business Statistics, Kalyani
publishers
• Sharma.J.K (2015), Fundamentals of business statistics, Vikas publishing
house Pvt. Ltd.
• Srivatsava TN, Shailaja Rego (2008), Statistics for Management, Tata
McGraw Hill.
5
Introduction
• The word ‘Statistics’ has been derived from the latin word
‘Status’, Italian word ‘Statista’ or German word ‘Statistik’
which relate to "state" or "politics." Historically, it referred to
the collection of data about the state or government.
• Statistics is a tool in the hands of mankind to translate
complex facts into simple and understandable statements
of facts.
• As a Discipline: Statistics is the science of collecting,
analyzing, interpreting, and presenting data for informed
decision-making.
• As Numerical Data: Statistics refers to quantitative facts
or figures, such as population counts, sales figures, or
survey results.
6
• Inferential Statistics
• Uses sample data to make generalizations or predictions about a
larger population through hypothesis testing and confidence
intervals.
9
Characteristics of Statistics
• Quantitative Nature: Statistics primarily deals with
numerical data.
Social Sciences
• Studying population demographics and social behavior.
• Conducting surveys and public opinion polls.
• Measuring economic inequalities, literacy rates, and
employment statistics.
12
Education
• Analyzing student performance and evaluating educational
policies.
• Developing standardized testing metrics.
Functions of Statistics
Collection of Data
• Statistics provides methods for collecting reliable and relevant
data.
• Ensures systematic and organized approaches like surveys,
experiments, and observational studies.
Organization of Data
• Helps arrange raw data into meaningful formats, such as tables,
charts, and graphs, for easy understanding and analysis.
Summarization of Data
Simplifies complex datasets using descriptive measures:
• Central Tendency: Mean, median, mode.
• Dispersion: Range, variance, standard deviation.
14
Data Analysis
• Identifies patterns, trends, and relationships within datasets.
• Facilitates hypothesis testing and decision-making
processes.
Interpretation of Results
• Draws meaningful conclusions from analyzed data.
• Provides insights into the underlying trends and variability.
Decision-Making
• Offers quantitative evidence to support strategic and
operational decisions in various fields, including business,
government, and healthcare.
Quality Control
• Assists in maintaining and improving the quality of products
and services.
• Applies statistical methods like control charts in
manufacturing and service industries.
Measuring Uncertainty
• Provides tools to quantify and manage uncertainty using
probability and confidence intervals.
• Enables risk assessment and mitigation.
16
Designing Experiments
• Offers methods for creating experiments to test hypotheses
effectively.
• Ensures proper sampling and control of variables to reduce
bias.
Understanding Relationships
• Analyzes correlations and causations between variables.
• Supports the study of how one factor influences another,
such as in regression analysis.
17
Importance of Statistics
Informed Decision-Making
• Statistics provides quantitative evidence for making informed decisions
in business, government, healthcare, and personal life.
• Example: Businesses use sales data to determine product pricing and
marketing strategies.
Understanding Data
• Helps summarize, simplify, and interpret complex datasets into
meaningful information.
• Example: A survey with thousands of responses can be condensed into
averages, percentages, and visual charts for better understanding.
• Supports Research
• Forms the backbone of scientific research by validating
hypotheses and analyzing experimental results.
• Example: In medicine, statistics is used to determine the
efficacy of a new drug in clinical trials.
• Understanding Relationships
• Explores correlations and causations between variables to
guide actions.
• Example: In marketing, understanding how advertising spend
affects sales.
Limitations of Statistics
• Statistics does not deal with individual measurement.
Data Collection
• Data collection involves gathering accurate and relevant
information to achieve specific objectives.
• Types of Data
• Primary Data: Data collected firsthand for a specific
purpose.
• Example: A researcher conducting a survey to understand
customer satisfaction in a supermarket.
• Secondary Data: Data already collected by someone
else, used for analysis.
• Example: Using census data to analyze population growth trends.
21
• Experiments
• Conducting controlled experiments to gather data.
• Example: Testing a new fertilizer on crops and recording the
growth rate.
• Existing Sources
• Utilizing previously published reports, articles, or
databases.
• Example: Using weather data from a government website for
climate change studies.
23
Comparison Table
28
Data Classification
• Basis of Classification
• Chronological Classification:
• Grouping data based on time periods.
• Example: Monthly sales data of a store for the year.
• Geographical Classification:
• Grouping data based on location.
• Example: Population distribution by state or region.
29
• Qualitative Classification:
• Grouping data by non-numeric attributes.
• Example: Categorizing employees by job roles (e.g., manager,
developer).
• Quantitative Classification:
• Grouping data by numeric values.
• Example: Classifying families by income brackets.
30
Quantitative Data
• Quantitative data refers to information that can be
measured or expressed numerically. It is often used in
mathematical and statistical analysis.
• Numerical in nature.
• Can be discrete or continuous.
• Suitable for mathematical computations and comparisons.
31
• Continuous Data
• `Represents measurable quantities.
• `Can take any value within a range (includes fractions/decimals).
• Examples:
• Height of individuals (e.g., 5.8 feet).
• Temperature readings (e.g., 23.5°C).
• Time taken to complete a task (e.g., 2.45 hours).
32
Qualitative Data
• Qualitative data refers to non-numerical information that
describes qualities, characteristics, or categories. It is
often used in descriptive analysis.
• Characteristics:
• Descriptive in nature.
Ordinal Data
• Represents categories with a specific order or ranking.
• Differences between categories may not be measurable or equal.
• Examples:
• Customer satisfaction levels (very satisfied, satisfied, neutral, dissatisfied,
very dissatisfied).
• Educational qualifications (high school, undergraduate, graduate).
• Star ratings for a product (1 star, 2 stars, etc.).
34
• 3. Diagrammatic presentation.
37
Textual Presentation
• Textual presentation is a method of presenting statistical
data in the form of written or descriptive text.
Example
• Dataset
• A survey was conducted among 100 women entrepreneurs in a city
regarding challenges in business.
• Results: 30% reported financial difficulties.
• 25% faced administrative hurdles.
• 20% cited lack of access to raw materials.
• 15% mentioned legal issues.
• 10% identified market competition as a challenge.
• Textual Presentation:
• "In a recent survey of 100 women entrepreneurs, 30% of respondents
identified financial difficulties as the most significant challenge.
Administrative hurdles were reported by 25% of participants, while
20% faced difficulties in accessing raw materials. Legal issues were
highlighted by 15% of the respondents, and 10% cited market
competition as a barrier to their business success."
39
Example
• In 2014, out of total of 2000 students in a college, 1400
were for Graduation and rest for Post-Graduation(PG).
Out of 1400 graduate students 100 were girls. However,
in all there were 600 girls in the college.
Diagrammatic Presentation
• Diagrammatic presentation refers to the use of visual aids
such as charts, graphs, and diagrams to represent data.
Line graph
• Ex. A test was administered on students of a class X to
demonstrate the effect of practice on learnings. The data
so obtained may be studied from the following table
Trail No. 1 2 3 4 5 6 7 8 9 10 11 12
Score 4 5 8 8 10 13 12 12 14 16 16 16
Draw a line graph for the representation and interpretation of the above data.
45
BAR DIAGRAM
• Simple bar diagram
• Multiple or grouped bar diagram
• Subdivided or component bar diagram
• Percentage subdivided bar diagram
46
Frequency Distribution
• Frequency distribution is a tabular or graphical
representation of data that shows the number of times
each value or group of values (class) occurs in a dataset.
Class Intervals:
• Represents a range of data values.
• Example: For scores, intervals might be 0–10, 11–20, etc.
Frequency (f):
• The number of observations within each class interval.
Cumulative Frequency:
• The running total of frequencies as you move through
class intervals.
52
• Percentage Frequency
• The proportion of the total frequency for each class,
expressed as a percentage
53
Problem
• A survey was conducted to record the ages
of 30 people. The ages (in years) are as
follows:
• 23, 25, 30, 21, 24, 28, 27, 23, 25, 22,
• 29, 30, 26, 31, 28,24, 22, 21, 29, 23,
• 25, 26, 28, 30, 27, 22, 31, 29, 28, 24.
58