SM Session 1 IPL 2024 Post Session Slides
SM Session 1 IPL 2024 Post Session Slides
Session 1
n Statistics
n Applications in Business and Economics
n Data
n Data Sources
n Descriptive Statistics
n Statistical Inference
2
Why do we need to study Mathematics (Probability), Statistics ?
3
What is Statistics?
l Both Probability and Statistics are science that helps us make better
decisions in business and economics as well as in other fields.
4
What is Statistics?
n Statistics can also refer to the art and science of collecting, analyzing,
presenting, and interpreting data.
5
Applications in Business and Economics
Marketing
n Electronic point-of-sale scanners at retail checkout counters are used
to collect data for a variety of marketing research applications.
Law
n Jury selection and Courtroom Analytics (e.g. Verdict Prediction)
n Legal Research
n Crime Data Analysis- e.g. crime trends
n Case Preparation- analyzing evidence, preparing legal arguments etc.
6
Applications in Business and Economics
Accounting
n Public accounting firms use statistical sampling procedures when
conducting audits for their clients. Also, during income tax returns check.
Economics
n Economists use statistical information in making forecasts about the future
of the economy or some aspect of it.
Production
n A variety of statistical quality control charts are used to monitor the output
of a production process.
7
9
Data and Data Sets
n Data are the facts and figures collected, analyzed, and summarized
for presentation and interpretation.
n All the data collected in a particular study are referred to as the data
set for the study.
10
Data, Data Sets, Elements, Variables, and Observations
Variables
Data Set
11
The World is Data Rich
12
“Data is the new oil. It’s valuable, but if unrefined, it cannot really be
used. It has to be changed into gas, plastic, chemicals, etc. to create a
valuable entity that drives profitable activity; so must data be broken
down and analyzed for it to have value.”
13
Categorical and Quantitative Data
14
Categorical Data
15
Quantitative Data
16
Cross-Sectional Data
Example
n Data detailing the number of building permits issued in November
2023 in each of the states of India.
17
18
Time Series Data
Example
Data detailing the number of building permits issued in Delhi, in each
of the last 36 months.
20
Scales of Measurement
21
Scales of Measurement
Nominal scale
n Data are labels or names used to identify an attribute of the element.
n A nonnumeric label or numeric code may be used.
Example
Students of a university are classified by the school in which they are
enrolled using a nonnumeric label such as Business, Humanities, Education,
and so on.
Alternatively, a numeric code could be used for the school variable (e.g. 1
denotes Business, 2 denotes Humanities, 3 denotes Education, and so on).
22
Scales of Measurement
Ordinal scale
n The data have the properties of nominal data and the order or rank of
the data is meaningful.
n A nonnumeric label or numeric code may be used.
Example
Students of a university are classified by their class standing using a
nonnumeric label such as Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for the class standing
variable (e.g. 1 denotes Freshman, 2 denotes Sophomore, and so on).
23
Scales of Measurement
Interval scale
n The data have the properties of ordinal data, and the interval
between observations is expressed in terms of a fixed unit of measure.
n Interval data are always numeric.
Example
Melissa has an SAT score of 1985, while Kevin has an SAT score of
1880. Melissa scored 105 points more than Kevin.
24
Scales of Measurement
Ratio scale
¨ Data have all the properties of interval data and the ratio of two values is
meaningful.
¨ Ratio data are always numerical.
¨ Zero value is included in the scale.
Example:
Price of a book at a retail store is $ 200, while the price of the same book sold
online is $100. The ratio property shows that retail stores charge twice the
online price.
25
Exercise
26
Scales of Measurement
Data
Categorical Quantitative
Non-
Numeric Numeric
numeric
27
PRIMARY DATA AND SECONDARY DATA
28
n Secondarydata
Secondary Data
n Pre-existing data not gathered for purposes of the current research
¨ Not ‘new’ data – ‘second hand’
• Because you want to find out what is already known about a subject before
you decline into your own investigation. WHY?
• Because some of your questions can possibly have been already answered
by other investigators or authors.
Primary Data
Demographic/Socioeconomic
§ Age, Gender, Income, Marital Status, Occupation
Psychological/Lifestyle
§ Activities, Interests, Personality Traits
Attitudes/Opinions
§ Preferences, Views, Feelings, Inclinations
Awareness/Knowledge
§ Facts about product, features, price, uses
Intentions
§ Planned or Anticipated Behavior
Motivations
§ Why People Buy (Needs, Wants, Wishes, Ideal-Self)
Behavior
§ Purchase, Use, Timing, Traffic Flow
`
Primary Data Can Be gathered By:
• Communication Methods
§ Interacting with respondents
§ Asking for their opinions, attitudes, motivations, characteristics
• Observation Methods
§ No interaction with respondents
§ Letting them behave naturally and drawing conclusions from their actions
…but before we delve deeper
Time Requirement
n Searching for information can be time consuming.
n Information may no longer be useful by the time it is available.
Cost of Acquisition
n Organizations often charge for information even when it is not their
primary business activity.
Data Errors
n Using any data that happen to be available or were acquired with
little care can lead to misleading information.
37
Using Statistics (Two Categories)
38
Descriptive Statistics
Example
The manager of Hudson Auto would like to have a better understanding of
the cost of parts used in the engine tune-ups performed in her shop. She
examines 50 customer invoices for tune-ups. The costs of parts, rounded to
the nearest dollar, are listed on the next slide.
39
Hudson Auto Repair
71 69 72 89 66 75 79 75 72 76
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
40
Tabular Summary: Frequency and Percent Frequency
50-59 2 4%
60-69 13 26%
70-79 16 32%
80-89 7 14%
90-99 7 14%
100-109 5 10%
TOTAL 50 100%
41
Graphical Summary: Histogram
16
14
12
Frequency
10
0
50-59 60-69 70-79 80-89 90-99
Parts Cost ($)
42
Numerical Descriptive Statistics
43
Softwares
n MS- Excel
n R- Software
n SPSS IBM
44