Ch01 Business Statistics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

Disclaimer

 This presentation is purely for academic purpose and does not carry
any commercial value.
 All non-academic images used in this presentation are property of
respective image holder(s). Images are used only for indicative
purpose and does not carry any other meaning.

1
Dilbert on Statistics

2
Please follow this…

3
DATA & STATISTICS
CHAPTER-1
Business Statistics
www.pibm.in

4
Text Book
TEXT BOOK
 Anderson, Sweeney, Williams, Camm, Cochran (2014). Business Statistics,
Cengage Learning (12th Edition)
NOTE – Most of the material is copied/adopted from this book. Please read the book for
additional explanation and understanding.

REFERENCE BOOK
 Black, K. (2011). Applied business statistics: Making better business
decisions. Wiley Publication.
5
Table of Content

Descriptive Statistical
Applications Data Data Sources
Statistics Inference

1 2 3 4 5
6
Cellular Phone Use in Japan (Slide - 1/2)
 Communications and Information Network Association of Japan (CIAJ) conducts
an annual study of cellular phone use in Japan.

 A recent survey was taken as part of this study using a sample of 600 cell phone
users split evenly between men and women.

 The survey was administered in the greater Tokyo and Osaka metropolitan
areas. The study produced several interesting findings. It was determined that
62.2% had replaced their handsets in the previous 10 months.

 A little more than 6% owned a second cell phone. Of these, the objective of
about two thirds was to own one for business use and a second one for personal
use.
7
Cellular Phone Use in Japan (Slide - 2/2)
 Some of the everyday uses of cell phones included e-mailing (91.7% of
respondents), camera functions (77.7%), Internet searching (46.7%), and
watching TV (28.0%).

 Of all those surveyed, 18.2% used their handsets to view videos, and another
17.3% were not currently using their handsets to view videos but were
interested in doing so.

 In the future, mobile company who wants production of cell phone for next
year, hoped there would be cell phones with high-speed data transmission that
could be used to send and receive PC files (47.7%), for video services such as You
Tube (46.9%), for downloading music albums (45.3%) and music videos (40.8%),
and for downloading long videos such as movies (39.2%).
8
WHAT IS STATISTICS?

9
Scenarios
You manage a portfolio of investments and are presented with a research report
that shows that over the past five years, technology stocks have outperformed
utility stocks by 13.7% to 10.1%. Is this a significantly higher return or just the
results of random chance? Should you think about investing in technology stocks?
What questions would you ask about this research? In this course, you will learn
how to compare two values against each other and determine if they differ in any
significant way.

An online research survey shows you that 57% of your competitor's customers
would switch to your product if you offered a coupon. Should you create a
coupon? What factors would you consider? Can you trust an online survey? In this
course, you will learn how to read survey results and ask the right questions about
them.
Reference: https://learn.saylor.org/mod/page/view.php?id=4683 10
Why to Study Statistics?
1 Informed (data driven) decision making rather than based on institution.

2 Read, evaluate and analysis of results published in reports and articles.

Identify and evaluate data gathered in the organization and create insights like
3 consumer behaviour, share market predictions, etc.

Conduct research required for business like market research, technical


4 analysis, etc.

5 Develop analytical, logical and critical thinking skills.

11
Definition of Statistics

Statistics is defined as the art and science of collecting,


DEFINITION
analyzing, presenting, and interpreting data.

 It provides methods for analyzing and assessing the


HOW IS IT significance of data.
HELPFUL?  It enables the transformation of data into information that
can then serve as the basis for decision-making.

12
APPLICATIONS

13
Applications in Business and Economics
Analysis of inventory churn, valuation, debt analysis, outstanding
ACCOUNTING analysis and predictions

FINANCE Risk analytics, share market predictions, algorithmic trading

Electronic point-of-sale scanners at retail checkout counters are used


MARKETING to collect data for marketing research.

Predictive maintenance of machines; Quality control required to


MANUFACTURING monitor the output of a production process.

Economists use statistical information in making forecasts about the


ECONOMICS future of the economy or some aspect of it.
14
DATA

15
Data
LEARNINGS IN DATA -

Data and data set

Elements, Variables, and Observations

Scales of Measurement

Qualitative and Quantitative Data

Cross-Sectional and Time Series Data

16
Data and Data Set
DATA DATA are the facts & figures that are collected, summarized, analyzed, & interpreted.

DATA SET The data collected in a particular study are referred to as the DATA SET.

Data Set for 5 Companies


Company Stock Exchange Share Price ($) Earnings per Share ($)
Abbott Laboratories NY 46 2.02
Bank of New York NY 30 1.85
eBay NQ 43 0.57
IBM NY 93 4,94
Wells Fargo NY 59 4.09
NY – New York Stock Exchange, NQ - NASDAQ Data is collected on April 2, 2005. 17
Elements, Variables, Measurement &
Observations
The ELEMENTS are the entities on which data are collected.
ELEMENTS e.g. Individual company’s stock is an element.
{Abbott Laboratories, Bank of New York, eBay, IBM, Wells Fargo}

A VARIABLE is a characteristic of interest for the elements.


VARIABLE
e.g. Stock Exchange, Stock Price ($), Earnings per Share ($)

Measurements are collected for every element.


MEASUREMENT e.g. in Abbott Laboratories for variable Stock Price ($) measurement
is {40}

The set of measurements collected for a particular element is called an


OBSERVATION OBSERVATION.
e.g. for Abbott Laboratories observation is {NYSE, 46, 2.02} 18
Elements, Variables, and Observations

Variable
Data is collected on April 2, 2005.

Company Stock Exchange Share Price ($) Earnings per Share ($)
Abbott Laboratories NYSE 46 2.02
Bank of New York NYSE 30 1.85
eBay NASDAQ 43 0.57
IBM NYSE 93 4,94
Wells Fargo NYSE 59 4.09

Elements Measurement
Observation

19
(Table 1.7 from Text Book)

Example
PROBLEM- Identify Elements, Variables,
Measurement & Observations

20
Scales of Measurement
 Data collection requires one of the following scales of measurement:
• Nominal
• Ordinal
• Interval
• Ratio
 The scale determines the amount of information contained in the data.
 The scale indicates the data summarization and statistical analyses that are most
appropriate.

21
Scales of Measurement - NOMINAL
 When data (measurement) are labels or names used to identify an attribute of the
element, the scale of measurement is NOMINAL SCALE.
 A nonnumeric label or a numeric code may be used.

NY – New York Stock Exchange


NQ - NASDAQ
Stock Exchange Nonnumeric Label Numeric Code
New York Stock Exchange NY 1
New York Stock Exchange NY 1
NASDAQ NQ 2
New York Stock Exchange NY 1
New York Stock Exchange NY 1
22
(Table 1.7 from Text Book)

Example
PROBLEM - Identify NOMINAL scale

23
Scales of Measurement - ORDINAL
 The data have the properties of nominal data and the order or rank of the data is
meaningful.
 A nonnumeric label or a numeric code may be used.
e.g. Eastside Automotive sends customers a questionnaire designed to obtain data on the
quality of its automotive repair service. Each customer provides a repair service rating of
excellent, good, or poor. Because the data obtained are the labels—excellent, good, or
poor—the data have the properties of nominal data. In addition, the data can be ranked,
or ordered, with respect to the service quality. Data recorded as excellent indicate the
best service, followed by good and then poor. Thus, the scale of measurement is ordinal.
Note that the ordinal data can also be recorded using a numeric code.

24
(Table 1.7 from Text Book)

Example
PROBLEM - Identify ORDINAL scale

25
Scales of Measurement - INTERVAL
 The data have the properties of ordinal data and the interval between observations is
expressed in terms of a fixed unit of measure.
 Interval data are always numeric.
e.g. Three students with CAT scores of 620, 550, and 470 can be ranked or ordered in
terms of best performance to poorest performance. In addition, the differences between
the scores are meaningful.
For instance, student 1 scored 620 - 550 = 70 points more than student 2, while student
2 scored 550 - 470 = 80 points more than student 3.

26
Scales of Measurement - RATIO
 The data have all the properties of interval data and the ratio of two values is
meaningful.
 Variables such as distance, height, weight, and time use the ratio scale.
 This scale must contain a zero value that indicates that nothing exists for the variable at
the zero point.

e.g. consider the cost of an automobile. A zero value for the cost would indicate that the
automobile has no cost and is free.
In addition, if we compare the cost of $30,000 for one automobile to the cost of $15,000 for
a second automobile, the ratio property shows that the first automobile is
$30,000/$15,000 2 times, or twice, the cost of the second automobile.

27
EXAMPLE - DiGiorno Pizza Shop (Slide-1/2)
In the various market research efforts made by Kraft for DiGiorno, some of the
possible measurements appear in the following list. Categorize these by level of
data. Think of some other measurements that Kraft researchers might have made
to help them in this research effort, and categorize them by level of data.
1. Number of pizzas consumed per week per household
2. Age of pizza purchaser
3. Zip code of the survey respondent
4. Dollars spent per month on pizza per person
5. Time in between purchases of pizza

28
EXAMPLE - DiGiorno Pizza Shop (Slide-1/2)
6. Rating of taste of a given pizza brand on a scale from 1 to 10, where 1 is
very poor tasting and 10 is excellent taste
7. Ranking of the taste of four pizza brands on a taste test
8. Number representing the geographic location of the survey respondent
9. Quality rating of a pizza brand as excellent, good, average, below average,
poor
10. Number representing the pizza brand being evaluated
11. Gender of survey respondent

29
Qualitative and Quantitative Data
 Data can be also classified as QUALITATIVE or QUANTITATIVE.

 The STATISTICAL ANALYSIS that is appropriate depends on whether the data for the
variable are qualitative or quantitative.

 In general, there are more alternatives for statistical analysis when the data are
QUANTITATIVE rather than QUALITATIVE.

30
Qualitative Data
 Qualitative data are LABELS or NAMES used to identify an attribute of each element.

 Qualitative data use either the NOMINAL or ORDINAL scale of measurement.

 Qualitative data can be either NUMERIC or NONNUMERIC.

 The statistical analysis for qualitative data are rather LIMITED.

 Qualitative data can be summarized qualitative data by COUNTING the number of


observations in each qualitative category or by COMPUTING the proportion of the
observations in each qualitative category.

 Qualitative data having numeric code can’t be used for ARITHMETIC OPERATIONS such
as addition, subtraction, multiplication, and division to get meaningful results.

31
Quantitative Data
 QUANTITATIVE DATA indicate either how many or how much.

 Quantitative data that measure how many are DISCRETE.

 Quantitative data that measure how much are CONTINUOUS because there is no
separation between the possible values for the data..

 Quantitative data are always NUMERIC.

 All ARITHMETIC OPERATIONS can be performed on quantitative data to get meaningful


output.

32
(Table 1.7 from Text Book)

Example - The Wall Street Journal


The Wall Street Journal subscriber survey (October 13, 2003) asked 46 questions about
subscriber characteristics and interests. State whether each of the following questions
provided qualitative or quantitative data and indicate the measurement scale appropriate
for each.
1. What is your age?
2. Are you male or female?
3. When did you first start reading the WSJ? High school, college, early career,
midcareer, late career, or retirement?
4. How long have you been in your present job or position?
5. What type of vehicle are you considering for your next purchase? Nine response
categories include sedan, sports car, SUV, minivan, and so on.
33
Cross-Sectional Data
Cross-sectional Data are collected at the same or approximately the same point in time.

e.g. The data given below are cross-sectional because they describe the five variables for
the 5 companies at the same point in time.

Data Set for 5 Companies


Company Stock Exchange Share Price ($) Earnings per Share ($)
Abbott Laboratories NY 46 2.02
Bank of New York NY 30 1.85
eBay NQ 43 0.57
IBM NY 93 4,94
Wells Fargo NY 59 4.09
NY – New York Stock Exchange, NQ - NASDAQ Data is collected on April 2, 2005. 34
Time Series Data
TIME SERIES DATA are collected over
several time periods.
e.g. Figure below provides a graph of
the U.S. city average price per gallon
for unleaded regular gasoline. The
graph shows gasoline price in a fairly
stable band between $1.80 and $2.00
from May 2004 through February
2005. After that gasoline price
became more volatile. It rose
significantly, culminating with a sharp
spike in September 2005.
35
Variety of Graphs of Time Series Data

36
(Table 1.7 from Text Book)

Example - Earnings for Volkswagen


Figure 1.8 provides a bar graph summarizing the earnings for Volkswagen for the years
1997 to 2005 (BusinessWeek, December 26, 2005).

 Are the data qualitative or quantitative?


 Are the data time series or cross-
sectional?
 What is the variable of interest?
 Comment on the trend in Volkswagen’s
earnings over time.
 What warning does this graph suggest
about projecting data such as
Volkswagen’s earnings into the future?

37
Example - CSM Worldwide

CSM Worldwide forecasts global production for all automobile manufacturers. The
following CSM data show the forecast of global auto production for General Motors, Ford,
DaimlerChrysler, and Toyota for the years 2004 to 2007 (USA Today, December 21, 2005).
Data are in millions of vehicles.
 Construct a time series graph for the years 2004 to 2007 showing the number of
vehicles manufactured by each automotive company. Show the time series for all four
manufacturers on the same graph.

38
HOMEWORK PROBLEMS

39
Homework : Question-1
State whether each of the following variables is categorical or quantitative and
indicate its measurement scale.

1. Annual sales

2. Soft drink size (small, medium, large)

3. Employee classification (GS1 through GS18)

4. Earnings per share

5. Method of payment (cash, check, credit card)

40
Homework : Question-2
Classify each of the following as nominal, ordinal, interval, or ratio data.
1. The time required to produce each tire on an assembly line
2. The number of quarts of milk a family drinks in a month
3. The ranking of four machines in your plant after they have been designated as
excellent, good, satisfactory, and poor
4. The telephone area code of clients in the United States
5. The age of each of your employees
6. The dollar sales at the local pizza shop each month
7. An employee’s identification number
8. The response time of an emergency unit
41
DATA SOURCES

42
Data Sources
Statistical studies can be classified as either experimental or observational.

Existing Sources Experimental Study Observational Study

Data is already existing in In an experimental study, a Nonexperimental, or


databases of company or variable of interest is first observational, statistical
industry associations or identified. Then one or studies make no attempt to
special interest more other variables are control the variables of
organizations or identified and controlled so interest. A SURVEY is
government agencies. that data can be obtained perhaps the most common
about how they influence type of observational study
the variable of interest.

43
Data Sources – Existing Internal Sources

44
Data Sources – Existing Government
Sources

45
Data Sources – INTERNET
 The Internet has become an important source of data.

 Most government agencies, like the Bureau of the Census (www.census.gov),


make their data available through a web site.

 More and more companies are creating web sites and providing public access to
them.

 A number of companies now specialize in making information available over the


Internet.

46
Data Sources – EXPERIMENTAL STUDY
 In an experimental study, a variable of interest is first identified. Then one or
more other variables are identified and controlled so that data can be obtained
about how they influence the variable of interest.
 e.g. a pharmaceutical firm might be interested in conducting an experiment to
learn about how a new drug affects blood pressure. Blood pressure is the
variable of interest in the study. The dosage level of the new drug is another
variable that is hoped to have a causal effect on blood pressure. To obtain data
about the effect of the new drug, researchers select a sample of individuals. The
dosage level of the new drug is controlled, as different groups of individuals are
given different dosage levels. Before and after data on blood pressure are
collected for each group. Statistical analysis of the experimental data can help
determine how the new drug affects blood pressure. 47
Data Sources – OBSERVATIONAL STUDY
 Nonexperimental, or observational, statistical studies make no attempt to control the
variables of interest. A SURVEY is perhaps the most common type of observational
study.
 For instance, in a personal interview survey, research questions are first identified.
Then a questionnaire is designed and administered to a sample of individuals. Some
restaurants use observational studies to obtain data about their customers’ opinions of
the quality of food, service, atmosphere, and so on.
 A questionnaire used by the Lobster Pot Restaurant in Redington Shores, Florida, is
shown in Figure (on next slide). Note that the customers completing the questionnaire
are asked to provide ratings for five variables: food quality, friendliness of service,
promptness of service, cleanliness, and management. The response categories of
excellent, good, satisfactory, and unsatisfactory provide ordinal data that enable
Lobster Pot’s managers to assess the quality of the restaurant’s operation
48
49
Data Acquisition Considerations
 TIME REQUIREMENT
• Searching for information can be time consuming.
• Information might no longer be useful by the time it is available.

 COST OF ACQUISITION
• Organizations often charge for information even when it is not their primary
business activity.

 DATA ERRORS
• Using any data that happens to be available or that were acquired with little
care can lead to poor and misleading information.

50
DESCRIPTIVE STATISTICS

51
Descriptive Statistics
Most of the statistical information in newspapers, magazines, company reports, and other
publications consists of data that are summarized and presented in a form that is easy for
the reader to understand. Such summaries of data, which may be tabular, graphical, or
numerical, are referred to as DESCRIPTIVE STATISTICS.

For example, a tabular summary of the data for the


qualitative variable Exchange is shown in 2 different
format as FREQUENCY TABLE and BAR GRAPH
Frequencies & Percent Frequencies for the Exchange Variable

52
Descriptive Statistics
In addition to tabular and graphical displays,
numerical descriptive statistics are used to
summarize data. The most common numerical
descriptive statistic is the average or mean.
An average demonstrates a measure of the
central tendency, or central location of the
data for that variable.

Using the data on the variable Earnings per


Share for the S&P stocks in given Table 1.1, we
can compute the average by adding the
earnings per share for all 25 stocks and
dividing the sum by 25. Doing so provides an
average earnings per share of $2.49. 53
STATISTICAL INFERENCE

54
Statistical Inference
Many situations require information about a large group of elements (individuals,
companies, voters, households, products, customers, and so on). But, because of time, cost,
and other considerations, data can be collected from only a SMALL PORTION of the group.
The larger group of elements in a particular study is called the POPULATION, and the
smaller group is called the SAMPLE.

POPULATION A population is the set of all elements of interest in a particular study.

SAMPLE A sample is a subset of the population.

55
Statistical Inference

The process of conducting a survey to collect data for the entire


CENSUS
population is called a CENSUS.

The process of conducting a survey to collect data for a sample is called


SURVEY
a sample SURVEY.

Statistics uses data from a sample to make estimates and test


STATISTICAL
hypotheses about the characteristics of a population through a process
INFERENCE
referred to as STATISTICAL INFERENCE.

56
Example - Norris Electronics
Norris manufactures a high-intensity lightbulb used in a variety of electrical products. In an
attempt to increase the useful life of the lightbulb, the product design group developed a
new lightbulb filament. In this case, the population is defined as all lightbulbs that could be
produced with the new filament. To evaluate the advantages of the new filament, 200 bulbs
with the new filament were manufactured and tested. Data collected from this sample
showed the number of hours each lightbulb operated before filament burnout.
Suppose Norris wants to use the sample data to make an inference about the average hours
of useful life for the population of all lightbulbs that could be produced with the new
filament. Adding the 200 values given in Table (on next slide) and dividing the total by 200
provides the sample average lifetime for the lightbulbs: 76 hours. We can use this sample
result to estimate that the average lifetime for the lightbulbs in the population is 76 hours.

57
Example - Norris Electronics

58
Statistical Inference Process for
Norris Electronics
Population
consists of all
bulbs
manufactured A sample of 200bulbs
1 2
with the new tune-ups is examined.
filament. Average
lifetime is
unknown.

The sample average is used The sample data provide


a sample average
4 to make an estimate of lifetime of 76 hours per 3
the population average. bulb
59
Example - Nielsen Media Research

60
HOMEWORK PROBLEMS

61
Home Work : Question-1
The Rathburn Manufacturing Company makes electric wiring, which it sells to contractors in
the construction industry. Approximately 900 electric contractors purchase wire from
Rathburn annually. Rathburn’s director of marketing wants to determine electric
contractors’ satisfaction with Rathburn’s wire. He developed a questionnaire that yields a
satisfaction score between 10 and 50 for participant responses. A random sample of 35 of
the 900 contractors is asked to complete a satisfaction survey. The satisfaction scores for
the 35 participants are averaged to produce a mean satisfaction score.
1. What is the population for this study?
2. What is the sample for this study?
3. What is the statistic for this study?
4. What would be a parameter for this study?

62
Home Work : Question-2
The Food and Drug Administration (FDA) reported the number of new drugs approved over
an eight-year period (The Wall Street Journal, January 12, 2004). Figure 1.9 provides bar
chart summarizing the number of new drugs approved each year.
 Are the data categorical or quantitative?
 Are the data time series or cross-sectional?
 How many new drugs were approved in 2003?
 In what year were the fewest new drugs approved? How many?
 Comment on the trend in the number of new drugs approved by the FDA over the
eight-year period.

63
Home Work : Question-2

64
65

You might also like