Statistics Week 2 T40i
Statistics Week 2 T40i
Definition of statistics:
Statistics is the art and science of collecting,
analyzing, presenting, interpreting and predicting
data.
Objective of statistics is to give the managers and
decision makers a better understanding of the business
and economic environment and thus enable them to
make more informed and better decision.
Descriptive statistics and statistical inference are ways
of converting data into meaningful and easily
interpreted statistical information.
2. Applications in Business and Economics
5
Economics
Economists frequently provide forecasts about the future
of the economy or some aspect of it. They use a variety of
statistical information in making such forecasts. For
instance, in forecasting inflation rates, economists use
statistical information on such indicators as the Producer
Price Index, the unemployment rate, and manufacturing
capacity utilization. Often these statistical indicators are
entered into computerized forecasting models that predict
inflation rates.
The Statistics in Practice applications show the importance
of statistics in a wide variety of business and economic
situations.
9
3. Data
Data are the facts and figures collected, analyzed, and
summarized for presentation and interpretation.
All the data collected in a particular study are referred
to as the data set for the study.
Elements are the entities on which data are collected.
A variable is a characteristic of interest for the
elements.
Measurements collected on each variable for every
element in a study provide the data. The set of
measurements obtained for a particular element is
called an observation.
10
Example 1:
An illustration would be useful. We have a data set for the
following 5 stocks:
Stock Annual Sales (in Earnings per Exchange (where
million) share ($) to trade)
Cache Inc. 86.6 0.25 OTC
Koss Corp 36.1 0.89 OTC
Par Technology 81.2 0.32 NYSE
Scientific Tech. 17.3 0.46 OTC
Western Beef 273.7 0.78 OTC
Note: OTC stands for “over the counter” while NYSE stands for “New York Stock Exchange”.
Example 1 (continue)
Qualitative data: OTC, OTC, NYSE, OTC and OTC
Quantitative data: 86.6, 36.1, 81.2, 17.3, 273.7, 0.25, 0.89, 0.32,
0.46 and 0.78.
The variable “Exchange” is referred to as a qualitative variable.
The variables “Annual Sales” and “Earnings per share” are
referred to as quantitative variables.
Note: quantitative data are always numeric, but qualitative
data may be either numeric or nonnumeric, for example, id
numbers and automobile license plate numbers are qualitative
data.
Note: ordinary arithmetic operations are meaningful only with
quantitative data and are not meaningful with qualitative
data.
13
4. Cross-sectional and Time Series Data
For purposes of statistical analysis, distinguishing between cross-
sectional data and time series data is important. Cross-sectional
data: statistical analysis which provides information on the
characteristics of, and statistical relationships between, individual
units of study at a specified moment in time (the moment of data
collection). Cross-sectional analysis looks at data collected at a
single point in time, rather than over a period of time. Time series
data, however, are data collected over several time periods. Please
see Table 1.2 below for an illustration.
Table 1.2 Cross-sectional vs. Time series data Sales
14
5. Statistical Inference
Many situations require data for a large group of elements (individuals,
companies, voters, households, products, customers and so on). Because
of multiple constraints, data can only be collected from only a small
portion of the group. The larger the group of elements in a particular
study is called population.
The smaller group is called sample.
A population is the set of all elements of interest in a particular study.
A sample is a subset of the population.
The process of conducting a survey to collect data for the entire
population is called census.
The process of conducting a survey to collect data for a sample is called
sample survey.
As one of its major contributions, statistics uses data from a sample to
make estimates and the test hypothesis about characteristics of a
population through a process referred to as statistical inference.
15
Exercises
16
1. Discuss the differences between statistics as numerical facts and statistics
as a discipline or field of study.
2. The U.S. Department of Energy provides fuel economy information for a
variety of motor vehicles. A sample of 10 automobiles is shown in Table 1.6
(Fuel Economy website, February 22, 2008). Data show the size of the
automobile (compact, midsize, or large), the number of cylinders in the
engine, the city driving miles per gallon, the highway driving miles per
gallon, and the recommended fuel (diesel, premium, or regular).
a. How many elements are in this data set?
b. How many variables are in this data set?
c. Which variables are categorical and which variables are quantitative?
17
3. Refer to Table 1.6.
a. What is the average miles per gallon for city driving?
b. On average, how much higher is the miles per gallon for highway
driving as compared to city driving?
c. What percentage of the cars have four-cylinder engines?
d. What percentage of the cars use regular fuel?
18
4. Table 1.7 shows data for seven colleges and universities. The endowment
(in billions of dollars) and the percentage of applicants admitted are
shown (USA Today, February 3, 2008). The state each school is located in,
the campus setting, and the NCAA Division for varsity teams were
obtained from the National Center of Education Statistics website,
February 22, 2008.
a. How many elements are in the data set?
b. How many variables are in the data set?
c. Which of the variables are categorical and which are quantitative?
19
5. Consider the data set in Table 1.7
a. Compute the average endowment for the sample.
b. Compute the average percentage of applicants admitted.
c. What percentage of the schools have NCAA Division III varsity
teams?
d. What percentage of the schools have a City: Midsize campus
setting?
20
7. The Financial Times/Harris Poll is a monthly online poll of adults from six
countries in Europe and the United States. A January poll included 1015
adults in the United States. One of the questions asked was, “How would
you rate the Federal Bank in handling the credit problems in the financial
markets?” Possible responses were Excellent, Good, Fair, Bad, and Terrible
(Harris Interactive website, January 2008).
a. What was the sample size for this survey?
b. Are the data categorical or quantitative?
c. Would it make more sense to use averages or percentages as a
summary of the data for this question?
d. Of the respondents in the United States, 10% said the Federal
Bank is doing a good job. How many individuals provided this
response?
22
9. The Wall Street Journal (WSJ) subscriber survey (October 13, 2003) asked 46
questions about subscriber characteristics and interests. State whether each
of the following questions provided categorical or quantitative data.
a. What is your age?
b. Are you male or female?
c. When did you first start reading the WSJ? High school, college, early
career, midcareer, late career, or retirement?
d. How long have you been in your present job or position?
e. What type of vehicle are you considering for your next purchase?
Nine response categories include sedan, sports car, SUV, minivan, and
so on.
24
10. The Hawaii Visitors Bureau collects data on visitors to Hawaii. The
following questions were among 16 asked in a questionnaire handed out
to passengers during incoming airline flights in June 2003.
• This trip to Hawaii is my: 1st, 2nd, 3rd, 4th, etc;
• The primary reason for this trip is: (10 categories including vacation,
convention, honeymoon);
• Where I plan to stay: (11 categories including hotel, apartment,
relatives, camping) and
• Total days in Hawaii
a. What is the population being studied?
b. Is the use of a questionnaire a good way to reach the population
of passengers on incoming airline flights?
c. Comment on each of the four questions in terms of whether it
will provide categorical or quantitative data.
25
11. Figure 1.8 provides a bar chart showing the amount of federal
spending for the years 2002 to 2008 (USA Today, February 5, 2008).
a. What is the variable of interest?
b. Are the data categorical or quantitative?
c. Are the data time series or cross-sectional?
d. Comment on the trend in federal spending over time.
26
12. A survey of 430 business travelers found 155 business travelers used a
travel agent to make the travel arrangements (USA Today, November 20,
2003).
a. Develop a descriptive statistic that can be used to estimate the
percentage of all business travelers who use a travel agent to make
travel arrangements.
b. The survey reported that the most frequent way business
travelers make travel arrangements is by using an online travel site.
If 44% of business travelers surveyed made their arrangements this
way, how many of the 430 business travelers used an online travel
site?
c. Are the data on how travel arrangements are made categorical or
quantitative?
27