0% found this document useful (0 votes)
12 views

BA Notes

Uploaded by

cutevenkyputti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

BA Notes

Uploaded by

cutevenkyputti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

1.

1 BUSINESS ANALYTICS:

OVERVIEW OF BUSINESS ANALYTICS:


Business analytics, or simply analytics, is the use of data, information technology,
statistical analysis, quantitative methods, and mathematical or computer-based models to help
managers gain improved insight about their business operations and make better, fact based
decisions. Business analytics is “a process of transforming data into actions through analysis and
insights in the context of organizational decision making and problem solving.” Business analytics
is supported by various tools such as Microsoft Excel and various Excel add-ins, commercial
statistical software packages such as SAS or Minitab, and more complex business intelligence
suites that integrate data with analytical software

USAGE of B.A: Leading banks use analytics to predict and prevent credit fraud. Manufacturers
use analytics for production planning, purchasing, and inventory management. Retailers use
analytics to recommend products to customers and optimize marketing promotions.
Pharmaceutical firms use it to get life-saving drugs to market more quickly. Airlines and hotels use
analytics to dynamically set prices over time to maximize revenue. Even sports teams are using
business analytics to determine both game strategy and optimal ticket prices.
One of the emerging applications of analytics is helping businesses learn from social media and
exploit social media data for strategic advantage. Using analytics, firms can integrate social media
data with traditional data sources such as customer surveys, focus groups, and sales data;
understand trends and customer perceptions of their products; and create informative reports to
assist marketing managers and product designers.
evolution of analytics began with the introduction of computers in the late 1940s and their
development through the 1960s and beyond. Early computers provided the ability to store and
analyze data in ways that were either very difficult or impossible to do so manually. This facilitated
the collection, management, analysis, and reporting of data, which is often called business
intelligence (BI), a term that was coined in 1958 by an IBM researcher, Hans Peter Luhn.
Using BI, we can create simple rules to flag exceptions automatically, BI has evolved into the
modern discipline we now call information systems (IS).

STATISTICS: Statistical methods allow us to gain a richer understanding of data that goes
beyond business intelligence reporting by not only summarizing data succinctly but also finding
unknown and interesting relationships among the data. Statistical methods include the basic tools
of description, exploration, estimation, and inference, as well as more advanced techniques like
regression, forecasting, and data mining. . Many OR/MS(operations research/Management
science) applications use modeling and optimization—techniques for translating real problems into
mathematics, spreadsheets, or other computer languages, and using them to find the best
(“optimal”) solutions and decisions.
Decision support systems (DSS) began to evolve in the 1960s by combining business
intelligence concepts with OR/MS models to create analytical-based computer systems to support
decision making.
DSSs include three components:
1. Data management. The data management component includes databases for storing data and
allows the user to input, retrieve, update, and manipulate data.
2. Model management. The model management component consists of various statistical tools
and management science models and allows the user to easily build, manipulate, analyze, and solve
models.
3. Communication system. The communication system component provides the interface
necessary for the user to interact with the data and model management components.11 DSSs have
been used for many applications, including pension fund management, portfolio management,
work-shift scheduling, global manufacturing and facility location, advertising-budget allocation,
media planning, distribution planning, airline operations planning, inventory control, library
management, classroom assignment, nurse scheduling, blood distribution, water pollution control,
ski-area design, police-beat design, and energy planning.
data mining is focused on better understanding characteristics and patterns among variables in
large databases using a variety of statistical and analytical tools. Many standard statistical tools as
well as more advanced ones are used extensively in data mining. Simulation and risk analysis relies
on spreadsheet models and statistical analysis to examine the impacts of uncertainty in the
estimates and their potential interaction with one another on the output variable of interest.
Spreadsheets and formal models allow one to manipulate data to perform what-if analysis—how
specific combinations of inputs that reflect key assumptions will affect model outputs. What-if
analysis is also used to assess the sensitivity of optimization models to changes in data inputs and
provide better insight for making good decisions.
Visualization:
Visualizing data and results of analyses provide a way of easily communicating data at all
levels of a business and can reveal surprising patterns and relationships.
Although many good analytics software packages are available to professionals, we use Microsoft
Excel and a powerful add-in called Analytic Solver Platform.
---------------------------------------------------------------------------------------------------------------------

SCOPE OF BUSINESS ANALYTICS:


Business analytics begins with the collection, organization, and manipulation of data and is
supported by three major components:
1.Descriptive analytics:
the use of data to understand past and current business performance and make informed
decisions. Descriptive analytics is the most commonly used and most well-understood type of
analytics. These techniques categorize, characterize, consolidate, and classify data to convert it into
useful information for the purposes of understanding and analyzing business performance.
Descriptive analytics summarizes data into meaningful charts and reports, for example,
about budgets, sales, revenues, or cost. This process allows managers to obtain standard and
customized reports and then drill down into the data and make queries to understand the impact of
an advertising campaign. ” Descriptive analytics also helps companies to classify customers into
different segments, which enables them to develop specific marketing campaigns and advertising
strategies.
2. Predictive analytics: to predict the future by examining historical data, detecting patterns or
relationships in these data, and then extrapolating these relationships forward in time. Predictive
analytics can predict risk and find relationships in data not readily apparent with traditional
analyses. Using advanced techniques, predictive analytics can help to detect hidden patterns in
large quantities of data to segment and group data into coherent sets to predict behavior and detect
trends.
3. Prescriptive analytics: uses optimization to identify the best alternatives to minimize or
maximize some objective. Prescriptive analytics is used in many areas of business, including
operations, marketing, and finance. The mathematical and statistical techniques of predictive
analytics can also be combined with optimization to make decisions that take into account the
uncertainty in the data.
A wide variety of tools are used to support business analytics:
These include: • Database queries and analysis • “Dashboards” to report key performance measures
• Data visualization • Statistical methods • Spreadsheets and predictive models • Scenario and
“what-if” analyses • Simulation • Forecasting • Data and text mining • Optimization • Social media,
Web, and text analytics.

Software Support:
Many companies, such as IBM, SAS, and Tableau have developed a variety of software
and hardware solutions to support business analytics. For example, IBM’s Cognos Express, an
integrated business intelligence and planning solution designed to meet the needs of midsize
companies, provides reporting, analysis, dashboard, scorecard, planning, budgeting, and
forecasting capabilities. It’s made up of several modules, including Cognos Express Reporter, for
self-service reporting and ad hoc query; Cognos Express Advisor, for analysis and visualization;
and Cognos Express Xcelerator, for Excel-based planning and business analysis. Information is
presented to the business user in a business context that makes it easy to understand, with an easy
to use interface they can quickly gain the insight they need from their data to make the right
decisions and then take action for effective and efficient business optimization and outcome. SAS
provides a variety of software that integrate data management, business intelligence, and analytics
tools. SAS Analytics covers a wide range of capabilities, including predictive modeling and data
mining, visualization, forecasting, optimization and model management, statistical analysis, text
analytics, and more. Tableau Software provides simple drag and drop tools for visualizing data
from spreadsheets and other databases.
BUSINESS ANALYTICS PROCESS:

Fig: Business Analytic Process


The size of some data sources can be unmanageable, overly complex, and generally confusing.
Sorting out data and trying to make sense of its informational value requires the application of
descriptive analytics as a first step in the BA process.
One might begin simply by sorting the data into groups using the four possible classifications
presented in Table below.

Table : Types of Data Measurement Classification Scales


From Step 1 in the Descriptive Analytic analysis some patterns or variables of business
behavior should be identified representing targets of business opportunities and possible (but not
yet defined) future trend behavior. Additional effort (more mining) might be required, such as the
generation of detailed statistical reports narrowly focused on the data related to targets of business
opportunities to explain what is taking place inthe data (what happened in the past). This is like a
statistical search for predictive variables in data that may lead to patterns of behavior a firm might
take advantage of if the patterns of behavior occur in the future. For example, a firm might find in
its general sales information that during economic downtimes, certain products are sold to
customers of a particular income level if certain advertising is undertaken. The sales, customers,
and advertising variables may be in the form of any of the measurable scales for data in table. but
they have to meet the three conditions of BA previously mentioned: clear relevancy to business,
an implementable resulting insight, and performance and value measurement capabilities.
To determine whether observed trends and behavior found in the relationships of the
descriptive analysis of Step 1 actually exist or hold true and can be used to forecast or predict the
future, more advanced analysis is undertaken in Step 2, Predictive Analytic analysis, of the BA
process. Thereare many methods that can be used in this step of the BA process. A commonly used
methodology is multiple regression. for a discussion on multiple regression and ANOVA testing.)
This methodology is ideal for establishing whether a statistical relationship exists between the
predictive variables found in the descriptive analysis. The relationship might be to show that a
dependent variable is predictively associated with business value or performance of some kind.
For example, a firm might want to determine which of several promotion efforts (independent
variables measured and represented in the model by dollars in TV ads, radio ads, personal selling,
and/or magazine ads) is most efficient in generating customer sale dollars (the dependent variable
and a measure of business performance). Care would have to be taken to ensure the multiple
regression model was used in a valid and reliable way, which is why ANOVA and other statistical
confirmatory analyses are used to support the model development. Exploring a database using
advanced statistical procedures to verify and confirm the best predictive variables is an important
part of this step in the BA process. This answers the questions of what is currently happening and
why it happened between the variables in the model.

A single or multiple regression model can often forecast a trend line intothe future. When
regression is not practical, other forecasting methods (exponential smoothing, smoothing
averages) can be applied as predictive analytics to develop needed forecasts of business trends.

The identification of future trends is the main output of Step 2 and the predictive
analytics used to find them. This helps answer the question ofwhat will happen.If a firm
knows where the future lies by forecasting trends as they would in Step 2 of the BA process,
it can then take advantage of any possible opportunities predicted in that future state. In Step
3, Prescriptive Analytics analysis, operations research methodologies can be used to
optimally allocate a firm’s limited resources to take best advantage of the opportunities it
found in the predicted future trends. Limits on human, technology, and financial resources
prevent any firm from going after all opportunities they may have available at any one time.
Using prescriptive analytics allows the firm to allocate limited resources to optimally achieve
objectives as fully as possible.
In summary, the three major components of descriptive, predictive, and prescriptive analytics
arranged as steps in the BA process can help a firm find opportunities in data, predict trends that
forecast future opportunities, and aid in selecting a course of action that optimizes the firm’s
allocation ofresources to maximize value and performance.
-------------------------------------------------------------------------------------------------------------------
Relationship of BA Process and Organization Decision-Making Process:
The BA process can solve problems and identify opportunities to
improve business performance. In the process, organizations may also determine strategies to
guide operations and help achieve competitive advantages. Typically, solving problems and
identifying strategic opportunities to follow are organization decision-making tasks. The
latter,identifying opportunities, can be viewed as a problem of strategy choice requiring a
solution. It should come as no surprise that the BA process described in closely parallels
classic organization decision- making processes. As depicted in below, the business analytic
process has an inherent relationship to the steps in typical organization decision- making
processes.

Figure. Comparison of business analytics and organization decision-making processes.


The organization decision-making process (ODMP) developed by Elbing (1970) and
presented in fig . problems but could also be applied to finding opportunities in data and
deciding what is the best course of action to take advantage of them. The five-step ODMP begins
with the perception of disequilibrium, or the awareness that a problem exists that needs a
decision. Similarly, in the BA process, the first step is to recognize that databases may contain
information that could both solve problems and find opportunities to improve business
performance. Then in Step 2 of the ODMP, an exploration of the problem todetermine its size,
impact, and other factors is undertaken to diagnose what the problem is. Likewise, the BA
descriptive analytic analysis explores factors that might prove useful in solving problems and
offering opportunities. The ODMP problem statement step is similarly structured to the BA
predictive analysis to find strategies, paths, or trends that clearly define a problem or opportunity
for an organization to solve problems.
Finally, the ODMP’s last steps of strategy selection and implementation involve the same kinds
of tasks that the BA process requires in the final prescriptive step (make an optimal selection of
resource allocations that canbe implemented for the betterment of the organization).

The decision-making foundation that has served ODMP for many decades parallels the BA
process. The same logic serves both processes and supports organization decision-making skills
and capacities.
------------------------------------------------------------------------------------------------------------------
COMPETITIVE ADVANTAGES OF BUSINESS ANALYTICS:

Companies that make plans that generate successful outcomes are


winners in the marketplace. Companies that do not effectively plan tend to be losers in the
marketplace. Planning is a critical part of running any business. If it is done right, it obtains the
results that the planners desire.

Business organization planning is typically segmented into three types, presented in fig.
below. The planning process usually follows a sequence from strategy, down to tactical, and then
down to operational, although Fig shows arrows of activities going up and down the depicted
hierarchal structure of most business organizations.

The upward flow in fig, represents the information passed from lower levels up, and the
downward flow represents the orders that are passed from higher levels of management down to
lower levels for implementation. It can be seen in theTeece (2007) study and more recently in
Rha (2013) that the three steps in the BA process and strategic planning embody the same efforts
and steps.
Fig. Types of organization planning*
Effectively planning and passing down the right orders in hopes of being a business winner
requires good information on which orders can be decided. Some information can become so
valuable that it provides the firma competitive advantage (the ability of one business to perform
at a higher level, staying ahead of present competitors in the same industry or market).Business
analytics can support all three types of planning with useful information that can give a firm a
competitive advantage. Examples of the ways BA can help firms achieve a competitive
advantage are presented in table below:
Table : Ways BA Can Help Achieve a Competitive Advantage
1.2. STATISTICAL TOOLS:
Statistical Notation:
A population consists of all items of interest for a particular decision or investigation for
example, all individuals in the United States who do not own cell phones, all subscribers to
Netflix, or all stockholders of Google. A company like Netflix keeps extensive records on its
customers, making it easy to retrieve data about the entire population of customers. However, it
would probably be impossible to identify all individuals who do not own cell phones.

A sample is a subset of a population. For example, a list of individuals who rented a


comedy from Netflix in the past year would be a sample from the population of all customers.
Whether this sample is representative of the population of customers which depends on how the
sample data are intended to be used may be debatable; nevertheless, it is a sample. Most
populations, even if they are finite, are generally too large to deal with effectively or practically.
For instance, it would be impractical as well as too expensive to survey the entire population of
TV viewers in the United States. Sampling is also clearly necessary when data must be obtained
from destructive testing or from a continuous production process. Thus, the purpose of sampling
is to obtain sufficient information to draw a valid inference about a population. Market
researchers, for example, use sampling to gauge consumer perceptions on new or existing goods
and services; auditors use sampling to verify the accuracy of financial statements; and quality
control analysts sample production output to verify quality levels and identify opportunities for
improvement.

Understanding Statistical Notation:

We typically label the elements of a data set using subscripted variables, x1, x2, … , and
so on. In general, xi represents the ith observation. It is a common practice in statistics to use
Greek letters, such as m (mu), s (sigma), and p (pi), to represent population measures and italic
letters such as by x (x-bar), s, and p to represent sample statistics. We will use N to represent the
number of items in a population and n to represent the number of observations in a sample.
Statistical formulas often contain a summation operator, Σ (Greek capital sigma), which means
that the terms that follow it are added together.
Thus, sigma i=1 to n: x1 + x2 +…+ xn. Understanding these conventions and mathematical
notation will help you to interpret and apply statistical formulas.
-------------------------------------------------------------------------------------------------------------------
DESCRIPTIVE STATISTICAL METHODS:
Descriptive statistics describes or summarizes the basic features or characteristics of the
data. It assigns numerical values to describe the trend of the samples collected. It converts large
volumes of data and presents it in a simpler, more meaningful format that is easier to understand
and interpret. It is paired with graphs and tables; descriptive statistics offer a clear summary of the
data’s complete collection.

Descriptive statistics indicate that interpretation is the primary purpose, while inferential
statistics make future predictions for a larger set of data based on descriptive values obtained.
Hence, descriptive statistics form the first step and the basis of quantitative data analysis.
METHODS USED IN DESCRIPTIVE STATISTICS:

(1)Measures of location: provide estimates of a single value that in some fashion represents the
“centering” of a set of data. The most common is the average. We all use averages routinely in our
lives, for example, to measure student accomplishment in college (e.g., grade point average), to
measure the performance of sports teams (e.g., batting average), and to measure performance in
business (e.g., average delivery time).
(a)Arithmetic Mean :
The average is formally called the arithmetic mean (or simply the mean), which is the sum of
the observations divided by the number of observations. Mathematically, the mean of a population
is denoted by the Greek letter m, and the mean of a sample is denoted by x. If a population consists
of N observations x1, x2, c, xN, the population mean, m, is calculated as

The mean of a sample of n observations, x1, x2, c, xn, denoted by x, is calculated as

Note that the calculations for the mean are the same whether we are dealing with a
population or a sample; only the notation differs. We may also calculate the mean in Excel using
the function AVERAGE(data range). One property of the mean is that the sum of the deviations of
each observation from the mean is zero:

This simply means that the sum of the deviations above the mean are the same as the sum
of the deviations below the mean; essentially, the mean “balances” the values on either side of it.
(b) Median:
The measure of location that specifies the middle value when the data are arranged from
least to greatest is the median. Half the data are below the median, and half the data are above it.
For an odd number of observations, the median is the middle of the sorted numbers. For an even
number of observations, the median is the mean of the two middle numbers. We could use the Sort
option in Excel to rank-order the data and then determine the median. The Excel function
MEDIAN(data range) could also be used. The median is meaningful for ratio, interval, and ordinal
data. As opposed to the mean, the median is not affected by outliers.
(c) Mode:
A third measure of location is the mode. The mode is the observation that occurs most
frequently. The mode is most useful for data sets that contain a relatively small number of unique
values.
For data sets that have few repeating values, the mode does not provide much practical
value. You can easily identify the mode from a frequency distribution by identifying the value
having the largest frequency or from a histogram by identifying the highest bar. You may also use
the Excel function MODE.SNGL(data range).
For frequency distributions and histograms of grouped data, the mode is the group with the
greatest frequency. Some data sets have multiple modes; to identify these, you can use the Excel
function MODE.MULT(data range), which returns an array of modal values.
(d) Midrange:
A fourth measure of location that is used occasionally is the midrange. This is simply the
average of the greatest and least values in the data set. Caution must be exercised when using the
midrange because extreme values easily distort the result, as this example illustrated. This is
because the midrange uses only two pieces of data, whereas the mean uses all the data; thus, it is
usually a much rougher estimate than the mean and is often used for only small sample sizes.
(2) Measures of Dispersion:
Dispersion refers to the degree of variation in the data, that is, the numerical spread (or
compactness) of the data. Several statistical measures characterize dispersion: the range, variance,
and standard deviation.
(a)Range:
The range is the simplest and is the difference between the maximum value and the
minimum value in the data set. Although Excel does not provide a function for the range, it can be
computed easily by the formula = MAX(data range) - MIN(data range). Like the midrange, the
range is affected by outliers and, thus, is often only used for very small data sets.
(b)Interquartile Range:
The difference between the first and third quartiles, Q3 - Q1, is often called the interquartile
range (IQR), or the midspread. This includes only the middle 50% of the data and, therefore, is not
influenced by extreme values. Thus, it is sometimes used as an alternative measure of dispersion.
(c) Variance:
A more commonly used measure of dispersion is the variance, whose computation depends
on all the data. The larger the variance, the more the data are spread out from the mean and the
more variability one can expect in the observations. The formula used for calculating the variance
is different for populations and samples.
The formula for the variance of a population is

where xi is the value of the ith item,


N is the number of items in the population, and
m is the population mean.
Essentially, the variance is the average of the squared deviations of the observations from
the mean. A significant difference exists between the formulas for computing the variance of a
population and that of a sample.
The variance of a sample is calculated using the formula

where n is the number of items in the sample and


x is the sample mean.
The Excel function VAR.S(data range) may be used to compute the sample variance, s2 ,
whereas the Excel function VAR.P(data range) is used to compute the variance of a population, s2

(d) Standard Deviation:


The standard deviation is the square root of the variance. For a population, the standard
deviation is computed as

and for samples, it is

The Excel function STDEV.P(data range) calculates the standard deviation for a population
1s2; the function STDEV.S(data range) calculates it for a sample (s).
The standard deviation is generally easier to interpret than the variance because its units of
measure are the same as the units of the data. Thus, it can be more easily related to the mean or
other statistics measured in the same units.

(e) Standardized Values:


A standardized value, commonly called a z-score, provides a relative measure of the
distance an observation is from the mean, which is independent of the units of measurement.
The z-score for the ith observation in a data set is calculated as follows:

We subtract the sample mean from the ith observation, xi, and divide the result by the
sample standard deviation. In formula ,the numerator represents the distance that xi is from the
sample mean; a negative value indicates that xi lies to the left of the mean, and a positive value
indicates that it lies to the right of the mean. By dividing by the standard deviation, s, we scale the
distance from the mean to express it in units of standard deviations. Thus, a z-score of 1.0 means
that the observation is one standard deviation to the right of the mean; a z-score of -1.5 means that
the observation is 1.5 standard deviations to the left of the mean.
Thus, even though two data sets may have different means and standard deviations, the
same z-score means that the observations have the same relative distance from their respective
means. Z-scores can be computed easily on a spreadsheet; however, Excel has a function that
calculates it directly, STANDARDIZE(x, mean, standard_dev).

(f) Coefficient of Variation:


The coefficient of variation (CV) provides a relative measure of the dispersion in data
relative to the mean and is defined as

Sometimes the coefficient of variation is multiplied by 100 to express it as a percent. This


statistic is useful when comparing the variability of two or more data sets when their scales differ.
The coefficient of variation provides a relative measure of risk to return. The smaller the coefficient
of variation, the smaller the relative risk is for the return provided. The reciprocal of the coefficient
of variation, called return to risk, is often used because it is easier to interpret.
(3) MEASURES OF SHAPE:

(a) Skewness:
describes the lack of symmetry of data. The coefficient of skewness (CS) measures
the degree of asymmetry of observations around the mean.

The coefficient of skewness is computed as:

REVIEW OF PROBABILITY DISTRIBUTION AND DATA MODELLING:


Chapter – 5 FULL - B.A. by james evans – page 157

SAMPLING AND ESTIMATION METHODS OVERVIEW:


Chapter -6 FULL - B.A. by james evans – page 207
2. TRENDINESS AND REGRESSION ANALYSIS

MODELLING RELATIONSHIPS & TRENDS IN DATA:


Common types of mathematical functions used in predictive analytical models in- clude the
following:

Linear function: y = a + bx. Linear functions show steady increases or decreases over the range of
x. This is the simplest type of function used in
predictive models. It is easy to understand, and over small ranges of values, can approximate
behavior rather well.
Logarithmic function: y = ln1x2. Logarithmic functions are used when the rate
of change in a variable increases or decreases quickly and then levels out, such as
with diminishing returns to scale. Logarithmic functions are often used in mar- keting models where
constant percentage increases in advertising, for instance, result in constant, absolute increases in
sales.
Polynomial function: y = ax 2 + bx + c (second order—quadratic function),
y = ax 3 + bx 2 + dx + e (third order—cubic function), and so on. A second-
order polynomial is parabolic in nature and has only one hill or valley; a third- order polynomial
has one or two hills or valleys. Revenue models that incorporate price elasticity are often
polynomial functions.
Power function: y = axb. Power functions define phenomena that increase at a specific rate.
Learning curves that express improving times in performing a task
are often modeled with power functions having a 7 0 and b 6 0.
Exponential function: y = abx. Exponential functions have the property that y
rises or falls at constantly increasing rates. For example, the perceived brightness
of a lightbulb grows at a decreasing rate as the wattage increases. In this case,
a would be a positive number and b would be between 0 and 1. The exponential function is often
defined as y = aex, where b = e, the base of natural logarithms (approximately 2.71828).
R2 (R-squared) is a measure of the “fit” of the line to the data. The value of R2 will be between 0
and 1. The larger the value of R2 the better the fit.
----------------------------------------------------------------------------------------------------------------
SIMPLE LINEAR REGRESSION:
Is a tool for building mathematical and statistical models that characterize relationships
between a dependent variable (which must be a ratio variable and not categorical) and one or more
independent, or explanatory, variables, all of which are numerical (but may be either ratio or
categorical).
Two broad categories of regression models are used often in business settings:
(1) regression models of cross-sectional data and
(2) regression models of time-series data, in which the independent variables are time or
some function of time and the focus is on predicting the future.
Time-series regression is an important tool in forecasting. A regression model that involves
a single independent variable is called simple regression. A regression model that involves two or
more independent variables is called multiple regression.
Simple linear regression involves finding a linear relationship between one independent
variable, X, and one dependent variable, Y. The relationship between two variables can assume
many forms, as illustrated in Figure 8.5. The relationship may be linear or nonlinear, or there may
be no relationship at all. Because we are focusing our discussion on linear regression models, the
first thing to do is to verify that the relationship is linear, as in Figure 8.5(a). We would not expect
to see the data line up perfectly along a straight line; we simply want to verify that the general
relationship is linear. If the relationship is clearly nonlinear, as in Figure 8.5(b), then alternative
approaches must be used, and if no relationship is evident, as in Figure 8.5(c), then it is pointless
to even consider developing a linear regression model. To determine if a linear relationship exists
between the variables, we recommend that you create a scatter chart that can show the relationship
between variables visually.

Fig. 8.5(a) fig.8.5(b) fig.8.5(c)


Linear non-linear no-relationship

Fig. 8.5 Examples of Variable Relationships


The idea behind simple linear regression is to express the relationship between the
dependent and independent variables by a simple linear equation, such as
market value = a + b * square feet
where a is the y-intercept and b is the slope of the line
---------------------------------------------------------------------------------------------------------------

IMPORTANT RESOURCES:
it is necessary to understand resource needs of a BA program to better comprehend the
value of the information that BA provides. The need for BA resources varies by firm to meet
particular decision support requirements. Some firms may choose to have a modest investment,
whereas other firms may have BA teams or a department of BA specialists. Regardless of the level
of resource investment, at minimum, a BA program requires resource investments in BA personnel,
data, and technology.
(1) Business Analytics Personnel
(2) Business analytics technology
(3) Business Analytics Data:
Structured and unstructured data is needed to generate analytics. As a beginning for
organizing data into an understandable framework, statisticians usually categorize data into
meaning groups.
(A) Categorizing Data:

There are many ways to categorize business analytics data. Data is commonly
categorized by either internal or external sources.Typical examples of internal data sources
include those presented in TABLE 3.4. When firms try to solve internal production or
service operations problems, internally sourced data may be all that is needed. Typical
external sources of data (SEE TABLE 3.5) are numerous and provide great diversity and
unique challenges for BA to process. Data can be measured quantitatively (for example,
sales dollars) or qualitatively by preference surveys (for example, products compared
based on consumers preferring one product over another) or by the amount of consumer
discussion (chatter) on the Web regarding the pluses and minuses of competing products

Table 3.4 Typical Internal Sources of Data on Which Business AnalyticsCan Be Based
Table 3.5 Typical External Sources of Data on Which Business AnalyticsCan Be Based

A major portion of the external data sources are found in the literature.For example, the
US Census and the International Monetary Fund (IMF) are useful data sources at the
macroeconomic level for model building.

(B)DATA ISSUES:
couple of data issues that are critical tothe usability of any database or data file. Those
issues are data quality and data privacy.

(A)Data quality:
can be defined as data that serves the purpose forwhich it is collected. It
means different things for different applications, but there are some commonalities of high-
quality data. These qualities usually include accurately representing reality, measuring what it
is supposed to measure, being timeless, and having completeness. When data is of high quality,
it helps ensure competitiveness, aids customer service, and improves profitability. When data
is of poor quality, it can provide information that is contradictory, leading to misguided
decision-making.

For example, having missing data in files can prohibit some forms’ statistical
modeling, and incorrect coding of information can completely render databases useless. Data
quality requires effort on the part of data managers to cleanse data of erroneous information
and repair or replace missing data.

(B)Data privacy:

refers to the protection of shared data such that access is permitted only to those
users for whom it is intended. It is a security issue that requires balancing the need to know
with the risks of sharing too much.
There are many risks in leaving unrestricted access to a company’s database. For
example, competitors can steal a firm’s customers by accessing addresses. Data leaks on
product quality failures can damage brand image, and customers can become distrustful
of a firm that shares information given in confidence. To avoid these issues, a firm needs
to abide by the current legislation regarding customer privacy and develop a program
devoted to data privacy.
Collecting and retrieving data and computing analytics requires the use of computers
and information technology. A large part of what BA personnel do is related to managing
information systems to collect, process,store, and retrieve data from various sources.
-------------------------------------------------------------------------------------------------------------
BUSINESS ANALYTICS PERSONNEL:

One way to identify personnel needed for BA staff is to examine what isrequired for
certification in BA by organizations that provide BA services. INFORMS, a major academic
and professional organization, announced the startup of a Certified Analytic Professional
(CAP) program in 2013.
Another more established organization, Cognizure, offers a variety of service products,
including business analytic services. It offers a general certification Business Analytics
Professional (BAP) exam that measures existing skill sets in BA staff and identifies areas
needing improvement This is a tool to validate technical proficiency, expertise, and
professional standards in BA. The certificationconsists of three exams covering the content
areas listed in Table 3.1.

Table 3.1 Cognizure Organization Certification Exam Content Areas

Most of the content areas in Table 3.1 will be discussed and illustrated in
subsequent chapters and appendixes. The three exams required in the Cognizure certification
program can easily be understood in the context ofthe three steps of the BA process (descriptive,
predictive, and prescriptive.
The topics in Figure 3.1 of the certification program are applicable to the three major
steps in the BA process. The basic statistical tools apply to the descriptive analytics step, the
more advanced statistical tools apply to the predictive analytics step, and the operations research
tools apply to the prescriptive analytics step. Some of the tools can be applied to both the
descriptive and the predictive steps.

Likewise, tools like simulation can be applied to answer questions in both the predictive
and the prescriptive steps, depending on how they’re used. At the conjunction of all the tools is
the reality of case studies. The use of case studies is designed to provide practical experience
where all tools are employed to answer important questions or seek opportunities.

Figure 3.1 Certification content areas and their relationship to the steps in
BA
they also include specialized skill sets related to BA personnel (administrators, designers,
developers,solution experts, and specialists), as presented in Table 3.2.

Table 3.2 Types of BA Personnel


With the variety of positions and roles participants play in the BA process, this leads to the question of what skill
sets or competencies are needed to function in BA. In a general sense, BA positions require competencies in business,
analytic, and information systems skills. As listed in Table 3.3, business skills involve basic management of people and
processes. BA personnel must communicate with BA staffers within the organization (the BA team members) and the
other functional areas within a firm (BA customers and users) to be useful. Because they serve a variety of functional
areas within a firm, BA personnel need to possess customer service skills so they can interact with the firm’s personnel
and understand the nature of the problems they seek to solve. BA personnel also need to sell their services to users
inside the firm. In addition, some must lead a BA team or department, which requires considerable interpersonal
management leadership skills and abilities.

Table 3.3 Select Types of BA Personnel Skills or CompetencyRequirements


Fundamental to BA is an understanding of analytic methodologies listed in Table 3.1 and others not listed. In
addition to any tool sets, there is a needto know how they are integrated into the BA process to leverage data (structured
or unstructured) and obtain information that customers who will be guided by the analytics desire.
------------------------------------------------------------------------------------------------------------------

DATA FOR BUSINESS ANALYTICS:


Data are numerical facts and figures that are collected through some type of measurement process. Information comes
from analyzing data—that is, extracting meaning from data to support evaluation and decision making.
Data are used in virtually every major function in a business. Modern organizations— which include not only for-profit
businesses but also nonprofit organizations—need good data to support a variety of company purposes, such as planning,
reviewing company performance, improving operations, and comparing company performance with competitors’ or
best-practice benchmarks. Some examples of how data are used in business include the following:

 Annual reports summarize data about companies’ profitability and market share both in numerical form and in
charts and graphs to communicate with shareholders.
 Accountants conduct audits to determine whether figures reported on a firm’s balance sheet fairly represent the
actual data by examining samples (that is,subsets) of accounting data, such as accounts receivable.
 Financial analysts collect and analyze a variety of data to understand the contribution that a business provides to
its shareholders. These typically include profitability, revenue growth, return on investment, asset utilization,
operating margins, earnings per share, economic value added (EVA), shareholder value, and other relevant
measures.
 Economists use data to help companies understand and predict population trends,interest rates, industry
performance, consumer spending, and international trade.Such data are often obtained from external sources such
as Standard & Poor’s Compustat data sets, industry trade associations, or government databases.
 Marketing researchers collect and analyze extensive customer data. These data often consist of demographics,
preferences and opinions, transaction and payment history, shopping behavior, and a lot more. Such data may be
collected by surveys, personal interviews, focus groups, or from shopper loyalty cards.
 Operations managers use data on production performance, manufacturing quality, delivery times, order accuracy,
supplier performance, productivity, costs, and environmental compliance to manage their operations.
 Human resource managers measure employee satisfaction, training costs, turn- over, market innovation, training
effectiveness, and skills development.

Data Sets and Databases:


A data set is simply a collection of data. Marketing survey responses, a table of historical stock prices, and a collection
of measurements of dimensions of a manufactured item are examples of data sets. A database is a collection of related
files containing records on people, places, or things. The people, places, or things for which we store and maintain
information are called entities. A database for an online retailer that sells instructional fitness books and DVDs, for
instance, might consist of a file for three entities: publishers from which goods are purchased, customer sales
transactions, and product inventory. A database file is usually organized in a two-dimensional table, where the columns
correspond to each individual element of data (called fields, or attributes), and the rows represent records of related data
elements. A key feature of computerized databases is the ability to quickly relate one set of files to another.
Databases are important in business analytics for accessing data, making queries, and other data and information
management activities. Software such as Microsoft Access provides powerful analytical database capabilities. However,
in this book, we won’t be delving deeply into databases or database management systems but will work with indi vidual
database files or simple data sets. Because spreadsheets are convenient tools for storing and manipulating data sets and
database files, we will use them for all examples and problems. data requires advanced analytics tools such as data mining
and text analytics, and new technologies such as cloud computing, faster multi-core processors, large memory spaces,
and solid-state drives.

Metrics and Data Classification:


A metric is a unit of measurement that provides a way to objectively quantify perfor- mance. For example, senior
managers might assess overall business performance using such metrics as net profit, return on investment, market share,
and customer satisfaction. A plant manager might monitor such metrics as the proportion of defective parts produced or
the number of inventory turns each month. For a Web-based retailer, some useful met- rics are the percentage of orders
filled accurately and the time taken to fill a customer’s order. Measurement is the act of obtaining data associated with a
metric. Measures are numerical values associated with a metric.
Metrics can be either discrete or continuous. A discrete metric is one that is de- rived from counting
something. For example, a delivery is either on time or not; anorder is complete or incomplete; or an invoice can
have one, two, three, or any number of errors. Some discrete metrics associated with these examples would be the
propor- tion of on-time deliveries; the number of incomplete orders each day, and the number of errors per invoice.
Continuous metrics are based on a continuous scale of measure ment. Any metrics involving dollars, length, time,
volume, or weight, for example, are continuous.
Another classification of data is by the type of measurement scale. Data may be clas-sified into four groups:
(A) Categorical (nominal) data, which are sorted into categories according to specified characteristics. For
example, a firm’s customers might be classi- fied by their geographical region (North America, South America,
Europe, and Pacific); employees might be classified as managers, supervisors, and associates. The categories bear
no quantitative relationship to one another, but we usually assign an arbitrary number to each category to ease
the process of managing the data and computing statistics. Categorical data are usually counted or expressed as
proportions or percentages.
(B) Ordinal data, which can be ordered or ranked according to some relationship to one another. College football or
basketball rankings are ordinal; a higher ranking signifies a stronger team but does not specify any numerical
measure of strength. Ordinal data are more meaningful than categorical data because data can be compared to one
another. A common example in business is data from survey scales—for example, rating a service as poor,
average, good, very good, or excellent. Such data are categorical but also have a natural order (excellent is better
than very good) and, consequently, are ordinal. However, ordinal data have no fixed units of measurement, so we
cannot make mean- ingful numerical statements about differences between categories. Thus, we cannot say that
the difference between excellent and very good is the same as between good and average, for example. Similarly,
a team ranked number 1 may be far superior to the number 2 team, whereas there may be little differ- ence between
teams ranked 9th and 10th.
(C) Interval data, which are ordinal but have constant differences between obser- vations and have arbitrary zero
points. Common examples are time and temper- ature. Time is relative to global location, and calendars have
arbitrary starting dates (compare, for example, the standard Gregorian calendar with the Chinese
calendar). Both the Fahrenheit and Celsius scales represent a specified mea- sure of distance—degrees—but have
arbitrary zero points. Thus we cannot take meaningful ratios; for example, we cannot say that 50 degrees is twice
as hot as 25 degrees. However, we can compare differences. Another example is SAT or GMAT scores. The scores
can be used to rank students, but only differences between scores provide information on how much better one
student performed over another; ratios make little sense. In contrast to ordinal data, interval data allow meaningful
comparison of ranges, averages, and other statistics.
Ratio data, which are continuous and have a natural zero. Most business and economic data, such as dollars and
time, fall into this category. For example, the measure dollars has an absolute zero. Ratios of dollar figures are
meaning- ful. For example, knowing that the Seattle region sold $12 million in March whereas the Tampa region
sold $6 million means that Seattle sold twice as much as Tampa.
DATA RELIABILITY AND VALIDITY:
Poor data can result in poor decisions. In one situation, a distribution system design
modelrelied on data obtained from the corporate finance department. Transportation costs were
determined using a formula based on the latitude and longitude of the locations of plants and
customers. But when the solution was represented on a geographic information sys- tem (GIS)
mapping program, one of the customers was in the Atlantic Ocean.
Thus, data used in business decisions need to be reliable and valid. Reliability means
that data are accurate and consistent. Validity means that data correctly measure what they are
supposed to measure.
----------------------------------------------------------------------------------------------------------------
MODELS FOR BUSINESS ANALYTICS:
Many decision problems can be formalized using a model. A model is an abstraction
or representation of a real system, idea, or object. Models capture the most important features
of a problem and present them in a form that is easy to interpret. A model can be as simple
as a written or verbal description of some phenomenon, a visual representation such as a graph
or a flowchart, or a mathematical or spreadsheet representation
Models can be descriptive, predictive, or prescriptive, and therefore are used in a wide
variety of business analytics applications. A simple descriptive model is a visual
representation called an influence diagram because it describes how various elements of the
model influence, or relate to, others. An influence diagram is a useful approach for
conceptualizing the structure of a model and can assist in building a mathematical or
spreadsheet model. The elements of the model are represented by circular symbols called nodes.
Arrows called branches connect the nodes and show which elements influence others. Influence
diagrams are quite useful in the early stages of model building when we need to understand and
characterize key relationships.
DECISION MODELS:
A decision model is a logical or mathematical representation of a problem or business
situation that can be used to understand, analyze, or facilitate making a decision. Most decision
models have three types of input:
(A) Data, which are assumed to be constant for purposes of the model. Some examples
would be costs, machine capacities, and intercity distances.
(B) Uncontrollable variables, which are quantities that can change but cannot be directly
controlled by the decision maker. Some examples would be customer demand, inflation
rates, and investment returns. Often, these variables are uncertain.
(C) Decision variables, which are controllable and can be selected at the discretion of the
decision maker. Some examples would be production quantities staffing levels, and
investment allocations.

Decision models characterize the relationships among the data, uncontrollable variables,
and decision variables, and the outputs of interest to the decision maker .Decision models can
be represented in various ways, most typically with mathematical functions and spreadsheets.
Spreadsheets are ideal vehicles for implementing decision models because of their versatilityin
managing data, evaluating different scenarios, and presenting results in a meaningful fashion.
UNCERTAINITY AND RISK:
Uncertainty is imperfect knowledge of what will happen; risk is associated with the
consequences and likeli hood of what might happen. Risk is evaluated by the magnitude of the
consequences and the likelihood that they would occur. To try to eliminate risk in business
enterprise is futile. Risk is inherent in the commitment of present resources to future
expectations. Indeed, economic progress can be defined as the ability to take greater risks. The
attempt to eliminate risks, even the attempt to minimize them, can only make them irrational
and unbearable. It can only result in the greatest risk of all.
A PRESCRIPTIVE decision model:
helps decision makers to identify the best solution to a decision problem. Optimization
is the process of finding a set of values for decision variables that minimize or maximize some
quantity of interest—profit, revenue, cost, time, and so on called the objective function. Any
set of decision variables that optimizes the objec- tive function is called an optimal solution.
Prescriptive decision models can be either deterministic or stochastic. A deterministic model
is one in which all model input information is either known or assumed to be known with
certainty. A stochastic model is one in which some of the model input infor- mation is
uncertain.
----------------------------------------------------------------------------------------------------------------

PROBLEM SOLVING WITH ANALYZER:


Problem solving is the activity associated with defining, analyzing, and solving a
problem and selecting an appropriate solution that solves a prob- lem. Problem solving consists
of several phases:
(A) recognizing a problem
(B) defining the problem
(C) structuring the problem
(D) analyzing the problem
(E) interpreting results and making a decision
(F) implementing the solution

(A) RECOGNIZING PROBLEM:


Managers at different organizational levels face different types of problems. In
a manufacturing firm, for instance, top managers face decisions of allocating financial
resources, building or expanding facilities, determining product mix, and strategically
sourcing pro- duction. Middle managers in operations develop distribution plans,
production and inventory schedules, and staffing plans. Finance managers analyze
risks, determine investment strategies, and make pricing decisions. Marketing
managers develop advertising plans and make sales force allocation decisions. In
manufacturing operations, problems involve the size of daily production runs,
individual machine schedules, and worker assignments. Whatever the problem, the first
step is to realize that it exists.

(B) DEFINING THE PROBLEM:


to clearly define the problem. Finding the real problem and distinguishing it
from symptoms that are observed is a critical step. For example, high distribution costs
might stem from inefficiencies in routing trucks, poor location of distribution centers, or
external factors such as increasing fuel costs. The problem might be defined as
improving the routing process, redesigning the entire distribution system, or optimally
hedging fuel purchases.
Defining problems is not a trivial task. The complexity of a problem increases
when the following occur:
 The number of potential courses of action is large.
 The problem belongs to a group rather than to an individual.
 The problem solver has several competing objectives.
 External groups or individuals are affected by the problem.
 The problem solver and the true owner of the problem—the person who
experiences the problem and is responsible for getting it solved—are not the
same.
 Time limitations are important.

(C) STRUCTURING THE PROBLEM:


This usually involves stating goals and objectives, characterizing the possible
decisions, and identifying any constraints or restrictions.

(D) ANALYZING THE PROBLEM:


analytics plays a major role. Analysis involves some sort of experimentation or
solution process, such as evaluating different scenarios, analyzing risks associated with
various decision alternatives, finding a solution that meets certain goals, or
determining an optimal solution. Analytics professionals have spent decades
developing and refining a variety of approaches to address different types of problems

(E) INTERPRETTING THE RESULTS & MAKING DECISIONS:


Interpreting the results from the analysis phase is crucial in making good decisions.
Models cannot capture every detail of the real problem, and managers must understand the
limita tions of models and their underlying assumptions and often incorporate judgment into
making a decision.

(F) IMPLEMENTING THE SOLUTION:


simply means making it work in the organization, or translating the results of a model
back to the real world. This generally requires providing adequate resources, motivat ing
employees, eliminating resistance to change, modifying organizational policies, and developing
trust. Problems and their solutions affect people: customers, suppliers, and employees. All must
be an important part of the problem-solving process.

Sensitivity to political and organizational issues is an important skill that managers and
analytical professionals alike must possess when solving problems.

--------------------------------------------------------------------------------------------------------------------------------------

VISUALIZING & EXPLORING DATA:


Chapter -3 ---- FULL - B.A. by james evans – Page no. 79
BUSINESS ANALYTICS TECHNIQUE:
Firms need an information technology (IT) infrastructure that supportspersonnel in
the conduct of their daily business operations. The general requirements for such a
system are stated in Table 3.6. These types of technology are elemental needs for
business analytics operations.

Table 3.6 General Information Technology (IT) Infrastructure

Of particular importance for BA is the data management technologies listed in Table


3.6. Database management systems (DBMS) is a data management technology software
that permits firms to centralize data, manage it efficiently, and provide access to stored
data by application programs. DBMS usually serves as an interface between application
programs and the physical data files of structured data. DBMS makes thetask of
understanding where and how the data is actually stored more efficient. In addition,
other DBMS systems can handle unstructured data. For example, object-oriented DBMS
systems are able to store and retrieveunstructured data, like drawings, images,
photographs, and voice data.
These types of technology are necessary to handle the load of big data that most firms
currently collect.

DBMS includes capabilities and tools for organizing, managing, and accessing data
in databases. Four of the more important capabilities are its data definition language, data
dictionary, database encyclopedia, and data manipulation language. DBMS has a data
definition capability to specify the structure of content in a database. This is used to create
database tables and characteristics used in fields to identify content. These tables and
characteristics are critical success factors for search efforts as the database grows in size.
These characteristics are documented in the data dictionary (an automated or manual file
that stores the size, descriptions, format, and other properties needed to characterize data).
The database encyclopedia is a table of contents listing a firm’s current data
inventory and what data filescan be built or purchased. The typical content of the database
encyclopediais presented in Table 3.7. Of particular importance for BA is the data
manipulation language tools included in DMBS. These tools are used to search databases
for specific information. An example is structure query language (SQL), which allows
users to find specific data through a session of queries and responses in a database.

Table 3.7 Database Encyclopedia Content


Data warehouses are databases that store current and historical data of potential interest
to decision makers. What a data warehouse does is make data available to anyone who
needs access to it. In a data warehouse, the data is prohibited from being altered. Data
warehouses also provide a set of query tools, analytical tools, and graphical reporting
facilities. Some firms use intranet portals to make data warehouse information widely
available throughout a firm.

Data marts are focused subsets or smaller groupings within a data warehouse.
Firms often build enterprise-wide data warehouses where a central data warehouse
serves the entire organization and smaller, decentralized data warehouses (called data
marts) are focused on a limitedportion of the organization’s data that is placed in a
separate database for aspecific population of users. For example, a firm might develop
a smaller database on just product quality to focus efforts on quality customer and
product issues. A data mart can be constructed more quickly and at lower cost than
enterprise-wide data warehouses to concentrate effort in areas ofgreatest concern.

Online analytical processing (OLAP) is software that allows users to view data in
multiple dimensions. For example, employees can be viewed in terms of their age, sex,
geographic location, and so on. OLAP would allow identification of the number of
employees who are age 35, male, and in thewestern region of a country. OLAP allows
users to obtain online answers toad hoc questions quickly, even when the data is stored in
very large databases.
Data mining is the application of a software, discovery-driven process that provides
insights into business data by finding hidden patterns and relationships in big data or
large databases and inferring rules from them topredict future behavior. The observed
patterns and rules are used to guide decision-making. They can also act to forecast the
impact of those decisions.

Table 3.8 Types of Information Obtainable with Data Mining Technology

Text mining is a software application used toextract key elements from unstructured
data sets, discover patterns and relationships in the text materials, and summarize the
information.

Web mining seeks to find patterns, trends, and insights into customer behavior from
users of the Web.

Analysis ToolPak isan Excel add-in that contains a variety of statistical tools (for
example, graphics and multiple regression) for the descriptive and predictive BA process
steps. Another Excel add-in, Solver, contains operations research optimization tools (for
example, linear programming) used in the prescriptive step of the BA process.
UNIT-3

3.1.____________________________________________________________
ORGANIZATION STRUCTURES OF B.A.:
to successfully implement business analytics (BA) within organizations, the BA in
whatever organizational form it takes must be fully integrated throughout afirm. This requires BA
resources to be aligned in a way that permits a viewof customer information within and across all
departments, access to customer information from multiple sources (internal and external to the
organization), access to historical analytics from a central repository, and making technology
resources align to be accountable for analytic success. The commonality of these requirements is
the desire for an alignment that maximizes the flow of information into and through the BA
operation, which in turn processes and shares information to desired users throughout the
organization.
(A)most organizations are hierarchical, with senior managers making the strategic planning
decisions, middle-level managers making tactical planning decisions, and lower-level managers
making operational planning decisions. Within the hierarchy, other organizational structures
exist to support the development and existence of groupings of resources like those needed for
BA. These additional structures include programs, projects, andteams. A program in this context
is the process that seeks to create an outcome and usually involves managing several related
projects with the intention of improving organizational performance. A program can also bea
large project. A project tends to deliver outcomes and can be defined as having temporary rather
than permanent social systems within or across organizations to accomplish particular and
clearly defined tasks, usually under time constraints. Projects are often composed of teams. A
team consists of a group of people with skills to achieve a common purpose.Teams are especially
appropriate for conducting complex tasks that havemany interdependent subtasks.

The relationship of programs, projects, and teams with a business hierarchy is presented in
Figure 4.1. Within this hierarchy, the organization’s senior managers establish a BA program
initiative to mandatethe creation of a BA grouping within the firm as a strategic goal. A BA
program does not always have an end-time limit. Middle-level managers reorganize or break
down the strategic BA program goals into doable BA project initiatives to be undertaken in a
fixed period of time. Some firms have only one project (establish a BA grouping) and others,
depending on the organization structure, have multiple BA projects requiring the creation of
multiple BA groupings. Projects usually have an end-time date in which to judge the
successfulness of the project. The projects in some cases are further reorganized into smaller
assignments, called BA team initiatives, to operationalize the broader strategy of the BA
program. BA teams may have a long-standing time limit (for example, to exist as the main source
of analytics for an entire organization) or have a fixed period (for example, to work on a specific
product quality problem and then end).
Figure 4.1 Hierarchal relationships program, project, and team planning

In summary, one way to look at the alignment of BA resources is to view


it as a progression of assigned planning tasks from a BA program, to BA projects, and
eventually to BA teams for implementation. As shown in Figure 4.1, this hierarchical
relationship is a way to examine how firms align planning and decision-making workload
to fit strategic needs and requirements.

BA organization structures usually begin with an initiative that recognizes the need to
use and develop some kind of program in analytics. Fortunately, most firms today recognize
this need. The question then becomes how to match the firm’s needs within the organization to
achieve its strategic, tactical, and operations objectives within resource limitations. Planning
the BA resource allocation within the organizational structure of afirm is a starting place for the
alignment of BA to best serve a firm’s needs.

Aligning the BA resources requires a determination of the amount of resources a firm wants
to invest. The outcome of the resource investment might identify only one individual to compute
analytics for a firm. Becauseof the varied skill sets in information systems, statistics, and operations
research methods, a more common beginning for a BA initiative is the creation of a BA team
organization structure possessing a variety of analytical and management skills.
(B) Another way of aligning BA resources within an organization is to use a project structure.
Most firms undertake projects, and some firms actually use a project structure for their entire
organization.
In organizations where functional departments are structured on a strict hierarchy, separate
BA departments orteams have to be allocated to each functional area, as presented in Figure.
This functional organization structure may have the benefit of stricter functional control by the
VPs of an organization and greater efficiency in focusing on just the analytics within each
specialized area. On the other hand, this structure does not promote the cross-department access
that is suggested as a critical success factor for the implementation of a BA program.
Figure 4.2 Functional organization structure with BA
The needs of each firm for BA sometimes dictate positioning BA within existing organization
functional areas. Clearly, many alternative structures can house a BA grouping. For example,
because BA provides information to users, BA could be included in the functional area of
management information systems, with the chief information officer (CIO) acting as boththe
director of information systems (which includes database management) and the leader of the
BA grouping.

(C) found in large organizations aligns resources by project or product and is called a
matrix organization. As illustrated in Figure 4.3, this structure allows the VPs some indirect
control over their related specialists, which would include the BA specialists but also allows
direct control by the project or product manager. This, similar to the functional organizational
structure, does not promote the cross-department access suggested for a successful
implementation of a BA program.

Figure 4.3 Matrix organization structure


The literature suggests that the organizational structure that best aligns BA resources is one
in which a department, project, or team is formed in a staff structure where access to and from
the BA grouping of resources permits access to all areas within a firm, as illustrated in Figure
4.4 The dashed line indicates a staff (not line management) relationship. This centralized BA
organization structure minimizes investment costs by avoiding duplications found in both the
functional and the matrix styles of organization structures. At the same time, it maximizes
information flow between and across functional areas in the organization. This is a logical
structure for a BA group in its advisory role to the organization.

These include a reduction in the filtering of information traveling upward throughthe


organization, insulation from political interests, breakdown of the siloed functional area
communication barriers, a more central platform for reviewing important analyses that require a
broader field of specialists, analytic-based group decision-making efforts, separation of the line
management leadership from potential clients (for example, the VP of marketing would not
necessarily come between the BA group working on customer service issues for a department
within marketing), and better connectivity between BA and all personnel within the area of
problem solving.

Figure 4.4 Centralized BA department, project, or team organizationstructure


Given the advocacy and logic recommending a centralized BA grouping, there are reasons
for all BA groupings to be centralized. These reasons help explain why BA initiatives that seek
to integrate and align BA resources into any type of BA group within the organization
sometimes fail.
----------------------------------------------------------------------------------------------------------------
TEAM MANAGEMENT:
When it comes to getting the BA job done, it tends to fall to a BA team. For firms that employ
BA teams the participants can be defined by the rolesthey play in the team effort. Some of the
roles BA team participants undertake and their typical background are presented in Table 4.2.
Table 4.2 BA Team Participant Roles*
Aligning BA teams to achieve their tasks requires collaboration efforts from team members
and from their organizations. Like BA teams, collaboration involves working with people to
achieve a shared and explicit set of goals consistent with their mission. BA teams also have a
specific mission to complete. Collaboration through teamwork is the means to accomplish their
mission.
Team members’ need for collaboration is motivated by changes in the nature of work (no
more silos to hide behind, much more open environment, and so on), growth in professions
(for example, interactive jobs tend to be more professional, requiring greater variety in
expertise sharing), and the need to nurture innovation (creativity and innovation are fostered
by collaboration with a variety of people sharing ideas). To keep one’s job and to progress in
any business career, particularly in BA, team members must encourage working with other
members inside a team and out.
For organizations, collaboration is motivated by the changing nature ofinformation flow
(that is, hierarchical flows tend to be downward, whereas in modern organizations, flow is in
all directions) and changes in the scopeof business operations (that is, going from domestic
to global allows for a greater flow of ideas and information from multiple sources in multiple
locations).

-----------------------------------------------------------------------------------------------------------------
MANAGEMENT ISSUES:
Aligning organizational resources is a management function. There are general
management issues that are related to a BA program, and some are specifically important to
operating a BA department, project, or team. The ones covered in this section include
establishing an information policy, outsourcing business analytics, ensuring data quality,
measuring business analytics contribution, and managing change.
 Establishing an Information Policy:
There is a need to manage information. This is accomplished by establishing an information
policy to structure rules on how information anddata are to be organized and maintained and who
is allowed to view the dataor change it. The information policy specifies organizational rules for
sharing, disseminating, acquiring, standardizing, classifying, and inventorying all types of
information and data. It defines the specific procedures and accountabilities that identify which
users and organizationalunits can share information, where the information can be distributed,
and who is responsible for updating and maintaining the information.
In small firms, business owners might establish the information policy.For larger firms,
data administration may be responsible for the specific policies and procedures for data
management. Responsibilities could include developing the information policy, planning
data collection and storage, overseeing database design, developing the data dictionary, as
well as montoring how information systems specialists and end user groups use data.
 Outsourcing Business Analytics:
Outsourcing can be defined as a strategy by which an organization chooses to allocate some
business activities and responsibilities from an internal source to an external source. Outsourcing
business operations is a strategy that an organization can use toimplement a BA program, run
BA projects, and operate BA teams. Any business activity can be outsourced, including BA.
Outsourcing is an important BA management activity that should be considered as a viable
alternative in planning an investment in any BA program.
BA is a staff function that is easier to outsource than other line management tasks, such as
running a warehouse. To determine if outsourcing is a useful option in BA programs,
management needs to balance the advantages of outsourcing with its disadvantages. Some of
theadvantages of outsourcing BA include those listed in Table 4.4.

Table 4.4 Advantages of Outsourcing BA


Some of disadvantages to outsourcing are presented in Table 4.5.

Table 4.5 Disadvantages of Outsourcing BA Managing

 Ensuring Data Quality:


Business analytics, if relevant, is based on data assumed to be of high quality. Data quality
refers to accuracy, precision, and completeness of data. High-quality data is considered to
correctly reflect the real world in which it is extracted. Poor quality data caused by data entry
errors, poorly maintained databases, out-of-date data, and incomplete data usually leads tobad
decisions and undermines BA within a firm. Organizationally, the database management systems
(DBMS, mentioned in personnel are managerially responsible for ensuring data quality. Because
of its importance and the possible location of the BA department outside of the management
information systems department (which usually hosts the DBMS), it is imperative that whoever
leads the BA program should seek to ensure data quality efforts are undertaken.

An organization needs to identify and correct faulty data and establish routines and
procedures for editing data in the database. The analysis of data quality can begin with a data
quality audit, where a structured survey or inspection of accuracy and level of completeness of
data is undertaken. This audit may be of the entire database, just a sample of files, or a survey
of end users for perceptions of the data quality. If during the data quality audit files are found
that have errors, a process called data cleansing or data scrubbing is undertaken to eliminate or
repair data. Some of the areas in a data file that should be inspected in the audit and suggestions
on how tocorrect them are presented in Table 4.6.
Table 4.6 Quality Data Inspection Items and Recommendations

 Measuring Business Analytics Contribution:

The investment in BA must continually be justified by communicatingthe BA contribution


to the organization for ongoing projects. This means that performance analytics should be
computed for every BA project and BA team initiative. These analytics should provide an
estimate of the tangible and intangible values being delivered to the organization. This should
also involve establishing a communication strategy to promote thevalue being estimated.
Measuring the value and contributions that BA brings to an organization is essential to
helping the firm understand why the application of BA is worth the investment. Some BA
contribution estimates can be computed using standard financial methods, such as payback
period (how long it takesfor the initial costs are returned by profit) or return on investment
(ROI)), where dollar values or quantitative analysis is possible. When intangible contributions
are a major part of the contribution being delivered to the firm, other methods like cost/benefit
analysis include intangible benefits, should be used.

 Managing Change:

Wells (2000) found that what is critical in changing organizations is organizational culture
and the use of change management. Organizational culture is how an organization supports
cooperation, coordination, and empowerment of employees . Change management is defined as
an approach for transitioning the organization (individuals, teams, projects, departments) to a
changed and desired future state .Change management is a means of implementing change in an
organization, such as adding a BA department .Changes in an organization can be either planned
(a result of specific and planned efforts at change withdirection by a change leader) or unplanned
(spontaneous changes without direction of a change leader).
The application of BA invariably will result inboth types of changes because of BA’s specific
problem-solving role (a desired, planned change to solve a problem) and opportunity finding
exploratory nature (i.e., unplanned new knowledge opportunity changes) of BA. Change
management can also target almost everything that makes up an organization (see Table 4.7).

Table 4.7 Change Management Targets


Some of these activities that lead to change management success are presented as best
practices in Table 4.8.
Table 4.8 Change Management Best Practices
---------------------------------------------------------------------------------------------------------------

3.2.____________________________________________________________
DESCRIPTIVE ANALYTICS:
Chapter- 5 full -B.A. principles, concepts- MARC J –
----------------------------------------------------------------------------------------------------------------
PREDICTIVE ANALYSIS:
Predictive analytics is the use of data, statistical algorithms and machine learning
techniques to identify the likelihood of future outcomes based on historical data. The goal is to go
beyond knowing what has happened to providing a best assessment of what will happen in the
future. Predictive analytics is an area of statistics that deals with extracting information from data
and using it to predict trends and behavior patterns. The enhancement of predictive web analytics
calculates statistical probabilities of future events online. Predictive analytics statistical techniques
include data modeling, machine learning, AI, deep learning algorithms and data mining. Often the
unknown event of interest is in the future, but predictive analytics can be applied to any type of
unknown whether it be in the past, present or future.
TYPES:
(1) Predictive models[edit]
Predictive modelling uses predictive models to analyze the relationship between the specific
performance of a unit in a sample and one or more known attributes or features of that unit. The
objective of the model is to assess the likelihood that a similar unit in a different sample will exhibit
the specific performance. This category encompasses models in many areas, such as marketing,
where they seek out subtle data patterns to answer questions about customer performance, or fraud
detection models. Predictive models often perform calculations during live transactions, for
example, to evaluate the risk or opportunity of a given customer or transaction, in order to guide a
decision. With advancements in computing speed, individual agent modeling systems have become
capable of simulating human behaviour or reactions to given stimuli or scenarios.
(2) Descriptive models
Descriptive models quantify relationships in data in a way that is often used to classify
customers or prospects into groups. Unlike predictive models that focus on predicting a single
customer behavior (such as credit risk), descriptive models identify many different relationships
between customers or products. Descriptive models do not rank-order customers by their likelihood
of taking a particular action the way predictive models do. Instead, descriptive models can be used,
for example, to categorize customers by their product preferences and life stage. Descriptive
modeling tools can be utilized to develop further models that can simulate large number of
individualized agents and make predictions.
(3) Decision models
Decision models describe the relationship between all the elements of a decision—the
known data (including results of predictive models), the decision, and the forecast results of the
decision—in order to predict the results of decisions involving many variables. These models can
be used in optimization, maximizing certain outcomes while minimizing others. Decision models
are generally used to develop decision logic or a set of business rules that will produce the desired
action for every customer or circumstance.
----------------------------------------------------------------------------------------------------------------

PREDICATIVE MODELLING:
Predictive modeling means developing models that can be used to forecast or predict
future events. In business analytics, models can be developed based on logic or data.
(A) Logic-Driven Models:
A logic-driven model is one based on experience, knowledge, and logicalrelationships of
variables and constants connected to the desired business performance outcome situation. The
question here is how to put variables and constants together to create a model that can predict
the future. Doing this requires business experience. Model building requires an understanding
of business systems and the relationships of variables and constants that seek to generate a
desirable business performance outcome. To help conceptualize the relationships inherent in a
business system, diagramming methods can be helpful. For example, the cause-and-effect
diagram is a visual aid diagram that permits a user to hypothesize relationships between
potential causes of an outcome (see Figure 6.1). This diagram lists potentialcauses in terms of
human, technology, policy, and process resources in an effort to establish some basic
relationships that impact business performance. The diagram is used by tracing contributing and
relational factors from the desired business performance goal back to possible causes, thus
allowing the user to better picture sources of potential causes that could affect the performance.
This diagram is sometimes referred to as a fishbone diagram because of its appearance.
Figure 6.1 Cause-and-effect diagram
Another useful diagram to conceptualize potential relationships with business
performance variables is called the influence diagram. According to Evans influence diagrams
can be useful to conceptualize the relationships of variables in the development of models. An
example of an influence diagram is presented in Figure 6.2. It maps the relationship of variables
and a constant to the desired business performanceoutcome of profit. From such a diagram, it is
easy to convert the information into a quantitative model with constants and variables that define
profit in this situation:
Profit = Revenue − Cost, or
Profit = (Unit Price × Quantity Sold) − [(Fixed Cost) + (Variable Cost ×Quantity Sold)], or
P = (UP × QS) − [FC + (VC × QS)]

Figure 6.2 An influence diagram


(B) Data-Driven Models:
Logic-driven modeling is often used as a first step to establish relationships through data-
driven models (using data collected from many sources to quantitatively establish model
relationships). To avoid duplication of content and focus on conceptual material in the chapters,
most of the computational aspects and some computer usage content are relegated to the
appendixes. In addition, some of the methodologies are illustrated in the case problems presented
in this book. Please refer to the Additional Information column in Table 6.1 to obtain further
information onthe use and application of the data-driven models.

Table 6.1 Data-Driven Models


------------------------------------------------------------------------------------------------------------------

Predictive Analytics Analysis:

An ideal multiple variable modeling approach that can be used in this situation to
explore variable importance in this case study and eventually lead to the development of a
predictive model for product sales is correlation and multiple regression. We will use both
Excel and IBM’s SPSS statistical packages to compute the statistics in this step of the BA
process.

First, we must consider the four independent variables—radio, TV, newspaper, POS—
before developing the model.
One way to see the statistical direction of the relationship (which is better than just
comparing graphic charts) is to compute the Pearson correlation coefficients r betweeneach of
the independent variables with the dependent variable (product sales). The SPSS correlation
coefficients and their levels of significance arepresented in Table 6.4. The comparable Excel
correlations are presented in Figure 6.5.

Table 6.4 SPSS Pearson Correlation Coefficients: Marketing/Planning

Figure 6.5 Excel Pearson correlation coefficients: marketing/planning casestudy

Although it can be argued that the positive or negative correlation coefficients should not
automatically discount any variable from what will be a predictive model, the negative
correlation of newspapers suggests that as a firm increases investment in newspaper ads, it will
decrease product sales. This does not make sense in this case study. Given the illogic of such a
relationship, its potential use as an independent variable in a model is questionable. Also, this
negative correlation poses several questions that should be considered. Was the data set
correctly collected? Is the data set accurate? Was the sample large enough to have included
enough data for this variable to show a positive relationship? Should it be included for further
analysis? Although it is possible that a negative relationship can statistically show up like this,
it does not make sense in this case. Based on this reasoning and the fact that the correlation is
not statistically significant, this variable (i.e., newspaper ads) will be removed from further
consideration in this exploratory analysis to develop a predictive model.
Some researchers might also exclude POS based on the insignificance (p=0.479) of its
relationship with product sales. However, for purposes ofillustration, continue to consider it
a candidate for model inclusion. Also, the other two independent variables (radio and TV)
were both found to besignificantly related to product sales, as reflected in the correlation
coefficients in the tables.
The procedure by which multiple regression can be used to evaluate which independent
variables are best to include or exclude in a linear model is called step-wise multiple
regression. It is based on an evaluation of regression models and their validation statistics—
specifically, the multiple correlation coefficients and the F-ratio from an ANOVA. SPSS
software and many other statistical systems build in the step-wise process. Some are called
backward step-wise regression and some are called forward step-wiseregression. The
backward step-wise regression starts with all the independent variables placed in the model,
and the step-wise process removes them one at a time based on worst predictors first until a
statistically significant model emerges. The forward step-wise regression starts with the best
related variable (using correction analysis as a guide), and then step-wise adds other variables
until adding more will no longer improve the accuracy of the model. The forward step-wise
regression process will be illustrated here manually. The first step is to generate individual
regression models and statistics for each independent variable with the dependent variable one
at a time. These three models are presented in Tables 6.5, 6.6, and 6.7 for the POS, radio, and
TV variables, respectively. The comparable Excel regression statistics are presented in Tables
6.8, 6.9 and 6.10 for the POS, radio, and TV variables, respectively.

Table 6.5 SPSS POS Regression Model: Marketing/Planning Case Study


Table 6.6 SPSS Radio Regression Model: Marketing/Planning Case Study

Table 6.7 SPSS TV Regression Model: Marketing/Planning Case Study


Table 6.8 Excel POS Regression Model: Marketing/Planning Case Study

Table 6.9 Excel Radio Regression Model: Marketing/Planning Case Study


----------------------------------------------------------------------------------------------------------------
DATA MINING:
It is a discovery- driven software application process that provides insights into business
databy finding hidden patterns and relationships in big or small data and inferring rules from them
to predict future behavior. These observed patterns and rules guide decision-making. This is not
just numbers, but text and social media information from the Web.
A Simple Illustration of Data Mining:
Suppose a grocery store has collected a big data file on what customers put into their baskets
at the market (the collection of grocery items a customer purchases at one time). The grocery
store would like to know if there are any associated items in a typical market basket. (For
example, if a customer purchases product A, she will most often associate it or purchase it with
product B.) If the customer generally purchases product A and B together, the store might only
need to advertise product A to gain both product A’s and B’s sales.
The value of knowing this association of productscan improve the performance of the store
by reducing the need to spend money on advertising both products. The benefit is real if the
association holds true. Finding the association and proving it to be valid requires some analysis.
From the descriptive analytics analysis, some possible associations may have been
uncovered, such as product A’s and B’s association. With any size data file, the normal
procedure in data mining would be to divide the file into two parts. One is referred to as a training
data set, and the other as avalidation data set. The training data set develops the association
rules, andthe validation data set tests and proves that the rules work. Starting with thetraining
data set, a common data mining methodology is what-if analysis using logic-based software.
Excel and SPSS both have what-if logic-based software applications, and so do a number of
other software vendors .These software applications allow logic expressions. (For example, if
product A is present, then is product B present?) The systems can also provide frequency and
probability information to show the strengthof the association. These software systems have
differing capabilities, which permit users to deterministically simulate different scenarios to
identify complex combinations of associations between product purchases in a market basket.
Once a collection of possible associations is identified and their probabilities are computed,
the same logic associations (now considered association rules) are reran using the validation
data set. A new set of probabilities can be computed, and those can be statistically compared
using hypothesis testing methods to determine their similarity. Other software systems compute
correlations for testing purposes to judge the strength and the direction of the relationship. In
other words, if the consumer buys product A first, it could be referred to as the Head and product
B as the Body of the association. If thesame basic probabilities are statistically significant, it
lends validity to the association rules and their use for predicting market basket item purchases
based on groupings of products.

DATA MINING METHODOLOGIES:


Data mining is an ideal predictive analytics tool used in the BA process. Table 6.2 lists a
small sampling of data mining methodologies to acquire different types of information. Some of
the same tools used in the descriptive analytics step are used in the predictive step butare employed
to establish a model (either based on logical connections or quantitative formulas) that may be
useful in predicting the future.

Table 6.2 Types of Information and Data Mining Methodologies


Several computer-based methodologies listed in Table 6.2 are briefly
introduced here. Neural networks are used to find associations where connections between
words or numbers can be determined. Specifically, neural networks can take large volumes of
data and potential variables and explore variable associations to express a beginning variable
(referred to as an input layer), through middle layers of interacting variables, and finally toan
ending variable (referred to as an output). More than just identifying simple one-on-one
associations, neural networks link multiple association pathways through big data like a
collection of nodes in a network. These nodal relationships constitute a form of classifying
groupings of variables as related to one another, but even more, related in complex paths with
multiple association. SPSS has two versions of neural network software functions: Multilayer
Perceptron (MLP) and Radial Basis Function (RBF). Both procedures produce a predictive
model for one or more dependent variables based on the values of the predictive variables. Both
allow a decision maker to develop, train, and use the software to identify particular traits (such
as bad loan risks for abank) based on characteristics from data collected on past customers).
Discriminant analysis is similar to a multiple regression model except that it permits
continuous independent variables and a categorical dependent variable. The analysis generates a
regression function whereby values of the independent variables can be incorporated to generate
a predicted value for the dependent variable. Similarly, logistic regression is like multiple
regression. Like discriminant analysis, its dependent variable can be categorical. The independent
variables, though, in logistic regressioncan be either continuous or categorical.
Hierarchical clustering is a methodology that establishes a hierarchy of clusters that can be
grouped by the hierarchy. Two strategies are suggested for this methodology: agglomerative
and divisive. The agglomerative strategy is a bottom-up approach, where one starts with each
item in the data and begins to group them. The divisive strategy is a top-down approach, where
one starts with all the items in one group and divides the group into clusters.
K-mean clustering is a classification methodology that permits a set of data to be reclassified
into K groups, where K can be set as the number of groups desired. The algorithmic process
identifies initial candidates for the K groups and then interactively searches other candidates in
the data set to be averaged into a mean value that represents a particular K group.
-------------------------------------------------------------------------------------------------------
Prescriptive Analytics Step in the BA Process:
(A) Case Study Background Review:

The case study firm had collected a random sample of monthly sales information presented
in Figure 6.4 listed in thousands of dollars. What the firm wants to know is, given a fixed budget
of $350,000 for promoting this service product, when offered again, how best should the
company allocate budget dollars in hopes of maximizing the future estimated month’s product
sales? Before making any allocation of budget, there is a need to understand how to estimate
future product sales. This requires understanding the behavior of product sales relative to sales
promotion efforts using radio, paper, TV, and point-of-sale (POS) ads.
Figure 6.4 Data for marketing/planning case study
The analysis also revealed little regarding the relationship of newspaper and POS ads to
product sales. So although radio and TV commercials are most promising, a more in-depth
predictive analytics analysis is called for to accurately measure and document the degree of
relationship that may exist in the variables to determine the best predictors of product sales.

(B) PREDICTIVE ANALYTICS ANALYSIS:

-------------------------------------------------------------------------------------------------------------------------

PRESCRIPTIVE MODELLING:
After undertaking the descriptive and predictive analytics steps in the BAprocess, one
should be positioned to undertake the final step: prescriptive analytics analysis. The prior
analysis should provide a forecast or predictionof what future trends in the business may
hold. For example, there may be significant statistical measures of increased (or decreased)
sales, profitability trends accurately measured in dollars for new market opportunities, or
measured cost savings from a future joint venture.
Step 3 of the BA process, prescriptive analytics, involves the application ofdecision
science, management science, or operations research methodologies to make best use of
allocable resources. These are mathematically based methodologies and algorithms
designed to take variables and other parameters into a quantitative framework and generate
an optimal or near-optimal solution to complex problems. These methodologies can be used
to optimally allocate a firm’s limited resources to take best advantage of the opportunities
it has found in the predicted future trends. Limits on human, technology, and financial
resources prevent any firm from going after all the opportunities. Using prescriptive
analyticsallows the firm to allocate limited resources to optimally or near-optimally achieve
the objectives as fully as possible.
The listing of the prescriptive analytic methodologies as they are in some cases utilized
in
the BA process is again presented in Figure 7.1 to form the basis of thischapter’s content.

Figure 7.1 Prescriptive analytic methodologies

Prescriptive Modeling:
The listing of prescriptive analytic methods and models in Figure 7.1 is but a small
grouping of many operations research, decision science, and management science
methodologies that are applied in this step of the BA process. The explanation and use of
most of the methodologies in Table 7.1are explained throughout this book. (See Additional
Information column inTable 7.1.)
NON-LINEAR OPTIMIZATION:
When business performance cost or profit functions becometoo complex for simple
linear models to be useful, exploration of nonlinear functions is a standard practice in BA.
Although the predictive nature of exploring for a mathematical expression to denote a trend
or establish a forecast falls mainly in the predictive analytics step of BA, the use of the
nonlinear function to optimize a decision can fall in the prescriptive analytics step.
there are many mathematical programing nonlinear methodologies and solution
procedures designed to generate optimal business performance solutions. Most of them
require careful estimation of parameters that may or may not be accurate, particularly given
the precision required of a solution that can be so precariously dependent upon parameter
accuracy. This precision is further complicated in BA by the large data files that should be
factored into the model-buildingeffort.
To overcome these limitations and be more inclusive in the use of largedata, regression
software can be applied. Curve Fitting software can be used to generate predictive
analytic modelsthat can also be utilized to aid in making prescriptive analytic decisions.
For purposes of illustration, SPSS’s Curve Fitting software will be used in this chapter.
Suppose that a resource allocation decision is being faced whereby one must decide how
many computer servers a service facility should purchase to optimize the firm’s costs of
running the facility. The firm’s predictive analytics effort has shown a growth trend. A new
facilityis called for if costs can be minimized. The firm has a history of setting up large and
small service facilities and has collected the 20 data points in Figure 7.2.

Figure 7.2 Data and SPSS Curve Fitting function selection window

In this server problem, the basic data has a u-shaped function, as


presented in Figure 7.3. This is a classic shape for most cost functions in business. In this
problem, it represents the balancing of having too few servers (resulting in a costly loss of
customer business through dissatisfaction and complaints with the service) or too many
servers (excessive waste in investment costs as a result of underutilized servers). Although
this is an overly simplified example with little and nicely ordered data for clarity purposes,
in big data situations, cost functions are considerably less obvious.
Figure 7.3 Server problem basic data cost function

The first step in using the curve-fitting methodology is to generate the best-fitting
curve to the data. By selecting all the SPSS models in Figure 7.2, the software applies each
point of data using the regression process of minimizing distance from a line. The result is
a series of regression models and statistics, including ANOVA and other testing statistics.
It is known from the previous illustration of regression that the adjusted R-Square statistic
can reveal the best estimated relationship between the independent (number of servers) and
dependent (total cost) variables. These statistics arepresented in Table 7.2. The best adjusted
R-Square value (the largest) occurs with the quadratic model, followed by the cubic model.
The more detailed supporting statistics for both of these models are presented in Table
BELOW. The graph for all the SPSS curve-fitting models appears in Figure 7.4.

Table 7.2 Adjusted R-Square Values of All SPSS Models


Table 7.3 Quadratic and Cubic Model SPSS Statistics

Figure 7.4 Graph of all SPSS curve-fitting models


From Table 7.3, the resulting two statistically significant curve-fittedmodels follow:
Yp = 35417.772 − 5589.432 X + 268.445 X2 [Quadratic model]
Yp = 36133.696 − 5954.738 X + 310.895 X2 − 1.347 X3 [Cubic model]
Yp = the forecasted or predicted total cost, and
X = can be the number of computer servers.
UNIT-4
4.1……………………………………..FORECASTING TECHNIQUES
----------------------------------------------------------------------------------------------------------------

QUALITATIVE & JUDGMENTAL FORECASTING:


Qualitative and judgmental techniques rely on experience and intuition; they are
necessary when historical data are not available or when the decision maker needs to forecast
far into the future. Another use of judgmental methods is to incorporate nonquantitative
information, such as the impact of government regulations or competitor behavior, in a
quantitative forecast. Judgmental techniques range from such simple methods as a manager’s
opinion or a group based jury of executive opinion to more structured approaches such as
historical analogy and the Delphi method.
(A) HISTORICAL ANALOGY:
One judgmental approach is historical analogy, in which a forecast is obtained
through a comparative analysis with a previous situation. For example, if a new
product is be- ing introduced, the response of consumers to marketing campaigns
to similar, previous products can be used as a basis to predict how the new
marketing campaign might fare.
(B) DELPHI METHOD:
A popular judgmental forecasting approach, called the Delphi method,
uses a panel of experts, whose identities are typically kept confidential from
one another, to respond to a sequence of questionnaires. After each round of
responses, individual opinions, edited to ensure anonymity, are shared,
allowing each to see what the other experts think. The Delphi method
promotes unbiased exchanges of ideas and discussion and usually results in
some convergence of opinion. It is one of the better approaches to forecasting
long- range trends and impacts.
----------------------------------------------------------------------------------------------------------------

STATISTICAL FORECASTING MODELS:


Statistical time-series models find greater applicability for short-range forecasting
prob- lems. A time series is a stream of historical data, such as weekly sales. We characterize
the values of a time series over T periods as At , t = 1, 2, c, T. Time-series models assume
that whatever forces have influenced sales in the recent past will continue into the near fu- ture;
thus, forecasts are developed by extrapolating these data into the future. Time series generally
have one or more of the following components: random behavior, trends, seasonal effects, or
cyclical effects. Time series that do not have trend, seasonal, or cyclical effects but are
relatively constant and exhibit only random behavior are called stationary time series.
Many forecasts are based on analysis of historical time-series data and are predicated
on the assumption that the future is an extrapolation of the past. A trend is a gradual upward
or downward movement of a time series over time.
Time series may also exhibit short-term seasonal effects (over a year, month, week, or
even a day) as well as longer-term cyclical effects, or nonlinear trends. A seasonal effect is one
that repeats at fixed intervals of time, typically a year, month, week, or day. At a neighborhood
grocery store, for instance, short-term seasonal patterns may occur over a week, with the
heaviest volume of customers on weekends; seasonal patterns may also be evident during the
course of a day, with higher volumes in the mornings and late afternoons. Figure 9.2 shows
seasonal changes in natural gas usage for a homeowner over the course of a year (Excel file
Gas & Electric). Cyclical effects describe ups and downs over a much longer time frame, such
as several years. Figure 9.3 shows a chart of the data in the Excel file Federal Funds Rates.
We see some evidence of long-term cycles in the time series driven by economic factors, such
as periods of inflation and recession.

9.2. Seasonal Effects in Natural Gas Usage

9.3. Cyclical Effects in Federal Funds Rates


Although visual inspection of a time series to identify trends, seasonal, or cyclical effects may
work in a naïve fashion, such unscientific approaches may be a bit unsettling to a manager making
important decisions. Subtle effects and interactions of seasonal and cyclical factors may not be evident
from simple visual extrapolation of data. Statistical methods, which involve more formal analyses of
time series, are invaluable in developing good forecasts. A variety of statistically-based forecasting
methods for time series are commonly used. Among the most popular are moving average methods,
expo- nential smoothing, and regression analysis. These can be implemented very easily on a
spreadsheet using basic functions and Data Analysis tools available in Microsoft Excel, aswell as with
more powerful software such as XLMiner. Moving average and exponential smoothing models work
best for time series that do not exhibit trends or seasonal factors. For time series that involve
trends and/or seasonal factors, other techniques have been developed. These include double moving
average and exponential smoothing models, seasonal additive and multiplicative models, and Holt-
Winters additive and multiplicative models.

-----------------------------------------------------------------------------------------------------------
FORECASTING MODELS FOR TIME SERIES WITH LINEAR TREND:
For time series with a linear trend but no significant seasonal components, double
moving average and double exponential smoothing models are more appropriate than using
simple moving average or exponential smoothing models. Both methods are based on the linear
trend equation:
Ft + k = at + btk…………………………………… (9.6)

That is, the forecast for k periods into the future from period t is a function of a base
value at, also known as the level, and a trend, or slope, bt. Double moving average and double
exponential smoothing differ in how the data are used to arrive at appropriate values for at and
bt. Because the calculations are more complex than for simple moving average and exponential
smoothing models, it is easier to use forecasting software than to try to imple- ment the models
directly on a spreadsheet. Therefore, we do not discuss the theory or for- mulas underlying the
methods. XLMiner does not support a procedure for double moving average; however, it does
provide one for double exponential smoothing.
DOUBLE EXPONENTIAL SMOOTHING:
In double exponential smoothing the estimates of at & bt are obtained from following
equations:
at = aFt + 11 - a21at-1 + bt -12

bt = b1at - a t-1 2 + 11 - b2bt-1 ………………………………………9.7

In essence, we are smoothing both parameters of the linear trend model. From the first
equation, the estimate of the level in period t is a weighted average of the observed value at
time t and the predicted value at time t, at-1 + bt-1, based on simple exponential smoothing. For
large values of a, more weight is placed on the observed value. Lower values of put more
weight on the smoothed predicted value. Similarly, from the second equation, the estimate of
the trend in period t is a weighted average of the differences in the estimated levels in periods
t and t - 1 and the estimate of the level in period t - 1.
Larger values of b place more weight on the differences in the levels, but lower values
of b put more emphasis on the previous estimate of the trend. Initial values are chosen for a1 as
A1 and b1 as A2 - A1. Equations (9.7) must then be used to compute at and bt for the entire time
series to be able to generate forecasts into the future. As with simple exponential smoothing,
we are free to choose the values of a and b. However, it is easier to let XLMiner optimize these
values using historical data.
-------------------------------------------------------------------------------------------------------------
FORECASTING TIME SERIES WITH SEASONALITY:
Quite often, time-series data exhibit seasonality, especially on an annual basis. When
time series exhibit seasonality, different techniques provide better forecasts than other
techniques.
(A) REGRESSON BASED SEASONAL FORECASTING MODELS:
One approach is to use linear regression. Multiple linear regression models with
categorical variables can be used for time series with seasonality. To do this, we use
dummy categorical variables for the seasonal components.
(B) HOLT-WINTERS FORECASTING FOR SEASONAL FORECASTING:
based on the work of two researchers, C.C. Holt, who developed the basic approach,
and P.R. Winters, who extended Holt’s work. Hence, these approaches are commonly
referred to as Holt-Winters models. Holt-Winters models are similar to exponential
smoothing models in that smoothing con- stants are used to smooth out variations in the
level and seasonal patterns over time. For time series with seasonality but no trend,
XLMiner supports a Holt-Winters method but does not have the ability to optimize the
parameters.
HOLT-WINTERS MODELS FOR FORECASTING TIME SERIES WITH
SEASONALITY AND TREND:
Many time series exhibit both trend and seasonality. Such might be the case for growing
sales of a seasonal product. These models combine elements of both the trend and sea- sonal
models. Two types of Holt-Winters smoothing models are often used.

Holt-Winters additive model is based on the equation


Ft +1 = at + bt + St-s + 1 (9.8)

and the Holt-Winters multiplicative model is

Ft +1 = 1at + bt2St-s+ 1 (9.9)


The additive model applies to time series with relatively stable seasonality, whereas
the multiplicative model applies to time series whose amplitude increases or decreases over
time. Therefore, a chart of the time series should be viewed first to identify the appropriate
type of model to use. Three parameters, a, b, and g, are used to smooth the level, trend, and
seasonal factors in the time series. XLMiner supports both models.
-------------------------------------------------------------------------------------------------------------

REGRESSION FORECASTING WITH CASUAL VARIABLE:


In many forecasting applications, other independent variables besides time, such as eco-
nomic indexes or demographic factors, may influence the time series. For example, a man-
ufacturer of hospital equipment might include such variables as hospital capital spending and
changes in the proportion of people over the age of 65 in building models to forecast future
sales. Explanatory/causal models, often called econometric models, seek to iden tify factors
that explain statistically the patterns observed in the variable being forecast, usually with
regression analysis. We will use a simple example of forecasting gasoline sales to illustrate
econometric modeling.
FORECASTING GASEOLINE SALES USING SIMPLE REGRESSION MODEL:
Figure 9.27 shows gasoline sales over 10 weeks during June through
August along with the average price per gal- lon and a chart of the
gasoline sales time series with a fitted trendline (Excel file
Gasoline Sales). During the sum- mer months, it is not unusual to
see an increase in sales as more people go on vacations. The chart
shows a linear trend, although R2 is not very high. The trendline is:
sales 4,790.1 + 812.99 week
Using this model, we would predict sales for week 11 as sales 4,790.1 +
812.99(11) 13,733 gallons………

Gasoline Sales Data and Trendline

The gasoline sales data, we also see that the average price per gallon changes each week, and
this may influence consumer sales. Therefore, the sales trend might not simply be a factor of steadily
increasing demand, but it might also be influenced by the average price per gallon. The average price
per gallon can be considered as a causal variable. Multiple linear regression provides a technique for
building forecasting models that incor- porate not only time, but other potential causal variables also.

INCORPORATING CASUAL VARIABLES IN REGRESSION FORECASTING


MODEL:
For the gasoline sales data, we can incorporate the price/gallon by using two
independent variables. This results in the multiple regression model sales
B0 + B1 week + B2 price>gallon
The results are shown in Figure 9.28, and the regres- sion model is sales
72333.08 + 508.67 week - 16463.2 price>gallon
Notice that the R2 value is higher when both variables are included, explaining more
than 86% of the variation in the data. If the company estimates that the average price
for the next week will drop to $3.80, the model would forecast the sales for week 11
as sales 72333.08 + 508.67(11) - 16463.2(3.80) 15,368 gallons
FIG. 9.28 Regression Results for
Gasoline Sales

----------------------------------------------------------------------------------------------------------------

SELECTING APPROPRIATE FORECASTING MODELS:


Table 9.1 summarizes the choice of forecasting approaches that can be implemented
by XLMiner based on characteristics of the time series.

TABLE 9.1 ……..FORECASTING MODEL CHOICE

-------------------------------------------------------------------------------------------- ----------------
4.2………………MONTE CARLO SIMULATION & RISK ANALYSIS:
MONTE CARLO SIMULATION USING ANALYTIC SOLVER
PLATFORM:
To use Analytic Solver Platform, you must perform the following steps:

(1) Develop the spreadsheet model.


(2) Determine the probability distributions that describe the uncertain inputs in your
model.
(3) Identify the output variables that you wish to predict.
(4) Set the number of trials or repetitions for the simulation.
(5) Run the simulation.
(6) Interpret the results.

 DEFINING CERTAIN MODEL INPUTS:


When model inputs are uncertain, we need to characterize them by some probability
distribution. For many decision models, empirical data may be available, either in historical records
or collected through special efforts. For example, maintenance records might provide data on
machine failure rates and repair times, or observers might collect data on service times in a bank or
post office. This provides a factual basis for choosing the appropriate probability distribution to
model the input variable.
There are two ways to define uncertain variables in Analytic Solver Platform. One is to use
the custom Excel functions for generating random samples from probability distri- butions The
second way to define an uncertain variable is to use the Distributions button in the Analytic Solver
Platform ribbon. First, select the cell in the spreadsheet for which you want to define a distribution.
Click on the Distributions button as shown in Figure 12.3. Choose a distribution from one of the
categories in the list that pops up. This will display a dialog in which you may define the parameters
of the distribution.

12.3. Analytic Solver Platform Distributions Options


 DEFINING OUTPUT CELLS:
To define a cell you wish to predict and create a distribution of output values
from your model (which Analytic Solver Platform calls an uncertain function cell),
first select it, and then click on the Results button in the Simulation Model group in the
Analytic Solver Platform ribbon. Choose the Output option and then In Cell.

 RUNNING A SIMULATION:
To run a simulation, first click on the Options button in the Options group in the
Analytic Solver Platform ribbon. This displays a dialog (see Figure 12.7) in which you can
specify the number of trials and other options to run the simulation (make sure the Simulation
tab is selected). Trials per Simulation allows you to choose the number of times that Analytic
Solver Platform will generate random values for the uncertain cells in the model and recalculate
the entire spreadsheet. Because Monte Carlo simulation is essentially sta- tistical sampling, the
larger the number of trials you use, the more precise will be the result. Unless the model is
extremely complex, a large number of trials will not unduly tax today’s computers, so we
recommend that you use at least 5,000 trials (the educational version restricts this to a maximum
of 10,000 trials). You should use a larger number of trials as the number of uncertain cells in
your model increases so that the simulation can generate representative samples from all
distributions for assumptions. You may run morethan one simulation if you wish to examine
the variability in the results.

FIG. 12.7. Analytic Solver Platform


Options Dialog

The procedure that Analytic Solver Platform uses generates a stream of random num-
bers from which the values of the uncertain inputs are selected from their probability same
assumption values. As long as you use the same number, the assumptions generated will be the
same for all simulations.
Analytic Solver Platform has alternative sampling methods; the two most common
are Monte Carlo and Latin Hypercube sampling. Monte Carlo sampling selects random variates
independently over the entire range of possible values of the distribution. With Latin
Hypercube sampling, the uncertain variable’s probability distribution is divided into intervals
of equal probability and generates a value randomly within each interval. Latin Hypercube
sampling results in a more even distribution of output values because it samples the entire
range of the distribution in a more consistent manner, thus achiev- ing more accurate forecast
statistics (particularly the mean) for a fixed number of Monte Carlo trials. However, Monte
Carlo sampling is more representative of reality and should be used if you are interested in
evaluating the model performance under various what-if scenarios. Unless you are an advanced
user, we recommend leaving the other options at their default values.
The last step is to run the simulation by clicking the Simulate button in the Solve Action
group. When the simulation finishes, you will see a message “Simulation finished successfully”
in the lower-left corner of the Excel window.

 VIEWING & ANALYZING RESULTS:


You may specify whether you want output charts to automatically appear after a
simulation is run by clicking the Options button in the Analytic Solver Platform ribbon, and
either checking or unchecking the box Show charts after simulation in the Charts tab. You may
also view the results of the simulation at any time by double-clicking on an output cell that
contains the PsiOutput() function or by choosing Simulation from the Reports button in the
Analysis group in the Analytic Solver Platform ribbon. This displays a win- dow with various
tabs showing different charts to analyze results.

--TOPICS TO REFER IN CHAPTR 11 FROM JAMES EVANS BOOK -----


-------------------------------

PART-2:
(1) NEW PRODUCT DEVELOPMENT MODEL -----PG.NO. 414
(2) NEWS VENDOR MODEL---------------------------------------421
(3) OVERBOOKING MODEL---------------------------------------424
(4) CASH BUDGET MODEL----------------------------------------426
----------------------------------------------------------------------------------------------------------------
UNIT-5
5.1………………………………………………DECISION ANALYSIS
FORMULATING DECISION PROBLEMS:
Many decisions involve a choice from among a small set of alternatives with uncertain
consequences. We may formulate such decision problems by defining three things:
1. the decision alternatives that can be chosen,
2. the uncertain events that may occur after a decision is made along with their possible
outcomes, and
3. the consequences associated with each decision and outcome, which are usu- ally
expressed as payoffs.
The outcomes associated with uncertain events (which are often called states of nature),
are defined so that one and only one of them will occur. They may be quantitative or qualitative.
For instance, in selecting the size of a new factory, the future demand for the product would be
an uncertain event. The demand outcomes might be expressed quantita- tively in sales units or
dollars. On the other hand, suppose that you are planning a spring- break vacation to Florida in
January; you might define an uncertain event as the weather; these outcomes might be
characterized qualitatively: sunny and warm, sunny and cold, rainy and warm, rainy and cold,
and so on. A payoff is a measure of the value of making a decision and having a particular
outcome occur. This might be a simple estimate made judgmentally or a value computed from
a complex spreadsheet model. Payoffs are often summarized in a payoff table, a matrix whose
rows correspond to decisions and whose columns correspond to events. The decision maker
first selects a decision alternative, after which one of the outcomes of the uncertain event occurs,
resulting in the payoff.
----------------------------------------------------------------------------------------------------------------
DECISION STRATEGIES WITHOUT OUTCOME PROBABILITIES:
 DECISION STRATEGIES FOR A MINIMIZE OBJECTIVE:
Aggressive (Optimistic) Strategy An aggressive decision maker might seek the option
that holds the promise of minimizing the potential loss. For a minimization objective, this
strategy is also often called a minimin strategy; that is, we choose the decision that minimizes
the minimum payoff that can occur among all outcomes for each decision. Aggressive decision
makers are often called speculators, particularly in financial arenas, because they increase their
exposure to risk in hopes of increasing their return; while a few may be lucky, most will not do
very well.
Conservative (Pessimistic) Strategy A conservative decision maker, on the other
hand, might take a more-pessimistic attitude and ask, “What is the worst thing that might
result from my decision?” and then select the decision that represents the “best of the worst.”
Such a strategy is also known as a minimax strategy because we seek the decision that
minimizes the largest payoff that can occur among all outcomes for each decision.
Conservative decision makers are willing to forgo high returns to avoid undesirable losses.
This rule typically models the rational behavior of most individuals.
Opportunity-Loss Strategy A third approach that underlies decision choices for many
individuals is to consider the opportunity loss associated with a decision. Opportunity loss
represents the “regret” that people often feel after making a nonoptimal decision (I should have
bought that stock years ago!). In general, the opportunity loss associated with any decision and
event is the absolute difference between the best decision for that particular outcome and the
payoff for the decision that was chosen. Opportunity losses can be only nonnegative values.
If you get a negative number, then you made a mistake. Once opportunity losses are computed,
the decision strategy is similar to a conservative strategy. The decision maker would select the
decision that minimizes the largest opportunity loss among all outcomes for each decision. For
these reasons, this is also called a minimax regret strategy.

 DECISION STRATEGIES FOR MAXIMIZE OBJECTIVE:


When the objective is to maximize the payoff, we can still apply aggressive, conservative,
and opportunity loss strategies, but we must make some key changes in the analysis.
(1) For the aggressive strategy, the best payoff for each decision would be the largest
value among all outcomes, and we would choose the decision corresponding to the
largest of these, called a maximax strategy.
(2) For the conservative strategy, the worst payoff for each decision would be the
smallest value among all outcomes, and we would choose the decision
corresponding to the largest of these, called a maximin strategy.
(3) For the opportunity-loss strategy, we need to be careful in calculating the opportunity
losses. With a maximize objective, the decision with the largest value for a particular event
has an opportunity loss of zero. The opportunity losses associated with other decisions is
the absolute difference between their payoff and the largest value. The actual decision is
the same as when payoffs are costs: Choose the decision that minimizes the maximum
opportunityloss.

 DECISION WITH CONFLICTING OBJECTIVES:


Many decisions require some type of tradeoff among conflicting objectives, such as risk versus
reward. A simple decision rule can be used whenever one wishes to make an optimal tradeoff between
any two conflicting objectives, one of which is good, and one of which is bad, that maximizes the ratio
of the good objective to the bad. First, display the tradeoffs on a chart with the “good” objective on
the x-axis, and the “bad” objective on the y-axis, making sure to scale the axes properly to display the
origin (0,0). Then graph the tangent line to the tradeoff curve that goes throughthe origin. The point at
which the tangent line touches the curve (which represents the smallest slope) represents the best return
to risk tradeoff.
TABLE 16.1. Summary of Decision Strategies Under Uncertainty

Strategy/ Aggressive Conservative


Objective Strategy Strategy Opportunity-Loss Strategy
Minimize Find the smallest Find the largest For each outcome, compute the
objective payoff for each payoff for each opportunity loss for each decision
decision among decision among as the absolute difference
all outcomes, all outcomes, and between its payoff and the small-
and choose the choose the decision est payoff for that outcome. Find
decision with the with the smallest of the maximum opportunity loss for
smallest of these these (minimax). each decision, and choose the
(minimum). decision with the smallest oppor-
tunity loss (minimax regret).
Maximize Choose the Find the largest Find the smallest For each outcome, compute
objective decision with payoff for each payoff for each the opportunity loss for each
the largest decision among decision among decision as the absolute
average payoff. all outcomes, all outcomes, and difference between its payoff
and choose the choose the decision and the largest payoff for that
decision with the with the largest of outcome. Find the maximum
largest of these these (maximin). opportunity loss for each deci-
(maximax). sion, and choose the decision
with the smallest opportunity
loss (minimax regret).
Table 16.1 summarizes the decision rules for both minimize and maximize objectives.

--------------------------------------------------------------------------------------------------------------------------------------

DECISION STRATEGIES WITH OUTCOME PROBABILITIES:


The aggressive, conservative, and opportunity-loss strategies assume no knowledge of the probabilities
associated with future outcomes.

 AVERAGE PAY-OFF STRATEGY:


If we can assess a probability for each outcome, we can choose the best decision based
onthe expected value using concepts. For any decision, the expected value is the summation of
the payoffs multiplied by their probability, summed over all outcomes. The simplest case is to
assume that each outcome is equally likely to occur; that is, the probability of each outcome is
simply 1/N, where N is the number of possible outcomes. This is called the average payoff
strategy.

 EXPECTED VALUE STRATEGY:


A more general case of the average payoff strategy is when the probabilities of the
out- comes are not all the same. This is called the expected value strategy.

 EVALUATING RISK:
An implicit assumption in using the average payoff or expected value strategy is that
the decision is repeated a large number of times.
----------------------------------------------------------------------------------------------------------------
DECISION TREES:
A useful approach to structuring a decision problem involving uncertainty is to use a
graphical model called a decision tree. Decision trees consist of a set of nodes and branches.
Nodes are points in time at which events take place. The event can be a selection of a decision
from among several alternatives, represented by a decision node, or an outcome over which
the decision maker has no control, an event node. Event nodes are conventionally depicted
by circles, and decision nodes are expressed by squares. Branches are associated with
decisions and events. Many decision makers find decision trees useful because sequences of
decisions and outcomes over time can be modeled easily.
Decision trees may be created in Excel using Analytic Solver Platform. Click the
Decision Tree button. To add a node, select Add Node from the Node drop down list, as shown
in Figure 16.2. Click on the radio button for the type of node you wish to create (decision or
event). This displays one of the dialogs shown in Figure 16.3. For a decision node, enter
the name of the node and names of the branches that emanate fromthe node (you may also
add additional ones). The Value field can be used to input cash flows, costs, or revenues that
result from choosing a particular branch. For an event node, enter the name of the node and
branches. The Chance field allows you to enter the probabilities of the events.

FIG. 16.2. Decision Tree Menu in Analytic Solver Platform

FIG. 16.3. Decision Tree Dialogs for Decisions and Events


DECISION TREES AND MONTE CARLO SIMULATION:
Because all computations use Excel formulas, you could easily perform what-if
analysis or create data tables to analyze changes in the assumptions of the model. One of
the interesting features of decision trees in Analytic Solver Platform is that you can also
use the Excel model to develop a Monte Carlo simulation or an optimization model using
the decision tree.

DECISION TREES AND RISK:


The decision tree approach is an example of expected value decision making. Thus, in
the drug-development example, if the company’s portfolio of drug-development projects has
similar characteristics, then pursuing further development is justified on an expected value
basis. However, this approach does not explicitly consider risk.
Each decision strategy has an associated payoff distribution, called a risk
profile.Risk profiles show the possible payoff values that can occur and their probabilities
SENSITIVE ANALYSIS IN DECISION TREES:
We may use Excel data tables to investigate the sensitivity of the optimal decision to changes in
probabilities or payoff values.
---------------------------------------------------------------------------------------------------------------------------

VALUE OF INFORMATION:
When we deal with uncertain outcomes, it is logical to try to obtain better information
about their likelihood of occurrence before making a decision. The value of information
represents the improvement in the expected return that can be achieved if the decision maker
is able to acquire before making a decision additional information about the future event that will
take place. In the ideal case, we would like to have perfect information, which tells us with
certainty what outcome will occur. Although this will never occur, it is useful to know the value
of perfect information because it provides an upper bound on the value of any information that
we may acquire. The expected value of perfect information (EVPI) is the expected value
with perfect information (assumed at no cost) minus the expected value without any
information; again, it represents the most you should be willing to pay for perfect information.
The expected opportunity loss represents the average additional amount the decision
maker would have achieved by making the right decision instead of a wrong one. To find the
expected opportunity loss, we create an opportunity-loss table, as discussed earlier in this
chapter, and then find the expected value for each decision. It will always be true that the
decision having the best expected value will also have the minimum expected opportunity loss.
The minimum expected opportunity loss is the EVPI.

 DECISIONS WITH SAMPLE INFORMATION:


Sample information is the result of conducting some type of experiment, such as
a market research study or interviewing an expert. Sample information is always
imperfect. Often, sample information comes at a cost. Thus, it is useful to know
how much we should be willing to pay for it. The expected value of sample
information (EVSI) is the expected value with sample information (assumed at no
cost) minus the expected value without sample information; it represents the most
you should be willing to pay for the sample information.
BAYE’S RULE:
Bayes’s rule extends the concept of conditional probability to revise historical
probabilities based on sample information. Suppose that A1, A2,…, Ak is a set of mutually
exclusive and collectively exhaustive events, and we seek the probability that some event Ai
occurs given that another event B has occurred. Bayes’s rule is stated as follows:

----------------------------------------------------------------------------------------------------------------

UTILITY & DECISION MAKING:


An approach for assessing risk attitudes quantitatively is called utility theory.
This approach quantifies a decision maker’s relative preferences for particular outcomes. We
can determine an individual’s utility function by posing a series of decision scenarios.

 CONSTRUCTING A UTILITY FUNCTION:


A utility function may be used instead of the actual monetary payoffs in a decision analysis
by simply replacing the payoffs with their equivalent utilities and then computing expected
values. The expected utilities and the corresponding optimal decision strategy then reflect the
decision maker’s preferences toward risk. For example, if we use the aver- age payoff strategy
(because no probabilities of events are given) for the data in Table 16.2, the best decision would
be to choose the stock fund. However, if we replace the payoffs in Table 16.2 with the (risk-
averse) utilities that we defined and again use the average payoff strategy, the best decision
would be to choose the bank CD as opposed to the stock fund, as shown in the following table.

Decision/Event Rates Rise Rates Stable Rates Fall Average Utility


Bank CD 0.75 0.75 0.75 0.75
Bond fund 0.35 0.85 0.9 0.70
Stock fund 0 0.80 1.0 0.60

 EXPONENTIAL UTILITY FUNCTION:


If assessments of event probabilities are available, these can be used to compute the expected
utility and identify the best decision.It can be rather difficult to compute a utility function,
especially for situations involving a large number of payoffs. Because most decision makers
typically are risk averse, we may use an exponential utility function to approximate the true
utility function.
The exponentialutility function is

U1x2 = 1 - e-x>R (16.2)


where e is the base of the natural logarithm (2.71828 …) and
R is a shape parameter that is a measure of risk tolerance.
Figure 16.14 shows several examples of U(x) for different val- ues of R. Notice that all
these functions are concave and that as R increases, the functions become flatter, indicating
more tendency toward risk neutrality.
One approach to estimating a reasonable value of R is to find the maximum payoff
$R for which the decision maker is willing to take an equal chance on winning $R or los- ing
$R>2. The smaller the value of R, the more risk averse is the individual. For instance, would
you take a bet on winning $10 versus losing $5? How about winning $10,000 ver- sus losing
$5,000? Most people probably would not worry about taking the first gamble but might
definitely think twice about the second. Finding one’s maximum comfort level establishes the
utility function.

Fig. 16.14. Examples of Exponential Utility Functions

You might also like