BA Notes
BA Notes
1 BUSINESS ANALYTICS:
USAGE of B.A: Leading banks use analytics to predict and prevent credit fraud. Manufacturers
use analytics for production planning, purchasing, and inventory management. Retailers use
analytics to recommend products to customers and optimize marketing promotions.
Pharmaceutical firms use it to get life-saving drugs to market more quickly. Airlines and hotels use
analytics to dynamically set prices over time to maximize revenue. Even sports teams are using
business analytics to determine both game strategy and optimal ticket prices.
One of the emerging applications of analytics is helping businesses learn from social media and
exploit social media data for strategic advantage. Using analytics, firms can integrate social media
data with traditional data sources such as customer surveys, focus groups, and sales data;
understand trends and customer perceptions of their products; and create informative reports to
assist marketing managers and product designers.
evolution of analytics began with the introduction of computers in the late 1940s and their
development through the 1960s and beyond. Early computers provided the ability to store and
analyze data in ways that were either very difficult or impossible to do so manually. This facilitated
the collection, management, analysis, and reporting of data, which is often called business
intelligence (BI), a term that was coined in 1958 by an IBM researcher, Hans Peter Luhn.
Using BI, we can create simple rules to flag exceptions automatically, BI has evolved into the
modern discipline we now call information systems (IS).
STATISTICS: Statistical methods allow us to gain a richer understanding of data that goes
beyond business intelligence reporting by not only summarizing data succinctly but also finding
unknown and interesting relationships among the data. Statistical methods include the basic tools
of description, exploration, estimation, and inference, as well as more advanced techniques like
regression, forecasting, and data mining. . Many OR/MS(operations research/Management
science) applications use modeling and optimization—techniques for translating real problems into
mathematics, spreadsheets, or other computer languages, and using them to find the best
(“optimal”) solutions and decisions.
Decision support systems (DSS) began to evolve in the 1960s by combining business
intelligence concepts with OR/MS models to create analytical-based computer systems to support
decision making.
DSSs include three components:
1. Data management. The data management component includes databases for storing data and
allows the user to input, retrieve, update, and manipulate data.
2. Model management. The model management component consists of various statistical tools
and management science models and allows the user to easily build, manipulate, analyze, and solve
models.
3. Communication system. The communication system component provides the interface
necessary for the user to interact with the data and model management components.11 DSSs have
been used for many applications, including pension fund management, portfolio management,
work-shift scheduling, global manufacturing and facility location, advertising-budget allocation,
media planning, distribution planning, airline operations planning, inventory control, library
management, classroom assignment, nurse scheduling, blood distribution, water pollution control,
ski-area design, police-beat design, and energy planning.
data mining is focused on better understanding characteristics and patterns among variables in
large databases using a variety of statistical and analytical tools. Many standard statistical tools as
well as more advanced ones are used extensively in data mining. Simulation and risk analysis relies
on spreadsheet models and statistical analysis to examine the impacts of uncertainty in the
estimates and their potential interaction with one another on the output variable of interest.
Spreadsheets and formal models allow one to manipulate data to perform what-if analysis—how
specific combinations of inputs that reflect key assumptions will affect model outputs. What-if
analysis is also used to assess the sensitivity of optimization models to changes in data inputs and
provide better insight for making good decisions.
Visualization:
Visualizing data and results of analyses provide a way of easily communicating data at all
levels of a business and can reveal surprising patterns and relationships.
Although many good analytics software packages are available to professionals, we use Microsoft
Excel and a powerful add-in called Analytic Solver Platform.
---------------------------------------------------------------------------------------------------------------------
Software Support:
Many companies, such as IBM, SAS, and Tableau have developed a variety of software
and hardware solutions to support business analytics. For example, IBM’s Cognos Express, an
integrated business intelligence and planning solution designed to meet the needs of midsize
companies, provides reporting, analysis, dashboard, scorecard, planning, budgeting, and
forecasting capabilities. It’s made up of several modules, including Cognos Express Reporter, for
self-service reporting and ad hoc query; Cognos Express Advisor, for analysis and visualization;
and Cognos Express Xcelerator, for Excel-based planning and business analysis. Information is
presented to the business user in a business context that makes it easy to understand, with an easy
to use interface they can quickly gain the insight they need from their data to make the right
decisions and then take action for effective and efficient business optimization and outcome. SAS
provides a variety of software that integrate data management, business intelligence, and analytics
tools. SAS Analytics covers a wide range of capabilities, including predictive modeling and data
mining, visualization, forecasting, optimization and model management, statistical analysis, text
analytics, and more. Tableau Software provides simple drag and drop tools for visualizing data
from spreadsheets and other databases.
BUSINESS ANALYTICS PROCESS:
A single or multiple regression model can often forecast a trend line intothe future. When
regression is not practical, other forecasting methods (exponential smoothing, smoothing
averages) can be applied as predictive analytics to develop needed forecasts of business trends.
The identification of future trends is the main output of Step 2 and the predictive
analytics used to find them. This helps answer the question ofwhat will happen.If a firm
knows where the future lies by forecasting trends as they would in Step 2 of the BA process,
it can then take advantage of any possible opportunities predicted in that future state. In Step
3, Prescriptive Analytics analysis, operations research methodologies can be used to
optimally allocate a firm’s limited resources to take best advantage of the opportunities it
found in the predicted future trends. Limits on human, technology, and financial resources
prevent any firm from going after all opportunities they may have available at any one time.
Using prescriptive analytics allows the firm to allocate limited resources to optimally achieve
objectives as fully as possible.
In summary, the three major components of descriptive, predictive, and prescriptive analytics
arranged as steps in the BA process can help a firm find opportunities in data, predict trends that
forecast future opportunities, and aid in selecting a course of action that optimizes the firm’s
allocation ofresources to maximize value and performance.
-------------------------------------------------------------------------------------------------------------------
Relationship of BA Process and Organization Decision-Making Process:
The BA process can solve problems and identify opportunities to
improve business performance. In the process, organizations may also determine strategies to
guide operations and help achieve competitive advantages. Typically, solving problems and
identifying strategic opportunities to follow are organization decision-making tasks. The
latter,identifying opportunities, can be viewed as a problem of strategy choice requiring a
solution. It should come as no surprise that the BA process described in closely parallels
classic organization decision- making processes. As depicted in below, the business analytic
process has an inherent relationship to the steps in typical organization decision- making
processes.
The decision-making foundation that has served ODMP for many decades parallels the BA
process. The same logic serves both processes and supports organization decision-making skills
and capacities.
------------------------------------------------------------------------------------------------------------------
COMPETITIVE ADVANTAGES OF BUSINESS ANALYTICS:
Business organization planning is typically segmented into three types, presented in fig.
below. The planning process usually follows a sequence from strategy, down to tactical, and then
down to operational, although Fig shows arrows of activities going up and down the depicted
hierarchal structure of most business organizations.
The upward flow in fig, represents the information passed from lower levels up, and the
downward flow represents the orders that are passed from higher levels of management down to
lower levels for implementation. It can be seen in theTeece (2007) study and more recently in
Rha (2013) that the three steps in the BA process and strategic planning embody the same efforts
and steps.
Fig. Types of organization planning*
Effectively planning and passing down the right orders in hopes of being a business winner
requires good information on which orders can be decided. Some information can become so
valuable that it provides the firma competitive advantage (the ability of one business to perform
at a higher level, staying ahead of present competitors in the same industry or market).Business
analytics can support all three types of planning with useful information that can give a firm a
competitive advantage. Examples of the ways BA can help firms achieve a competitive
advantage are presented in table below:
Table : Ways BA Can Help Achieve a Competitive Advantage
1.2. STATISTICAL TOOLS:
Statistical Notation:
A population consists of all items of interest for a particular decision or investigation for
example, all individuals in the United States who do not own cell phones, all subscribers to
Netflix, or all stockholders of Google. A company like Netflix keeps extensive records on its
customers, making it easy to retrieve data about the entire population of customers. However, it
would probably be impossible to identify all individuals who do not own cell phones.
We typically label the elements of a data set using subscripted variables, x1, x2, … , and
so on. In general, xi represents the ith observation. It is a common practice in statistics to use
Greek letters, such as m (mu), s (sigma), and p (pi), to represent population measures and italic
letters such as by x (x-bar), s, and p to represent sample statistics. We will use N to represent the
number of items in a population and n to represent the number of observations in a sample.
Statistical formulas often contain a summation operator, Σ (Greek capital sigma), which means
that the terms that follow it are added together.
Thus, sigma i=1 to n: x1 + x2 +…+ xn. Understanding these conventions and mathematical
notation will help you to interpret and apply statistical formulas.
-------------------------------------------------------------------------------------------------------------------
DESCRIPTIVE STATISTICAL METHODS:
Descriptive statistics describes or summarizes the basic features or characteristics of the
data. It assigns numerical values to describe the trend of the samples collected. It converts large
volumes of data and presents it in a simpler, more meaningful format that is easier to understand
and interpret. It is paired with graphs and tables; descriptive statistics offer a clear summary of the
data’s complete collection.
Descriptive statistics indicate that interpretation is the primary purpose, while inferential
statistics make future predictions for a larger set of data based on descriptive values obtained.
Hence, descriptive statistics form the first step and the basis of quantitative data analysis.
METHODS USED IN DESCRIPTIVE STATISTICS:
(1)Measures of location: provide estimates of a single value that in some fashion represents the
“centering” of a set of data. The most common is the average. We all use averages routinely in our
lives, for example, to measure student accomplishment in college (e.g., grade point average), to
measure the performance of sports teams (e.g., batting average), and to measure performance in
business (e.g., average delivery time).
(a)Arithmetic Mean :
The average is formally called the arithmetic mean (or simply the mean), which is the sum of
the observations divided by the number of observations. Mathematically, the mean of a population
is denoted by the Greek letter m, and the mean of a sample is denoted by x. If a population consists
of N observations x1, x2, c, xN, the population mean, m, is calculated as
Note that the calculations for the mean are the same whether we are dealing with a
population or a sample; only the notation differs. We may also calculate the mean in Excel using
the function AVERAGE(data range). One property of the mean is that the sum of the deviations of
each observation from the mean is zero:
This simply means that the sum of the deviations above the mean are the same as the sum
of the deviations below the mean; essentially, the mean “balances” the values on either side of it.
(b) Median:
The measure of location that specifies the middle value when the data are arranged from
least to greatest is the median. Half the data are below the median, and half the data are above it.
For an odd number of observations, the median is the middle of the sorted numbers. For an even
number of observations, the median is the mean of the two middle numbers. We could use the Sort
option in Excel to rank-order the data and then determine the median. The Excel function
MEDIAN(data range) could also be used. The median is meaningful for ratio, interval, and ordinal
data. As opposed to the mean, the median is not affected by outliers.
(c) Mode:
A third measure of location is the mode. The mode is the observation that occurs most
frequently. The mode is most useful for data sets that contain a relatively small number of unique
values.
For data sets that have few repeating values, the mode does not provide much practical
value. You can easily identify the mode from a frequency distribution by identifying the value
having the largest frequency or from a histogram by identifying the highest bar. You may also use
the Excel function MODE.SNGL(data range).
For frequency distributions and histograms of grouped data, the mode is the group with the
greatest frequency. Some data sets have multiple modes; to identify these, you can use the Excel
function MODE.MULT(data range), which returns an array of modal values.
(d) Midrange:
A fourth measure of location that is used occasionally is the midrange. This is simply the
average of the greatest and least values in the data set. Caution must be exercised when using the
midrange because extreme values easily distort the result, as this example illustrated. This is
because the midrange uses only two pieces of data, whereas the mean uses all the data; thus, it is
usually a much rougher estimate than the mean and is often used for only small sample sizes.
(2) Measures of Dispersion:
Dispersion refers to the degree of variation in the data, that is, the numerical spread (or
compactness) of the data. Several statistical measures characterize dispersion: the range, variance,
and standard deviation.
(a)Range:
The range is the simplest and is the difference between the maximum value and the
minimum value in the data set. Although Excel does not provide a function for the range, it can be
computed easily by the formula = MAX(data range) - MIN(data range). Like the midrange, the
range is affected by outliers and, thus, is often only used for very small data sets.
(b)Interquartile Range:
The difference between the first and third quartiles, Q3 - Q1, is often called the interquartile
range (IQR), or the midspread. This includes only the middle 50% of the data and, therefore, is not
influenced by extreme values. Thus, it is sometimes used as an alternative measure of dispersion.
(c) Variance:
A more commonly used measure of dispersion is the variance, whose computation depends
on all the data. The larger the variance, the more the data are spread out from the mean and the
more variability one can expect in the observations. The formula used for calculating the variance
is different for populations and samples.
The formula for the variance of a population is
The Excel function STDEV.P(data range) calculates the standard deviation for a population
1s2; the function STDEV.S(data range) calculates it for a sample (s).
The standard deviation is generally easier to interpret than the variance because its units of
measure are the same as the units of the data. Thus, it can be more easily related to the mean or
other statistics measured in the same units.
We subtract the sample mean from the ith observation, xi, and divide the result by the
sample standard deviation. In formula ,the numerator represents the distance that xi is from the
sample mean; a negative value indicates that xi lies to the left of the mean, and a positive value
indicates that it lies to the right of the mean. By dividing by the standard deviation, s, we scale the
distance from the mean to express it in units of standard deviations. Thus, a z-score of 1.0 means
that the observation is one standard deviation to the right of the mean; a z-score of -1.5 means that
the observation is 1.5 standard deviations to the left of the mean.
Thus, even though two data sets may have different means and standard deviations, the
same z-score means that the observations have the same relative distance from their respective
means. Z-scores can be computed easily on a spreadsheet; however, Excel has a function that
calculates it directly, STANDARDIZE(x, mean, standard_dev).
(a) Skewness:
describes the lack of symmetry of data. The coefficient of skewness (CS) measures
the degree of asymmetry of observations around the mean.
Linear function: y = a + bx. Linear functions show steady increases or decreases over the range of
x. This is the simplest type of function used in
predictive models. It is easy to understand, and over small ranges of values, can approximate
behavior rather well.
Logarithmic function: y = ln1x2. Logarithmic functions are used when the rate
of change in a variable increases or decreases quickly and then levels out, such as
with diminishing returns to scale. Logarithmic functions are often used in mar- keting models where
constant percentage increases in advertising, for instance, result in constant, absolute increases in
sales.
Polynomial function: y = ax 2 + bx + c (second order—quadratic function),
y = ax 3 + bx 2 + dx + e (third order—cubic function), and so on. A second-
order polynomial is parabolic in nature and has only one hill or valley; a third- order polynomial
has one or two hills or valleys. Revenue models that incorporate price elasticity are often
polynomial functions.
Power function: y = axb. Power functions define phenomena that increase at a specific rate.
Learning curves that express improving times in performing a task
are often modeled with power functions having a 7 0 and b 6 0.
Exponential function: y = abx. Exponential functions have the property that y
rises or falls at constantly increasing rates. For example, the perceived brightness
of a lightbulb grows at a decreasing rate as the wattage increases. In this case,
a would be a positive number and b would be between 0 and 1. The exponential function is often
defined as y = aex, where b = e, the base of natural logarithms (approximately 2.71828).
R2 (R-squared) is a measure of the “fit” of the line to the data. The value of R2 will be between 0
and 1. The larger the value of R2 the better the fit.
----------------------------------------------------------------------------------------------------------------
SIMPLE LINEAR REGRESSION:
Is a tool for building mathematical and statistical models that characterize relationships
between a dependent variable (which must be a ratio variable and not categorical) and one or more
independent, or explanatory, variables, all of which are numerical (but may be either ratio or
categorical).
Two broad categories of regression models are used often in business settings:
(1) regression models of cross-sectional data and
(2) regression models of time-series data, in which the independent variables are time or
some function of time and the focus is on predicting the future.
Time-series regression is an important tool in forecasting. A regression model that involves
a single independent variable is called simple regression. A regression model that involves two or
more independent variables is called multiple regression.
Simple linear regression involves finding a linear relationship between one independent
variable, X, and one dependent variable, Y. The relationship between two variables can assume
many forms, as illustrated in Figure 8.5. The relationship may be linear or nonlinear, or there may
be no relationship at all. Because we are focusing our discussion on linear regression models, the
first thing to do is to verify that the relationship is linear, as in Figure 8.5(a). We would not expect
to see the data line up perfectly along a straight line; we simply want to verify that the general
relationship is linear. If the relationship is clearly nonlinear, as in Figure 8.5(b), then alternative
approaches must be used, and if no relationship is evident, as in Figure 8.5(c), then it is pointless
to even consider developing a linear regression model. To determine if a linear relationship exists
between the variables, we recommend that you create a scatter chart that can show the relationship
between variables visually.
IMPORTANT RESOURCES:
it is necessary to understand resource needs of a BA program to better comprehend the
value of the information that BA provides. The need for BA resources varies by firm to meet
particular decision support requirements. Some firms may choose to have a modest investment,
whereas other firms may have BA teams or a department of BA specialists. Regardless of the level
of resource investment, at minimum, a BA program requires resource investments in BA personnel,
data, and technology.
(1) Business Analytics Personnel
(2) Business analytics technology
(3) Business Analytics Data:
Structured and unstructured data is needed to generate analytics. As a beginning for
organizing data into an understandable framework, statisticians usually categorize data into
meaning groups.
(A) Categorizing Data:
There are many ways to categorize business analytics data. Data is commonly
categorized by either internal or external sources.Typical examples of internal data sources
include those presented in TABLE 3.4. When firms try to solve internal production or
service operations problems, internally sourced data may be all that is needed. Typical
external sources of data (SEE TABLE 3.5) are numerous and provide great diversity and
unique challenges for BA to process. Data can be measured quantitatively (for example,
sales dollars) or qualitatively by preference surveys (for example, products compared
based on consumers preferring one product over another) or by the amount of consumer
discussion (chatter) on the Web regarding the pluses and minuses of competing products
Table 3.4 Typical Internal Sources of Data on Which Business AnalyticsCan Be Based
Table 3.5 Typical External Sources of Data on Which Business AnalyticsCan Be Based
A major portion of the external data sources are found in the literature.For example, the
US Census and the International Monetary Fund (IMF) are useful data sources at the
macroeconomic level for model building.
(B)DATA ISSUES:
couple of data issues that are critical tothe usability of any database or data file. Those
issues are data quality and data privacy.
(A)Data quality:
can be defined as data that serves the purpose forwhich it is collected. It
means different things for different applications, but there are some commonalities of high-
quality data. These qualities usually include accurately representing reality, measuring what it
is supposed to measure, being timeless, and having completeness. When data is of high quality,
it helps ensure competitiveness, aids customer service, and improves profitability. When data
is of poor quality, it can provide information that is contradictory, leading to misguided
decision-making.
For example, having missing data in files can prohibit some forms’ statistical
modeling, and incorrect coding of information can completely render databases useless. Data
quality requires effort on the part of data managers to cleanse data of erroneous information
and repair or replace missing data.
(B)Data privacy:
refers to the protection of shared data such that access is permitted only to those
users for whom it is intended. It is a security issue that requires balancing the need to know
with the risks of sharing too much.
There are many risks in leaving unrestricted access to a company’s database. For
example, competitors can steal a firm’s customers by accessing addresses. Data leaks on
product quality failures can damage brand image, and customers can become distrustful
of a firm that shares information given in confidence. To avoid these issues, a firm needs
to abide by the current legislation regarding customer privacy and develop a program
devoted to data privacy.
Collecting and retrieving data and computing analytics requires the use of computers
and information technology. A large part of what BA personnel do is related to managing
information systems to collect, process,store, and retrieve data from various sources.
-------------------------------------------------------------------------------------------------------------
BUSINESS ANALYTICS PERSONNEL:
One way to identify personnel needed for BA staff is to examine what isrequired for
certification in BA by organizations that provide BA services. INFORMS, a major academic
and professional organization, announced the startup of a Certified Analytic Professional
(CAP) program in 2013.
Another more established organization, Cognizure, offers a variety of service products,
including business analytic services. It offers a general certification Business Analytics
Professional (BAP) exam that measures existing skill sets in BA staff and identifies areas
needing improvement This is a tool to validate technical proficiency, expertise, and
professional standards in BA. The certificationconsists of three exams covering the content
areas listed in Table 3.1.
Most of the content areas in Table 3.1 will be discussed and illustrated in
subsequent chapters and appendixes. The three exams required in the Cognizure certification
program can easily be understood in the context ofthe three steps of the BA process (descriptive,
predictive, and prescriptive.
The topics in Figure 3.1 of the certification program are applicable to the three major
steps in the BA process. The basic statistical tools apply to the descriptive analytics step, the
more advanced statistical tools apply to the predictive analytics step, and the operations research
tools apply to the prescriptive analytics step. Some of the tools can be applied to both the
descriptive and the predictive steps.
Likewise, tools like simulation can be applied to answer questions in both the predictive
and the prescriptive steps, depending on how they’re used. At the conjunction of all the tools is
the reality of case studies. The use of case studies is designed to provide practical experience
where all tools are employed to answer important questions or seek opportunities.
Figure 3.1 Certification content areas and their relationship to the steps in
BA
they also include specialized skill sets related to BA personnel (administrators, designers,
developers,solution experts, and specialists), as presented in Table 3.2.
Annual reports summarize data about companies’ profitability and market share both in numerical form and in
charts and graphs to communicate with shareholders.
Accountants conduct audits to determine whether figures reported on a firm’s balance sheet fairly represent the
actual data by examining samples (that is,subsets) of accounting data, such as accounts receivable.
Financial analysts collect and analyze a variety of data to understand the contribution that a business provides to
its shareholders. These typically include profitability, revenue growth, return on investment, asset utilization,
operating margins, earnings per share, economic value added (EVA), shareholder value, and other relevant
measures.
Economists use data to help companies understand and predict population trends,interest rates, industry
performance, consumer spending, and international trade.Such data are often obtained from external sources such
as Standard & Poor’s Compustat data sets, industry trade associations, or government databases.
Marketing researchers collect and analyze extensive customer data. These data often consist of demographics,
preferences and opinions, transaction and payment history, shopping behavior, and a lot more. Such data may be
collected by surveys, personal interviews, focus groups, or from shopper loyalty cards.
Operations managers use data on production performance, manufacturing quality, delivery times, order accuracy,
supplier performance, productivity, costs, and environmental compliance to manage their operations.
Human resource managers measure employee satisfaction, training costs, turn- over, market innovation, training
effectiveness, and skills development.
Decision models characterize the relationships among the data, uncontrollable variables,
and decision variables, and the outputs of interest to the decision maker .Decision models can
be represented in various ways, most typically with mathematical functions and spreadsheets.
Spreadsheets are ideal vehicles for implementing decision models because of their versatilityin
managing data, evaluating different scenarios, and presenting results in a meaningful fashion.
UNCERTAINITY AND RISK:
Uncertainty is imperfect knowledge of what will happen; risk is associated with the
consequences and likeli hood of what might happen. Risk is evaluated by the magnitude of the
consequences and the likelihood that they would occur. To try to eliminate risk in business
enterprise is futile. Risk is inherent in the commitment of present resources to future
expectations. Indeed, economic progress can be defined as the ability to take greater risks. The
attempt to eliminate risks, even the attempt to minimize them, can only make them irrational
and unbearable. It can only result in the greatest risk of all.
A PRESCRIPTIVE decision model:
helps decision makers to identify the best solution to a decision problem. Optimization
is the process of finding a set of values for decision variables that minimize or maximize some
quantity of interest—profit, revenue, cost, time, and so on called the objective function. Any
set of decision variables that optimizes the objec- tive function is called an optimal solution.
Prescriptive decision models can be either deterministic or stochastic. A deterministic model
is one in which all model input information is either known or assumed to be known with
certainty. A stochastic model is one in which some of the model input infor- mation is
uncertain.
----------------------------------------------------------------------------------------------------------------
Sensitivity to political and organizational issues is an important skill that managers and
analytical professionals alike must possess when solving problems.
--------------------------------------------------------------------------------------------------------------------------------------
DBMS includes capabilities and tools for organizing, managing, and accessing data
in databases. Four of the more important capabilities are its data definition language, data
dictionary, database encyclopedia, and data manipulation language. DBMS has a data
definition capability to specify the structure of content in a database. This is used to create
database tables and characteristics used in fields to identify content. These tables and
characteristics are critical success factors for search efforts as the database grows in size.
These characteristics are documented in the data dictionary (an automated or manual file
that stores the size, descriptions, format, and other properties needed to characterize data).
The database encyclopedia is a table of contents listing a firm’s current data
inventory and what data filescan be built or purchased. The typical content of the database
encyclopediais presented in Table 3.7. Of particular importance for BA is the data
manipulation language tools included in DMBS. These tools are used to search databases
for specific information. An example is structure query language (SQL), which allows
users to find specific data through a session of queries and responses in a database.
Data marts are focused subsets or smaller groupings within a data warehouse.
Firms often build enterprise-wide data warehouses where a central data warehouse
serves the entire organization and smaller, decentralized data warehouses (called data
marts) are focused on a limitedportion of the organization’s data that is placed in a
separate database for aspecific population of users. For example, a firm might develop
a smaller database on just product quality to focus efforts on quality customer and
product issues. A data mart can be constructed more quickly and at lower cost than
enterprise-wide data warehouses to concentrate effort in areas ofgreatest concern.
Online analytical processing (OLAP) is software that allows users to view data in
multiple dimensions. For example, employees can be viewed in terms of their age, sex,
geographic location, and so on. OLAP would allow identification of the number of
employees who are age 35, male, and in thewestern region of a country. OLAP allows
users to obtain online answers toad hoc questions quickly, even when the data is stored in
very large databases.
Data mining is the application of a software, discovery-driven process that provides
insights into business data by finding hidden patterns and relationships in big data or
large databases and inferring rules from them topredict future behavior. The observed
patterns and rules are used to guide decision-making. They can also act to forecast the
impact of those decisions.
Text mining is a software application used toextract key elements from unstructured
data sets, discover patterns and relationships in the text materials, and summarize the
information.
Web mining seeks to find patterns, trends, and insights into customer behavior from
users of the Web.
Analysis ToolPak isan Excel add-in that contains a variety of statistical tools (for
example, graphics and multiple regression) for the descriptive and predictive BA process
steps. Another Excel add-in, Solver, contains operations research optimization tools (for
example, linear programming) used in the prescriptive step of the BA process.
UNIT-3
3.1.____________________________________________________________
ORGANIZATION STRUCTURES OF B.A.:
to successfully implement business analytics (BA) within organizations, the BA in
whatever organizational form it takes must be fully integrated throughout afirm. This requires BA
resources to be aligned in a way that permits a viewof customer information within and across all
departments, access to customer information from multiple sources (internal and external to the
organization), access to historical analytics from a central repository, and making technology
resources align to be accountable for analytic success. The commonality of these requirements is
the desire for an alignment that maximizes the flow of information into and through the BA
operation, which in turn processes and shares information to desired users throughout the
organization.
(A)most organizations are hierarchical, with senior managers making the strategic planning
decisions, middle-level managers making tactical planning decisions, and lower-level managers
making operational planning decisions. Within the hierarchy, other organizational structures
exist to support the development and existence of groupings of resources like those needed for
BA. These additional structures include programs, projects, andteams. A program in this context
is the process that seeks to create an outcome and usually involves managing several related
projects with the intention of improving organizational performance. A program can also bea
large project. A project tends to deliver outcomes and can be defined as having temporary rather
than permanent social systems within or across organizations to accomplish particular and
clearly defined tasks, usually under time constraints. Projects are often composed of teams. A
team consists of a group of people with skills to achieve a common purpose.Teams are especially
appropriate for conducting complex tasks that havemany interdependent subtasks.
The relationship of programs, projects, and teams with a business hierarchy is presented in
Figure 4.1. Within this hierarchy, the organization’s senior managers establish a BA program
initiative to mandatethe creation of a BA grouping within the firm as a strategic goal. A BA
program does not always have an end-time limit. Middle-level managers reorganize or break
down the strategic BA program goals into doable BA project initiatives to be undertaken in a
fixed period of time. Some firms have only one project (establish a BA grouping) and others,
depending on the organization structure, have multiple BA projects requiring the creation of
multiple BA groupings. Projects usually have an end-time date in which to judge the
successfulness of the project. The projects in some cases are further reorganized into smaller
assignments, called BA team initiatives, to operationalize the broader strategy of the BA
program. BA teams may have a long-standing time limit (for example, to exist as the main source
of analytics for an entire organization) or have a fixed period (for example, to work on a specific
product quality problem and then end).
Figure 4.1 Hierarchal relationships program, project, and team planning
BA organization structures usually begin with an initiative that recognizes the need to
use and develop some kind of program in analytics. Fortunately, most firms today recognize
this need. The question then becomes how to match the firm’s needs within the organization to
achieve its strategic, tactical, and operations objectives within resource limitations. Planning
the BA resource allocation within the organizational structure of afirm is a starting place for the
alignment of BA to best serve a firm’s needs.
Aligning the BA resources requires a determination of the amount of resources a firm wants
to invest. The outcome of the resource investment might identify only one individual to compute
analytics for a firm. Becauseof the varied skill sets in information systems, statistics, and operations
research methods, a more common beginning for a BA initiative is the creation of a BA team
organization structure possessing a variety of analytical and management skills.
(B) Another way of aligning BA resources within an organization is to use a project structure.
Most firms undertake projects, and some firms actually use a project structure for their entire
organization.
In organizations where functional departments are structured on a strict hierarchy, separate
BA departments orteams have to be allocated to each functional area, as presented in Figure.
This functional organization structure may have the benefit of stricter functional control by the
VPs of an organization and greater efficiency in focusing on just the analytics within each
specialized area. On the other hand, this structure does not promote the cross-department access
that is suggested as a critical success factor for the implementation of a BA program.
Figure 4.2 Functional organization structure with BA
The needs of each firm for BA sometimes dictate positioning BA within existing organization
functional areas. Clearly, many alternative structures can house a BA grouping. For example,
because BA provides information to users, BA could be included in the functional area of
management information systems, with the chief information officer (CIO) acting as boththe
director of information systems (which includes database management) and the leader of the
BA grouping.
(C) found in large organizations aligns resources by project or product and is called a
matrix organization. As illustrated in Figure 4.3, this structure allows the VPs some indirect
control over their related specialists, which would include the BA specialists but also allows
direct control by the project or product manager. This, similar to the functional organizational
structure, does not promote the cross-department access suggested for a successful
implementation of a BA program.
-----------------------------------------------------------------------------------------------------------------
MANAGEMENT ISSUES:
Aligning organizational resources is a management function. There are general
management issues that are related to a BA program, and some are specifically important to
operating a BA department, project, or team. The ones covered in this section include
establishing an information policy, outsourcing business analytics, ensuring data quality,
measuring business analytics contribution, and managing change.
Establishing an Information Policy:
There is a need to manage information. This is accomplished by establishing an information
policy to structure rules on how information anddata are to be organized and maintained and who
is allowed to view the dataor change it. The information policy specifies organizational rules for
sharing, disseminating, acquiring, standardizing, classifying, and inventorying all types of
information and data. It defines the specific procedures and accountabilities that identify which
users and organizationalunits can share information, where the information can be distributed,
and who is responsible for updating and maintaining the information.
In small firms, business owners might establish the information policy.For larger firms,
data administration may be responsible for the specific policies and procedures for data
management. Responsibilities could include developing the information policy, planning
data collection and storage, overseeing database design, developing the data dictionary, as
well as montoring how information systems specialists and end user groups use data.
Outsourcing Business Analytics:
Outsourcing can be defined as a strategy by which an organization chooses to allocate some
business activities and responsibilities from an internal source to an external source. Outsourcing
business operations is a strategy that an organization can use toimplement a BA program, run
BA projects, and operate BA teams. Any business activity can be outsourced, including BA.
Outsourcing is an important BA management activity that should be considered as a viable
alternative in planning an investment in any BA program.
BA is a staff function that is easier to outsource than other line management tasks, such as
running a warehouse. To determine if outsourcing is a useful option in BA programs,
management needs to balance the advantages of outsourcing with its disadvantages. Some of
theadvantages of outsourcing BA include those listed in Table 4.4.
An organization needs to identify and correct faulty data and establish routines and
procedures for editing data in the database. The analysis of data quality can begin with a data
quality audit, where a structured survey or inspection of accuracy and level of completeness of
data is undertaken. This audit may be of the entire database, just a sample of files, or a survey
of end users for perceptions of the data quality. If during the data quality audit files are found
that have errors, a process called data cleansing or data scrubbing is undertaken to eliminate or
repair data. Some of the areas in a data file that should be inspected in the audit and suggestions
on how tocorrect them are presented in Table 4.6.
Table 4.6 Quality Data Inspection Items and Recommendations
Managing Change:
Wells (2000) found that what is critical in changing organizations is organizational culture
and the use of change management. Organizational culture is how an organization supports
cooperation, coordination, and empowerment of employees . Change management is defined as
an approach for transitioning the organization (individuals, teams, projects, departments) to a
changed and desired future state .Change management is a means of implementing change in an
organization, such as adding a BA department .Changes in an organization can be either planned
(a result of specific and planned efforts at change withdirection by a change leader) or unplanned
(spontaneous changes without direction of a change leader).
The application of BA invariably will result inboth types of changes because of BA’s specific
problem-solving role (a desired, planned change to solve a problem) and opportunity finding
exploratory nature (i.e., unplanned new knowledge opportunity changes) of BA. Change
management can also target almost everything that makes up an organization (see Table 4.7).
3.2.____________________________________________________________
DESCRIPTIVE ANALYTICS:
Chapter- 5 full -B.A. principles, concepts- MARC J –
----------------------------------------------------------------------------------------------------------------
PREDICTIVE ANALYSIS:
Predictive analytics is the use of data, statistical algorithms and machine learning
techniques to identify the likelihood of future outcomes based on historical data. The goal is to go
beyond knowing what has happened to providing a best assessment of what will happen in the
future. Predictive analytics is an area of statistics that deals with extracting information from data
and using it to predict trends and behavior patterns. The enhancement of predictive web analytics
calculates statistical probabilities of future events online. Predictive analytics statistical techniques
include data modeling, machine learning, AI, deep learning algorithms and data mining. Often the
unknown event of interest is in the future, but predictive analytics can be applied to any type of
unknown whether it be in the past, present or future.
TYPES:
(1) Predictive models[edit]
Predictive modelling uses predictive models to analyze the relationship between the specific
performance of a unit in a sample and one or more known attributes or features of that unit. The
objective of the model is to assess the likelihood that a similar unit in a different sample will exhibit
the specific performance. This category encompasses models in many areas, such as marketing,
where they seek out subtle data patterns to answer questions about customer performance, or fraud
detection models. Predictive models often perform calculations during live transactions, for
example, to evaluate the risk or opportunity of a given customer or transaction, in order to guide a
decision. With advancements in computing speed, individual agent modeling systems have become
capable of simulating human behaviour or reactions to given stimuli or scenarios.
(2) Descriptive models
Descriptive models quantify relationships in data in a way that is often used to classify
customers or prospects into groups. Unlike predictive models that focus on predicting a single
customer behavior (such as credit risk), descriptive models identify many different relationships
between customers or products. Descriptive models do not rank-order customers by their likelihood
of taking a particular action the way predictive models do. Instead, descriptive models can be used,
for example, to categorize customers by their product preferences and life stage. Descriptive
modeling tools can be utilized to develop further models that can simulate large number of
individualized agents and make predictions.
(3) Decision models
Decision models describe the relationship between all the elements of a decision—the
known data (including results of predictive models), the decision, and the forecast results of the
decision—in order to predict the results of decisions involving many variables. These models can
be used in optimization, maximizing certain outcomes while minimizing others. Decision models
are generally used to develop decision logic or a set of business rules that will produce the desired
action for every customer or circumstance.
----------------------------------------------------------------------------------------------------------------
PREDICATIVE MODELLING:
Predictive modeling means developing models that can be used to forecast or predict
future events. In business analytics, models can be developed based on logic or data.
(A) Logic-Driven Models:
A logic-driven model is one based on experience, knowledge, and logicalrelationships of
variables and constants connected to the desired business performance outcome situation. The
question here is how to put variables and constants together to create a model that can predict
the future. Doing this requires business experience. Model building requires an understanding
of business systems and the relationships of variables and constants that seek to generate a
desirable business performance outcome. To help conceptualize the relationships inherent in a
business system, diagramming methods can be helpful. For example, the cause-and-effect
diagram is a visual aid diagram that permits a user to hypothesize relationships between
potential causes of an outcome (see Figure 6.1). This diagram lists potentialcauses in terms of
human, technology, policy, and process resources in an effort to establish some basic
relationships that impact business performance. The diagram is used by tracing contributing and
relational factors from the desired business performance goal back to possible causes, thus
allowing the user to better picture sources of potential causes that could affect the performance.
This diagram is sometimes referred to as a fishbone diagram because of its appearance.
Figure 6.1 Cause-and-effect diagram
Another useful diagram to conceptualize potential relationships with business
performance variables is called the influence diagram. According to Evans influence diagrams
can be useful to conceptualize the relationships of variables in the development of models. An
example of an influence diagram is presented in Figure 6.2. It maps the relationship of variables
and a constant to the desired business performanceoutcome of profit. From such a diagram, it is
easy to convert the information into a quantitative model with constants and variables that define
profit in this situation:
Profit = Revenue − Cost, or
Profit = (Unit Price × Quantity Sold) − [(Fixed Cost) + (Variable Cost ×Quantity Sold)], or
P = (UP × QS) − [FC + (VC × QS)]
An ideal multiple variable modeling approach that can be used in this situation to
explore variable importance in this case study and eventually lead to the development of a
predictive model for product sales is correlation and multiple regression. We will use both
Excel and IBM’s SPSS statistical packages to compute the statistics in this step of the BA
process.
First, we must consider the four independent variables—radio, TV, newspaper, POS—
before developing the model.
One way to see the statistical direction of the relationship (which is better than just
comparing graphic charts) is to compute the Pearson correlation coefficients r betweeneach of
the independent variables with the dependent variable (product sales). The SPSS correlation
coefficients and their levels of significance arepresented in Table 6.4. The comparable Excel
correlations are presented in Figure 6.5.
Although it can be argued that the positive or negative correlation coefficients should not
automatically discount any variable from what will be a predictive model, the negative
correlation of newspapers suggests that as a firm increases investment in newspaper ads, it will
decrease product sales. This does not make sense in this case study. Given the illogic of such a
relationship, its potential use as an independent variable in a model is questionable. Also, this
negative correlation poses several questions that should be considered. Was the data set
correctly collected? Is the data set accurate? Was the sample large enough to have included
enough data for this variable to show a positive relationship? Should it be included for further
analysis? Although it is possible that a negative relationship can statistically show up like this,
it does not make sense in this case. Based on this reasoning and the fact that the correlation is
not statistically significant, this variable (i.e., newspaper ads) will be removed from further
consideration in this exploratory analysis to develop a predictive model.
Some researchers might also exclude POS based on the insignificance (p=0.479) of its
relationship with product sales. However, for purposes ofillustration, continue to consider it
a candidate for model inclusion. Also, the other two independent variables (radio and TV)
were both found to besignificantly related to product sales, as reflected in the correlation
coefficients in the tables.
The procedure by which multiple regression can be used to evaluate which independent
variables are best to include or exclude in a linear model is called step-wise multiple
regression. It is based on an evaluation of regression models and their validation statistics—
specifically, the multiple correlation coefficients and the F-ratio from an ANOVA. SPSS
software and many other statistical systems build in the step-wise process. Some are called
backward step-wise regression and some are called forward step-wiseregression. The
backward step-wise regression starts with all the independent variables placed in the model,
and the step-wise process removes them one at a time based on worst predictors first until a
statistically significant model emerges. The forward step-wise regression starts with the best
related variable (using correction analysis as a guide), and then step-wise adds other variables
until adding more will no longer improve the accuracy of the model. The forward step-wise
regression process will be illustrated here manually. The first step is to generate individual
regression models and statistics for each independent variable with the dependent variable one
at a time. These three models are presented in Tables 6.5, 6.6, and 6.7 for the POS, radio, and
TV variables, respectively. The comparable Excel regression statistics are presented in Tables
6.8, 6.9 and 6.10 for the POS, radio, and TV variables, respectively.
The case study firm had collected a random sample of monthly sales information presented
in Figure 6.4 listed in thousands of dollars. What the firm wants to know is, given a fixed budget
of $350,000 for promoting this service product, when offered again, how best should the
company allocate budget dollars in hopes of maximizing the future estimated month’s product
sales? Before making any allocation of budget, there is a need to understand how to estimate
future product sales. This requires understanding the behavior of product sales relative to sales
promotion efforts using radio, paper, TV, and point-of-sale (POS) ads.
Figure 6.4 Data for marketing/planning case study
The analysis also revealed little regarding the relationship of newspaper and POS ads to
product sales. So although radio and TV commercials are most promising, a more in-depth
predictive analytics analysis is called for to accurately measure and document the degree of
relationship that may exist in the variables to determine the best predictors of product sales.
-------------------------------------------------------------------------------------------------------------------------
PRESCRIPTIVE MODELLING:
After undertaking the descriptive and predictive analytics steps in the BAprocess, one
should be positioned to undertake the final step: prescriptive analytics analysis. The prior
analysis should provide a forecast or predictionof what future trends in the business may
hold. For example, there may be significant statistical measures of increased (or decreased)
sales, profitability trends accurately measured in dollars for new market opportunities, or
measured cost savings from a future joint venture.
Step 3 of the BA process, prescriptive analytics, involves the application ofdecision
science, management science, or operations research methodologies to make best use of
allocable resources. These are mathematically based methodologies and algorithms
designed to take variables and other parameters into a quantitative framework and generate
an optimal or near-optimal solution to complex problems. These methodologies can be used
to optimally allocate a firm’s limited resources to take best advantage of the opportunities
it has found in the predicted future trends. Limits on human, technology, and financial
resources prevent any firm from going after all the opportunities. Using prescriptive
analyticsallows the firm to allocate limited resources to optimally or near-optimally achieve
the objectives as fully as possible.
The listing of the prescriptive analytic methodologies as they are in some cases utilized
in
the BA process is again presented in Figure 7.1 to form the basis of thischapter’s content.
Prescriptive Modeling:
The listing of prescriptive analytic methods and models in Figure 7.1 is but a small
grouping of many operations research, decision science, and management science
methodologies that are applied in this step of the BA process. The explanation and use of
most of the methodologies in Table 7.1are explained throughout this book. (See Additional
Information column inTable 7.1.)
NON-LINEAR OPTIMIZATION:
When business performance cost or profit functions becometoo complex for simple
linear models to be useful, exploration of nonlinear functions is a standard practice in BA.
Although the predictive nature of exploring for a mathematical expression to denote a trend
or establish a forecast falls mainly in the predictive analytics step of BA, the use of the
nonlinear function to optimize a decision can fall in the prescriptive analytics step.
there are many mathematical programing nonlinear methodologies and solution
procedures designed to generate optimal business performance solutions. Most of them
require careful estimation of parameters that may or may not be accurate, particularly given
the precision required of a solution that can be so precariously dependent upon parameter
accuracy. This precision is further complicated in BA by the large data files that should be
factored into the model-buildingeffort.
To overcome these limitations and be more inclusive in the use of largedata, regression
software can be applied. Curve Fitting software can be used to generate predictive
analytic modelsthat can also be utilized to aid in making prescriptive analytic decisions.
For purposes of illustration, SPSS’s Curve Fitting software will be used in this chapter.
Suppose that a resource allocation decision is being faced whereby one must decide how
many computer servers a service facility should purchase to optimize the firm’s costs of
running the facility. The firm’s predictive analytics effort has shown a growth trend. A new
facilityis called for if costs can be minimized. The firm has a history of setting up large and
small service facilities and has collected the 20 data points in Figure 7.2.
Figure 7.2 Data and SPSS Curve Fitting function selection window
The first step in using the curve-fitting methodology is to generate the best-fitting
curve to the data. By selecting all the SPSS models in Figure 7.2, the software applies each
point of data using the regression process of minimizing distance from a line. The result is
a series of regression models and statistics, including ANOVA and other testing statistics.
It is known from the previous illustration of regression that the adjusted R-Square statistic
can reveal the best estimated relationship between the independent (number of servers) and
dependent (total cost) variables. These statistics arepresented in Table 7.2. The best adjusted
R-Square value (the largest) occurs with the quadratic model, followed by the cubic model.
The more detailed supporting statistics for both of these models are presented in Table
BELOW. The graph for all the SPSS curve-fitting models appears in Figure 7.4.
-----------------------------------------------------------------------------------------------------------
FORECASTING MODELS FOR TIME SERIES WITH LINEAR TREND:
For time series with a linear trend but no significant seasonal components, double
moving average and double exponential smoothing models are more appropriate than using
simple moving average or exponential smoothing models. Both methods are based on the linear
trend equation:
Ft + k = at + btk…………………………………… (9.6)
That is, the forecast for k periods into the future from period t is a function of a base
value at, also known as the level, and a trend, or slope, bt. Double moving average and double
exponential smoothing differ in how the data are used to arrive at appropriate values for at and
bt. Because the calculations are more complex than for simple moving average and exponential
smoothing models, it is easier to use forecasting software than to try to imple- ment the models
directly on a spreadsheet. Therefore, we do not discuss the theory or for- mulas underlying the
methods. XLMiner does not support a procedure for double moving average; however, it does
provide one for double exponential smoothing.
DOUBLE EXPONENTIAL SMOOTHING:
In double exponential smoothing the estimates of at & bt are obtained from following
equations:
at = aFt + 11 - a21at-1 + bt -12
In essence, we are smoothing both parameters of the linear trend model. From the first
equation, the estimate of the level in period t is a weighted average of the observed value at
time t and the predicted value at time t, at-1 + bt-1, based on simple exponential smoothing. For
large values of a, more weight is placed on the observed value. Lower values of put more
weight on the smoothed predicted value. Similarly, from the second equation, the estimate of
the trend in period t is a weighted average of the differences in the estimated levels in periods
t and t - 1 and the estimate of the level in period t - 1.
Larger values of b place more weight on the differences in the levels, but lower values
of b put more emphasis on the previous estimate of the trend. Initial values are chosen for a1 as
A1 and b1 as A2 - A1. Equations (9.7) must then be used to compute at and bt for the entire time
series to be able to generate forecasts into the future. As with simple exponential smoothing,
we are free to choose the values of a and b. However, it is easier to let XLMiner optimize these
values using historical data.
-------------------------------------------------------------------------------------------------------------
FORECASTING TIME SERIES WITH SEASONALITY:
Quite often, time-series data exhibit seasonality, especially on an annual basis. When
time series exhibit seasonality, different techniques provide better forecasts than other
techniques.
(A) REGRESSON BASED SEASONAL FORECASTING MODELS:
One approach is to use linear regression. Multiple linear regression models with
categorical variables can be used for time series with seasonality. To do this, we use
dummy categorical variables for the seasonal components.
(B) HOLT-WINTERS FORECASTING FOR SEASONAL FORECASTING:
based on the work of two researchers, C.C. Holt, who developed the basic approach,
and P.R. Winters, who extended Holt’s work. Hence, these approaches are commonly
referred to as Holt-Winters models. Holt-Winters models are similar to exponential
smoothing models in that smoothing con- stants are used to smooth out variations in the
level and seasonal patterns over time. For time series with seasonality but no trend,
XLMiner supports a Holt-Winters method but does not have the ability to optimize the
parameters.
HOLT-WINTERS MODELS FOR FORECASTING TIME SERIES WITH
SEASONALITY AND TREND:
Many time series exhibit both trend and seasonality. Such might be the case for growing
sales of a seasonal product. These models combine elements of both the trend and sea- sonal
models. Two types of Holt-Winters smoothing models are often used.
The gasoline sales data, we also see that the average price per gallon changes each week, and
this may influence consumer sales. Therefore, the sales trend might not simply be a factor of steadily
increasing demand, but it might also be influenced by the average price per gallon. The average price
per gallon can be considered as a causal variable. Multiple linear regression provides a technique for
building forecasting models that incor- porate not only time, but other potential causal variables also.
----------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------- ----------------
4.2………………MONTE CARLO SIMULATION & RISK ANALYSIS:
MONTE CARLO SIMULATION USING ANALYTIC SOLVER
PLATFORM:
To use Analytic Solver Platform, you must perform the following steps:
RUNNING A SIMULATION:
To run a simulation, first click on the Options button in the Options group in the
Analytic Solver Platform ribbon. This displays a dialog (see Figure 12.7) in which you can
specify the number of trials and other options to run the simulation (make sure the Simulation
tab is selected). Trials per Simulation allows you to choose the number of times that Analytic
Solver Platform will generate random values for the uncertain cells in the model and recalculate
the entire spreadsheet. Because Monte Carlo simulation is essentially sta- tistical sampling, the
larger the number of trials you use, the more precise will be the result. Unless the model is
extremely complex, a large number of trials will not unduly tax today’s computers, so we
recommend that you use at least 5,000 trials (the educational version restricts this to a maximum
of 10,000 trials). You should use a larger number of trials as the number of uncertain cells in
your model increases so that the simulation can generate representative samples from all
distributions for assumptions. You may run morethan one simulation if you wish to examine
the variability in the results.
The procedure that Analytic Solver Platform uses generates a stream of random num-
bers from which the values of the uncertain inputs are selected from their probability same
assumption values. As long as you use the same number, the assumptions generated will be the
same for all simulations.
Analytic Solver Platform has alternative sampling methods; the two most common
are Monte Carlo and Latin Hypercube sampling. Monte Carlo sampling selects random variates
independently over the entire range of possible values of the distribution. With Latin
Hypercube sampling, the uncertain variable’s probability distribution is divided into intervals
of equal probability and generates a value randomly within each interval. Latin Hypercube
sampling results in a more even distribution of output values because it samples the entire
range of the distribution in a more consistent manner, thus achiev- ing more accurate forecast
statistics (particularly the mean) for a fixed number of Monte Carlo trials. However, Monte
Carlo sampling is more representative of reality and should be used if you are interested in
evaluating the model performance under various what-if scenarios. Unless you are an advanced
user, we recommend leaving the other options at their default values.
The last step is to run the simulation by clicking the Simulate button in the Solve Action
group. When the simulation finishes, you will see a message “Simulation finished successfully”
in the lower-left corner of the Excel window.
PART-2:
(1) NEW PRODUCT DEVELOPMENT MODEL -----PG.NO. 414
(2) NEWS VENDOR MODEL---------------------------------------421
(3) OVERBOOKING MODEL---------------------------------------424
(4) CASH BUDGET MODEL----------------------------------------426
----------------------------------------------------------------------------------------------------------------
UNIT-5
5.1………………………………………………DECISION ANALYSIS
FORMULATING DECISION PROBLEMS:
Many decisions involve a choice from among a small set of alternatives with uncertain
consequences. We may formulate such decision problems by defining three things:
1. the decision alternatives that can be chosen,
2. the uncertain events that may occur after a decision is made along with their possible
outcomes, and
3. the consequences associated with each decision and outcome, which are usu- ally
expressed as payoffs.
The outcomes associated with uncertain events (which are often called states of nature),
are defined so that one and only one of them will occur. They may be quantitative or qualitative.
For instance, in selecting the size of a new factory, the future demand for the product would be
an uncertain event. The demand outcomes might be expressed quantita- tively in sales units or
dollars. On the other hand, suppose that you are planning a spring- break vacation to Florida in
January; you might define an uncertain event as the weather; these outcomes might be
characterized qualitatively: sunny and warm, sunny and cold, rainy and warm, rainy and cold,
and so on. A payoff is a measure of the value of making a decision and having a particular
outcome occur. This might be a simple estimate made judgmentally or a value computed from
a complex spreadsheet model. Payoffs are often summarized in a payoff table, a matrix whose
rows correspond to decisions and whose columns correspond to events. The decision maker
first selects a decision alternative, after which one of the outcomes of the uncertain event occurs,
resulting in the payoff.
----------------------------------------------------------------------------------------------------------------
DECISION STRATEGIES WITHOUT OUTCOME PROBABILITIES:
DECISION STRATEGIES FOR A MINIMIZE OBJECTIVE:
Aggressive (Optimistic) Strategy An aggressive decision maker might seek the option
that holds the promise of minimizing the potential loss. For a minimization objective, this
strategy is also often called a minimin strategy; that is, we choose the decision that minimizes
the minimum payoff that can occur among all outcomes for each decision. Aggressive decision
makers are often called speculators, particularly in financial arenas, because they increase their
exposure to risk in hopes of increasing their return; while a few may be lucky, most will not do
very well.
Conservative (Pessimistic) Strategy A conservative decision maker, on the other
hand, might take a more-pessimistic attitude and ask, “What is the worst thing that might
result from my decision?” and then select the decision that represents the “best of the worst.”
Such a strategy is also known as a minimax strategy because we seek the decision that
minimizes the largest payoff that can occur among all outcomes for each decision.
Conservative decision makers are willing to forgo high returns to avoid undesirable losses.
This rule typically models the rational behavior of most individuals.
Opportunity-Loss Strategy A third approach that underlies decision choices for many
individuals is to consider the opportunity loss associated with a decision. Opportunity loss
represents the “regret” that people often feel after making a nonoptimal decision (I should have
bought that stock years ago!). In general, the opportunity loss associated with any decision and
event is the absolute difference between the best decision for that particular outcome and the
payoff for the decision that was chosen. Opportunity losses can be only nonnegative values.
If you get a negative number, then you made a mistake. Once opportunity losses are computed,
the decision strategy is similar to a conservative strategy. The decision maker would select the
decision that minimizes the largest opportunity loss among all outcomes for each decision. For
these reasons, this is also called a minimax regret strategy.
--------------------------------------------------------------------------------------------------------------------------------------
EVALUATING RISK:
An implicit assumption in using the average payoff or expected value strategy is that
the decision is repeated a large number of times.
----------------------------------------------------------------------------------------------------------------
DECISION TREES:
A useful approach to structuring a decision problem involving uncertainty is to use a
graphical model called a decision tree. Decision trees consist of a set of nodes and branches.
Nodes are points in time at which events take place. The event can be a selection of a decision
from among several alternatives, represented by a decision node, or an outcome over which
the decision maker has no control, an event node. Event nodes are conventionally depicted
by circles, and decision nodes are expressed by squares. Branches are associated with
decisions and events. Many decision makers find decision trees useful because sequences of
decisions and outcomes over time can be modeled easily.
Decision trees may be created in Excel using Analytic Solver Platform. Click the
Decision Tree button. To add a node, select Add Node from the Node drop down list, as shown
in Figure 16.2. Click on the radio button for the type of node you wish to create (decision or
event). This displays one of the dialogs shown in Figure 16.3. For a decision node, enter
the name of the node and names of the branches that emanate fromthe node (you may also
add additional ones). The Value field can be used to input cash flows, costs, or revenues that
result from choosing a particular branch. For an event node, enter the name of the node and
branches. The Chance field allows you to enter the probabilities of the events.
VALUE OF INFORMATION:
When we deal with uncertain outcomes, it is logical to try to obtain better information
about their likelihood of occurrence before making a decision. The value of information
represents the improvement in the expected return that can be achieved if the decision maker
is able to acquire before making a decision additional information about the future event that will
take place. In the ideal case, we would like to have perfect information, which tells us with
certainty what outcome will occur. Although this will never occur, it is useful to know the value
of perfect information because it provides an upper bound on the value of any information that
we may acquire. The expected value of perfect information (EVPI) is the expected value
with perfect information (assumed at no cost) minus the expected value without any
information; again, it represents the most you should be willing to pay for perfect information.
The expected opportunity loss represents the average additional amount the decision
maker would have achieved by making the right decision instead of a wrong one. To find the
expected opportunity loss, we create an opportunity-loss table, as discussed earlier in this
chapter, and then find the expected value for each decision. It will always be true that the
decision having the best expected value will also have the minimum expected opportunity loss.
The minimum expected opportunity loss is the EVPI.
----------------------------------------------------------------------------------------------------------------