0% found this document useful (0 votes)
10 views47 pages

Business Analytics 3

The document provides an overview of business forecasting and predictive analytics, highlighting the importance of accurate forecasting in guiding business strategy and decision-making. It discusses various forecasting techniques, including qualitative, time series, causal, and quantitative methods, and outlines the steps involved in the forecasting process. Additionally, it emphasizes the role of predictive analytics in leveraging historical data to anticipate future events and improve business outcomes.

Uploaded by

Femi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views47 pages

Business Analytics 3

The document provides an overview of business forecasting and predictive analytics, highlighting the importance of accurate forecasting in guiding business strategy and decision-making. It discusses various forecasting techniques, including qualitative, time series, causal, and quantitative methods, and outlines the steps involved in the forecasting process. Additionally, it emphasizes the role of predictive analytics in leveraging historical data to anticipate future events and improve business outcomes.

Uploaded by

Femi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 47

UNIT III BUSINESS FORECASTING

Introduction to Business Forecasting and Predictive analytics - Logic and Data Driven Models
– Data Mining and Predictive Analysis Modeling – Machine Learning for Predictive analytics.

INTRODUCTION TO BUSINESS FORECASTING AND PREDICTIVE ANALYTICS

Companies conduct business forecasts to determine their goals, targets, and project plans for
each new period, whether quarterly, annually, or even 2–5-year planning. Some companies
utilize predictive analytics software to collect and analyze the data necessary to make an
accurate business forecast. Predictive analytics solutions give you the tools to store data,
organize information into comprehensive datasets, develop predictive models to forecast
business opportunities, adapt datasets to data changes, and allow import/export from other data
channels.

Forecasting helps managers guide strategy and make informed decisions about critical business
operations such as sales, expenses, revenue, and resource allocation. When done right,
forecasting adds a competitive advantage and can be the difference between successful and
unsuccessful companies.

What is business forecasting?


Business forecasting is a projection of future developments of a business or industry based on
trends and patterns of past and present data.
Choosing a right forecasting method depends on many factors –
 significance and accessibility of historical data,
 the background of the prediction,
 time available for analysis,
 the degree of precision anticipated,
 value to the Company,
 And desired time period for forecast.

Downloaded by P. AJITHA CSE


Manager and forecaster need to collectively work to achieve successful forecasting, they must
try to answer the following questions:

1. How the forecast is going to be useful? – precisely the purpose of it


2. Understand the mechanisms and sensitivities of the system for which forecast is made?
3. How relevant is past in predicting future?
TYPES OF PREDICTING MODELS

Downloaded by P. AJITHA CSE


Qualitative Techniques:

Qualitative Technique is applied when enough data is not available – i.e. when the
product is launched in the market for the first time. They use human evaluation and rating
schemes to convert qualitative data into quantitative calculations.

The goal is to gather all information and considerations related to the factors being evaluated in a
logical, impartial, and systematic manner. Such methods are often used in the field of new
technologies, where the development of product ideas may require more “invention”, so it is
difficult to study the research and development requirements, and the perception and entry of the
market is very uncertain.

Qualitative business forecasting is predictions and projections based on experts' and


customers' opinions. This method is best when there is insufficient past data to analyze to reach a
quantitative forecast. In these cases, industry experts and forecasters piece together available data
to make qualitative predictions.

Qualitative models are most successful with short-term projections. They are expert-
driven, bringing up contrasting opinions and reliance on judgment over calculable data.
Examples of qualitative models in business forecasting include:

• Market research: This involves polling people – experts, customers, employees – to


get their preferences, opinions, and feedback on a product or service.
• Delphi method: The Delphi method relies on asking a panel of experts for their opinions
and recommendations and compiling them into a forecast.

Downloaded by P. AJITHA CSE


Time Series Analysis

Time Series is a set of observations on the values that a variable takes at different times.
Example: Sales trend, stock market prices, weather forecasts etc. In simple terms. Let’s take a
sales data. You would have cells that are connected on every month basis. Like for January, you
sold 150, and in February, you sold about a bit more let us assume three hundred and so on for
all the 12 months. So, you have your sales data, right? This becomes a time series for you. And
given that there is a pattern, we can predict the future sales of the same unit.

Casual Methods:

Causal forecasting recognizes that the predicted dependent variable affects one or more other
independent variables. Causal methods take into account all possible factors that may affect the
dependent variable. Consequently, the data necessary for such forecasting can vary from internal
data to external data, such as surveys, macroeconomic indicators, product characteristics, social
chatter, etc. Typically, casual models are infinitely modified to ensure that the latest data are
included in the model.

Downloaded by P. AJITHA CSE


Quantitative business forecasting

Use quantitative forecasting when there is accurate past data available to analyze patterns and
predict the probability of future events in your business or industry.

Quantitative forecasting extracts trends from existing data to determine the more probable
results. It connects and analyzes different variables to establish cause and effect between events,
elements, and outcomes. An example of data used in quantitative forecasting is past sales
numbers.

Quantitative models work with data, numbers, and formulas. There is little human interference in
quantitative analysis. Examples of quantitative models in business forecasting include:

• The indicator approach: This approach depends on the relationship between specific
indicators being stable over time, e.g., GDP and the unemployment rate. By following the
relationship between these two factors, forecasters can estimate a business's performance.
• The average approach: This approach infers that the predictions of future values are equal
to the average of the past data. It is best to use this approach only when assuming that the
future will resemble the past.
• Econometric modeling: Econometric modeling is a mathematically rigorous approach to
forecasting. Forecasters assume the relationships between indicators stay the same and test
the consistency and strength of the relationship between datasets.
• Time-series methods: Time-series methods use historical data to predict future outcomes.
By tracking what happened in the past, forecasters expect to get a near-accurate view of the
future

How do you choose the right business forecasting technique?

Choosing the right business forecasting technique depends on many factors. Some of these are:

Downloaded by P. AJITHA CSE


• Context of the forecast
• Availability and relevance of past data

• Degree of accuracy required

• Allocated time to conduct the forecast

• Costs and benefits of the forecast

• Stage of the product or business needing the forecast

Managers and forecasters must consider the stage of the product or business as this influences
the availability of data and how you establish relationships between variables. A new startup
with no previous revenue data would be unable to use quantitative methods in its forecast. The
more you understand the use, capabilities, and impact of different forecasting techniques, the

What are the integral elements of business forecasting?

While there are different forecasting techniques and methods, all forecasts follow the same
process on a conceptual level. Standard elements of business forecasting include:
 Prepare the stage: Before you begin, develop a system to investigate the current state of
business.

• Choose a data point: An example for any business could be "What is our sales projection
for next quarter?"

• Choose indicators and data sets: Identify the relevant indicators and data sets you need
and decide how to collect the data.

• Make initial assumptions: To kick start the forecasting process, forecasters may
make some assumptions to measure against variables and indicators.

• Select forecasting technique: Pick the technique that fits your forecast best.

• Analyze data: Analyze available data using your selected forecasting technique.

• Estimate forecasts: Estimate future conditions based on data you've gathered to reach data-
backed estimates.

• Verify forecasts: Compare your forecast to the eventual results. This helps you identify any
problems, tweak errant variables, correct deviations, and continue to improve your
forecasting technique.
Downloaded by P. AJITHA CSE
• Review forecasting process: Review any deviations between your forecasts and actual
performance data.

Choose the best forecasting methods based on the stage of the product or business life cycle, availability
of past data, and skills of the forecasters and managers leading the project.
When you have these answers, you can start collecting data from two main sources:

• Primary sources: These sources are gathered first-hand using reporting tools — you or
members of your team source data through interviews, surveys, research, or observations.

• Secondary sources: Secondary sources are second-hand information or data that others
have collected. Examples include government reports, publications, financial statements,
competitors' annual reports, journals, and other periodicals.
BUSINESS FORECASTING PROCESS

The way a company forecasts is always unique to its needs and resources, but the primary
forecasting process can be summed up in five steps. These steps outline how business forecasting
starts with a problem and ends with not only a solution but valuable learnings.

1. Choose an issue to address

The first step in predicting the future is choosing the problem you’re trying to solve or the
question you’re trying to answer. This can be as simple as determining whether your audience
will be interested in a new product your company is developing. Because this step doesn’t yet
involve any data, it relies on internal considerations and decisions to define the problem at hand.

Downloaded by P. AJITHA CSE


2. Create a data plan

The next step in forecasting is to collect as much data as possible and decide how to use it. This
may require digging up some extensive historical company data and examining the past and
present market trends. Suppose your company is trying to launch a new product. In this case, the
gathered data can be a culmination of the performance of your previous product and the current
performance of similar competing products in the target market.

3. Pick a forecasting technique

After collecting the necessary data, it’s time to choose a business forecasting technique that
works with the available resources and the type of prediction. All the forecasting models are
effective and get you on the right track, but one may be more favorable than others in creating a
unique, comprehensive forecast.
For example, if you have extensive data on hand, quantitative forecasting is ideal for
interpretation. Qualitative forecasting is best if you have less hard data available and are willing
to invest in extensive market research.

4. Analyze the data

Once the ball starts rolling, you can begin identifying patterns in the past and predict the
probability of their repetition. This information will help your company’s decision-makers
determine what to do beforehand to prepare for the predicted scenarios.
5. Verify your findings

The end of business forecasting is simple. You wait to see if what you predicted actually
happens. This step is especially important in determining not only the success of your forecast
but also the effectiveness of the entire process. Having done some forecasting, you can compare
the present experience with these forecasts to identify potential areas for growth.
When in doubt, never throw away “old” data. The final information of one forecasting process
can also be used as the past data for another forecast. It’s like a life cycle of business
development predictions.

Business forecasting examples

Some forecasting examples for business include:

1. Calculating cash flow forecasts, i.e., predicting your financial needs within a timeframe
Downloaded by P. AJITHA CSE
2. Estimating the threat of new entrants into your market

3. Measuring the opportunity of developing a new product or service


4. Estimating the costs of recurring bills

5. Predicting future sales growth based on past sales performance

Business Forecasting: How it Works & Real-Life Examples

A rapidly evolving modern business climate has proven how fast things can change, with
businesses evolving beside it to succeed. In fact, today’s world requires agile strategy and
management.

This is where business forecasting can help, enabling businesses to plan for unexpected
events. In this, you’ll learn the basic principles of business forecasting and how to implement
forecasting techniques in your business planning.

Examples of business forecasting in action

Now that you understand the basics of business forecasting, it’s time to see how it works in
practice. Read the following examples to better understand the different approaches to
business forecasting.
1. A company forecasting its sales through the end of the year

Let’s suppose a small greeting card company wants to forecast its sales through the end of
the year. The company has just a year and a half of experience and limited data to use for
predictions. Though the first few quarters were slow to start, they have gained a great
reputation in the last three quarters. For this reason, sales are on the rise.

Since the business has limited historical data, they might consider a qualitative model for
predicting future sales. By polling their customers, the greeting card company can gauge the
willingness of their audience to buy new cards and pricing for the remaining quarters of the
year. Market surveys are a type of qualitative forecasting, which utilizes questionnaires to
estimate future customer behavior.

2. A company forecasting sales for the next quarter

In this example, let’s suppose a well-established shoe brand is forecasting profits for the next
quarter. Normally, this company would use the time series forecasting technique to estimate
profits for the next quarter. However, economic conditions have shifted, and the
unemployment rate is higher than normal. As a result, the company chooses the indicator
approach to predict the actual performance of its product.

Downloaded by P. AJITHA CSE


In this scenario, the company might compare two variables: employment rate and spending
rates. With this business forecasting approach, the company predicts it will have a decrease
in profits for the upcoming quarter. Following this prediction, it chooses to produce fewer
items in response to economic changes and adjust budgets accordingly.

3. A company forecasting returns on a new product

In this next example, let’s suppose a loungewear company plans on rolling out a new product:
slippers. Since this product is new to the company, there are no official metrics for pricing
and popularity. For this reason, the company needs to gauge the interest level of its target
audience.

In this case, demand forecasting would be a great approach to gauge how much customers
are willing to spend and how much the company will need to invest in terms of materials. By
using this forecasting process, the loungewear company can decide if the product will
perform well and what kind of demand exists. Ultimately, this will help the team make
informed business decisions for production as well as sales.

PREDICTIVE ANALYTICS

Predictive analytics uses historical data to predict future events. Typically, historical data is used
to build a mathematical model that captures important trends. That predictive model is then used
on current data to predict what will happen next, or to suggest actions to take for optimal
outcomes.

Predictive analytics has received a lot of attention in recent years due to advances in supporting
technology, particularly in the areas of big data and machine learning.

Rise of Big Data

Predictive analytics is often discussed in the context of big data, Engineering data, for example,
comes from sensors, instruments, and connected systems out in the world. Business system data
at a company might include transaction data, sales results, customer complaints, and marketing
information. Increasingly, businesses make data-driven decisions based on this valuable trove of
information.

Increasing Competition

With increased competition, businesses seek an edge in bringing products and services to
crowded markets. Data-driven predictive models can help companies solve long-standing
problems in new ways.

Equipment manufacturers, for example, can find it hard to innovate in hardware alone. Product
developers can add predictive capabilities to existing solutions to increase value to the customer.
Using predictive analytics for equipment maintenance, or predictive maintenance, can anticipate
equipment failures, forecast energy needs, and reduce operating costs. For example, sensors
thatmeasure vibrations in automotive parts can signal the need for maintenance before the
Downloaded by P. AJITHA CSE
vehicle fails on the road.
Companies also use predictive analytics to create more accurate forecasts, such as forecasting the
demand for electricity on the electrical grid. These forecasts enable resource planning (for
example, scheduling of various power plants), to be done more effectively.

Predictive Analytics Examples


Predictive analytics helps teams in industries as diverse as finance, healthcare, pharmaceuticals,
automotive, aerospace, and manufacturing.

 Automotive – Breaking new ground with autonomous vehicles


Companies developing driver assistance technology and new autonomous vehicles use predictive
analytics to analyze sensor data from connected vehicles and to build driver assistance
algorithms.
 Aerospace – Monitoring aircraft engine health
To improve aircraft up-time and reduce maintenance costs, an engine manufacturer created a
real- time analytics application to predict subsystem performance for oil, fuel, liftoff, mechanical
health, and controls.
 Energy Production – Forecasting electricity price and demand
Sophisticated forecasting apps use models that monitor plant availability, historical trends,
seasonality, and weather.
 Financial Services – Developing credit risk models
Financial institutions use machine learning techniques and quantitative tools to predict credit
risk.
 Industrial Automation and Machinery – Predicting machine failures
A plastic and thin film producer saves 50,000 Euros monthly using a health monitoring and
predictive maintenance application that reduces downtime and minimizes waste.
 Medical Devices – Using pattern-detection algorithms to spot asthma and COPD
An asthma management device records and analyzes patients' breathing sounds and provides
instant feedback via a smart phone app to help patients manage asthma and COPD.

How Predictive Analytics Works

Predictive analytics is the process of using data analytics to make predictions based on data.
This process uses data along with analysis, statistics, and machine learning techniques to create
a predictive model for forecasting future events.

The term “predictive analytics” describes the application of a statistical or machine learning
technique to create a quantitative prediction about the future. Frequently, supervised machine
learning techniques are used to predict a future value (How long can this machine run before
requiring maintenance?) or to estimate a probability (How likely is this customer to default on a
loan?).
Downloaded by P. AJITHA CSE
Predictive analytics starts with a business goal: to use data to reduce waste, save time, or cut
costs. The process harnesses heterogeneous, often massive, data sets into models that can
generate clear, actionable outcomes to support achieving that goal, such as less material waste,
less stocked inventory, and manufactured product that meets specifications.

Predictive Analytics Workflow

We are all familiar with predictive models for weather forecasting. A vital industry application of
predictive models relates to energy load forecasting to predict energy demand. In this case,
energy producers, grid operators, and traders need accurate forecasts of energy load to make
decisions for managing loads in the electric grid. Vast amounts of data are available, and using
predictive analytics, grid operators can turn this information into actionable insights.

Downloaded by P. AJITHA CSE


Step-by-Step Workflow for Predicting Energy Loads
Typically, the workflow for a predictive analytics application follows these basic steps:

1. Import data from varied sources, such as web archives, databases, and
spreadsheets. Data sources include energy load data in a CSV file and national weather
data showing temperature and dew point.
2. Clean the data by removing outliers and combining data sources.
Identify data spikes, missing data, or anomalous points to remove from the data. Then
aggregate different data sources together – in this case, creating a single table including
energy load, temperature, and dew point.
3. Develop an accurate predictive model based on the aggregated data using
statistics, curve fitting tools, or machine learning.
Energy forecasting is a complex process with many variables, so you might choose to use
neural networks to build and train a predictive model. Iterate through your training data set
to try different approaches. When the training is complete, you can try the model against
new data to see how well it performs.

Downloaded by P. AJITHA CSE


4. Integrate the model into a load forecasting system in a production environment.
Once you find a model that accurately forecasts the load, you can move it into your
production system, making the analytics available to software programs or devices, including
web apps, servers, or mobile devices.

Types of Predictive Models


While data analysts are required to make decisions regarding which mathematical model to use
in a given situation, they are not actually the ones crunching the data. Statisticians and
programmers develop computer programs that carry out these processes, each of which operates
using a different mathematical model.

Downloaded by P. AJITHA CSE


“The tools we’re using for predictive analytics now have improved and become much more
sophisticated,” Goulding says, explaining that these advanced models have allowed us to “handle
massive amounts of data in ways we couldn’t before.”
The advancement of these tools has also resulted in the use of predictive analytics to identify
“unknowns” that previously could not be addressed, leading to an overall need for analysts that
can succinctly identify which model best aligns with the type of unknown in each scenario.
Below, we explore four common predictive models and the types of questions they can be best
used to answer.
1. Linear Regression
Linear regression is one of the most famous and historic modeling tools, according to Goulding.
This model considers all the known data points on a graph and creates a straight line that travels
through the center of those data points. This line represents the smallest possible distance
between all the points on the graph. A linear regression mathematical modeling tool can then
base predictions about nonexistent data off of the relationship between this line and the existing
data points.
Real-World Example

A linear regression model would be useful when a doctor wants to predict a new patient’s
cholesterol based only on their body mass index (BMI). In this example, the analyst would know
to put the data the doctor gathered from his 5,000 other patients—including each of their BMIs
and cholesterol levels—into the linear regression model. They are hoping to predict an unknown
based on a predetermined set of quantifiable data.

Downloaded by P. AJITHA CSE


The linear regression model would take the data, plot it onto a graph, and establish a line down
the center that properly depicts the smallest distance between all plotted data points. In this
scenario, when that new patient arrives knowing only that their BMI is 31, a data analyst will be
able to predict the patient’s cholesterol by looking at that line and seeing what cholesterol level
most closely aligns with other patients who have a BMI of 31.
2. Text Mining
Whereas linear regression uses only numeric data, mathematical models can also be used to
make predictions about non-numerical factors. Text mining is a perfect example.
“Text mining is part of predictive analytics in the sense that analytics is all about finding the
information I previously knew nothing about,” Goulding says. In this scenario, the tool takes
data points in the form of text-based words or phrases and searches a giant database for those
specific points.
Sound Familiar? The algorithm used by Google or other search engines to bring up relevant links
when you search for a specific keyword is an example of text mining.
Real-World Example
Although tools like search engines—or even the “find” function you may use when searching for
a word in a digital body of text—represent some common examples of text mining, there are also
industry-specific instances where this type of predictive analytics comes into play.
Goulding describes another medical application of predictive analytics, explaining how doctors
rely on text mining when analyzing patient symptoms and trying to determine the root cause. “If
I’m a doctor and I have 50 children in front of me with flu symptoms, my brain can figure out
that the next patient to walk in the door [with similar symptoms] also has the flu,” he says. “But
if I see an unusual set of symptoms from just one patient, I may need the case history of patients
from all over the world to make a correct diagnosis. My brain can’t help me do this; analytics,
however, can.”
Especially in complex patient cases, an analyst can use text mining modeling tools to comb
databases, locate similar symptoms among patients of the past, and generate a prediction as to
what this new patient is “most likely” suffering from based on that data.
3. Optimal Estimation
Optimal estimation is a modeling technique that is used to make predictions based on observed
factors. This model has been used in analytics for over 50 years and has laid the groundwork for
many of the other predictive tools used today. According to Goulding, past applications of this
method include determining “how to best recalibrate equipment on a manufacturing floor…[and]
estimating where a bullet might go when shot,” as well as in other aspects of the defense
industry.
Real-World Example

Downloaded by P. AJITHA CSE


If two planes were flying toward one another, an analyst might use the optimal estimation model
to predict if or when they will collide. To do this, the analyst would put a variety of observed
factors into the mathematical modeling tool, including the airplanes’ height, altitude, speed,
angle, and more. The mathematical model would then be able to help predict at which point, if
any, the planes would meet.
4. Clustering Models
Clustering models are focused on finding different groups with similar qualities or elements
within the data. Many mathematical modeling tools fall within this category, including:
K-Means
Hierarchical Clustering
TwoStep
Density-Based Scan Clustering
Gaussian Clustering Model
Kohonen
Real-World Example
If a fast-food restaurant wanted to open a new location in a new city, the corporate team may
work with a data analyst to figure out exactly where that new location should go. The analyst
would start by gathering an array of specific, relevant data about each location—including
factors like demographics, where the high-end houses are, how close the location is to a college,
etc.—then input all of that data into a clustering mathematical model. This model would most
efficiently analyze this particular type of data and predict where the most strategic location in the
city for that restaurant is based on the data alone.
5. Neural Networks
Neural networks are complex algorithms inspired by the structure of the human brain. They
process historical and current data and identify complex relationships within the data to predict
the future, similar to how the human brain can spot trends and patterns.
A typical neural network is composed of artificial neurons, called units, arranged in different
layers. The neural network uses input units to learn about and process data. On the other hand,
output units are on the opposite side and outline how the neural network should respond to the
input units. Between the two are hidden layers, which are layers of mathematical functions that
produce a specific output.
Real-World Example
If an e-commerce retailer wants to accurately predict which products its customers are likely to
consider purchasing in the future, a data analyst or data scientist might use neural networks to

Downloaded by P. AJITHA CSE


inform the company’s product recommendation algorithm. The analyst will pull purchase data
and feed it to the neural network, giving the network real examples to learn from. This data will
travel through the neural network through various mathematical functions until the output is
produced and a product recommendation populates.
Other Common Predictive Models
In addition to the mathematical models above, there are additional models that data analysts use
to make predictions, including:
Decision trees
Random forests
Logistic regression
Bayesian methods

LOGIC AND DATA DRIVEN MODELS


Predictive modeling means the developing models that can be used to forecast or
predict future events. Models can be developed either through logic or data.
1. Logic driven models remain based on experience, knowledge and logical relationships
of variables and constants connected to the desired business performance outcome situation.
2. Data-driven Models refers to the models in which data is collected from many
sources to qualitatively establish model relationships. Logic driven models is often used as a
first step to establish relationships through data-driven models. Data driven models include
sampling and estimation, regression analysis, correlation analysis, forecasting models and
stimulation.

Logic-Driven Models

Logic driven models are created on the basis of inferences and postulations which the sample space and existing
conditions provide. Creating logical models require solid understanding of business functional areas, logical
skills to evaluate the propositions better and knowledge of business practices and research.

To understand better, let us take an example of a customer who visits a restaurant around six times in a year and
spends around ₹5000 per visit. The restaurant gets around 40% margin on per visit billing amount. The annual
gross profit on that customer turns out to be 5000 × 6 × 0.40 =
₹12000. 30% of the customers do not return each year, while 70% do return to provide more business to the
restaurant.

A logic-driven model is one based on experience, knowledge, and logical relationships of variables and
constants connected to the desired business performance outcome situation. The question here is how to put
variables and constants together to create a model that can predict the future. Doing this requires business
experience. Model building requires an understanding of business systems and the relationships of variables
and constants that seek to generate a desirable business performance outcome. To help conceptualize the
relationships inherent in a business system, diagramming methods can be helpful. For example, the cause-and-
effect diagram is a visual aid diagram that permits a user to hypothesize relationships between potential causes
of an outcome. This diagram lists potential causes in terms of human, technology, policy, and process
Downloaded by P. AJITHA CSE
resources in an effort to establish some basic relationships that impact business performance. The diagram is
used by tracing contributing and relational factors from the desired business performance goal back to possible
causes, thus allowing the user to better picture sources of potential causes that could affect the performance.
This diagram is sometimes referred to as a fishbone diagram because of its appearance.

Another useful diagram to conceptualize potential relationships with business performance variables is called the influence
diagram. According to Evans, influence diagrams can be useful to conceptualize the relationships of variables in the
development of models. It maps the relationship of variables and a constant to the desired business performance outcome of
profit. From such a diagram, it is easy to convert the information into a quantitative model with constants and variables that
define profit in this situation:

Profit = Revenue − Cost, or

Profit = (Unit Price × Quantity Sold) - [(Fixed Cost) + (Variable Cost × Quantity Sold)], or

P = (UP × QS) - [FC + (VC × QS)]

Data-Driven Models
Logic-driven modeling is often used as a first step to establish relationships through data- driven models (using
data collected from many sources to quantitatively establish model relationships).Types

Sampling & Estimation


Downloaded by P. AJITHA CSE
Sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a
statistical population to estimate characteristics of the whole population. The subset is meant to reflect the whole
population and statisticians attempt to collect samples that are representative of the population. Sampling has
lower costs and faster data collection compared to recording data from the entire population, and thus, it can
provide insights in cases where it is infeasible to measure an entire population.
Estimation in statistics are any procedures used to calculate the value of a population drawn from observations
within a sample size drawn from that population.

Regression Analysis

Regression analysis is a set of statistical methods used for the estimation of relationships between a dependent
variable and one or more independent variables. It can be utilized to assess the strength of the relationship
between variables and for modeling the future relationship between them.

Correlation Analysis
Correlation Analysis is statistical method that is used to discover if there is a relationship between two
variables/datasets, and how strong that relationship may be.

Probability Distribution
The probability distribution gives the possibility of each outcome of a random experiment or event. It provides
the probabilities of different possible occurrences.

Example –

A restaurant customer dines 6 times a year and spends an average of $50 per visit. The restaurant
realizes a 40% margin on the average bill for food and drinks.

Annual gross profit on a customer = $50(6)(0.40) = $120

30% of customers do not return each year. Average lifetime of a customer = 1/.3 = 3.33

years. Average gross profit for a customer = $120(3.33) = $400

OR Average gross profit for a customer = $120/.3 = $400

Thus, the economic value of a customer is


• V = value of a loyal customer
• R = revenue per purchase
• F = purchase frequency (number visits per year)
Downloaded by P. AJITHA CSE
• M = gross profit margin
• D = defection rate (proportion customers not returning each year)

DATA MINING AND PREDICTIVE ANALYSIS MODELLING

Downloaded by P. AJITHA CSE


PREDICTIVE MODELING

Predictive modeling is a method of predicting future outcomes by using data modeling. It’s one
of the premier ways a business can see its path forward and make plans accordingly. While not
foolproof, this method tends to have high accuracy rates, which is why it is so commonly used.

In short, predictive modeling is a statistical technique using machine learning and data mining to
predict and forecast likely future outcomes with the aid of historical and existing data. It works
by analyzing current and historical data and projecting what it learns on a model generated to
forecast likely outcomes. Predictive modeling can be used to predict just about anything, from
TV ratings and a customer’s next purchase to credit risks and corporate earnings.

A predictive model is not fixed; it is validated or revised regularly to incorporate changes in the
underlying data. In other words, it’s not a one-and-done prediction. Predictive models make
assumptions based on what has happened in the past and what is happening now. If incoming,
new data shows changes in what is happening now, the impact on the likely future outcome must
be recalculated, too. For example, a software company could model historical sales data against
marketing expenditures across multiple regions to create a model for future revenue based on the
impact of the marketing spend.

Most predictive models work fast and often complete their calculations in real time. That’s why
banks and retailers can, for example, calculate the risk of an online mortgage or credit card
application and accept or decline the request almost instantly based on that prediction.

Some predictive models are more complex, such as those used in computational
biology and quantum computing; the resulting outputs take longer to compute than a credit card
application but are done much more quickly than was possible in the past thanks to advances in
technological capabilities, including computing power.
Top 5 Types of Predictive Models

Fortunately, predictive models don’t have to be created from scratch for every application.
Predictive analytics tools use a variety of vetted models and algorithms that can be applied to a
wide spread of use cases.

Downloaded by P. AJITHA CSE


Predictive modeling techniques have been perfected over time. As we add more data, more
muscular computing, AI and machine learning and see overall advancements in analytics, we’re
able to do more with these models.

The top five predictive analytics models are:

1. Classification model: Considered the simplest model, it categorizes data for simple and
direct query response. An example use case would be to answer the question “Is this a
fraudulent transaction?”
2. Clustering model: This model nests data together by common attributes. It works by
grouping things or people with shared characteristics or behaviors and plans strategies for
each group at a larger scale. An example is in determining credit risk for a loan applicant
based on what other people in the same or a similar situation did in the past.
3. Forecast model: This is a very popular model, and it works on anything with a numerical
value based on learning from historical data. For example, in answering how much
lettuce a restaurant should order next week or how many calls a customer support agent
should be able to handle per day or week, the system looks back to historical data.
4. Outliers model: This model works by analyzing abnormal or outlying data points. For
example, a bank might use an outlier model to identify fraud by asking whether a
transaction is outside of the customer’s normal buying habits or whether an expense in a
given category is normal or not. For example, a $1,000 credit card charge for a washer
and dryer in the cardholder’s preferred big box store would not be alarming, but $1,000
spent on designer clothing in a location where the customer has never charged other
items might be indicative of a breached account.
5. Time series model: This model evaluates a sequence of data points based on time. For
example, the number of stroke patients admitted to the hospital in the last four months is
used to predict how many patients the hospital might expect to admit next week, next
month or the rest of the year. A single metric measured and compared over time is thus
more meaningful than a simple average.

Common Predictive Algorithms


Predictive algorithms use one of two things: machine learning or deep learning. Both are
subsets of artificial intelligence (AI). Machine learning (ML) involves structured data, such as
spreadsheet or machine data. Deep learning (DL) deals with unstructured data such as video,
audio, text, social media posts and images—essentially the stuff that humans communicate with
that are not numbers or metric reads.
Some of the more common predictive algorithms are:

1. Random Forest: This algorithm is derived from a combination of decision trees, none of
which are related, and can use both classification and regression to classify vast amounts
of data.

Downloaded by P. AJITHA CSE


2. Generalized Linear Model (GLM) for Two Values: This algorithm narrows down the
list of variables to find “best fit.” It can work out tipping points and change data
capture and other influences, such as categorical predictors, to determine the “best fit”
outcome, thereby overcoming drawbacks in other models, such as a regular linear
regression.
3. Gradient Boosted Model: This algorithm also uses several combined decision trees, but
unlike Random Forest, the trees are related. It builds out one tree at a time, thus enabling
the next tree to correct flaws in the previous tree. It’s often used in rankings, such as on
search engine outputs.
4. K-Means: A popular and fast algorithm, K-Means groups data points by similarities and
so is often used for the clustering model. It can quickly render things like personalized
retail offers to individuals within a huge group, such as a million or more customers with
a similar liking of lined red wool coats.
5. Prophet: This algorithm is used in time-series or forecast models for capacity planning,
such as for inventory needs, sales quotas and resource allocations. It is highly flexible
and can easily accommodate heuristics and an array of useful assumptions.
Benefits of Predictive Modeling
In a nutshell, predictive analytics reduce time, effort and costs in forecasting business outcomes.
Variables such as environmental factors, competitive intelligence, regulation changes and market
conditions can be factored into the mathematical calculation to render more complete views at
relatively low costs.
Examples of specific types of forecasting that can benefit businesses include demand forecasting,
headcount planning, churn analysis, external factors, competitive analysis, fleet and IT hardware
maintenance and financial risks.

Downloaded by P. AJITHA CSE


Challenges of Predictive Modeling
It’s essential to keep predictive analytics focused on producing useful business insights because
not everything this technology digs up is useful. Some mined information is of value only in
satisfying a curious mind and has few or no business implications. Getting side-tracked is a
distraction few businesses can afford.
Also, being able to use more data in predictive modeling is an advantage only to a point. Too
much data can skew the calculation and lead to a meaningless or an erroneous outcome. For
example, more coats are sold as the outside temperature drops. But only to a point. People do not
buy more coats when it’s -20 degrees Fahrenheit outside than they do when it’s -5 degrees below
freezing. At a certain point, cold is cold enough to spur the purchase of coats and more frigid
temps no longer appreciably change that pattern.
And with the massive volumes of data involved in predictive modeling, maintaining security and
privacy will also be a challenge. Further challenges rest in machine learning’s limitations.
Limitations of Predictive Modeling
According to a McKinsey report, common limitations and their “best fixes” include:

1. Errors in data labeling: These can be overcome with reinforcement


learning or generative adversarial networks (GANs).
2. Shortage of massive data sets needed to train machine learning: A possible fix is
“one- shot learning,” wherein a machine learns from a small number of demonstrations
rather than on a massive data set.
3. The machine’s inability to explain what and why it did what it did: Machines do not
“think” or “learn” like humans. Likewise, their computations can be so exceptionally
complex that humans have trouble finding, let alone following, the logic. All this makes it
difficult for a machine to explain its work, or for humans to do so. Yet model
transparency is necessary for a number of reasons, with human safety chief among them.
Promising potential fixes: local-interpretable-model-agnostic explanations (LIME) and
attention techniques.
4. Generalizability of learning, or rather lack thereof: Unlike humans, machines have
difficulty carrying what they’ve learned forward. In other words, they have trouble
applying what they’ve learned to a new set of circumstances. Whatever it has learned is
applicable to one use case only. This is largely why we need not worry about the rise of
AI overlords anytime soon. For predictive modeling using machine learning to be
reusable— that is, useful in more than one use case—a possible fix is transfer learning.
5. Bias in data and algorithms: Non-representation can skew outcomes and lead to
mistreatment of large groups of humans. Further, baked-in biases are difficult to find and
purge later. In other words, biases tend to self-perpetuate. This is a moving target, and no
clear fix has yet been identified.

Downloaded by P. AJITHA CSE


The Future of Predictive Modeling
Predictive modeling, also known as predictive analytics, and machine learning are still young and
developing technologies, meaning there is much more to come. As techniques, methods, tools
and technologies improve, so will the benefits to businesses and societies.
However, these are not technologies that businesses can afford to adopt later, after the tech reaches
maturity and all the kinks are worked out. The near-term advantages are simply too strong for a
late adopter to overcome and remain competitive.

DATA MINING
The data mining process is used to get the pattern and probabilities from the large dataset due to

which it is highly used in business for forecasting the trends, along with this it is also used in

fields like Market, Manufacturing, Finance, and Government to make predictions and analysis

using the tools and techniques like R-language and Oracle data mining, which involves the flow

of six different steps.

Advantages
The advantage of data mining includes the ones related to business and ones like medicine, weather

forecast, healthcare, transportation, insurance, government, etc. Some of the advantages include:

1. Marketing/Retail: It helps all the marketing companies and firms to build models which

are based on a historical set of data and information to predict the responsiveness to the

marketing campaigns prevailing today, such as online marketing campaigns, direct mail,

etc.

2. Finance/Banking: The data mining involves financial institutions provide information

about loans and also credit reporting. When the model is built on historical information,

good or bad loans can then be determined by the financial institutions. Furthermore,

fraudulent and suspicious transactions are monitored by the banks.

3. Manufacturing: The faulty equipment and the quality of the manufactured products can

be determined by using the optimal parameters for controlling. For example, for some of

the semi-conductor development industries, water hardness and quality become a major

challenge as they tend to affect the quality of their product’s production.


Downloaded by P. AJITHA CSE
4. Government: The governments can be benefitted from the monitoring and gauging the

suspicious activities to avoid anti-money laundering activities.

Different Stages of Data Mining Process

Downloaded by P. AJITHA CSE


The different stages of the data mining process are as follows

1. Data cleansing: This is the initial stage in data mining, where the classification of the

data becomes an essential component to obtain final data analysis. It involves identifying

and removing inaccurate and tricky data from a set of tables, databases, and record sets.

Some techniques include the ignorance of tuple, which is mainly found when the class

label is not in place; the next approach requires filling the missing values on its own,

replacing missing values and incorrect values with global constants or predictable or

mean values.

2. Data integration: It is a technique that involves merging the new set of information with

the existing group. The source may, however, involve many data sets, databases or flat

files. The customary implementation for data integration is creating an EDW (enterprise data

warehouse), which then talks about two concepts- tight and loose coupling, but let’s not dig into
Downloaded by P. AJITHA CSE
the detail.

3. Data transformation: This requires transforming data within formats, generally from the

source system to the required destination system. Some strategies include Smoothing,

Aggregation, Normalization, Generalization, and attribute construction.

4. Data discretization: The technique that can split the continuous attribute domain along

intervals is called data discretization. The datasets are stored in small chunks, thereby

making our study much more efficient. Two strategies involve Top-down discretization

and bottom-up discretization.

5. Concept hierarchies: They minimize the data by replacing and collecting low-level

concepts from high-level concepts. Concept hierarchies define the multi-dimensional data

with multiple levels of abstraction. The methods are Binning, histogram analysis, cluster

analysis, etc.

6. Pattern evaluation and data presentation: If the data is presented efficiently, the client

and the customers can make use of it in the best possible way. After going through the

above set of stages, the data is presented in graphs and diagrams and thereby

understanding it with minimum statistical knowledge.

Tools and Techniques


Data mining tools and techniques involve how these data can be mined and be put to fair and

effective use. The following two are among the most popular set of tools and techniques of data

mining:

Downloaded by P. AJITHA CSE


1. R-language: It is an open-source tool that is used for graphics and statistical computing. It has

a wide variety of classical statistical tests, classification, graphical techniques, time-series

analysis, etc. It makes use of effective storage facilities and data handling.

2. Oracle data mining: It is popularly known as ODM, which becomes a part of Oracle

advanced analytics database, thereby generating detailed insights and predictions specifically

used to detect customer behavior, develop customer profiles, and identify cross-selling ways

opportunities.

One of the drawbacks can include the training of resources on software, which can be a

complicated and time-consuming task. Data mining becomes a necessary component of one’s

system today, and by making efficient use of it, businesses can grow and predict their future

sales and revenue.

Downloaded by P. AJITHA CSE


Downloaded by P. AJITHA CSE
Downloaded by P. AJITHA CSE
KEY DIFFERENCES OF PREDICTIVE ANALYTICS VS DATA MINING
Below is the difference between predictive analytics and data mining

Downloaded by P. AJITHA CSE


Downloaded by P. AJITHA CSE
Downloaded by P. AJITHA CSE
HOW DATA MINING WORKS
Imagine that you have gathered three friends and decided which pizza to buy - vegetarian, meat,
or fish? You just poll everyone and conclude what exactly needs to be ordered in your favorite
pizzeria. But what if, for example, you have three million friends and several hundred varieties
of pizza from several dozen establishments? It's not so easy to deal with an order, is it?
Nevertheless, it is what data mining specialists do.

According to this principle, when you go to an online store to buy earrings, you will immediately
be offered a bracelet, pendant, and rings to match. And to the swimsuit - a straw hat, sunglasses,
and sandals.
It is precisely the ideally structured array of specific information that make it possible to identify
a suspicious declaration of income among millions of others of the same kind.
Data mining is conventionally divided into three stages:
• Exploration, in which the data is sorted into essential and non-essential (cleaning, data
transformation, selection of subsets)
• A model building or hidden pattern identification, the same datasets are applied to different
models, allowing better choices. It is called competitive pricing of models
• Deployment - the selected data model is used to predict the results

Data mining is handled by highly qualified mathematicians and engineers as well as AI/ML
experts.

HOW PREDICTIVE ANALYTICS WORKS


According to a report by Zion Market Research, the global predictive analytics market was
valued at approximately $3.49 billion in 2016 and is expected to reach approximately $10.95
billion by 2022, with a CAGR between 2016 and 2022 at about 21%.
Predictive analytics works with behavioral factors, making it possible to predict customer
behavior in the future - how many will come, how many will go, how to change the product, and
what promotions to offer to prevent consumer churn.
You can make predictions based on one person's behavior or a group united by a specific
criterion (gender, age, place of residence, etc.) Predictive analytics uses not only statistics, but
ML, teaching itself.
Business analysts interpret forecasts from inferred patterns. If you don't predict how your regular
and hypothetical customers will behave, you will lose the battle with your competitors.

Downloaded by P. AJITHA CSE


Data Mining and Predictive Analytics in Healthcare
The healthcare system was one of the first to adopt AI technologies, including data mining and
predictive analytics. It includes detecting fraud, managing customer relationships, and measuring
the effectiveness of specific treatments. And, of course, there is such a massive layer of
developments as predictive medicine based on predictive analytics.
Step-By-Step Guide On Mobile App Hipaa Compliance
Using the example of the latter, we will explain how it works. Let's say you have a cancer patient
like thousands of other patients in your hospital. Based on their treatment, you decide which
regimen to choose for this particular patient, taking into account all of the characteristics. The
more patients you add to the database, the more relevant solution will be given by the self-
learning application for future patients.
Video Streaming App Proof of Concept
Another example: you can adjust the number of medical personnel in a hospital depending on the
reasons for the visit. If most of the patients who come to you are kids, it's time to expand the
pediatric ward. AI will help the HR department see an impending problem before it becomes
urgent. Also, such a system can predict peak loads in hours/days/months of hospital operation,
which will make it possible to intelligently plan the shifts of doctors and nurses.
Clustering patients into groups will help assign a patient to a risk group for a particular disease
before getting sick. For example, those prone to diabetes or disseminated sclerosis need to stick
to diets so as not to worsen their health. If the patient prepares in advance, the course of the
disease will be far less intense and more effectively treated.
But data analysis tools can be helpful not only for doctors. So, a special application can remind
the patient that it is time to replenish the supply of prescription drugs, and if necessary,
automatically pay for them at the nearest pharmacy and order home
delivery

Downloaded by P. AJITHA CSE


PREDICTIVE MODELING ANALYTICS AND MACHINE LEARNING
For many organisations, big data – incredible volumes of raw structured, semi-structured and
unstructured data – is an untapped resource of intelligence that can support business decisions
and enhance operations. As data continues to diversify and change, more and more organisations
are embracing predictive analytics, to tap into that resource and benefit from data at scale.

What is predictive analytics?

A common misconception is that predictive analytics and machine learning are the same thing.
This is not the case. (Where the two do overlap, however, is predictive modelling – but more
on that later.)

At its core, predictive analytics encompasses a variety of statistical techniques (including


machine learning, predictive modelling and data mining) and uses statistics (both historical and
current) to estimate, or ‘predict’, future outcomes. These outcomes might be behaviours a
customer is likely to exhibit or possible changes in the market, for example. Predictive analytics
help us to understand possible future occurrences by analysing the past.

Machine learning, on the other hand, is a subfield of computer science that, as per Arthur
Samuel’s definition from 1959, gives ‘computers the ability to learn without being explicitly
programmed’. Machine learning evolved from the study of pattern recognition and explores
the notion that algorithms can learn from and make predictions on data. And, as they begin to
become more ‘intelligent’, these algorithms can overcome program instructions to make highly
accurate, data-driven decisions.
The most widely used predictive models are:

 Decision trees:
Decision trees are a simple, but powerful form of multiple variable analysis. They are produced
by algorithms that identify various ways of splitting data into branch-like segments. Decision
trees partition data into subsets based on categories of input variables, helping you to understand
someone’s path of decisions.

 Regression (linear and logistic)


Regression is one of the most popular methods in statistics. Regression analysis estimates
relationships among variables, finding key patterns in large and diverse data sets and how they
relate to each other.

 Neural networks
Patterned after the operation of neurons in the human brain, neural networks (also called
artificial neural networks) are a variety of deep learning technologies. They’re typically used to
solve complex pattern recognition problems – and are incredibly useful for analyzing large data
Downloaded by P. AJITHA CSE
sets. They are great at handling nonlinear relationships in data – and work well when certain
variables are unknown

Other classifiers:

 Time Series Algorithms: Time series algorithms sequentially plot data and are useful for
forecasting continuous values over time.

 Clustering Algorithms: Clustering algorithms organise data into groups whose members are
similar.

 Outlier Detection Algorithms: Outlier detection algorithms focus on anomaly detection,


identifying items, events or observations that do not conform to an expected pattern or standard
within a data set.

 Ensemble Models: Ensemble models use multiple machine learning algorithms to obtain better
predictive performance than what could be obtained from one algorithm alone.

Downloaded by P. AJITHA CSE


 Factor Analysis: Factor analysis is a method used to describe variability and aims to find
independent latent variables.

 Naïve Bayes: The Naïve Bayes classifier allows us to predict a class/category based on a given
set of features, using probability.

 Support vector machines: Support vector machines are supervised machine learning techniques
that use associated learning algorithms to analyze data and recognize patterns.

Each classifier approaches data in a different way, therefore for organizations to get the results
they need, they need to choose the right classifiers and models.

Applications of predictive analytics and machine learning

For organizations overflowing with data but struggling to turn it into useful insights, predictive
analytics and machine learning can provide the solution. No matter how much data an
organization has, if it can’t use that data to enhance internal and external processes and meet
objectives, the data becomes a useless resource.

Predictive analytics is most commonly used for security, marketing, operations, risk and fraud
detection. Here are just a few examples of how predictive analytics and machine
learning are utilised in different industries:

1. Banking and Financial Services


In the banking and financial services industry, predictive analytics and machine learning are
used in conjunction to detect and reduce fraud, measure market risk, identify opportunities
and much, much more.
2. Security
With cybersecurity at the top of every business’ agenda in 2017, it should come as no
surprise that predictive analytics and machine learning play a key part in security. Security
institutions typically use predictive analytics to improve services and performance, but also
to detect anomalies, fraud, understand consumer behaviour and enhance data security.
3. Retail
Retailers are using predictive analytics and machine learning to better understand consumer
behaviour; who buys what and where? These questions can be readily answered with the
right predictive models and data sets, helping retailers to plan ahead and stock items based on
seasonality and consumer trends – improving ROI significantly.

Want to find out more about getting Predictive Analytics to work?

Downloaded by P. AJITHA CSE


Developing the right environment

While machine learning and predictive analytics can be a boon for any organisation,
implementing these solutions haphazardly, without considering how they will fit into everyday
operations, will drastically hinder their ability to deliver the insights the organisation needs.

To get the most out of predictive analytics and machine learning, organisations need to ensure
they have the architecture in place to support these solutions, as well as high-quality data to feed
them and help them to learn. Data preparation and quality are key enablers of predictive
analytics. Input data, which may span multiple platforms and contain multiple big data sources,
must be centralised, unified and in a coherent format.

In order to achieve this, organisations must develop a sound data governance program to police
the overall management of data and ensure only high-quality data is captured and recorded.
Secondly, existing processes will need to be altered to include predictive analytics and machine
learning as this will enable organisations to drive efficiency at every point in the business. Lastly,
organisations need to know what problems they are looking to solve, as this will help them to
determine the best and most applicable model to use.

Understanding predictive models

Typically, an organisation’s data scientists and IT experts are tasked with the development of
choosing the right predictive models – or building their own to meet the organisation’s needs.
Today, however, predictive analytics and machine learning is no longer just the domain
of mathematicians, statisticians and data scientists, but also that of business analysts and
consultants. More and more of a business’ employees are using it to develop insights and
improve business operations – but problems arise when employees do not know what model to
use, how to deploy it, or need information right away.

At SAS, we develop sophisticated software to support organisations with their data governance
and analytics. Our data governance solutions help organisations to maintain high-quality data, as
well as align operations across the business and pinpoint data problems within the
same environment., Our predictive analytics solutions help organisations to turn their data into
timely insights for better, faster decision making. These predictive analytics solutions are

Downloaded by P. AJITHA CSE


designed to meet the needs of all types of users and enables them to deploy predictive models
rapidly.
MACHINE LEARNING

What is Machine learning?

Machine learning methods enable computers to operate autonomously without explicit


programming. ML applications are fed with new data, and they can independently learn, grow,
develop, and adapt.

Machine learning derives insightful information from large volumes of data by leveraging
algorithms to identify patterns and learn in an iterative process. ML algorithms use computation
methods to learn directly from data instead of relying on any predetermined equation that may
serve as a model.

The performance of ML algorithms adaptively improves with an increase in the number of


available samples during the ‘learning’ processes. For example, deep learning is a sub-domain of
machine learning that trains computers to imitate natural human traits like learning from
examples. It offers better performance parameters than conventional ML algorithms.

While machine learning is not a new concept – dating back to World War II when the Enigma
Machine was used – the ability to apply complex mathematical calculations automatically to
growing volumes and varieties of available data is a relatively recent development.

Today, with the rise of big data, IoT, and ubiquitous computing, machine learning has become
essential for solving problems across numerous areas, such as

• Computational finance (credit scoring, algorithmic trading)


• Computer vision (facial recognition, motion tracking, object detection)
• Computational biology (DNA sequencing, brain tumor detection, drug discovery)
• Automotive, aerospace, and manufacturing (predictive maintenance)
• Natural language processing (voice recognition)
How does machine learning work?

Machine learning algorithms are molded on a training dataset to create a model. As new input
data is introduced to the trained ML algorithm, it uses the developed model to make a prediction.

Downloaded by P. AJITHA CSE


Types of Machine Learning

Machine learning algorithms can be trained in many ways, with each method having its pros and
cons. Based on these methods and ways of learning, machine learning is broadly categorized into
four main types:

Downloaded by P. AJITHA CSE


Types of Machine Learning

1. Supervised machine learning

This type of ML involves supervision, where machines are trained on labeled datasets and
enabled to predict outputs based on the provided training. The labeled dataset specifies that some
input and output parameters are already mapped. Hence, the machine is trained with the input
and corresponding output. A device is made to predict the outcome using the test dataset in
subsequent phases.

For example, consider an input dataset of parrot and crow images. Initially, the machine is
trained to understand the pictures, including the parrot and crow’s color, eyes, shape, and size.
Post- training, an input picture of a parrot is provided, and the machine is expected to identify the
object and predict the output. The trained machine checks for the various features of the object,
such as color, eyes, shape, etc., in the input picture, to make a final prediction. This is the process
of object identification in supervised machine learning.

The primary objective of the supervised learning technique is to map the input variable (a) with
the output variable (b). Supervised machine learning is further classified into two broad
categories:

• Classification: These refer to algorithms that address classification problems


where the output variable is categorical; for example, yes or no, true or false, male
or female, etc. Real-world applications of this category are evident in spam
detection and email filtering.

Some known classification algorithms include the Random Forest Algorithm, Decision Tree
Algorithm, Logistic Regression Algorithm, and Support Vector Machine Algorithm.

• Regression: Regression algorithms handle regression problems where input and


output variables have a linear relationship. These are known to predict continuous
output variables. Examples include weather prediction, market trend analysis, etc.

Popular regression algorithms include the Simple Linear Regression Algorithm, Multivariate
Regression Algorithm, Decision Tree Algorithm, and Lasso Regression.

2. Unsupervised machine learning

Unsupervised learning refers to a learning technique that’s devoid of supervision. Here, the
machine is trained using an unlabeled dataset and is enabled to predict the output without any
supervision. An unsupervised learning algorithm aims to group the unsorted dataset based on the
input’s similarities, differences, and patterns.

For example, consider an input dataset of images of a fruit-filled container. Here, the images are
not known to the machine learning model. When we input the dataset into the ML model, the

Downloaded by P. AJITHA CSE


task of the model is to identify the pattern of objects, such as color, shape, or differences seen
in the input images and categorize them. Upon categorization, the machine then predicts the
output as it gets tested with a test dataset.

Unsupervised machine learning is further classified into two types:

• Clustering: The clustering technique refers to grouping objects into clusters based
on parameters such as similarities or differences between objects. For example,
grouping customers by the products they purchase.

Some known clustering algorithms include the K-Means Clustering Algorithm, Mean-Shift
Algorithm, DBSCAN Algorithm, Principal Component Analysis, and Independent Component
Analysis.

• Association: Association learning refers to identifying typical relations between


the variables of a large dataset. It determines the dependency of various data items
and maps associated variables. Typical applications include web usage mining and
market data analysis.

Popular algorithms obeying association rules include the Apriori Algorithm, Eclat Algorithm,
and FP-Growth Algorithm.

3. Semi-supervised learning

Semi-supervised learning comprises characteristics of both supervised and unsupervised machine


learning. It uses the combination of labeled and unlabeled datasets to train its algorithms. Using
both types of datasets, semi-supervised learning overcomes the drawbacks of the options
mentioned above.

Consider an example of a college student. A student learning a concept under a teacher’s


supervision in college is termed supervised learning. In unsupervised learning, a student self-
learns the same concept at home without a teacher’s guidance. Meanwhile, a student revising the
concept after learning under the direction of a teacher in college is a semi-supervised form of
learning.

4. Reinforcement learning

Reinforcement learning is a feedback-based process. Here, the AI component automatically takes


stock of its surroundings by the hit & trial method, takes action, learns from experiences, and
improves performance. The component is rewarded for each good action and penalized for every
wrong move. Thus, the reinforcement learning component aims to maximize the rewards by
performing good actions.

Unlike supervised learning, reinforcement learning lacks labeled data, and the agents learn via
experiences only. Consider video games. Here, the game specifies the environment, and each
move of the reinforcement agent defines its state. The agent is entitled to receive feedback via

Downloaded by P. AJITHA CSE


punishment and rewards, thereby affecting the overall game score. The ultimate goal of the agent
is to achieve a high score.
Reinforcement learning is applied across different fields such as game theory, information
theory, and multi-agent systems. Reinforcement learning is further divided into two types of
methods or algorithms:

• Positive reinforcement learning: This refers to adding a reinforcing stimulus after


a specific behavior of the agent, which makes it more likely that the behavior may
occur again in the future, e.g., adding a reward after a behavior.
• Negative reinforcement learning: Negative reinforcement learning refers to
strengthening a specific behavior that avoids a negative outcome.

Key Differences between Machine Learning and Predictive Modelling


Below are the lists of points, describe the key differences between Machine Learning and

Predictive Modelling

Machine learning is an AI technique where the algorithms are given data and are asked to

process without a predetermined set of rules and regulations whereas Predictive

analysis is the analysis of historical data as well as existing external data to find patterns

and behaviors.

1. Machine learning algorithms are trained to learn from their past mistakes to improve

future performance whereas predictive makes informed predictions based upon historical

data about future events only

2. Machine learning is a new generation technology which works on better algorithms

and massive amounts of data whereas predictive analysis are the study and not a

particular technology which existed long before Machine learning came into existence.

Alan Turing had already made used of this technique to decode the messages during

world war II.

3. Related practices and learning techniques for machine learning include Supervised and

unsupervised learning while for predictive analysis it is Descriptive analysis,

Diagnostic analysis, Predictive analysis, Prescriptive analysis, etc.

4. Once our machine learning model is trained and tested for a relatively smaller dataset,
Downloaded by P. AJITHA CSE
then the same method can be applied to hidden data. The data effectively need not be

biased as it would result in bad decision making. In the case of predictive analysis, data

is useful when it is complete, accurate and substantial. Data quality needs to be taken

care of when data is ingested initially. Organizations use this to predict forecasts,

consumer behaviors and make rational decisions based on their findings. A success

case will surely result in boosting business and firm’s revenues.

Downloaded by P. AJITHA CSE

You might also like