BI Module 4
BI Module 4
Data mining is not a completely new field, but rather a combination of different subjects like statistics, artificial
intelligence, machine learning, databases, and management science. It focuses on finding useful patterns and
information from large amounts of data.
Large Data Storage: Data is often stored in huge databases that may contain information from many years. Sometimes, this
data is cleaned and organized into a data warehouse.
Modern Systems: Data mining usually works on a client/server system or a web-based platform.
Advanced Tools: Special tools, including visualization software, help find hidden patterns in data stored in company
records or public files. Some tools can even analyze unstructured text from emails, websites, and business networks.
User-Friendly: Many data mining tools allow users (even those without programming skills) to explore data and ask
questions to get quick answers.
Unexpected Discoveries: Sometimes, valuable insights come from unexpected results, so users need to think creatively
while analyzing the findings.
Easy Integration: Data mining tools can be easily combined with spreadsheets and other software, making it simple to
analyze and use the results.
Fast Processing: Since data mining deals with a huge amount of data, parallel processing (using multiple processors at once)
is often needed to speed up the process.
Overall, data mining helps businesses and researchers find useful patterns in data, leading to better decisions
and insights.
Data mining is like solving a big puzzle using data. It helps businesses find useful information from large
amounts of data. The process involves six main steps:
Before starting data mining, we need to understand why we are doing it. Businesses need answers to questions
like:
Why are customers leaving for competitors?
Which customers are the most valuable?
Once the goal is clear, a plan is created. This plan includes assigning tasks, collecting data, analyzing data, and
reporting the findings. A budget is also set.
Now, we find and collect the right data. Different business problems need different types of data. For example,
a clothing store might look at customer age, income, and purchase history to understand shopping habits.
To understand the data better, analysts use tools like graphs, tables, and statistics.
Real-world data is often messy. It may have missing values, errors, or duplicate records. Cleaning and
organizing the data takes a lot of time—about 80% of the total project time.
Now, we apply different techniques to find useful patterns in the data. There are different types of models:
Different models are tested to find the best one for the business problem.
Once the models are built, we check how well they work. The goal is to see if they provide useful and accurate
results.
If the models do not perform well, they may need to be improved or replaced. Businesses also look for any
unexpected insights that can help them make better decisions.
quantitative model
When making decisions, different types of variables influence the outcome. Let's break them down in simple
terms:
1. Result (Outcome) Variables
These variables show how successful a system is. They are the final results of a process.
Example:
A company wants to improve customer satisfaction. The result variable is the customer satisfaction score.
In a business, profit is a result variable—it shows how well the company is doing.
These are dependent variables because they depend on other factors like decisions made and external
conditions.
2. Decision Variables
These are the choices we can control. Decision-makers use these variables to take action.
Example:
A business decides how much money to invest in stocks or bonds. The investment amount is a decision variable.
A company schedules work shifts for employees. People, work hours, and schedules are decision variables.
These are factors that affect the results but cannot be controlled by decision-makers.
Example:
These variables set limits and constraints on what decisions can be made.
These are middle steps that affect the final result. They help understand how one factor leads to another.
Example:
A factory wants to maximize profit. Spoilage (wasted materials) is an intermediate result—it affects total profit.
A company pays high employee salaries → Employees feel satisfied → Productivity increases. Here, employee satisfaction
is an intermediate result before the final result (higher productivity).
Final Thoughts
Result variables show success.
Decision variables are choices we can control.
Uncontrollable variables are external factors we can't change.
Intermediate result variables are steps that lead to the final outcome.
Data mining is used in many industries to solve problems and find new opportunities. Here are some examples
of how different sectors use data mining:
Banking
1. Helps approve or reject loan applications by predicting who might not repay.
2. Detects fraud in credit card and online banking transactions.
3. Identifies products that customers are most likely to buy.
4. Predicts how much cash is needed at ATMs and bank branches.
Insurance
1. Entertainment Industry
1. Sports ⚾
Data mining helps businesses and organizations make better decisions by analyzing large amounts of data to
find useful patterns. ✅
Here, the decision-maker knows exactly what will happen for each choice. This means there is complete
information about the outcomes.
This type of decision-making is common in structured problems and short-term decisions (up to 1 year).
Here, there are multiple possible outcomes, but the decision-maker does not know their probability.
Example:
A company launching a new product doesn’t know how customers will respond.
Managers try to reduce uncertainty by gathering more information. If that's not possible, they have to make a
decision with limited knowledge.
Here, the decision-maker knows the possible outcomes and their probabilities. This helps in calculating
expected risk before making a decision.
Example:
A company can estimate the probability of profit or loss when launching a new product based on past market trends.
Most big business decisions are made under assumed risk, where risks are analyzed and calculated before
deciding.