0% found this document useful (0 votes)
20 views24 pages

Topic 10-Data Mining

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views24 pages

Topic 10-Data Mining

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Topic 10: Data

Mining

ICT601 Business Analytics


Dr Saeed Shariati
Resources for this topic

Pal, C.J., Hall, M.A., Frank, E., Witten, I.H., 2016, Data Mining, 4th
Edition, Morgan Kauffman. Chapter 1, available from Topic 08
Readings
Learning outcomes

At the completion of this topic, you should be able to:


• Define and give examples of data mining as an enabling technology for business
analytics
• Understand and give examples of the objectives and benefits of data mining
• Give examples of a wide range of data mining applications
Lecture outline

• Data Mining Defined


• Data Mining Applications
• Data Mining Issues
Topic 10: Part 01
Data Mining Defined
What is data mining?

• Also sometimes known as:


“…a process that uses
• Knowledge extraction (KDD)
statistical, mathematical
• Pattern analysis
and artificial intelligence
• Data Archaeology
techniques to extract and
• Information harvesting
identify useful
• Pattern searching
information ... from large
• Data dredging
sets of data.” Sharda et.
• …and many others
al. (2014, p.222).
…another definition

“the nontrivial process of


identifying valid, novel, Pattern
Recognition
potentially useful, and
ultimately understandable
DATA Machine
patterns in data stored in MINING Learning

structured databases” Mathematical


Databases
Modeling
Sharda et al (2014, p. 223)
Management Science &
Information Systems
Why bother?

• Global competition
“We are overwhelmed • Untapped value of organisational data
with data. The amount of
data in the world, in our • Increasing consolidation of data
lives, seems ever- • Vast improvements in processing and
increasing—and there’s
no end in sight.” Pal et al
storage capabilities and reduction in
(2014) cost
Why bother?

• Every year, dairy farmers in New Zealand have to make a tough


“Data mining is about business decision: which cows to retain in their herd and which to sell
off to an abattoir. Typically, one-fifth of the cows in a dairy herd are
solving problems by culled each year near the end of the milking season as feed reserves
analyzing data already dwindle. Each cow’s breeding and milk production history influences
present in databases.” this decision. Other factors include age (a cow is nearing the end of its
productive life at 8 years), health problems, history of difficult calving,
Pal et al (2014) undesirable temperament traits (kicking or jumping fences), and not
being in calf for the following season. About 700 attributes for each of
several million cows have been recorded over the years. Machine
learning has been investigated as a way of ascertaining what factors
are taken into account by successful farmers—not to automate the
decision but to propagate their skills and experience to others. (Pal et
al, 2014)
Topic 10: Part 02
Data Mining Applications
Data Mining Applications

• Data mining has been, and continues to be, used in a wide variety of
contexts, some examples are:
• Customer relationship management
• Banking and other financial
• Retailing/logistics
• Insurance
• Brokerage and securities trading
• Manufacturing and Maintenance
CRM

• Customer relationship management


• http://searchcrm.techtarget.com/definition/CRM
• Maximize return on marketing campaigns
• Improve customer retention (churn analysis)
• Maximize customer value (cross-, up-selling)
• Identify and treat most valued customers
Banking and other financial

• Automate the loan application process


• Prediction of most likely defaulters
• Detecting fraudulent transactions
• https://www.youtube.com/watch?v=1zDwIfSDQiE
• Maximize customer value (cross-, up-selling)
• Optimizing cash reserves with forecasting
• Enter machine learning. The input was 1000 training examples of borderline cases for which a loan had been
made that specified whether the borrower had finally paid off or defaulted. For each training example, about
20 attributes were extracted from the questionnaire, such as age, years with current employer, years at
current address, years with the bank, and other credit cards possessed. A machine learning procedure was
used to produce a small set of classification rules that made correct predictions on two-thirds of the
borderline cases in an independently chosen test set. Not only did these rules improve the success rate of the
loan decisions, but the company also found them attractive because they could be used to explain to
applicants the reasons behind the decision. Although the project was an exploratory one that took only a
small development effort, the loan company was apparently so pleased with the result that the rules were
put into use immediately. (Pal, et al 2014)
Retailing and logistics

• Optimize inventory levels at different locations


• Improve the store layout and sales promotions
• Optimize logistics by predicting seasonal effects
• Minimize losses due to limited shelf life

https://www.linkedin.com/pulse/20140403185
417-4785379-diapers-and-beer
Manufacturing and maintenance

• Predict/prevent machinery failures


• http://www.manufacturing.net/article/2014/12/using-big-data-iot-predict-ma
chine-failure

• Identify anomalies in production systems to optimize the use


manufacturing capacity
• Discover novel patterns to improve product quality
Insurance

• Forecast claim costs for better business planning


• Determine optimal rate plans
• Optimize marketing to specific customers
• Identify and prevent fraudulent claim activities
Web Mining

• Page ranking
• Social media
• “And then there are social networks and other personal data. We live in the
age of selfrevelation: people share their innermost thoughts in blogs and
tweets, their photographs, their music and movie tastes, their opinions of
books, software, gadgets, and hotels, their social life. They may believe they
are doing this anonymously, or pseudonymously, but often they are
incorrect... There is huge commercial interest in making money by mining the
Web.” Pal, et al
Topic 10: Part 03
Data Mining Issues
Privacy issues

• Any time that transactional data is stored, there may be identifying


information
• Name, address etc
• Purchasing habits
• Loan details etc
• The ownership of that data is questionable
• Data mining, uses these data
Myths…

• Data mining …
• provides instant solutions/predictions
• is not yet viable for business applications
• requires a separate, dedicated database
• can only be done by those with advanced degrees
• is only for large firms that have lots of customer data
• is another name for the good-old statistics
Blunders

• Selecting the wrong problem for data mining


• Ignoring what your sponsor thinks data mining is and what it really
can/cannot do
• Not leaving insufficient time for data acquisition, selection and
preparation
• Looking only at aggregated results and not at individual
records/predictions
• Being sloppy about keeping track of the data mining procedure and
results
Ethics

• Re-identification
• >85% of Americans can be identified from publicly available records using
three pieces of information: ZIP code, birthdate and gender
• There are many examples of companies releasing “de-identified” data in good
faith, but finding they were re-identifiable
• Using Personal Information
• When collecting and using information from individuals, they should be told
what is being done to protect their data
• Wider issues
• E.g., Cambridge Analytica
Topic 10: Part 04
Topic Summary
Learning outcomes

At the completion of this topic, you should be able to:


• Define and give examples of data mining as an enabling technology for business
analytics
• Understand and give examples of the objectives and benefits of data mining
• Give examples of a wide range of data mining applications

You might also like