0% found this document useful (0 votes)
24 views

1.1 - Intro DM

This document provides an introduction to data mining, including definitions and examples. It discusses how data mining is used to extract useful patterns and knowledge from large amounts of data. Specifically, it describes how data mining can be used by banks to identify customers who may be interested in loans or credit cards. The document also outlines some common data mining techniques like artificial neural networks, decision trees, and genetic algorithms. It provides examples of how data mining is used in various domains such as finance, science, and e-commerce. Both advantages and disadvantages of data mining are discussed.

Uploaded by

dssd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

1.1 - Intro DM

This document provides an introduction to data mining, including definitions and examples. It discusses how data mining is used to extract useful patterns and knowledge from large amounts of data. Specifically, it describes how data mining can be used by banks to identify customers who may be interested in loans or credit cards. The document also outlines some common data mining techniques like artificial neural networks, decision trees, and genetic algorithms. It provides examples of how data mining is used in various domains such as finance, science, and e-commerce. Both advantages and disadvantages of data mining are discussed.

Uploaded by

dssd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

DATA MINING

UNIT – I:
Introduction to Data Mining: Introduction, What is Data Mining, Definition, KDD, Challenges, Data Mining
Tasks, Data Preprocessing, Data Cleaning, Missing data, Dimensionality Reduction, Feature Subset Selection,
Discretization and Binaryzation, Data Transformation; Measures of Similarity and Dissimilarity- Basics.
1.1 Introduction – What is Data Mining , Definition
In general terms, “Mining” is the process of extraction of some valuable material from the earth e.g. coal
mining, diamond mining etc. In the context of computer science, “Data Mining” refers to the extraction of
useful information from a bulk of data or data warehouses.
In case of coal or diamond mining, the result of extraction process is coal or diamond. But in case of
Data Mining, the result of extraction process is not data!! Instead, the result of data mining is the patterns and
knowledge that we gain at the end of the extraction process.
Thus, Data Mining is also known as Knowledge Discovery or Knowledge Extraction.
Currently, Data Mining and Knowledge Discovery are used interchangeably.
Data Mining refers to the nontrivial extraction of implicit, previously unknown and potentially useful
information from data in databases.
Now a days, data mining is used in almost all the places where a large amount of data is stored and
processed.
For example, banks typically use ‘data mining’ to find out their prospective customers who could be
interested in credit cards, personal loans or insurances as well. Since banks have the transaction details and
detailed profiles of their customers, they analyze all this data and try to find out patterns which help them predict
that certain customers could be interested in personal loans etc.
Main Purpose of Data Mining
Basically, the information gathered from Data Mining helps to predict hidden patterns, future trends and
behaviors and allowing businesses to take decisions.
Technically, data mining is the computational process of analyzing data from different perspective, dimensions,
angles and categorizing / summarizing it into meaningful information.
Data Mining can be applied to any type of data e.g. Data Warehouses, Transactional Databases, Relational
Databases, Multimedia Databases, Spatial Databases, Time-series Databases, World Wide Web.
Real life example of Data Mining – Market Basket Analysis
Market Basket Analysis is a technique which gives the careful study of purchases done by a customer in a super
market. The concept is basically applied to identify the items that are bought together by a customer. Say, if a
person buys bread, what are the chances that he/she will also purchase butter. This analysis helps in promoting
offers and deals by the companies. The same is done with the help of data mining.

BEYOND SYLLABUS ( UNCOVERED Topics In SYLLABI ) :


Type of Data Gathered Data Mining :

USES OF DATA MINING


Following are the uses of Data Mining, as :
a. Automated Prediction of Trends and behaviours
b. Automated Discovery of Previously Unknown Patterns

DATA MINING TECHNIQUES
a. Artificial Neural Networks
We use data mining in non-linear predictive models. As this learn through training and resemble biological
neural networks in structure.
b. Decision Trees
As we use tree-shaped structures to represent sets of decisions. Also, these rules are generated for the
classification of a dataset. These decisions generate rules for the classification of a dataset..
c. Genetic Algorithms
There are the present genetic combination, mutation, and natural selection for optimization techniques.
d. Nearest Neighbor Method
A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) like.
Sometimes called the k-nearest neighbour technique.
e. Rule Induction
The extraction of useful if-then rules from data based on statistical significance.
APPLICATIONS OF DATA MINING
1. Financial Analysis
2. Biological Analysis
3. Scientific Analysis
4. Intrusion Detection
5. Fraud Detection
6. Research Analysis
7. Weather forecasting.
8. E-commerce.
9. Self-driving cars.
10.Hazards of new medicine.
11. Space research.
12. Fraud detection.
13.Stck trade analysis.
14. Business forecasting.
15.Social networks.
16.Customers likelihood.
AREAS WHERE DATA MINING HAD GOOD AND BAD EFFECTS :
a. Good Effects
 Predict future trends, customer purchase habits
 Help with decision making
 Improve company revenue and lower costs
 Market basket analysis
 Fraud detection
b. Bad Effects
 User privacy/security
 Amount of data is overwhelming
 Great cost at an implementation stage
 Possible misuse of information
 The possible inaccuracy of data

DATA MINING ADVANTAGES


 To find probable defaulters, we use data mining in banks and financial institutions. This is done based on
past transactions, user behavior and data patterns.
 It helps advertisers to push right advertisements to the internet. This way data mining benefit both possible
buyers as well as sellers of the various products.
 The retail malls and grocery stores peoples used data mining. That is to arrange and keep most sellable
items in the most attentive positions. It has become possible due to inputs obtained from data mining
software. This way data mining helps in increasing revenue.
 As data mining is having different methods. That is cost-effective compared to other applications.
 We use data mining in so many areas. Such as bio-informatics, medicine, genetics, etc.
 We use data mining to identifying criminal suspects. That is by law enforcement agencies .
DATA MINING DISADVANTAGES
 Security: The time at which users are online for various uses, must be important. They do not have security
systems in place to protect us.
 As some of the data mining analytics use software. That is difficult to operate. Thus they require a user to
have knowledge based training.
The techniques of data mining are not 100% accurate. Hence, it may cause serious consequences in certain
conditions.

You might also like