0% found this document useful (0 votes)

49 views15 pages

Module-1 DM

The document discusses data mining, including its definition, why it is used, the knowledge discovery process, types of data mining, and applications of data mining such as in healthcare, market basket analysis, education, manufacturing, customer relationship management, fraud detection, and lie detection.

Uploaded by

prathammsr192003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views15 pages

Module-1 DM

Uploaded by

prathammsr192003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

DATA MINING

Module-1 : Topics: Introduction to Data mining: Definition, KDD, Challenges, Data Mining
Tasks - Data Mining Goals– Stages of the Data Mining Process–Data Mining Techniques–
Applications – Major Issues in Data mining.

Introduction to Data Mining:

In general terms, “Mining” is the process of extraction of some valuable material from the
earth e.g. coal mining, diamond mining, etc. In the context of computer science, “Data
Mining” can be referred to as knowledge mining from data, knowledge extraction,
data/pattern analysis, data archaeology, and data dredging. It is basically the process
carried out for the extraction of useful information from a bulk of data or data
warehouses. One can see that the term itself is a little confusing. In the case of coal or
diamond mining, the result of the extraction process is coal or diamond. But in the case of Data
Mining, the result of the extraction process is not data!! Instead, data mining results are the
patterns and knowledge that we gain at the end of the extraction process. In that sense, we can
think of Data Mining as a step in the process of Knowledge Discovery or Knowledge
Extraction.

Why Data Mining?

 The Explosive Growth of Data: from terabytes to petabytes

 Data collection and data availability
 Automated data collection tools, database systems, Web, computerized
society
 Major sources of abundant data
 Business: Web, e-commerce, transactions, stocks, …
 Science: Remote sensing, bioinformatics, scientific simulation, …
 Society and everyone: news, digital cameras, YouTube

What Is Data Mining?

 The process of extracting information to identify patterns, trends, and useful data that
would allow the business to take the data-driven decision from huge sets of data is called
Data Mining.

 Data mining is the process of extracting useful information from large sets of data. It
involves using various techniques from statistics, machine learning, and database
systems to identify patterns, relationships, and trends in the data .

 Data mining is also called Knowledge Discovery in Database (KDD). The knowledge
discovery process includes Data cleaning, Data integration, Data selection, Data
transformation, Data mining, Pattern evaluation, and Knowledge presentation.
DATA MINING
 Data Mining is a process used by organizations to extract specific data from huge databases
to solve business problems. It primarily turns raw data into useful information.

Types of Data Mining :

Data mining can be performed on the following types of data:

1. Relational Database:

A relational database is a collection of multiple data sets formally organized by tables, records,
and columns from which data can be accessed in various ways without having to recognize the
database tables. Tables convey and share information, which facilitates data searchability,
reporting, and organization.

2. Data warehouses:

A Data Warehouse is the technology that collects the data from various sources within the
organization to provide meaningful business insights. The huge amount of data comes from
multiple places such as Marketing and Finance. The extracted data is utilized for analytical
purposes and helps in decision- making for a business organization.

3. Data Repositories:

The Data Repository generally refers to a destination for data storage. However, many IT
professionals utilize the term more clearly to refer to a specific kind of setup within an IT structure.
For example, a group of databases, where an organization has kept various kinds of information.

4. Object-Relational Database:

A combination of an object-oriented database model and relational database model is called an

object-relational model. It supports Classes, Objects, Inheritance, etc.

5. Transactional Database:

A transactional database refers to a database management system (DBMS) that has the
potential to undo a database transaction if it is not performed appropriately.

Knowledge Discovery (KDD) Process

 This is a view from typical database systems and data warehousing communities.
 Data mining plays an essential role in the knowledge discovery process.
 Data mining is also called Knowledge Discovery in Database (KDD). The knowledge
discovery process includes Data cleaning, Data integration, Data selection, Data
transformation, Data mining, Pattern evaluation, and Knowledge presentation.
DATA MINING

The following steps are included in KDD process:

1. Data Cleaning
Data cleaning is defined as removal of noisy and irrelevant data from collection.
1. Cleaning in case of Missing values.
2. Cleaning noisy data, where noise is a random or variance error.
3. Cleaning with Data discrepancy detection and Data transformation tools.

2. Data Integration
Data integration is defined as heterogeneous data from multiple sources combined in a common
source(DataWarehouse). Data integration using Data Migration tools, Data Synchronization
tools and ETL(Extract-Load-Transformation) process.

3. Data Selection
Data selection is defined as the process where data relevant to the analysis is decided and
retrieved from the data collection. For this we can use Neural network, Decision Trees, Naive
bayes, Clustering, and Regression methods.

4. Data Transformation
Data Transformation is defined as the process of transforming data into appropriate form
required by mining procedure. Data Transformation is a two step process:
1. Data Mapping: Assigning elements from source base to destination to capture
transformations.
2. Code generation: Creation of the actual transformation program.

5. Data Mining
Data mining is defined as techniques that are applied to extract patterns potentially useful. It
transforms task relevant data into patterns, and decides purpose of model
using classification or characterization.

6. Pattern Evaluation
Pattern Evaluation is defined as identifying strictly increasing patterns representing knowledge
based on given measures. It find interestingness score of each pattern, and
uses summarization and Visualization to make data understandable by user.

7. Knowledge Representation
This involves presenting the results in a way that is meaningful and can be used to make
decisions.

KDD is an iterative process where evaluation measures can be enhanced, mining can be
refined, new data can be integrated and transformed in order to get different and more
appropriate results. Preprocessing of databases consists of Data cleaning and Data
Integration.
DATA MINING

Data Mining Applications :

Data Mining is primarily used by organizations with intense consumer demands- Retail,
Communication, Financial, marketing company, determine price, consumer preferences, product
positioning, and impact on sales, customer satisfaction, and corporate profits.

1. Data Mining in Healthcare:

Data mining in healthcare has excellent potential to improve the health system. It uses data and
analytics for better insights and to identify best practices that will enhance health care services and
reduce costs.

2. Data Mining in Market Basket Analysis:

Market basket analysis is a modeling method based on a hypothesis. If you buy a specific group
of products, then you are more likely to buy another group of products. This technique may enable
DATA MINING
the retailer to understand the purchase behavior of a buyer. This data may assist the retailer in
understanding the requirements of the buyer and altering the store's layout accordingly.

3. Data mining in Education:

Education data mining is a newly emerging field, concerned with developing techniques that
explore knowledge from the data generated from educational Environments. EDM objectives are
recognized as affirming student's future learning behavior, studying the impact of educational
support, and promoting learning science.

4. Data Mining in Manufacturing Engineering:

Knowledge is the best asset possessed by a manufacturing company. Data mining tools can be
beneficial to find patterns in a complex manufacturing process.

5. Data Mining in CRM (Customer Relationship Management):

Customer Relationship Management (CRM) is all about obtaining and holding Customers, also
enhancing customer loyalty and implementing customer-oriented strategies. To get a decent
relationship with the customer, a business organization needs to collect data and analyze the data.
With data mining technologies, the collected data can be used for analytics.

6. Data Mining in Fraud detection:

Billions of dollars are lost to the action of frauds. Traditional methods of fraud detection are a little
bit time consuming and sophisticated. Data mining provides meaningful patterns and turning data
into information. An ideal fraud detection system should protect the data of all the users.
Supervised methods consist of a collection of sample records, and these records are classified as
fraudulent or non-fraudulent.

7. Data Mining in Lie Detection:

Apprehending a criminal is not a big deal, but bringing out the truth from him is a very challenging
task. Law enforcement may use data mining techniques to investigate offenses, monitor suspected
terrorist communications, etc. This technique includes text mining also, and it seeks meaningful
patterns in data, which is usually unstructured text. The information collected from the previous
investigations is compared, and a model for lie detection is constructed.

8. Data Mining Financial Banking:

The Digitalization of the banking system is supposed to generate an enormous amount of data with
every new transaction. The data mining technique can help bankers by solving business-related
problems in banking and finance by identifying trends, casualties, and correlations in business
information and market costs that are not instantly evident to managers or executives because the
data volume is too large or are produced too rapidly on the screen by experts.
DATA MINING

9. Customer Segmentation : Business Use data mining technique to

understand customer.

What Technology Are Used?

DATA MINING

Major Issues in Data Mining :

 Mining Methodology
 Mining various and new kinds of knowledge
 Mining knowledge in multi-dimensional space
 Data mining: An interdisciplinary effort
 Boosting the power of discovery in a networked environment
 Handling noise, uncertainty, and incompleteness of data
 Pattern evaluation and pattern- or constraint-guided mining
 User Interaction
 Interactive mining
 Incorporation of background knowledge
 Presentation and visualization of data mining results
 Efficiency and Scalability
 Efficiency and scalability of data mining algorithms
 Parallel, distributed, stream, and incremental mining methods
 Diversity of data types
 Handling complex types of data
 Mining dynamic, networked, and global data repositories
 Data mining and society
DATA MINING
 Social impacts of data mining
 Privacy-preserving data mining
 Invisible data mining

Challenges of Implementation in Data mining :

1. Incomplete and noisy data:

The process of extracting useful data from large volumes of data is data mining. The data in the
real-world is heterogeneous, incomplete, and noisy.

2. Data Distribution:
DATA MINING
Real-worlds data is usually stored on various platforms in a distributed computing environment.
Data mining requires the development of tools and algorithms that allow the mining of distributed
data.

3. Complex Data:

Real-world data is heterogeneous, and it could be multimedia data, including audio and video,
images, complex data, spatial data, time series, and so on. Managing these various types of data
and extracting useful information is a tough task.

4. Performance:

The data mining system's performance relies primarily on the efficiency of algorithms and
techniques used. If the designed algorithm and techniques are not up to the mark, then the
efficiency of the data mining process will be affected adversely.

5. Data Privacy and Security:

Data mining usually leads to serious issues in terms of data security, governance, and privacy. For
example, if a retailer analyzes the details of the purchased items, then it reveals data about buying
habits and preferences of the customers without their permission.

6. Data Visualization:

In data mining, data visualization is a very important process because it is the primary method that
shows the output to the user in a presentable way. The extracted data should convey the exact
meaning of what it intends to express.

Tasks /Goals of Data Mining:

Data mining tasks are designed to be semi-automatic or fully automatic and on large data
sets to uncover patterns such as groups or clusters, unusual or over the top data called
anomaly detection and dependencies such as association and sequential pattern. Once
patterns are uncovered, they can be thought of as a summary of the input data, and
further analysis may be carried out using Machine Learning and Predictive analytics.

The primary goal of data mining is to discover hidden patterns and relationships in the data that
can be used to make informed decisions or predictions.
This involves exploring the data using various techniques such as clustering, classification,
regression analysis, association rule mining, and anomaly detection.
DATA MINING
In comparison, Data mining tasks can be classified into two types: descriptive and predictive.:

o Descriptive Data Mining: It includes certain knowledge to understand what is happening

within the data without a previous idea. The common data features are highlighted in the
data set. For example, count, average etc.
o Predictive Data Mining: It helps developers to provide unlabeled definitions of attributes.
With previously available or historical data, data mining can be used to make predictions
about critical business metrics based on data's linearity. For example, predicting the volume
of business next quarter based on performance in the previous quarters over several years
or judging from the findings of a patient's medical examinations that is he suffering from
any particular disease.

Functionalities of Data Mining:

Data mining functionalities are used to represent the type of patterns that have to be discovered in
data mining tasks.

Data Mining Techniques:

DATA MINING

Data mining includes the utilization of refined data analysis tools to find previously unknown,
valid patterns and relationships in huge data sets. These tools can incorporate statistical models,
machine learning techniques, and mathematical algorithms, such as neural networks or decision
trees. Thus, data mining incorporates analysis and prediction.

1. Association

Association analysis is the finding of association rules showing attribute-value conditions that
occur frequently together in a given set of data. Association analysis is widely used for a market
basket or transaction data analysis. Association rule mining is a significant and exceptionally
dynamic area of data mining research.
DATA MINING

2. Classification

Classification is the processing of finding a set of models (or functions) that describe and
distinguish data classes or concepts, for the purpose of being able to use the model to predict the
class of objects whose class label is unknown. The determined model depends on the
investigation of a set of training data information (i.e. data objects whose class label is known).

Data Mining has a different type of classifier:

 Decision Tree
 SVM(Support Vector Machine)
 Generalized Linear Models
 Bayesian classification:
 Classification by Backpropagation
 K-NN Classifier
 Rule-Based Classification

3. Prediction

Data Prediction is a two-step process, similar to that of data classification. Although, for
prediction, we do not utilize the phrasing of “Class label attribute” because the attribute for
which values are being predicted is consistently valued(ordered) instead of categorical (discrete-
esteemed and unordered).
4. Clustering
Unlike classification and prediction, which analyze class-labeled data objects or attributes,
clustering analyzes data objects without consulting an identified class label. In general, the class
labels do not exist in the training data simply because they are not known to begin with.
Clustering can be used to generate these labels. The objects are clustered based on the principle
of maximizing the intra-class similarity and minimizing the interclass similarity. That is,
clusters of objects are created so that objects inside a cluster have high similarity in contrast with
each other, but are different objects in other clusters.

5. Regression

Regression can be defined as a statistical modeling method in which previously obtained data is
used to predicting a continuous quantity for new observations. This classifier is also known as
the Continuous Value Classifier. There are two types of regression models: Linear regression
and multiple linear regression models.
DATA MINING

6. Artificial Neural network (ANN) Classifier Method

An artificial neural network (ANN) also referred to as simply a “Neural Network” (NN), could
be a process model supported by biological neural networks. It consists of an interconnected
collection of artificial neurons. A neural network is a set of connected input/output units where
each connection has a weight associated with it. During the knowledge phase, the network
acquires by adjusting the weights to be able to predict the correct class label of the input samples.

7. Outlier Detection

A database may contain data objects that do not comply with the general behavior or model of
the data. These data objects are Outliers. The investigation of OUTLIER data is known as
OUTLIER MINING. An outlier may be detected using statistical tests which assume a
distribution or probability model for the data, or using distance measures where objects having
a small fraction of “close” neighbors in space are considered outliers.

Stages Of Data Mining Process :

Data mining is a systematic process of discovering previously unknown findings that hide within
large datasets. The data mining process generally involves six main phases:

 Business understanding (Problem Statement)

 Data understanding,
 Data preparation,
 Data analysis,
 Evaluation,
 Deployment

Problem Definition

The first stage of data mining is problem definition, which involves identifying a specific business
problem or objective to be achieved through data analysis. This could include
improving customer retention rates to identifying opportunities for cost savings.

Data Collection

Once the problem has been clearly defined, data collection is the next stage of data mining. This
involves gathering relevant data from a variety of sources, including both internal and external
sources.
DATA MINING

Data Analysis

Once you have collected and organized your data, the next data mining stage is data analysis. In
this step, statistical methods and algorithms are used to analyze the data in order to uncover
patterns, relationships, and insights. This can involve using techniques such as regression
analysis, clustering analysis, or decision trees to identify relationships between variables and
make predictions about future outcomes.

Evaluation

After completing the analysis stage of data mining, it’s important to evaluate the results against
your original problem definition. This allows you to determine whether your analysis has
addressed the initial problem or needs further refinement.

Deployment

The deployment stage is the final step in the data mining process. Once analysis has been
completed, it’s essential to integrate the results into business practice by incorporating them into
decision-making processes.

Recommended Referenced Book

 S. Chakrabarti. Mining the Web: Statistical Analysis of Hypertex and Semi-

Structured Data. Morgan Kaufmann, 2002
 R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2ed., Wiley-
Interscience, 2000
 T. Dasu and T. Johnson. Exploratory Data Mining and Data Cleaning. John
Wiley & Sons, 2003
 U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy.
Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, 1996
 U. Fayyad, G. Grinstein, and A. Wierse, Information Visualization in Data
Mining and Knowledge Discovery, Morgan Kaufmann, 2001
 J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan
Kaufmann, 3rd ed., 2011
 D. J. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, MIT
Press, 2001
 T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning: Data Mining, Inference, and Prediction, 2nd ed., Springer-
Verlag, 2009
 B. Liu, Web Data Mining, Springer 2006.
 T. M. Mitchell, Machine Learning, McGraw Hill, 1997
DATA MINING
 G. Piatetsky-Shapiro and W. J. Frawley. Knowledge Discovery in Databases.
AAAI/MIT Press, 1991
 P.-N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining,
Wiley, 2005
 S. M. Weiss and N. Indurkhya, Predictive Data Mining, Morgan
Kaufmann, 1998
 I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools
and Techniques with Java Implementations, Morgan Kaufmann, 2nd ed. 2005

Extra Questions Related To Module -1 :

Solar Flares
No ratings yet
Solar Flares
2 pages
The Chiasm Of Daniel And Revelation The Alpha Segment A A Nueske download
No ratings yet
The Chiasm Of Daniel And Revelation The Alpha Segment A A Nueske download
86 pages
Grammarvsgrammaring
No ratings yet
Grammarvsgrammaring
7 pages
Appetizer & Salad
No ratings yet
Appetizer & Salad
13 pages
Introduction To Data Mining-1
100% (1)
Introduction To Data Mining-1
24 pages
Data Preprocessing Personal
No ratings yet
Data Preprocessing Personal
11 pages
Spin Plan
No ratings yet
Spin Plan
10 pages
DWDM Notes - Unit 1
No ratings yet
DWDM Notes - Unit 1
26 pages
Insta 2
No ratings yet
Insta 2
1 page
Cottonex Anstalt v Patriot Spinning Mills Ltd [2014] EWHC 236 (Comm) (14 February 2014)
No ratings yet
Cottonex Anstalt v Patriot Spinning Mills Ltd [2014] EWHC 236 (Comm) (14 February 2014)
12 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
22 pages
PA1-6th STD
No ratings yet
PA1-6th STD
2 pages
Unit 1 Data Warehouse and Data Mining
No ratings yet
Unit 1 Data Warehouse and Data Mining
13 pages
Lps Week 16 Iatb
No ratings yet
Lps Week 16 Iatb
5 pages
3D COVER MODEL LIST 22ND FEB
No ratings yet
3D COVER MODEL LIST 22ND FEB
28 pages
Dune Profile - Google Search
No ratings yet
Dune Profile - Google Search
1 page
UNIT-III
No ratings yet
UNIT-III
33 pages
data mining introduction
No ratings yet
data mining introduction
52 pages
LG Electronics-47785214-H Lfu850
No ratings yet
LG Electronics-47785214-H Lfu850
2 pages
The Seven Principles of Effective Communication
100% (1)
The Seven Principles of Effective Communication
23 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
38 pages
Data Mining Questions 1st Unit
No ratings yet
Data Mining Questions 1st Unit
6 pages
DWDM 1
No ratings yet
DWDM 1
17 pages
unit1DM
No ratings yet
unit1DM
16 pages
PDF Bundle 01
No ratings yet
PDF Bundle 01
62 pages
DM Module 1
No ratings yet
DM Module 1
11 pages
Max Von Laue and The Discovery of X-Ray Diffraction in 1912: Then & Now
No ratings yet
Max Von Laue and The Discovery of X-Ray Diffraction in 1912: Then & Now
3 pages
cc15 2nd
No ratings yet
cc15 2nd
2 pages
Data Mining Lecture One - Docx1
No ratings yet
Data Mining Lecture One - Docx1
12 pages
Notes for DMDWH -Module1
No ratings yet
Notes for DMDWH -Module1
21 pages
Advance Database With Lab: Professor & Head (Department of Software Engineering)
No ratings yet
Advance Database With Lab: Professor & Head (Department of Software Engineering)
5 pages
Subject Data Warehouse
No ratings yet
Subject Data Warehouse
42 pages
Data Mining Nostos - Resp
No ratings yet
Data Mining Nostos - Resp
39 pages
Fund_Data_Science (1)
No ratings yet
Fund_Data_Science (1)
91 pages
DATA_MINING_UNIT_1
No ratings yet
DATA_MINING_UNIT_1
13 pages
5015 3042 IPRUTOUCH Transactions Manual
No ratings yet
5015 3042 IPRUTOUCH Transactions Manual
26 pages
What Is Data Mining: Effective Data Collection Warehousing
No ratings yet
What Is Data Mining: Effective Data Collection Warehousing
21 pages
New Note
No ratings yet
New Note
23 pages
wao
No ratings yet
wao
9 pages
B SC (IT) VI-DSE3-M5
No ratings yet
B SC (IT) VI-DSE3-M5
13 pages
Data Mining
No ratings yet
Data Mining
7 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
Nutrition Unit 2
No ratings yet
Nutrition Unit 2
36 pages
Difference Between Marketing and Selling: The Definition of Marketing
No ratings yet
Difference Between Marketing and Selling: The Definition of Marketing
5 pages
Chapter 1___Data Mining and Data Warehouse
No ratings yet
Chapter 1___Data Mining and Data Warehouse
44 pages
Labor Case Digests Week 2
100% (1)
Labor Case Digests Week 2
64 pages
Unit-2 Finalized
No ratings yet
Unit-2 Finalized
12 pages
BDUD unit1
No ratings yet
BDUD unit1
100 pages
Data Mining
No ratings yet
Data Mining
395 pages
Framework For Accounting & Reporting I
No ratings yet
Framework For Accounting & Reporting I
14 pages
Unit III Dwdm
No ratings yet
Unit III Dwdm
113 pages
Data Mining and Warehousing-1
No ratings yet
Data Mining and Warehousing-1
43 pages
Unit-2 Introduction To Data Mining
100% (1)
Unit-2 Introduction To Data Mining
11 pages
datamining&warehousing
No ratings yet
datamining&warehousing
65 pages
Topic 3 - Data Mining
No ratings yet
Topic 3 - Data Mining
37 pages
DM NOTES
No ratings yet
DM NOTES
193 pages
Data Mining Notes UNIT I
No ratings yet
Data Mining Notes UNIT I
21 pages
Data Minng
No ratings yet
Data Minng
20 pages
Data Mining
No ratings yet
Data Mining
46 pages
DWM 4
No ratings yet
DWM 4
23 pages
Learning Scenario
No ratings yet
Learning Scenario
5 pages
Lecture 1-Data Mining (Introduction)
No ratings yet
Lecture 1-Data Mining (Introduction)
30 pages
François Perroux's Concept of A Growth Pole PDF
No ratings yet
François Perroux's Concept of A Growth Pole PDF
4 pages
DM Module1
No ratings yet
DM Module1
15 pages
Adm Unit - 1
No ratings yet
Adm Unit - 1
62 pages
Unit 1
No ratings yet
Unit 1
43 pages
Presentation On Data Mining
100% (1)
Presentation On Data Mining
51 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Us Versus Them': Abortion and The Rhetoric of The New Right
No ratings yet
Us Versus Them': Abortion and The Rhetoric of The New Right
16 pages
The Witch
No ratings yet
The Witch
6 pages
Data Mining and Data Analysis UNIT-1 Notes For Print
No ratings yet
Data Mining and Data Analysis UNIT-1 Notes For Print
22 pages
JBoss Web Server User's Guide
No ratings yet
JBoss Web Server User's Guide
33 pages
ELMReview
No ratings yet
ELMReview
20 pages
p144 Data Mining
100% (3)
p144 Data Mining
11 pages
5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
Unit - 4 Introduction To Data Mining
No ratings yet
Unit - 4 Introduction To Data Mining
71 pages
Carding Dumps Tutorial and Carding Dumps Tutorial and Cashout Dumps Method 2021 Cashout Dumps Method 2021
100% (6)
Carding Dumps Tutorial and Carding Dumps Tutorial and Cashout Dumps Method 2021 Cashout Dumps Method 2021
7 pages
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 12 Introduction To Data Mining by Samaher Hussein Ali
4 pages
Training Manuals
No ratings yet
Training Manuals
20 pages
A Conceptual Overview of Data Mining: B.N. Lakshmi., G.H. Raghunandhan
No ratings yet
A Conceptual Overview of Data Mining: B.N. Lakshmi., G.H. Raghunandhan
6 pages
Data Mining and Its Applications
No ratings yet
Data Mining and Its Applications
60 pages
Data Mining Nostos
100% (1)
Data Mining Nostos
39 pages
Principles of Data Mining
From Everand
Principles of Data Mining
Subodh Keshari
No ratings yet
Data Mining
No ratings yet
Data Mining
7 pages
Unit 3 Data Mining PDF
No ratings yet
Unit 3 Data Mining PDF
19 pages
Dmdw-Unit-1 R16
No ratings yet
Dmdw-Unit-1 R16
17 pages
Download full Introduction to geographic information systems Eighth Edition. Edition Chang - eBook PDF ebook all chapters
100% (5)
Download full Introduction to geographic information systems Eighth Edition. Edition Chang - eBook PDF ebook all chapters
66 pages
NexentaStor 5.2 CLI Config Guide RevB
No ratings yet
NexentaStor 5.2 CLI Config Guide RevB
150 pages
Instructions and Directions: Giving
No ratings yet
Instructions and Directions: Giving
13 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet

Module-1 DM

Uploaded by

Module-1 DM

Uploaded by

DATA MINING

Introduction to Data Mining:

Why Data Mining?

 The Explosive Growth of Data: from terabytes to petabytes

What Is Data Mining?

Types of Data Mining :

A combination of an object-oriented database model and relational database model is called an

Knowledge Discovery (KDD) Process

The following steps are included in KDD process:

Data Mining Applications :

1. Data Mining in Healthcare:

2. Data Mining in Market Basket Analysis:

3. Data mining in Education:

4. Data Mining in Manufacturing Engineering:

5. Data Mining in CRM (Customer Relationship Management):

6. Data Mining in Fraud detection:

7. Data Mining in Lie Detection:

8. Data Mining Financial Banking:

9. Customer Segmentation : Business Use data mining technique to

What Technology Are Used?

Major Issues in Data Mining :

Challenges of Implementation in Data mining :

1. Incomplete and noisy data:

5. Data Privacy and Security:

Tasks /Goals of Data Mining:

o Descriptive Data Mining: It includes certain knowledge to understand what is happening

Functionalities of Data Mining:

Data Mining Techniques:

Data Mining has a different type of classifier:

6. Artificial Neural network (ANN) Classifier Method

Stages Of Data Mining Process :

 Business understanding (Problem Statement)

Recommended Referenced Book

 S. Chakrabarti. Mining the Web: Statistical Analysis of Hypertex and Semi-

Extra Questions Related To Module -1 :

You might also like