0% found this document useful (0 votes)

16 views

Data Mining

data mining notes carried by me

Uploaded by

sunit shinde

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Data Mining

data mining notes carried by me

Uploaded by

sunit shinde

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Introduction to Data Mining

Data mining is the process of extracting useful information from large sets of
data. It involves using various techniques from statistics, machine learning, and
database systems to identify patterns, relationships, and trends in the data. This
information can then be used to make data-driven decisions, solve business
problems, and uncover hidden insights. Applications of data mining include
customer profiling and segmentation, market basket analysis, anomaly
detection, and predictive modeling. Data mining tools and technologies are
widely used in various industries, including finance, healthcare, retail, and
telecommunications.
In general terms, “Mining” is the process of extraction of some valuable
material from the earth e.g. coal mining, diamond mining, etc. In the context of
computer science, “Data Mining” can be referred to as knowledge mining
from data, knowledge extraction, data/pattern analysis, data
archaeology, and data dredging. It is basically the process carried out for the
extraction of useful information from a bulk of data or data warehouses. One can
see that the term itself is a little confusing. In the case of coal or diamond
mining, the result of the extraction process is coal or diamond. But in the case of
Data Mining, the result of the extraction process is not data!! Instead, data
mining results are the patterns and knowledge that we gain at the end of the
extraction process. In that sense, we can think of Data Mining as a step in the
process of Knowledge Discovery or Knowledge Extraction.
Gregory Piatetsky-Shapiro coined the term “Knowledge Discovery in
Databases” in 1989. However, the term ‘data mining’ became more popular
in the business and press communities. Currently, Data Mining and Knowledge
Discovery are used interchangeably.
Nowadays, data mining is used in almost all places where a large amount of data
is stored and processed. For example, banks typically use ‘data mining’ to find
out their prospective customers who could be interested in credit cards, personal
loans, or insurance as well. Since banks have the transaction details and detailed
profiles of their customers, they analyze all this data and try to find out patterns
that help them predict that certain customers could be interested in personal
loans, etc.

1
Main Purpose of Data Mining

Data Mining
Basically, Data mining has been integrated with many other techniques from
other domains such as statistics, machine learning, pattern recognition,
database and data warehouse systems, information retrieval,
visualization, etc. to gather more information about the data and to helps
predict hidden patterns, future trends, and behaviors and allows businesses to
make decisions.
Technically, data mining is the computational process of analyzing data from
different perspectives, dimensions, angles and categorizing/summarizing it into
meaningful information.
Data Mining can be applied to any type of data e.g. Data Warehouses,
Transactional Databases, Relational Databases, Multimedia Databases,
Spatial Databases, Time-series Databases, World Wide Web.
Data Mining as a Whole Process
The whole process of Data Mining consists of three main phases:
1. Data Pre-processing – Data cleaning, integration, selection, and
transformation takes place

2
2. Data Extraction – Occurrence of exact data mining
3. Data Evaluation and Presentation – Analyzing and presenting results

In future articles, we will cover the details of each of these phases.

Applications of Data Mining
1. Financial Analysis
2. Biological Analysis
3. Scientific Analysis
4. Intrusion Detection
5. Fraud Detection
6. Research Analysis
Benefits of Data Mining
1. Improved decision-making: Data mining can provide valuable insights
that can help organizations make better decisions by identifying patterns
and trends in large data sets.
2. Increased efficiency: Data mining can automate repetitive and time-
consuming tasks, such as data cleaning and preparation, which can help
organizations save time and resources.
3. Enhanced competitiveness: Data mining can help organizations gain a
competitive edge by uncovering new business opportunities and
identifying areas for improvement.

3
4. Improved customer service: Data mining can help organizations better
understand their customers and tailor their products and services to meet
their needs.
5. Fraud detection: Data mining can be used to identify fraudulent
activities by detecting unusual patterns and anomalies in data.
6. Predictive modeling: Data mining can be used to build models that can
predict future events and trends, which can be used to make proactive
decisions.
7. New product development: Data mining can be used to identify new
product opportunities by analyzing customer purchase patterns and
preferences.
8. Risk management: Data mining can be used to identify potential risks by
analyzing data on customer behavior, market conditions, and other
factors.

Understanding Data Attribute Types |

Qualitative and Quantitative
When we talk about data mining , we usually discuss knowledge discovery from
data. To learn about the data, it is necessary to discuss data objects, data
attributes, and types of data attributes. Mining data includes knowing about
data, finding relations between data. And for this, we need to discuss data
objects and attributes.
Data objects are the essential part of a database. A data object represents the
entity. Data Objects are like a group of attributes of an entity. For example, a
sales data object may represent customers, sales, or purchases. When a data
object is listed in a database they are called data tuples.
What are Data Attributes?
 Data attributes refer to the specific characteristics or properties that
describe individual data objects within a dataset.
 These attributes provide meaningful information about the objects and are
used to analyze, classify, or manipulate the data.
 Understanding and analyzing data attributes is fundamental in various
fields such as statistics , machine learning , and data analysis, as they
form the basis for deriving insights and making informed decisions from
the data.
 Within predictive models, attributes serve as the predictors influencing an
outcome. In descriptive models, attributes constitute the pieces of
information under examination for inherent patterns or correlations.
We can say that a set of attributes used to describe a given object are
known as attribute vector or feature vector.
Examples of data attributes include numerical values (e.g., age, height),

4
categorical labels (e.g., color, type), textual descriptions (e.g., name,
description), or any other measurable or qualitative aspect of the data objects.
To build a strong foundation in understanding data attributes and other essential
data science concepts, consider enrolling in the Data Science Live
course. This course offers comprehensive training in data analysis, visualization,
and machine learning, equipping you with the knowledge and skills needed to
excel in the field of data science. Learn from industry experts and apply your
skills to real-world projects for a successful career.

Types of attributes:
This is the initial phase of data preprocessing involves categorizing attributes
into different types, which serves as a foundation for subsequent data processing
steps. Attributes can be broadly classified into two main types:
1. Qualitative (Nominal (N), Ordinal (O), Binary(B)).
2. Quantitative (Numeric, Discrete, Continuous)

Qualitative Attributes:
1. Nominal Attributes :
Nominal attributes, as related to names, refer to categorical data where the
values represent different categories or labels without any inherent order or
ranking. These attributes are often used to represent names or labels associated
with objects, entities, or concepts.
Example :

5
2. Binary Attributes: Binary attributes are a type of qualitative attribute where
the data can take on only two distinct values or states. These attributes are often
used to represent yes/no, presence/absence, or true/false conditions within a
dataset. They are particularly useful for representing categorical data where
there are only two possible outcomes. For instance, in a medical study, a binary
attribute could represent whether a patient is affected or unaffected by a
particular condition.
 Symmetric: In a symmetric attribute, both values or states are
considered equally important or interchangeable. For example, in the
attribute “Gender” with values “Male” and “Female,” neither value holds
precedence over the other, and they are considered equally significant for
analysis purposes.

 Asymmetric: An asymmetric attribute indicates that the two values or

states are not equally important or interchangeable. For instance, in the
attribute “Result” with values “Pass” and “Fail,” the states are not of equal
importance; passing may hold greater significance than failing in certain
contexts, such as academic grading or certification exams

3. Ordinal Attributes : Ordinal attributes are a type of qualitative attribute

where the values possess a meaningful order or ranking, but the magnitude
between values is not precisely quantified. In other words, while the order of
values indicates their relative importance or precedence, the numerical
difference between them is not standardized or known.
Example:

6
Quantitative Attributes:
1. Numeric: A numeric attribute is quantitative because, it is a measurable
quantity, represented in integer or real values. Numerical attributes are of 2
types: interval , and ratio-scaled.
 An interval-scaled attribute has values, whose differences are
interpretable, but the numerical attributes do not have the correct
reference point, or we can call zero points. Data can be added and
subtracted at an interval scale but can not be multiplied or divided.
Consider an example of temperature in degrees Centigrade. If a day’s
temperature of one day is twice of the other day we cannot say that one
day is twice as hot as another day.
 A ratio-scaled attribute is a numeric attribute with a fix zero-point. If a
measurement is ratio-scaled, we can say of a value as being a multiple (or
ratio) of another value. The values are ordered, and we can also compute
the difference between values, and the mean, median, mode, Quantile-
range, and Five number summary can be given.
2. Discrete : Discrete data refer to information that can take on specific,
separate values rather than a continuous range. These values are often distinct
and separate from one another, and they can be either numerical or categorical
in nature.
Example:

3. Continuous : Continuous data, unlike discrete data, can take on an infinite

number of possible values within a given range. It is characterized by being able
to assume any value within a specified interval, often including fractional or
decimal values.
Example :

7
Data Preprocessing in Data Mining
Data preprocessing is an important step in the data mining process. It refers to
the cleaning, transforming, and integrating of data in order to make it ready for
analysis. The goal of data preprocessing is to improve the quality of the data and
to make it more suitable for the specific data mining task.
Steps of Data Preprocessing
Data preprocessing is an important step in the data mining process that involves
cleaning and transforming raw data to make it suitable for analysis. Some
common steps in data preprocessing include:
1. Data Cleaningg (to remove noise and inconsistent data): This
involves identifying and correcting errors or inconsistencies in the data,
such as missing values, outliers, and duplicates. Various techniques can be
used for data cleaning, such as imputation, removal, and transformation.
2. Data Integration(where multiple data sources may be
combined): This involves combining data from multiple sources to create
a unified dataset. Data integration can be challenging as it requires
handling data with different formats, structures, and semantics.
Techniques such as record linkage and data fusion can be used for data
integration.
3. Data Transformation where data are transformed and
consolidated into forms appropriate for mining by performing
summary or aggregation operations): This involves converting the
data into a suitable format for analysis. Common techniques used in data
transformation include normalization, standardization, and discretization.
Normalization is used to scale the data to a common range, while
standardization is used to transform the data to have zero mean and unit
variance. Discretization is used to convert continuous data into discrete
categories.
4. Data Reduction: This involves reducing the size of the dataset while
preserving the important information. Data reduction can be achieved
through techniques such as feature selection and feature extraction.
Feature selection involves selecting a subset of relevant features from the
dataset, while feature extraction involves transforming the data into a
lower-dimensional space while preserving the important information.
5. Data Discretization: This involves dividing continuous data into discrete
categories or intervals. Discretization is often used in data mining and
machine learning algorithms that require categorical data. Discretization

8
can be achieved through techniques such as equal width binning, equal
frequency binning, and clustering.

Data Preprocessing in Data Mining

Data preprocessing is an important step in the data mining process. It refers to
the cleaning, transforming, and integrating of data in order to make it ready for
analysis. The goal of data preprocessing is to improve the quality of the data and
to make it more suitable for the specific data mining task.
Steps of Data Preprocessing
Data preprocessing is an important step in the data mining process that involves
cleaning and transforming raw data to make it suitable for analysis. Some
common steps in data preprocessing include:
1. Data Cleaning: This involves identifying and correcting errors or
inconsistencies in the data, such as missing values, outliers, and
duplicates. Various techniques can be used for data cleaning, such as
imputation, removal, and transformation.
2. Data Integration: This involves combining data from multiple sources to
create a unified dataset. Data integration can be challenging as it requires
handling data with different formats, structures, and semantics.
Techniques such as record linkage and data fusion can be used for data
integration.
3. Data Transformation: This involves converting the data into a suitable
format for analysis. Common techniques used in data transformation
include normalization, standardization, and discretization. Normalization is
used to scale the data to a common range, while standardization is used
to transform the data to have zero mean and unit variance. Discretization
is used to convert continuous data into discrete categories.
4. Data Reduction: This involves reducing the size of the dataset while
preserving the important information. Data reduction can be achieved
through techniques such as feature selection and feature extraction.
Feature selection involves selecting a subset of relevant features from the
dataset, while feature extraction involves transforming the data into a
lower-dimensional space while preserving the important information.
5. Data Discretization: This involves dividing continuous data into discrete
categories or intervals. Discretization is often used in data mining and
machine learning algorithms that require categorical data. Discretization
can be achieved through techniques such as equal width binning, equal
frequency binning, and clustering.
6. Data Normalization: This involves scaling the data to a common range,
such as between 0 and 1 or -1 and 1. Normalization is often used to handle
data with different units and scales. Common normalization techniques
include min-max normalization, z-score normalization, and decimal
scaling.

9
Data preprocessing plays a crucial role in ensuring the quality of data and the
accuracy of the analysis results. The specific steps involved in data
preprocessing may vary depending on the nature of the data and the analysis
goals.
By performing these steps, the data mining process becomes more efficient and
the results become more accurate.
Preprocessing in Data Mining
Data preprocessing is a data mining technique which is used to transform the
raw data in a useful and efficient format.

Ste
ps Involved in Data Preprocessing
1. Data Cleaning: The data can have many irrelevant and missing parts. To
handle this part, data cleaning is done. It involves handling of missing data, noisy
data etc.
 Missing Data: This situation arises when some data is missing in the
data. It can be handled in various ways.
Some of them are:
o Ignore the tuples: This approach is suitable only when the dataset
we have is quite large and multiple values are missing within a
tuple.
o Fill the Missing values: There are various ways to do this task.
You can choose to fill the missing values manually, by attribute
mean or the most probable value.

10
 Noisy Data: Noisy data is a meaningless data that can’t be interpreted by
machines.It can be generated due to faulty data collection, data entry
errors etc. It can be handled in following ways :
o Binning Method: This method works on sorted data in order to
smooth it. The whole data is divided into segments of equal size and
then various methods are performed to complete the task. Each
segmented is handled separately. One can replace all data in a
segment by its mean or boundary values can be used to complete
the task.
o Regression:Here data can be made smooth by fitting it to a
regression function.The regression used may be linear (having one
independent variable) or multiple (having multiple independent
variables).
o Clustering: This approach groups the similar data in a cluster. The
outliers may be undetected or it will fall outside the clusters.
2. Data Transformation: This step is taken in order to transform the data in
appropriate forms suitable for mining process. This involves following ways:
 Normalization: It is done in order to scale the data values in a specified
range (-1.0 to 1.0 or 0.0 to 1.0)
 Attribute Selection: In this strategy, new attributes are constructed
from the given set of attributes to help the mining process.
 Discretization: This is done to replace the raw values of numeric
attribute by interval levels or conceptual levels.
 Concept Hierarchy Generation: Here attributes are converted from
lower level to higher level in hierarchy. For Example-The attribute “city”
can be converted to “country”.
3. Data Reduction: Data reduction is a crucial step in the data mining process
that involves reducing the size of the dataset while preserving the important
information. This is done to improve the efficiency of data analysis and to avoid
overfitting of the model. Some common steps involved in data reduction are:
 Feature Selection: This involves selecting a subset of relevant features
from the dataset. Feature selection is often performed to remove
irrelevant or redundant features from the dataset. It can be done using
various techniques such as correlation analysis, mutual information, and
principal component analysis (PCA).
 Feature Extraction: This involves transforming the data into a lower-
dimensional space while preserving the important information. Feature
extraction is often used when the original features are high-dimensional
and complex. It can be done using techniques such as PCA, linear
discriminant analysis (LDA), and non-negative matrix factorization (NMF).
 Sampling: This involves selecting a subset of data points from the
dataset. Sampling is often used to reduce the size of the dataset while

11
preserving the important information. It can be done using techniques
such as random sampling, stratified sampling, and systematic sampling.
 Clustering: This involves grouping similar data points together into
clusters. Clustering is often used to reduce the size of the dataset by
replacing similar data points with a representative centroid. It can be done
using techniques such as k-means, hierarchical clustering, and density-
based clustering.
 Compression: This involves compressing the dataset while preserving
the important information. Compression is often used to reduce the size of
the dataset for storage and transmission purposes. It can be done using
techniques such as wavelet compression, JPEG compression, and gif
compression.

Data Mining Functionality:

1. Class/Concept Descriptions: Classes or definitions can be correlated with
results. In simplified, descriptive and yet accurate ways, it can be helpful to
define individual groups and concepts. These class or concept definitions are
referred to as class/concept descriptions.
 Data Characterization: This refers to the summary of general
characteristics or features of the class that is under the study. The output
of the data characterization can be presented in various forms include pie
charts, bar charts, curves, multidimensional data cubes.
Example: To study the characteristics of software products with sales increased
by 10% in the previous years. To summarize the characteristics of the customer
who spend more than $5000 a year at AllElectronics, the result is general profile
of those customers such as that they are 40-50 years old, employee and have
excellent credit rating.
 Data Discrimination: It compares common features of class which is
under study. It is a comparison of the general features of the target class
data objects against the general features of objects from one or multiple
contrasting classes.
Example: we may want to compare two groups of customers those who shop for
computer products regular and those who rarely shop for such products(less than
3 times a year), the resulting description provides a general comparative profile
of those customers, such as 80% of the customers who frequently purchased
computer products are between 20 and 40 years old and have a university
degree, and 60% of the customers who infrequently buys such products are
either seniors or youth, and have no university degree.
Data entries can be associated with classes or concepts. For example, in the
AllElectronics store, classes of items for sale include computers and printers, and
concepts of customers include bigSpenders and budgetSpenders. It can be useful
to describe individual classes and concepts in summarized, concise, and yet
precise terms. Such descriptions of a class or a concept are called class/concept
descriptions. These descriptions can be derived using (1) data characterization,
by summarizing the data of the class under study (often called the target class)

12
in general terms, or (2) data discrimination, by comparison of the target class
with one or a set of comparative classes (often called the contrasting classes), or
(3) both data characterization and discrimination. Data characterization is a
summarization of the general characteristics or features of a target class of data.
The data corresponding to the user-specified class are typically collected by a
query. For example, to study the characteristics of software products with sales
that increased by 10% in the previous year, the data related to such products
can be collected by executing an SQL query on the sales database.
Data characterization. A customer relationship manager at AllElectronics may
order the following data mining task: Summarize the characteristics of customers
who spend more than $5000 a year at AllElectronics. The result is a general
profile of these customers, such as that they are 40 to 50 years old, employed,
and have excellent credit ratings. The data mining system should allow the
customer relationship manager to drill down on any dimension, such as on
occupation to view these customers according to their type of employment.
Data discrimination is a comparison of the general features of the target class
data objects against the general features of objects from one or multiple
contrasting classes. The target and contrasting classes can be specified by a
user, and the corresponding data objects can be retrieved through database
queries. For example, a user may want to compare the general features of
software products with sales that increased by 10% last year against those with
sales that decreased by at least 30% during the same period. The methods used
for data discrimination are similar to those used for data characterization. “How
are discrimination descriptions output?” The forms of output presentation are
similar to those for characteristic descriptions, although discrimination
descriptions should include comparative measures that help to distinguish
between the target and contrasting classes. Discrimination descriptions
expressed in the form of rules are referred to as discriminant rules.
Example 1.6
Data discrimination. A customer relationship manager at AllElectronics may
want to compare two groups of customers—those who shop for computer
products regularly (e.g., more than twice a month) and those who rarely shop for
such products (e.g., less than three times a year). The resulting description
provides a general comparative profile of these customers, such as that 80% of
the customers who frequently purchase computer products are between 20 and
40 years old and have a university education, whereas 60% of the customers
who infrequently buy such products are either seniors or youths, and have no
university degree. Drilling down on a dimension like occupation, or adding a new
dimension like income level, may help to find even more discriminative features
between the two classes.

Mining Frequent Patterns, Associations, and

Correlations Frequent patterns,

13
as the name suggests, are patterns that occur frequently in data. There are
many kinds of frequent patterns, including frequent itemsets, frequent
subsequences (also known as sequential patterns), and frequent substructures. A
frequent itemset typically refers to a set of items that often appear together in a
transactional data set—for example, milk and bread, which are frequently bought
together in grocery stores by many customers. A frequently occurring
subsequence, such as the pattern that customers, tend to purchase first a
laptop, followed by a digital camera, and then a memory card, is a (frequent)
sequential pattern. A substructure can refer to different structural forms (e.g.,
graphs, trees, or lattices) that may be combined with itemsets or subsequences.
If a substructure occurs frequently, it is called a (frequent) structured pattern.
Mining frequent patterns leads to the discovery of interesting associations and
correlations within data.

Classification and Regression for Predictive Analysis

Classification is the process of finding a model (or function) that describes and
distinguishes data classes or concepts. The model are derived based on the
analysis of a set of training data (i.e., data objects for which the class labels are
known). The model is used to predict the class label of objects for which the the
class label is unknown. “How is the derived model presented?” The derived
model may be represented in various forms, such as classification rules (i.e., IF-
THEN rules), decision trees, mathematical formulae, or neural networks(Figure
1.9). Adecision tree is a flowchart-like tree structure, where each node denotes a
test on an attribute value, each branch represents an outcome of the test, and
tree leaves represent classes or class distributions. Decision trees can easily

14
Cluster Analysis
Unlike classification and regression, which analyze class-labeled (training) data
sets, clustering analyzes data objects without consulting class labels. In many
cases, classlabeled data may simply not exist at the beginning. Clustering can be
used to generate

Pharmacology MCQs & Past Papers 4 MBBS, DPT, Pharm D (Password Protected)
100% (1)
Pharmacology MCQs & Past Papers 4 MBBS, DPT, Pharm D (Password Protected)
247 pages
Data Mining Techniques (DMT) by Kushal Anjaria Session-1 (Lecture Note)
No ratings yet
Data Mining Techniques (DMT) by Kushal Anjaria Session-1 (Lecture Note)
2 pages
R21 DM Unit1
No ratings yet
R21 DM Unit1
77 pages
Data Warehousing and Data Mining: DR Seema Agarwal
No ratings yet
Data Warehousing and Data Mining: DR Seema Agarwal
72 pages
Data Mining Notes
No ratings yet
Data Mining Notes
25 pages
DWDM REFERENCE NOTES
No ratings yet
DWDM REFERENCE NOTES
126 pages
Unit I Notes
No ratings yet
Unit I Notes
23 pages
UNIT 3 DWM NOTES
No ratings yet
UNIT 3 DWM NOTES
17 pages
Mining
No ratings yet
Mining
7 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
DWM Sem V Module 2 - Introduction To Data Mining, Data Exploration and Data Pre-Processing
No ratings yet
DWM Sem V Module 2 - Introduction To Data Mining, Data Exploration and Data Pre-Processing
55 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Data Mining Tutorial
No ratings yet
Data Mining Tutorial
30 pages
Data Mining.pdf
No ratings yet
Data Mining.pdf
6 pages
Chapter 3-IB
No ratings yet
Chapter 3-IB
69 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
Dataming T PDF
No ratings yet
Dataming T PDF
48 pages
Data Mining Unit 1(Msc Ds 3 Sem)
No ratings yet
Data Mining Unit 1(Msc Ds 3 Sem)
119 pages
DM Module1
No ratings yet
DM Module1
15 pages
Unit 1
No ratings yet
Unit 1
27 pages
Data Mining and Data Warehousing Unit 3 Part 1
No ratings yet
Data Mining and Data Warehousing Unit 3 Part 1
13 pages
Types of attributes-1
No ratings yet
Types of attributes-1
8 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
11 pages
DATA_MINING_UNIT_1
No ratings yet
DATA_MINING_UNIT_1
13 pages
1.1 - Data Mining
No ratings yet
1.1 - Data Mining
18 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
24 pages
Unit 2 (DWDM)
No ratings yet
Unit 2 (DWDM)
40 pages
Data Mining
No ratings yet
Data Mining
4 pages
What Is Data Mining?
No ratings yet
What Is Data Mining?
17 pages
Data Mining1
No ratings yet
Data Mining1
37 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
15 pages
Mining
No ratings yet
Mining
129 pages
Data Mining
No ratings yet
Data Mining
7 pages
Unit 2 Data Preprocessing for Students.pptx
No ratings yet
Unit 2 Data Preprocessing for Students.pptx
169 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
Data mining M1
No ratings yet
Data mining M1
64 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
Data Mining 4545
No ratings yet
Data Mining 4545
20 pages
Unit 1 Datamining
No ratings yet
Unit 1 Datamining
16 pages
DM UNIT -3
No ratings yet
DM UNIT -3
10 pages
Presentation On Data Mining
100% (1)
Presentation On Data Mining
51 pages
BIDW Lecture 2
No ratings yet
BIDW Lecture 2
33 pages
Data Mining Report
No ratings yet
Data Mining Report
15 pages
Data Mining
No ratings yet
Data Mining
89 pages
Data Mining Information
100% (1)
Data Mining Information
15 pages
Data Mining e Resources
No ratings yet
Data Mining e Resources
98 pages
Data Mining - Lecture 1
No ratings yet
Data Mining - Lecture 1
33 pages
Data Mining
No ratings yet
Data Mining
11 pages
fundamentals_of_Datascience1
No ratings yet
fundamentals_of_Datascience1
83 pages
Fundamentals of Datascience
No ratings yet
Fundamentals of Datascience
80 pages
Nptel Swayam DWDM Slides
No ratings yet
Nptel Swayam DWDM Slides
406 pages
Data Minng
No ratings yet
Data Minng
20 pages
datamining-1class
No ratings yet
datamining-1class
76 pages
DM_Midsem_Question Bank (1)
No ratings yet
DM_Midsem_Question Bank (1)
5 pages
Chapter-1 (Introduction)
No ratings yet
Chapter-1 (Introduction)
17 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
87 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
76 pages
DM
No ratings yet
DM
15 pages
Data Mining - Digital Notes (Unit I To V)
No ratings yet
Data Mining - Digital Notes (Unit I To V)
85 pages
Principles of Data Mining
From Everand
Principles of Data Mining
Subodh Keshari
No ratings yet
A DETAILED LESSON PLAN IN LANGUAGE CHILDREN WITH AUTISM
No ratings yet
A DETAILED LESSON PLAN IN LANGUAGE CHILDREN WITH AUTISM
4 pages
Optimal Monetary Policy - Lecture Notes
No ratings yet
Optimal Monetary Policy - Lecture Notes
6 pages
Robotic Fish1
No ratings yet
Robotic Fish1
10 pages
2023 BakerRipley Utility Assistance Application - English
No ratings yet
2023 BakerRipley Utility Assistance Application - English
9 pages
En 1996 06
No ratings yet
En 1996 06
124 pages
Clash of Empires - Mordor Vs Westeros
No ratings yet
Clash of Empires - Mordor Vs Westeros
7 pages
Total Physical Response
No ratings yet
Total Physical Response
21 pages
Morgan Flooring Catalogue
No ratings yet
Morgan Flooring Catalogue
4 pages
Healthcare E Guide System Using K Means
No ratings yet
Healthcare E Guide System Using K Means
90 pages
Thread by @G_S_Bhogal on Thread Reader App – Thread Reader App
No ratings yet
Thread by @G_S_Bhogal on Thread Reader App – Thread Reader App
10 pages
How To Write Your Will The Complete Guide To Structuring Your Will Inheritance Tax Planning Probate and Administering An Estate 18th Edition Marlene Garsia 2024 Scribd Download
100% (13)
How To Write Your Will The Complete Guide To Structuring Your Will Inheritance Tax Planning Probate and Administering An Estate 18th Edition Marlene Garsia 2024 Scribd Download
70 pages
METRONext - Vision Moving Forward Plans
No ratings yet
METRONext - Vision Moving Forward Plans
38 pages
Infrastructure Senior Administrator JD
No ratings yet
Infrastructure Senior Administrator JD
2 pages
SQL Complete Notes.
No ratings yet
SQL Complete Notes.
63 pages
Biology Semester 1 PST Modul 1 Kolej Matrikulasi Perak Chap 3 Notes
50% (2)
Biology Semester 1 PST Modul 1 Kolej Matrikulasi Perak Chap 3 Notes
7 pages
Tomé Pires
No ratings yet
Tomé Pires
3 pages
Study On Behaviour of Piled Raft Foundation by Numerical Modelling
No ratings yet
Study On Behaviour of Piled Raft Foundation by Numerical Modelling
5 pages
Madison Fox Porter Transcript
No ratings yet
Madison Fox Porter Transcript
40 pages
By Chrétien de Troyes: Cligès
100% (1)
By Chrétien de Troyes: Cligès
86 pages
Nokia Flexi Bts Commissioning Tutorial
No ratings yet
Nokia Flexi Bts Commissioning Tutorial
12 pages
MAS 05 Budgeting
No ratings yet
MAS 05 Budgeting
8 pages
1 Add The Necessary Commas, (Semi) Colons and Apostrophes To These Texts
No ratings yet
1 Add The Necessary Commas, (Semi) Colons and Apostrophes To These Texts
4 pages
BME01.HM3C.chapter 2 - Roa - Giovanni
No ratings yet
BME01.HM3C.chapter 2 - Roa - Giovanni
10 pages
Lady Diana
No ratings yet
Lady Diana
2 pages
Analysis Oscar Wilde Tale Nightingale Amd Rose
No ratings yet
Analysis Oscar Wilde Tale Nightingale Amd Rose
7 pages
Application Guide Tankguard 412: Abrasive Blast Cleaning
No ratings yet
Application Guide Tankguard 412: Abrasive Blast Cleaning
1 page
Xushi Ko Online Menu
No ratings yet
Xushi Ko Online Menu
8 pages
Blockchain Presentation Jet
No ratings yet
Blockchain Presentation Jet
30 pages
50 Tips and Expressions For After Effects Enchanted Media
No ratings yet
50 Tips and Expressions For After Effects Enchanted Media
3 pages

Data Mining

Uploaded by

Data Mining

Uploaded by

Introduction to Data Mining

In future articles, we will cover the details of each of these phases.

Understanding Data Attribute Types |

 Asymmetric: An asymmetric attribute indicates that the two values or

3. Ordinal Attributes : Ordinal attributes are a type of qualitative attribute

3. Continuous : Continuous data, unlike discrete data, can take on an infinite

Data Preprocessing in Data Mining

Data Mining Functionality:

Mining Frequent Patterns, Associations, and

Classification and Regression for Predictive Analysis

You might also like