0% found this document useful (0 votes)

25 views3 pages

Datamining

Datamining notes

Uploaded by

Abhishek Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views3 pages

Datamining

Datamining notes

Uploaded by

Abhishek Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

DATA MINING

Q1. What are the different tasks of Data Mining? (5 marks)

Data mining involves various tasks aimed at discovering patterns,

relationships, and valuable information from large datasets. The
key tasks include:

Classification: Assigning predefined labels or categories to

instances based on their attributes. Example: Predicting whether
an email is spam or not.

Regression: Predicting a continuous numerical value based on

other attributes.
Example: Predicting the price of a house based on its features.

Clustering: Grouping similar instances or data points together

based on their characteristics.
Example: Segmenting customers into groups with similar
purchasing behavior.

Association Rule Mining: Discovering relationships and

associations between different variables in a dataset.
Example: Finding associations between products frequently
bought together.

Anomaly Detection: Identifying unusual patterns or outliers in the

data that may indicate errors or fraud.
Example: Detecting unusual activity in credit card transactions.

Q2. Explain the Decision Tree Classifier. (5 marks)

The Decision Tree Classifier is a supervised machine learning

algorithm used for both classification and regression tasks. It
works by recursively partitioning the dataset into subsets based
on the values of input features, ultimately assigning a class label
or predicting a target value for each instance. The tree-like
structure is composed of nodes, where each node represents a
decision or a test on a specific feature. Here's an overview of how
the Decision Tree Classifier operates:
Root Node: The top most node in the tree, representing the entire
dataset.
It is associated with the feature that, at this level, is deemed the
most significant for making decisions.

Internal Nodes: Nodes within the tree, representing tests or

decisions based on specific features.Each internal node has
branches leading to child nodes, corresponding to the possible
outcomes of the associated test.

Branches: Each branch emanating from an internal node

corresponds to a possible value or range of values for the
associated feature.

Leaf Nodes: Terminal nodes at the end of the branches,

representing the final predicted class label or regression value.
Instances reaching a leaf node are assigned the class label or
value associated with that leaf.
Decision Making: To classify a new instance, start at the root node
and navigate the tree based on the feature values of the instance.
Follow the branches according to the test outcomes until a leaf
node is reached, and the associated class label or value is
assigned.

Q3. Define Clustering. (5 marks)

Clustering is a data mining technique that involves grouping a set

of data points or objects based on their similarity. The goal is to
create clusters or groups where objects within the same cluster
are more similar to each other than to those in other clusters.
Clustering is an unsupervised learning approach, meaning it
doesn't require predefined labels for the data.

Key Characteristics:

Unsupervised Learning: No predefined categories; the algorithm

discovers patterns on its own.
Similarity Measure: Clusters are formed based on the similarity or
distance between data points.

Noisy Data Handling: Robust to noise and outliers.

Applications:

Market Segmentation: Grouping customers with similar

preferences.

Image Segmentation: Grouping pixels with similar characteristics.

Anomaly Detection: Identifying unusual patterns by recognizing

deviations from normal clusters.
Algorithms:

K-Means: Divides data into k clusters based on centroids.

Hierarchical Clustering: Builds a tree of clusters by merging or
splitting them.
DBSCAN (Density-Based Spatial Clustering of Applications with
Noise): Clusters dense regions of data points.
Clustering is valuable for exploratory data analysis and pattern
recognition, helping to uncover hidden structures within datasets.

G. David Garson-Logistic Regression - Binary and Multinomial-Statistical Associates Publishing (2014)
75% (4)
G. David Garson-Logistic Regression - Binary and Multinomial-Statistical Associates Publishing (2014)
224 pages
Unit 5
No ratings yet
Unit 5
27 pages
Nursing Research
0% (2)
Nursing Research
4 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
Datamining Quiz
No ratings yet
Datamining Quiz
173 pages
ML Unsupervised
No ratings yet
ML Unsupervised
35 pages
Machine Learning Clustering AlgorithmsI
No ratings yet
Machine Learning Clustering AlgorithmsI
129 pages
Classification in Data Mining
No ratings yet
Classification in Data Mining
60 pages
Dataming Cat Answers
No ratings yet
Dataming Cat Answers
43 pages
Unit 2
No ratings yet
Unit 2
57 pages
Data Mining
No ratings yet
Data Mining
35 pages
Unit-IV Classification Part 1
No ratings yet
Unit-IV Classification Part 1
38 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
Top 50 Data Mining Interview Questions & Answers - GeeksforGeeks
No ratings yet
Top 50 Data Mining Interview Questions & Answers - GeeksforGeeks
25 pages
Unit I (Notes 2)
No ratings yet
Unit I (Notes 2)
16 pages
DM Passing Package
No ratings yet
DM Passing Package
38 pages
Clustering Agglo Devisive DBSCAN
No ratings yet
Clustering Agglo Devisive DBSCAN
78 pages
60 Common Data Mining Interview Questions in 2025
No ratings yet
60 Common Data Mining Interview Questions in 2025
20 pages
Machine Learning Note Modul 4 5
No ratings yet
Machine Learning Note Modul 4 5
20 pages
DM-Model Question Paper Solutions
No ratings yet
DM-Model Question Paper Solutions
27 pages
Lecture 01 - Unsupervised Learning (Optional)
No ratings yet
Lecture 01 - Unsupervised Learning (Optional)
57 pages
A Simple Guide To Centroid Based Clustering (With Python Code)
No ratings yet
A Simple Guide To Centroid Based Clustering (With Python Code)
25 pages
Data Analytics-1
No ratings yet
Data Analytics-1
21 pages
DM Unit 1
No ratings yet
DM Unit 1
10 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
18 pages
Creswell Et Al 2018
No ratings yet
Creswell Et Al 2018
9 pages
Overview Basics
No ratings yet
Overview Basics
16 pages
Data Mining
No ratings yet
Data Mining
20 pages
5 What Is Data-WPS Office
No ratings yet
5 What Is Data-WPS Office
19 pages
Wibd
No ratings yet
Wibd
10 pages
Data Mining and Visualization Question Bank
100% (1)
Data Mining and Visualization Question Bank
11 pages
Dataminingshort Question Part2
No ratings yet
Dataminingshort Question Part2
17 pages
1.1 Project Overview: Data Mining
No ratings yet
1.1 Project Overview: Data Mining
74 pages
Data Mining - UNIT-IV
No ratings yet
Data Mining - UNIT-IV
24 pages
DWDM
No ratings yet
DWDM
18 pages
DMDA Viva Questions-1
No ratings yet
DMDA Viva Questions-1
7 pages
DA Unit 1
No ratings yet
DA Unit 1
44 pages
Comp 414 Revision
No ratings yet
Comp 414 Revision
9 pages
DWM Ia-2 QB
No ratings yet
DWM Ia-2 QB
10 pages
Data Mining Exam Answers - April 2024
No ratings yet
Data Mining Exam Answers - April 2024
6 pages
Data Mining Long Answers
No ratings yet
Data Mining Long Answers
4 pages
Sources of Self-Efficacy PDF
No ratings yet
Sources of Self-Efficacy PDF
13 pages
Tableau Financial Data Analysis
No ratings yet
Tableau Financial Data Analysis
3 pages
Viva Data Mining Lab
No ratings yet
Viva Data Mining Lab
11 pages
Chapter 7
No ratings yet
Chapter 7
3 pages
Oral Questions LP II
No ratings yet
Oral Questions LP II
21 pages
Data Mining Real
No ratings yet
Data Mining Real
19 pages
DMDW
No ratings yet
DMDW
4 pages
Comparing Response Blocking and Response Interruption or Redirection On Levels of Motor Stereotypy
No ratings yet
Comparing Response Blocking and Response Interruption or Redirection On Levels of Motor Stereotypy
13 pages
Data Mining
No ratings yet
Data Mining
7 pages
Exercise of Chapter 4 - Data Mining Tools and Techniques Worksheet
No ratings yet
Exercise of Chapter 4 - Data Mining Tools and Techniques Worksheet
4 pages
Slides Courtesy: Ling Chen [email protected]
No ratings yet
Slides Courtesy: Ling Chen [email protected]
42 pages
HW1
No ratings yet
HW1
4 pages
Clustering
No ratings yet
Clustering
3 pages
Bia Unit-3 Part-2
No ratings yet
Bia Unit-3 Part-2
43 pages
Data Warehouse and Mining Notes
No ratings yet
Data Warehouse and Mining Notes
12 pages
Mmds
No ratings yet
Mmds
12 pages
Kamrul Hasan PDF
No ratings yet
Kamrul Hasan PDF
153 pages
Running Head:: Data Mining 1
No ratings yet
Running Head:: Data Mining 1
7 pages
Gautam A. Kudale
No ratings yet
Gautam A. Kudale
6 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Q.1. What Is Data Mining?
No ratings yet
Q.1. What Is Data Mining?
15 pages
DWM Mid 2 Question Bank
No ratings yet
DWM Mid 2 Question Bank
5 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
RM1 - Errors in Business Research
No ratings yet
RM1 - Errors in Business Research
3 pages
Discovering Knowledge in Data: Lecture Review of
No ratings yet
Discovering Knowledge in Data: Lecture Review of
20 pages
Sample Quiz 2 Statistics Essentials of Business Development
No ratings yet
Sample Quiz 2 Statistics Essentials of Business Development
15 pages
Cluster Evaluation Techniques: Atds Assignment
No ratings yet
Cluster Evaluation Techniques: Atds Assignment
4 pages
Passport by MD Darwish
No ratings yet
Passport by MD Darwish
1 page
A Far Cry From Africa by Derek Walcott
No ratings yet
A Far Cry From Africa by Derek Walcott
2 pages
PROPOSAL
No ratings yet
PROPOSAL
14 pages
Case Study On Milkfish Production of Illera Fish Farm
No ratings yet
Case Study On Milkfish Production of Illera Fish Farm
57 pages
Advanced Statistics Project Report
100% (1)
Advanced Statistics Project Report
42 pages
Indicators: Holistic Rubric For The Research Data-Collection Procedure
0% (1)
Indicators: Holistic Rubric For The Research Data-Collection Procedure
2 pages
Evaluating Fracture-Fluid Flowback in Marcellus Using Data-Mining Technologies
No ratings yet
Evaluating Fracture-Fluid Flowback in Marcellus Using Data-Mining Technologies
14 pages
STAT15S - PSPP: Exercise Using PSPP To Explore Multiple Linear Regression
No ratings yet
STAT15S - PSPP: Exercise Using PSPP To Explore Multiple Linear Regression
5 pages
The Coefficient of Determination R-Squared Is More Informative Than SMAPE, MAE, MAPE, MSE, and RMSE in Regression Analysis Evaluation
No ratings yet
The Coefficient of Determination R-Squared Is More Informative Than SMAPE, MAE, MAPE, MSE, and RMSE in Regression Analysis Evaluation
28 pages
Network Security
No ratings yet
Network Security
1 page
Financial Analytics - BA Presentation Final
No ratings yet
Financial Analytics - BA Presentation Final
19 pages
Unit V - Clustering
No ratings yet
Unit V - Clustering
19 pages
Nigeria Transport Policy
No ratings yet
Nigeria Transport Policy
19 pages
Infinit Digital Solutions - RP 1 Report Final
No ratings yet
Infinit Digital Solutions - RP 1 Report Final
17 pages
4 Factor Analysis
No ratings yet
4 Factor Analysis
16 pages
BUS - 5030 - Milestone - 2 - Worksheet (2) (1) (Repaired)
No ratings yet
BUS - 5030 - Milestone - 2 - Worksheet (2) (1) (Repaired)
12 pages
A Study On Performance Appraisal in Kotak Mahindra Bank Hyderabad
No ratings yet
A Study On Performance Appraisal in Kotak Mahindra Bank Hyderabad
14 pages
Software Engineering Question Answers
No ratings yet
Software Engineering Question Answers
27 pages
Examine
No ratings yet
Examine
19 pages
Choice of Hotel Facilities by Guests With Physical Disabilities in Nairobi, Kenya
No ratings yet
Choice of Hotel Facilities by Guests With Physical Disabilities in Nairobi, Kenya
17 pages
Government Scheme Alert Web Applications (1) - 1
No ratings yet
Government Scheme Alert Web Applications (1) - 1
5 pages
G 2 Tos - Math3a
No ratings yet
G 2 Tos - Math3a
2 pages
Report 3
No ratings yet
Report 3
3 pages
The Children of The Poor by Gwendolyn Brooks
No ratings yet
The Children of The Poor by Gwendolyn Brooks
2 pages

Datamining

Uploaded by

Datamining

Uploaded by

DATA MINING

Q1. What are the different tasks of Data Mining? (5 marks)

Data mining involves various tasks aimed at discovering patterns,

Classification: Assigning predefined labels or categories to

Regression: Predicting a continuous numerical value based on

Clustering: Grouping similar instances or data points together

Association Rule Mining: Discovering relationships and

Anomaly Detection: Identifying unusual patterns or outliers in the

Q2. Explain the Decision Tree Classifier. (5 marks)

The Decision Tree Classifier is a supervised machine learning

Internal Nodes: Nodes within the tree, representing tests or

Branches: Each branch emanating from an internal node

Leaf Nodes: Terminal nodes at the end of the branches,

Q3. Define Clustering. (5 marks)

Clustering is a data mining technique that involves grouping a set

Unsupervised Learning: No predefined categories; the algorithm

Noisy Data Handling: Robust to noise and outliers.

Market Segmentation: Grouping customers with similar

Image Segmentation: Grouping pixels with similar characteristics.

Anomaly Detection: Identifying unusual patterns by recognizing

K-Means: Divides data into k clusters based on centroids.

You might also like