0% found this document useful (0 votes)

1K views14 pages

1.data Mining Functionalities

The document discusses data mining functionalities and tasks. It describes two categories of data mining - descriptive and predictive. Descriptive mining highlights common data features while predictive mining estimates characteristics based on previous tests. Key data mining functionalities are also outlined, including class description, pattern mining, classification/prediction, clustering, and outlier analysis. The document also discusses data mining query primitives, major issues in data mining like handling noise, and the importance of data pre-processing techniques like cleaning, transformation, and discretization.

Uploaded by

Sai Deekshith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views14 pages

1.data Mining Functionalities

Uploaded by

Sai Deekshith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

1.

Data Mining Functionalities:

A. Data Mining functions are used to define the trends or correlations contained in
data mining activities.
In comparison, data mining activities can be divided into 2 categories:
1. Descriptive Data Mining:
It includes certain knowledge to understand what is happening within the
data without a previous idea. The common data features are highlighted in
the data set.
For examples: count, average etc.
2. Predictive Data Mining:
It helps developers to provide unlabelled definitions of attributes. Based on
previous tests, the software estimates the characteristics that are absent.
For example: Judging from the findings of a patient’s medical examinations
that is he suffering from any particular disease.
Data Mining Functionality:
1. Class/Concept Descriptions:
Classes or definitions can be correlated with results. In simplified, descriptive
and yet accurate ways, it can be helpful to define individual groups and
concepts.
These class or concept definitions are referred to as class/concept descriptions.
• Data Characterization:
This refers to the summary of general characteristics or features of the class
that is under the study. For example. To study the characteristics of a
software product whose sales increased by 15% two years ago, anyone can
collect these type of data related to such products by running SQL queries.
• Data Discrimination:
It compares common features of class which is under study. The output of
this process can be represented in many forms. Eg., bar charts, curves and pie
charts.
2. Mining Frequent Patterns, Associations, and Correlations:
Frequent patterns, are patterns that occur frequently in data. There are many
kinds of frequent patterns, including itemsets, subsequences, and
substructures.
3. Classification and Prediction:
Classification is the process of finding a model that describes and distinguishes
data classes for the purpose of being able to use the model to predict the class of
objects whose class label is unknown.
4. Cluster Analysis:
In classification and prediction analyse classlabeled data objects, whereas
clustering analyses data objects without consulting a known class label.
5. Outlier Analysis:
A database may contain data objects that do not comply with the general
behaviour or model of the data. These data objects are outliers.
Most data mining methods discard outliers as noise or exceptions. The analysis of
outlier data is referred to as outlier mining.

2. Data Mining Task Primitives

Each user will have a data mining task in mind, that is, some form of data analysis that he
or she would like to have performed. A data mining task can be specified in the form of
a data mining query, which is input to the data mining system. A data mining query is
defined in terms of data mining task primitives. These primitives allow the user to
interactively communicate with the data mining system during discovery in order to
direct the mining process, or examine the findings from different angles or depths.

The data mining primitives specify the following:

The set of task-relevant data to be mined:
This specifies the portions of the database or the set of data in which the user is
interested. This includes the database attributes or data warehouse dimensions of
interest (referred to as the relevant attributes or dimensions).
The kind of knowledge to be mined:
This specifies the data mining functions to be performed, such as characterization,
discrimination, association or correlation analysis, classification, prediction, clustering,
outlier analysis, or evolution analysis.
The background knowledge to be used in the discovery process:
This knowledge about the domain to be mined is useful for guiding the knowledge
discovery process and for evaluating the patterns found.
The interestingness measures and thresholds for pattern evaluation:
They may be used to guide the mining process or, after discovery, to evaluate the
discovered patterns. Different kinds of knowledge may have different interestingness
measures. For example, interestingness measures for association rules include support
and confidence. Rules whose support and confidence values are below user-specified
thresholds are considered uninteresting.
The expected representation for visualizing the discovered patterns:
This refers to the form in which discovered patterns are to be displayed, which may
include rules, tables, charts, graphs, decision trees, and cubes.

3.Major Issues in Data Mining:

Mining different kinds of knowledge in databases. - The need of different users is not
the same. And Different user may be in interested in different kind of knowledge.
Therefore, it is necessary for data mining to cover broad range of knowledge discovery
task.

• Interactive mining of knowledge at multiple levels of abstraction:

The data mining process needs to be interactive because it allows users to
focus the search for patterns, providing and refining data mining requests
based on returned results.
• Incorporation of background knowledge:
To guide discovery process and to express the discovered patterns, the
background knowledge can be used. Background knowledge may be used to
express the discovered patterns not only in concise terms but at multiple
level of abstraction.
• Data mining query languages and ad hoc data mining:
Data Mining Query language that allows the user to describe ad hoc mining
tasks, should be integrated with a data warehouse query language and
optimized for efficient and flexible data mining.
• Presentation and visualization of data mining results:
Once the patterns are discovered it needs to be expressed in high level
languages, visual representations. These representations should be easily
understandable by the users.
• Handling noisy or incomplete data:
The data cleaning methods are required that can handle the noise,
incomplete objects while mining the data regularities. If data cleaning
methods are not there then the accuracy of the discovered patterns will be
poor.
• Pattern evaluation:
It refers to interestingness of the problem. The patterns discovered should
be interesting because either they represent common knowledge or lack
novelty.
• Efficiency and scalability of data mining algorithms:
In order to effectively extract the information from huge amount of data in
databases, data mining algorithm must be efficient and scalable.
• Parallel, distributed, and incremental mining algorithms:
The factors such as huge size of databases, wide distribution of data, and
complexity of data mining methods motivate the development of parallel
and distributed data mining algorithms. These algorithms divide the data
into partitions which is further processed parallel. Then the results from the
partitions is merged. The incremental algorithms, updates databases
without having mine the data again from scratch.
4.Data Pre-processing:

Data pre-processing is a data mining technique which is used to transform the

raw data in a useful and efficient format.

Steps Involved in Data Pre-processing:

1. Data Cleaning:
The data can have many irrelevant and missing parts. To handle this part, data
cleaning is done. It involves handling of missing data, noisy data etc.
• (a). Missing Data:
This situation arises when some data is missing in the data. It can be handled
in various ways.
Some of them are:
1. Ignore the tuples:
This approach is suitable only when the dataset we have is quite
large and multiple values are missing within a tuple.
2. Fill the Missing values:
There are various ways to do this task. You can choose to fill the
missing values manually, by attribute mean or the most probable
value.
• (b). Noisy Data:
Noisy data is a meaningless data that can’t be interpreted by machines.It can
be generated due to faulty data collection, data entry errors etc. It can be
handled in following ways:
1. Binning Method:
This method works on sorted data in order to smooth it. The whole
data is divided into segments of equal size and then various
methods are performed to complete the task. Each segmented is
handled separately. One can replace all data in a segment by its
mean or boundary values can be used to complete the task.
2. Regression:
Here data can be made smooth by fitting it to a regression function.
The regression used may be linear (having one independent
variable) or multiple (having multiple independent variables).
3. Clustering:
This approach groups the similar data in a cluster. The outliers may
be undetected or it will fall outside the clusters.
2. Data Transformation:
This step is taken in order to transform the data in appropriate forms suitable for
mining process. This involves following ways:
1. Normalization:
It is done in order to scale the data values in a specified range (-1.0 to 1.0 or
0.0 to 1.0)
2. Attribute Selection:
In this strategy, new attributes are constructed from the given set of
attributes to help the mining process.
3. Discretization:
This is done to replace the raw values of numeric attribute by interval levels
or conceptual levels.
4. Concept Hierarchy Generation:
Here attributes are converted from lower level to higher level in hierarchy.
For Example-The attribute “city” can be converted to “country”.
3. Data Reduction:
Since data mining is a technique that is used to handle huge amount of data. While
working with huge volume of data, analysis became harder in such cases. In order to
get rid of this, we uses data reduction technique. It aims to increase the storage
efficiency and reduce data storage and analysis costs.
The various steps to data reduction are:

1. Data Cube Aggregation:

Aggregation operation is applied to data for the construction of the data
cube.
2. Attribute Subset Selection:
The highly relevant attributes should be used, rest all can be discarded. For
performing attribute selection, one can use level of significance and p- value
of the attribute. The attribute having p-value greater than significance level
can be discarded.
3. Numerosity Reduction:
This enables to store the model of data instead of whole data, for example:
Regression Models.
4. Dimensionality Reduction:
This reduces the size of data by encoding mechanisms. It can be lossy or
lossless. If after reconstruction from compressed data, original data can be
retrieved, such reduction are called lossless reduction else it is called lossy
reduction. The two effective methods of dimensionality reduction are:
Wavelet transforms and PCA (Principal Component Analysis).

5. Data Cleaning:
Data cleaning is one of the important parts of Data Mining. It plays a significant part in
building a model. However, the success or failure of a project relies on proper data
cleaning.
If we have a well-cleaned dataset, there are chances that we can get achieve good
results with simple algorithms also, which can prove very beneficial at times
especially in terms of computation when the dataset size is large.
Obviously, different types of data will require different types of cleaning. However,
this systematic approach can always serve as a good starting point.

Steps involved in Data Cleaning:

1. Removal of unwanted observations
This includes deleting duplicate/ redundant or irrelevant values from your
dataset. Duplicate observations most frequently arise during data collection
and Irrelevant observations are those that don’t actually fit the specific
problem that you’re trying to solve.
• Redundant observations alter the efficiency by a great extent as the
data repeats and may add towards the correct side or towards the
incorrect side, thereby producing unfaithful results.
• Irrelevant observations are any type of data that is of no use to us
and can be removed directly.
2. Fixing Structural errors
The errors that arise during measurement, transfer of data or other similar
situations are called structural errors. Structural errors include typos in the
name of features, same attribute with different name, mislabelled classes, i.e.,
separate classes that should really be the same or inconsistent capitalization.
• For example, the model will treat America and America as different
classes or values, though they represent the same value or red,
yellow and red-yellow as different classes or attributes, though one
class can be included in the other two classes. So, these are some
structural errors that make our model inefficient and gives poor
quality results.
2. Managing Unwanted outliers
Outliers can cause problems with certain types of models. For example, linear
regression models are less robust to outliers than decision tree models.
Generally, we should not remove outliers until we have a legitimate reason to
remove them. Sometimes, removing them improves performance, sometimes
not. So, one must have a good reason to remove the outlier, such as
suspicious measurements that are unlikely to be the part of real data.
3. Handling missing data
Missing data is a deceptively tricky issue in machine learning. We cannot just
ignore or remove the missing observation. They must be handled carefully as
they can be an indication of something important. The two most common
ways to deal with missing data are:
1. Dropping observations with missing values.
Dropping missing values is sub-optimal because when you drop
observations, you drop information.

• The fact that the value was missing may be informative in

itself.
• Plus, in the real world, you often need to make predictions
on new data even if some of the features are missing!
2. Imputing the missing values from past observations.
Imputing missing values is sub-optimal because the value was
originally missing but you filled it in, which always leads to a loss in
information, no matter how sophisticated your imputation method
is.

• Again, “missingness” is almost always informative in itself,

and you should tell your algorithm if a value was missing.
• Even if you build a model to impute your values, you’re not
adding any real information. You’re just reinforcing the
patterns already provided by other features.
Both of these approaches are sub-optimal because dropping an observation
means dropping information, thereby reducing data and imputing values also
is sub-optimal as we fil the values that were not present in the actual
dataset, which leads to a loss of information.

6.Classigication of Data Mining Systems:

Data Mining is considered as an interdisciplinary field. It includes a set of various
disciplines such as statistics, database systems, machine learning, visualization and
information sciences. Classification of the data mining system helps users to
understand the system and match their requirements with such systems.
Data mining systems can be categorized according to various criteria, as follows:

1. Classification according to the application adapted:

This involves domain-specific application.For example, the data mining
systems can be tailored accordingly for telecommunications, finance, stock
markets, e-mails and so on.
2. Classification according to the type of techniques utilized:
This technique involves the degree of user interaction or the technique of
data analysis involved. For example, machine learning, visualization, pattern
recognition, neural networks, database-oriented or data-warehouse oriented
techniques.
3. Classification according to the types of knowledge mined:
This is based on functionalities such as characterization, association,
discrimination and correlation, prediction etc.
4. Classification according to types of databases mined:
A database system can be classified as a ‘type of data’ or ‘use of data’ model
or ‘application of data’.

7.Challenges of Data Mining:

Nowadays Data Mining and knowledge discovery are evolving a crucial technology for
business and researchers in many domains. Data Mining is developing into established
and trusted discipline, many still pending challenges have to be solved.
Some of these challenges are given below.

1. Security and Social Challenges:

Decision-Making strategies are done through data collection-sharing, so it
requires considerable security. Private information about individuals and
sensitive information are collected for customers profiles, user behaviour
pattern understanding. Illegal access to information and the confidential
nature of information becoming an important issue.
2. User Interface:
The knowledge discovered is discovered using data mining tools is useful
only if it is interesting and above all understandable by the user. From good
visualization interpretation of data, mining results can be eased and helps
better understand their requirements. To obtain good visualization many
research is carried out for big data sets that display and manipulate mined
knowledge.
(i) Mining based on Level of Abstraction: Data Mining process needs to be
collaborative because it allows users to concentrate on pattern finding,
presenting and optimizing requests for data mining based on returned
results.
(ii) Integration of Background Knowledge: Previous information may be
used to express discovered patterns to direct the exploration processes and
to express discovered patterns.
3. Mining Methodology Challenges:
These challenges are related to data mining approaches and their limitations.
Mining approaches that cause the problem are:
4. Different approaches may implement differently based upon data
consideration. Some algorithms require noise-free data. Most data sets contain
exceptions, invalid or incomplete information lead to complication in the
analysis process and some cases compromise the precision of the results.

5. Complex Data:
Real-world data is heterogeneous and it could be multimedia data containing
images, audio and video, complex data, temporal data, spatial data, time series,
natural language text etc. It is difficult to handle these various kinds of data and
extract the required information. New tools and methodologies are developing
to extract relevant information.
(i) Complex data types: The database can include complex data elements,
objects with graphical data, spatial data, and temporal data. Mining all these
kinds of data is not practical to be done one device.
(ii) Mining from Varied Sources: The data is gathered from different sources
on Network. The data source may be of different kinds depending on how they
are stored such as structured, semi-structured or unstructured.
6. Performance:
The performance of the data mining system depends on the efficiency of
algorithms and techniques are using. The algorithms and techniques designed
are not up to the mark lead to affect the performance of the data mining
process.
(i) Efficiency and Scalability of the Algorithms: The data mining algorithm
must be efficient and scalable to extract information from huge amounts of
data in the database.
(ii) Improvement of Mining Algorithms: Factors such as the enormous size of
the database, the entire data flow and the difficulty of data mining approaches
inspire the creation of parallel & distributed data mining algorithms.
8.Architecture of Data mining:
Data mining refers to the detection and extraction of new patterns from the already
collected data. Data mining is the amalgamation of the field of statistics and
computer science aiming to discover patterns in incredibly large datasets and then
transforming them into a comprehensible structure for later use.
Architecture of Data Mining:
Basic Working:
1. It all starts when the user puts up certain data mining requests, these
requests are then sent to data mining engines for pattern evaluation.
2. These applications try to find the solution of the query using the already
present database.
3. The metadata then extracted is sent for proper analysis to the data mining
engine which sometimes interacts with pattern evaluation modules to
determine the result.
4. This result is then sent to the front end in an easily understandable manner
using a suitable interface.
A detailed description of parts of data mining architecture is shown:

1. Data Sources:
Database, WWW and data warhouse are parts of data sources. The data in
these sources may be in the form of plain text, spreadsheets or in other
forms of media like photos or videos. WWW is one of the biggest sources of
data.
2. Database Server:
The database server contains the actual data ready to be processed. It
performs the task of handling data retrieval as per the request of the user.
3. Data Mining Engine:
It is one of the core components of the data mining architecture that
performs all kinds of data mining techniques like association, classification,
characterization, clustering, prediction, etc.
4. Pattern Evaluation Modules:
They are responsible for finding interesting patterns in the data and
sometimes they also interact with the database servers for producing the
result of the user requests.
5. Graphic User Interface:
Since the user cannot fully understand the complexity of the data mining
process so graphical user interface helps the user to communicate
effectively with the data mining system.
6. Knowledge Base:
Knowledge Base is an important part of the data mining engine that is quite
beneficial in guiding the search for the result patterns. Data mining engine
may also sometimes get inputs from the knowledge base. This knowledge
base may contain data from user experiences. The objective of the
knowledge base is to make the result more accurate and reliable.

Big Data Analytics Notes
No ratings yet
Big Data Analytics Notes
33 pages
Unit - 1
No ratings yet
Unit - 1
14 pages
Ad3351 Daa Unit I
No ratings yet
Ad3351 Daa Unit I
135 pages
Programming The World Wide Web by Robert W Sebesta
No ratings yet
Programming The World Wide Web by Robert W Sebesta
6 pages
UNIT - I Introduction
No ratings yet
UNIT - I Introduction
65 pages
Unit - I Introduction To Data Analytics
No ratings yet
Unit - I Introduction To Data Analytics
89 pages
Ccs341-Question-Bank NNNNNN
No ratings yet
Ccs341-Question-Bank NNNNNN
10 pages
UNIT-1 Introduction To Data Mining
No ratings yet
UNIT-1 Introduction To Data Mining
29 pages
Counting Oneness in A Window
No ratings yet
Counting Oneness in A Window
12 pages
Procedural Oriented Programming
100% (1)
Procedural Oriented Programming
3 pages
Aiml Lab Manaual R23
100% (1)
Aiml Lab Manaual R23
10 pages
C & Ds Notes 2022-2023 r22 Syllabus
100% (1)
C & Ds Notes 2022-2023 r22 Syllabus
210 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
BCA-404: Data Mining and Data Ware Housing
No ratings yet
BCA-404: Data Mining and Data Ware Housing
19 pages
5.knowledge Acquisition in Artificial Intelligence
No ratings yet
5.knowledge Acquisition in Artificial Intelligence
19 pages
pst1 1st Sem Bca
No ratings yet
pst1 1st Sem Bca
14 pages
Chapter 9
100% (1)
Chapter 9
30 pages
Problem Solving Unit 1
No ratings yet
Problem Solving Unit 1
6 pages
Closure Properties of Context-Free Languages: Osama Awwad
No ratings yet
Closure Properties of Context-Free Languages: Osama Awwad
25 pages
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
No ratings yet
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
58 pages
Unit 1 Algorithm Performance Analysis and Measurement
No ratings yet
Unit 1 Algorithm Performance Analysis and Measurement
61 pages
Daa Unit-2
No ratings yet
Daa Unit-2
53 pages
Unit I DM
No ratings yet
Unit I DM
27 pages
Data Structure Previous Year Paper - B.C.A Study
No ratings yet
Data Structure Previous Year Paper - B.C.A Study
4 pages
JNTUA JNTUH JNTUK - B Tech - 1 2 - Handwritten Notes - CSE - OOP Object Oriented Programming Through C
No ratings yet
JNTUA JNTUH JNTUK - B Tech - 1 2 - Handwritten Notes - CSE - OOP Object Oriented Programming Through C
51 pages
Curious Kids 3
No ratings yet
Curious Kids 3
10 pages
III B.SC CS - Operating Systems
No ratings yet
III B.SC CS - Operating Systems
63 pages
UNIT 2 - Python Programming - QUESTION BANK - 2023-24
100% (1)
UNIT 2 - Python Programming - QUESTION BANK - 2023-24
2 pages
ADA SolBank Final
No ratings yet
ADA SolBank Final
80 pages
CS8392 - Oop - Unit - 3 - PPT - 3.4
No ratings yet
CS8392 - Oop - Unit - 3 - PPT - 3.4
23 pages
IT8074 - Service Oriented Architecture
No ratings yet
IT8074 - Service Oriented Architecture
196 pages
Subject: Bcs-303 Data Structure & Algorithms Class: Sybsc (CS) - Semester Iii
No ratings yet
Subject: Bcs-303 Data Structure & Algorithms Class: Sybsc (CS) - Semester Iii
12 pages
Dbms Unit 1 Notes
0% (1)
Dbms Unit 1 Notes
14 pages
Unit 1: Question Bank BCA (SEM-3) Software Engineering
No ratings yet
Unit 1: Question Bank BCA (SEM-3) Software Engineering
8 pages
CS8651 - Ip - 2 Marks With Answers
0% (1)
CS8651 - Ip - 2 Marks With Answers
65 pages
DMW Question Paper
0% (1)
DMW Question Paper
7 pages
Problem Solving Using C KCA-102: Introduction To Course
No ratings yet
Problem Solving Using C KCA-102: Introduction To Course
36 pages
1.1.data Structure Questions and Answers 1
No ratings yet
1.1.data Structure Questions and Answers 1
92 pages
Python Quetion and Answers
No ratings yet
Python Quetion and Answers
5 pages
Pps - Question Bank
No ratings yet
Pps - Question Bank
65 pages
Recursively Enumerable Languages
No ratings yet
Recursively Enumerable Languages
8 pages
Prog 1: Write C++ Programs To Implement The Stack ADT Using An Array
No ratings yet
Prog 1: Write C++ Programs To Implement The Stack ADT Using An Array
47 pages
Stock Analysis Spreadsheet (10YR, 2024) (Vers 4.2) PUBLIC
No ratings yet
Stock Analysis Spreadsheet (10YR, 2024) (Vers 4.2) PUBLIC
17 pages
Single Link Example
No ratings yet
Single Link Example
8 pages
Vtu 7TH Sem Cse/ise Data Warehousing & Data Mining Notes 10cs755/10is74
94% (18)
Vtu 7TH Sem Cse/ise Data Warehousing & Data Mining Notes 10cs755/10is74
70 pages
CET Exam Guide - 0821
No ratings yet
CET Exam Guide - 0821
13 pages
Java Lab Manual
No ratings yet
Java Lab Manual
26 pages
Data Mining - Discretization
100% (1)
Data Mining - Discretization
5 pages
Data Warehouse and Data Mining Question Bank R13 PDF
No ratings yet
Data Warehouse and Data Mining Question Bank R13 PDF
12 pages
BCA - SEM-1 - Static Website Designing - Syllabus
No ratings yet
BCA - SEM-1 - Static Website Designing - Syllabus
6 pages
In-Row Vs Room Cooling Comparison
No ratings yet
In-Row Vs Room Cooling Comparison
3 pages
S31600317b184b-Abb Ticket
No ratings yet
S31600317b184b-Abb Ticket
7 pages
Introduction To The Design and Analysis of Algorithms - Lecture Notes, Study Material and Important Questions, Answers
No ratings yet
Introduction To The Design and Analysis of Algorithms - Lecture Notes, Study Material and Important Questions, Answers
16 pages
Data Mining-Graph Mining
No ratings yet
Data Mining-Graph Mining
9 pages
Concurrency Control DBMS
No ratings yet
Concurrency Control DBMS
12 pages
Coa Lab File
No ratings yet
Coa Lab File
75 pages
CNET101 Computer Networks
No ratings yet
CNET101 Computer Networks
3 pages
Progenifix Product Review
No ratings yet
Progenifix Product Review
4 pages
07 Sp20 Java Programming Essentials
No ratings yet
07 Sp20 Java Programming Essentials
95 pages
CCMS Smoke Test Plan
No ratings yet
CCMS Smoke Test Plan
6 pages
Exam Questions ITIL-4-Foundation
100% (1)
Exam Questions ITIL-4-Foundation
15 pages
Design and Analysis of Algorithms Laboratory (15Csl47)
100% (1)
Design and Analysis of Algorithms Laboratory (15Csl47)
12 pages
INVENTORY
No ratings yet
INVENTORY
7 pages
Data Warehousing & Data Mining Important Questions
No ratings yet
Data Warehousing & Data Mining Important Questions
1 page
Variables, Expressions, and Statements: Python For Informatics: Exploring Information
No ratings yet
Variables, Expressions, and Statements: Python For Informatics: Exploring Information
33 pages
Module II
No ratings yet
Module II
22 pages
An Overview of Electronics and Communication
No ratings yet
An Overview of Electronics and Communication
18 pages
Division Memorandum No. 579, S. 2024
No ratings yet
Division Memorandum No. 579, S. 2024
5 pages
Socks4 Proxies
No ratings yet
Socks4 Proxies
39 pages
Bput Coa
No ratings yet
Bput Coa
2 pages
Operational Plan Modern Space Multifunctional Table
No ratings yet
Operational Plan Modern Space Multifunctional Table
20 pages
CCNA 200-301 Chapter 27 Analyzing Cisco Wireless Architectures
No ratings yet
CCNA 200-301 Chapter 27 Analyzing Cisco Wireless Architectures
17 pages
Ansa TC Sim White Paper
No ratings yet
Ansa TC Sim White Paper
16 pages
Ebooks Implementation Guide Sme
No ratings yet
Ebooks Implementation Guide Sme
35 pages
Log - 2023 06 28 - 17 48
No ratings yet
Log - 2023 06 28 - 17 48
8 pages
Jagan Resume PDF
No ratings yet
Jagan Resume PDF
1 page
DBMS Unit-1 PPT 1.2 (Advantages & Disadvantages of DBMS, Components, Overall System Tructure)
100% (1)
DBMS Unit-1 PPT 1.2 (Advantages & Disadvantages of DBMS, Components, Overall System Tructure)
5 pages
SA Change Management Procedures
No ratings yet
SA Change Management Procedures
10 pages
DOC001152125
No ratings yet
DOC001152125
10 pages
Yearly C For Class 7
No ratings yet
Yearly C For Class 7
4 pages
DAA Question Bank
No ratings yet
DAA Question Bank
9 pages
Intro To Threads PDF
No ratings yet
Intro To Threads PDF
4 pages
Usman's Resume
No ratings yet
Usman's Resume
1 page
Levitin: Introduction To The Design and Analysis of Algorithms
No ratings yet
Levitin: Introduction To The Design and Analysis of Algorithms
35 pages
1-2020 - Crypto Certificates Guide PDF
No ratings yet
1-2020 - Crypto Certificates Guide PDF
1 page
Or Or: Important Information!
No ratings yet
Or Or: Important Information!
5 pages
Cs2358 Internet Programming Lab Anna University Syllabus
No ratings yet
Cs2358 Internet Programming Lab Anna University Syllabus
12 pages
2 Marks
No ratings yet
2 Marks
16 pages
Artificial Intelligence Class 6: Skill Education for Class 6th, Code (417)
From Everand
Artificial Intelligence Class 6: Skill Education for Class 6th, Code (417)
Geeta Zunjani
No ratings yet
Introduction to Linux: Installation and Programming
From Everand
Introduction to Linux: Installation and Programming
N. B. Venkateswarlu
No ratings yet

1.data Mining Functionalities

Uploaded by

1.data Mining Functionalities

Uploaded by

1.

Data Mining Functionalities:

2. Data Mining Task Primitives

The data mining primitives specify the following:

3.Major Issues in Data Mining:

• Interactive mining of knowledge at multiple levels of abstraction:

Data pre-processing is a data mining technique which is used to transform the

Steps Involved in Data Pre-processing:

1. Data Cube Aggregation:

Steps involved in Data Cleaning:

• The fact that the value was missing may be informative in

• Again, “missingness” is almost always informative in itself,

6.Classigication of Data Mining Systems:

1. Classification according to the application adapted:

7.Challenges of Data Mining:

1. Security and Social Challenges:

You might also like