0% found this document useful (0 votes)

22 views47 pages

DM 1

The document outlines the syllabus for a Data Mining course, covering topics such as data types, mining functionalities, association rule mining, classification, clustering, and advanced concepts like mining data streams and multimedia. It also includes course objectives, outcomes, and references to textbooks and resources. The course aims to equip students with the ability to understand data mining tasks, apply preprocessing methods, and evaluate mining algorithms across various data types.

Uploaded by

mrpulluri1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views47 pages

DM 1

Uploaded by

mrpulluri1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

AI512PE: DATA MINING (Professional Elective - I)

Dr. M. Kumara Swamy

Syllabus
❑ UNIT – I
Data Mining: Data–Types of Data–Data Mining Functionalities– Interestingness Patterns–
Classification of Data Mining systems–Data mining Task primitives–Integration of Data
mining system with a Data warehouse–Major issues in Data Mining–Data Preprocessing.
❑ UNIT – II
Association Rule Mining: Mining Frequent Patterns–Associations and correlations – Mining
Methods– Mining Various kinds of Association Rules– Correlation Analysis– Constraint
based Association mining. Graph Pattern Mining, SPM.
❑ UNIT – III
Classification: Classification and Prediction – Basic concepts–Decision tree induction–
Bayesian classification, Rule–based classification, Lazy learner.
❑ UNIT - IV
Clustering and Applications: Cluster analysis–Types of Data in Cluster Analysis
Categorization of Major Clustering Methods– Partitioning Methods, Hierarchical Methods
Density–Based Methods, Grid–Based Methods, Outlier Analysis.
2
Syllabus…
❑ UNIT – V
Advanced Concepts: Basic concepts in Mining data streams–Mining Time–series data––
Mining sequence patterns in Transactional databases– Mining Object– Spatial–
Multimedia–Text and Web data – Spatial Data mining–Multimedia Data mining–Text
Mining– Mining the World Wide Web.
TEXT BOOKS
1. Data Mining – Concepts and Techniques – Jiawei Han & Micheline Kamber, 3rd Edition Elsevier.
2. Data Mining Introductory and Advanced topics – Margaret H Dunham, PEA.
REFERENCE BOOK
1. Ian H. Witten and Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques
(Second Edition), Morgan Kaufmann, 2005.
Pre-Requisites
• Database Management Systems
• Computer Oriented Statistical Methods
3
Course Objectives and Outcomes
Course Objectives
❑ It presents methods for mining frequent patterns, associations, and correlations.
❑ It then describes methods for data classification and prediction, and data–clustering
approaches.
❑ It covers mining various types of data stores such as spatial, textual, multimedia, streams.
Course Outcomes
❑ Ability to understand the types of the data to be mined and present a general classification of
tasks and primitives to integrate a data mining system.
❑ Apply preprocessing methods for any given raw data.
❑ Extract interesting patterns from large amounts of data.
❑ Discover the role played by data mining in various fields.
❑ Choose and employ suitable data mining algorithms to build analytical applications
❑ Evaluate the accuracy of supervised and unsupervised models and algorithms.
4
Unit - I
Data Mining
❑ Data–Types of Data
❑ Data Mining Functionalities
❑ Interestingness Patterns
❑ Classification of Data Mining systems
❑ Data mining Task primitives
❑ Integration of Data mining system with a Data warehouse
❑ Major issues in Data Mining
❑ Data Preprocessing.

5
Why Data Mining?
❑ The Explosive Growth of Data: from terabytes to petabytes
❑ Data collection and data availability
❑ Automated data collection tools, database systems, Web, computerized
society
❑ Major sources of abundant data
❑ Business: Web, e-commerce, transactions, stocks, …
❑ Science: Remote sensing, bioinformatics, scientific simulation, …
❑ Society and everyone: news, digital cameras, YouTube
❑ We are drowning in data, but starving for knowledge!
❑ “Necessity is the mother of invention”—Data mining—Automated
analysis of massive data sets
6
What Is Data Mining?
❑ Data mining (knowledge discovery from data)
❑ Extraction of interesting (non-trivial, implicit, previously unknown and
potentially useful) patterns or knowledge from huge amount of data
❑ Data mining: a misnomer?
❑ Alternative names
❑ Knowledge discovery (mining) in databases (KDD), knowledge extraction,
data/pattern analysis, data archeology, data dredging, information
harvesting, business intelligence, etc.
❑ Watch out: Is everything “data mining”?
❑ Simple search and query processing
❑ (Deductive) expert systems
7
Knowledge Discovery (KDD) Process
❑ This is a view from typical database systems
and data warehousing communities Pattern Evaluation

❑ Data mining plays an essential role in the

knowledge discovery process Data Mining

Task-relevant Data

Data Warehouse Selection

Data Cleaning

Data Integration

8 Databases
Example: A Web Mining Framework
❑ Web mining usually involves
❑ Data cleaning
❑ Data integration from multiple sources
❑ Warehousing the data
❑ Data cube construction
❑ Data selection for data mining
❑ Data mining
❑ Presentation of the mining results
❑ Patterns and knowledge to be used or stored into knowledge-base
9
Data Mining in Business Intelligence
Increasing potential
to support
business decisions End User
Decision
Making

Data Presentation Business

Analyst
Visualization Techniques
Data Mining Data
Information Discovery Analyst

Data Exploration
Statistical Summary, Querying, and Reporting

Data Preprocessing/Integration, Data Warehouses

DBA
Data Sources
Paper, Files, Web documents, Scientific experiments, Database Systems
10
KDD Process: A View from ML and Statistics

Input Data Data Pre- Data Post-

Processing Mining Processing

Data integration Pattern discovery Pattern evaluation

Classification Pattern selection
Normalization
Clustering Pattern interpretation
Feature selection
Outlier analysis
Dimension reduction Pattern visualization
…………

❑ This is a view from typical machine learning and statistics communities

11
Data Mining vs. Data Exploration
❑ Which view do you prefer?
❑ KDD vs. ML/Stat. vs. Business Intelligence
❑ Depending on the data, applications, and your focus

❑ Data Mining vs. Data Exploration

❑ Business intelligence view
❑ Warehouse, data cube, reporting but not much mining
❑ Business objects vs. data mining tools
❑ Supply chain example: mining vs. OLAP vs. presentation tools
❑ Data presentation vs. data exploration

12
Multi-Dimensional View of Data Mining
❑ Data to be mined
Database data (extended-relational, object-oriented, heterogeneous), data warehouse,
❑
transactional data, stream, spatiotemporal, time-series, sequence, text and web, multi-
media, graphs & social and information networks
❑ Knowledge to be mined (or: Data mining functions)
❑ Characterization, discrimination, association, classification, clustering, trend/deviation,
outlier analysis, …
❑ Descriptive vs. predictive data mining
❑ Multiple/integrated functions and mining at multiple levels
❑ Techniques utilized
❑ Data-intensive, data warehouse (OLAP), machine learning, statistics, pattern recognition,
visualization, high-performance, etc.
❑ Applications adapted
❑ Retail, telecommunication, banking, fraud analysis, bio-data mining, stock market analysis,
13 text mining, Web mining, etc.
Data Mining: On What Kinds of Data?
❑ Database-oriented data sets and applications
❑ Relational database, data warehouse, transactional database
❑ Object-relational databases, Heterogeneous databases and legacy databases
❑ Advanced data sets and advanced applications
❑ Data streams and sensor data
❑ Time-series data, temporal data, sequence data (incl. bio-sequences)
❑ Structure data, graphs, social networks and information networks
❑ Spatial data and spatiotemporal data
❑ Multimedia database
❑ Text databases
❑ The World-Wide Web
14
Data Mining Functions: (1) Generalization
❑ Information integration and data warehouse construction
❑ Data cleaning, transformation, integration, and
multidimensional data model
❑ Data cube technology
❑ Scalable methods for computing (i.e., materializing)
multidimensional aggregates
❑ OLAP (online analytical processing)
❑ Multidimensional concept description: Characterization
and discrimination
❑ Generalize, summarize, and contrast data
characteristics, e.g., dry vs. wet region
15
Data Mining Functions: (2) Pattern Discovery
❑ Frequent patterns (or frequent itemsets)
❑ What items are frequently purchased together in your Walmart?
❑ Association and Correlation Analysis

❑ A typical association rule

❑ Diaper → Beer [0.5%, 75%] (support, confidence)
❑ Are strongly associated items also strongly correlated?
❑ How to mine such patterns and rules efficiently in large datasets?
❑ How to use such patterns for classification, clustering, and other applications?
16
Data Mining Functions: (3) Classification
❑ Classification and label prediction
❑ Construct models (functions) based on some training examples
❑ Describe and distinguish classes or concepts for future prediction
❑ Ex. 1. Classify countries based on (climate)
❑ Ex. 2. Classify cars based on (gas mileage)
❑ Predict some unknown class labels
❑ Typical methods
❑ Decision trees, naïve Bayesian classification, support vector machines, neural
networks, rule-based classification, pattern-based classification, logistic
regression, …
❑ Typical applications:
❑ Credit card fraud detection, direct marketing, classifying stars, diseases, web-
pages, …
17
Data Mining Functions: (4) Cluster Analysis
❑ Unsupervised learning (i.e., Class label is
unknown)
❑ Group data to form new categories (i.e.,
clusters), e.g., cluster houses to find
distribution patterns
❑ Principle: Maximizing intra-class similarity
& minimizing interclass similarity
❑ Many methods and applications

18
Data Mining Functions: (5) Outlier Analysis
❑ Outlier analysis
❑ Outlier: A data object that does not comply with the
general behavior of the data
❑ Noise or exception?―One person’s garbage could be
another person’s treasure
❑ Methods: by product of clustering or regression analysis, …
❑ Useful in fraud detection, rare events analysis

19
Data Mining Functions: (6) Time and Ordering:
Sequential Pattern, Trend and Evolution Analysis
❑ Sequence, trend and evolution analysis
❑ Trend, time-series, and deviation analysis
❑ e.g., regression and value prediction
❑ Sequential pattern mining
❑ e.g., buy digital camera, then buy large memory cards
❑ Periodicity analysis
❑ Motifs and biological sequence analysis
❑ Approximate and consecutive motifs
❑ Similarity-based analysis
❑ Mining data streams
❑ Ordered, time-varying, potentially infinite, data streams

20
Data Mining Functions: (7) Structure and
Network Analysis
❑ Graph mining
❑ Finding frequent subgraphs (e.g., chemical compounds), trees (XML),
substructures (web fragments)
❑ Information network analysis
❑ Social networks: actors (objects, nodes) and relationships (edges)
❑ e.g., author networks in CS, terrorist networks
❑ Multiple heterogeneous networks
❑ A person could be multiple information networks: friends, family, classmates, …
❑ Links carry a lot of semantic information: Link mining
❑ Web mining
❑ Web is a big information network: from PageRank to Google
❑ Analysis of Web information networks
❑ Web community discovery, opinion mining, usage mining, …
21
Evaluation of Knowledge
❑ Are all mined knowledge interesting?
❑ One can mine tremendous amount of “patterns”
❑ Some may fit only certain dimension space (time, location, …)
❑ Some may not be representative, may be transient, …
❑ Evaluation of mined knowledge → directly mine only interesting knowledge?
❑ Descriptive vs. predictive
❑ Coverage
❑ Typicality vs. novelty
❑ Accuracy
❑ Timeliness

22
❑ …
Data Mining: Confluence of Multiple Disciplines

Machine Pattern
Statistics
Learning Recognition

Applications Data Mining Visualization

Database High-Performance
Algorithm
Technology Computing

23
Why Confluence of Multiple Disciplines?
❑ Tremendous amount of data
❑ Algorithms must be scalable to handle big data
❑ High-dimensionality of data
❑ Micro-array may have tens of thousands of dimensions
❑ High complexity of data
❑ Data streams and sensor data
❑ Time-series data, temporal data, sequence data
❑ Structure data, graphs, social and information networks
❑ Spatial, spatiotemporal, multimedia, text and Web data
❑ Software programs, scientific simulations
❑ New and sophisticated applications

24
Applications of Data Mining
❑ Web page analysis: classification, clustering, ranking
❑ Collaborative analysis & recommender systems
❑ Basket data analysis to targeted marketing
❑ Biological and medical data analysis
❑ Data mining and software engineering
❑ Data mining and text analysis
❑ Data mining and social and information network analysis
❑ Built-in (invisible data mining) functions in Google, MS, Yahoo!, Linked, Facebook, …
❑ Major dedicated data mining systems/tools
❑ SAS, MS SQL-Server Analysis Manager, Oracle Data Mining Tools)
25
Major Issues in Data Mining (1)
❑ Mining Methodology
❑ Mining various and new kinds of knowledge
❑ Mining knowledge in multi-dimensional space
❑ Data mining: An interdisciplinary effort
❑ Boosting the power of discovery in a networked environment
❑ Handling noise, uncertainty, and incompleteness of data
❑ Pattern evaluation and pattern- or constraint-guided mining
❑ User Interaction
❑ Interactive mining
❑ Incorporation of background knowledge
❑ Presentation and visualization of data mining results
26
Major Issues in Data Mining (2)
❑ Efficiency and Scalability
❑ Efficiency and scalability of data mining algorithms
❑ Parallel, distributed, stream, and incremental mining methods
❑ Diversity of data types
❑ Handling complex types of data
❑ Mining dynamic, networked, and global data repositories
❑ Data mining and society
❑ Social impacts of data mining
❑ Privacy-preserving data mining
❑ Invisible data mining
27
Types of Data Sets: (1) Record Data
❑ Relational records
❑ Relational tables, highly structured
❑ Data matrix, e.g., numerical matrix, crosstabs

❑ Transaction data

timeout

season
coach

game
score
team

ball

lost
pla

wi
n
y
TID Items
1 Bread, Coke, Milk
2 Beer, Bread Document 1 3 0 5 0 2 6 0 2 0 2
3 Beer, Coke, Diaper, Milk
Document 2 0 7 0 2 1 0 0 3 0 0
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk Document 3 0 1 0 0 1 2 2 0 3 0

❑ Document data: Term-frequency vector (matrix) of text documents

28
Types of Data Sets: (2) Graphs and Networks
❑ Transportation network

❑ World Wide Web

❑ Molecular Structures

❑ Social or information networks

29
Types of Data Sets: (3) Ordered Data
❑ Video data: sequence of images

❑ Temporal data: time-series

❑ Sequential Data: transaction sequences

❑ Genetic sequence data

30
Types of Data Sets: (4) Spatial, image and multimedia Data

❑ Spatial data: maps

❑ Image data:

❑ Video data:

31
Important Characteristics of Structured Data
❑ Dimensionality
❑ Curse of dimensionality
❑ Sparsity
❑ Only presence counts
❑ Resolution
❑ Patterns depend on the scale
❑ Distribution
❑ Centrality and dispersion

32
Data Objects
❑ Data sets are made up of data objects
❑ A data object represents an entity
❑ Examples:
❑ sales database: customers, store items, sales
❑ medical database: patients, treatments
❑ university database: students, professors, courses
❑ Also called samples , examples, instances, data points, objects, tuples
❑ Data objects are described by attributes
❑ Database rows → data objects; columns → attributes

33
Attributes
❑ Attribute (or dimensions, features, variables)
❑ A data field, representing a characteristic or feature of a data object.
❑ E.g., customer _ID, name, address
❑ Types:
❑ Nominal (e.g., red, blue)
❑ Binary (e.g., {true, false})
❑ Ordinal (e.g., {freshman, sophomore, junior, senior})
❑ Numeric: quantitative
❑ Interval-scaled: 100○C is interval scales
❑ Ratio-scaled: 100○K is ratio scaled since it is twice as high as 50 ○K
❑ Q1: Is student ID a nominal, ordinal, or interval-scaled data?
❑ Q2: What about eye color? Or color in the color spectrum of physics?
34
Attribute Types
❑ Nominal: categories, states, or “names of things”
❑ Hair_color = {auburn, black, blond, brown, grey, red, white}
❑ marital status, occupation, ID numbers, zip codes
❑ Binary
❑ Nominal attribute with only 2 states (0 and 1)
❑ Symmetric binary: both outcomes equally important
❑ e.g., gender
❑ Asymmetric binary: outcomes not equally important.
❑ e.g., medical test (positive vs. negative)
❑ Convention: assign 1 to most important outcome (e.g., HIV positive)
❑ Ordinal
❑ Values have a meaningful order (ranking) but magnitude between successive
values is not known
❑ Size = {small, medium, large}, grades, army rankings
35
Numeric Attribute Types
❑ Quantity (integer or real-valued)

❑ Interval

❑ Measured on a scale of equal-sized units

❑ Values have order
❑ E.g., temperature in C˚or F˚, calendar dates
❑ No true zero-point
❑ Ratio

❑ Inherent zero-point
❑ We can speak of values as being an order of magnitude larger than the unit
of measurement (10 K˚ is twice as high as 5 K˚).
❑ e.g., temperature in Kelvin, length, counts, monetary quantities
36
Discrete vs. Continuous Attributes
❑ Discrete Attribute
❑ Has only a finite or countably infinite set of values
❑ E.g., zip codes, profession, or the set of words in a collection of documents
❑ Sometimes, represented as integer variables
❑ Note: Binary attributes are a special case of discrete attributes
❑ Continuous Attribute
❑ Has real numbers as attribute values
❑ E.g., temperature, height, or weight
❑ Practically, real values can only be measured and represented using a finite
number of digits
❑ Continuous attributes are typically represented as floating-point variables
37
Visualizing Complex Data and Relations: Social Networks
❑ Visualizing non-numerical data: social and information networks

organizing
information networks

A typical network structure

A social network

38
What is Data Preprocessing? — Major Tasks
❑ Data cleaning
❑ Handle missing data, smooth noisy data, identify or remove outliers, and
resolve inconsistencies
❑ Data integration
❑ Integration of multiple databases, data cubes, or files
❑ Data reduction
❑ Dimensionality reduction
❑ Numerosity reduction
❑ Data compression
❑ Data transformation and data discretization
❑ Normalization
❑ Concept hierarchy generation
39
Why Preprocess the Data? — Data Quality Issues
❑ Measures for data quality: A multidimensional view
❑ Accuracy: correct or wrong, accurate or not
❑ Completeness: not recorded, unavailable, …
❑ Consistency: some modified but some not, dangling, …
❑ Timeliness: timely update?
❑ Believability: how trustable the data are correct?
❑ Interpretability: how easily the data can be understood?

40
Data Cleaning
❑ Data in the Real World Is Dirty: Lots of potentially incorrect data, e.g., instrument faulty,
human or computer error, and transmission error
❑ Incomplete: lacking attribute values, lacking certain attributes of interest, or containing
only aggregate data
❑ e.g., Occupation = “ ” (missing data)
❑ Noisy: containing noise, errors, or outliers
❑ e.g., Salary = “−10” (an error)
❑ Inconsistent: containing discrepancies in codes or names, e.g.,
❑ Age = “42”, Birthday = “03/07/2010”
❑ Was rating “1, 2, 3”, now rating “A, B, C”
❑ discrepancy between duplicate records
❑ Intentional (e.g., disguised missing data)
❑ Jan. 1 as everyone’s birthday?
41
Incomplete (Missing) Data
❑ Data is not always available
❑ E.g., many tuples have no recorded value for several attributes, such as
customer income in sales data
❑ Missing data may be due to
❑ Equipment malfunction
❑ Inconsistent with other recorded data and thus deleted
❑ Data were not entered due to misunderstanding
❑ Certain data may not be considered important at the time of entry
❑ Did not register history or changes of the data
❑ Missing data may need to be inferred

42
How to Handle Missing Data?
❑ Ignore the tuple: usually done when class label is missing (when doing
classification)—not effective when the % of missing values per attribute varies
considerably
❑ Fill in the missing value manually: tedious + infeasible?
❑ Fill in it automatically with
❑ a global constant : e.g., “unknown”, a new class?!
❑ the attribute mean
❑ the attribute mean for all samples belonging to the same class: smarter
❑ the most probable value: inference-based such as Bayesian formula or decision
tree

43
Noisy Data
❑ Noise: random error or variance in a measured variable
❑ Incorrect attribute values may be due to
❑ Faulty data collection instruments
❑ Data entry problems
❑ Data transmission problems
❑ Technology limitation
❑ Inconsistency in naming convention
❑ Other data problems
❑ Duplicate records
❑ Incomplete data
❑ Inconsistent data

44
How to Handle Noisy Data?
❑ Binning
❑ First sort data and partition into (equal-frequency) bins
❑ Then one can smooth by bin means, smooth by bin median, smooth by bin
boundaries, etc.
❑ Regression
❑ Smooth by fitting the data into regression functions
❑ Clustering
❑ Detect and remove outliers
❑ Semi-supervised: Combined computer and human inspection
❑ Detect suspicious values and check by human (e.g., deal with possible outliers)

45
Data Cleaning as a Process
❑ Data discrepancy detection
❑ Use metadata (e.g., domain, range, dependency, distribution)
❑ Check field overloading
❑ Check uniqueness rule, consecutive rule and null rule
❑ Use commercial tools
❑ Data scrubbing: use simple domain knowledge (e.g., postal code, spell-check) to
detect errors and make corrections
❑ Data auditing: by analyzing data to discover rules and relationship to detect violators
(e.g., correlation and clustering to find outliers)
❑ Data migration and integration
❑ Data migration tools: allow transformations to be specified
❑ ETL (Extraction/Transformation/Loading) tools: allow users to specify transformations
through a graphical user interface
❑ Integration of the two processes
❑ Iterative and interactive (e.g., Potter’s Wheels)
46
END OF UNIT - I

3JL01001BWABPCZZA - V1 - 5520 AMS Release 9.7.03 Administrator Guide
No ratings yet
3JL01001BWABPCZZA - V1 - 5520 AMS Release 9.7.03 Administrator Guide
642 pages
MCS-220 2024-25 em
No ratings yet
MCS-220 2024-25 em
60 pages
BCP and DCP
No ratings yet
BCP and DCP
2 pages
Organization Management in Workday: As Dynamic As Your Business
No ratings yet
Organization Management in Workday: As Dynamic As Your Business
4 pages
Important Ques For BT205 and Assignment Ques Set
No ratings yet
Important Ques For BT205 and Assignment Ques Set
2 pages
Service-Now: Types of Support Tools
No ratings yet
Service-Now: Types of Support Tools
4 pages
Iot Tools
No ratings yet
Iot Tools
35 pages
Data Mining: Concepts and Techniques
100% (2)
Data Mining: Concepts and Techniques
27 pages
Gis
No ratings yet
Gis
111 pages
Software Engineering - PYQ
No ratings yet
Software Engineering - PYQ
2 pages
PDF PDF
No ratings yet
PDF PDF
11 pages
PCD Lawn Menial Process
No ratings yet
PCD Lawn Menial Process
24 pages
03 - Performing Security Assessments
100% (1)
03 - Performing Security Assessments
78 pages
Juspay Technologies (P) Ltd. Internship Cum PPO Recruitment Drive - PPT & Hackathon On 21st Sept'2024 For 2025 Batch
No ratings yet
Juspay Technologies (P) Ltd. Internship Cum PPO Recruitment Drive - PPT & Hackathon On 21st Sept'2024 For 2025 Batch
3 pages
Technologies - Digital Technologies - Year 5 - Teaching and Learning Exemplar
No ratings yet
Technologies - Digital Technologies - Year 5 - Teaching and Learning Exemplar
128 pages
Cs-09 Web Programming
No ratings yet
Cs-09 Web Programming
3 pages
Lesson 1: Online Safety, Security and Rules of Netiquette
No ratings yet
Lesson 1: Online Safety, Security and Rules of Netiquette
2 pages
CF Unit 1 Notes
No ratings yet
CF Unit 1 Notes
14 pages
Real Estate IS Project
No ratings yet
Real Estate IS Project
10 pages
How To Configure DNS Server On A Cisco Router
No ratings yet
How To Configure DNS Server On A Cisco Router
2 pages
Cloud Software Environment
No ratings yet
Cloud Software Environment
9 pages
RPF Si PDF
No ratings yet
RPF Si PDF
2 pages
The Role of Artifictelligence in Enhancing Cybersecurity and Internal Audit
No ratings yet
The Role of Artifictelligence in Enhancing Cybersecurity and Internal Audit
6 pages
Bmram Sap Abap 3 9
No ratings yet
Bmram Sap Abap 3 9
4 pages
1Z0-051 0 PDF
No ratings yet
1Z0-051 0 PDF
30 pages
Project Implementation and Coding: 6.1 Overview of Project Modules
No ratings yet
Project Implementation and Coding: 6.1 Overview of Project Modules
22 pages
Aqueel Ahmad: Career Objective
No ratings yet
Aqueel Ahmad: Career Objective
2 pages
Assessment Activity Front Sheet: ASSIGNMENT 10.02
No ratings yet
Assessment Activity Front Sheet: ASSIGNMENT 10.02
9 pages
Lab 6
No ratings yet
Lab 6
15 pages
Iv Year Ii Sem Mid - I Wssoa
No ratings yet
Iv Year Ii Sem Mid - I Wssoa
3 pages
Computer Breaks System and Mound
No ratings yet
Computer Breaks System and Mound
5 pages
Data Base - MCQ
No ratings yet
Data Base - MCQ
4 pages
Cloud Security Report
No ratings yet
Cloud Security Report
4 pages
Test-6-PracticeQuestion-cachememory-2 - Cache Mappping
No ratings yet
Test-6-PracticeQuestion-cachememory-2 - Cache Mappping
5 pages
OverallMarks of Individual
No ratings yet
OverallMarks of Individual
1 page
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
A Management Information System
No ratings yet
A Management Information System
15 pages
How SSL Works: Web Browser SSL Certificate
No ratings yet
How SSL Works: Web Browser SSL Certificate
3 pages
Inf 444e - Datamining N Advanced Databases Introduction 2019
No ratings yet
Inf 444e - Datamining N Advanced Databases Introduction 2019
32 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
LaserNet SAP Connector UK
No ratings yet
LaserNet SAP Connector UK
2 pages
02-Introduction To Data Mining
No ratings yet
02-Introduction To Data Mining
40 pages
Lecture 1.1.1 1.1.2
No ratings yet
Lecture 1.1.1 1.1.2
32 pages
01 Intro
No ratings yet
01 Intro
41 pages
Intro Data Mining
No ratings yet
Intro Data Mining
51 pages
Unit 1
No ratings yet
Unit 1
95 pages
01 Intro 1
No ratings yet
01 Intro 1
33 pages
DWDM LS1 Fall 24 25
No ratings yet
DWDM LS1 Fall 24 25
42 pages
Module1 IntroToDataMining
No ratings yet
Module1 IntroToDataMining
36 pages
Chap1 Introduction
No ratings yet
Chap1 Introduction
21 pages
1 - Introduction To DM
No ratings yet
1 - Introduction To DM
59 pages
LECTURE 1 Data Mining
No ratings yet
LECTURE 1 Data Mining
41 pages
Module 1
No ratings yet
Module 1
40 pages
01 Intro
No ratings yet
01 Intro
28 pages
Data Analysis-2
No ratings yet
Data Analysis-2
41 pages
Unit 1
No ratings yet
Unit 1
148 pages
Lecture 01 11jan
No ratings yet
Lecture 01 11jan
29 pages
01 - Data Mining Introduction
No ratings yet
01 - Data Mining Introduction
21 pages
Introduction
No ratings yet
Introduction
27 pages
Chapter 1 Intro
No ratings yet
Chapter 1 Intro
23 pages
Introduction
No ratings yet
Introduction
46 pages
01 - Introduction To Datamining
No ratings yet
01 - Introduction To Datamining
19 pages
01 Intro
No ratings yet
01 Intro
35 pages
01 Intro
No ratings yet
01 Intro
26 pages
Unit 3.1
No ratings yet
Unit 3.1
23 pages
21IS503 UnitII LM5
No ratings yet
21IS503 UnitII LM5
20 pages
01 Intro
No ratings yet
01 Intro
22 pages
Lecture 1
No ratings yet
Lecture 1
37 pages
Chapter 1 DM
No ratings yet
Chapter 1 DM
20 pages
ICS 2408 Lecture 1 Introduction
No ratings yet
ICS 2408 Lecture 1 Introduction
32 pages
01 Intro
No ratings yet
01 Intro
40 pages
Introduction To Data Mining 1604
No ratings yet
Introduction To Data Mining 1604
32 pages
01 Intro
No ratings yet
01 Intro
29 pages
DM Introduction
No ratings yet
DM Introduction
32 pages
DM Introduction-SSM
No ratings yet
DM Introduction-SSM
6 pages
3-OLAP Operations-13!08!2021 (13-Aug-2021) Material I 13-Aug-2021 Data Mining - Introductory Slides
No ratings yet
3-OLAP Operations-13!08!2021 (13-Aug-2021) Material I 13-Aug-2021 Data Mining - Introductory Slides
37 pages
Chapter - 1
No ratings yet
Chapter - 1
22 pages
Combine 056
No ratings yet
Combine 056
57 pages
Concepts and Techniques: - Chapter 1
No ratings yet
Concepts and Techniques: - Chapter 1
39 pages
01 Introduction
No ratings yet
01 Introduction
36 pages
Concepts and Techniques: - Chapter 1
No ratings yet
Concepts and Techniques: - Chapter 1
37 pages
Chapter 1 Data Mining Lecture Note
No ratings yet
Chapter 1 Data Mining Lecture Note
31 pages
Concepts and Techniques: - Chapter 1
No ratings yet
Concepts and Techniques: - Chapter 1
39 pages
Data Mining:: Concepts and Techniques
No ratings yet
Data Mining:: Concepts and Techniques
28 pages
Data Mining: Concepts and Techniques: - Chapter 1
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 1
37 pages
Introduction To Data Mining & Business Intelligence
No ratings yet
Introduction To Data Mining & Business Intelligence
25 pages
Unit 1: Data Warehousing & Data Mining
No ratings yet
Unit 1: Data Warehousing & Data Mining
54 pages
IS414: Data Mining: DR - Waleed M.Ead
No ratings yet
IS414: Data Mining: DR - Waleed M.Ead
36 pages
1712060004 (1)
No ratings yet
1712060004 (1)
25 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
25 pages
Intro of Data Mining
No ratings yet
Intro of Data Mining
27 pages
Chapter 1. Introduction
No ratings yet
Chapter 1. Introduction
323 pages
Data Mining Concepts
No ratings yet
Data Mining Concepts
35 pages
Week 02 PDF
No ratings yet
Week 02 PDF
39 pages
01 Intro
No ratings yet
01 Intro
23 pages
Data Mining: Concepts and Techniques: Sujata Chakravarty Associate Professor RCMA, Bhubaneswar
No ratings yet
Data Mining: Concepts and Techniques: Sujata Chakravarty Associate Professor RCMA, Bhubaneswar
17 pages

DM 1

Uploaded by

DM 1

Uploaded by

AI512PE: DATA MINING (Professional Elective - I)

Dr. M. Kumara Swamy

❑ Data mining plays an essential role in the

Data Warehouse Selection

Data Presentation Business

Data Preprocessing/Integration, Data Warehouses

Input Data Data Pre- Data Post-

Data integration Pattern discovery Pattern evaluation

❑ This is a view from typical machine learning and statistics communities

❑ Data Mining vs. Data Exploration

❑ A typical association rule

Applications Data Mining Visualization

❑ Document data: Term-frequency vector (matrix) of text documents

❑ World Wide Web

❑ Social or information networks

❑ Temporal data: time-series

❑ Sequential Data: transaction sequences

❑ Genetic sequence data

❑ Spatial data: maps

❑ Measured on a scale of equal-sized units

A typical network structure

You might also like