0% found this document useful (0 votes)

26 views66 pages

CH 1

Uploaded by

afifrafsan111

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views66 pages

CH 1

Uploaded by

afifrafsan111

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 66

Lecture Outline

• What Is Data Mining?

• Why Data Mining?

• Data Mining: On what kind of data?

• Data Mining Functionality

• Are all the patterns interesting?

• Classification and data mining systems

• Major Issues in Data Mining

1
What Is Data Mining
• Data mining (knowledge discovery from data)
• Refers to the extracting or mining knowledge from large amount of data
• Extraction of interesting (non-trivial, implicit, previously unknown and
potentially useful) patterns or knowledge from huge amount of data

• Alternative names
• Knowledge discovery (mining) in databases (KDD), knowledge extraction,
data/pattern analysis, data archeology, data dredging, information
harvesting, business intelligence, etc.
• What is not Data Mining?
• Simple search and query processing
• (Deductive) expert systems

2
Why Data Mining

3
Motivation
• Data rich but information poor!
• Data explosion problem
• Automated data collection tools and mature database
technology lead to tremendous amounts of data stored
in databases, data warehouses and other information
repositories
• Solution: Data Mining
• Extraction of interesting knowledge (rules, patterns,
constraints) from data in large databases regularities

4
Evolution of Database
Technology
• 1960s:
• Data collection, database creation, IMS and network DBMS
• 1970s:
• Relational data model, relational DBMS implementation
• 1980s:
• RDBMS, advanced data models (extended-relational, OO, deductive, etc.)
• Application-oriented DBMS (spatial, scientific, engineering, etc.)
• 1990s:
• Data mining, data warehousing, multimedia databases, and Web databases
• 2000s
• Stream data management and mining
• Data mining and its applications
• Web technology (XML, data integration) and global information systems
5
WHY DATA MINING?-
POTENTIAL APPLICATIONS
• Database analysis and decision support
• Market analysis and management
• target marketing, market basket analysis,...
• Risk analysis and management
• Forecasting, quality control, competitive analysis,....
• Fraud detection and management
• Other Applications
• Text mining (newsgroup, email, documents) and Web
analysis.
• Spatial data (eg. Map data) mining
• Intelligent query answering
6
MARKET ANALYSIS AND MANAGEMENT

• Data sources for analysis?

• Credit card transactions, discount coupons, customer
complaint calls, etc.
• Target marketing
• Find clusters of "model" customers who share same
characteristics: interest, income level, spending habits, etc.
• Customer purchasing patterns over time
• Conversion of single to a joint bank account due to
marriage.
• Cross-market analysis
• Associations between product sales and prediction based
on associations
7
CUSTOMER BUYING/PURCHASING PATTERN
• Buying patterns refer to the why and how behind consumer
purchase decisions. They are habits and routines that consumers
establish through the products and services they buy.
• Buying patterns are defined by the frequency, timing, quantity,
etc. of said purchases.
• These patterns are determined by factors such as:
• Where someone lives, where they work
• How much money they make
• What they enjoy and prefer
• What their friends and family recommend
• What are their goals and motivations?
• The price of the product or service they're interested in
• Any product displays.
• The necessity of the product or service
• Festivals, holidays, ceremonies/rituals, or celebrations
8
MARKET ANALYSIS AND MANAGEMENT
• Customer Profiling
• What customers buy what products?
• Customer Requirements
• Best products for different customers
• Summary information
• Multidimensional summary reports

9
RISK ANALYSIS AND MANAGEMENT
• Finance planning and asset evaluation
• cash flow analysis and prediction
• cross-sectional and time series analysis (see slide-12)
• Resource planning
• summarize and compare the resources and spending
• Competition
• monitor competitors and market directions
• group customers into classes and a class-based pricing
procedure
• set pricing strategy in a highly competitive market

10
CROSS-SECTIONAL
ANALYSIS
• A type of analysis where an investor, analyst or
portfolio manager may conduct on a company in
relation to its industry or industry peers.
• The analysis compares one company against the
industry it operates within, or directly against
certain competitors within the same industry, in an
attempt to discover the best of the business
methods.
• Time series analysis comprises methods for
analyzing time series data in order to extract
meaningful statistics and other characteristics of
the data.
11
FRAUD DETECTION AND MANAGEMENT

• Applications
• Health care, retail, credit card services, telecommunications etc.
• Approach
• Use historical data to build models of normal and fraudulent
behavior and use data mining to help identify fraudulent
instances
• Examples
• auto insurance: detect groups who stage accidents to collect
insurance
• money laundering: detect suspicious money transactions
• medical insurance: detect professional patients and ring of
doctors, in appropriate medical treatment
• detecting telephone fraud: Telephone call model: destination of
the call, duration, time of day/week. Analyze patterns that
deviate from expected norm.
12
DISCOVERY OF MEDICAL/BIOLOGICAL KNOWLEDGE

• Discovery of structure-function associations

• Structure of proteins and their function
• Human Brain Mapping (lesion-deficit, task-activation
associations)
• Breast structure and pathology
• Cell structure (cytoskeleton) and functionality or
pathology
• Discovery of causal relationships
• Symptoms and medical conditions
• DNA sequence analysis
• Bioinformatics (microarrays, etc)
13
OTHER APPLICATIONS
• Sports
• IBM Advanced Scout analyzed Basketball game statistics
(shots blocked, assists, and fouls) to gain competitive
advantage for New York Knicks and Miami Heat
• Astronomy
• Jet Propulsion Laboratory (JPL) and the Palomar
Observatory discovered 22 quasars (group of stars) with
the help of data mining
• Internet Web Surf-Aid.
• IBM Surf-Aid applies data mining algorithms to Web
access logs for market-related pages to discover customer
preference and behavior pages, analyzing effectiveness of
Web 15 marketing, improving Web site organization, etc.
14
DATA MINING: A KDD
PROCESS
• Data mining: the core of
knowledge discovery process.

15
KDD Steps

Knowledge discovery of data consists of an iterative

sequence of the following steps:
1. Data cleaning (to remove noise and inconsistent
data)
2. Data integration (where multiple data sources may
be combined)
3. Data selection (where data relevant to the analysis
task are retrieved from the database)
4. Data transformation (where data are transformed
or consolidated into forms appropriate for mining
by performing summary or aggregation operations,
for instance) 16
4. Data mining (an essential process where
intelligent methods are applied in order to extract
data patterns)
5. Pattern evaluation (to identify the truly
interesting patterns representing knowledge
based on some interestingness measures; Section
1.5)
6. Knowledge presentation (where visualization and
knowledge representation techniques are used to
present the mined knowledge to the user)

17
MULTIDIMENSIONAL
ANALYSIS
• In statistics, econometrics, and related fields
multidimensional analysis is
• a data analysis process that groups data into two or
more categories: data dimensions and measurements.
• For example-
• A data set consisting of the number several football
teams in each department over several years is a two-
dimensional dataset.

18
DBMS
• A database system, also called a database
management system (DBMS), consists of a collection
of interrelated data, known as a database, and a set
of software programs to manage and access the data.

19
ARCHITECTURE OF A
TYPICAL DATA MINING
SYSTEM

20
Data Mining System
Components
• Database, Data Warehouse, WWW and Other
repository
• All this act as data source for data mining
• Data cleaning and integration techniques may be
applied.
• Database or data warehouse server
• Responsible for fetching user's requested data
• Knowledge base
• Used to guide the search or evaluate the interestingness
resulting pattern

21
Data Mining System
Components
• Data mining engine
• Consists of a set of modules which are responsible for characterization,
association and correlation analysis, classification, prediction, cluster
analysis, outlier analysis and evolution analysis.
• Pattern evaluation module
• typically employs interestingness measures and interacts with the data
mining modules so as to focus the search toward interesting patterns.
• User Interface
• communicates between users and the data mining system,
• allows the user to interact with the system by specifying a data mining
query or task,
• provides information to help focus the search, and
• performs exploratory (=discover more) data mining based on the
intermediate data mining results

22
DATA MINING: ON WHAT
KIND OF DATA
• Relational databases .
• Data warehouses
• Transactional databases
• Advanced DB and information repositories
• Object-oriented (OO) and object-relational (OR) databases
• Spatial databases (medical, satellite image DBS, GIS)
• Temporal databases
• Text databases
• Multimedia databases (Image, Video, etc)
• Heterogeneous and legacy databases
• WWW
23
RELATIONAL DATABASE
• A relational database is a collection of tables, each of
which is assigned a unique name.
• Each table consists of a set of attributes (columns or
fields) and usually stores a large set of tuples (records
or rows).
• Each tuple in a relational table represents an object
identified by a unique key and described by a set of
attribute values.
• It can be described by E-R data model that represents
the database as a set of entities and their
relationships.
24
DATA WAREHOUSE
• a repository of information collected from multiple
sources, stored under a unified schema, and that
usually resides at a single site.
• constructed via a process of data cleaning, data
integration, data transformation, data loading, and
periodic data refreshing.

25
DATA CUBE
• Provides a multidimensional view of data and allows the
pre-computation and fast accessing of summarized data.
• Example- The cube has three dimensions:
• address (with city values Chicago, New York, Toronto,
Vancouver),
• time (with quarter values Q1, Q2, Q3, Q4), and
• item (with item type values home entertainment, computer,
phone, security).
• The aggregate value stored in each cell of the cube is
sales amount (in thousands). If the total sales for the
first quarter, Q1, for items relating to security systems in
Vancouver is $400,000, as stored in cell (Vancouver Q1,
security).
26
27
TRANSACTIONAL
DATABASES
• A transactional database consists of a file where
each record represents a transaction.
• A transaction typically includes a unique
transaction identity number (trans ID) and a list of
the items making up the transaction (such as items
purchased in a store).

28
OBJECT RELATIONAL DATA
MODEL
• Inherits the essential concepts of object-oriented
database where each entity is considered as an
object.
For example - for xyz electronics company, object can
individual employee, customers or product items.

29
OBJECT RELATIONAL
DATABASE
• Are constructed based on the object relational data model.
• Here data and code relating an object are encapsulated
into single unit.
• Each object has the following features:
• A set of variable that describes the objects. These corresponds to
the attributes in the entity.
• A set of message that the object can use to communicate with
other objects or the rest of the database system.
• A set of methods, where each method hold the code to
implement a message. Upon receiving a message, the method
returns a value in response.
• For example- the method for get_national_id (employee) will retrieve
and return national id of the given employee object.

30
TEMPORAL DATABASE
• Typically stores data that include time-related
attributes.
• Where this attributes may involve several
timestamp, each having different semantics.

31
SEQUENCE DATABASE
• Stores a sequence of ordered events, with or
without a concrete notion of time.
• Example-customer shopping sequence, web click
streams, and biological sequence

32
TIME SERIES DATABASE
• Store sequences of values or events obtained over
repeated measurements of time (e.g. hourly, daily,
weekly)
• For example-data collected from stock exchange,
inventory control, and the observation of natural
phenomena like temperature and wind.

33
SPATIAL DATABASE
• Contain spatial-related information.
• For example - geographic map database, VLSI or
CAD database, and medical and satellite database

34
SPATIOTEMPORAL
DATABASE
• Stores spatial objects that change with time
• From spatiotemporal database interesting
information can be mined.
• For example - database of moving objects like a
moving car where GPS device is attached.

35
TEXT DATABASE
• Text databases are databases that contain word descriptions for
objects.
• These word descriptions are usually not simple keywords but
rather long sentences or paragraphs, such as product
specifications, error or bug reports, warning messages, summary
reports, notes, or other documents. Text databases may be highly
unstructured (such as some Web pages on the World Wide Web).
• Some text databases may be somewhat structured, that is, semi-
structured (such as e-mail messages and many HTML/XML Web
pages), whereas others are relatively well structured (such as
library catalogue databases).
• Text databases with highly regular structures "typically can be
implemented using relational
36
MULTIMEDIA DATABASE
• Multimedia databases store image, audio, and video data.
• They are used in applications such as picture content-based
retrieval, voice-mail systems, video-on-demand systems, the
World Wide Web, and speech-based user interfaces that
recognize spoken commands.
• Multimedia databases must support large objects, because
data objects such as video can require gigabytes of storage.
• Specialized storage and search techniques are also required.
• Because video and audio data require real-time retrieval at a
steady and predetermined rate in order to avoid picture or
sound gaps and system buffer overflows, such data are
referred to as continuous-media data.
37
HETEROGENEOUS
DATABASE
• A heterogeneous database consists of a set of
interconnected, autonomous component
databases.
• The components communicate in order to
exchange information and answer queries.
• Objects in one component database may differ
greatly from objects in other component databases,
making it difficult to incorporate their semantics
into the overall heterogeneous database.

38
LEGACY DATABASE
• A legacy database is a group of heterogeneous
databases
• It combines different kinds of data systems, such as
relational or object-oriented databases, hierarchical
databases, network databases, spreadsheets,
multimedia databases, or file systems.

39
DATA MINING TASK
(CLASSIFICATION)
• Descriptive
• Here mining tasks characterize the general properties of
the data in the database.
• Predictive
• Here mining tasks perform inference (=deduction) on
the current data in order to make predictions.

40
CLASS AND CONCEPTS OF
DATA
• Data can be associated with classes or concepts.
• For example,
• In the AllElectronics store, classes of items for sale
include
• computers and printers, and
• concepts of customers include
• bigSpenders and budgetSpenders.
• Descriptions of individual classes and concepts in
summarized, concise, and yet precise terms are
called class/concept descriptions.

41
HOW CLASS/CONCEPT
DESCRIPTION DERIVED?
Class/concept description can be derived via-
1) data characterization, by summarizing the data of
the class under study (often called the target
class) in general terms, or
2) data discrimination, by comparison of the target
class with one or a set of comparative classes
(often called the contrasting classes), or
3) both data characterization and discrimination.

42
DATA CHARACTERIZATION
• Data characterization is a summarization of the
general characteristics or features of a target class
of data.
• The data corresponding to the user-specified class
are typically collected by a database query.

43
EXAMPLE OF DATA
CHARACTERIZATION(CONT.
.)
• A data mining system should be able to produce a
description summarizing the characteristics of
customers who spend more than $1,000 a year at
XYZElectronics.
• The result could be a general profile of the
customers, such as they are 40-50 years old,
employed, and have excellent credit ratings.
• The system should allow users to drill down on any
dimension, such as on occupation in order to view
these customers according to their type of
employment.
44
DATA DISCRIMINATION
• Data discrimination is a comparison of the general
features of target class data objects with the general
features of objects from one or a set of contrasting
classes.
• The target and contrasting classes can be specified by
the user, and the corresponding data objects retrieved
through database queries.
For example, the user may like to compare the
general features of software products whose sales
increased by 10% in the last year with those whose
sales decreased by at least 30% during the same period.
45
STRUCTURED VS.
UNSTRUCTURED DATA
• Structured Data
• Databases
• XML data
• Data warehouses
• Enterprise systems (CRM, ERP, etc)
• Unstructured Data
• Excel spreadsheets
• Word documents
• Email messages
• RSS feeds
• Audio files
• Video files
46
MINING FREQUENT
PATTERNS
• Frequent patterns are patterns that occur frequently in data.
• Example: itemsets, subsequences, and substructures.
• A frequent itemset typically refers to a set of items that frequently
appear together in a transactional data set, such as milk and bread
• A frequently occurring subsequence, such as the pattern that
customers tend to purchase first a PC, followed by a digital cam-era,
and then a memory card, is a (frequent) sequential pattern
• A substructure can refer to different structural forms, such as
graphs, trees, or lattices, which may be combined with itemsets or
subsequences.
• Mining frequent patterns leads to the discovery of interesting
associations and correlations within data

47
ASSOCIATION RULE
• Association rules are if/then statements that help uncover
relationships between seemingly unrelated data in a relational
database or other information repository. For example "If a
customer buys a dozen eggs, he is 80% likely to also purchase milk.“
• An association rule has two parts: an antecedent (if) and a
consequent (then).“
• An antecedent is an item found in the data.
• A consequent is an item that is found in combination with the antecedent.
• Association rules are created by analyzing data for frequent if then
patterns and using the criteria support and confidence to identify
the most important relationships.
• Support is an indication of how frequently the items appear in the
database.
• Confidence indicates the number of times the if/then statements have been
found to be true.49

48
49
SINGLE-DIMENSIONAL ASSOCIATION
• Buy (X, "computer") Buy(X, "software")
[support=1%,confidence-50%]
• where X is a variable representing a customer.
• A confidence, or certainty, of 50% means that if a
customer buys a computer, there is a 50% chance that
she will buy software as well.
• A 1% support means that 1% of all of the transactions
under analysis showed that computer and software
were purchased together.
• This association rule involves a single attribute or
predicate (i.e., buys) that repeats.
• Association rules that contain a single predicate are
referred to as single-dimensional association rules.

50
MULTIDIMENSIONAL ASSOCIATION
RULE
• age(X, "20:::29")^income(X,"20K:::29K"))→ buys(X,"
CD player") [2%, 60%]
• The rule indicates that of the XYZ Computers' customers
under study, 2% are 20 to 29 years of age with an
income of 20,000 to 29,000 and have purchased a CD
player at XYZ Computers’.
• There is a 60% probability that a customer in this age
and income group will purchase a CD player.
• Note that this is an association between more than one
attribute, or predicate (i.e., age, income, and buys).
• Here each attribute is referred to as a dimension, the
above rule can be referred to as a multidimensional
association rule.
51
CLASSIFICATION
• Classification a form of data analysis that can be used to extract
models describing important data classes.
• Classification-predicts categorical labels
• Finding models (e.g., if-then rules, decision trees, mathematical
formulae, neural networks, classification rules) that describe
and distinguish classes or concepts for future prediction, e.g.,
classify cars based on gas mileage
• Example-
• A bank loans officer needs analysis of his/her data in order to learn
which loan applicants are "safe" and which are "risky“ for the bank
• In this examples, the data analysis task is classification, where a
model or classifier is constructed to predict categorical labels, such as
"safe" or "risky" for the loan application data;

52
CLASSIFICATION (NN= IP+HIDDEN +
OP LAYER)

53
PREDICTION
• Prediction - models continuous valued functions
• Predict some unknown or missing numerical values
• Suppose that the marketing manager would like to
predict how much a given customer will spend during a
sale at ‘XYZ Electronics’
• This data analysis task is an example of numeric
prediction.

54
DATA EXPLORATION
• is a common process in data warehouses which are
characterized by large bulks of data coming from
disparate systems.

• Disparate System or a Disparate Data System

• is a computer data processing system
• was designed to operate as a fundamentally distinct data
processing system without exchanging data or interacting
with other computer data processing systems.
• Examples Legacy systems (such as windows 7 is no longer
getting support from Microsoft after 2020).
55
CLUSTERING
• The process of grouping a set of physical or abstract
objects into classes of similar objects is called clustering.
• A cluster is a collection of data objects that are similar to
one another within the same cluster and are dissimilar
to the objects in other clusters.
• Clustering principle: maximize intra-class similarity and
minimize interclass similarity.
• Cluster analysis can be performed on some customer
data in order to identify homogeneous sub-populations
of customers. These clusters may represent individual
target groups for marketing.
56
OUTLIER ANALYSIS
• Outliers: data objects that do not comply with the
general behavior of the data (can be detected using
statistical tests that assume a prob. model)
• Often considered as noise but useful in fraud detection,
rare events analysis
• Example:
• Outlier analysis may uncover fake usage of credit cards by
detecting purchases of extremely large amounts for a given
account number in comparison to regular charges incurred
by the same account.
• Outlier values may also be detected with respect to the
location and type of purchase, or the purchase frequency.
57
EVOLUTION ANALYSIS
• Data evolution analysis describes and models
regularities or trends for objects whose behavior
changes over time.
• Example
• Suppose that you have the major stock market (time-
series) data of the last several years available from the
Dhaka Stock Exchange and you would like to invest in
shares of high profitable garments industrial companies.
• Such regularities may help predict future trends in stock
market prices, contributing to your decision making
regarding stock investments.

58
WHEN IS A "DISCOVERED"
PATTERN INTERESTING?
• A data mining system/query may generate thousands of
patterns, not all of them are interesting.
• Suggested approach: Human-centered, query-based, focused
mining
• Interestingness measures: A pattern is interesting if it is
easily understood by humans, valid on new or test data with
some degree of certainty, potentially useful, novel, or
validates some hypothesis that a user seeks to confirm
• Objective vs. subjective interestingness measures:
• Objective: based on statistics and structures of patterns; e.g.,
support, confidence, etc.
• Subjective: based on user's belief in the data, e.g.,
unexpectedness novelty, actionability, etc.
59
60
DATA MINING:
CONFLUENCE OF MULTIPLE
DISCIPLINES

61
MAJOR ISSUES IN DATA
MINING
• Issue relating to Mining methodology and User
interaction
• Issue relating to Performance
• Issues relating to the diversity of database types

62
MAJOR ISSUES IN DATA MINING
• Theo Issue relating to Mining methodology and User interaction
• Mining different kinds of knowledge from databases
• Here knowledge discovery tasks, including data characterization,
discrimination, association and correlation analysis, classification, prediction,
clustering, outlier analysis, and evolution analysis
• Pattern evaluation: the interestingness problem
• Incorporation of background knowledge:
• integrity constraints and deduction rules
• Handling noise and incomplete data
• Data mining query languages and ad-hoc mining
• Data mining query languages, such as DMQL can be designed to support ad
hoc and interactive data, mining
• Presentation and visualization of data mining results
• knowledge representation techniques, such as trees, tables, rules, graphs,
charts, crosstabs, matrices, or curves
• Interactive mining of knowledge
• Interactive mining allows users to focus the search for patterns, providing and
refinin data mining requests based an returned results.

63
MAJOR ISSUES IN DATA MINING
• Issue relating to Performance
• Efficiency and scalability of data mining algorithms
• Parallel, distributed and incremental mining algorithms
• Issues relating to the diversity of database types
• Handling of relational and complex types of data
• complex data objects, hypertext and multimedia data, spatial
data, temporal data, or transaction data
• Mining information from heterogeneous databases and
global information system.

64
DATA MINING
CLASSIFICATION
• Databases to be mined
• Relational, transactional, object-oriented, object-relational, active,
spatial, time-series, text, multi-media, heterogeneous, legacy, WWW,
etc.
• Knowledge to be mined
• Characterization, discrimination, association, classification, clustering,
trend, deviation and outlier analysis, etc.
• Multiple/integrated functions and mining at multiple levels
• Techniques
• Database-oriented, data warehouse, machine learning, statistics,
utilizedvisualization, pattern recognition, neural network, etc.
• Applications adapted
• Retail, telecommunication, banking, fraud analysis, DNA mining, stock
market analysis, Web mining, Weblog analysis, etc.
65
CONFERENCES AND
JOURNALS ON DATA
MINING
• KDD Conferences
• ACM SIGKDD Int. Conf. on Knowledge Discovery in Databases and Data Mining (KDD)
• SIAM Data Mining Conf. (SDM).
• (IEEE) Int. Conf. on Data Mining (ICDM)
• Conf. on Principles and practices of Knowledge Discovery and Data Mining (PKDD)
• Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD)
• Other related conferences
• ACM SIGMOD
• VLDB
• (IEEE) ICDE
• WWW, SIGIR
• ICML, CVPR, NIPS
• Journals
• Data Mining and Knowledge Discovery (DAMI or DMKD)
• IEEE Trans. On Knowledge and Data Eng. (TKDE)
• KDD Explorations
• ACM Trans. on KDD
66

Guide Complet N8N
No ratings yet
Guide Complet N8N
41 pages
KCSE Computer Studies Project
100% (1)
KCSE Computer Studies Project
22 pages
Data Mining
No ratings yet
Data Mining
395 pages
Data Mining: Concepts and Techniques
100% (2)
Data Mining: Concepts and Techniques
27 pages
Major Issues in Data Mining
75% (4)
Major Issues in Data Mining
45 pages
Data Mining and Datawarehousing CS-303
No ratings yet
Data Mining and Datawarehousing CS-303
34 pages
2-Introduction To Data Mining, Steps in Data Mining Process-31-07-2024
No ratings yet
2-Introduction To Data Mining, Steps in Data Mining Process-31-07-2024
77 pages
Chapter 1 DM
No ratings yet
Chapter 1 DM
20 pages
Knowledge Discovery Process and Data Mining - Final Remarks: - Moore's Law
No ratings yet
Knowledge Discovery Process and Data Mining - Final Remarks: - Moore's Law
25 pages
Data Mining Concept (MMU)
No ratings yet
Data Mining Concept (MMU)
38 pages
Data Mining and Decision Trees: Prof. Sin-Min Lee Department of Computer Science
No ratings yet
Data Mining and Decision Trees: Prof. Sin-Min Lee Department of Computer Science
66 pages
Data Mining
No ratings yet
Data Mining
130 pages
Haramaya University College of Engineering and Technology Department of Information Technology
No ratings yet
Haramaya University College of Engineering and Technology Department of Information Technology
38 pages
DWDM
No ratings yet
DWDM
30 pages
Internal
No ratings yet
Internal
267 pages
Data Mining: M.P.Geetha, Department of CSE, Sri Ramakrishna Institute of Technology, Coimbatore
100% (1)
Data Mining: M.P.Geetha, Department of CSE, Sri Ramakrishna Institute of Technology, Coimbatore
115 pages
Unit-1 PPT Dma
No ratings yet
Unit-1 PPT Dma
83 pages
Data Mining
No ratings yet
Data Mining
88 pages
Topic10 - Data Mining
No ratings yet
Topic10 - Data Mining
29 pages
Chapter 1 Data Mining Lecture Note
No ratings yet
Chapter 1 Data Mining Lecture Note
31 pages
02-Introduction To Data Mining
No ratings yet
02-Introduction To Data Mining
40 pages
Module 3
No ratings yet
Module 3
187 pages
Unit-1 A
No ratings yet
Unit-1 A
47 pages
Data Mining
No ratings yet
Data Mining
52 pages
Introduction
No ratings yet
Introduction
46 pages
Data Mining Notes
100% (1)
Data Mining Notes
45 pages
July 16, 2009 1 Data Mining
No ratings yet
July 16, 2009 1 Data Mining
26 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
46 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
Lecture 01 11jan
No ratings yet
Lecture 01 11jan
29 pages
DWM Unit II
No ratings yet
DWM Unit II
76 pages
Data Mining:: Dr. Hany Saleeb
No ratings yet
Data Mining:: Dr. Hany Saleeb
37 pages
Data Mining
No ratings yet
Data Mining
15 pages
DM Introduction
No ratings yet
DM Introduction
32 pages
ICS 2408 Lecture 1 Introduction
No ratings yet
ICS 2408 Lecture 1 Introduction
32 pages
Data Mining 1
No ratings yet
Data Mining 1
39 pages
L - 1 Data Mining
No ratings yet
L - 1 Data Mining
17 pages
CSM6404 DM L1
No ratings yet
CSM6404 DM L1
29 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
12 pages
Anaum Hamid: Lecture 01 - Introduction To DM
No ratings yet
Anaum Hamid: Lecture 01 - Introduction To DM
50 pages
Introduction
No ratings yet
Introduction
27 pages
Data Mining and Its Applications
No ratings yet
Data Mining and Its Applications
60 pages
Introduction To Data Mining 1604
No ratings yet
Introduction To Data Mining 1604
32 pages
Data Mining From Scratch
No ratings yet
Data Mining From Scratch
17 pages
Data Mining: Concepts and Techniques: - Chapter 1
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 1
37 pages
Unit - I
No ratings yet
Unit - I
22 pages
Introduction To Data Mining: - Chapter 3
No ratings yet
Introduction To Data Mining: - Chapter 3
39 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
27 pages
01 Intro
No ratings yet
01 Intro
23 pages
Intro of Data Mining
No ratings yet
Intro of Data Mining
27 pages
Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems
No ratings yet
Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems
67 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
All Interview Questions and Answers - SAP HANA Interview Questions and Answers
No ratings yet
All Interview Questions and Answers - SAP HANA Interview Questions and Answers
5 pages
Chap 1
No ratings yet
Chap 1
45 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
25 pages
Motivation For Data Mining The Information Crisis
No ratings yet
Motivation For Data Mining The Information Crisis
13 pages
Data Mining Concepts
No ratings yet
Data Mining Concepts
35 pages
Rakhi Bhadauria 8109912917: Career Vision
100% (3)
Rakhi Bhadauria 8109912917: Career Vision
3 pages
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
No ratings yet
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
27 pages
To Data Mining: Motivation: "Necessity Is The Mother of Invention"
No ratings yet
To Data Mining: Motivation: "Necessity Is The Mother of Invention"
14 pages
Chapter 1. Introduction
No ratings yet
Chapter 1. Introduction
323 pages
1 Intro
No ratings yet
1 Intro
33 pages
Database Systems: Design, Implementation, and Management: Advanced SQL
No ratings yet
Database Systems: Design, Implementation, and Management: Advanced SQL
52 pages
Lab Manual 10 PDF
No ratings yet
Lab Manual 10 PDF
4 pages
SQL Injection
No ratings yet
SQL Injection
37 pages
Chap 1
No ratings yet
Chap 1
32 pages
Microsoft - Passguide.da 100.exam - Dumps.2021 Mar 21.by - Bernard.91q.vce
No ratings yet
Microsoft - Passguide.da 100.exam - Dumps.2021 Mar 21.by - Bernard.91q.vce
6 pages
Archive Browser PDF
100% (1)
Archive Browser PDF
4 pages
DBMS Lecture 4
No ratings yet
DBMS Lecture 4
27 pages
Data Manipulation Language
No ratings yet
Data Manipulation Language
48 pages
Lecture 10: BCSE302L - DBMS: Functional Dependencies
No ratings yet
Lecture 10: BCSE302L - DBMS: Functional Dependencies
35 pages
SQL Triggers: Prepared By: Rahim Suwal (29) Shyam Rajak
100% (1)
SQL Triggers: Prepared By: Rahim Suwal (29) Shyam Rajak
55 pages
DBMS Important
No ratings yet
DBMS Important
9 pages
Lecture 3-Access
No ratings yet
Lecture 3-Access
32 pages
Aca Exam
No ratings yet
Aca Exam
13 pages
Unit 4 (Database Architecture)
No ratings yet
Unit 4 (Database Architecture)
15 pages
BIT 1201 Database Lesson 1
No ratings yet
BIT 1201 Database Lesson 1
40 pages
SQL Commands
No ratings yet
SQL Commands
13 pages
Unit - 1 - Part3 - DBMS Architecture
No ratings yet
Unit - 1 - Part3 - DBMS Architecture
4 pages
Assignment 1 - Node Modules, Express, MongoDB and REST API
No ratings yet
Assignment 1 - Node Modules, Express, MongoDB and REST API
6 pages
Lecture 1.1.1 Intro To Database
No ratings yet
Lecture 1.1.1 Intro To Database
24 pages
Data Pipeline Pharmarack
No ratings yet
Data Pipeline Pharmarack
3 pages
Gokhale Institute of Politics and Economics: B.Sc. (Sem - 01) Database Management System Assignment 02
No ratings yet
Gokhale Institute of Politics and Economics: B.Sc. (Sem - 01) Database Management System Assignment 02
7 pages
DBMS Reviewer
No ratings yet
DBMS Reviewer
8 pages
Kenlm: Faster and Smaller Language Model Queries
No ratings yet
Kenlm: Faster and Smaller Language Model Queries
11 pages
DataMan2 Project Problems
No ratings yet
DataMan2 Project Problems
7 pages
Assignment 04
No ratings yet
Assignment 04
7 pages
Data Fabric Corp
No ratings yet
Data Fabric Corp
2 pages
9 GraphQL Variables
No ratings yet
9 GraphQL Variables
4 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet

CH 1

Uploaded by

CH 1

Uploaded by

Lecture Outline

• What Is Data Mining?

• Why Data Mining?

• Data Mining: On what kind of data?

• Data Mining Functionality

• Are all the patterns interesting?

• Classification and data mining systems

• Major Issues in Data Mining

• Data sources for analysis?

• Discovery of structure-function associations

Knowledge discovery of data consists of an iterative

• Disparate System or a Disparate Data System

You might also like