0% found this document useful (0 votes)

167 views67 pages

Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems

This document provides an overview of web mining and data mining. It discusses the motivation for data mining due to vast amounts of stored data. Data mining aims to extract useful patterns from data and has applications in business management, production control, and market analysis. The document outlines the evolution of database technology and different types of data mining applications, including market analysis, fraud detection, financial analysis, and analysis for retail and telecommunications industries. It provides examples of how data mining can be used to target customers, detect fraud, and improve business operations. Finally, it depicts data mining as the core of the knowledge discovery process.

Uploaded by

Michael Cormier

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

167 views67 pages

Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems

Uploaded by

Michael Cormier

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 67

Web Mining

Faculty of Information Technology

Department of Software Engineering
and Information Systems

PART 1

1
Outline
Introduction
 Motivation: Why data mining?
 What is data mining?
 Business Applications of data mining
 Data Mining: On what kind of data?
 Data mining functionality
 Are all the patterns interesting?
 Classification of data mining systems
 Major issues in data mining

2
Motivation:
“Necessity is the Mother of Invention”

Data explosion problem

 Automated data collection tools and mature database
technology lead to tremendous amounts of data stored in
databases, data warehouses and other information
repositories
Need to convert such data into knowledge and information
Applications
 Business management
 Production control
 Market analysis
 Engineering design
 Science exploration
3
Evolution of Database Technology

Data collection, database creation

Data management
 data storage and retrieval
 database transaction processing

Data analysis and understanding

 Data mining and data warehousing

4
Evolution of Database Technology
1960s:
 Data collection, database creation, IMS and network DBMS
1970s:
 Relational data model, relational DBMS implementation
1980s:
 RDBMS, advanced data models (extended-relational, OO, deductive,
etc.)
 Application-oriented DBMS (spatial, scientific, engineering, etc.)
1990s:
 Data mining, data warehousing, multimedia databases, and Web
databases
2000s
 Stream data management and mining
 Data mining with a variety of applications
 Web technology and global information systems
5
What Is Data Mining?
Data mining (knowledge discovery from data)
 Extraction of interesting (non-trivial, implicit, previously
unknown and potentially useful) patterns or knowledge from
huge amount of data
 Data mining: a misnomer?
Alternative names
 Knowledge discovery (mining) in databases (KDD), knowledge
extraction, data/pattern analysis, data archeology,
information harvesting, business intelligence, etc.
What is not data mining?
 (Deductive) query processing.
 Expert systems or small ML/statistical programs
6
Why Data Mining?—Potential Applications

Data analysis and decision support

 Market analysis and management
 Target marketing, customer relationship management (CRM),
market basket analysis, cross selling, market segmentation
 Risk analysis and management
 Forecasting, customer retention, quality control, competitive
analysis
 Fraud detection and detection of unusual patterns
Other Applications
 Text mining (news group, email, documents) and Web mining
 Intelligent query answering
 DNA and bio-data analysis
7
Market Analysis and Management
Where does the data come from?
 Credit card transactions, loyalty cards, discount coupons,
customer complaint calls, plus (public) lifestyle studies
Target marketing
 Find clusters of “model” customers who share the same
characteristics: interest, income level, spending habits, etc.
 Determine customer purchasing patterns over time
Cross-market analysis
 Associations/co-relations between product sales, & prediction
based on such association

8
Market Analysis and Management
Customer profiling
 What types of customers buy what products (clustering or
classification)

Customer requirement analysis

 identifying the best products for different customers
 predict what factors will attract new customers

Provision of summary information

 multidimensional summary reports
 statistical summary information (data central tendency and
variation)

9
Corporate Analysis & Risk
Management
Finance planning and asset evaluation
 cash flow analysis and prediction
 contingent claim analysis to evaluate assets
 cross-sectional and time series analysis (financial-ratio, trend
analysis, etc.)
Resource planning
 summarize and compare the resources and spending
Competition
 monitor competitors and market directions
 group customers into classes and a class-based pricing
procedure
 set pricing strategy in a highly competitive market

10
Fraud Detection &
Mining Unusual Patterns
Approaches: Clustering & model construction for frauds, outlier
analysis, based on historical data
Applications: Health care, retail, credit card service,
telecomm.
 Auto insurance: detect a group of people who stage accidents
to collect insurance
 Money laundering: suspicious monetary transactions
 Medical insurance
 Professional patients, ring of doctors, and ring of references
 Unnecessary or correlated screening tests

11
Fraud Detection &
Mining Unusual Patterns
 Detecting inappropriate medical treatment
 Australian Health Insurance Commission identifies that in many
cases blanket screening tests were requested (save Australian
$1m/yr)
 Telecommunications: phone-call fraud
 Phone call model: destination of the call, duration, time of day or
week. Analyze patterns that deviate from an expected norm
 British Telecom identified discrete groups of callers with frequent
intra-group calls, especially mobile phones, and broke a
multimillion dollar fraud.
 Retail industry
 Analysts estimate that 38% of retail shrink is due to dishonest
employees
 Anti-terrorism
12
Financial Data Analysis
Financial data
 complete
 reliable
 high quality

Loan payment prediction and customer

credit policy analysis

13
Loan payment prediction and
customer credit policy analysis
Factors influencing loan payment performance
 loan-to-value ratio
 term of the loan
 debt ratio (total monthly debt/total monthly income)
 payment-to-income ratio
 income level
 education level
 residence region
 credit history

Analysis may find that

 payment-income ratio is a dominant factor while
 education level and debt ratio are not

14
Data Mining for the Retail Industry
Multidimensional analysis of sales, customers,
products, time and region
 OLAP cubes
Effectiveness of sales campaigns
 Advertisements, coupons, discounts, bonuses
 promote products and attract customers
 can help improve profits
 Compare amount of sales and number of transactions
 during the sales period versus before or after the sales campaign
 Association analysis
 which items are likely to be purchased together with the items
on sale

15
Data Mining for the Retail Industry
Customer retention Analysis of Customer loyalty
 sequences of purchases of particular customers
 goods purchased at different periods by the same customers
can be grouped into sequences
 changes in customer consumption or loyalty
 suggests adjustments on the pricing and variety of goods
 to retain old customers and attract new customers
Purchase recommendation and cross-reference of
items
 associations from sales records
 a customer who buy a PC is likely to buy a printer
 purchase recommendations

16
Data Mining for the
Telecommunication Industry
Telecommunication data are multidimensional
 calling-time duration
 location of caller location of called
 type of call
used to identify and compare
 data traffic system workload
 resource usage user group behaviour
 profit
fraudulent pattern analysis and identification of
unusual patterns
to achieve customer loyalty
characteristics of customers affecting line usage

17
Other Applications
Sports
 IBM Advanced Scout analyzed NBA game statistics (shots
blocked, assists, and fouls) to gain competitive advantage for
New York Knicks and Miami Heat
Internet Web Surf-Aid
 IBM Surf-Aid applies data mining algorithms to Web access
logs for market-related pages to discover customer
preference and behavior pages, analyzing effectiveness of
Web marketing, improving Web site organization, etc.

18
Data Mining: A KDD Process
Data mining—core of Pattern Evaluation
knowledge discovery
process Data Mining

Task-relevant Data

Data Warehouse Selection

Data Cleaning

Data Integration

Databases
19
Steps of a KDD Process
Learning the application domain
 relevant prior knowledge and goals of application
Creating a target data set: data selection
Data cleaning and preprocessing: (may take 60% of effort!)
Data reduction and transformation
 Find useful features, dimensionality/variable reduction, invariant
representation.
Choosing functions of data mining
 summarization, classification, regression, association, clustering.
Choosing the mining algorithm(s)
Data mining: search for patterns of interest
Pattern evaluation and knowledge presentation
 visualization, transformation, removing redundant patterns, etc.
Use of discovered knowledge
20
Data Mining and Business
Intelligence
Increasing potential
to support
business decisions End User
Making
Decisions

Data Presentation Business

Analyst
Visualization Techniques
Data Mining Data
Information Discovery Analyst

Data Exploration
Statistical Analysis, Querying and Reporting

Data Warehouses / Data Marts

OLAP, MDA DBA
Data Sources
Paper, Files, Information Providers, Database Systems, OLTP
21
Architecture: Typical Data Mining System

Graphical user interface

Pattern evaluation

Data mining engine

Knowledge-base
Database or data
warehouse server
Data cleaning & data integration Filtering

Data
Databases Warehouse

22
Data Mining: On What Kinds of Data?
Relational database
Data warehouse
Transactional database
Advanced database and information repository
 Object-relational database

 Spatial and temporal data

 Time-series data

 Multimedia database

 Heterogeneous and legacy database

 Text databases & WWW

23
Data Mining Functionalities
Concept description: Characterization and discrimination
 Generalize, summarize, and contrast data characteristics, e.g., dry vs. wet regions

Association (correlation and causality)

 Multi-dimensional vs. single-dimensional association
 age(X, “20..29”) ^ income(X, “20..29K”)  buys(X, “PC”) [support = 2%, confidence =
60%]
 contains(T, “computer”)  contains(x, “software”) [1%, 75%]
 Diaper  Milk [0.5%, 75%]

Classification and Prediction

 Construct models (functions) that describe and distinguish classes or concepts for future
prediction
 E.g., classify countries based on climate, or classify cars based on gas mileage
 Presentation: decision-tree, classification rule, neural network
 Predict some unknown or missing numerical values

24
Data Mining Functionalities
Cluster analysis
 Class label is unknown: Group data to form new classes, e.g., cluster

houses to find distribution patterns

 Maximizing intra-class similarity & minimizing interclass similarity

Outlier analysis
 Outlier: a data object that does not comply with the general
behavior of the data
 Noise or exception? No! useful in fraud detection, rare events
analysis
Trend and evolution analysis
 Trend and deviation: regression analysis

 Sequential pattern mining, periodicity analysis

 Similarity-based analysis

Other pattern-directed or statistical analyses

25
Are All the “Discovered” Patterns
Interesting?

Data mining may generate thousands of patterns: Not all of them

are interesting
 Suggested approach: Human-centered, query-based, focused mining
Interestingness measures
 A pattern is interesting if it is easily understood by humans, valid on
new or test data with some degree of certainty, potentially useful,
novel, or validates some hypothesis that a user seeks to confirm
Objective vs. subjective interestingness measures
 Objective: based on statistics and structures of patterns, e.g.,
support, confidence, etc.
 Subjective: based on user’s belief in the data, e.g., unexpectedness,
novelty, actionability, etc.
26
Can We Find All and Only Interesting
Patterns?

Find all the interesting patterns: Completeness

 Can a data mining system find all the interesting patterns?
 Heuristic vs. exhaustive search
 Association vs. classification vs. clustering
Search for only interesting patterns: An optimization problem
 Can a data mining system find only the interesting patterns?
 Approaches
 First general all the patterns and then filter out the uninteresting
ones.
 Generate only the interesting patterns—mining query
optimization

27
Data Mining:
Confluence of Multiple Disciplines

Database
Statistics
Systems

Machine
Learning
Data Mining Visualization

Algorithm Other
Disciplines

28
Data Mining: Classification Schemes

General functionality
 Descriptive data mining
 Predictive data mining

Different views, different classifications

 Kinds of data to be mined
 Kinds of knowledge to be discovered
 Kinds of techniques utilized
 Kinds of applications adapted

29
Two Styles of Data Mining
Descriptive data mining
 Characterize the general properties of the data in
the database
 finds patterns in data and
 user determines which ones are important
Predictive data mining
 perform inference on the current data to make
predictions
 we know what to predict
Not mutually exclusive
 used together

30
Descriptive Data Mining
Discovering new patterns inside the data
Used during the data exploration steps
 what is in the data
 what does it look like
 are there any unusual patterns
 what dose the data suggest for customer
segmentation
users may have no idea
 which kind of patterns may be interesting

31
Descriptive Data Mining
Patterns at various granularities
 geography
 country - city - region - street
 student
 university - faculty - department - minor

Functionalities of descriptive data mining

 clustering
 summarization
 visualization
 market basket analysis

32
A Model is a Black Box
X: vector of independent variables
Y =f(X) : an unknown function

inputs Model Y output

X1,X2

 the user does not care what the model is doing

 it is a black box
 interested in the accuracy of its predictions

33
Predictive Data Mining
Using known examples the model is trained
 the unknown function is learned from data

The more data with known outcomes is available

 the better the predictive power of the model

Used to predict outcomes whose inputs are known but the output
values are not realized yet

Never %100 accurate

The performance of a model on past data is not important

 to predict the known outcomes

Its performance on unknown data is much more important

34
Typical Questions Answered by
Predictive Models

Who is likely to respond to our next offer

 based on history of previous marketing campaigns

Which customers are likely to leave in the next

six months

What transactions are likely to be fraudulent

 based on known examples of fraud

35
Multi-Dimensional View of Data
Mining
Data to be mined
 Relational, data warehouse, transactional, stream, object-
oriented/relational, active, spatial, time-series, text, multi-media,
heterogeneous, legacy, WWW
Knowledge to be mined
 Characterization, discrimination, association, classification, clustering,
trend/deviation, outlier analysis, etc.
 Multiple/integrated functions and mining at multiple levels
Techniques utilized
 Database-oriented, data warehouse (OLAP), machine learning,
statistics, visualization, etc.
Applications adapted
 Retail, telecommunication, banking, fraud analysis, bio-data mining, stock
market analysis, Web mining, etc.

36
OLAP Mining: Integration of Data Mining and
Data Warehousing

Data mining systems, DBMS, Data warehouse systems coupling

 No coupling, loose-coupling, semi-tight-coupling, tight-coupling
On-line analytical mining data
 integration of mining and OLAP technologies
Interactive mining multi-level knowledge
 Necessity of mining knowledge and patterns at different levels of
abstraction by drilling/rolling, pivoting, slicing/dicing, etc.
Integration of multiple mining functions
 Characterized classification, first clustering and then association

37
An OLAM
Mining query
Architecture
Mining result Layer4
User Interface
User GUI API
Layer3
OLAM OLAP
Engine Engine OLAP/OLAM

Data Cube API

Layer2
MDDB
MDDB
Meta Data

Filtering&Integration Database API Filtering

Layer1
Data cleaning Data
Databases Data Repository
Data integration Warehouse

38
Major Issues in Data Mining
Mining methodology
 Mining different kinds of knowledge from diverse data types, e.g., bio, stream, Web
 Performance: efficiency, effectiveness, and scalability
 Pattern evaluation: the interestingness problem
 Incorporation of background knowledge
 Handling noise and incomplete data
 Parallel, distributed and incremental mining methods
 Integration of the discovered knowledge with existing one: knowledge fusion
User interaction
 Data mining query languages and ad-hoc mining
 Expression and visualization of data mining results
 Interactive mining of knowledge at multiple levels of abstraction
Applications and social impacts
 Domain-specific data mining & invisible data mining
 Protection of data security, integrity, and privacy

39
An Example Problem
All Electronic is a multi branch retail company
relational tables include
customer
 ID, name, address, age, income, education, sex, m_status
items
 ID, name, brand, category, type, price, place_made, supplier, cost
employee
 ID, name, department, education, salary
branch
purchases
 transID, item_sold, customer ID, emp_ID, date, time, method_paid,
amount

40
Concept Description
Characterization
Discrimination
Data
 classes or
 concepts
Classes of items for sale
 computers, printers
Concepts of customers:
 BigSpenders
 BudgetSpenders

41
Data Characterization
Summarization the data of the class under
study (target class)
Methods
 OLAP roll up-operation
 user-controlled data summarization
 along a specified dimension
 attribute oriented induction
 without step by step user interaction
The output of characterization
 pie charts, bar chars, curves, multidimensional data
cube, or cross tabs
 in rule form as characteristic rules

42
Characterization Example
Description summarizing the characteristics of
customers who spend more than $1000 a year
at All Electronics
 age, employment, income
 drill down on any dimension
 on occupation view these according to their type of
employment

43
Data Discrimination
Comparing the target class with one or a set
of comparative classes (contrasting classes)
 these classes can be specified by the use
Database queries
Methods and output
 similar to those used for characterization
 include comparative measures to distinguish
between the target and contrasting classes

44
Discrimination Examples
Compare the general features of software products
 whose sales increased by %10 in the last year
 whose sales decreased by at least %30 during the same period
Compare two groups of AE customers
 I) who shop for computer products regularly
 more than two times a month
 II) who rarely shop for such products
 less than three times a year
The resulting description:
%80 of I group customers
 university education
 ages 20-40
%60 of II group customers
 seniors or young
 no university degree

45
Multidimensional Data
According to sales region month and
product type Dimensions: Product, Location, Time
Hierarchical summarization paths

Industry Region Year

Category Country Quarter

Product

Product City Month Week

Office Day

Month
46
Association Analysis
Discovery of association rules showing
attribute-value conditions that occur frequently
together in a given set of data
Widely used
 market basket
 transaction data analysis
More formally
 X  Y that is
 A1A2.. Ak  B1B2.. Bl
 A1 , B1 are attribute value pairs

47
Example: Association Analysis
From the AllEs database
 age(X,”20..29”)income(X,”20K...40K”)buy(X,”CD player”)
 (support = %2,
 confidence= %60)
X is a variable representing a customer
%2 of the AE customers are
 between 20 and 29 age
 incomes ranging from 20K to 40K
With %60 probability that customers in those age and
income groups will buy CD player
A multidimensional association rule
 contains more than one attribute or predicate

48
Market Basket Analysis
Customers buying behaviour is
investigated
Based on only the transactions data
 no information about customer properties:
age income
Managers
 are interested in which products or product
groups are sold together

49
Example: Basket Analysis Rule
 buy(computer)buy(printer)
 (support= %1,confidence=%60)
 %1 of all transactions contains
 computer and printer
 if a transaction contains computer
 there is a %50 chance that it contains printer as well
a single dimensional association rule
 contains a single predicate
an association rule is interesting if
 its support exceeds a minimum threshold and
 its confidence exceeds a min threshold
These min values are set by specialists

50
Classification and Prediction
Finding models (functions) that describe and distinguish classes
or concepts for future prediction
The derived model is based on the analysis of a set of training
data (object whose class labels is known)
E.g., classify countries based on climate, or classify cars based on
gas mileage
Presentation: decision-tree, classification rule, neural network
Prediction: Predict some unknown or missing numerical values
May need to be preceded by relevance analysis which attempts to
identify attributes that do not contribute to the classification or
prediction process
These attributes can be excluded

51
Steps of Classification Process
Train the model
 using a training set
Test the model
 on a test sample
 whose class labels are known but not used
for training the model
Use the model for classification
 on new data whose class labels are
unknown

52
Example
wealth
OK
DEFAULT

Yearly income

53
Decision Trees

x1 > q1 x1 : yearly income

x2 : wealth
yes no y = 0: DEFAULT
y = 1: OK
x2 > q2 y=0
yes no Numerical values of
q1 andq2
y=1 y=0 are estimated
by the algorithm

54
Solution
x : wealth
2

OK
DEFAULT

q1 x1 : yearly income

rule: IF yearly income> q1 and wealth> q2

THEN OK ELSE DEFAULT
55
Exercise

x1 < q1 x1 : yearly income

x2 : wealth
yes no y = 0: DEFAULT
y = 1: OK
x2 < q2 y=0
yes no  Numerical values of
q1 andq2

y=1 y=0 are estimated

by the algorithm

 construct a data set seperated by the above tree

56
Artificial Neural Nets: Perceptron

x0=+1
x1
w1 w0 y  g ( x1w1  x2 w2    w0 )
x2
g  g ( w T x)
w2
y
wd
xd

57
Training ANNs

 d

o  g (wT x)  g  w i x i 
 i 0 

Learning set: X  x , y
t t

Find w which minimizes the error on X
2
 t 
E (w | X )   y t
o 
t 2


   y  g  w i x i  
t X
 t X   i 

58
ANN for Clasification
o1 o2 oK

wKd

x0=+1 x1 x2 xd

 d

o tj  g (wTj xt )  g  w ji x it 
 i 0 
59
Prediction Methods
linear regression
 Yi = a0+a1X1,i+a2X2,i+...+akXk,i+ui
non-linear regression
 Yi =f(X1,i, X2,i,.., Xk,ia1,a2,..,ak,ui)
generalized linear regression
 logistic
 logit,probit
 when the dependent variable is categorical
 good customer bad customer or employed unemployed
 pason regression
 for count variables

60
Example:Prediction and Classification
Classification is used to classify customers applying for
credit cards
 known class labels: risky,reliable
 when a new customer applies looking at his/her
characteristics
 income age education wealth region ...
 Customer class is predicted

Prediction: The monthly expense of a new customer (a

real continuous variable) is predicted based on
personal information
 independent variables
 income education wealth profession ...
 Some are numeric some categorical

61
Cluster Analysis
Class label is unknown: Group data to form new
classes, e.g., cluster houses to find distribution
patterns
Clustering based on the principle: maximizing the
intra-class similarity and minimizing the interclass
similarity
 Objects within a cluster have high similarity in comparison to
one another
 but are very dissimilar to objects in other clusters
There may be hierarchy of classes

62
Example: Clustering
Can be performed on AE customer data
to identify homogenous subpopulations of
customers
represent individual target groups for
marketing

63
Example
distance

Type1
+

Type 2
type 3
 + +

income
Clustering according to income and distance to store
three cluster of data points are evident
+ s indicate group centers
64
Outlier Analysis
Outlier: a data object that does not comply with the
general behavior of the data
It can be considered as noise or exception but is quite
useful in fraud detection, rare events analysis
Detected using
 statistical tests
 distance measures
 visually inspecting the data

65
Reasons for Outliers
Measurement errors
Coding errors
 age is entered as 999
Nature of data
 salary of the general manager is much more higher
than the other employees
 In different countries in crisis the interest rate was
in the order of 1000s

66
Evolution Analysis
Describes and models regularities or trends for objects
whose behavior changes over time
Distinct features include
 Trend and deviation: time-series data analysis
 Sequential pattern mining, periodicity analysis
 Similarity-based analysis

Example
 Stock market predictions: future stock prices
 For overall stocks: indexes or individual company stocks

Case Study-Retail Analytics
100% (1)
Case Study-Retail Analytics
11 pages
Personal Financial-Planning
100% (1)
Personal Financial-Planning
20 pages
Term Sheet Mezzanine Debt
100% (1)
Term Sheet Mezzanine Debt
9 pages
DBIS Lecture 4 - Slides (AI and Big Data)
No ratings yet
DBIS Lecture 4 - Slides (AI and Big Data)
84 pages
Chapter 5
No ratings yet
Chapter 5
41 pages
Lecture 5 - Designing Marketing Program To Build Brand Equity
No ratings yet
Lecture 5 - Designing Marketing Program To Build Brand Equity
26 pages
Business Analytics Using Python Sentiment Analytics: Cyrus Lentin
100% (1)
Business Analytics Using Python Sentiment Analytics: Cyrus Lentin
28 pages
Brand Management: Assignment
100% (1)
Brand Management: Assignment
15 pages
Smart Manufacturing Ebook
No ratings yet
Smart Manufacturing Ebook
20 pages
12 Channel Power, Conflict & Its Managing
100% (1)
12 Channel Power, Conflict & Its Managing
38 pages
02 Micro and Macro Environment
100% (2)
02 Micro and Macro Environment
15 pages
Setting Product Strategy: Dr. T. K. Chatterjee
No ratings yet
Setting Product Strategy: Dr. T. K. Chatterjee
75 pages
The Evolution of Advertising - 4
No ratings yet
The Evolution of Advertising - 4
4 pages
Revolutionize Your Business With The Industrial Internet of Things (IIoT)
100% (1)
Revolutionize Your Business With The Industrial Internet of Things (IIoT)
16 pages
Annex A C of RMC No. 57 2015
No ratings yet
Annex A C of RMC No. 57 2015
13 pages
Marketing in The Age of Alexa
No ratings yet
Marketing in The Age of Alexa
19 pages
01 - Introduction To E-Commerce
No ratings yet
01 - Introduction To E-Commerce
19 pages
Unit-2 Etgbe PDF
No ratings yet
Unit-2 Etgbe PDF
28 pages
Brand Extension PBM
No ratings yet
Brand Extension PBM
50 pages
5.web Data Mining
No ratings yet
5.web Data Mining
41 pages
Chapter 01 - Brands and Brand Management
No ratings yet
Chapter 01 - Brands and Brand Management
41 pages
Social Media and Web Analytics Unit-5
No ratings yet
Social Media and Web Analytics Unit-5
10 pages
Viral Marketing and Social Media Statistics
No ratings yet
Viral Marketing and Social Media Statistics
16 pages
Industry 4 With Sap
No ratings yet
Industry 4 With Sap
44 pages
Social Media and Web Analytics
No ratings yet
Social Media and Web Analytics
7 pages
Unit 3 - ETGBE-1
No ratings yet
Unit 3 - ETGBE-1
23 pages
Strategic Market Segmentation: Prepared By: Ma. Anna Corina G. Kagaoan Instructor College of Business and Accountancy
No ratings yet
Strategic Market Segmentation: Prepared By: Ma. Anna Corina G. Kagaoan Instructor College of Business and Accountancy
33 pages
P2F-DischargeDebt 1
No ratings yet
P2F-DischargeDebt 1
36 pages
Industry 4.0 - Data Focus CoEP
No ratings yet
Industry 4.0 - Data Focus CoEP
26 pages
Internship Report
No ratings yet
Internship Report
54 pages
Integrating Marketing Communications To Build Brand Equity
No ratings yet
Integrating Marketing Communications To Build Brand Equity
45 pages
M2 Universal Communications
No ratings yet
M2 Universal Communications
5 pages
Brand Architecture - Disney
No ratings yet
Brand Architecture - Disney
5 pages
Brand Extensions
No ratings yet
Brand Extensions
33 pages
Industry 4.0
No ratings yet
Industry 4.0
56 pages
Building Secondary Brand Associations
No ratings yet
Building Secondary Brand Associations
18 pages
Entrepreneurship Development A4
No ratings yet
Entrepreneurship Development A4
44 pages
CRM Initiatives at 3M PDF
No ratings yet
CRM Initiatives at 3M PDF
10 pages
Dvertising And: Ntegrated Arketing Ommunications
No ratings yet
Dvertising And: Ntegrated Arketing Ommunications
61 pages
The Dark Side of Customer Analytics
No ratings yet
The Dark Side of Customer Analytics
13 pages
Implantation of Global Brand Equity Measurement System
No ratings yet
Implantation of Global Brand Equity Measurement System
29 pages
Case - Study of Data Warehouse
No ratings yet
Case - Study of Data Warehouse
14 pages
Tracy' Set #1 - Lynda End of Chapter Questions
No ratings yet
Tracy' Set #1 - Lynda End of Chapter Questions
38 pages
Part 1 - Introduction To Big Data
No ratings yet
Part 1 - Introduction To Big Data
24 pages
Module 3 Job Order Costing Lecture Notes11
100% (1)
Module 3 Job Order Costing Lecture Notes11
20 pages
Designing and Managing AndIntegrated Marketing Channels - 149-155
No ratings yet
Designing and Managing AndIntegrated Marketing Channels - 149-155
86 pages
Service Blue Printing
No ratings yet
Service Blue Printing
25 pages
Designing Marketing Programs To Build Brand Equity
No ratings yet
Designing Marketing Programs To Build Brand Equity
28 pages
Product
No ratings yet
Product
71 pages
Smart and Connected Devices
No ratings yet
Smart and Connected Devices
10 pages
Marketing Envi
No ratings yet
Marketing Envi
18 pages
BM Assignment - Brand Extension
No ratings yet
BM Assignment - Brand Extension
9 pages
Integrating Marketing Communications To Build Brand Equity Final
No ratings yet
Integrating Marketing Communications To Build Brand Equity Final
29 pages
Starbucks: Make Coffee A Click Away: Digital Transformation Individual Project
No ratings yet
Starbucks: Make Coffee A Click Away: Digital Transformation Individual Project
9 pages
Motivation and Values: by Michael R. Solomon
No ratings yet
Motivation and Values: by Michael R. Solomon
34 pages
Brand - Part05 - Designing Marketing Programs To Build Brand Equity
No ratings yet
Brand - Part05 - Designing Marketing Programs To Build Brand Equity
21 pages
01 Business Intelligence
No ratings yet
01 Business Intelligence
16 pages
Presentation On Alliance: Pioneer Institute of Professional Studies, Indore
No ratings yet
Presentation On Alliance: Pioneer Institute of Professional Studies, Indore
18 pages
Prepared and Presented By:-Prashant Sakariya
No ratings yet
Prepared and Presented By:-Prashant Sakariya
17 pages
(R) Imc To Buid Brand Equity
No ratings yet
(R) Imc To Buid Brand Equity
30 pages
Digital Business Ecosystem: Done By: Tahj Salmon
No ratings yet
Digital Business Ecosystem: Done By: Tahj Salmon
6 pages
Bank Merger 2
No ratings yet
Bank Merger 2
60 pages
How Digital Is Reinventing Levi Strauss & Co.'s IT Business Model
No ratings yet
How Digital Is Reinventing Levi Strauss & Co.'s IT Business Model
5 pages
STWS Business Model Archetypes
No ratings yet
STWS Business Model Archetypes
1 page
Lo3 - Tendering Process
No ratings yet
Lo3 - Tendering Process
42 pages
GST Certificate - Sant Lal
No ratings yet
GST Certificate - Sant Lal
3 pages
POM Module 2 Quality Notes
No ratings yet
POM Module 2 Quality Notes
46 pages
Intel
No ratings yet
Intel
3 pages
Real Estate Sector India
No ratings yet
Real Estate Sector India
14 pages
AIAG Book 2012
No ratings yet
AIAG Book 2012
70 pages
The Cap Code
No ratings yet
The Cap Code
128 pages
Leverage Chap 7 - Politeknik
No ratings yet
Leverage Chap 7 - Politeknik
83 pages
RA Annexure II
No ratings yet
RA Annexure II
37 pages
50+ Case Studies by CS Tushar Pahade - Dec 24 Exams
No ratings yet
50+ Case Studies by CS Tushar Pahade - Dec 24 Exams
54 pages
MSC Accounting Seminar Paper
No ratings yet
MSC Accounting Seminar Paper
15 pages
Labour Management
No ratings yet
Labour Management
15 pages
Fire Advisory Services: Case Studies
No ratings yet
Fire Advisory Services: Case Studies
6 pages
Stock Market Basics: A Beginner's Guide To The Stock Market
No ratings yet
Stock Market Basics: A Beginner's Guide To The Stock Market
7 pages
Leadership JournalArtical
No ratings yet
Leadership JournalArtical
6 pages
Analysis of The Consumer 'S Percetion On Coca-Cola's Brand Image and Its Relationship On Customer Loyalty.
No ratings yet
Analysis of The Consumer 'S Percetion On Coca-Cola's Brand Image and Its Relationship On Customer Loyalty.
17 pages
MAIN+ +YMP+Facebook+Ads+Portfolio+
No ratings yet
MAIN+ +YMP+Facebook+Ads+Portfolio+
41 pages
Subject: Management Science (MS) Class: III /IV B..Tech - VI Semester Faculty: Dr. K Visweswara Rao
No ratings yet
Subject: Management Science (MS) Class: III /IV B..Tech - VI Semester Faculty: Dr. K Visweswara Rao
2 pages
TSLA Balance Sheet: Collapse All
No ratings yet
TSLA Balance Sheet: Collapse All
6 pages
M4 Appendix 1 & Act 4 - Compliance Monitoring Plan
No ratings yet
M4 Appendix 1 & Act 4 - Compliance Monitoring Plan
4 pages
What Are The SEO Benefits For Small Business by Professionals
No ratings yet
What Are The SEO Benefits For Small Business by Professionals
3 pages
Yash Makadiya - Essilor
No ratings yet
Yash Makadiya - Essilor
3 pages
GTU DOM Paper 3
No ratings yet
GTU DOM Paper 3
1 page
E-Factory CAD Foundry Model1
No ratings yet
E-Factory CAD Foundry Model1
3 pages
August of Money: The Quest for Cashless Society
From Everand
August of Money: The Quest for Cashless Society
Mehul Desai
No ratings yet
Customer 360: How Data, AI, and Trust Change Everything
From Everand
Customer 360: How Data, AI, and Trust Change Everything
Martin Kihn
No ratings yet

Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems

Uploaded by

Web Mining: Faculty of Information Technology Department of Software Engineering and Information Systems

Uploaded by

Web Mining

Faculty of Information Technology

Data explosion problem

Data collection, database creation

Data analysis and understanding

Data analysis and decision support

Customer requirement analysis

Provision of summary information

Loan payment prediction and customer

Analysis may find that

Data Warehouse Selection

Data Presentation Business

Data Warehouses / Data Marts

Graphical user interface

Data mining engine

 Spatial and temporal data

 Heterogeneous and legacy database

 Text databases & WWW

Association (correlation and causality)

Classification and Prediction

houses to find distribution patterns

 Sequential pattern mining, periodicity analysis

Other pattern-directed or statistical analyses

Data mining may generate thousands of patterns: Not all of them

Find all the interesting patterns: Completeness

Different views, different classifications

Functionalities of descriptive data mining

inputs Model Y output

 the user does not care what the model is doing

The more data with known outcomes is available

Never %100 accurate

The performance of a model on past data is not important

Its performance on unknown data is much more important

Who is likely to respond to our next offer

Which customers are likely to leave in the next

What transactions are likely to be fraudulent

Data mining systems, DBMS, Data warehouse systems coupling

Data Cube API

Filtering&Integration Database API Filtering

Industry Region Year

Category Country Quarter

Product City Month Week

x1 > q1 x1 : yearly income

rule: IF yearly income> q1 and wealth> q2

x1 < q1 x1 : yearly income

y=1 y=0 are estimated

by the algorithm

 construct a data set seperated by the above tree

Prediction: The monthly expense of a new customer (a

You might also like