0% found this document useful (0 votes)

114 views9 pages

DM Unit 1 PDF

OLAP stands for Online Analytical Processing and allows users to analyze multidimensional data from multiple database systems simultaneously. OLAP databases are divided into cubes that can be analyzed using five basic operations: drill down, roll up, dice, slice, and pivot. These operations allow users to view data at different levels of granularity. Data mining techniques like association, classification, clustering, sequential patterns, and decision trees are used to extract useful knowledge and patterns from large amounts of data. These techniques help organizations make better decisions. Descriptive techniques characterize data properties while predictive techniques infer patterns to make predictions.

Uploaded by

Ayush

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

114 views9 pages

DM Unit 1 PDF

Uploaded by

Ayush

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

OLAP Operations

OLAP stands for Online Analytical Processing Server. It is a software technology

that allows users to analyze information from multiple database systems at the same
time. It is based on multidimensional data model and allows the user to query on
multi-dimensional data (eg. Delhi -> 2018 -> Sales data).

OLAP databases are divided into one or more cubes and these cubes are known
as Hyper-cubes.

OLAP OPERATIONS:
There are five basic analytical operations that can be performed on an OLAP cube:

1. Drill down: In drill-down operation, the less detailed data is converted into highly
detailed data. It can be done by:

 Moving down in the concept hierarchy

 Adding a new dimension

In the cube given in overview section, the drill down operation is performed by moving
down in the concept hierarchy of Time dimension (Quarter -> Month).

2. Roll up: It is just opposite of the drill-down operation. It performs aggregation on the
OLAP cube. It can be done by:

 Climbing up in the concept hierarchy

 Reducing the dimensions

 In the cube given in the overview section, the roll-up operation is performed by
climbing up in the concept hierarchy of Location dimension (City -> Country).
3. Dice: It selects a sub-cube from the OLAP cube by selecting two or more dimensions.
In the cube given in the overview section, a sub-cube is selected by selecting following
dimensions with criteria:

 Location = “Delhi” or “Kolkata”

 Time = “Q1” or “Q2”

 Item = “Car” or “Bus”

4. Slice: It selects a single dimension from the OLAP cube which results in a new sub-
cube creation. In the cube given in the overview section, Slice is performed on the
dimension Time = “Q1”.

5. Pivot: It is also known as rotation operation as it rotates the current view to get a new
view of the representation. In the sub-cube obtained after the slice operation,
performing pivot operation gives a new view of it.

DATA MINING TECHNIQUES

Extracting important knowledge from a very large amount of data can be crucial to
organizations for the process of decision-making.

Some data mining techniques are :-

1 Association

2 Classification

3 Clustering

4 Sequential patterns

5 Decision tree.

1 Association Technique

Association Technique helps to find out the pattern from huge data, based on a
relationship between two or more items of the same transaction. The association
technique is used to analyze market means it help us to analyze people's buying habits.
For example, you might identify that a customer always buys ice cream whenever he
comes to watch move so it might be possible that when customer again comes to watch
move he might also want to buy ice cream again.

2 Classification Technique

Classification technique is most common data mining technique. In classification method

we use mathematical techniques such as decision trees, neural network and statistics in
order to predict unknown records. This technique helps in deriving important information
about data.

Let assume you have set of records, each record contains a set of attributes and depending
upon this attributes you will be able to predict unseen or unknown records. For example,
you have given all records of employees who left the company, with classification
technique you can predict who will probably leave the company in a future period.

3 Clustering Technique

Clustering is one of the oldest techniques used in the process of data mining. The main
aim of clustering technique is to makes cluster(groups) from pieces of data which share
common characteristics. Clustering Technique help to identify the differences and
similarities between the data.

Take an example of a shop in which many items are for sales, now the challenge is how to
keep those items in such way that customer can easily find his required item.By using the
clustering technique, you can keep some items in one corner that have some similarities
and other items in another corner that have some different similarities.

4 Sequential patterns

Sequential patterns are a useful method for identifying trends and similar patterns.
For example, in customer data you identify that a customer buys particular product on
particular time of year, you can use this information to suggest customer these particular
product on that time of year.

5 Decision tree

Decision
sion tree is one of the most common used data mining techniques because its model
is easy to understand for users. In decision tree you start with a simple question which has
two or more answers. Each answer leads to a further two or more question which help he us
to make a final decision. The root node of decision tree is a simple question.

Take a example of flood warning system.

Decision tree
First check water level, if water level is > 50ft then alert is send and if water level is <
50ft then check water level if water level is > 30ft then send warning and if water level is
< 30ft then water is in normal range.

Data Mining Functionalities

Data mining functionalities are used to specify the kind of patterns to be found in data
mining tasks. Data mining tasks can be classified into two categories: descriptive and
predictive.

Descriptive mining tasks characterize the general properties of the data in the database.

Predictive mining tasks perform inference on the current data in order to make
predictions.

Concept/Class Description: Characterization and Discrimination

Data can be associated with classes or concepts. For example, in the Electronics store,
classes of items for sale include computers and printers, and concepts of customers
include bigSpenders and budgetSpenders.

Data characterization

Data characterization is a summarization of the general characteristics or features of a

target class of data.

Data discrimination

Data discrimination is a comparison of the general features of target class data objects
with the general features of objects from one or a set of contrasting classes.

Mining Frequent Patterns, Associations, and Correlations

Frequent patterns, are patterns that occur frequently in data. There are many kinds of
frequent patterns, including itemsets, subsequences, and substructures.

Association analysis

Suppose, as a marketing manager, you would like to determine which items are frequently
purchased together within the same transactions.

buys(X,“computer”)=buys(X,“software”) [support=1%,confidence=50%]
where X is a variable representing a customer.Confidence=50% means that if a customer
buys a computer, there is a 50% chance that she will buy software as well.

Support=1% means that 1% of all of the transactions under analysis showed that
computerr and software were purchased together.

Classification and Prediction

Classification is the process of finding a model that describes and distinguishes data
classes for the purpose of being able to use the model to predict the class of objects whose
class label is unknown.

“How is the derived model presented?” The derived model may be represented in various
forms, such as classification (IF
(IF-THEN)
THEN) rules, decision trees, mathematical formulae, or
neural networks.

A decision tree is a flow-chart

chart-like tree structure, where each node denotes a test on an
attribute value, each branch represents an outcome of the test, and tree leaves represent
classes or class distributions.

Decision tree

A neural network,, when used for classification, is typically a collection

collect of neuron-like
processing units with weighted connections between the units.
Neural Network

Cluster Analysis

In classification and prediction analyze class

class-labeled
labeled data objects, where as clustering
analyzes data objects without consulting a known cla
class label.

Cluster Analysis

The objects are grouped based on the principle of maximizing the intraclass similarity and
minimizing the interclass similarity. That is, clusters of objects are formed so that objects
within a cluster have high similarity in ccomparison
omparison to one another, but are very dissimilar
to objects in other clusters.

Outlier Analysis

A database may contain data objects that do not comply with the general behavior or
model of the data. These data objects are outliers. Most data mining methods discard
outliers as noise or exceptions. The analysis of outlier data is referred to as outlier mining.

Data Mining Module 2
No ratings yet
Data Mining Module 2
23 pages
Digital Design - Morris Mano-Fifth Edition
No ratings yet
Digital Design - Morris Mano-Fifth Edition
31 pages
Data Mining
No ratings yet
Data Mining
11 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
DATA MINIING Unit 1 Notes
No ratings yet
DATA MINIING Unit 1 Notes
22 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
84 pages
CST 466
No ratings yet
CST 466
24 pages
Chapter 1&2
No ratings yet
Chapter 1&2
91 pages
Lec 02
No ratings yet
Lec 02
33 pages
Data Mining - Digital Notes (Unit I To V)
No ratings yet
Data Mining - Digital Notes (Unit I To V)
85 pages
Fundamentals of Data Mining
No ratings yet
Fundamentals of Data Mining
36 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Chapter 6 - Data Mining Techniques
No ratings yet
Chapter 6 - Data Mining Techniques
19 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
24 pages
Data Mining-Unit-1
No ratings yet
Data Mining-Unit-1
21 pages
Data Mining Important
No ratings yet
Data Mining Important
15 pages
Business Analytics.
No ratings yet
Business Analytics.
18 pages
Data Mining Implementation
No ratings yet
Data Mining Implementation
9 pages
Data Warehousing Fundamentals - Unit 2
No ratings yet
Data Warehousing Fundamentals - Unit 2
38 pages
Chapter 1
No ratings yet
Chapter 1
55 pages
Data Mining and Data Warehousing Notes ct1
No ratings yet
Data Mining and Data Warehousing Notes ct1
12 pages
2-Concept Hierarchy To Classification of DMS
No ratings yet
2-Concept Hierarchy To Classification of DMS
75 pages
Unit-4 DWM
No ratings yet
Unit-4 DWM
73 pages
Unit 3 PPT (BA)
No ratings yet
Unit 3 PPT (BA)
19 pages
Data Mining Notes
No ratings yet
Data Mining Notes
14 pages
1.1 - Data Mining
No ratings yet
1.1 - Data Mining
18 pages
Data Mining Techniques
No ratings yet
Data Mining Techniques
24 pages
Section 06: Information-Centered Systems
No ratings yet
Section 06: Information-Centered Systems
21 pages
Past PPR
No ratings yet
Past PPR
31 pages
Mekelle University-Mekelle Institute of Technology Department of Information Technology Data Mining and Knowledge Discovery
No ratings yet
Mekelle University-Mekelle Institute of Technology Department of Information Technology Data Mining and Knowledge Discovery
36 pages
Module 4
No ratings yet
Module 4
54 pages
Data Mining Unit I Notes
No ratings yet
Data Mining Unit I Notes
24 pages
Data Mining 1 2 and 3
No ratings yet
Data Mining 1 2 and 3
20 pages
Unit I Dbmi
No ratings yet
Unit I Dbmi
35 pages
BI - Unit 5
No ratings yet
BI - Unit 5
9 pages
Unit 1 DM
No ratings yet
Unit 1 DM
24 pages
2-Tasks and Techniques
No ratings yet
2-Tasks and Techniques
17 pages
Data Mining and Its Techniques: A Review Paper: Maria Shoukat (MS Student)
No ratings yet
Data Mining and Its Techniques: A Review Paper: Maria Shoukat (MS Student)
7 pages
Unit I DATA MINING AAGAC
No ratings yet
Unit I DATA MINING AAGAC
27 pages
Unit - I
No ratings yet
Unit - I
22 pages
Data Warehousing&Dat Mining
No ratings yet
Data Warehousing&Dat Mining
12 pages
R18CSE4102-UNIT 2 Data Mining Notes
100% (1)
R18CSE4102-UNIT 2 Data Mining Notes
31 pages
DM Unit-1
No ratings yet
DM Unit-1
14 pages
Data Mining System and Applications A Re
No ratings yet
Data Mining System and Applications A Re
13 pages
Fundamentals of Data Science Notes (Module - 1)
No ratings yet
Fundamentals of Data Science Notes (Module - 1)
19 pages
Vinee
100% (1)
Vinee
28 pages
Data Mining
100% (1)
Data Mining
40 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
Seminar On Data Mining Concepts and Its
No ratings yet
Seminar On Data Mining Concepts and Its
8 pages
A Brief Overview On Data Mining Survey PDF
No ratings yet
A Brief Overview On Data Mining Survey PDF
8 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
What Is Data Mining?
No ratings yet
What Is Data Mining?
17 pages
Data Mining AND Warehousing: Abstract
No ratings yet
Data Mining AND Warehousing: Abstract
12 pages
Centre For Management Studies: Online Submission of Assignment-02
No ratings yet
Centre For Management Studies: Online Submission of Assignment-02
10 pages
Unit 4 New Database Applications and Environments: by Bhupendra Singh Saud
No ratings yet
Unit 4 New Database Applications and Environments: by Bhupendra Singh Saud
14 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Data Mining
No ratings yet
Data Mining
11 pages
Student Marks Management System
No ratings yet
Student Marks Management System
24 pages
Report Final Biometrics by Himanshu
No ratings yet
Report Final Biometrics by Himanshu
30 pages
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
No ratings yet
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
10 pages
Seminar Report On Proxy Server and Firewall
100% (1)
Seminar Report On Proxy Server and Firewall
19 pages
Data Archiving Essentials What Every Administrator Needs To Know
0% (1)
Data Archiving Essentials What Every Administrator Needs To Know
26 pages
Essentials of Big Data Griet
No ratings yet
Essentials of Big Data Griet
2 pages
Data Warehouse Concepts - Final
0% (1)
Data Warehouse Concepts - Final
60 pages
Noc 24 Hs 176 S 650906310
No ratings yet
Noc 24 Hs 176 S 650906310
19 pages
Database Management Systems
No ratings yet
Database Management Systems
2 pages
SQL Server Reporting Services
No ratings yet
SQL Server Reporting Services
24 pages
HR Schema
100% (1)
HR Schema
3 pages
Section 9
No ratings yet
Section 9
6 pages
Overview Oracle
No ratings yet
Overview Oracle
27 pages
Natural Language Processing: Rada Mihalcea
No ratings yet
Natural Language Processing: Rada Mihalcea
27 pages
Chapter 3 Database
No ratings yet
Chapter 3 Database
19 pages
Database Duplication Using RMAN Feature
No ratings yet
Database Duplication Using RMAN Feature
15 pages
Power Query Academy: WWW - Powerquery
No ratings yet
Power Query Academy: WWW - Powerquery
5 pages
ITM PPT Final
No ratings yet
ITM PPT Final
12 pages
DBMS Lab Record 2020-21
No ratings yet
DBMS Lab Record 2020-21
36 pages
Types of DBMS Architecture Lesson Two
No ratings yet
Types of DBMS Architecture Lesson Two
4 pages
Unit-1 DBMS English
No ratings yet
Unit-1 DBMS English
23 pages
SuperSubtype Relationship
No ratings yet
SuperSubtype Relationship
27 pages
Term 2 GR 11 ICT Lesson Plan
No ratings yet
Term 2 GR 11 ICT Lesson Plan
6 pages
MCS-023Introduction To Database Management Systems
No ratings yet
MCS-023Introduction To Database Management Systems
21 pages
Captcha: Kiit University
No ratings yet
Captcha: Kiit University
27 pages
Department of Information Technology Soet H.N.B. Garhwal University Chauras Campus
No ratings yet
Department of Information Technology Soet H.N.B. Garhwal University Chauras Campus
14 pages
HLD of OTT Apps
No ratings yet
HLD of OTT Apps
28 pages
Getting Started With Spark Redis PDF
0% (1)
Getting Started With Spark Redis PDF
9 pages
Ayush Assign MM
No ratings yet
Ayush Assign MM
12 pages
TEDAS: A Twitter-Based Event Detection and Analysis System
No ratings yet
TEDAS: A Twitter-Based Event Detection and Analysis System
4 pages
Reusable CAPTCHA Security Engine
No ratings yet
Reusable CAPTCHA Security Engine
3 pages
SQL Datetime Conversion - String Date Convert Formats - SQLUSA PDF
No ratings yet
SQL Datetime Conversion - String Date Convert Formats - SQLUSA PDF
13 pages
BI Apps796 Perf Tech Note V9
No ratings yet
BI Apps796 Perf Tech Note V9
134 pages
Data Analytics
No ratings yet
Data Analytics
16 pages
Script-Controlled Backups in SAP HANA Database: Documentation of A Template Shell Script For Controlling Backup Execution
No ratings yet
Script-Controlled Backups in SAP HANA Database: Documentation of A Template Shell Script For Controlling Backup Execution
2 pages
Inventortrenches Blogspot Com 2011 07 Ilogic To Save PDF Files To New HTML M 0
No ratings yet
Inventortrenches Blogspot Com 2011 07 Ilogic To Save PDF Files To New HTML M 0
4 pages
Big Data, Big Innovations: Collaborative, Self-Service Analytics Delivers Unprecedented Value
No ratings yet
Big Data, Big Innovations: Collaborative, Self-Service Analytics Delivers Unprecedented Value
4 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet

DM Unit 1 PDF

Uploaded by

DM Unit 1 PDF

Uploaded by

OLAP Operations

OLAP stands for Online Analytical Processing Server. It is a software technology

 Moving down in the concept hierarchy

 Adding a new dimension

 Climbing up in the concept hierarchy

 Reducing the dimensions

 Location = “Delhi” or “Kolkata”

 Time = “Q1” or “Q2”

 Item = “Car” or “Bus”

DATA MINING TECHNIQUES

Some data mining techniques are :-

Classification technique is most common data mining technique. In classification method

Take a example of flood warning system.

Data Mining Functionalities

Concept/Class Description: Characterization and Discrimination

Data characterization is a summarization of the general characteristics or features of a

Mining Frequent Patterns, Associations, and Correlations

Classification and Prediction

A decision tree is a flow-chart

A neural network,, when used for classification, is typically a collection

In classification and prediction analyze class

You might also like