0% found this document useful (0 votes)

212 views4 pages

Data Mining Using SQL Server Analysis Server

The document discusses data mining using SQL Server Analysis Server. It describes data mining as the process of discovering patterns from large datasets. Various data mining techniques are discussed, including clustering, classification, regression, anomaly detection, and association rule learning. It then outlines the steps for developing a data mining application using Microsoft SQL Server tools, including defining the problem, preparing data, exploring data, building models, exploring/validating models, and deploying/updating models.

Uploaded by

Julie Roberts

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

212 views4 pages

Data Mining Using SQL Server Analysis Server

Uploaded by

Julie Roberts

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Data Mining using SQL Server Analysis Server

Article by Shamaran Satkunanathan

In today’s world, software applications are moving from traditional information systems to Business
Intelligent systems. The growing of data and information have within data need to raise to develop
new kind of applications for the organizations. To address this, data mining solutions have become
an integral part of many software solutions.

Data mining is the process of discovering actionable information from large sets of data. It helps
organizations analyze incredible amounts of data in order to detect common patterns or learn new
things. It uses mathematical analysis to derive patterns and trends from existing data. However
existing data need to be organized via ETL (Extract, Transform, Loading) process before applying
the mining technique on them. This is because typically these patterns cannot be discovered by
traditional data exploration methods since the relationships are too complex or because there is too
much data.

There are many data mining techniques available to analyze data and drive the useful knowledge
and patterns from those. These techniques are ranged from extremely complex to basic. Each
technique serves a slightly different purpose or goal. Here are few examples approaches to data
mining.

Clustering
Cluster detection is a type of pattern recognition that is used to detect patterns within large data
sets. It’s a bit like arranging a large amount of information into categories using patterns which
emerge during data analysis.

Classification
Classification Analysis is a systematic process for obtaining important and relevant information
about data, and metadata – data about data. The classification analysis helps identifying to which
set of categories different types of data belong. Classification analysis is closely linked to cluster
analysis as the classification can be used to cluster data.

Regression
Regression is a technique that aims to predict future outcomes using large sets of existing
variables. This is used to predict future user engagement, customer retention and even property
prices.

Anomaly or Outlier Detection

Anomaly detection refers to the search for data items in a dataset that do not match a projected
pattern or expected behaviour. Anomalies are also called outliers, exceptions, surprises or
contaminants and they often provide critical and actionable information.

1
Association Rule Learning
Association rule learning enables the discovery of interesting relations between different variables in
large databases. Association rule learning uncovers hidden patterns in the data that can be used to
identify variables within the data and the co-occurrences of different variables that appear with the
greatest frequencies.

Microsoft SQL Server Data tools

There are many data mining tools available to apply the data and predict. Here we are going to talk
about Microsoft SQL Server Data tools and how its features support to build & deploy a mining
model.

The Microsoft SQL Server Data tools includes SQL server relational databases, Azure SQL
databases, Integration Services packages, Analysis Services data models and reporting Services.
The analysis server contains data mining algorithms and query tools that make it easy to build a
comprehensive solution for a variety of projects. SQL Server Management Studio, contains tools for
browsing models and managing data mining objects.

Developing a mining application generally takes following steps

1. Defining the problem
2. Preparing Data
3. Exploring Data
4. Building Models
5. Exploring and Validating Models
6. Deploying and updating Models
Following diagram explains the cyclical flow of creating a data mining model. Each step in the
process needs to be repeated many times in order to create a decent model.

Image source: https://i-msdn.sec.s-msft.com/dynimg/IC125015.jpeg

2
Defining the Problem
First step is to determine the scope of the problem and analyzing the business requirements to
defining specific goals for the data mining project. Here we might need to conduct a data availability
study.

Preparing Data
In this step, we are working with a very huge data set and cannot examine every transaction for
data quality; therefore, we might need data profiling and automated data cleansing and filtering
tools, such as Microsoft SQL Server Master Data Services or SQL Server Data Quality Services to
explore the data and find the inconsistencies.

Exploring Data
In this step, we can use tools such as Master Data Services to canvass available sources of data
and define their availability for data mining. We can use tools such as SQL Server Data Quality
Services, or the Data Profiler in Integration Services, to analyze the distribution of data and repair
issues such as incorrect or missing data.

Building Models
In this step, we state the columns of data which we want to use by creating a mining structure.
When we process the mining structure, Analysis Services produces aggregates and other statistical
information that can be used for analysis. This information can be used by any mining model that is
based on the structure.

Processing a model is called as training. Applying a specific mathematical algorithm to the data in
the structure is the process of training. By using training, we can extract patterns. The patterns that
we find in the training process depend on the following three points
 Selection of training data,
 The algorithm we chose,
 How we have configured the algorithm.

We can also use parameters to adjust each algorithm and apply filters to the training data to use
just a subset of the data. After data is passed through the model, the mining model object holds
summaries and patterns that can be queried or used for prediction.

Exploring and Validating Models

By using Analysis Services we can distinct data into training and testing a dataset so that we can
precisely assess the performance of all models on the same data. We use the training dataset to
build the model and the testing dataset to test the accuracy of the model by creating prediction
queries.

We can explore the trends and patterns that the algorithms discover by using the viewers in Data
Mining Designer in SQL Server Data Tools. We can also test how well the models create predictions
by using tools in the designer such as the lift chart and classification matrix. To verify whether the
model is specific to our data or might be used to make inferences on the general population, we can
use the statistical technique called cross-validation to automatically create subsets of the data and
test the model against each subset.

3
Deploying and Updating Models

Here we can use the use the models to create predictions, which we can then use to make
business decisions. SQL Server provides the DMX language that we can use to create prediction
queries, and Prediction Query Builder to help you build the queries.

References
 http://charc-concepts.org/the-benefits-of-data-mining/
 https://msdn.microsoft.com/en-us/library/bb522607.aspx
 https://msdn.microsoft.com/en-us/library/bb510516.aspx

(Applications Development and Emerging Technologies) : Pre-Summative Assessment
No ratings yet
(Applications Development and Emerging Technologies) : Pre-Summative Assessment
29 pages
OANDA Exchange Rate With D365FO
No ratings yet
OANDA Exchange Rate With D365FO
6 pages
206 Datamining
No ratings yet
206 Datamining
109 pages
DM Unit-1
No ratings yet
DM Unit-1
27 pages
Data Mining
No ratings yet
Data Mining
11 pages
Data Mining e Resources
No ratings yet
Data Mining e Resources
98 pages
DATA Mining
No ratings yet
DATA Mining
21 pages
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
No ratings yet
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
27 pages
ModelQB - Part B&C-1
No ratings yet
ModelQB - Part B&C-1
51 pages
BIDW Lecture 2
No ratings yet
BIDW Lecture 2
33 pages
BIDM
No ratings yet
BIDM
48 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
15 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Data Mining System and Applications A Re
No ratings yet
Data Mining System and Applications A Re
13 pages
Archana Data Mining
No ratings yet
Archana Data Mining
24 pages
Data Mining Process, Techniques, Tools & Examples
No ratings yet
Data Mining Process, Techniques, Tools & Examples
11 pages
Data Mining AND Warehousing: Abstract
No ratings yet
Data Mining AND Warehousing: Abstract
12 pages
DM ITERA 2020 w1
No ratings yet
DM ITERA 2020 w1
35 pages
Unit-2 Bi
No ratings yet
Unit-2 Bi
58 pages
Data Mining L1,2
No ratings yet
Data Mining L1,2
26 pages
Unit III DWDM
No ratings yet
Unit III DWDM
113 pages
Data Mining
No ratings yet
Data Mining
6 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
Data Science Module 1 Notes
No ratings yet
Data Science Module 1 Notes
16 pages
Dadm (1) Sidra
No ratings yet
Dadm (1) Sidra
9 pages
Data Mining Cognate
No ratings yet
Data Mining Cognate
23 pages
Data Mining
No ratings yet
Data Mining
11 pages
Unit 3 Ba
No ratings yet
Unit 3 Ba
29 pages
Data Mining Algorithms (Analysis Services - Data Mining) : Choosing The Right Algorithm
No ratings yet
Data Mining Algorithms (Analysis Services - Data Mining) : Choosing The Right Algorithm
3 pages
DataMining and Warehousing - Chapter1
No ratings yet
DataMining and Warehousing - Chapter1
23 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
24 pages
Data Mining Merged PDF CS1 CS8
No ratings yet
Data Mining Merged PDF CS1 CS8
272 pages
Dwdm Unit II
No ratings yet
Dwdm Unit II
18 pages
Data Mining Notes
No ratings yet
Data Mining Notes
21 pages
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
No ratings yet
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
35 pages
Introduction To Data Mining: Dr. Dipti Chauhan Assistant Professor SCSIT, SUAS Indore
No ratings yet
Introduction To Data Mining: Dr. Dipti Chauhan Assistant Professor SCSIT, SUAS Indore
16 pages
DataMining Process 17.03.12
No ratings yet
DataMining Process 17.03.12
24 pages
Chapter 3-IB
No ratings yet
Chapter 3-IB
69 pages
DWDM Unit-2
No ratings yet
DWDM Unit-2
13 pages
What is Data Mining
No ratings yet
What is Data Mining
1 page
01-Introduction To Data Mining
No ratings yet
01-Introduction To Data Mining
43 pages
Data Mining Notes1
No ratings yet
Data Mining Notes1
56 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
16 pages
Data Mining Methods Basics - Resp
No ratings yet
Data Mining Methods Basics - Resp
33 pages
Data Mining - Digital Notes (Unit I To V)
No ratings yet
Data Mining - Digital Notes (Unit I To V)
85 pages
Introduction
No ratings yet
Introduction
26 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
Data Mining 1
No ratings yet
Data Mining 1
56 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
84 pages
DM - Weka Reprot
No ratings yet
DM - Weka Reprot
18 pages
DWH Unit 3
No ratings yet
DWH Unit 3
7 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
84 pages
4 Datamining
No ratings yet
4 Datamining
90 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
62 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
Data Mining
No ratings yet
Data Mining
20 pages
Paper 6: Management Information System Module 20: Data Mining For Decision Support
No ratings yet
Paper 6: Management Information System Module 20: Data Mining For Decision Support
16 pages
DM Module1
No ratings yet
DM Module1
15 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Implementing Data Flow in An SSIS Package
No ratings yet
Implementing Data Flow in An SSIS Package
3 pages
CSE459 CSharp 03 Inheritance
No ratings yet
CSE459 CSharp 03 Inheritance
25 pages
A Corporate System For Continuous Innovation
No ratings yet
A Corporate System For Continuous Innovation
4 pages
MS SQL Server Business Intelligence
No ratings yet
MS SQL Server Business Intelligence
6 pages
CSE459 CSharp 02 LanguageOverview
No ratings yet
CSE459 CSharp 02 LanguageOverview
26 pages
Internet Application: Previous Next Chapter
No ratings yet
Internet Application: Previous Next Chapter
3 pages
Development Standards
No ratings yet
Development Standards
3 pages
MGMT489 J and J B
No ratings yet
MGMT489 J and J B
5 pages
The IDEF Process Modeling Methodology
No ratings yet
The IDEF Process Modeling Methodology
8 pages
Analysis of Real Madrid and JJB Sports
No ratings yet
Analysis of Real Madrid and JJB Sports
8 pages
SQL Fundamentals I Solutions 11 PDF
No ratings yet
SQL Fundamentals I Solutions 11 PDF
2 pages
Java Checklist
No ratings yet
Java Checklist
3 pages
Achieve: Microsoft Dynamics NAV 5.0
No ratings yet
Achieve: Microsoft Dynamics NAV 5.0
16 pages
SQL Fundamentals I Solutions 06 PDF
No ratings yet
SQL Fundamentals I Solutions 06 PDF
2 pages
Hotel Lock Manual
100% (1)
Hotel Lock Manual
29 pages
Chapter-4 - HTML
100% (2)
Chapter-4 - HTML
30 pages
Faq Digital Kisan Tatkal
No ratings yet
Faq Digital Kisan Tatkal
4 pages
Introduction To Python
No ratings yet
Introduction To Python
28 pages
Business Proposal Aguilar
No ratings yet
Business Proposal Aguilar
10 pages
Báró Radvánszky-Régi Magyar Szakácskönyvek
No ratings yet
Báró Radvánszky-Régi Magyar Szakácskönyvek
446 pages
Swiftec Manual Chip Tuning
100% (2)
Swiftec Manual Chip Tuning
33 pages
Attribute Data & Tables
No ratings yet
Attribute Data & Tables
77 pages
Advanced Application Programming Assignment
No ratings yet
Advanced Application Programming Assignment
5 pages
Cadence Redhat 6 Installation
100% (1)
Cadence Redhat 6 Installation
8 pages
Marshalling
No ratings yet
Marshalling
10 pages
Microcontrolador msp430g2231
No ratings yet
Microcontrolador msp430g2231
61 pages
Fco Instructions 091
No ratings yet
Fco Instructions 091
13 pages
Constant Log BCJR Turbo Decoder With Pipelined Architecture
No ratings yet
Constant Log BCJR Turbo Decoder With Pipelined Architecture
5 pages
2
No ratings yet
2
1 page
SCOM 2007 Management Pack Guide For Veritas NetBackup
100% (13)
SCOM 2007 Management Pack Guide For Veritas NetBackup
22 pages
Muhammad Hammad Mechanics of Blockchain: Scripts & Block
No ratings yet
Muhammad Hammad Mechanics of Blockchain: Scripts & Block
27 pages
WiNG 5.X How-To Guide PDF
No ratings yet
WiNG 5.X How-To Guide PDF
51 pages
ST Unit1
No ratings yet
ST Unit1
14 pages
Questions
No ratings yet
Questions
66 pages
Lavin Institute For Computer Science
No ratings yet
Lavin Institute For Computer Science
18 pages
ETAP FAQ - Converting Powerplot Projects
No ratings yet
ETAP FAQ - Converting Powerplot Projects
6 pages
Claro Manual
No ratings yet
Claro Manual
56 pages
Swapnil Patni PDF
No ratings yet
Swapnil Patni PDF
24 pages
Smart Mailing System For Blind People
No ratings yet
Smart Mailing System For Blind People
5 pages
IP Backhaul Network BGP Routing Protocol
No ratings yet
IP Backhaul Network BGP Routing Protocol
89 pages
Workiva vs. Competiton Requirements
No ratings yet
Workiva vs. Competiton Requirements
1 page
Advanced SQL Querying
No ratings yet
Advanced SQL Querying
170 pages

Data Mining Using SQL Server Analysis Server

Uploaded by

Data Mining Using SQL Server Analysis Server

Uploaded by

Data Mining using SQL Server Analysis Server

Article by Shamaran Satkunanathan

Anomaly or Outlier Detection

Microsoft SQL Server Data tools

Developing a mining application generally takes following steps

Image source: https://i-msdn.sec.s-msft.com/dynimg/IC125015.jpeg

Exploring and Validating Models

You might also like