Data Platform Vision: Hanoi University of Technology
Data Platform Vision: Hanoi University of Technology
Vu Tuyet Trinh
[email protected]
Hanoi University of Technology
Outline
Microsoft
MS. SQL Server 2008
XML
time/calendar
Types of
Data file, document
geospatial
DATA
search
PLATFORM
query
VISION
data analysis
Services to
interact reporting
data integration
robust
Microsoft
MS. SQL Server 2008 synchronization
• Microsoft Data Platform Vision
Microsoft
MS. SQL Server 2008
Improved Productivity
Microsoft
MS. SQL Server 2008
Improved Productivity
LINQ
LINQ to SQL .
LINQ to Entities .
LINQ to DataSet .
LINQ to XML .
LINQ to Object .
Visual Studio
Providing features such as source code control, tracking, and
deployment tools .
Microsoft
MS. SQL Server 2008
Improved Productivity
Microsoft
MS. SQL Server 2008
SERVICES
Microsoft
MS. SQL Server 2008
SERVICES
Analysis
Services
Reporting
Services
services
Data Platform
Intergration
Services
Microsoft
MS. SQL Server 2008
1. Analysis Services
Microsoft
MS. SQL Server 2008
BuildEnterprise-
Build Enterprise-
Analysis Services ScaleSolutions
Solutions
Scale
ExtendReach
Extend Reach
with
with
Analysis Comprehensive
Comprehensive
Analytics
Analytics
Services
DriveActionable
Drive Actionable
Insightthrough
Insight through
FamiliarTools
Familiar Tools
Microsoft
MS. SQL Server 2008
Build Enterprise-Scale Solutions
Microsoft
MS. SQL Server 2008
Build Enterprise-Scale Solutions
High
Developer
Productivity
Build
Build
Enterprise-
Enterprise-
Scale
Scale Scalable
Solutions
Solutions Infrastructure
Superior
Performance
Microsoft
MS. SQL Server 2008
High Developer Productivity
Microsoft
MS. SQL Server 2008
High Developer Productivity
Figure 1 shows an alert on the Time dimension and Calendar hierarchy.
Microsoft
MS. SQL Server 2008
Figure 1
High Developer Productivity
Figure 2 shows the current alerts on a design.
Figure 2
Microsoft
MS. SQL Server 2008
High
SQLDeveloper Productivity
Server 2008 Analysis Services further increases developer productivity
with new, enhanced cube, dimension, and attribute designers.
Microsoft
MS. SQL Server 2008 Figure 3
Scalable Infrastructure
Microsoft
MS. SQL Server 2008
Superior Performance
Microsoft
MS. SQL Server 2008
Superior Performance
SQL Server provides attribute-based hierarchies that avoid the need
for any duplication and improve performance and scalability.
SQL Server 2008 Analysis Services allows writeback data to be
stored in MOLAP format resulting in significantly better performance
for query and writeback operations.
AS prevent users from overloading the relational database by
providing a high performance, transparent, synchronized aggregate
cache.
Microsoft
MS. SQL Server 2008
Extend Solutions with Comprehensive Analytics
Analysis Services takes the analytical platform to a new level offering more
advanced features than those traditionally related to OLAP. This enables
organizations to accommodate multiple analytical needs within one solution
offering so much more than a traditional OLAP platform. In this effort, the
Unified Dimensional Model (UDM) plays a central role, providing extensive
analytical capabilities.
Microsoft
MS. SQL Server 2008
Unified
Dimensional
Model
Extend Reach
Extend Reach Central
with
with Manageability of
Comprehensive
Comprehensive Key Enterprise
Analytics Metrics
Analytics
Predictive
Analysis
Microsoft
MS. SQL Server 2008
Unified Dimensional Model
The UDM was a new concept for Analysis Services that was introduced with
the release of SQL Server 2005. The UDM provides an intermediate logical
layer between the physical relational database used as the data source and
the proprietary cube and dimension structures that are used to resolve user
queries.
In this way, you can think of the UDM as the centerpiece of the OLAP
solution.
Microsoft
MS. SQL Server 2008
Central Manageability of Key Enterprise Metrics
Microsoft
MS. SQL Server 2008
Predictive Analysis
Microsoft
MS. SQL Server 2008
Predictive Analysis
Microsoft SQL Server Data Mining Add-Ins for Office 2007:
The Data Mining Add-Ins for Office 2007 empowers end users to perform
advanced analysis directly in Microsoft Excel and Microsoft Visio.
There are three individual components:
Data Mining Client for Excel enables you to create and manage an entire
Analysis Services data mining project from within Excel 2007.
Table Analysis Tools for Excel enables you to use the powerful Analysis
Services data mining capabilities to analyze data stored in Excel spreadsheets.
Data Mining Templates for Visio enables you to render decision trees,
regression trees, cluster diagrams, and dependency nets in Visio diagrams.
Microsoft
MS. SQL Server 2008
Drive Actionable Insight through Familiar Tools
MSOffice Excel
Optimized
Optimized
Office
Office
Interoperability
Interoperability MS Office Word
Drive
Drive MS Office Visio
Actionable
Actionable
Insight RichPartner
Rich Partner
Insight MS Office Share
through Extensibility
Extensibility Point
through
Familiar
Familiar
Tools
Tools MS Office
Open
Open Performance Point
Embeddable
Embeddable
Architecture
Architecture
Microsoft
MS. SQL Server 2008
2. Reporting Service
Microsoft SQL Server 2008 Reporting Services provides a complete server-
based platform that is designed to support a wide variety of reporting needs
including managed enterprise reporting, ad-hoc reporting, embedded
reporting, and web based reporting to enable organizations to deliver
relevant information where needed across the entire enterprise.
Reporting Services 2008 provides the tools and features necessary to
author a variety of richly formatted reports from a wide range of data
sources and provides a comprehensive set of familiar tools used to manage
and secure an enterprise reporting solution.
Microsoft
MS. SQL Server 2008
Reporting
Reporting
Services
Services
Managing
Managing
AuthoringReport
Authoring Report DeliveringReports
Delivering Reports
ReportingServices
Reporting Services
Microsoft
MS. SQL Server 2008
Authoring Report
Using Report
Development
Tools Charts
Authoring
Authoring Accessing Data Tablix
Report
Report Sources for
Report Creation
Interactive
Features
Creating
Compelling
Reports
Microsoft
MS. SQL Server 2008
Managing Reporting Services
ExtendingManagement
Extending Management
Capabilities
Capabilities
ConfiguringaaReporting
Configuring Reporting
ServicesInstance
Services Instance
Managing
Managing
ReportingServices
Reporting Services
MSOffice
MS OfficeSharePoint
SharePoint
ServicesIntegration
Services Integration
SecuringReporting
Securing Reporting
Services
Services
Microsoft
MS. SQL Server 2008
Delivering Reports
HighPerformance
High Performance
ReportProcessing
Report Processing
Caching
Caching
Snapshots
Snapshots
DeliveringReports
Delivering Reports
MultipleFile
Multiple File
Formats
Formats
DeliveringReports
Delivering Reports
throughSubscriptions
through Subscriptions
EmbeddingReports
Embedding Reportsinto
into
BusinessApplications
Business Applications
Microsoft
MS. SQL Server 2008
3. Intergration Services
SQL Server 2008 Integration Services (SSIS) helps Information
Technology departments to meet data integration requirements in
their enterprises.
SQL Server 2008 Integration Services meets the challenges of
cleansing, transforming, and mapping multiple data sources with
large volumes into a useful format.
New features improve its ability to scale up and improve
performance while speeding development and lowering the TCO.
Microsoft
MS. SQL Server 2008
3. Intergration Services
Technology
Technology
Challenges
Challenges
SSIS
Architecture
Organization
Organization Intergration
alalChallenges
Challenges Services
Integration
Economic
Economic Scenarios
Challenges
Challenges
Microsoft
MS. SQL Server 2008
Technology
Technology
Challenges
Challenges
Pipeline architecture
ADO.NET connectivity
Thread pooling
Persistent lookups
Microsoft
MS. SQL Server 2008
SSISfor
SSIS fordata
datatransfer
transfer
operations
operations
Integration
Scenarios SSISfor
fordata
datawarehouse
warehouse
SSIS
loading
loading
SSISand
SSIS andData
DataQuality
Quality
Applicationof
Application ofSSIS
SSIS
BeyondTraditional
Beyond TraditionalETL
ETL
SSIS,the
SSIS, theIntegration
Integration
Platform
Platform
Microsoft
MS. SQL Server 2008
SSIS for data transfer operations
Microsoft
MS. SQL Server 2008
SSIS for data warehouse loading
SQL Server 2008 includes support for Change Data Capture (CDC).
SSIS can consume data from (and load data into) a variety of
sources including managed (ADO.NET), OLE DB, ODBC, flat file,
Microsoft Office Excel®, and XML by using a specialized set of
components called adapters.
Microsoft
MS. SQL Server 2008
Figure 3 shows an
example of such a flow.
Microsoft
MS. SQL Server 2008
Figure 4
shows a
page from
the SCD
Wizard.
Microsoft
MS. SQL Server 2008
Figure 5 shows
the data flow that
is generated by
this Wizard
Microsoft
MS. SQL Server 2008
SSIS and Data Quality
One of the key features of SSIS, as well as its ability to integrate data, is its ability
to integrate different technologies to manipulate the data. This has allowed SSIS
to include innovative “fuzzy logic”–based data cleansing components.
SSIS deeply integrates with the data mining functionality in Analysis Services.
Data mining abstracts out the patterns in a dataset and encapsulates them in a
mining model.
Support for complex data routing in SSIS helps you to not only identify
anomalous data, but also to automatically correct it and replace it with better
values. This enables “closed loop” cleansing scenarios
Microsoft
MS. SQL Server 2008
Figure 6 shows
an example of a
closed loop
cleansing data
flow.
Microsoft
MS. SQL Server 2008
Application of SSIS Beyond Traditional ETL
Service Oriented
Architecture
Applicationof
Application ofSSIS
SSIS
BeyondTraditional
TraditionalETL
ETL Data and text
Beyond mining
On-demand data
source
Microsoft
MS. SQL Server 2008
On-demand data source
Figure 7
shows a
SSIS
package that
sources data
from RSS
feeds over
the Internet,
integrates
with data
from a Web
service
Microsoft
MS. SQL Server 2008
Figure 8 shows
the use of the
SSIS package as
a data source in
the Report
Wizard.
Microsoft
MS. SQL Server 2008
SSIS, the Integration Platform
SSIS goes beyond being an ETL tool not only in terms of enabling
nontraditional scenarios, but also in being a true platform for data
integration. SSIS is part of the SQL Server Business Intelligence
(BI) platform that enables the development of end-to-end BI
applications.
Microsoft
MS. SQL Server 2008
Integrated
Integrated
developmentplatform
development platform
SSIS,the
SSIS, theIntegration
Integration
Platform Programmability
Programmability
Platform
Scripting
Scripting
Microsoft
MS. SQL Server 2008
Integrated development platform
Microsoft
MS. SQL Server 2008
Integrated development platform
Figure 9
shows a BI
Development
Studio
solution that
consists of
Integration,
Analysis, and
Reporting
projects.
Microsoft
MS. SQL Server 2008
Integrated development platform
Figure 10
shows an
example of
geographic
data
visualized
using a
scatter plot
and a text
grid.
Microsoft
MS. SQL Server 2008
Figure 11 shows
an example of a
script that checks
for the existence
of an Office Excel
file.
Microsoft
MS. SQL Server 2008
Making Data Integration Approachable
Microsoft
MS. SQL Server 2008