7 - Informatica Data Cleanse
7 - Informatica Data Cleanse
Chris Phillips
Agenda
1. Product Platform 2. Data Quality Positioning 3. Informatica Data Quality 4. Data Quality Assistant 5. Informatica Data Explorer
Agenda
1. Product Platform 2. Data Quality Positioning 3. Informatica Data Quality 4. Data Quality Assistant 5. Informatica Data Explorer
Access
Any system in batch or real-time
Discover
Search and profile any data from any source
Cleanse
Validate, correct and standardize all data types
Integrate
Transform and reconcile all data types
Deliver
Provide right data, at the right time, in the right format
PowerExchange
PowerCenter
Data Quality Profiling Data Enrichment and Cleansing Data Standardisation Data Matching and Deduplication / Consolidation Data Quality Monitoring
5
Agenda
1. Product Platform 2. Data Quality Positioning 3. Informatica Data Quality 4. Data Quality Assistant 5. Informatica Data Explorer
business leadership needs to take responsibility for identification of data quality issues, establishing minimum acceptable levels for data quality and facilitating data quality improvement initiatives, Gartner 2006
All Master Data Types Data Quality Metrics and Reports Enterprise Data Quality Deployment
The same infrastructure can be deployed to support customer and product data
Easy access to metrics to identify, categorize and quantify low quality data
Scalable infrastructure for high performance installations (reusable for DI and DQ)
Business Imperatives
IT Initiative
Regulatory Reporting
Systems Consolidation
Data Standardization
Profile
Cleanse
Enrich
Match
Scorecard
Business Case and Benefit Statement - Improved Sales and Customer Service
Business Imperatives
Improve Sales and Customer Service Global Operational Efficiency
Regulatory Compliance
Benefit Statement
10
Agenda
1. Product Platform 2. Data Quality Positioning 3. Informatica Data Quality 4. Data Quality Assistant 5. Informatica Data Explorer
11
DB DB
Data Integration
Source Reconciliation Source Reconciliation Fuzzy Matching Fuzzy Matching Scorecarding Scorecarding Cleansing Cleansing Enrichment Enrichment
Data Sources
Workbench
Consumer
Portals, Dashboards, and Reports
Packaged Applications
Rules
Repository
Reference Data
Scorecards
Global
Local
Relational and Flat Files
13
14
Enhance Enhance
15
Conformity
Consistency
Accuracy
Duplicates
Integrity
16
Data Exploration
Relationship Relationship Redundancy Redundancy Completeness Completeness Conformity Conformity Consistency Consistency Accuracy Accuracy Duplication Duplication Integrity Integrity Range Range
Data Quality
17
Duplication: Fuzzy matching Completeness: Conformity: Missing Key Values Incorrect Format
Range: Identify outliers Integrity: Accuracy: Identification UsingRelationship reference data to validate
COMPLETENESS
CONFORMITY
CONSISTENCY
DUPLICATION
INTEGRITY
ACCURACY
RANGE
18
Before
20
21
22
Deployment
23
Target Target Target Application Target Application Target Application database Target Application database Application database Application database database database
Business
IT
24
Data Integration
Powercenter
Report / Monitor
Data Sources
Point Pointof ofEntry EntryData DataQuality Quality--delivered deliveredvia viaPowercenter PowercenterWeb WebServices Services
25
26
Agenda
1. Product Platform 2. Data Quality Positioning 3. Informatica Data Quality 4. Data Quality Assistant 5. Informatica Data Explorer
27
28
Interactive Consolidation
Select final record Select attributes from other records to populate final record
Audit Trail
Status of records stored here Updated for changed/final master records Merged for associate records to final
29
IDQ Rules
Data Fixing
30
Managing Exceptions
31
Interactive Consolidation
32
Audit Trail
33
Agenda
1. Product Platform 2. Data Quality Positioning 3. Informatica Data Quality 4. Data Quality Assistant 5. Informatica Data Explorer
34
Consumer
Data Quality Reporting
Packaged Applications
Repository
Data Integration
Metadata Management
35
Targets
CRM
Flat Files
BI
DB/2
36
Target Systems
ERP CRM
M&A
BI
SCM
DW
38
Investigate and profile data in all source systems to assess actual state of data and identify issues
View results & drill downs Connect to data sources & step through Analyses
Analyst Statements
a core component to creating master data is the ability to first perform data quality profiling and then apply standardization, matching, merging and enrichment logic, Forrester, Rob Karel, Mar. 2007
40
41
Agenda
1. Product Platform 2. Data Quality Positioning 3. Informatica Data Quality 4. Data Quality Assistant 5. Informatica Data Explorer
42
Questions?
43