SAP Data Architecture - NEW
SAP Data Architecture - NEW
Data Architecture
Rahul Padgaonkar
October, 2021
Unit Objectives
Data
Architecture
SAP Data Architecture is not concerned with data modeling, database design or physical data
storage - this is covered in Solution Architecture
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 5
Contextualize
Data Architecture Objectives
Objectives
Provide a comprehensive set of artifacts with relevant attributes of all data and information entities that exist or are
planned in the business environment.
Based on the gaps between the current and future state data and information entities, help to develop a roadmap
outlining how the major additions, changes and retirements to data and information will be achieved
Benefits
Having a clear and transparent plan for all data and information that is driven by and aligned to the business and IT
strategies.
Ensuring that all data and information can be defined, managed and governed, with clear ownership by the
business.
Realize cost savings through the consolidation and retirement of redundant data and information, reducing
unnecessary integration, and ensuring a „single version of the truth“ for data and information across the business.
Improved SAP Architecture efficiency, delivering higher value to SAP Solution design and implementation
Improved SAP Solution Portfolio Management process
Enable Solution Architects to better design the data model change
Role Description
Business Unit Define the rules and ownership of the organizations data
Business
Representative
Stakeholders
Data Steward Play key roles in the governance of data, i.e. safeguard
Data Data Owner consistence and coherence of data in the organization
Stakeholders
= Many connections
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 10
Key Requirements
Data Architecture
SW Comp
Metadata Application
Strategy Map Deployment
Catalog Catalog
Model
Application
Balanced Data Process Instance
Architecture
Scorecard Matrix Strategy
Diagram
Capability
Map
Process Map
Business System
Footprint Landscape
Diagram Model
Data Interface
Organization
Distribution Catalog
Diagram
Diagram
Data Distribution
Metadata Catalog Data Process Matrix
Diagram
A Meta Data Catalog (or Repository) provides data that describes any aspect of an enterprise’s information
assets and enables the organization to use and manage these assets.
It is a repository of Data Object and Field definitions and various business system related characteristics. This includes:
Business rules and process relevance, data activities (data quality, etc.), people and organizations involved, locations of
data such as master & secondary systems, access controls, limitations (security, SOX, etc.), timing and events and the
data lifecycle components (create => maintain => delete).
It provides an easy to read and current view of “Data Reality” from Business and IT viewpoints. It bridges the borders
between Business and IT, allowing impact analysis of all new data requirements, identifies gaps regarding the target data
architecture and defines corresponding roadmaps
It helps to clearly understand how and where enterprise data entities are created, stored, transported, and reported
It provides a holistic view of the entire enterprise environment, not only useful for a specific business function, but enable
utilization across any type of business.
Key
(For Sorting)
LoB
Data Object
Data Object Variant
Data Type
Business Definition
Process Relevance
(according to Process Map - Level 1)
Process Relevance
(according to Process Map - Level 2)
Data Ownership - Business Owner (LoB
Level)
Data Ownership - Process Owner
Rules for Creation
Rules for Updates
Business Rules for Archiving and Deletion
The creation of a Meta Data Catalog is primarily based on discussions with Data Architecture Stakeholders,
such as the Business Unit Representative, the Business Process Owner or the Data Steward. It usually follows
4 phases:
Phase A: Catalog Definition (Based on discussion results with all Data Architecture Stakeholders)
– Agree on structure and level of details for initial version, i.e. gathering of stakeholder requirements
– Define Meta Data Catalog Usage scenarios
– Define roadmap for Meta Data Catalog content delivery and evolution
Phase B: Deliver Baseline Catalog Version
– In this version the High Priority (Prio 1) Data Objects plus Business and IT Attributes are named. start filling with content
Phase C: Deliver Catalog Version 2 – extend to all data objects
– Extend the Baseline Version with Data Objects having specifically High Priority 1 Business and IT Attributes. These
Priority Attributes have been identified based on the Phase B exercise.
Phase D: Deliver Final Version – non priority objects to complete the overall version
– Include non priority Objects to complete the overall Version having all Data Objects and all Attributes captured in the
Meta Data Catalog
The Data Process Matrix development is also basically a mapping exercise, done together with subject matter
experts such as the Business Process Owner, the Business Architect, the Business Unit Representative, etc.
The key Inputs: Process Map (as discussed in Unit 05) and the Meta Data Catalog.
– Together with the Business Stakeholders mentioned an analysis of processes is made, to compare with the Objects in
the Meta Data Catalog. A “hit” between a process and a Meta Data Object is then plotted in the matrix. (As shown on the
Example slide)
– During the cross-reference activities a clear structure should be established for;
▫ Master (Data) Process – The processes which trigger the creation of new Master Data entities, based on the Meta
Data Objects. (e.g. the process “create new customer”
▫ Process Hierarchy – a structure of process outlining where Master Data is used and consumed as a reference for the
data in the organization
Provides a clear definition of which application components in the landscape will serve as the system
of record or reference for enterprise data
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 23
Core Artifact – Data Distribution Diagram
Approach for creating the Core Artifact
The Data Distribution Diagram is basically a mapping exercise, done together with subject matter experts such
as the Application Architect, the Technology Architect, Solution Architects, Database Administrators, etc.
The key Inputs: Application Architecture Diagram (as discussed in Unit 06), the Software Component Deployment Model
(as discussed in Unit 08) and the Meta Data Catalog.
– Based on discussions and analysis the result will be “putting the color on the map”, i.e. the Application Architecture
Diagram plus the related Software Component Deployment Model form the map in which the Meta Data Objects are
colored in. (As shown on the Example slide)
– While coloring the plot:
▫ Provide a clear definition of which application components in the diagram will serve as the system of record or the
system of reference for the Meta Data Objects, i.e. where is the single source of truth for master data, based on meta
data.
▫ Establish clear (naming) conventions for; Application / Systems (together with the Architects), Replication Technology,
Replication Frequency, Create Read Update Delete definitions, etc. as these will affect / impact the data life cycle.
1. Data Strategy is the overarching practice to consider all domains from holistic perspective
ensuring an alignment between Business and IT strategy
2. Data Governance provides direction and oversight for data management by establishing a
system of decision rights over data that accounts for the needs of the enterprise.
3. Data Architecture defines the blueprint for managing data assets by aligning with
organizational strategy to establish strategic data requirements and designs to meet these
Data Data requirements.
Architecture Modeling & 4. Data Modeling and Design is the process of discovering, analyzing, representing, and
Design communicating data requirements in a precise form called the data model.
5. Data Storage and Operations includes the design, implementation, and support of stored data
Data Data Storage & to maximize its value. Operations provide support throughout the data lifecycle from planning
Quality Operations incl. for to disposal of data
Data Strategy Retention 6. Data Security ensures that data privacy and confidentiality are maintained, that data is not
breached, and that data is accessed appropriately.
C/4H
7. Data Integration and Interoperability includes processes related to the movement and
Metadata
f Data consolidation of data within and between data stores, applications, and organizations.
SAC Security 8. Document and Content Management includes planning, implementation, and control activities
S/4H
incl. Privacy used to manage the lifecycle of data and information found in a range of unstructured media,
especially documents needed to support legal and regulatory compliance requirements.
Data 9. Reference and Master Data includes ongoing reconciliation and maintenance of core critical
Warehousing & Data Governance shared data to enable consistent use across systems of the most accurate, timely, and relevant
Business Data Integration &
version of truth about essential business entities.
Intelligence interoperability
10. Data Warehousing and Business Intelligence includes the planning, implementation, and
Reference & Document & control processes to manage decision support data and to enable knowledge workers to get
Master Data Content value from data via analysis and reporting.
Management 11. Metadata includes planning, implementation, and control activities to enable access to high
quality, integrated Metadata, including definitions, models, data flows, and other information
critical to understanding data and the systems through which it is created, maintained, and
accessed.
Modified 12. Data Quality includes the planning and implementation of quality management techniques to
Original version measure, assess, and improve the fitness of data for use within an organization.
version
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 26
SAP Data Management Domains and related products *derived from DMBOK2
Potentially not all SAP Products are shown
Business
Processes
Data Integration Operationalization
Data Insights
Data Processing
1 Forrester Consulting, 2019 Hybrid Data Management Drives Innovation and Growth
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 30
2 NewVantage Partners’ 2019 Big Data and AI Executive Survey
Why is Data Management so difficult?
Data
cataloging
Data
Quality Non-SAP
Applications
Data
ingestion
ELT Data
masking
Machine
Learning
Streaming
Analytics
SAP Event
Applications Cloud Data Stream
Lakes processing
Video
Processing
Data 3rd party
Data Image replication databases
ETL cleansing Processing
Data
profiling Graph
processing
Time Speech
series Recognition
3rd party Data
Semi-structured Meta Data
Warehouses Text management
& unstructured
analytics data
Geospatial
Processing
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 30
Modern Data Management needs Data Orchestration, beyond data integration
Data
Diversity
Data Orchestration
(HTAP, Big Data, Cloud etc.)
Streaming, IoT, ML, Big Data,
Advanced Analytics, Insight 2
Action…
Analytics Integration
ETL, EDW, BI, MIS, Data
Marts…
Application Integration
MFT, EAI, ESB, SOA, B2B,
BPM/BRM…
Data Value
Multi-model Engines
Integration
Tiered Data Storage
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL *planned 33
Find more details in the appendix
SAP Data Intelligence – Core Capabilities
So, there is Governed Data and Ungoverned Data, both having relevant Data Management. In regards to Governed Data,
governance is the enforcement of Data Policy on specific data domain and data object Create, Maintain, Archive and
Distribution processes.
In this section, we’ll provide an overview of Data Management relevant to SAP S/4HANA and connected SAP Products within
a System Landscape. The important point is to architect data quality within an SAP landscape.
Organization
Explicit ownership for data objects and processes. Assigned responsibility for data objects and for
processes. Assigned responsibility for Data Quality. Defined communication and decision channels
involving Business and IT
SimpleMDG is a partner developed solution by Laidon Group for Cloud only usage and is available via the SAP Store. It is
positioned for master data governance initiatives that are budget constrained with low to medium governance complexity.
SimpleMDG runs exclusively on SAP BTP.
Both solutions include standard SAP Data Domain/Object Models and integration capability via SAP Cloud Integrator
(including Cloud Platform Integration Suite). UI is exclusively Fiori based. SAP MDG additionally utilizes SAP Data
Replication Framework (DRF) for replication to both SAP and non-SAP business applications.
These are both Application Level Solutions and do not support direct database to database connectivity.
LoB customer LoB procurement LoB finance LoB production Other LoB
All typical approaches are supported: central governance with distribution, decentralized ownership with consolidation, data quality monitoring with remediation
All systems always in synch Real-life customer example: SAP MDG managing prospect
and customer data between cloud and on premise systems
Master data creation can happen in on-demand systems, on-premise
systems, or in SAP MDG on SAP S/4HANA
De-central creation triggers a process in SAP MDG on SAP S/4HANA
After approval, enriched high-quality master data is replicated to all
relevant cloud and on-premise systems
Opportunities
SAP MDG and SimpleMDG provide adaptable integration
Customers
Prospects
Cloud
constraints Fulfillmen …
t
1
Global data type Example::
1 ..* (SAP) BankAccountContractID
1 ..*
1
Core data type
(CCTS) Example: Identifier
1 ..*
1
Primitive data type Examples: float, string, token,
(XSD) and binary
SAP developed a global data type catalog (GDT) as part of the SOA-
Enablement of the SAP Business Suite and other SAP products. Most
extensive usage is in SAP Banking products.
Characteristics GDTs are SAP-wide defined and reconciled data types with business
Standard (ISO 15000-5 and UN/CEFACT CCTS) related content as they occur in standards.
Defined in SAP Enterprise Service Repository (ESR) The GDTs are used to define the web service interfaces which expose
Semantic building blocks for interfaces (reuse) functionality of SAP applications.
The complete GDT catalog is a 16840 pages PDF which is publicly available on this SAP Community wiki site:
https://wiki.scn.sap.com/wiki/display/GDT/SAP+Global+Data+Type+Catalog+-+pdf+version
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 45
Data-Driven Architecture
with SAP Data Intelligence
Why is Data Management so difficult?
Enterprise IT is challenged to a whole new degree!
Create powerful data pipelines for Manage metadata across a diverse data Harness machine learning to
data integration and to landscape and create a data catalog discover hidden insights as part of
orchestrate the data processing and business glossary your data pipelines
Access & Govern & Prepare & Build scalable Operationalize Monitor &
connect data discover data label data & flexible machine learning scale
data pipelines
Orchestration
SAP Information SAP Data
SAP Data SAP Data Steward Intelligence
Services Intelligence
Common Support W ide
Connections Span of Data
[Data Lake]
SAP HR data
Manual
Employee point-to-point
Personal Data & integrations Weekly
Job Position
Data
updates
Employee
Seniority Data
Excel sheet with
Best Fitting
Data extracts to
Best Fit Employees,
Excel files
refreshed weekly
Areas &
Knowledge Data
Employee
Knowledge
Point-to-point
Scoring Data
data extract
Real Time
Employee Matching by Job
Personal Data & SAP Data Intelligence
Job Position
Posting Code
Data
Employee
Seniority Data
Best Fitting
Score Scaling Employee List
Best Fit according to related to input
Location job posting
Areas &
Knowledge Data
Employee
Knowledge
Data Intelligence
Scoring Data Distance Data
pipelines
Employee – job
posting location
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 52
SAP Data Intelligence – Deployment Options
SAP Business
Technology
Gardener Platform
Kubernetes
Hyperscaler 1
Infrastructure
Customer Network 1
Hyperscaler 2
VPN Infrastructure
SAP Business
Customer Network 2 Technology
Platform
SAP Cloud
Connector
Business Processes
5
Adapt
Adaptation / enrichment
in local systems
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 57
SAP Master Data Governance, Consolidation
Process flow
Calculate Best
Data Load Initial Check Standardize Match Validate Activate
Record
Open to SAP HANA View loaded data Validate and enrich Find duplicates based Create “Best Validate best records Provide consolidated
smart data integration, and check data address data on customer-specific Records” based on against backend master data for
non-HANA based quality based on matching rules approved match customizing to verify analytical or
Possibility to connect
SAP ETL backend groups whether records can operational use
to 3rd party tools for
mechanisms, customizing Review match result be activated
standardization and BRF+ can be used for
non-SAP ETL options, Option to activate
enrichment customer-specific
or data import from file Validate against directly, or indirectly
Best Record
Usage of BRF+ for ctrl. governance triggering post
Calculation
standardization and checks (BAdI, BRF+) processing using
enrichment Review Best Record central governance
Calculation result
Requirements are defined based on Ensure quality at point of e ntry Operational motivation: detect issues Correct data and drive the correction
your company’s business processes before processes fail process
Consider all entry-points: si ngle
Priorities are set according to value, changes, mass changes, l oad scenarios, Tactical: ensure progress an d Fix data entry processes
impact, and quality evolution in daily business, projects, … performance of current activ itie Evolve the definition of quality
► Experts collaborate to define needed ► Rule-based checks in al l processes of Strategic: enable achieveme s
► Tools to fix data and to improve
quality level and required checks SAP MDG new initiatives nts, define
checks at point of entry
► The system helps to identify ► Easy to consume monitori
additional meaningful rules trend reporting ng and
Karl Singer
StevenOlsen
StevenOlsen
In a nutshell …
Business user application to define master data quality rules, striving for consistent usage across all points of
entry and enabling data quality monitoring and remediation.
Business value
Define Quality Enter Quality Monitor Quality Improve Quality
In a nutshell …
Highly effective mass change process enabling master data stewards to perform bulk changes in business partner,
customer, supplier, and product data.
Business value
Highly effective data processing option for master data specialists
Efficiently edit individual fields or make bulk changes by using a tabular UI with
the ability to filter and sort data
Out-of-the box delivery of proven data models for operational and financial master data
Financials Material Supplier & Customer Enterprise Asset Mgmt. ** Retail & Fashion Mgmt. ***
Asset
Linear
Cost Element* / Units of Measure Bank Details Identification Textile Components
Accounting ,Controlling and Consolidation
Capabilities
Maintenance Plans
Work Management
P&L Statement Classification Business Partner Relationships
& Procurement
Maintenance Items Purchase Info Record
Cost Center / Document Link
Hierarchies Supplier Attributes Customer Attributes Purchasing Org / Vendor / Site
Measuring Points
Profit Center / Sales Data General Data General Data Stores Distribution Cntrs.
Task Lists
Bill of Material
Purchasing Data Tax Indicators Additionals Article Hierarchy
Validity Data, Material Data, Texts
Consolidation: Production Version Substitution Additional Texts
MRO
Item, Group & Contract Account Attributes Item: Detail, Quantity, Status, Document Management System
Hierarchies, Unit, MRP and Purchasing Data,
Value
Valuation, Costing,
Break Down Category Material Ledger General Data Partner-specific Data Text and Document Assignment Segmentation Seasons
* User interface integration between GL Account and Cost Element only supported in SAP MDG on SAP S/4HANA ** SAP MDG, enterprise asset mgmt extension by Utopia *** SAP MDG, retail & fashion mgmt. extension by Utopia
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 62
Cloud based master data sharing in context of Enterprise master data mgmt.
Apply Master Data capabilities along the business use cases
SAP MDG
Cloud
apps
Lost
Legacy
SAP innovation
Multi-year value
projects
Business
disruption
Legacy
nonSAP
Data Digital Core
loss
High risk
of failure Slow SCP, IOT
Acquired
business
ERP value delivery
SAP
Non-SAP
SAP MDG SAP BW on HANA
Service Layer
SAP BPC
SAP
Applications
Instant visibility for the business
Enterprise Data Hub
SAP Integration
Synergy realization support
F
Applications
I
/
oData
Support
Services
Micro-Services Layer
o Data Services
w Store
Ra
Non-SAP
Non-SAP
Applications
SAP HANA
(platform)
m
M
A
P
t
I
.
© 2020 SAP SE or an SAP affiliate company. All rights reserved. ǀ INTERNAL 66
Next generation transformation using Data-In Hub
How it works
One central set of transformation rules for both analytics and data migration
End-to-end coverage of the transformation process with one integrated tool set for
BW/4 HANA
Data
S/4 HANA
Lake Security
MDG
Excel
DiH DiH
On Prem Cloud
2015-2019 … 2020+
On premise/private cloud Private or public cloud
Static Elastic
Limited Unlimited
TECHNOLOGIES
SAP Analytics Cloud SAP Data Warehouse Cloud SAP HANA Cloud
Analytics technology – business End-to-end data warehouse in Next-gen database Platform as-
intelligence (BI), planning, and the cloud that combines data a-Service with full capabilities to
predictive analytics – in a single management processes with manage OLTP, OLAP and HTAP
solution advanced analytics workloads
Excel
3. Data-In Hub is a highly flexible concept that adapts to the very diverse
requirements and existing tools in each customer’s landscape
www.sap.com/contactsap