Data Model Onboarding Slide 20230329
Data Model Onboarding Slide 20230329
Internal
[Open]
Content
Internal
[Open]
Internal
[Open]
QAQC
Completed Data Template
Review
Create Validate
Requirement Conceptual Create Logical Create Physical
Conceptual Logical Data
Analysis Data Model & Data Model Data Model
Data Model Model
Data Modelling Approval
Maintain Maintain
Versioning Logical Versioning Physical Model
Model
Create Change
Document
Validate Technical Raise Object Create Data
Request Form Generate Script Model in
Physical Data Review &
(ORF) & (DDL) Development
Model Approval
Approval
Harvest Upload
Stitch Data Stitch business and
technical business
Lineage technical metadata
Metadata metadata metadata
© 2021 Petroliam Nasional Berhad (PETRONAS) |
Internal
[Open]
During business requirement gathering, project may gather the information below, which provides better context when
developing data models
Business Requirement Gathering
1
Facts:
• Business Function & Process
Business Process Define business
Define scope • Data Requirement
& Requirement objective • Business Operation Location
(Contextual) • People/Organization Structure
• Business event/cycles
2
Data Warehouse:
Business Model Develop BUS Matrix
(Conceptual) Develop Data Entity / Business Develop Business Model Develop Fact Qualifier Matrix
Function Matrix (Conceptual Data Diagram)
System Migration:
Data Migration Diagram
Requirement Document
Serve as a blueprint for the business to better understand what to expect out of the project.
Internal
[Open]
The technical solution review is a platform where project solution design (i.e: data model design) is reviewed.
• Review the project solution designs (incl. Technology, Data, Reporting, • Head Data Architecture, Enterprise Data
Infrastructure, Security, Third Party or Local Market specific designs) • Head Data Design Development, ED
Chairman &
Objective • Provide EDH platform architecture, solution, global and local data related • Head Master Data Management, ED
Members
decisions and design recommendations • Enterprise Architecture/ EDH Core Rep
• Escalate to the Data Architecture Review Board as appropriate • Other reps (if required)
Internal
[Open]
Internal
[Open]
Refer to this link to get the latest Naming Convention for EDH
Topic General Rule Correct Incorrect Explanation
Letters (Data It is best to use capital letters except in the 1. AZ_ADLS_RW_SAPECC 1. az_adls_rw_sapecc The file name will be identifiable by
Lake Folders: situation where the format for letters of a file name 2. AZ_ADLS_PR_SAPECC 2. az_adls_pr_sapecc using capital letters to differentiate
Raw & Process) MUST not all be in capital letters. 3. AZ_ADLS_ER_FIN 3. az_adls_er_fin between the words.
Letters (Data For data model objects use PascalCase ElectricalComponent 1. electricalComponent To standardize the format of name of
Model Object: (Uppercase for beginning of word with no spaces in 2. Electrical_Component the data model object
Enrich, between) 3. ELETECTRICAL_CO
Dwarehouse, MPONENT* *For Physical Data Model, if the
Tenant, Mart) Data Model objects to adhere to PascalCase is as database do not allow mix case, use
follows: Upper Case with hyphen in between
i. Table/ Entity words
ii. Attribute/Column
iii. View
iv. Index
For custom table/ For data model that uses standards (e.g: IDM, P_ElectricalNetwork ElectricalNetwork To be able to identify that the data
attributes of CHIFOS etc) as reference, to append P_ to indicate model object is custom
standard models the table and attribute is custom
Internal
[Open]
Should show entities and Fully attributed Designated a primary key and enforce constraint
relationships between entities
Primary key and foreign key
Use modeling notations to denote Select data type and length that matches the data
identified
type and description of relationship
e.g:
Ensure unit of measure is included
(for required numerical attribute) Standard: Recommendation:
• Time data to be • Numerical data use
formatted to UTC 0 SMALLINT, BIGINT,
INTEGER, DECIMAL
Resolve many-to-many relationships Not allowed: • Character data use
• FLOAT is not allowed CHAR or VARCHAR
TIPS : Indicator of good model
for decimal in EDH • Date/ Timestamp data
Denote the use of supertype/ use DATE, DATETIME
More than one entity, at least 1 or TIMESTAMP
subtype/ recursive if used * To accommodate
master data and 1 transaction and tag
Protegrity constraint
the entity type using color coding
standard
Internal
[Open]
Physical (2/2)
Append versioning columns to table Adhere to the physical data model standards & guideline
Internal
[Open]
Internal
[Open]
Decision Tree for Industry domain data model adoption
Business tenant has the option to not adopt industry standards, however, may use this decision tree as reference)
PROJECT INPUT
Start
Is the data type
Is the project’s project requirements,
Is the project 1 No
2 business process and data type matches No
3 a master/
No
requirement to reference
considerably with below table? data?
serve information
to Open
Subsurface Data (Note: refer to appendix for more comprehensive list of data type)
Is the data a
Universe (OSDU) Upstream data • Tags and Equipment
Data Platform? (snippets from full list in • Capital Facility Data Yes normalized 4
transaction data? No
appendix) : Entities
• Additives • Master Document
• Applications Register
Yes • Areas
• Contracts/ Agreements
Data Type
• Entitlement
• Production
• Rate Schedule Yes
Templates
Templates Attachment
Source to Target Mapping (STM)/
NoSQL Design Document Data Model STM
Object Request Form (ORF) Refer to this link for the latest template
Object Request
Form
https://petronas.sharepoint.com/:f:
Data Model Detailed Design Document /r/teams/ts_GD_enterprisedatadat
aar/Shared%20Documents/Data
Detail Design
%20Modeling/2.%20Document%
Document
20Templates?csf=1&web=1&e=Z
Technical Review (TR) Presentation bFWGc
TRB Presentation
Internal
[Open]
Internal
[Open]
Internal
[Open]
Internal
[Open]
• As data modeler in the project, we will be working mostly with Data Onboarding and ETL. Occasionally API, master
data, architecture and metadata team.
POD of EDH projects
fulltime members
Internal
[Open]
All projects will undergo this end to end enablement process flow upon finalizing the critical data elements
prior to ingesting into EDH
Data Demand Planning Data Discovery Data Model ETL Data Serving Handover & Go-Live
Key Tasks
1. Identify business pain 1. Project & resource 1. Design conceptual, 1. Integration with source 1. Design & create view for Complete handover
points planning logical & physical data 2. Design & build data Data+ Self Serve documents and processes,
2. Establish business 2. Establish project pod model pipeline Platform release to Production, and
roadmap 3. Identify data types & 2. Prepare Source-to-Target 3. Ingest data 2. PBI-Query to EDH Run & Maintain
3. Data pre-discovery attributes Mapping (STM) 4. DQ profiling Datawarehouse
4. Data prioritisation 4. Data classification document 5. Develop DQ dashboard 3. Develop API Connection
5. EDH data check 5. Catalogue business & 3. Prepare for Technical 6. Perform User Acceptance to Front End Application
6. Raise Service Request technical metadata Review Session (TRS) Test (UAT) or Digital Solution monitor
(SR) 6. Define business DQ rules
7. Prepare Solution
Proposal (SP)
Output
1. Identified Critical Data 1. Project Management 1. Conceptual model 1. Platform provision 1. Dataset published 1. Handover documents
Elements (CDE) Committee (PMC) 2. Physical model 2. ETL pipeline 2. API URL 2. Briefing to EDH Core
2. Data enablement strategy endorsement 3. Approved TRS 3. API Support
3. Approved Service 2. Sprint planning 4. Source-to-Target 4. DQ profiling/score 3. Change requests
Request (SR) 3. Approved data dictionary Mapping document 5. Data tokenization (for 4. Incident logs
4. Approved Solution 5. Approved Object Request Secret & Confidential
Proposal (SP) Form (ORF) data)
Key Roles
1. Business Steward 1. Business Data Analysts 1. Data Modeler 1. Platform Engineer 1. EDH Core Support 1. EDH Core Support
2. Business Data Analysts 2. ED Data Delivery 2. ED Data Delivery 2. Data Engineer 2. API Engineer 2. Data Analyst
3. ED Data Delivery 3. Business Subject Matter 3. Business Data Analysts 3. DQ Engineer 3. Data Modeler
4. ED Data Architecture Experts 4. ED Metadata 4. Data Modeler 4. ED Data Analyst
5. ED Metadata
Internal
[Open]
Business Requirement Advise and steer project team for technical matters.
Gathering Solution Architect/ Technical Lead
Support other team members – metadata, API etc – and support testing
when required
Internal
[Open]
• Note: For Data Model team, all data modelers will meet every week, usually Friday, to share about projects status
and discuss on any data model or relevant data architecture management matters.
Data Architecture
Management
Head: Haris Syukri
Metadata
Management
Data Modeling
Head: Head:
TBD Chong Yee Onn
© 2021 Petroliam Nasional Berhad (PETRONAS) |
Internal
[Open]
1. Working hours:
• Follow Petronas working hours: 8 working hours (between 8/9AM Malaysia time – 5/6PM Malaysia time) + 1
hour break.
• Follow Malaysia Public Holidays timetable
3. For other administrative issues (e.g: , Petronas ID, access to general petronas network via VPN, security exception
of personal device, contract related etc), please contact :
• Emil ‘Aizat Elis (Data Platform Services)
• Wong Kim Chuan (Head, Data Platform Services)
Internal
[Open]
4. For request to EDH environment (e.g: EDH ADLS, EDH S3, EDH Redshift, EDH Synapese etc depending on project
requirement),
• Step 1: Request for a-ID via ICT2U. Follow the instructuctions attached below.
• Step 2: Send request for access to EDH environment via email to Sahrilnizam Ismail (Head, Data Platform
Services), c.c Farah Hanum M Ariffin and provide a-ID
Internal
[Open]
Next step
Internal
[Open]
Internal
[Open]
Appendix: Decision Tree for Industry domain data model adoption
PPDM
Legend: Anchor Tables
Internal
[Open]
Appendix: Decision Tree for Industry domain data model adoption
CHIFOS
Legend: Anchor Tables
Data Model Data Domain/ Business Data Type Business Requirement/ Process
CHIFOS Tags and Equipment designing, operating, maintaining or
Capital Facility Data Entities, e.g: decommissioning industrial facilities
• Geographical Surface (Site and Area);
• Function (Plant - Process Unit - Tag);
• Plant Breakdown Structure:
• Commissioning Unit - Commissioning System – Tag;
• Maintenance Unit - Maintenance System - Tag and Equipment;
• Construction Assembly - Tag;
• Process Unit - Equipment;
• Tag Physical Connection;
• Corrosion (Corrosion Loop Type – Corrosion Loop);
• Master Data (Company, Purchase Order, Property, and Property Pick
List);
Facility related Master Document Register
Example of documents:
• Project Plans and Procedures;
• Engineering deliverables, including those supplied by
Supplier/Manufacturers;
• Engineering documents;
• Engineering drawings;
• Procurement related documents;
• Construction related documents;
• Commissioning related documents;
• QA and QC related documents;
• Any other documents required by Principal.
Data Model Data Domain/ Business Data Type Business Requirement/ Process
OSDU Subsurface data and wells application integration between our domain
applications and the OSDU Data Platform.
Internal
[Open]
Appendix: Decision Tree for Industry domain data model adoption
IDM Legend: Anchor Tables
Internal Note: If IDM is identified as the Data Model standard to be used, IDM would be enhanced if the data domain/ business data type are not available
[Open]
PETRONAS
Passionate about Progress
Internal