0% found this document useful (0 votes)
123 views

How To Read A Data Model

The document provides an overview of how to read and understand a data model. It defines a data model as a graphical representation of data elements and their relationships that describes data for an information system. The document outlines different types of data models, including entity-relationship diagrams and logical vs physical models. It explains that a data model provides a standardized way to communicate how data is structured and related to various stakeholders.

Uploaded by

Sanjay Sharma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views

How To Read A Data Model

The document provides an overview of how to read and understand a data model. It defines a data model as a graphical representation of data elements and their relationships that describes data for an information system. The document outlines different types of data models, including entity-relationship diagrams and logical vs physical models. It explains that a data model provides a standardized way to communicate how data is structured and related to various stakeholders.

Uploaded by

Sanjay Sharma
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

How to read a data model?

By: Sanjay Sharma Consulting Enterprise and Data Architect e.mail: [email protected]

Goal
To develop basic literacy about data models.
To understand what it contains. To understand how information in it can be used more effectively. We would not touch upon technicalities of developing a data model.

Session Structure
What is a data model, its need and context. Different types of data models Semantics of data models How to read a data model How to use data models more effectively Question answers.

Why Model?
John Boyd (1927-1997) Military Strategist and Thinker Most original military thinker since Sun Tzu (600BC) OODA Loop: Every organization/organism uses OODA loop to adapt to its surroundings and survive.

Why model?
Observation is information gathering. Orientation is developing a mental framework of information by understanding its structure and relationships . Models are observation as well as orientation tools which use symbols for real world facts. Models are effective because human mind absorbs more information visually than textually. Models in business and IT Enterprise Models, Business Process Models, Workflow Models Interaction Models, Network Models etc.

Why model data?


Data is a distinct component of an information system the other component is application logic. It needs to be described in such a way that it is clearly and precisely communicated to all stake holders- information analysts, application developers, data analysts, database administrators etc. Every data element must have a defined business purpose. A data model is an un-ambiguous and precise description of data, its structure and relationships agreed upon by all stakeholders.
7

What is a data model?


It is a paper sheet with coloured rectangles and tangled web of crow-feet lines joining them For a given information system, it is graphical representation of data elements, their relationships and constraints governing the data.
InReportOther InReportHospital inReportID: int NOT NULL (FK) hospitalOrgRoleID: int NULL hospitalLU: text NULL admissionDate: datetime NULL dischargeDate: datetime NULL dischargeNote: text NULL occupReportTypeCode: int NULL servProvOrgRoleID: int NULL servProvLU: text NULL therapistPersonRoleID: int NULL therapistLU: text NULL inReportID: int NOT NULL (FK) findings: text NULL recs: text NULL

HospitalConsultationReport inReportID: int IDENTITY (FK) hospConReportID: int NOT NULL doctorPersonRoleID: int NOT NULL doctorLU: text NULL dictationDate: datetime NULL diagnosis: text NULL findings: text NULL procedures: text NULL

HospitalOtherReport inReportID: int IDENTITY (FK) hospitalOtherReportID: int NOT NULL reportDate: datetime NULL hosOthReportTypeCode: int NULL source: varchar(50) NULL findings: text NULL procedures: text NULL comments: text NULL

HospitalImagingReport inReportID: int IDENTITY (FK) hospImgReportID: int NOT NULL reportDate: datetime NULL proceduresCode: int NULL findings: text NULL opinions: text NULL

What is the context?

Types of data models


Data can be described with different perspectives: ObjectRole Models, Entity-Relationship Diagrams(ERDs), Data Flow Diagrams ( DFDs), UML Class Diagrams etc. Entity-Relationship (ER) Diagrams most popular for data modeling as they can easily be converted into relational database designs.
InReportHospital inReportID: int NOT NULL (FK) hospitalOrgRoleID: int NULL hospitalLU: text NULL admissionDate: datetime NULL dischargeDate: datetime NULL dischargeNote: text NULL HospitalConsultationReport inReportID: int IDENTITY (FK) hospConReportID: int NOT NULL doctorPersonRoleID: int NOT NULL doctorLU: text NULL dictationDate: datetime NULL diagnosis: text NULL findings: text NULL procedures: text NULL HospitalOtherReport inReportID: int IDENTITY (FK) hospitalOtherReportID: int NOT NULL reportDate: datetime NULL hosOthReportTypeCode: int NULL source: varchar(50) NULL findings: text NULL procedures: text NULL comments: text NULL

InReportPsychTest inReportID: int NOT NULL (FK) source: varchar(50) NULL assessmentDate: varchar(30) NULL conclusions: text NULL summary: text NULL recs: text NULL

HospitalImagingReport inReportID: int IDENTITY (FK) hospImgReportID: int NOT NULL reportDate: datetime NULL proceduresCode: int NULL findings: text NULL opinions: text NULL GlasgowComaScale gcsID: int IDENTITY inReportID: int NOT NULL (FK) time: datetime NULL eyes: tinyint NULL verbal: tinyint NULL motor: tinyint NULL total: tinyint NULL InReportAmb inReportID: int NOT NULL (FK) scene: varchar(50) NULL sceneTime: datetime NULL destination: varchar(50) NULL destinationTime: datetime NULL complaint: varchar(100) NULL injuryMech: text NULL history: text NULL medications: text NULL allergies: text NULL consciousness: varchar(50) NULL airwayControlCode: int NULL note: text NULL NeuroPsychTest neuroPsychTestID: int IDENTITY inReportID: int NOT NULL (FK) test: varchar(75) NULL result: varchar(75) NULL note: text NULL

10

Types of ERD domain model


Domain Model(Subject Area Model): A very high level (10,000 feet) conceptual model showing the major entities and their relationships in a business or problem domain Only entities are shown

11

Scope of domain models


Business Domain Models or Business Subject Area Models Very high level covering entire business Application Domain Models or Application Subject Area Models covering an application/package.

12

Types of ERD logical models


Logical Models: Showing entities and their logical relationships for a given information system.
TOTAL LOSS REQUEST RECORD Claim File Id (FK) RequestNumber ActualMileageFlag CommentsNotOnValuation CommentsOnValuation Condition Equipment MarketValue Other OtherAdj OtherDesc Packages RequestUploadFlag SalvageType SearchDays SearchExtent TransferFee ValuationLevel ValuationStatus Create Date CLAIM FILE Claim File Id ICBC Claim Number ICBC Form Id ClaimStatus ControlLogNumber EstimateCount Creation Date Creation Time LastNet PrimaryImpactPoint SecondaryImpactPoint Entered Car Model Year Entered Car Model VIN ADPHostControlLogNumber DeviceAssetNumber PenPro Claim Number AcctControlNo Adjuster Resource Name Adjuster Resource Number LossSecondPayee LossPayee LossType LossDate PolicyNumber Insured Name Claim Centre Number Claim Centre Name CLF_DAIS_NUM_BYTES CLF_DAIS_NUM_ROWS Claim Number Check Digit Exposure Code Kind Of Loss Code Person Organization Id Licence Series Year Declared Value Gross Vehicle Weight CLAIM FILE ESTIMATE GROUP Claim File Id (FK) Claim Program Type Estimating Business Facility Number Maximum Estimate Id Current Status Last Status Change Timestamp Stale Claim Flag BF Logical Supplement Count

ESTIMATE DAIS CHUNKS Claim File Id (FK) Sequence Number DAIS Data

VEHICLE REPAIR LOG Vehicle Repair Log Claim File Id (FK) Vehicle Repair Log Secondary Id Vehicle Repair Log Logon Id Vehicle Repair Log TimeStamp Vehicle Repair Log Car In Date Vehicle Repair Log Car In Time Vehicle Repair Log Customer Contact Date Vehicle Repair Log Customer Contact Time Vehicle Repair Log Car Out Date Vehicle Repair Log Car Out Time Vehicle Repair Log Exclude Flag VRL_PVRT_NUM_DAYS ESTIMATE PRINT IMAGE LINE Claim File Id (FK) EstimateID (FK) Estimate Print Line Number Estimate Print Line Text

13

Types of ERD-physical models


Physical Models: The model showing the physical implementation of logical model at data storage level. Contains columns for implementing relationships and fast data access. Most tools can create schema scripts from physical models.
CLAIM_FILE AUTOSOURCE_REQUEST CLF_ID: DECIMAL(15,0) NOT NULL ASR_CLF_ID: DECIMAL(15,0) NOT NULL (FK) ASR_REQ_ID: SMALLINT NOT NULL ASR_ADXE_CREATE_ID: VARCHAR2(35) NOT NULL ASR_EST_ID: SMALLINT NOT NULL ASR_PRODUCT_TYP: CHAR(1) NOT NULL ASR_DEVICE_NME: VARCHAR2(10) NOT NULL ASR_SEARCH_DAYS: VARCHAR2(30) NOT NULL ASR_SEARCH_PROV_CD: VARCHAR2(30) NOT NULL ASR_SEARCH_PROV: VARCHAR2(30) NOT NULL ASR_SEARCH_POSTAL: VARCHAR2(30) NOT NULL ASR_SEARCH_CITY: VARCHAR2(30) NOT NULL ASR_ASHOST_REQ_NUM: CHAR(8) NOT NULL ASR_CURRENT_STAT: CHAR(18) NOT NULL ASR_ADJ_POLARITY: CHAR(6) NOT NULL ASR_ADJ_VALUE: DEC(8,0) NOT NULL ASR_ADJ_DESC: VARCHAR2(30) NOT NULL ASR_TITLE_FEE: DEC(4,0) NOT NULL ASR_TRANSFER_FEE: DEC(4,0) NOT NULL ASR_SALVAGE_TYP: SMALLINT NOT NULL ASR_PUB_COMMENT: VARCHAR2(1000) NOT NULL ASR_PRIV_COMMENT: VARCHAR2(1000) NOT NULL ASR_RECEIVED_DTE: DATE NULL AS_REQ_CONDITION ASRC_CLF_ID: DECIMAL(15,0) NOT NULL (FK) ASRC_REQ_ID: SMALLINT NOT NULL (FK) ASRC_SEQ_NUM: SMALLINT NOT NULL ASRC_COMPONENT: VARCHAR2(72) NOT NULL ASRC_COND_TYP: CHAR(1) NULL ASRC_CNDTYP_RATING: CHAR(18) NOT NULL ASRC_COND_RATE: SMALLINT NOT NULL ASRC_COND_DATE: DATE NULL ASRC_COND_VALUE: DECIMAL(6,0) NOT NULL ASRC_COND_NAME: VARCHAR2(30) NOT NULL ASRC_COND_NOTES: VARCHAR2(30) NULL CLF_ICBC_CLM_NUM: CHAR(7) NOT NULL CLF_ICBC_FORM_ID: CHAR(1) NOT NULL CLF_CLM_STAT: SMALLINT NOT NULL CLF_CNTL_LOG_NUM: CHAR(25) NOT NULL CLF_EST_CNT: SMALLINT NOT NULL CLF_SCHED_DTE: DATE NULL CLF_SCHED_TME: DATE NULL CLF_LAST_NET: DECIMAL(8,2) NOT NULL CLF_PRIM_IMP_PNT: SMALLINT NOT NULL CLF_SEC_IMP_PNT: SMALLINT NOT NULL CLF_SCHED_YEAR: SMALLINT NOT NULL CLF_SCHED_VIN: CHAR(20) NOT NULL CLF_ADPH_CNTL_NUM: CHAR(7) NOT NULL CLF_DEV_ASSET_NUM: CHAR(10) NOT NULL CLF_PENPRO_CLM_NUM: CHAR(25) NOT NULL CLF_ACCT_CNTL_NUM: CHAR(17) NOT NULL CLF_ADJ_RSRC_NME: CHAR(35) NOT NULL CLF_ADJ_RSRC_NUM: CHAR(5) NOT NULL CLF_LOSS_SECND_PAY: CHAR(30) NOT NULL CLF_LOSS_PAYEE: CHAR(30) NOT NULL CLF_LOSS_TYP: SMALLINT NOT NULL CLF_LOSS_DTE: DATE NULL CLF_PLCY_NUM: CHAR(12) NOT NULL CLF_INS_NME: CHAR(27) NOT NULL CLF_CLM_CNTR_NUM: CHAR(3) NOT NULL CLF_CLM_CNTR_NME: CHAR(30) NOT NULL CLF_DAIS_NUM_BYTES: INTEGER NOT NULL CLF_DAIS_NUM_ROWS: SMALLINT NOT NULL CLF_CLM_NUM_CD: CHAR(1) NOT NULL CLF_EXP_CDE: CHAR(1) NOT NULL CLF_KOL_CDE: CHAR(2) NOT NULL CLF_PO_ID: DECIMAL(15,0) NOT NULL CLF_LIC_SER_YEAR: CHAR(1) NOT NULL CLF_DEC_VALUE: DECIMAL(7,0) NOT NULL CLF_GR_VEH_WT: CHAR(6) NOT NULL CLF_PR_ID: DECIMAL(15,0) NULL CLF_AQT_CDE: CHAR(3) NOT NULL CLF_MIN_NO_DAM_TYP: CHAR(2) NOT NULL CLF_EST_REM_CRC: INTEGER NOT NULL CLF_EST_REM_CH_FLG: CHAR(1) NOT NULL CLF_PURGE_FLG: CHAR(1) NOT NULL CLF_PURGE_DTE: DATE NULL VEHICLE_REPAIR_LOG

VRL_CLF_ID: DECIMAL(15,0) NOT NULL (FK) VRL_SEC_ID: SMALLINT NOT NULL VRL_LOGON_ID: CHAR(8) NOT NULL VRL_TMESTMP: TIMESTAMP NOT NULL VRL_CAR_IN_DTE: DATE NOT NULL VRL_CAR_IN_TME: DATE NOT NULL VRL_CUST_CNTCT_DTE: DATE NULL VRL_CUST_CNTCT_TME: DATE NULL VRL_CAR_OUT_DTE: DATE NULL VRL_CAR_OUT_TME: DATE NULL VRL_EXCLUDE_FLG: CHAR(1) NOT NULL VRL_PVRT_NUM_DAYS: SMALLINT NULL

14

Semantics of data models


Data models use graphical notations and text strings called Verb Phrases. The semantics of notations depends upon the modeling technique followed and the tool being used.

15

Entities
A Thing of significance for business for which data has to be stored and manipulated. Nouns representing Objects, Events, Concepts, Relationships, Actions.. In data models represented as rectangles. Examples: Insurance policy, Claim, Vehicle, Event etc.

16

Entity sub-types
Some entities have many subtypes PERSON and ORGANIZATION entities are sub types of PARTY entity FULL TIME EMPLOYEE and CONTRACT EMPLOYEE are sub types of EMPLOYEE entity They are depicted as contained in main entity or as child of main entity
Party Employee

Full Time

Contract

Person

Organization
17

Attributes
The properties of Entities for which data has to be collected and stored. Attributes are represented as text strings contained inside the entities in data models. Example- Policy holder`s name, event date, claim amount etc

18

Relationships
Relationships represent how entities interact and create, use, modify or delete each other. They are represented by different types of lines going from one entity to another.

---------------_________

________ ________

-------------

19

Cardinality of relationship
Cardinality of relationship is number of instances of entities at the two ends of relationships. It is represented by 3 domain values Zero, One or Many It may be shown as a circle, a vertical line and a crow feet at the end of relationship lines or some other symbol. Sometimes it is represented as 0, 1 or n on relationship lines.
Policy ..1.. 0n Claim

Product

Line Item
20

Optionality of relationships
Optionality of relationship means whether the entity may be present or must be present in the relationship. It may be represented as solid line or broken line part in the relationship ( or some other way)

Policy

_____----------

Claim

21

Self Referencing Relationships

22

Verb phrases
Verb Phrases describe relationship between two entities going from one entity to another in both directions.
Employs organization Works for Employee

Paid to Claim

Makes

Policy Holder

23

Keys
Keys are for navigating through data: information retrieval Primary Keys: A primary key is a group of attributes that uniquely identifies an entity instance. Every entity has exactly one primary key Foreign Keys: Navigating to attribute of an entity from another entity. FK attributes implement relationships and are owned by parent entities.

24

Relationships- identifying vs. nonidentifying


The parent entity is needed to identify the child entity.

25

Domains
A named set of data values all of the same data type, upon which the actual value for an attribute instance is drawn. Every attribute must be defined on exactly one underlying domain. Multiple attributes may be based on the same underlying domain. Example of domain
Gender- M, F Province -Varchar(2) BC, AB, ON, NF, QC, MN, SC, YU Short Description- Varchar(40) Long Description Varchar(2000) Unique Identifier Integer(9)

26

Cost of wrong domains


NASA spacecraft Mars Climate Orbiter crashed on mars surface in 1998. The spacecraft was using domain with USMB units(pound force seconds ) whereas the control center was using domain based on SI units(newton seconds). Total cost - $327.6 million European Ariane 5 expendable launch system blast occurred 37 seconds after launch in 1996- Wrong use of domain(Integer vs Float) caused integer overflow - Total cost - $8 Billion

27

Types of notations
Different types of semantic notations are available for ER diagramming
Chen Notations IDEF1X Information Engineering Barker Notations

28

Types of notations-IDEF1X
.
Independent Entities
Identifying Solid lines

Dependent Entities
Discriminator
Non-Identifying- Dashed lines

-----------Many-to-Many

Category Complete

Category In-Complete

-----------Zero-One or Many
Z

-----------P

Attributes

Mandatory

Optional

29

Types of notations-IDEF1X
Supported by most of the available tools. More geared towards developing physical database design Needs combination of notations to capture rules. These combinations not easily understood by business people- difficult to use in JAD sessions.

30

IDEF1X model

31

Types of notations Information Engineering(IE)


Entities
Non-Identifying

Super Type

Identifying

----------------One to Many Zero-or-One

Sub Type

Sub Type

--------------Many to Many Zero-One or Many One and only One

Exclusive OR in Finkelstein

Attributes Sub Type Sub Type

Attributes

32

Types of notations-Information Engineering ( IE)


Two variations - Clive Finkelstein and James Martin Different tools implement different variations of the notations. In the original version, attributes not shown on the entities but in a separate document like Martins` Bubble Chart Supported by most of the available modeling tools. Easy to understand notations Suitable for JAD sessions.

33

IE model

34

Types of notations- Barker


.
Entities

Solid-Dashed lines for Optionality One or More

____ -------Zero or One One to One

_____ -------Zero or More

Exclusive OR

Super Type

Sub Type
35

Types of notations: Barker


# before attribute unique identifier attribute Solid circle are for required attributes Blank circles for optional attributes Sub Types are mutually exclusive Sub Types are always complete. A line across relationship means the relationship is identifying.

36

Types of notations- Barker


Developed by Richard Barker in UK in 1986. Adopted by Oracle for its case methodology. Simple and easily understood by business people. Not supported by all tools.

37

Barker model

38

39

Reading business rules


Each <Entity 1> {may be | must be } Optionality <relationship> Verb Phrase {zero |only one | one or more} Cardinality <Entity 2>
An EMPLOYEE must be staff of only one DEPARTMENT A DEPARTMENT may be composed of one or more EMPLOYEE

40

Reading business rules


A CLAIM FILE may contain Zero, One or More TOTAL LOSS REQUEST RECORD A TOTAL LOSS REQUEST RECORD must be on only one CLAIM FILE

41

Reading business rules


A CLAIM FILE may have vehicle detail in zero one or more VEHICLE RECORD A VEHICLE RECORD must be (..?..) one and only one CLAIM FILE

42

Reading a data model


Find out what notations are being used. Get a chart of the notations giving graphical representations and their descriptions. Look at the important entities in the model entities which are center of many relationships. Look at the definition of the entity. The definition should convey the role entity plays in business. Following relationship lines and reading verb phrases, move from one entity to another. Note the relationships implemented in the model. Note the cardinality and optionality rules. Read the business rule implemented for the entities.

43

Let us read a data model

44

Reading a data model-gleaning the business rules


It is an attributed logical model. It is using Information Engineering (IE) notations. A PARTY may place Zero, One or
Many PURCHASE ORDER

A PURCHASE ORDER must be received from only one PARTY. A PARTY must be of either PERSON or ORGANIZATION type. A PURCHASE ORDER may contain Zero, One or Many LINE ITEM. A LINE ITEM must be placed on only one PURCHASE ORDER. A PRODUCT may be on Zero, One or More LINE ITEM A LINE ITEM must shows only one PRODUCT. A PRODUCT may be of SOURCED PRODUCT or SERVICE Type Party Identifier is key identifier for PARTY. Product Identifier is key identifier for PRODUCT.

45

Reading a data model-gleaning the business rules


Purchase Order Number combined with PARTY Identifier is Primary identifier for PURCHASE ORDER Line Item Number, Product Identifier, Party Identifier and Purchase Order Number combined is Primary identifier for LINE ITEM Surname is attribute of PERSON only Business Number is attribute of ORGANIZATION only. Sourced From is attribute of SOURCED PRODUCT only Cost Amount is attribute of SOURCED PRODUCT only. Service Location is attribute of SERVICE only. Rate Per Hour is attribute of SERVICE only.

46

Reading a data model- deriving real value


Very important exercise for flushing out hidden and missing business rules- minimize later day change requests. Value is in critical examination of business rules.
A PURCHASE ORDER must be received from only one PARTY : Can a party transfer its purchase order to another party? What if a party is dissolved, merged or acquired by another party after placing a purchase order? Do we need to know about original party? Can two parties place a combined order to obtain volume discount? Business Number is attribute of ORGANIZATION only, There are individuals who are incorporated and have a business number. Should we capture their business number?

A PRODUCT may be of SOURCED PRODUCT or SERVICE Type


What about sourced products requiring installation service and support? Should we invoice service on a separate purchase order

47

Avoiding high cost of change


.

48

Data models maximizing ROI.


Make data modeling mandatory part of development life cycle. Standardize on use of data modeling tool so everybody is familiar with its semantics. Provide training to users in modeling tool and its semantics. Capture additional business rules in separate documents for their completeness. Keep data models up to date.
49

Further readings
Help section of the data modeling tools: most of the tools come with good support documentations on modeling methodology and notations.
Data Model Patterns: Convention of Thought by David C. Hay Data Modeling Made Simple: A Practical Guide for Business and IT Professionals by Steve Hoberman Data Modeling for the Business: A Handbook for Aligning the Business with IT using High-Level Data Models (Take It with You Guides) - By Steve Hoberman, Donna Burbank, Chris Bradley

50

Thank you for joining

51

You might also like