0% found this document useful (0 votes)
6 views

Conceptual Data Model

The document discusses conceptual data models which provide a high-level business view of the data needed to support business processes independent of any underlying applications or storage structures. A conceptual model focuses on identifying business terms, entities, attributes and their relationships without physical implementation details. It allows business users to view integrated business data outside of any specific applications.

Uploaded by

Ichwan Habibie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Conceptual Data Model

The document discusses conceptual data models which provide a high-level business view of the data needed to support business processes independent of any underlying applications or storage structures. A conceptual model focuses on identifying business terms, entities, attributes and their relationships without physical implementation details. It allows business users to view integrated business data outside of any specific applications.

Uploaded by

Ichwan Habibie
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Conceptual Data Model

Related terms:

Physical Data Model, Data Modelling, Relational Database, Data Model, Business
Data Glossary, Logical Data Model

View all Topics

Foundational Data Modeling


Rick Sherman, in Business Intelligence Guidebook, 2015

Conceptual Data Model


The conceptual data model is a structured business view of the data required to
support business processes, record business events, and track related performance
measures. This model focuses on identifying the data used in the business but not its
processing flow or physical characteristics. This model’s perspective is independent
of any underlying business applications. For example, it allows business people to
view sales data, expense data, customers, and products—business subjects that are
in the integrated model and outside of the applications themselves.

The conceptual data model represents the overall structure of data required to
support the business requirements independent of any software or data storage
structure. The characteristics of the conceptual data model include:

• An overall view of the structure of the data in a business context.

• Features that are independent of any database or physical storage structure.

• Objects that may not ever be implemented in physical databases. There are
some concepts and processes that will not find their way into models, but they
are needed for the business to understand and explain what is needed in the
enterprise.
• Data needed to perform business processes or enterprise operations.

The conceptual data model is a tool for business and IT to define:


• Data requirements scope.

• Business terms and measures across different business units and those that
are agreed upon for enterprise-wide usage.
• Names, data types, and characteristics of entities and their attributes.

> Read full chapter

Models for Phase B


Philippe Desfray, Gilbert Raymond, in Modeling Enterprise Architecture with TO-
GAF, 2014

Using conceptual data diagrams


A pertinent conceptual data model is a legacy of knowledge upon which many
enterprise architecture models can be based:

• Data models obviously derive from the conceptual data diagram.

• Service data diagrams will be based on this model.

• “Entity” application components2 will be derived from the most important key
business entities of this model, as well as their access interfaces.
• Business processes can share the definition of their information flows or
products exchanged with the business entities defined in conceptual data
diagrams.

The use of a modeling tool based on a central repository provides great consistency
to all diagrams, some of whose model elements are shared and others derived
(Figure 8.17).
Figure 8.17. Lifecycle diagram for the “Order” business entity.

> Read full chapter

DATA
David C. Hay, in Data Model Patterns, 2006

Entity Class Views


The Row Three conceptual data model is constrained to entity classes that represent
fundamental, often abstract, things of significance to the business. The things seen
by Row Two's business owners, however, tend to be more concrete, and are usually
examples, combinations, or subcategories of the entity classes in the conceptual
model.

We have previously specified that multiple business terms can be to represent a


business concept (entity class or attribute), but this is often not adequate to describe
the detailed interactions between the two kinds of language. What we need are
virtual entity classes that represent the business owner's language, but which can be
explicitly mapped to more fundamental classes.

In the original ANSI three-schema architecture, the external schema was considered
a “view” of the conceptual model, and relational database management vendors
have implemented the concept of a view derived from one or more tables that in
all respects behave like a table.

No modeling tool currently (as of this writing in 2005) supports this, but it is
reasonable to hypothesize the similar concept of an entity view. In Figure 2-26, a
virtual entity class is a view derived from underlying elementary entity classes.
Fig. 2-26. Entity class views.

Both virtual entity class and elementary entity class are sub-types of standard entity
class, which means that a previously cited business rule should actually read as
follows.

Business Rule

Each information engineering role must be played by a standard entity class or by an


other entity class.

Each virtual entity class is defined in terms of the entity classes and attributes that
make it up, plus the selection criteria used to select occurrences of the underlying
entity classes. Specifically, each virtual entity class must be defined in terms of one
or more entity class selections, each of which must be of one and only one other
(other) entity class. In addition, each entity class selection must be composed of one
or more attribute selections, each of which must be of one and only one attribute.

In addition, each virtual entity class may be populated by rows defined in terms of one
or more selection conditions, each of which must be either in terms of an attribute
or in terms of a relationship role. Attributes of selection condition define the criteria
for selecting rows to be part of the virtual entity class: an “Operator” evaluates the
attribute or relationship role involved, comparing each occurrence's value with the
selection condition's “Value”. For example, rows from the entity class that is “Part”
could be selected if the selection condition is in terms of the attribute “Length” (which
is also about the entity class “Part”), the operator is “Less than”, and the value is “24”.
That is, the view will be populated with all parts that are less than 24 (units)* long.

As another example, in the sample model shown in Figure 2-17, suppose you wanted
to define the view customer which was the set of rows from the entity class party
(which were buyers in an order). To do so, the virtual entity customer would be defined
in terms of the entity class selection that is of the entity class party. This in turn would
be composed of the attribute selections shown in Table 2-2.

Table 2-2. Virtual entity class example.

About Which One Which Implies


attribute Party ID entity class Party

attribute Person given name entity class Person

attribute Person surname entity class Person

attribute Person middle initial entity class Person

attribute Organization name entity class Organization

In addition, the virtual entity customer would also be defined in terms of one selection
criterion that is in terms of the relationship role “buyer in” (which in turn is played by
the same party). The Operator in selection criterion would be “equal to”. The process
of constructing a virtual entity class, then, is controlled by the following business
rule.

Business Rule

Each occurrence of attribute selection must be of an attribute that is about an entity


class. Each occurrence of attribute must also be part of an entity class selection that
is of an entity class. For each attribute selection, the same entity class must be at the
end of both of those navigations.

> Read full chapter

Model Constructs and Model Types


Charles D. Tupper, in Data Architecture, 2011

Conceptual Business Model


Explained simply, a conceptual data model (CDM) shows (in graphic and text form)
how the business world sees information it uses. It often suppresses or blurs details
in order to emphasize on the big picture. Conceptual data modeling is one of the
most powerful and effective analytical techniques for understanding and organizing
the information required to support any organization. This form of model focuses on
the big picture, and the really important strategic objectives that will ensure prosper-
ity for the organization. Data are shared across both functional and organizational
boundaries in the business. As a result, this is critical for removing redundant data
and process in the conduct of the organization’s processes by increasing shared data
use and encouraging process reuse.

There are a number of basic steps involved in conceptual business modeling. It is,
of course, an exercise in the gathering of requirements from a user environment.
The difference between conceptual models and lower-level models is detail. To put
it simply, conceptual models are highly abstracted, architectural-type views of the
business area. At their level they capture the major entities and how they might be
related together. The conceptual data model is not specific in nature but is generic.
The relationships within it are not made explicit as to type or cardinality. They are just
present. Domain constraint data (that set of limits placed upon reference domain
data or validation data) are not included. This model is only intended to capture the
highest level of business use so there is an understanding of what the process is. It is
accompanied by a high-level activity hierarchy or functional decomposition diagram
that depicts the major functionality that is accomplished in the business problem
area.

The functional decomposition diagram is a hierarchical structure that identifies,


defines, and logically groups the business functions that are performed by the
current system. It isolates the processes; it shows no data inputs, outputs, data
stores, or sources of information. The principal objective of the FDD is to show the
primitive functions of the system for which logic is to be specified. It will be further
examined and analyzed in much greater detail in the next phase of this project:
logical model development.

> Read full chapter

Architecture Approach
Charles T. Betz, in Architecture and Patterns for IT Service Management, Resource
Planning, and Governance: Making Shoes for the Cobbler's Children (Second Edi-
tion), 2011
Intersection Entities
The intersection entities are where the devil emerges from the details.

Most of the entity relationships in the conceptual data model are many to many.
These relationships must be resolved with an intermediate entity. Such intersection
entities require the same CRUD analysis as the major IT concepts, and in fact some
of the most challenging problems emerge in attempting to manage them.

For example, an Application may use many Servers and a Server may support many
Applications:

Figure 2.45. Many to many

In order to actually turn these language concepts into an operable system, an


intersection entity is required:xl

Figure 2.46. Resolved many to many

Wen analyzing process to data (or any many to many, for that matter), one eventually
needs to consider all three entities as shown in the previous figure.

If you look at the main data model and imagine all the many to manys being
elaborated with their intersection entities, you'll see that it would be far too complex
to represent as one diagram. That's the beauty of a well-scoped conceptual data
model: it should be able to represent a substantial problem domain on one page.

Uncoordinated spreadsheets do not handle many to many data well.

The intersection entities are where the devil emerges from the details. For example,
it is likely your database administration team has a list (at least a spreadsheet) of all
their databases. Perhaps you have an application management group with their own
spreadsheet. Therefore you might be able to say that you can populate the Applica-
tion Service and Datastore entities. But who is responsible for the relationship, as
represented by the Application/Datastore intersection entity?

Questions of this nature permeate the problem of configuration management


(and data management more generally, in any business domain). For any entity,
documented processes are required for the creation, reading, updating, and deleting
of data in the Application/Datastore intersection entity. Would it be your application
team? Your DBA team? A separate team of configuration analysts?

The current state of most IT organizations is much less formal. What we often see
is uncoordinated spreadsheets, which do not handle the challenge of many to many
data well at all.
Dialog: Spreadsheet Silos

Chris: What's so bad about people maintaining their own spreadsheets?

Kelly: Well, let's look at your organization. Here are some extracts from
spreadsheets maintained by your application support, database, and serv-
er teams:Server team:Server nameNotesWNAPPL01Supports FirstTime and X-time Batch.-
FRED?UXPLV01PLV server. See Scott Armstrong.WINWEB03External Web serverUNXDB001-
PLV databasesWINDB2SQL ServerTXEMLAEmail serverQDXAPP02Quadrex App ServerApplica-
tions team:ServersDatabasesQuadrexQDXAPP02UNXDB001OracleX-TimeWNAPPL01SQL Serv-
erPLVUNXDB001UXPLV01OracleDatabase team:DatabaseServerAppPDBX01UNXDB001Qaudrex-
LVDBX01UNXDB001PLV/X-timeARGDBX02WINDB2ArgentGDBX01WINDB2GuardSys
Chris: Ouch. This data makes my head hurt.

Kelly: Well, stick with me. There are some serious issues here. Let's focus on Quadrex.
The server team knows that Quadrex uses QDXAPP02 as an application server, but
doesn't seem to realize that Quadrex also uses UNXDB001 through its use of the
PDBX01 database. They think that UNXDB001 is only used for PLV. (Perhaps there
was surplus capacity on that server and Quadrex came later.)

The application team knows that Quadrex is using QDXAPP02, and UNXDB001, but
doesn't have the level of detail that the DBAs do, that Quadrex is using specifically
the PDBX01 database on that server. Quadrex does not own that server – the PLV
team is also using it. This is important from a cost allocation and support impact
standpoint.

Chris: Actually, no application team “owns” their server according to our VP for
systems engineering, even if that server is currently allocated 100 percent to them.
It's a “hosting” relationship. But some of them haven't quite bought into that point
of view.

Kelly: Right – common argument nowadays! Now, the database team knows that
Quadrex is using the PDBX01 database on UNXDB001 – but isn't tracking Quadrex's
use of QDXAPP02, as that is an application server that they don't manage. Finally,
notice that someone fumble-fingered the Quadrex name on the first row of the DBA
spreadsheet, misspelling it “Qaudrex.” This means that when we go to consolidate
all this data into one database, we're going to have manually identify and clean that
up.

Chris: Why didn't the DBAs pick from a list of application names?

Kelly: Has that list been shared with them? Do they agree with how those applications
are represented? Is there confidence in the process for keeping the list up to date?
(For that matter, is there even a process?!) Do they have a technical approach on
how they can integrate that list from another system? Excel can pull a list from a live
database, but you start to get into advanced features and go too far down that road
and you're looking at real system development.

The same issues need to be thought through for every many-to-many relationship:

▪ Event/Incident/Problem

▪ Application/Technology Product

▪ Application/Process

▪ Change/CI

▪ Change/Incident

and so forth. The complexities of doing this are why vendor products are marketed;
this problem domain is both complex and yet common to many industries.

But data architecture is a critical area in which to review the vendor product – a
frequent vendor mistake is to put in a one to many where a many to many is required!
For example:

▪ A Problem might be addressed by several Releases, but your problem manage-


ment tool allows you to identify only one Release that fixes it.
▪ A Datastore may be shared by many Applications, but a configuration manage-
ment tool allows you to identify it with only one.
▪ A Machine may support multiple Servers, but your Asset Management tool
allows you to associate it with only one.

Author's Note

These kinds of data architecture flaws have become fewer over the last few years in
my experience with these tools.

These are the kinds of details that are critical to review in assessing any vendor
product – and it all starts with having good, specific, clear requirements for what you
need to track and how it needs to relate. Even when purchasing a vendor product, a
conceptual data model is needed. (Emphasis on conceptual. The physical data model
is irrelevant; the purpose of asking for a data model is to assess the business rules
that the application is based on – not to assess their technical architecture.)

The purpose of asking for a data model is to assess the business rules that the
application is based on.

> Read full chapter

An Example of Logical Database Design


Toby Teorey, ... H.V. Jagadish, in Database Modeling and Design (Fifth Edition), 2011

Logical Design
Our first step is to develop a conceptual data model diagram and a set of FDs to
correspond to each of the assertions given. Figure 7.1 presents the diagram for
the entity–relationship (ER) model and Figure 7.2 shows the equivalent diagram
for the Unified Modeling Language (UML). Normally, the conceptual data model is
developed without knowing all the FDs, but in this example the nonkey attributes are
omitted so that the entire database can be represented with only a few statements
and FDs. The results of this analysis, relative to each of the assertions given, are
shown in Table 7.2.

Figure 7.1. Conceptual data model diagram for the ER model.


Figure 7.2. Conceptual data model diagram for UML.

Table 7.2. Results of the Analysis of the Conceptual Data Model

ER Construct FDs
Customer(many): Job(one) cust-no -> job-title
Order(many): Customer(one) order-no -> cust-no
Salesperson(many): Department(one) sales-id -> dept-no
Item(many): Department(one) item-no -> dept-no
Order(many): Item(many): Salesperson(one) order-no, item-no -> sales-id
Order(many): Department(many): Salesper- order-no, dept-no -> sales-id
son(one)

The candidate tables needed to represent the semantics of this problem can be
derived easily from the constructs for entities and relationships. Primary keys and
foreign keys are explicitly defined.

create table customer(cust_no char(6),job_title varchar(256),primary key


(cust_no),foreign key (job_title) references jobon delete set null on update
cascade);
create table job(job_no char(6),job_title varchar(256),primary key (job_no));
create table order(order_no char(9),cust_no char(6) not null,primary key (or-
der_no),foreign key (cust_no) references customeron delete set null on update
cascade);
create table salesperson(sales_id char(10)sales_name varchar(256),dept_no
char(2),primary key (sales_id),foreign key (dept_no) references departmenton
delete set null on update cascade);
create table department(dept_no char(2),dept_name varchar(256),manag-
er_name varchar(256),primary key (dept_no));
create table item(item_no char(6),dept_no char(2),primary key (item_no),-
foreign key (dept_no) references departmenton delete set null on update
cascade);
create table order_item_sales(order_no char(9),item_no char(6),sales_id var-
char(256) not null,primary key (order_no, item_no),foreign key (order_no)
references orderon delete cascade on update cascade,foreign key (item_no)
references itemon delete cascade on update cascade,foreign key (sales_id)
references salespersonon delete cascade on update cascade);
create table order_dept_sales(order_no char(9),dept_no char(2),sales_id var-
char(256) not null,primary key (order_no, dept_no),foreign key (order_no) ref-
erences orderon delete cascade on update cascade,foreign key (dept_no) ref-
erences departmenton delete cascade on update cascade,foreign key (sales_id)
references salespersonon delete cascade on update cascade);

Note that it is often better to put foreign key definitions in separate (alter)
statements. This prevents the possibility of getting circular definitions with very
large schemas.

This process of decomposition and reduction of tables moves us closer to a mini-


mum set of normalized (BCNF) tables, as shown in Table 7.3.

Table 7.3. Decomposition and Reduction of Tables

Table Primary Key Likely Nonkeys


customer cust_no job_title, cust_name, cust_ad-
dress
order order_no cust_no, item_no, date_of_pur-
chase, price
salesperson sales_id dept_no, sales_name,
phone_no
item item_no dept_no, color, model_no

order_item_sales order_no, item_no sales_id

order_dept_sales order_no, dept_no sales_id

The reductions shown in this section have decreased storage space and update costs
and have maintained the normalization of BCNF (and thus 3NF). On the other
hand, however, we have potentially higher retrieval cost—for example, given the
transaction “list all job_titles”—and have increased the potential for loss of integrity
because we have eliminated simple tables with only key attributes. Resolution of
these trade-offs depends on your priorities for your database.

The details of indexing are covered in the companion book Physical Database Design
(Lightstone et al., 2007). However, during the logical design phase of defining SQL
tables, it makes sense to start considering where to create indexes. At a minimum,
all primary keys and all foreign keys should be indexed. Indexes are relatively easy
to implement and store, and make a significant difference in reducing the access
time to stored data.

> Read full chapter

Some Types and Uses of Data Models


Matthew West, in Developing High Quality Data Models, 2011

3.1.3 Conceptual Data Model


As with logical data models, there are some differing opinions about what a concep-
tual data model is. So again, I will state the way that I understand the term and then
identify some key variations I have noticed.

A conceptual data model is a model of the things in the business and the relation-
ships among them, rather than a model of the data about those things. So in a
conceptual data model, when you see an entity type called car, then you should think
about pieces of metal with engines, not records in databases. As a result, conceptual
data models usually have few, if any, attributes. What would often be attributes may
well be treated as entity types or relationship types in their own right, and where
information is considered, it is considered as an object in its own right, rather than
as being necessarily about something else. A conceptual data model may still be
sufficiently attributed to be fully instantiable, though usually in a somewhat generic
way.

Variations in view seem to focus on the level of attribution and therefore whether or
not a conceptual data model is instantiable.

A conceptual data model might include some rules, but it would not place limits on
the data that can be held about something (whether or not it was instantiable) or
include derived data.

The result of this is that it is possible for a conceptual data model and a logical data
model to be very similar, or even the same for the same subject area, depending on
the approach that is taken with each.
> Read full chapter

Information Architecture
James V. Luisi, in Pragmatic Enterprise Architecture, 2014

4.1.5.1 Conceptual Data Models


The somewhat less traditional view of data modeling begins with conceptual data
modeling. Conceptual data models utilize a standard system of symbols that form
a formal, although uncomplicated language that communicates an abundance of
knowledge about the information being modeled. This uncomplicated visual lan-
guage is effective for communicating the business users’ view of the data they work
with.

The system of symbols employed in conceptual data model borrows a number of the
basic modeling constructs found in entity relationship diagrams (ERDs), containing
entities, attributes, and relationships.

The characteristics of conceptual data models that are specific to it include the
following:

▪ The objective of the model is to communicate business knowledge to any


individuals who are unfamiliar to the business.
▪ The scope of the model is from the perspective of a business subject area of data,
as opposed to the scope of an automation project, automation application,
automation database, or automation interface.
▪ The names of the objects in the model are strictly restricted to language used
within the business, excluding any and all technical terminology related to
automation jargon.
▪ Diagramming conventions are that which emphasize what an individual can
comfortably view and comprehend on an individual page.
▪ Business data points are simply associated with the data objects they would
belong to and are not taken through the data engineering process called
“normalization” to separate attributes into code tables.
▪ Data abstractions, such as referring to business objects in a more generic and
general way, are not performed as they often lose the business intent and then
become less recognizable to the business.
▪ Technical details, frequently found within ERDs, such as optionality and specific
numerical cardinalities, are omitted.
The modern approach to conceptual data models is to incorporate them as a natural
extension of the LDA. In fact, each conceptual data model should correspond to one
business subject area of data and should be developed by business users who have
been mentored by information architects to assist in the upkeep of the LDA.

> Read full chapter

Data Management, Models, and Meta-


data
Laura Sebastian-Coleman, in Measuring Data Quality for Ongoing Improvement,
2013

Types of Data Models


Different types of data models depict data at different levels of abstraction. Concep-
tual data models present the entities (ideas or logical concepts) that are represented
in the database and have little if any detail about attributes. Logical data models
include detail about attributes (characteristics in columns) needed to represent a
concept, such as key structure (the attributes needed to define a unique instance
of an entity), and they define details about the relationships within and between
data entities. Relationships between entities can be optional or mandatory. They
differ in terms of cardinality (one-to-one, one-to-many, many-to-many). Physical
data models represent the way that data are physically stored in a database. They
describe the physical characteristics of data elements that are required to set up and
store actual data about the entities represented. In addition to models that differ
by levels of abstraction, there can also be models of data consumer-facing views of
the data. Technically, a view is a dataset generated through a query that produces a
virtual table. A more mundane definition is that a view is what a data consumer
sees. At its simplest, a view can have exactly the same structure as a physical table.
Views can also be used to display a subset of data from a table, to aggregate data
within a table, or combine data from multiple tables. As with other data models,
models of views enable data consumers to understand how data is organized.

The process of data modeling involves a series of decisions about how to represent
concepts and relate them to each other. Data modeling uses tools and conventions
of representation that convey meaning in a consistent way, regardless of the content
of the data being modeled. Like all forms of representation, data models are limited.
They can be articulated to different levels of detail for different purposes. They focus
on representing those aspects of the things represented that are important to a
particular purpose of the representation (West, 2003).
To understand the implications of purpose and representation in data modeling, let’s
consider how these choices affect other kinds of modeling. All models are built for
particular purposes and must be understood in light of those purposes. A house in
a subdivision will be depicted differently in different representations or models built
for different purposes. In a street plan for the subdivision, it will be represented as a
box on a parcel of land. The purpose of such a plan is to convey information about
the size and shape of the subdivision and the location of houses and lots to each
other. Such a plan might be shared with a town planning commission charged with
making decisions about land use or with potential buyers wanting to understand
the general characteristics of a neighborhood. In architectural drawings, the house
will be depicted in a set of views showing the size, shape, and details of its structure.
The purpose of an architectural drawing is to enable people to see what the house
would look like and to make decisions about how it will be built. The accompanying
floor plan, another model of the house, contributes to the process of understanding
the size and shape of the house and is also necessary input to building the house.
It contains details such as room sizes, the number of windows and doors, and
the like that will influence the construction of the house. None of these models
is the house itself, but all of them depict the house. Each represents a subset of
characteristics important to the purpose of the representation. The same idea applies
to data models.

When working with data models, it is important to recognize that there is not
one-and-only-one way to model any given dataset. Put this way, models present
a kind of chicken-and-egg problem: Do data define models, or do models define
data? The answer is both. To be understandable at all, data require context and
structure. Data models provide a means of understanding this context. In doing so,
they also create context. If data stakeholders find that models are understandable
representations of data, then they can become a primary means of defining data.

For most databases, especially data warehouses, models are critical to data manage-
ment. The conceptual and logical models allow data managers to know what data
resides in the data asset. The physical model has a direct bearing on how data is
moved within the database, as well as how it is accessed.

> Read full chapter

Applying the Principles for Attributes


Matthew West, in Developing High Quality Data Models, 2011

Publisher Summary
This chapter illustrates some practical examples of problems that arise with attribut-
es in data models and how the principles for conceptual, integration, and enterprise
data models can help overcome or avoid these problems, because they lead to data
models that are more stable and regular in their structure. The clue to look for is
a relatively large number of attributes or unexpected attributes. This means that a
particular business view is being modeled rather than the underlying nature of the
problem. The process that is followed when resolving a complex entity type is to
examine each attribute in turn, discover what it means, and determine whether it is
really an attribute of the entity type in question. A key consequence of this approach
to attributes and identifiers is that data is broken down into small elements. This
is particularly useful for data models aimed at data integration. One cannot control
the granularity of the data models he or she needs to integrate, but if the integration
data model has the finest granularity, then the data from other data models is always
able to be broken down to that level, and one then has the pieces to reassemble for
more coarsely granular data models.

> Read full chapter

ScienceDirect is Elsevier’s leading information solution for researchers.


Copyright © 2018 Elsevier B.V. or its licensors or contributors. ScienceDirect ® is a registered trademark of Elsevier B.V. Terms and conditions apply.

You might also like