0% found this document useful (0 votes)
19 views3 pages

Data Modelling - Extra Material

Uploaded by

Gabriela Barra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views3 pages

Data Modelling - Extra Material

Uploaded by

Gabriela Barra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

28/09

Data Modeling 101


The goals of this article are to overview fundamental data modeling skills that all developers should have, skills that can
be applied on both traditional projects that take a serial approach to agile projects that take an evolutionary
approach. My personal philosophy is that every IT professional should have a basic understanding of data
modeling. They don’t need to be experts at data modeling, but they should be prepared to be involved in the creation of
such a model, be able to read an existing data model, understand when and when not to create a data model, and
appreciate fundamental data design techniques. This article is a brief introduction to these skills. The primary audience
for this article is application developers who need to gain an understanding of some of the critical activities performed
by an Agile DBA. This understanding should lead to an appreciation of what Agile DBAs do and why they do them, and it
should help to bridge the communication gap between these two roles.
1. What is Data Modeling?
Data modeling is the act of exploring data-oriented structures. Like other modeling artifacts data models can be used for
a variety of purposes, from high-level conceptual models to physical data models. From the point of view of an object-
oriented developer data modeling is conceptually similar to class modeling. With data modeling you identify entity types
whereas with class modeling you identify classes. Data attributes are assigned to entity types just as you would assign
attributes and operations to classes. There are associations between entities, similar to the associations between
classes – relationships, inheritance, composition, and aggregation are all applicable concepts in data modeling.
Traditional data modeling is different from class modeling because it focuses solely on data – class models allow you to
explore both the behavior and data aspects of your domain, with a data model you can only explore data issues.
Because of this focus data modelers have a tendency to be much better at getting the data “right" than object
modelers. However, some people will model database methods (stored procedures, stored functions, and triggers) when
they are physical data modeling. It depends on the situation of course, but I personally think that this is a good idea and
promote the concept in myUML data modeling profile (more on this later).
Although the focus of this article is data modeling, there are often alternatives to data-oriented artifacts (never forget
Agile Modeling’s Multiple Models principle). For example, when it comes to conceptual modeling ORM diagrams aren’t
your only option – In addition to LDMs it is quite common for people to create UML class diagrams and even Class
Responsibility Collaborator (CRC) cards instead. In fact, my experience is that CRC cards are superior to ORM diagrams
because it is very easy to get project stakeholders actively involved in the creation of the model. Instead of a
traditional, analyst-led drawing session you can instead facilitate stakeholders through the creation of CRC cards.

1.1 How are Data Models Used in Practice?


Although methodology issues are covered later, we need to discuss how data models can be used in practice to better
understand them. You are likely to see three basic styles of data model:
 Conceptual data models. These models, sometimes called domain models, are typically used to explore domain
concepts with project stakeholders. On Agile teams high-level conceptual models are often created as part of
your initial requirements envisioning efforts as they are used to explore the high-level static business structures
and concepts. On traditional teams conceptual data models are often created as the precursor to LDMs or as
alternatives to LDMs.
 Logical data models (LDMs). LDMs are used to explore the domain concepts, and their relationships, of your
problem domain. This could be done for the scope of a single project or for your entire enterprise. LDMs depict
the logical entity types, typically referred to simply as entity types, the data attributes describing those entities,
and the relationships between the entities. LDMs are rarely used on Agile projects although often are on
traditional projects (where they rarely seem to add much value in practice).
 Physical data models (PDMs). PDMs are used to design the internal schema of a database, depicting the data
tables, the data columns of those tables, and the relationships between the tables. PDMs often prove to be
useful on both Agile and traditional projects and as a result the focus of this article is on physical modeling.
Although LDMs and PDMs sound very similar, and they in fact are, the level of detail that they model can be significantly
different. This is because the goals for each diagram is different – you can use an LDM to explore domain concepts with
your stakeholders and the PDM to define your database design. Figure 1 presents a simple LDM and Figure 2 a simple
PDM, both modeling the concept of customers and addresses as well as the relationship between them. Both diagrams
apply the Barker notation, summarized below. Notice how the PDM shows greater detail, including an associative table
required to implement the association as well as the keys needed to maintain the relationships. More on these concepts
later. PDMs should also reflect your organization’s database naming standards, in this case an abbreviation of the entity
name is appended to each column name and an abbreviation for “Number" was consistently introduced. A PDM should
also indicate the data types for the columns, such as integer and char(5). Although Figure 2 does not show them, lookup
tables (also called reference tables or description tables) for how the address is used as well as for states and countries
are implied by the attributes ADDR_USAGE_CODE, STATE_CODE, and COUNTRY_CODE.
Figure 1. A simple logical data model.

Figure 2. A simple physical data model.

An important observation about Figures 1 and 2 is that I’m not slavishly following Barker’s approach to naming
relationships. For example, between Customer and Address there really should be two names “Each CUSTOMER may be
located in one or more ADDRESSES" and “Each ADDRESS may be the site of one or more CUSTOMERS". Although these
names explicitly define the relationship I personally think that they’re visual noise that clutter the diagram. I prefer
simple names such as “has" and then trust my readers to interpret the name in each direction. I’ll only add more
information where it’s needed, in this case I think that it isn’t. However, a significant advantage of describing the names
the way that Barker suggests is that it’s a good test to see if you actually understand the relationship – if you can’t name
it then you likely don’t understand it.
Data models can be used effectively at both the enterprise level and on projects. Enterprise architects will often create
one or more high-level LDMs that depict the data structures that support your enterprise, models typically referred to as
enterprise data models or enterprise information models. An enterprise data model is one of several views that your
organization’s enterprise architects may choose to maintain and support – other views may explore your
network/hardware infrastructure, your organization structure, your software infrastructure, and your business
processes (to name a few). Enterprise data models provide information that a project team can use both as a set of
constraints as well as important insights into the structure of their system.
Project teams will typically create LDMs as a primary analysis artifact when their implementation environment is
predominantly procedural in nature, for example they are using structured COBOL as an implementation
language. LDMs are also a good choice when a project is data-oriented in nature, perhaps a data warehouse or reporting
system is being developed (having said that, experience seems to show that usage-centered approaches appear to work
even better). However LDMs are often a poor choice when a project team is using object-oriented or component-based
technologies because the developers would rather work with UML diagrams or when the project is not data-oriented in
nature. As Agile Modeling advises, apply the right artifact(s) for the job. Or, as your grandfather likely advised you, use
the right tool for the job. It's important to note that traditional approaches to Master Data Management (MDM) will
often motivate the creation and maintenance of detailed LDMs, an effort that is rarely justifiable in practice when you
consider the total cost of ownership (TCO) when calculating the return on investment (ROI) of those sorts of efforts.
When a relational database is used for data storage project teams are best advised to create a PDMs to model its
internal schema. My experience is that a PDM is often one of the critical design artifacts for business application
development projects.
1- ¿Cuál es el objetivo principal de este artículo?
2- Traducir el párrafo del punto 1: ¿Qué es modelado de datos?
3- Describir los tres estilos de modelado de datos que se presentan en el artículo.
4- ¿Se pueden encontrar en el texto beneficios de un estilo sobre los otros? Si es así, explicar brevemente.
Extra themes:

https://www.umsl.edu/~sauterv/analysis/Fall2013Papers/Sirpur/index.html

https://www.umsl.edu/~sauterv/analysis/Fall2010Papers/varuni/

You might also like