Chapter 4 - Data Architecture Management
Chapter 4 - Data Architecture Management
Within the framework, Data Architecture Management is the first function that interacts
with and is influenced by the Data Governance function. Chapter 4 defines and explains the
concepts and activities involved in managing Data Architecture.
4.1 Introduction
Data Architecture management is the process of defining and maintaining specifications
that:
• They provide a common vocabulary of business standards;
• Express strategic data requirements;
• They outline integrated designs at a high level to meet these requirements; and
• They align with the business strategy and related business architecture.
Data architecture is an integrated set of specification artifacts used to define data
requirements, guide integration and control of data assets, and align data investments with
business strategy. It is also an integrated collection of master plans at different levels of
abstraction. Data architecture includes formal data names, complete data definitions,
efficient data structures, precise data integrity rules, and robust data documentation.
Data Architecture is most valuable when it supports the information needs of the entire
enterprise and enables standardization and integration of data across the enterprise.
Although this chapter focuses on enterprise data architecture, the same techniques can be
scaled down for use in a specific function or department of the organization.
Data Architecture is part of the larger architecture, and integrates with technology and
business architectures. Enterprise Architecture integrates data, processes, organizations,
applications, and technology architecture. It helps organizations manage change and
improve effectiveness, agility, and accountability.
Context for this function is shown in Figure 4.1.
2. Data Architecture Management
Definition: Define the needs of the company and design master plans to meet these needs.
Goals:
1. Plan with vision and foresight to provide high-quality data.
2. Identify and define common data requirements.
3. Design conceptual structures and plans to meet current and long-term data requirements of the company.
Activities:
1. Understanding the Information Needs of the Company (P)
Tickets: 2. Develop and Maintain the Enterprise Data Model (P) Primary Deliveries:
• Business Objectives 3. Analyze and Align with Other Business Models (P) • Enterprise Data Model
• Business Strategies 4. Defining and Maintaining Data Architecture Technology (P) • Value Chain Information Analysis
• Business Architecture Define and Maintain Data Architecture Integration (P) • Data Technology Architecture
• Architecture Processes 5. Defining and Maintaining DW/BI Architecture (P) • Data Integration/MDM Architecture
• IT Objectives Defining and Maintaining Taxonomies and Namespaces • DW/BI Architecture
• IT Strategies 6. Company (P) • Metadata Architecture
• Data Problems Defining and Maintaining Metadata Architecture (P)
8. • Company Taxonomies and Namespaces
• Data Needs
• Technical Architecture • Metadata Architecture Document
Participants: Tools: Management
• Data Administrators • Data Modeling Tools
• Subject Matter Experts (SMEs) • Model Management Tools
Suppliers: • Data Architects • Data Warehouse
• Executives • Analysts and Data Modelers • Office Productivity Tools. Consumers:
• Data Administrators • Other Enterprise Architects
• Data Administrators
• Data Producers • DM Administrators and Executives
• CIO and Other Executives • Data Architects
• Customer Information • Data Analyst
• Database Administrators
• Data Model Manager • Database Administrators
• Software Developers
• Project Managers
• Data Producers
Activities: (P) - Planning (C) - Control (D) - Development (O) - Operational • Knowledge Workers
• Managers and Executives
4.2.2 Activities
The data architecture management function contains several activities related to defining the model for
managing data assets. An overview of these activities is presented in the following sections.
4.2.2.1 Understanding Business Information Needs
In order to create the enterprise data architecture, the business first needs to define its information needs.
An enterprise data model is a way to capture and define information needs and data requirements. It
represents a master plan for data integration across the enterprise. The enterprise data model is a critical
input for future systems development, data requirements analysis, and data modeling projects.
Conceptual and logical data models for specific projects are based on the applicable parts of the enterprise
data model. Depending on the scope, some projects may benefit from integrating the enterprise data model
into solution design. Virtually every major project has the potential to impact the enterprise data model.
Designers can determine the information needs of the enterprise by evaluating inputs, outputs, internal and
external data sources, current system documentation, reports, interviews with system stakeholders, and
other artifacts required by the organization. These materials, organized and classified by business unit and
subject area, provide important entities, data, data attributes, and calculations. The list becomes the basic
requirements of the enterprise data model.
4.2.2.2 Develop and Maintain the Enterprise Data Model
Business entities are the classes of real things and concepts that describe the business. Data are the facts
that uniquely describe business entities. Data models maintain business entities, and data types, ie, data
attributes, necessary to operate and guide the business. Data modeling is an analysis and design method
used to:
1. Define and analyze data requirements.
2. Design the logical and physical data structures that support these
requirements.
A data model is a set of data specifications and related diagrams that reflect data requirements and designs.
The Enterprise Data Model (EDM) provides an integrated, subject-oriented view of essential data produced
and consumed by the organization.
• Integrated means that all of an organization's data and rules are represented once and fit together
seamlessly. An important goal of the model is to provide a view of the enterprise as a whole, which
also reflects functional and departmental views. Each business entity such as a “Customer” or
“Order” must be uniquely identifiable. The data attributes that define a business entity must be
complete, accurate, and provide clear definitions. In addition, the data model can identify common
synonyms and important distinctions between different subtypes of the same common business
entities.
• Subject-oriented means that the model is divided into commonly recognized subject areas that
span across multiple business functions and application systems. Subject areas focus on the most
essential business entities.
• Essential means data that is critical to the effective functioning and decision making of the
organization. Few enterprise data models define all the data within the enterprise. Essential data
requirements are generally not common across multiple applications and projects. Multiple systems
can share some data defined in enterprise data models. Other data may be critically important, but is
still created and used in some systems. Over time, the enterprise data model should define all data
that is important to the business. The definition of essential data changes with changes in the
business. The enterprise data model must keep up with those changes.
Data modeling is an important technique used in data architecture management and data development. Data
development implements data architecture in a way that extends and adapts enterprise data models to meet
the needs of specific business applications and project requirements.
4.2.2.2.1 The Enterprise Data Model
The enterprise data model is an integrated set of closely related deliverables. Many deliverables are
generated using a data modeling tool. The central repository for the data model may be in the form of a file
or repository created and maintained by a data modeling tool. This artifact provides metadata for enterprise
data assets. [See Chapter 11 for more details.]
An enterprise data model represents an investment in defining and documenting vocabulary, business rules,
and business knowledge. Creating, maintaining, and enriching the model requires ongoing investments of
time and effort, even when architects begin the design process with an industry-standard model. The results
of data modeling activities include a common view, understanding of entities, data, data attributes, and
their relationships across the enterprise.
Organizations can purchase an enterprise data model, or build one. Several vendors provide industry-
standard logical data models. Both options require some customization.
Enterprise data models differ widely in terms of levels of detail. When a business first recognizes the need
for a data model, it must make decisions about the time and effort that can be devoted to building the
model. As business needs dictate, the scope and level of detail captured within the enterprise data model
typically expands. Successful enterprise data models are built gradually and incrementally.
An enterprise data model is built on layers of information organized as a hierarchy, as shown in Figure 4.3.
Regarding, an enterprise data model is built in layers from the top down. The content at the highest level of
the hierarchy is fundamental and broad, while the lower levels define the details and dependencies between
the data. Model inputs are results of analysis and synthesis of insights and details from existing logical and
physical data models. Integrating business perspectives and influences from existing models can enhance
the development of an entrepreneurial vision.
4.2.2.2.2 The Thematic Area Model
The highest layer of an enterprise data model is the subject area model (SAM). This model is a list of the
main subject areas that together express the essential scope of the company. The list represents a “scope”
view of data, presented in the Zachman Framework. At a more detailed level, business entities and object
classes are also displayed as lists.
There are two main ways to communicate a subject area model:
• A scheme that organizes subjects from high to low in order of priority.
• A diagram that presents and organizes subject areas visually for easy reference.
Designating the core subject areas of the enterprise is important to the success of the entire enterprise data
model. The list of topics is essential for developing critical taxonomies, and allows for further refining of
entities and data in the business model. The thematic area model is optimal when it is accepted by all
participants and constituents of the company. Furthermore, the subject area model should be useful as an
organizing construct for data governance, data management, and enterprise data modeling.
Subject areas typically share the same name with a central business entity. Some subject areas align closely
with core business functions. Other topic areas cover a super-type business entity and its family of
subtypes.
Also, subject areas are important for data management and governance. They define the scope of
responsibilities for data management teams assigned to specific subject areas.
4.2.2.2.3 The Conceptual Data Model
The next lowest level in the enterprise data model hierarchy governs the conceptual data model, its subject
areas, and its business entity relationships.
Business entities are the basic organizational structures in a conceptual data model. They represent the
concepts and kinds of things, people, and places that are important to the company. Business entities are
referred to using business terms. For example, for the business entity it is called “Account”, the Lord
account represents an instance.
The scope boundaries of subject areas generally overlap with some business entities included in other
subject areas. For governance and administration purposes, each business entity should have a primary
subject area that “owns” the master version of that entity.
Conceptual data model diagrams do not represent the data attributes of business entities. Models can
include many-to-many and other types of relationships between essential entities. Conceptual data models
typically represent relationships between essential entities, without normalized data.
The conceptual data model should include a glossary with business definitions and other metadata
associated with all business entities, and their relationships. Other metadata may include entity synonyms,
instance examples, and security classifications.
A conceptual data model can foster improved business understanding and semantic reconciliation. It can
serve as a framework for developing integrated information systems that support both transactional
processing and business intelligence. [See Chapter 5 for more details.]
4.2.2.2.4 Business Logical Data Models
Some enterprise data models include diagrams of the logical data models for each subject area. This level
of detail below the conceptual data model addresses the essential data attributes for each instance of the
business entity. Essential data attributes consist of common data requirements and standardized definitions
that are necessary for the enterprise. Determining which data attributes to include in the enterprise data
model is a very subjective decision.
Logical data model diagrams reflect the changing perspective of the enterprise. They are neutral and
independent of any particular need, use, or context of application. Other more traditional logical models
reflect specific usage and application requirements.
Business logical data models are only partially allocated. They can be normalized to some extent, but they
do not have to be as normalized as logical data models being designed for solutions.
Enterprise Logical Data Models should include a glossary of all business terms, other types of metadata
about entities, their data attributes, and the data domains for the attributes. [See Chapter 5 for more details.]
4.2.2.2.5 Other Components of the Enterprise Data Model
Some enterprise data models include other optional components such as:
• Assignments of responsibility for metadata distributed by subject areas, entities, attribute sets, or
reference data. [See Chapter 3 for more details.]
• Reference data management: Maintaining sets of values controlled by codes, labels, and their
business meaning. These sets of business values are sometimes used to cross-reference equivalent
data in other departments, divisions, or regions. [See Chapter 8 for more details.]
• Additional data quality specifications and rules for essential data attributes, such as accuracy,
precision requirements, timeliness (of data), integrity rules, voidability, format, data
agreement/merge rules, and audit requirements. [See Chapter 12 for more details.]
• Entity life cycles are state transition diagrams that represent the different states of major entities
and the events that cause changes in states. Lifecycles are very useful for determining a rational set
of state values, e.g. codes or labels, for a business entity. [See Section 4.2.2.5 for more details.]
4.3 Summary
Defining and maintaining data architecture is a collaborative effort that requires active participation from
data stewards and other subject matter experts, facilitation, and support from data architects and other data
analysts. Data architects and analysts must work to optimize the valuable experience provided by data
stewards. The Data Management executive must frequently communicate the business case for defining
and maintaining the data architecture. In addition, the executive must ensure that critical resources are
available and committed to project goals.
Data architecture is driven by changes in the business. Maintaining data architecture requires periodic
reviews by data stewards. Routine updates to existing data architecture, such as reference data, can solve
many problems quickly. The most significant issues often require project authorizations.
The value of data architecture is limited until data stewards actively manage data architecture, or
management designates data architecture as a best practice for systems implementation. The data
governance council or other body that can approve the enterprise data architecture is critical to coordinating
data, business processes, systems, and technology architectures.
Data architecture is only one part of the overall enterprise architecture. It serves as a guide for integration.
It is useful to consult the data architecture during:
• Defining and evaluating new information systems projects: The enterprise data architecture serves
as a zoning plan for the long-term integration of information systems. It affects project goals and
objectives, and influences the priority of projects in the project portfolio. Enterprise data
architecture also influences project scope boundaries and system versions.
• Defining project data requirements: Enterprise data architecture provides data required for
individual projects, which speeds up the identification and definition of these requirements.
• Project Data Design Review: Design reviews ensure that conceptual, logical, and physical data
models fit and contribute to the long-term implementation of the enterprise data architecture.