TDWI Checklist Report SLMDA AtScale Russom Web
TDWI Checklist Report SLMDA AtScale Russom Web
M
any of the most exciting innovations and
advancements in data management today Understanding the semantic layer’s critical roles:
are occurring within the semantic layer
of data architectures and data stacks. For example, 1 Understand what a semantic layer should
we are witnessing new or improved approaches be and do
to semantic modeling, data cataloging, and data
lineage. Even older forms of managing semantics— 2 Consider why your data architecture needs a
such as metadata and virtualization—are being modern semantic layer
infused with new techniques for agile modeling,
performance optimization for logical and virtual 3 Recognize where a semantic layer fits in
data environments, and intelligent augmentation today’s data architectures
(i.e., tool algorithms driven by machine learning and
graph analytics). 4 Understand how the semantic layer helps
remodel distributed data
The innovations of the semantic layer also play a
role in improving large-scale data and analytics 5 Appreciate how a semantic layer contributes
architectures. For example, the new definition of data to common data architectures
fabric is not possible without a modern semantic
layer, and the semantic layer can be a backbone for
unifying new data and analytics architectures in the
cloud. Furthermore, a well-designed semantic layer tool types, data platforms, use cases, and business
allows analytics teams to define business metrics, departments without favoring one over another.
hierarchies, and dimensions on top of big data while
providing a means to centrally govern data access The data descriptions (or “data about data”) created
and deliver high-performance interactive queries. and managed by a semantic layer may take the form
of older techniques (such as metadata management,
This TDWI Checklist educates data and analytics federation, and virtualization) or newer ones (such
leaders about modern platforms and practices for the as data catalogs, data lineages, dimensional models,
semantic layer. It does so by discussing five beneficial or automated generation of data descriptions via
characteristics of the modern semantic layer but in knowledge graphs). Whether the semantic layer is
the context of the semantic layer’s critical roles in a unified environment from a vendor or assembled
modern data architectures. by technical users as a “best of breed” collection
of multiple tools from multiple vendors, it should
NOTE: This report assumes the reader is familiar support many semantic tool functions.
with data architectures. Readers needing more
details can read the TDWI Checklist Report: Six A semantic layer platform must go beyond data
Requirements for the Modern Data and Analytics definitions to provide rich capabilities in semantic
Cloud Stack. modeling and data modeling. In other words, a tool
for the semantic layer should actively support the
creation of new data structures and data products
(whether federated, virtual, logical, or data sets in
storage), not just descriptions of source data and its
Understand what a
1 semantic layer should
characteristics.
Today, a truly modern semantic layer is a standalone A semantic layer must translate data consumer
tool type that provides data semantics services for requests to the flavor of SQL preferred by the source
multiple tools within a multitool and multiplatform data platform. It must accommodate multiple
data architecture. This gives the modern semantic inbound protocols (not just SQL) because tools
layer the ability to serve many architectural layers, themselves support different protocols. For example,
to support Excel, the semantic layer should support and other security for systems the semantic layer
MDX. For data science, it should support Python. For accesses
application developers, it should support REST.
• Friendly descriptions of data that simplify and
In production, a semantic model solution must improve modern data practices (e.g., self-service,
deliver “speed of thought,” direct query performance dashboard customization)
through automated performance optimization. Query
performance is imperative because without it, many • Reuse of composable data objects and data
end users will make redundant and non-governed products listed in the semantic layer
copies of data in the form of data extracts (TDEs in
Tableau) or data imports (Power BI, Qlik, etc.). One • Automation for data governance, monitoring via
of the greatest benefits of a modern semantic layer, data, audit, and data observability
extended with query optimization and virtualization,
is that the semantic layer serves as an abstraction • When done well, can elevate data literacy and
layer for governance, security, and “single source democracy
of truth.”
Today’s data architectures are trending toward
centralized data organization paradigms (databases,
data lakes, data warehouses, data science labs)
within both cloud and on-premises architectures.
Consider why your data
2 architecture needs a
Even when physically consolidated, many data
environments are not logically organized to support
modern semantic layer direct analytics use.
The semantic layer provides a collection of data A semantic layer helps to unify far-flung data
descriptions (and tools to create and maintain them) architectures in that it:
to make a single, centralized, and standardized
architectural layer for most data semantics. Being • Describes data consistently for all layers of the
centralized, a semantic layer delivers a standardized architecture and beyond
and consistent way of representing enterprise data to
different types of users, tools, and data management • Reaches all platforms: multiple brands, on
processes. multiple clouds or on premises, etc.
This centralized approach simplifies many things, • Can provide data views that incorporate data
and it delivers important architectural benefits: from many architectural layers and elsewhere
• Provides a process that is a governed, • Facilitates data access and interfacing for many
standardized, and consistent way of representing users and tools
distributed enterprise data
• Can enable virtualization for the logical data
• Can be a single point of entry, with single sign-on warehouse and virtual data lake
Because it is an abstracted layer, the semantic layer layers; the data fabric gets the advanced semantic
creates a kind of future proofing. By decoupling the functionality and automation it needs; but the whole
layers of data consumption tools and data storage data architecture still has access to standalone
platforms, a semantic layer provides IT with the and independent semantic layer tooling. This also
freedom to consolidate, move, or transition its data creates a unification effect, which is beneficial to
without disrupting end-user data consumption. A multiplatform architectures.
semantic layer also provides an open platform for
DATA CONSUMPTION
plugging in new or different data consumption tools
as they arise. Data Science Business Intelligence
Self-Service
Essentially, a modern, independent semantic exposed multiple ways to multiple users, teams, and
layer can support just about any variation of data tools. This is so that users can avail themselves of
architecture available today. multiple data consumption styles, unlike the limited
approaches typical of embedded semantic layers or
traditional data warehouses.
higher quality. Similarly, some DWs operate almost THE SEMANTIC LAYER HELPS THE
exclusively with technical metadata; a semantic layer DATA FABRIC
helps the DW team embrace business metadata and
The most recent definition of the data fabric is “an
more advanced forms of semantics, such as the data
architecture for unifying and governing multiple data
catalog and data lineage.
management and data semantics disciplines, from
data integration and quality to metadata and data
The business-friendly semantics of the semantic
cataloging.” Among other requirements, a data fabric
layer enable DWs to participate in more use
requires sophisticated semantics that are centralized,
cases. This is particularly useful when less-technical
standardized, and shared for the many tool types
users want to access the data in a DW for self-service
found in modern data fabrics. The semantic layer
or when users need to personalize their management
satisfies this requirement.
dashboards with metrics and KPIs from a DW.
The semantic layer is a key enabler for the logical THE SEMANTIC LAYER HELPS DATAOPS
DW. A logical DW is inherently multiplatform, All data-driven development processes benefit from
hybrid, and distributed. To make this complicated better semantics and DataOps is such a process. In
microarchitecture seem more unified and usable, fact, the semantic layer can help DataOps achieve
DW professionals use data virtualization and data many of its key objectives by providing:
views. Because a semantic layer is inherently logical,
virtual, and view-driven, it can be a natural addition • Centralized, standardized, and shared data
to a DW to make it a true logical DW. semantics for data engineering but with more
features than a metadata repository or catalog
THE SEMANTIC LAYER HELPS THE
DATA LAKE • Automation for data semantics to accelerate the
delivery of data products
The semantic layer helps a data lake avoid
becoming a data swamp. A lack of metadata
• Semantic modeling (faster than models based on
and other data semantics is the leading cause of
aggregation)
swamps; a semantic layer provides ample metadata
management for this situation.
• Reduced data prep and design work
Product and company names mentioned herein may be trademarks and/or registered trademarks
E [email protected] of their respective companies. Inclusion of a vendor, product, or service in TDWI research does
not constitute an endorsement by TDWI or its management. Sponsorship of a publication should
tdwi.org not be construed as an endorsement of the sponsor organization or validation of its claims.