0% found this document useful (0 votes)
160 views30 pages

Ebook The Practical Guide To Using A Semantic Layer

Uploaded by

Sports Club
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views30 pages

Ebook The Practical Guide To Using A Semantic Layer

Uploaded by

Sports Club
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

The Practical Guide to Using a Semantic Layer for Data and Analytics

Table of Contents
Introduction: What is a Semantic Layer? 2

Trends Driving the Need for a Semantic Layer 3

Use Cases for a Semantic Layer 5

Healthcare 5

Retail 5

Consumer Packaged Goods 6

Financial Services 7

Understanding the Data and Analytics Maturity Model 8

Rising Up the Data and Analytics Maturity Scale 9

Why Adopt a Semantic Layer 10

The top five challenges a semantic layer can solve 11

#1 Business units have preferences for different analytics tools 11

#2 Users complain about a lack of access to data 11

#3 The slow pace of data integration drives businesses to DIY 12

#4 Reports from different BI tools use similar terms but show different results 12

#5 Business execs express doubt in the numbers 12

Super-powering decisions with the data and analytics flywheel 13

Key Considerations for Choosing a Semantic Layer 16

Using a Semantic Layer in Practice vs. a DIY Approach 18

Using Power BI without a semantic layer 18

Defining the semantic layer 19

Consuming from the semantic layer using Power BI 20

Workbook: How a Semantic Layer Works for You 22

Assessment: Where do you fall on the maturity scale? 24

The Best Business Case for a Semantic Layer 29

© 2021 AtScale Inc. All rights reserved. 1


The Practical Guide to Using a Semantic Layer for Data and Analytics

Introduction: What is a Semantic Layer?


You may have heard the term semantic layer before; it’s been around for some time. People invented semantic layers
to mold relational databases and their SQL dialects into an approachable Interface for business users. In 1992,
Business Objects patented the term and formalized their implementation as the Business Objects UniverseTM. From
that point on, the concept of measure and dimensions as an abstraction of SQL has become the preferred language
for business users.

Until recently, however, the semantic layer was always closely tied to a business intelligence (BI) platform. As long as
enterprises remained within the confines of their BI vendor of choice, everything worked well. Today, there are more
ways than ever to analyze data. Long gone are the days where there was one BI platform to rule all. Tightly coupling a
semantic layer to one analytics consumption style no longer makes sense.

To expand on that, the explosion of self-service BI has freed business users from relying on IT-prepared analytics,
but at the expense of data consistency and trust in analytics’ output. Business definitions and terms have become
mutable, malleable, and subject to interpretation. While it’s great that business users now have self-service BI tools,
they also need to be working off of consistent, high-quality data. The cost of bad data is enormous; According to IBM,
poor data quality costs the U.S. economy a staggering $3.1 trillion annually.

Luckily, a semantic layer that’s decoupled from the point of consumption can help ease these problems with data
quality and empower self-service analytics. A well-designed semantic layer can lead to better data-driven decisions.
It’s a critical part of the modern analytics stack.

Using a semantic layer simplifies many complexities of business data and creates flexibility among new data
platforms and tools. Perhaps most importantly, these solutions can empower everyone on your team to be a data
analyst, by ensuring that people are playing by the same rules when it comes to data.

Making all of this work involves a series of building blocks.

Key trends driving a need for a semantic layer

Real use-cases for a semantic layer across industries

Best practices and key considerations for choosing a semantic layer for your business.

Let’s get started!

© 2021 AtScale Inc. All rights reserved. 2


The Practical Guide to Using a Semantic Layer for Data and Analytics

Trends Driving the Need for a Semantic Layer


Cloud data lakes and cloud data warehouses like Snowflake, BigQuery, Redshift, Databricks and more have become
well-accepted data platform architectures. According to the AtScale 2020 Big Data & Analytics Maturity Survey, 61%
of respondents currently operate cloud data platforms, and 48% plan on deploying them soon. In the meantime,
Hadoop didn’t become the be-all end-all data solution but just one solution for managing data.

As the volume of data in the cloud grows, data architects are increasingly becoming more comfortable with data
living in different locations and in different platform architectures. However, this gives rise to a new challenge for
IT: managing data access and quality across multiple silos. A semantic layer becomes a critical piece in a cloud data
platform strategy (or a blended cloud and on-prem strategy).

Both data scientists and BI users need access to clean, understandable data. Today’s self-service architectures often
force analytics consumers to become data wranglers and data engineers. In fact, the average data scientist spends
over 45% of their time preparing data rather than modeling it.

Asking business users and data scientists to program their own metrics and business terms is both a massive waste
of time and a recipe for chaos and inconsistency. A semantic layer solves this problem by defining business metrics,
data access, and transformations in one place. That way, analytics consumers are almost guaranteed to speak the
same language, regardless of their use case or toolsets.

Finally, a semantic layer can serve as a central governance gateway across the enterprise, which is crucial as the
number of silos and data access points explodes. A semantic layer serves as a single point of access so IT can secure
data and control access across the organization. The same Big Data & Analytics Maturity Survey referenced above
shows that nearly 80% of enterprises rank security and governance as critical to their success in the cloud.

© 2021 AtScale Inc. All rights reserved. 3


The Practical Guide to Using a Semantic Layer for Data and Analytics

Top Data and Analytics Leaders


Share Their Data Literacy Secrets

As you can see by the diagram above, the semantic layer sits between the point of analytics consumption and the
data warehouse and data lake. A semantic layer hides the physical complexity from end users and provides them with
understandable business terms and user-friendly data, instead of raw SQL and database schemas. This level of data
virtualization makes data access possible for any analytics consumer.

© 2021 AtScale Inc. All rights reserved. 4


The Practical Guide to Using a Semantic Layer for Data and Analytics

Use Cases for a Semantic Layer

Regardless of the industry you’re in, a semantic layer can be an effective solution to democratize data access and
create a culture in which everyone can be a data analyst. Let’s look at some key use cases across industries.

Healthcare
Many pharmaceutical and healthcare companies operate in highly complex and heavily regulated industries. As
you might imagine, their businesses depend on data for success. Some teams choose to build their own data and
analytics platforms or make use of pre-existing components. In either scenario, a semantic layer helps to democratize
access to data across the company. Among the many benefits of this approach is allowing healthcare companies
to focus data and analytics efforts on activities that impact profit and loss. For a pharmaceutical company, even
a single percentage point of efficiency improvements could have a tremendous impact on margins. The goal is to
take a forward-facing, predictive approach to data, rather than simply looking back on reports of what has already
happened.

This approach also dramatically simplifies their data accuracy, and reduces replication of data across multiple data
stores. In addition, it provides common controls and a shared backlog so that business and IT teams can define work
in big room planning sessions and pull work from a common backlog for sprints. Finally, a semantic layer provides
crucial security and governance controls, so that sensitive information remains protected (but more on that later).

Retail and eCommerce


Retailers and eCommerce providers rely on their data and infrastructure to compete. With a plethora of options
available to shoppers both online and in-store, the retailers with the best data-driven strategies can provide highly
tailored recommendations and adapt to changing customer preferences.

This agility stems from the ability of everyone on the team to be a data analyst. When dealing with a high volume
of traffic and the resulting mountain of data, the data team’s top priority is empowering business users to leverage
whichever data tools they like best while enabling them to get reliable, accurate answers quickly. Adopting
a semantic layer from AtScale helps teams accelerate time to insight from data, agnostic of their underlying
infrastructure.

© 2021 AtScale Inc. All rights reserved. 5


The Practical Guide to Using a Semantic Layer for Data and Analytics

Second, data producers need a set of technologies in order to do their jobs well. This could include underlying
dimensional models or training sets for a machine learning model. Finally, infrastructure powers the activities of both
user groups (this includes compute engines and storage systems for data.)

Many large retailers have undertaken a transformation to cloud-based infrastructure, which provides a perfect test
case to use AtScale. The goal is to drive end user adoption of cloud technologies through the implementation of a
semantic layer that democratizes data access.

Treating “Data as Code” with a


Semantic Layer

Architecting your data as code abstracts your data out of proprietary applications and into a semantic layer. In a
perfect world, data models can be viewed and shared as open source code or via APIs, which creates an ecosystem
where data consumers can leverage common data models without reinventing the wheel.

Let’s look at an example of this idea in action. For a major home improvement retailer, it was difficult for employees to
make certain store-level calculations without a common data model.

The data engineering team created an API for common, hard to calculate business metrics (i.e., store SKU gross
margins) for both internal use and external use, including supply chain partners. Extending the data model out using
a common API, supply chain partners can plug in and access analytics that conform to a standard way of talking
about data.

By exposing SKU-level data by store, market or region back to their suppliers, the home improvement retailer could
plan better, making sure those shipments go to stores that need them most. The company built a vendor portal that
embeds AtScale’s semantic layer to expose data with the right level of security and governance. Suppliers can now do
live queries at the SKU level and know exactly where to ship their products. This level of partner data sharing creates
data self-service not only within your own company, but with your trusted vendors and suppliers.

Consumer Packaged Goods


For data teams at consumer packaged goods (CPG) companies, it’s a natural fit to institutionalize the idea of data
as a product. In other words, their team treats data as a path to unlocking value for the business user. Smart CPG
companies leverage concepts from engineering and product management in the software world and apply those
approaches to data, with impressive results.

© 2021 AtScale Inc. All rights reserved. 6


The Practical Guide to Using a Semantic Layer for Data and Analytics

One major CPG has successfully managed to reduce data silos and enable business users to consume it using a
semantic layer. They have developed a logical model for the business that serves as a sort of “digital twin” for the
physical business. This semantic, logical model makes it possible for business users to query data and get answers
using terms that the team already understands.

Plus, by leveraging AtScale’s semantic layer, this company can separate the consumption of data from where that
data actually lives and how it is stored. This way, data can live anywhere and in any format without slowing people
down. Any business user can ask questions and feel confident that they are receiving correct, consistent answers.

Financial Services
With a semantic layer, financial services organizations can save millions in total cost of ownership from analytics,
while simultaneously avoiding the risk of regulatory penalties. Due to federal reporting requirements, hundreds of
analysts may need to drill down across thousands of business calculations to properly sign off and file reports on a
regular basis.

Legacy data structures, however, can create excessive silos. As data volumes grow, business intelligence (BI),
development, and database engineering teams spend significant time managing caches and manually joining data
from various sources. Meanwhile, the business has to bear the risk associated with penalties due to inaccurate or late
reporting.

Investing in a semantic layer can automate the management of data engineering previously done by busy BI teams.
Automation can take the manpower out of data preparation, by aggregating raw data based on end user behavior.
All of this can be done while enhancing existing security and governance controls, and mitigating risk of inaccurate
reporting.

As a result, analysts have performant access via a single source of truth, meeting regulatory requirements. This
restores trust in analytics and allows engineering, BI and data engineering teams to spend their time on more
productive activities.

© 2021 AtScale Inc. All rights reserved. 7


The Practical Guide to Using a Semantic Layer for Data and Analytics

Understanding the Data and Analytics Maturity


Model
Building an effective data and analytics operation requires adopting a maturity scale that grades your team’s efforts.
The AtScale Data & Analytics Maturity Model covers six capabilities and four levels of maturity for each capability,
with the end goal of empowering everyone in your organization to make data-driven decisions.

Data begins with assessing how and where you store data and the steps to enhance it for consumption. But for data
to be useful, it needs to be easily accessible to people within your organization to make data-driven decisions at
scale. This involves providing for atomic data access, timely access, and dual access to raw and normalized versions of
data.

Next, data needs context — descriptions about what it is, where it’s from, and how it was collected. Such a business-
friendly data model makes the data usable and enables self-service data usage without needing specialists to
interpret the data.

From the data, we shift our focus to the person consuming the data — who they are, what they need, and their
required skill level to be productive. Analyzing this information helps organizations make their data and analytics
programs accessible to their staff members, regardless of their skill level or data capabilities.

But how does the end user consume that data? What levels of sophistication are required, and how will the data be
used and shared with other people? What are the ramifications of sharing potentially sensitive information for data
and analytics users? What guardrails should you put in place to prevent that data from falling into the wrong hands?
Finally, it’s important to remember that users want to consume data with the tools they feel most comfortable using.

Insights are the last step before we can empower users to make data-driven decisions. In this stage, you transform
data into actionable insights.

© 2021 AtScale Inc. All rights reserved. 8


The Practical Guide to Using a Semantic Layer for Data and Analytics

See the workbook section below for a data and


analytics maturity assessment scale to determine
where your organization stands and identify areas
for improvement.

Rising Up the Data and Analytics Maturity Scale


The AtScale Data & Analytics Maturity Model isn’t a hard-and-fast declaration of where your organization stands,
but rather a guide to follow on your way to Level 3 (the highest maturity level) of data and analytics strategy and
implementation.

There are four levels on the maturity scale:

Level 0: Initial

Level 1: Procedural

Level 2: Proactive

Level 3: Leading

LEVEL 0: INITIAL

At Level 0, silos are the name of the game. Analytically speaking, your teams work in isolation to choose their
technical stacks and how they integrate data — much of which is done on an ad hoc basis inside business intelligence
tools.

Data at this level also tends to be siloed in customized data marts or accessed with little to no automation for timely
updates and analysis. For organizations at Level 0, only advanced specialists can wrangle data, analyze it, or write the
necessary SQL code to make sense of it.

LEVEL 1: PROCEDURAL

A step up from the initial level, Level 1 attempts to bring some order to data access and analytics by establishing a
core team of data engineers who curate the organization’s data warehouse. This central team typically uses a range
of commercial and homegrown tools to transform raw data into database tables.

Level 1 maturity means the data team is dictating the data analysis tool sets. Business users and data scientists will
often depend on this team for access to new datasets — often subject to a development roadmap or queue.

© 2021 AtScale Inc. All rights reserved. 9


The Practical Guide to Using a Semantic Layer for Data and Analytics

Business users are also typically responsible for authoring their reports using the central team’s star schema, which
is the most widely used approach to develop data warehouses and dimensional data marts.The challenges at this level
include the limited speed of access and data use (because of the development queues) and pre-defined data schemas
that may not apply to individual teams’ needs.

LEVEL 2: PROACTIVE

At Level 2, the focus shifts to the user. Proactive organizations at this level go beyond just providing carefully curated
data toward introducing more atomic, user-driven data access for business users and data scientists using a self-
serve model. By augmenting data virtualization tools with their ETL data pipelines, organizations at this level are
more agile in responding to business data needs.

At an advanced version of Level 2, organizations may even augment their proprietary data with third-party data to
provide richer datasets for deeper insights. These organizations may also support AutoML tools access so that any
data user can build predictive business forecasts and customer experience models.

LEVEL 3: LEADING

Three words exemplify Level 3 organizations: universal data access. Combining the best parts of Levels 1 and 2 (order
and self-serve access), Level 3 organizations typically introduce a semantic layer to simplify self-service data access.
Even better, data at this level is available to anyone in the organization to use for data-driven decisions — not just
data analysts and scientists. They can also access much of this data using the tools and interfaces of their choice.
Introducing a semantic layer to the data tech stack simplifies data access, drives analytics consistency, and promotes
good data governance. It also expands analytics from just BI and AI tools to apps inside and outside the company —
fully embeddable and shareable with third parties.

Why Adopt a Semantic Layer?


The above levels can help you gauge your organization’s data and analytics maturity and provide a roadmap for
improvement. But why bother investing in the tools, training, and thinking to get to Level 3’s semantic layer-backed
maturity?

© 2021 AtScale Inc. All rights reserved. 10


The Practical Guide to Using a Semantic Layer for Data and Analytics

The top five challenges a semantic layer can solve


There are common problems that crop up without a semantic layer facilitating decision-making in an organization.
We can group these problems into five areas:

1 Different analytics tool preferences

2 Lack of data access

3 Slow data integration leading to siloed solutions

4 Inconsistent BI reports across different business units

5 Low data confidence

What follows is a deep dive into each of these challenges and an explanation of how a semantic layer can help solve
it.

# 1 B U S I N E S S U N I T S H A V E P R E F E R E N C E S F O R D I F F E R E N T A N A LY T I C S T O O L S

Larger organizations have a tougher time imposing a single analytics standard across the board. This can be because
of the disruption of an acquisition, resistance to change, or factors that limit management’s ability to enforce unified
standards.

Dresner reports that man enterprises use three or more BI tools, with each tool having its own source of truth. Throw
in possibilities of inaccurate reports from business analysts or misleading predictions from data scientists, and it’s
easy to see how multiple tools can lead to multiple truths — and that’s not a good thing!

And the pace of change in cloud data warehousing, BI, and AI/ML has resulted in a constant cycle of upgrades, re-
platforms, and re-factors across different organizations. From a time, cost, and business impact perspective, it’s hard
to keep up with these changes.

A semantic layer neatly solves this problem by providing analytics-as-a-service (AaaS) to your business users and
data scientists. This lets you grant data access to your end users via their tools of choice while maintaining data
governance and semantic consistency.

# 2 U S E R S C O M P L A I N A B O U T A L AC K O F AC C E S S TO DATA

Data is plentiful, but coherent data is another story. Business analysts and data scientists can’t rely on just any data.
They need to understand the data in log files, relational tables, and other data stores through metadata. If that’s
missing, it leads to time wasted on interpretation and even inaccurate results that can hurt business performance.

The research supports this, too — Gartner reports that 87% of organizations have low BI and analytics maturity. You
might have abundant data, but your data consumers struggle to make sense of it — and it’s hampering the speed at
which they can make accurate decisions. A semantic layer eases this pain by powering your data model with crucial
context to aid decision-making.

© 2021 AtScale Inc. All rights reserved. 11


The Practical Guide to Using a Semantic Layer for Data and Analytics

# 3 T H E S L O W PA C E O F D ATA I N T E G R AT I O N D R I V E S B U S I N E S S E S T O D I Y

Business today moves quickly, and waiting for a centralized data team to produce reports and dashboards for different
departmental use cases is not a good option. There’s a clear link between data-driven decision-making and business
performance: MIT reports that companies in the top three spots in their industry who apply data-driven decision-
making realized 5% more productivity and 6% more profit than their peers.

This move to the cloud and rise of big data have powered a BI revolution, leading to business users taking reporting
and data engineering into their own hands. This is a positive shift. But it also has its drawbacks, with many data
platforms and data marts proliferating everywhere and making data governance difficult. Such a situation shows the
need for a semantic layer to simplify and streamline data access and use.

# 4 R E P O R T S F R O M D I F F E R E N T B I T O O L S U S E S I M I L A R T E R M S B U T S H O W D I F F E R E N T R E S U LT S

Of course, having multiple BI tools across the organization results in differing results for similar queries. Each BI
tool comes with its own modeling layer, and all of them support custom calculations, so it’s easy enough to create
wildly divergent reports off of the same data. That’s not even accounting for table join errors, flawed time-based
calculations, or just simple formula mistakes. This leads to a common consequence:

#5 BUSINESS EXECS EXPRESS DOUBT IN THE NUMBERS

Experian reports that six in 10 companies believe that high-quality data increases business efficiency, 44% believe
it raises consumer trust, 43% conclude it enhances customer satisfaction, 42% believe it drives more informed
decision-making, and 41% report that good data cuts costs.

However, this isn’t the reality for most businesses today. Many companies cannot be sure of the reliability of their
data. This introduces doubt and delays in decision-making — a significant drawback considering that trust in data is a
major competitive advantage.

Using one source of truth naturally leads to more trust in the data, so if you find your business users employing
different analytics tools to do their analyses, you may be suffering from a confidence crisis that a semantic layer
could solve.There are several approaches to implementing a semantic layer in your organizations. Below is a table
with the pros and cons for each:

© 2021 AtScale Inc. All rights reserved. 12


The Practical Guide to Using a Semantic Layer for Data and Analytics

Business Intelligence Platforms PROS CONS

Traditional BI platforms that bundle data + No extra technology - Semantic layer


needed specific to BI tool only
modeling, query management and visualization (not Reusable)
+ Tight integration
- Vendor lock in
EXAMPLE VENDORS + Business user friendly
Tabletau, power BI, IBM, Cognos, SAP Business
Objects, Looker

Data Virtualization Platforms PROS CONS

Platforms that abstract away the physical source + Provides flexibility in - Not friendly for
how/where data is business users (tables,
and location in a tabular format stored columns)
+ Semantic layer can be - Data models need
EXAMPLE VENDORS used across a variety to be built before
of tools accessing data
Denodo, Dremio
- Query performance is
not guaranteed and/or
need manual tuning

Data Warehouse / Data Marts PROS CONS

A database of information from a variety of data + Single source of truth - Not friendly for
business users (tables,
sources + Widest array of tool/
columns)
query access
EXAMPLE VENDORS + Easy to secure - Slow to integrate new
data sources
Snowflake, Amazon Redshift, Google BigQuery,
- Dependence on IT
Azure Synapse SQL Analytics.

Semantic Layer Solution PROS CONS

A platfom that presents a business data view + Business user friendly - Extra technology layer
required
that helps users access data autonomously using + Single source of truth
- Data models need
common business terms + Provides flexibility in
to be built before
how/where data is
accessing data
stored
EXAMPLE VENDORS
+ Semantic layer can be
AtScale, SQL Server Analysis Services used variety of tools
+ Easy to use

© 2021 AtScale Inc. All rights reserved. 13


The Practical Guide to Using a Semantic Layer for Data and Analytics

Super-powering decisions with a data and analytics “flywheel”

Now that we’ve covered the data and analytics maturity model and top challenges a semantic layer can solve, let’s
discuss one of the biggest reasons to invest in a semantic layer. It can create a data-driven decisions “flywheel”
to super-power your organization’s ability to use data in every decision you make. Investing in the right tools and
processes will serve as the “grease” to making your analytics flywheel spin.

Amazon is obviously a great success story and their leadership principles are admired by many across a variety of
industries. The Amazon Virtuous Cycle is a strategy that leverages a great customer experience to drive traffic to the
platform and third-party sellers. This in turn improves the selection of goods to further lower Amazon’s cost structure
so it can decrease prices, which then spins the flywheel. This is the Amazon Flywheel.

This virtuous cycle principle can also work as a strategy for accelerating data-driven decisions in your organization.
The illustration below explains how this flywheel effect can drive more, higher quality data to analyze and, most
importantly, smarter decisions.

T H E R O L E O F T H E S E M A N T I C L A Y E R I N T H E F LY W H E E L

A semantic layer is critical to powering the flywheel effect because it creates a logical view of your data. By
translating raw, physical data into business-friendly terms, a semantic layer creates “analytics ready” data, making
data accessible to an audience beyond data engineers and analytics experts. By making data consumable by everyone
in an organization, the semantic layer becomes the “grease” for the flywheel, making it spin easier and faster.

Besides serving as a single source of truth, the independent semantic layer also insulates the organization from
future technology changes, including new data platforms and consumption tools. By decoupling query tooling from
the physical data platform, you can effectively “future proof” your analytics stack. Even better, the semantic layer
also hides the location and format of data from users – whether the data lives in a data warehouse, data lake or SaaS
applications. This makes finding and accessing data trivial for all users, freeing them to make more decisions with
less data wrangling.

© 2021 AtScale Inc. All rights reserved. 14


The Practical Guide to Using a Semantic Layer for Data and Analytics

H O W T O C R E A T E A D A T A - D R I V E N D E C I S I O N S F LY W H E E L

Now, let’s map each of the AtScale Data & Analytics Maturity Model capabilities to our flywheel and discuss how
mastering each capability is key to making our flywheel spin faster.

DATA

Of course we have to start with the data. To enable a flywheel, data needs to be stored in a form that’s
accessible by a variety of query languages and APIs. In other words, data needs to be reachable in situ in
order to provide live, up-to-date access for analytics consumers.

ACCESS

Besides making data available via ETL-driven data pipelines, it’s imperative that data is accessible via a
“live” interface to power an analytics flywheel. Data virtualization is a key technology for providing a real
time (or near real time) view of data to support the most demanding analytics use cases that power our
analytics flywheel.

MODEL

To summarize, a logical model of an organization’s physical data is crucial in making data easier to
understand and use by a wide range of users. In particular, a dimensional data model tends to provide
the most business- friendly interface and supports the widest range of consumption tools. By making
data consumable by more users, we create a larger audience making data-driven decisions to power our
analytics flywheel.

A N A LY Z E

By freeing users from the time-consuming drudgery of data engineering tasks (wrangling and modeling
data themselves), the semantic layer allows users to spend more time applying data to make decisions.
Coupled with data literacy, more users spending more time on making informed decisions spins our
flywheel faster.

CONSUME

Freedom to choose the best data tool for the job is key to driving more consumers to use data to make
decisions. The independent semantic layer makes this possible by delivering consistent, governed data
access to a variety of tools and applications. Some users may prefer using Excel for their analytics, others
a BI tool like Power BI or Tableau. Data scientists may prefer a Jupyter notebook. By allowing users to
leverage the tools they are most proficient in, we spin the flywheel faster through more productive users
making more, data-driven decisions.

INSIGHTS

Now we arrive at the good part: turning data into meaningful insights. With our flywheel spinning, we
now have (1) more users, (2) that are more productive, (3) spending more time on analytics and less on
data wrangling, (4) using the tools of their choice, to make more decisions. Even better, more data-driven
decisions begets even more data, generated from the output of AutoML and machine learning tools, which
feeds right back into our virtuous cycle. Our flywheel is now spinning.

© 2021 AtScale Inc. All rights reserved. 15


The Practical Guide to Using a Semantic Layer for Data and Analytics

Key Considerations for Choosing a Semantic Layer


Now that you have a sense of how a semantic layer can solve common data challenges, let’s talk about how to go
about selecting and implementing one. Choosing a semantic layer vendor can be daunting, but there are eight key
considerations to keep in mind as you pick the best one for your organization.

#1 Not tied to a single consumption style


As analytics have spread more within organizations, relying on one BI or AI/ML platform to meet everyone’s needs is
becoming less realistic. Also, a semantic layer tied to one set of consumption tools is by design not “universal” — and
in a landscape of many tools and analytics user personas, it’s crucial to choose a semantic layer decoupled from a
single consumption style or analytics tool.

#2 Offers tabular and multidimensional views


Semantic layers come in two flavors: tabular and multidimensional.

The tabular (or relational) model became popular in the 70s and 80s and relied on concepts like fact and dimensional
tables. Tools based on this model were designed to make relational databases or data warehouses easier to query.
Multidimensional data layers go one step further by defining relationships and aggregation rules and adding
business-friendly context while negating the need for SQL.

It’s essential to choose a semantic layer tool that offers both views to cover a broader range of uses and consumption
styles.

#3 Supports data platform virtualization


Data has lived in lots of different homes over the years. First it was the mainframe, then the relational database,
followed by the data warehouse, the MPP database, the data lake, and back again to the (this time, cloud-hosted)
data warehouse.

These evolutions have brought significant changes to how data is accessed and used, and savvy organizations hedge
against data obsolescence through virtualization. Virtualization eliminates the cost of data migrations every time
a new trend grips the industry. A semantic layer vendor should offer data virtualization to abstract away platform
differences and minimize lock-in.

#4 Easy model development and sharing


Raw data is near-useless, but adding a data model to it makes it consumable information. The ideal semantic layer
vendor should enable easy authoring, sharing, and collaborating on data models. It should also allow the reuse of
common objects and conformed dimensions, the ability to model data visually, and a code-based approach that’s
compatible with your organization’s software development life cycle.

© 2021 AtScale Inc. All rights reserved. 16


The Practical Guide to Using a Semantic Layer for Data and Analytics

#5 Ability to express different business concepts and functions


Relational data is flexible and powerful but often difficult to express high-level business constructs with. These
constructs include time-based calculations (e.g., period-over-period), semi-additive metrics, ancestor/predecessor
functions, etc. Expressing these computations in SQL is challenging, so choose a semantic layer that supports
business constructs and core analytics requirements around time intelligence and hierarchical roll-ups.

#6 Query performance and caching


Query performance and caching are critical considerations in the selection process. A semantic layer needs
consistent and performant to be of any use to its users, who expect blazingly fast performance from proprietary
databases.

This isn’t easy considering that many of today’s queries often include heterogeneous database joins that further
tax query performance. To overcome this challenge, choose a semantic layer vendor that includes a comprehensive
performance management system beyond simple caching techniques.

#7 Support for BI and data science workloads


The need for clean, usable data doesn’t end with just the business analyst — as referenced above, data scientists
spend approximately 45% of their time just prepping data for use. A common data language and business terms are
more likely to ensure business analysts and data scientists have the same context and produce consistent results
and predictions. Choose a semantic layer that supports various workloads, including business intelligence and data
science

#8 Security & governance


Because the semantic layer sits between the organization’s data and the analytics tools that access that data, the
platform must integrate with your organization’s security infrastructure. This can happen in two ways: authentication
and authorization.

First, the semantic layer must integrate with any existing single sign-on infrastructure to authenticate users, whether
through Active Directory, LDAP, OAuth, or any other authentication platform. Second, the semantic layer must
include the ability to mask sensitive columns, limit data rows based on user access rules, and, crucially, impersonate
users when querying underlying sources. Choose a semantic layer that incorporates these two critical security and
governance protocols.

© 2021 AtScale Inc. All rights reserved. 17


The Practical Guide to Using a Semantic Layer for Data and Analytics

Using a Semantic Layer in Practice vs. a DIY


Approach
Perhaps the best way to truly understand how to use a semantic layer is to compare it to a “do it yourself” or DIY
approach. Let’s look at an example of using a Power BI dashboard connected to Snowflake with and without a
semantic layer.

Using Power BI without a semantic layer


For a user connecting to Power BI without a semantic layer, they would connect to Snowflake directly using the
Snowflake Power BI database driver. From there they’d need to find the right data warehouse or specify their data
warehouse size.

Allowing end users to define Snowflake compute configurations (data warehouse size) is very dangerous, likely
resulting in unpredictable compute costs whenever someone opens or views the Power BI workbook.

© 2021 AtScale Inc. All rights reserved. 18


The Practical Guide to Using a Semantic Layer for Data and Analytics

Once connected to Snowflake in Power BI, the end user may get access to data they may not have permission to see.
Next, they are forced to navigate and manually find the tables they need. Consider a simple example where the user
wants to analyze “sales by country.”

They’d need to locate and choose the sales table with the right level of granularity, then they’d need to find the
location table that has the country field, find its required foreign key, and map it back to the sales table. This is a very
error-prone process that could take minutes or hours, depending on the user’s familiarity with how the data is stored.
Even worse, this manual data wrangling process would need to be repeated for each additional Power BI workbook,
likely resulting in inconsistent reports.

Defining the semantic layer

The alternative approach is to define a semantic layer once and use it many times.

With the drag-and-drop canvas of AtScale’s Design Center, a subject matter expert can connect directly to Snowflake
(or any data warehouse for that matter), select the right tables, define their relationships and expose columns for
consumption. Design Center users can leverage AtScale’s library to re-use conformed dimensions and existing
calculations to further make the defintion of the semantic layer easier and more consistent.

By exposing business-friendly dimensions, measures, hierarchies and complex calculations, the subject matter
expert can leverage AtScale Design Center to make raw data “analytics ready” for everyone, freeing consumers from
complicated and error-prone data engineering work.

In other words, by creating a semantic layer once, the complexity of the underlying data model is hidden from the
analytics consumer in Power BI. The semantic layer delivers pre-modeled data, ready for analysis.

© 2021 AtScale Inc. All rights reserved. 19


The Practical Guide to Using a Semantic Layer for Data and Analytics

Consuming from the semantic layer using Power BI


Users can connect to the AtScale semantic layer with tools like Power BI and Excel just as they connect to SQL Server
Analysis Services (SSAS). Using the built-in SSAS connector in Power BI, end users can connect live (vs. importing
data) to Snowflake using an AtScale endpoint.

Within seconds, users have access to rich, atomic data with all the business-friendly metadata defined in AtScale
Design Center – no data modeling or data engineering required.

© 2021 AtScale Inc. All rights reserved. 20


The Practical Guide to Using a Semantic Layer for Data and Analytics

In the example below, to analyze “sales by country,” the user simply drags in the ‘Sales Amount’ metric and the
‘Country’ attribute and they get the correct, governed results. This Power BI user only has access to sales data for the
Americas, which includes the U.S. and Canada. From there, this user can drill down into state and city-level sales,
since location hierarchy was already defined in the semantic layer.

Since the model came along with the connection to the AtScale semantic layer, there’s no way users can get different
answers to the same questions.

© 2021 AtScale Inc. All rights reserved. 21


The Practical Guide to Using a Semantic Layer for Data and Analytics

Workbook: How a Semantic Layer Works for You


You might be considering a semantic layer implementation, but how do you calculate the tangible benefits to
your organization? As champions of data-driven decisions, we understand the need for hard data to support the
buying process. That’s why we created a set of calculators to help you figure out exactly how much you’d save by
implementing a semantic layer.

For example, let’s say you want to calculate the cost savings on data engineering by implementing a semantic layer.
By entering a few details into the Data Engineering ROI calculator on our website, you can get an instant look at how
much money you’d save each year from the solution. See an example below and try it out for yourself — the results
are instantaneous!

With ten full-time analytics employees earning an


average of $120,000 per year and spending two
hours per day on manual data engineering, you
could save $150,000 each year by implementing a
semantic layer.

It’s also vital to run each vendor through a checklist to determine if they have what you’re looking for. Use the
checklist below to benchmark each solution against your needs.

© 2021 AtScale Inc. All rights reserved. 22


The Practical Guide to Using a Semantic Layer for Data and Analytics

F E AT U R E SCORE WEIGH WEIGHTED


F E AT U R E SCORE
C AT E G O R Y (1-5, 5 = BEST) (1-5, 5 = BEST) (1-5, 5 = BEST)

Supports analytical workloads


Use Cases
Supports data science workloads

Supports legacy, on-premise data warehouses

Supports cloud data warehouses

Supports on-premise and cloud data lakes

Connectivity Supports Saas data sources (Salesforce, Workday)


(northbound &
Supports tools that speak SQL via JDBS or ODBS
southbound)
Supports tools that speak MDX or DAX and live Excel connections

Supports custom applications via REST or Phyton interfaces

Supports zero client install for data consumers

Supports web based development (versus client application)

Supports multiple, simultaneous editors for virtual view development


Development
Environment Supports reusable objects and model component sharing

Supports development lifestyle (dev/test/prod)

Supports Time intelligence (period over period, period to date)

Supports MDX, DAX, pre and post query calculations


Calcultions
and Analytical Supports aggregation functions (SUN, AVG, MAX, MIN)
Functions
(OLAP) Supports non-additive metrics (Distinct Count, First, Last)

Supports live Excel pivot tables and Excel CUBE functions

Query Supports automated query performance management


Performance
Supports dialect specific optimizations
& Caching
Supports single sign on for all data consumers

Supports user impersonation and delegated authorization

Security & Supports and respects native data platform security constructs
Governance
Supports row level security for users and groups

Supports column hiding and masking for users and groups

TOTA L

© 2021 AtScale Inc. All rights reserved. 23


The Practical Guide to Using a Semantic Layer for Data and Analytics

Assessment: Where do you fall on the maturity scale?


Knowing how mature your organization is in terms of data and analytics gives you an overview of your strengths and
growth areas. The AtScale Data & Analytics Maturity Scale aids such an exercise, and we’ve presented the four levels
below to help you determine where you fall.

Level 0: Initial
Organizations in the initial level of data and analytics maturity will tend to have an “every person for themselves”
strategy toward data and analytics. Typically, each business unit chooses their own technical stack and data
integration is performed on an ad hoc basis inside the business intelligence (BI) tools using the tools’ native import
or extract functionality. Data will tend to be siloed in customized data marts or data may be accessed at the file
level with little automation for refreshing analysis on a timely basis. For organizations at the initial level of maturity,
analysis is typically left to those who have a fairly advanced skill level for writing SQL and wrangling data.

If you can answer “yes” to many of the following questions, your organization may fall into an Initial maturity level:

© 2021 AtScale Inc. All rights reserved. 24


The Practical Guide to Using a Semantic Layer for Data and Analytics

Level 1: Procedural
Organizations at the procedural level of data and analytics maturity will tend to have a centralized BI or data team
that’s responsible for curating and loading a corporate data warehouse. This centralized data team will tend to
be staffed with data engineers who use a variety of commercial and home grown ETL/ELT tools for transforming
raw data into database tables. At this level, it’s likely that the central data team dictates the toolsets for analyzing
data, including BI and AI tools. Business users and data scientists will often be dependent on this central data
team for getting access to new datasets and be subject to a development or roadmap queue. Business users are
usually responsible for authoring their own reports using a star schema defined by the central data team in the data
warehouse.

If you can answer “yes” to many of the following questions, your organization may fall into a Procedural maturity
level:

Level 2: Proactive
Organizations at the proactive level of data and analytics maturity have typically advanced beyond just providing
carefully curated data access and have introduced more atomic level, user-driven data access. By augmenting their
ETL data pipelines with data virtualization tools, organizations at this level can be more agile in responding to the
business data needs by providing more self-service data access.

© 2021 AtScale Inc. All rights reserved. 25


The Practical Guide to Using a Semantic Layer for Data and Analytics

The more advanced organizations at this level may augment their first party data with third party data to provide
richer datasets for deeper insights. In addition to providing self-service access for business users, organizations at
the proactive level may support AutoML tool access so that citizen data scientists can create predictive models for
improving business forecasts and customer experience.

If you can answer “yes” to many of the following questions, your organization may fall into a Proactive maturity level:

Level 3: Leading
Organizations at the leading level of data and analytics maturity typically have introduced a business-oriented
semantic layer to simplify self-service data access to most, if not all, enterprise data. At this level, data analysis is
no longer suitable just for advanced data analysts and data scientists. Instead, anyone in the organization can use
data and analytics to make data-driven decisions using the tools and interfaces of their choice. To drive analytics
consistency, promote data governance and simplify data access, organizations at the leading level leverage a
semantic layer in their data technology stack. With a semantic layer, analytics are not just limited to BI and AI tools
but are also embedded in applications and shared both inside and outside the company with strategic business
partners.

If you can answer “yes” to many of the following questions, your organization may fall into a: Leading maturity level:

© 2021 AtScale Inc. All rights reserved. 26


The Practical Guide to Using a Semantic Layer for Data and Analytics

Performance Benchmarks: How AtScale Performs with Popular Cloud Data


Lakes and Data Warehouses
Data model advancements for data lakes and cloud data warehouses (such as nested data types) are a game changer.
However, existing BI and AI toolsets are really not geared to take advantage of these new innovations. They expect
to see data in a traditional star schema, in fixed rows and columns. As a result, most people bring their star schemas
with them into the cloud and are disappointed with their performance and agility as a result.

Enter the AtScale semantic layer. AtScale’s accelerated query structures will readily accept your existing star
schemas and optimize them automatically for the data denormalization and full table scans these data platforms
prefer. If you want to take advantage of these new nested data types (and you should), AtScale has you covered there
as well. We built AtScale with these new data model innovations in mind so our modeling tools and query optimizer
take advantage of these new data warehouse capabilities.

Whether you’re old school or new school, rest assured that leveraging AtScale’s semantic layer will give you the cloud
boost you hoped for without the disruption of redesigning your data models or throwing out your existing BI and AI
tools.

To demonstrate this, we ran 20 queries both with and without AtScale, using the standard TPC-DS benchmark v2.11.0
from the Transaction Processing Council (TPC) for our tests. AtScale’s Acceleration Structures showed major benefits
in accelerating query performance, improving user concurrency and reducing compute costs. The illustrations below
show summary results, and full results for each platform are available in these benchmarking reports.

AMAZON REDSHIFT

© 2021 AtScale Inc. All rights reserved. 27


The Practical Guide to Using a Semantic Layer for Data and Analytics

A Z U R E S Y N A P S E A N A LY T I C S S Q L

GOOGLE BIGQUERY

SNOWFLAKE

© 2021 AtScale Inc. All rights reserved. 28


The Practical Guide to Using a Semantic Layer for Data and Analytics

The Best Business Case for a Semantic Layer


Data and analytics play crucial roles in helping you make more confident and accurate decisions. The correct
infrastructure deployment empowers your teams to trust the data they have, apply it to their use cases in logically
consistent ways, and maintain proper data security and governance. A semantic layer also future-proofs your data
against new data storage and consumer technologies while improving data query speed and performance.

AtScale helps companies speed up and simplify their analytics through a universal semantic layer that simplifies data
access and use. Learn more about how your organization can benefit from this at atscale.com.

TO LEARN MORE ABOUT HOW A SEMANTIC


LAYER CAN HELP MAKE EVERYONE A DATA
ANALYST, SCHEDULE A DEMO

© 2021 AtScale Inc. All rights reserved. 29

You might also like