Data Warehousing Reduced

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

!

"#$%&#'%#
!"#$%&'() *"# +,%&-$%%

Data Stored Well is


Data Used Well
A Prescience Decision Solutions Whitepaper

Harnessing advanced
data warehousing capabilities
to deliver superior business outcomes

There is no doubt that over the many years, data has evolved into an indispensable source,
offering valuable insights to help make critical decisions. How important data is can be
acknowledged by the fact that today it is considered equivalent to oil, soil, water and oxygen.
But what is equally important is the need to store and organize data in a manner that aids
seamless, accurate and efficient decision making.

Data warehousing helps in consolidating, storing and organizing data. While data warehousing
has undergone significant changes in its architecture and methodology, there is an urgent need
to establish a paradigm to support next-gen technologies such as big data and cloud computing
along with modern analytics and reporting requirements.

Data warehouses, when deployed effectively, are valuable in organizing data and eliminating
redundancies. A well-designed warehouse can provide information in a timely manner to drive
effective decision making. With a significant surge in analytical applications and prescriptive
analytics, the importance of organized and clean data has definitely increased.

A modern data warehouse architecture must have the following functions to support the
evolving needs of an enterprise:
• Manage and integrate both structured and unstructured data types
• Integrate support for advanced analytics processing to support new, advanced analytics
use cases
• Support near-real-time or real-time access and analysis at a scale (and cost) that was not
previously practical
• Move data to and from cloud services as it is with on-premises data sources and services
• Integrate transparently multiple platforms in a unified data warehouse architecture
• Support ad-hoc data / reporting requests

1
!"#$%&#'%#
!"#$%&'() *"# +,%&-$%%

KEY INGREDIENTS FOR A POWERFUL DATA WAREHOUSE


A modern data warehouse must have the capabilities to store all kinds of data—structured,
unstructured, semi-structured or data streaming. In addition, a data warehouse must also
perform functions such as ingestion, storage, processing and reporting under one umbrella.
Here are some of the key aspects that a modern data warehouse must have in today’s
data-driven business landscape:

DESIGN
Data warehouses should be designed to allow data structures to be
accessed, connected, processed, and stored easily. Modelling standards,
master data management, usage of concepts such as slowly changing
dimensions and change data capture are important to keep the data
warehouse relevant.

Building a
INFRASTRUCTURE Powrful Data ARCHITECTURE
A modern data warehouse architecture
To deploy a data warehouse on cloud or
on-premises is a decision that should be taken Warehouse includes various components like data
lake/store, database/data warehouse,
based on the specific business needs. Cloud
analytical engine and reporting.
offers flexible deployment options, keeps initial
investment to minimum and makes available
toolsets at a fast pace that help data analytics
functions run better.

IMPLEMENTATION METHODOLOGY
A robust methodology must include models,
policies, rules or standards that govern which
data is collected and how it is stored,
arranged, integrated and used in data
systems and organizations.

DATA WAREHOUSING PITFALLS


While data warehouse projects are among the most visible and expensive initiatives an
organization can undertake, they are also among the most likely to fail. According to Gartner,
more than 50 percent of data warehouses fail to make it to user acceptance. With data
becoming a critical element for an enterprise’s business operations today, it is imperative that
data warehousing projects are executed and implemented successfully. Some of the reasons
why data warehousing projects fail are:
• Not answering the big question – Why does an organization need a data warehouse?
• Using the Big Bang approach – Delivering usable business functionality and building data
warehouse incrementally.
• Shortening testing and involving business at a much later stage for validation
• Neglecting maintenance – With enterprise data and analytics requirements changing
constantly, a data warehouse project has no end date.

2
!"#$%&#'%#
!"#$%&'() *"# +,%&-$%%

DATA WAREHOUSING BEST PRACTICES


Every data warehousing engagement should identify and implement certain best practices for
optimal technological and business returns on investement.

DESIGN
Define standards upfront Integration layer
Put in place technical as well as A source-agnostic integration layer can
methodology-related standards to be used to pull together information from
ensure there is no confusion during the multiple sources, allowing better
implementation stage. business reporting and alignment with
the business model.
Choose ELT and data lake instead of ETL
It is imperative to consider a data lake Enable ad-hoc querying and self-service
architecture along with ELT logic to BI
ensure optimized data storage and A self-service BI with an analytical model
retrieval capabilities, to manage data of to back it up makes it easy for the
multiple types easily. analytical layer to generate reports
without affecting the sanctity of the
underlying data model.

ARCHITECTURE
DATA MODEL Normalization: Adoption of the right
Getting a common understanding of what database design technique and the
information is important to the business will be appropriate normal form is what
vital to the success of the data warehouse. differentiates a practical and usable
Adopting recognized data warehouse warehouse from a defunct one.
architecture standards can help a long way, such
as: Change Data Capture (CDC): CDC
minimizes the resources required for ETL
Star, Snowflake or Constellation Schema: processes and ensures data synchronicity
Adopting the right approach based on facts, in an optimal manner.
dimensions and reporting is a key factor in
meeting business requirements. MASTER DATA MANAGEMENT (MDM)
By providing a single point of reference for
Slowly Changing Dimensions (SCD): Tracking critical information, MDM eliminates costly
changes to keep a historical reporting redundancies that occur when organizations
perspective and adopting the right SCD (Type rely upon multiple, conflicting sources of
0, 1, 2,3, 4) based on data is essential. information.

3
!"#$%&#'%#
!"#$%&'() *"# +,%&-$%%

METHODOLOGY
Data Lineage Data Governance Council
Identifying the critical data elements The council is responsible for the
(CDEs), authoritative data (AD) sources integrity and quality of data before it is
and data traverses across systems ingested into the data warehouse. The
(including capture of transformations) members of the council should be the
allows everyone to determine corrective data team, data owners and data
actions in case of a data mismatch. specialists from relevant parts of the
organization who own the data.

Data Management Program


The entire methodology needs to be integrated into the Data Management Program by
breaking it down into three steps:

Analyze: Before creating governance policies, it is necessary for the project owner to define
what data quality means for the organization, in addition to profiling and quantifying the
current data landscape.
Improve: Construct the framework (for example, data governance) and run the utilities to
continuously cleanse and enrich the data.
Control: Continuous monitoring and reporting of ambiguities helps in maintaining the quality
of data. With greater access to high-quality data, one can finally start to monetize this
information by increasing productivity, reducing waste and driving additional revenue.

IMPLEMENTATION METHODOLOGY - AGILE

One of the major reasons why data warehouse projects often fail is because of the traditional
waterfall approach. On the other hand, the nature of data warehouse projects at times makes
it difficult to adopt truly agile practices. Hence a modification of the approach is required as
highlighted in the diagram below:

WAREHOUSE
Sales
DESIGN
Marketing
BUILDING CONTEXT ARCHITECTURE
DESIGN, WAREHOUSE Finance
Understanding business
INFRASTRUCTURE, DEVELOPMENT
technology and data
BUILD AND POC & TESTING Planning

BI REPORTING Operations
AND
Extendable to all
ANALYTICS
other functions

With proper planning and aligning it to a single integration layer, a data warehouse project can
be broken down into smaller, faster deliverables to ensure faster and high-value returns. This
also gives the teams the flexibility to change or adapt according to the dynamic business
changes.

4
!"#$%&#'%#
!"#$%&'() *"# +,%&-$%%

REFERENCE ARCHITECTURE OF A MODERN DATA WAREHOUSE

A robust, modern data warehouse must have the capabilities to easily consolidate all the data
at any scale and provide deep, comprehensive insights through analytical dashboards,
operational reports or advanced analytics for all users.

SOURCE EXTRACT STORE TRANSFORM


1 2 3 4

The data can be Automated data A data lake can hold vast
sourced as a pipelines transport the amounts of raw data in its In this phase, the data is
structured data-set, data from one system native format until it is prepared, cleaned and
semi-structured or to another, not needed. Cloud object transformed as per the
as an unstructured necessarily stores (AWS S3, Azure need for loading into the
one. transforming it. The Blob, Google Cloud data warehouse. An
data can also be Storage, and more) offer important point to be kept
processed in real time high availability access for in mind while building a
(or streaming) instead big data computing at an data warehouse is to
of batches. extremely low-cost. integrate data from
multiple heterogeneous
sources that support
analytical reporting,
structured and/or ad hoc
queries and decision
making.

DATA WAREHOUSING ANALYTICS REPORTING


5
6 7

By building, training and BI reports and


deploying artificial dashboards can be
In-memory models and semantic layer help in addressing intelligence and machine created to show data
detailed and specific reporting needs. . learning models, users insights to business user,
can derive while self-service BI can
comprehensive insights be used serve requests
to take critical business on-the-fly
decisions.

BUSINESS BENEFITS

With a robust data quality process and data governance framework in place, data management
and quality will improve over time. Some of the multiple benefits organizations can reap by
putting in place robust data governance capabilities are:

• Better insights from data analytics


• Accountability for data
• Reduction in rework and costs
• Ability to track lineage and hence better business and IT agility
• Better compliance and reduced costs for compliance reporting

5
CASE STUDY
DERIVE DEEP INSIGHTS THROUGH EVOLVED DATA WAREHOUSING

Business challenge
The client is India’s only payments company with multichannel transaction processing
capabilities—web, mobile, in-store or at the time of delivery. The company sought to re-design
and create an enterprise data warehouse to improve the overall system, ensure better BI
experience, drive scalability, and more importantly, leverage cloud capabilities.

Solution offered
Since the aim was to drive high level of efficiencies in the data management and warehousing
practices, Prescience made a detailed study of the client’s existing transaction data, operational
data stores, reporting infrastructure and analytics and reporting needs. Keeping the usage
considerations and scope of growth in mind, the team designed the architecture with the
following considerations:
• Support for structured/semi-structured/unstructured data
• Customer-centric process
• Support analytics and machine Learning – ease of availability of historical data
• Varied high-performance dashboards and reports

Business outcomes
Armed with meaningful and actionable insights through data-driven business models, the client
is now on the path to achieve significant cost reductions and RoI improvements. In addition to
this, the decision makers now have:
• Access to good quality and consistent data
• Holistic view of the organization’s financial health, productivity and growth
• Ability to derive deep insights in the areas of customer demand trends, churn behavior and
relation of churn to customer support

About Prescience Decision Solutions


Prescience is a fast growing advanced analytics company that helps enterprises become more PRESCIENT by providing meaningful
business insights and recommendations generated through careful analyses of data. We leverage our tools & frameworks, deep
expertise in machine learning, advanced data analytics techniques and domain knowledge along with the business knowledge of
our customers to create tangible data driven solutions.

Visit us at www.prescienceds.com or send us an email at [email protected] to know more about us.


You can also follow us on LinkedIn .

You might also like