DWM Chap2
DWM Chap2
1. Data marts:
A data mart is a condensed version of Data Warehouse and is designed for
use by a specific department, unit or set of users in an organization.
E.g., Marketing, Sales, HR or finance. It is often controlled by a single
department in an organization.
A data mart is focused on a single functional area of an organization and
contains a subset of data stored in a Data Warehouse.
Data marts are small in size and are more flexible compared to a
Datawarehouse.
Data Mart helps to enhance user's response time due to reduction in volume of data
It provides easy access to frequently requested data.
Data mart are simpler to implement when compared to corporate Datawarehouse. At
the same time, the cost of implementing Data Mart is certainly lower compared with
implementing a full data warehouse.
Compared to Data Warehouse, a datamart is agile. In case of change in model,
datamart can be built quicker due to a smaller size.
A Datamart is defined by a single Subject Matter Expert.Hence, Data mart is more
open to change compared to Datawarehouse.
Data is partitioned and allows very granular access control privileges.
Data can be segmented and stored on different hardware/software platforms.
1. Dependent: Dependent data marts are created by drawing data directly from
operational, external or both sources.
2. Independent: Independent data mart is created without the use of a central data
warehouse.
3. Hybrid: This type of data marts can take data from data warehouses or operational
systems.
A dependent data mart allows sourcing organization's data from a single Data Warehouse. It
offers the benefit of centralization. If you need to develop one or more physical data marts,
then you need to configure them as dependent data marts.
Dependent data marts can be built in two different ways. Either where a user can access
both the data mart and data warehouse, depending on need, or where access is limited only
to the data mart. The second approach is not optimal as it produces sometimes referred to
as a data junkyard. In the data junkyard, all data begins with a common source, but they are
scrapped, and mostly junked.
Independent Data Mart
An independent data mart is created without the use of central Data warehouse. This kind of
Data Mart is an ideal option for smaller groups within an organization.
An independent data mart has neither a relationship with the enterprise data warehouse nor
with any other data mart. In Independent data mart, the data is input separately, and its
analyses are also performed autonomously.
A hybrid data mart combines input from sources apart from Data warehouse. This could be
helpful when you want ad-hoc integration, like after a new group or product is added to the
organization.
It is best suited for multiple database environments and fast implementation turnaround for
any organization. It also requires least data cleansing effort. Hybrid Data mart also supports
large storage structures, and it is best suited for flexible for smaller data-centric applications.
4. Steps in Implementing a Datamart
Implementing a Data Mart is a rewarding but complex procedure. Here are the detailed steps
to implement a Data Mart:
Designing
Gathering the business & technical requirements and Identifying data sources.
Selecting the appropriate subset of data.
Designing the logical and physical structure of the data mart.
Date
Business or Functional Unit
Geography
Any combination of above
A simple pen and paper would suffice. Though tools that help you create UML or ER
diagrams would also append meta data into your logical and physical designs.
Constructing
This is the second phase of implementation. It involves creating the physical database and
the logical structures.
Implementing the physical database designed in the earlier phase. For instance,
database schema objects like table, indexes, views, etc. are created.
You need a relational database management system to construct a data mart. RDBMS have
several features that are required for the success of a Data Mart.
Storage management: An RDBMS stores and manages the data to create, add, and
delete data.
Fast data access: With a SQL query you can easily access data based on certain
conditions/filters.
Data protection: The RDBMS system also offers a way to recover from system failures
such as power failures. It also allows restoring data from these backups incase of the
disk fails.
Multiuser support: The data management system offers concurrent access.
Security: The RDMS system also provides a way to regulate access by users to
objects and certain types of operations.
Populating:
You accomplish these population tasks using an ETL(Extract Transform Load)Tool. This tool
allows you to look at the data sources, perform source-to-target mapping, extract the data,
transform, cleanse it, and load it back into the data mart.
Accessing
Accessing is a fourth step which involves putting the data to use: querying the data, creating
reports, charts, and publishing them. End-user submit queries to the database and display
the results of the queries
Managing
This is the last step of Data Mart Implementation process. This step covers management
tasks such as-
You could use the GUI or command line for data mart management.
Following are the best practices that you need to follow while in the Data Mart
Implementation process:
Advantages
Disadvantages
Many a times enterprises create too many disparate and unrelated data marts
without much benefit. It can become a big hurdle to maintain.
Data Mart cannot provide company-wide data analysis as their data set is limited.