DWM Unit-1 Notes
DWM Unit-1 Notes
2] Middle Tier:
▪ The middle tier in Data warehouse is an OLAP server which is
implemented using either ROLAP or MOLAP model.
▪ This application tier presents an abstracted view of the database. This layer
also acts as a mediator between the end-user and the database.
▪ It includes summary data, raw data and metadata.
3] Top Tier:
▪ The top tier is a front-end client layer. Top tier is the tools and API that
user used to get useful data out from the data warehouse.
▪ The different tools are Query tools, reporting tools, managed query tools,
Analysis tools and Data mining tools.
Q.6] Advantages and Disadvantages of Data Warehouse:
Advantages of Data Warehouse:
1. Better decision-making with consolidated data.
2. Faster queries due to optimized storage.
3. Consistent data through ETL processes.
4. Time-saving with centralized access.
5. Helps with trend forecasting.
Disadvantages of Data Warehouse:
1. High setup and maintenance costs.
2. Complex data integration from multiple sources.
3. Data latency due to periodic updates.
4. Scalability issues with growing data volumes.
5. Requires specialized knowledge and skills.
Q.7] Explain Metadata repository.
Metadata Repository:
(Metadata: data about data, repository: big container)
▪ The metadata repository is responsible for physically storing and
categorizing metadata. The data in the metadata repository should be
generic, integrated, current and historical.
▪ Metadata is the information about the structures that contain the actual
data.
▪ It is data about the structures that contain data. Metadata may describe the
structure of any data, of any subject, stored in any format.
▪ Metadata repository contains the structures of all data at one place, which
gives the plenty of data more than requirement for decision making.
▪ Metadata Repository used for building, maintain, managing Data
warehouse.
Concept example: a line in sales database may contain: 4030 KJ732 299.90 This
is a meaningless data until we consult the Meta that tells us what it was.
The Meta of the data is
• Model number: 4030
• Sales Agent ID: KJ732
• Total sales mount of $299.90
▪ Therefore, Metadata are essential ingredients in the transformation of data
into knowledge.
Example: Metadata of a Book Store:
1. Name of book
2. Summary of book
3. Publication of book
4. Edition of book
5. Author of book
6. Date of publication
7. Availability of book
8. Reviews of book
Above information (metadata) helps to search the book, access the book, etc.
Advantages of Metadata Repository:
1. Centralizes and simplifies metadata management.
2. Ensures data consistency.
3. Supports better decision-making.
4. Enhances data governance.
5. Tracks data lineage for quality.
6. Eases data integration.
Disadvantages of Metadata Repository:
1. Complex to set up and maintain.
2. High initial costs for implementation.
3. Can become overloaded with data.
4. Security risks if unprotected.
5. Requires ongoing updates.
6. Needs skilled personnel.
Q.8] Describe Extraction, Transformation and Loading.
ETL:
ETL means Extract, transform, and load which is a data integration
process that include clean, combine and organize data from multiple sources into
one place which is consistent storage of data in data warehouse, data lake or other
similar systems.
The second step of the ETL process is transformation. In this step, a set of
rules or functions are applied on the extracted data to convert it into a single
standard format.
Q.10] List and explain data warehouse models with suitable examples
Data warehouse models:
1) Enterprise Data Warehouse
2) Data mart
3) Virtual Warehouse
1] Enterprise Data Warehouse:
▪ Enterprise Data Warehouse is a centralized warehouse, which aggregates
the information or data automatically.
▪ It offers a unified approach for organizing and representing data. It also
provides the ability to classify data according to the subject and give access
accordingly to users.
▪ It provides decision support service across the enterprise.
Example: All Polytechnic data available at MSBTE
2] Data Marts:
▪ A data mart is a subset of the data warehouse.
▪ It is specially designed for a particular line of business, such as sales,
finance, sales or finance. In an independent data mart, data can collect
directly from sources.
▪ Due to large amount of data, a single warehouse can become overburdened.
So, to prevent the warehouse from becoming impossible to navigate,
subdivisions created, called as Data Marts.
▪ These data marts divide the information saved in the warehouse into
categories or specific groups of users.
▪ In a simple word Data mart is a subsidiary of a data warehouse.
Example: Five regions of MSBTE: One region may be referred as Data Mart.
3] Virtual Warehouse:
▪ The view over an operational data warehouse is known as a virtual
warehouse.
▪ A virtual warehouse is essentially a separate business database, which
contains only required data for operation system.
▪ The data found in a virtual warehouse is usually copied from multiple
sources throughout an operation system.
▪ Virtual warehouse is used to search the data quickly and without accessing
the entire system. It speeds up the overall access process.
Example: It may contain only one or two Polytechnics data.
Q.11] State any four Benefits of Data warehouse.
1. Delivers enhanced business intelligence:
By having access to
information from various sources in a single platform, decision makers will
no longer need to rely on limited data.
2. Saves times:
A data warehouse standardizes, preserves, and stores data
from different sources, and integration of all the data in one place. So, all
critical data is available to all users simultaneously.
3. Enhances data quality and consistency:
A data warehouse converts data
from multiple sources into a consistent format. The data from different
sources can be filtered, sorted, cleaned. This will lead to more accurate
data, which will become the basis for solid decisions.
4. Generates a high Return on Investment (ROI):
Companies experience
higher revenues and cost savings than those that haven’t invested in a data
warehouse.
5. Provides competitive advantage:
Data warehouses help to get a holistic
(as a whole not parts) view of their current standing and evaluate
opportunities and risks, thus providing companies with a competitive
advantage.
6. Improves the decision-making process:
Data warehousing provides
better insights (detailed understanding) to decision makers by maintaining
a related database of current and historical data.
Q.12] Applications of Data Warehouse
Applications of Data Warehouse:
1. Airlines
2. Banking
3. Healthcare
4. Public Sector
5. Telecommunication
6. Investment