Module 1
Module 1
2020-2021
E
MODULE 1: 6 hrs. Introduction to Data Warehousing
Course
Alan S. Brillantes, CPA, MBA
Instructor
FB
Alan Brillantes
Messenger
Contact Details Email Ad [email protected]
Phone No./s 0932-9543932
Consultation 8:00-10:00
MWF 2:30-4:00 pm TTH
Hours am
Learning Objectives
At the end of this module, you must be able to:
1. Describe what a data warehouse is, including its key characteristics and
properties.
2. Discuss the value and applications of data warehousing in business.
3. Differentiate between an online transaction processing system and a data
warehousing system.
4. Describe the parts of a data warehouse environment and their
interrelationships.
5. Discuss data model concepts.
6. Describe the data warehouse design and development process, and key
principles of data warehouse administration and management.
7. Discuss the major challenges of integrating a data warehouse into the
global information environment.
MODULE GUID
Flexible Learn
Learning Evidence
1. Accomplished Assignment
2. Accomplished Quiz
Rubric/Evaluation Tool
The following rubrics shall be utilized in evaluating and grading your work:
LE1: Accomplished
Assignment
Area to Weight Excellent Above Average Average Passing Failure
Assess
Complete- 60% All required 86-99% of 71-85% of 50%-70% of <50% of
ness contents are required required required required
present contents are contents are contents are contents are
present present present present
Substance 40% Depth & Depth & Depth & Depth & Generally
elaboration are elaboration are elaboration elaboration lacks depth &
exemplary very good are good are wanting elaboration
in some
parts
LE2: Quiz
Area to Superior Above Average Average Below Poor
Assess Average
Number of 91-100% 61-90% correct 51-60% 41-50% <40% correct
Correct correct
Answers
CONTENTS:
“"We have heaps of data, but we cannot access it!" This shows the frustration of those
who are responsible for the future of their enterprises but have no technical tools to help them
extract the required information in a proper format.
"How can people playing the same role achieve substantially different results?" In
midsize to large enterprises, many databases are usually available, each devoted to a specific
business area. They are often stored on different logical and physical media that are not
conceptually integrated. For this reason, the results achieved in every business area are likely
to be inconsistent.
"We want to select, group, and manipulate data in every possible way!" Decision-making
processes cannot always be planned before the decisions are made. End users need a tool that
is user-friendly and flexible enough to conduct ad hoc analyses. They want to choose which
new correlations they need to search for in real time as they analyze the information retrieved.
"Show me just what matters!" Examining data at the maximum level of detail is not only
useless for decision-making processes, but is also self-defeating, because it does not allow
users to focus their attention on meaningful information.
"Everyone knows that some data is wrong!" This is another sore point. An appreciable
percentage of transactional data is not correct—or it is unavailable. It is clear that you cannot
achieve good results if you base your analyses on incorrect or incomplete data.“
Without a centralized database allowing ease of access, there was a lot speculation as
while people knew where the data was or how to get it, they couldn't. With a central database
providing the information at a moments notice, gathering information is much easier.
Even with meticulous organization, data accumulates over time. Before there were data
warehouses, people would have to manually shift through records stored and hope that the
information they wanted was kept.
Data is not always valid. Data is inputted by people and as such there is always a
chance that errors can occur, both accidentally and not. Before DW's, validation of data was
something which was not guaranteed which could lead to faulty data during information
gathering.
… It [the DW] is the central point of data integration for business intelligence and is the
source of data for the data marts, delivering a common view of enterprise data.
MODULE GUID
Flexible Learn
Business intelligence is the set of processes and data structures used to analyze
data and information used in strategic decision support. The components of Business
Intelligences are the data warehouse, data marts, the DSS (decision support system)
interface and the processes to ‘get data in’ to the data warehouse and to ‘get
information out’.)
Subject Oriented
Integrated
The warehouse contains integrated data about a particular subject instead of the
ongoing operations of the organization (Debevoise, 1999; Inmon, 1996a; Rahm and Do,
2000). Data is integrated as the data moves from operational systems into the data
warehouse. In a data warehouse the data not only is integrated across different
functional units of the organization but also includes external entities such as customers
and suppliers. For example, feeds from the stock market may be integrated with
financial data from operational systems in a data warehouse for a comprehensive
financial analysis.
Because data warehouses are targeted for decision support, they contain consolidated
data rather than detailed, individual transactional records. Data in the warehouses is
integrated from several operational databases, over potentially long periods of time into
one repository. Data is integrated to support a corporate view of the data. Integration is
not the mere gathering of data into a single large database. Integration of data requires
MODULE GUID
Flexible Learn
Nonvolatile
Data in the data warehouse is nonvolatile. Once the data enters the data warehouse, it
remains unchanged. In an operational system, data can be changed by deleting or
modifying it. The data in the data warehouse is not updated. Any change to the
information is done by adding a new record to reflect the changed status of the data.
The existing records are not modified. For example, say a person’s contact details are
stored in the customer database as Record No. 1. In an operational system, if the
person’s telephone number changes, this change is made to the Record No. 1 in the
customer database by modifying the entry. However, in a data warehouse no change
will be made to Record No. 1. Instead, a new record (Record No. 2) will be created and
inserted into the data warehouse to reflect the changed telephone number. The
warehouse data is nonvolatile in that the data that enter the database are rarely, if ever,
changed.
Time Variant
A major strength of the data warehouse lies in the time variance of its data (Pedersen
and Jensen, 1998; Han et al., 1998). The value of the operational data archived in the
data warehouse is a function of time and changes on the basis of time. A data
warehouse gives an accurate picture of operational data for a given time, and changes
in the data in the warehouse are based on the time-based changes in operational data.
The data from the operational systems is extracted at a specific moment in time,
creating a snapshot of the data. The data warehouse consists of snapshots of the
operational data taken at intervals of time. Data can be viewed in the data warehouse
across the field of time in different levels of detail. This time variant characteristic of the
data warehouse allows complex analysis along the time dimension, allowing patterns
and trends to be viewed over time.
Data warehousing is much bigger than simply delivering reports in a timely manner. It is
not the data, the technologies, or the reports that impact the business. Rather, it is the ability of
your staff to harness the information to make better, fact‐based, insightful decisions. The data
warehouse is simply a tool that enables your staff to be more effective. The types of things that
can be done using a data warehouse include the following:
MODULE GUID
Flexible Learn
Measuring business performance: Using reports from the data warehouse, actual and
forecasted performance can be compared. For example, claims managers can see how close
they are to reaching the target of making a first payment on claims within the first ten days of
opening a new claim.
Reporting and understanding financial results: A data warehouse can help identify
departments that have exceeded their monthly budgets, highlight suppliers who have
consistently met profitability goals, and single out products that have contributed the most (or
least) to the bottom line.
Understanding customers and their behavior: Exception reports that highlight changes in
consumer purchase patterns can help identify shifts in the marketplace or erosion of brand
loyalty. For example, early identification of changes in payment patterns might indicate that a
customer is under financial pressure and could benefit from a courtesy call to prevent more
serious problems.
Identifying high‐value customers: Using the data warehouse to identify the lifetime value of
customers helps with the development of loyalty programs and improves customer service.
Some customers may generate many business transactions, but they may not actually be
profitable. Other customers may contribute consistently to the organization's profits without a lot
of hands‐on interaction or support.
Attracting and retaining high‐value customers: Data warehouse reports can help you to
develop a profile of high‐value customers so that initiatives can be created to seek out new
customers with a similar profile. This may mean offering low‐cost incentives early on so that the
organization has the opportunity to develop a strong long‐term relationship.
Understanding which products should be scaled back or eliminated: Using the data
warehouse, reports can be generated to highlight products with lagging sales. Additional
analyses can be run to determine the cost effectiveness of continuing to carry these items in
stores. The data warehouse can also be used to help develop plans identifying when trendy
items should be marked down to clear out any remaining inventory.
Understanding business competitors: The data warehouse can provide reports to compare
internal sales volumes with external competitor sales figures. This can help identify fluctuations
in the overall marketplace and how well the organization is maintaining its market share.
Identifying opportunities to improve business flow and processes: The data warehouse
can be used to track how business transactions flow within the organization to identity
bottlenecks, the need for more training, or when systems capacity can no longer keep up with
demand.
Understanding the impact of highly qualified professionals: A data warehouse can also be
used effectively in not‐for‐profit scenarios. For example, data warehouse reports can help
identify teachers who meet specific criteria and to track how a teacher's students perform on
educational assessments over time.
Applications that run the business are called online transaction processing systems
(OLTPs). These OLTPs involve transaction processing that occurs interactively with the
end user. Online transactions are familiar to most people. Examples include:
ATM machine transactions such as deposits, withdrawals, inquiries, and
transfers
Supermarket payments with debit or credit cards
Purchase of merchandise over the Internet
One of the main characteristics of a transaction system is that the interactions between
the user and the system are very short. The user will perform a complete business
transaction through short interactions, with immediate response time required for each
interaction. These systems are currently supporting mission-critical applications;
MODULE GUID
Flexible Learn
OLTP systems are geared toward functions such as processing incoming orders,
getting products shipped out, and transferring funds as requested. These applications
must ensure that transactions are handled accurately and efficiently. No one wants to
wait minutes to get cash from an automated teller machine, or to enter sales orders into
a company's system.
In contrast, the purpose and characteristics of a data warehousing environment are to
provide data in a format easily understood by the business community in order to
support decision‐making processes. The data warehouse (DW) supports looking at the
business data over time to identify significant trends in buying behavior, customer
retention, or changes in employee productivity. Table 1 lays out the primary differences
between these two types of systems.
The inherent differences between the functions performed in OLTP and DW systems
result in methodology, architecture, tool, and technology differences. Data warehousing
emerged as an outgrowth of necessity, but has blossomed into a full‐fledged industry
that serves a valuable function in the business community.
MODULE GUID
Flexible Learn
Now that the differences between data warehouse and OLTP systems have been
reviewed, it is time to look deeper into the makeup of the data warehouse itself.
OLTP
Source Extract
System , Data
OLTP
Transf Organized
Source
System orm to Support Acce
OLTP and ss &
Source the
Load Business Use
System
OLTP Proces of
Source s Data
System
There are many different parts of a data warehouse environment, which encompasses
everything from where the data lives today through where it is ultimately used on reports
and for analysis. Each of the main parts of the data warehousing environment, shown in
Figure 1, are described in the following sections. This figure indicates how the data
flows throughout the environment.
Source systems, shown on the left side of Figure 1, are where data is created or
collected by operational application systems that run the business. These are often
large applications that have been in place for a long time. Examples of source systems
include the following:
Order processing
Production scheduling
Financial trading systems
Policy administration
MODULE GUID
Flexible Learn
The entire midsection of Figure 1 is devoted to the preparation and organization of data.
First, the data must be extracted from the source systems. Next, the data needs to be
transformed to prepare it for business use. It must be cleansed, validated, integrated
together, and reorganized. Finally, the data is loaded into structures that are designed
to deliver it to the business community. The entire process is referred to as the extract,
transform, and load (ETL) process.
The database in which the data is organized to support the business is called a data
mart. A data mart includes all of the data that is loaded into a single database and used
together for analysis. Data marts are often developed to meet the needs of a business
group such as marketing or finance. The key to a successful data mart is to create it in
an integrated manner. It is also recommended that data be loaded into only one data
mart and then shared across the organization to ensure data consistency.
Data Models
There is one more critical concept that warrants some attention: the mechanism used to
help organize data, which is called a data model.
A data model is an abstraction of how individual data elements relate to each other. It
visually depicts how the data is to be organized and stored in a database. A data model
provides the mechanism for documenting and understanding how data is organized.
There are many different types of data modeling, each with a specific goal and purpose.
As organizations modified how data was structured to support reporting and analysis, a
new data modeling technique, now called dimensional modeling, emerged. Ralph
Kimball, a pioneer in data warehousing, can be credited with crystallizing these
techniques and publishing them for the benefit of the industry.
Data model is the process used to define data requirements to support business
structures.
Here a brief overview of what IT industry sees and define data modeling.
“Conceptual Data Model describes the scope of the model and the business
structures. It is the first step in organizing the data requirements. A Conceptual Data
Model (CDM) is a structured business view of the data required to support current
business processes, business events, and related performance measures. It is a single
integrated data structure which reflects the structure of business functions rather than
the processing flow or the physical arrangement of data.
Characteristics:
- Represents overall logical structure of data
- Independent of software or data storage structure
- Often contains objects not implemented in physical databases
- Represents data needed to run an enterprise or a business activity
Logical Data Model describes the technicalities on how the business users
conceptualized their data. This includes now the describing of what tables to be used ,
their relationships etc. etc.
Transactional Logical Data Model: Used for transactional data modeling ( Transaction
includes ledger, sales, history logs and data matrices.
Analytical Logical Data Model : Used for analytic / generic logical data modeling
( Analytical includes Business strategies, Data consolidation, Profiling, and Fact finding
solutions.
Logical Data Model (LDM) builds upon the business requirements and includes a further
level of detail that supports both the business and system requirements. Business rules
are incorporated into the LDM and it loses some of the “generalities” from the Enterprise
CDM
Characteristics
– Independent of specific software and data storage structure
– Includes more specific entities and attributes
– Includes business rules and relationships
– Includes foreign keys, alternate keys
Physical Data Model : defines the physical concepts of the data warehouse, where you
will put all of your records, what software to be used and how the storage will be used.
Physical Data Model (PDM) is specific to the software and performance constraints of
the specific database management system to be used in the implementation. Both
software and data storage structures are considered and the model is often modified to
meet performance or physical constraints.
Characteristics:
– Dependent on specific software and data storage structure
– Includes tables and columns
– Includes physical database objects (triggers, stored procedures, table spaces)
– Includes referential integrity rules that restrict relationships between tables
Multidimensional data model in data warehouse is a model which represents data in the form
of data cubes. It allows to model and view the data in multiple dimensions and it is defined by
dimensions and facts. Multidimensional data model is generally categorized around a central
theme and represented by a fact table.
Dimension: provides the context surrounding a business process event. In simple terms, they
give who, what, where of a fact. In the Sales business process, for the fact quarterly sales
number, dimensions would be
MODULE GUID
Flexible Learn
MODULE GUID
Flexible Learn
Fact Examples:
Sales, Shipments,
Hospital
Admissions
Measure
Examples:
Sale Receipts,
Amount
Shipped, Hospital
Admission Costs
The multidimensional model begins with the observation that the factors affecting decision-
making processes are enterprise-specific facts, such as sales, shipments, hospital admissions,
surgeries, and so on. Instances of a fact correspond to events that occurred. For example,
every single sale or shipment carried out is an event. Each fact is described by the values of a
set of relevant measures that provide a quantitative description of events. For example, sales
receipts, amounts shipped, hospital admission costs, and surgery time are measures.
Perhaps the best starting point to approach the multidimensional model effectively is a definition
of the types of queries for which this model is best suited. Section 1.7 offers more details on
typical decision-making queries such as those listed here (Jarke et al., 2000):
"What is the total amount of receipts recorded last year per state and per product
category?"
"What is the relationship between the trend of PC manufacturers' shares and quarter
gains over the last five years?"
"Which orders maximize receipts?"
"Which one of two new treatments will result in a decrease in the average period of
admission?"
"What is the relationship between profit gained by the shipments consisting of less than
10 items and the profit gained by the shipments of more than 10 items?"
It is clear that using traditional languages, such as SQL, to express these types of queries can
be a very difficult task for inexperienced users. It is also clear that running these types of
queries against operational databases would result in an unacceptably long response time.
Sales Cube
Example
Dates
Stores Products
MODULE GUID
Flexible Learn
How it Works:
Cubes
The metaphor Cubes came up as a way to visualize this model
The concept of dimension gave life to the broadly used metaphor of cubes to represent
multidimensional data. According to this metaphor, events are associated with cube cells and
cube edges stand for analysis dimensions. If more than three dimensions exist, the cube is
called a hypercube. Each cube cell is given a value for each measure. Figure 6 shows an
intuitive representation of a cube in which the fact is a sale in a store chain. Its analysis
dimensions are store, product and date. An event stands for a specific item sold in a specific
store on a specific date, and it is described by two measures: the quantity sold and the receipts.
This figure highlights that the cube is sparse—this means that many events did not actually take
place. Of course, you cannot sell every item every day in every store..
Multidimensional
Model
MODULE GUID
Flexible Learn
Events:
Unique Events in the Multidimensional Model (Cube Cells) do
not always correspond to
Unique Events in the Application Domain
Store
Each cell represents the entire sales in a store
in a day
It doesn't take into account individual transactions
Day (App Domain)
Sales
To avoid any misunderstanding of the term event, you should realize that the group of
dimensions selected for a fact representation singles out a unique event in the multidimensional
model, but the group does not necessarily single out a unique event in the application domain.
To make this statement clearer, consider once again the sales example. In the application
domain, one single sales event is supposed to be a customer's purchase of a set of products
from a store on a specific date. In practice, this corresponds to a sales receipt. From the
viewpoint of the multidimensional model, if the sales fact has the product, store, and date
dimensions, an event will be the daily total amount of an item sold in a store. It is clear that the
difference between both interpretations depends on sales receipts that generally include various
items, and on individual items that are generally sold many times every day in a store. In the
following sections, we use the terms event and fact to make reference to the granularity taken
by events and facts in the multidimensional model.
For the marketing manager, his business dimensions are product, product category,
time (day, week, month), sales district, and distribution channel. For the financial controller,
the business dimensions are budget line, time (month, quarter, year), district, and
division.
If your users of the data warehouse think in terms of business dimensions for decision
making, you should also think of business dimensions while collecting requirements.
Although the actual proposed usage of a data warehouse could be unclear, the business
dimensions used by the managers for decision making are not nebulous at all. The users
will be able to describe these business dimensions to you. You are not totally lost in the process
of requirements definition. You can find out about the business dimensions.
Let us try to get a good grasp of the dimensional nature of business data. Figure 8 shows
MODULE GUID
Flexible Learn
Cube Management:
Assume:
100 Stores,
100 Items,
3 Years (Roughly 1000 Days)
Cube Size = 100x100x 1000 = 10000000 Potential Events
The information in a multidimensional cube is very difficult for users to manage because of its
quantity, even if it is a concise version of the information stored to operational databases. If, for
example, a store chain includes 50 stores selling 1000 items, and a specific data warehouse
covers three-year-long transactions (approximately 1000 days), the number of
potential events totals 50 × 1000 × 1000 = 5 × 107. Assuming that each store can sell only 10
percent of all the available items per day, the number of events totals 5 × 106. This is still too
much data to be analyzed by users without relying on automatic tools.
You have essentially two ways to reduce the quantity of data and obtain useful information:
restriction and aggregation.
The cube metaphor offers an easy-to-use and intuitive way to understand both of these
methods.
Earlier in this module, you looked at how data flows through the data warehouse environment.
While this correctly illustrates how data flows in the completed environment, this is not the
recommended sequence for designing and developing a data warehouse. A better way to
design the environment is to start from the business user perspective. Business Questions and
Problems arises and then collected by the source system, (Source system can be flat files,
Spreadsheets, Personal folders, Pictures etc.), then These Source Data will be processed and
then organized that might help support the business, once It was organized, users can now
access the information needed.
MODULE GUID
Flexible Learn
Data
Organized Design and
Access Proces
Build
& Use to Support Process s the
of Data Data
the Business
This Figure shows the correct order to successfully design and implement a data warehousing
environment. Both the technical and business team members play a role throughout.
An understanding of what the business is trying to accomplish and how success is measured
should be the foundation for all data warehousing initiatives. The starting point for designing the
data warehouse is with the business community.
Once the business requirements are understood, the data in the underlying source systems
needs to be studied. Many business people have a vision for what they want to do, but it is not
always tied to the reality of the organization's actual data.
The foundation for successful data warehousing, now and into the future, is properly structuring
the data. Data must be organized to support the business perspective. This provides ease of
use and improved query performance. This design is created based on a knowledge of the
business requirements, as well as the reality of the existing data.
After defining how the data will be organized, the design for getting the data from the source
systems to the database can be created. Decisions about the architecture and tools needed to
prepare the data can be made in the proper context. Too often these decisions are made before
you know what is to be delivered.
While the data is being prepared, the data access and application layer can be designed. This
includes the design of basic reports, business intelligence, and analytical applications, and
performance dashboards or other end user tools.
Project Methodology
Many different project methodologies are available for all systems' development efforts. There
are even multiple methodologies specifically targeted toward data warehousing. These have
evolved over several decades. Most organizations already have adopted some type of project
methodology or project life cycle. It is important to understand how your organization runs
projects to ensure that the data warehouse project is adhering to the strategic direction for all
information systems. Several basic building blocks are found in any methodology. These
primary components are as follows:
A well-managed data warehouse can assist a corporation in its strategy to gain competitive
advantages. This can be achieved by using an exploration warehouse, which is a direct product
of data warehouse, to identify environmental factors, formulate strategic plans, and determine
business specific objectives:
While managing a data warehouse for business strategy, what needs to be taken into
consideration is the difference between companies. No one formula fits every organization.
Avoid using so called "templates" from other companies. The data warehouse is used for your
company's competitive advantages. You need to follow your company's user information
requirements for strategic advantages.
Manager/Director: responsible for the overall management of the entire team to ensure that the
team follows the guiding principles, business requirements, and corporate strategic plans.
Project Manager: responsible for data warehouse project development, including matching
each team member's skills and aspirations to tasks on the project plan.
Executive Sponsor: responsible for garnering and retaining adequate resources for the
construction and maintenance of the data warehouse.
Business Analyst: responsible for determining what information is required from a data
warehouse to manage the business competitively.
System Architect: responsible for developing and implementing the overall technical
architecture of the data warehouse, from the backend hardware and software to the client
desktop configurations.
ETL Specialist: responsible for routine work on data extraction, transformation, and loading for
the warehouse databases.
Front End Developer: responsible for developing the front-end, whether it is client-server or
over the Web.
OLAP Specialist: responsible for the development of data cubes, a multidimensional view of
data in OLAP.
MODULE GUID
Flexible Learn
Trainer: responsible for training the end-users to use the system so that they can benefit from
the data warehouse system.
End User: responsible for providing feedback to the data warehouse team.
Process Management
Developing data warehouse has become a popular but exceedingly demanding and costly
activity in information systems development and management. Data warehouse vendors are
competing intensively for their customers because so much of their money and prestige are at
stake. Consulting vendors have redirected their attention toward this rapidly expanding market
segment. User companies are facing with a serious question on which product they should buy.
As mentioned before, data warehouse development is a large system development process.
Process management is not required in every step of the development processes.
Security Management
In recent years, information technology (IT) security has become one of the hottest and most
important topics facing both users and providers. The goal of database security is the protection
of data from accidental or intentional threats to its integrity and access. The same is true for a
data warehouse. However, higher security methods, in addition to the common practices such
as view-based control, integrity control, processing rights, and DBMS security, need to be used
for the data warehouse due to the differences between a database and data warehouse. One of
the differences that demand a higher level of security for a data warehouse is the scope of and
detail level of data in the data warehouse, such as financial transactions, personal medical
records, and salary information. A method that can be used to protect data that requires high
level of security in a data warehouse is by using encryption and decryption.
Confidential and sensitive data can be stored in a separate set of tables where only authorized
users can have access. These data can be encrypted while they are being written into the data
warehouse. In this way, the data captured and stored in the data warehouse are secure and can
only be accessed on an authorized basis. Three levels of security can be offered by using
encryption and decryption. The first level is that only authorized users can have access to the
data in the data warehouse. Each group of users, internal or external, ranging from executives
to information consumers should be granted different rights for security reasons. Unauthorized
users are totally prevented from seeing the data in the data warehouse. The second level is the
protection from unauthorized dumping and interpretation of data. Without the right key an
unauthorized access will not be allowed to write anything into the tables. On the other hand, the
existing data in the tables cannot be decrypted. The third level is the protection from
unauthorized access during the transmission process. Even if unauthorized access occurs
during transmission, there is no harm to the encrypted data unless the user has the decryption
code
The availability of the opportunities described will completely depend on the availability, quality,
and organization of the information in the external sources as well as on the organization's
ability to complete complicated projects and its commitment to flexible and well-grounded
decision making. That is why such a company will have to adopt new organizational approaches
as well as new software engineering solutions and technologies that could provide a solid base
for efficient and adequate accumulation of information from scattered sources in a global
information environment for its further integration into corporate decision support systems.
The problem of development of data warehouses in a global information environment has much
in common with the development of data warehouses within the scope of corporate information
systems (we have already mentioned those difficulties and tasks). But on the whole it
significantly differs from the implementation of data warehouses that use information from local
databases:
External information environment and data in it (data sources, data formats, access
interfaces) may change significantly, and these changes may be made without taking
into account how they influence the behavior of the entities that consume this
information.
MODULE GUID
Flexible Learn
A. In your own words, answer the following questions (DO NOT copy verbatim from the text of
this module; you may use Word or simply write your solution):
1. Describe what a data warehouse is, including its key attributes. (10 pts.)
2. Justify how a large organization like San Miguel Corporation (SMC) may benefit from
having a data warehouse. You may cite actual products and other real-life aspects of
SMC operations to illustrate your points (you may do some research on this). (15 pts.)
3. Identify the parts of the data warehouse environment and discuss their interrelationships.
(10 pts.)
4. Construct a multidimensional model of possible data cubes for SMC (DO NOT use the
same examples mentioned in the module). Draw the same; illustrate and label the
dimensions, facts, measures, events, and slices.(15 pts.)
5. Discuss briefly the data warehouse design and develop process.(5 pts.)