0% found this document useful (0 votes)
110 views21 pages

22.1 Introduction: Data Access For Decision Makers: Section 16.4

Uploaded by

api-3696230
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views21 pages

22.1 Introduction: Data Access For Decision Makers: Section 16.4

Uploaded by

api-3696230
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Lesson 22: An introduction to data warehousing

22.1 Introduction: Data access for decision understands how all the data fits together—the
makers database administrator (DBA). The bad news is
that the DBA is a database specialist, not a
When you used the ACCESS report writer to business specialist.
create an invoice report in Section 16.4, you
may have made an important observation: The decision-makers who most need the
information locked up inside the database
Creating a report from a relational (marketing managers, executives) seldom have
database requires a solid any training (or even interest) in the subtleties
understanding of where the data is of third-normal form, concatenated keys, or
stored and how the tables are referential integrity. Similarly, as a class of
related. individuals, DBAs are not known for their
In this case, you are the person who marketing instincts or general business acumen.
implemented the database so you know that
each Order consists of many OrderDetails and 22.1.2 Dimensional data modeling
that OrderDetails.ActualPrice is what the In the early 90s, a new class of database
items sold for, not Products.UnitPrice and so application—the data warehouse—emerged to
on. But who else in your organization could address the problems encountered by managers
realistically be expected to know all these as they tried to access information locked up
details? inside of transaction processing systems. At the
simplest level, a data warehouse is simply a
22.1.1 Database specialists versus business read-only copy of the data in a transaction
specialists processing system. However, instead of being
A real-world business application typically optimized for transaction processing, a data
contains hundreds or thousands of tables and a warehouse is optimized for reporting and
mind-boggling web of relationships between decision support. Specifically, a data warehouse
tables. The good news is that within each is based on a “dimensional” data model rather
organization, there is an individual who than a “normalized” data model.

© Michael Brydon ([email protected])


Last update: 02-May-01 1 o f 21
An introduction to data warehousing Learning objectives
2 o f 21

An example of a dimensional data model is


shown in Figure 22.1: the center table—Sales— FIGURE 22.1: A dimensional data model.
contains the “fact” of primary importance to
decision makers: the dollar amount sold. The
other tables—Customers, Time, and Products—
are the “dimensions” along which sales vary.
For example, a manager may want to know who
her best customers are this quarter and what
they are buying. Finding an answer using a
normalized database would require some “dimension”
“fact” table
reasonably sophisticated knowledge of both a tables
query language and the table structure of the
database. However, as you will see, it is very
easy to answer this type of question when the
source of the data is organized into facts and
dimensions. The ease with which business users
can create complex queries is an important
benefit of the data warehousing approach to
decision support.
22.2 Learning objectives
22.1.3 Building a data warehouse ! understand the difference between
data models for transaction processing
In this lesson, you will build a very small-scale
and data models for decision support
data warehouse. Although your warehouse will
be implemented in ACCESS and will be a fraction ! denomalize data to create dimension
of the size of a real-world warehouse, even a tables
small warehouse is sufficient to illustrate the ! extract data to create a fact table
critical elements of data extraction and ! build a star schema
dimensional data modeling. In Lesson 23 you
! use grouping to change the granularity
will use your data warehouse to explore your
of a fact table
data in greater detail and answer complex
questions about your business.
An introduction to data warehousing Exercises
3 o f 21

22.3 Exercises
Since you may not have enough data in your FIGURE 22.2: Select tables in the
order entry application to yield interesting operational system to link to.
query results, we will use the NORTHWIND TRADERS
sample database (recall Section 4.3.2) as the
source for our data warehouse. The NORTHWIND
TRADERS database is also small by real-world
standards; however, it contains enough orders
(just over 800) to make the querying exercises
in this lesson and Lesson 23 worthwhile.

22.3.1 Preliminaries 1 Select all the tables in


the operational
Rather than alter the NORTHWIND TRADERS sample database and press OK.
database, we are going to create links to its
tables (recall Section 8.3.3) and extract the
data into a new database file.
Your data warehouse database should now
➨ Create a new blank database in ACCESS contain links to all the tables in a transaction
called OrderEntryWarehouse. processing system application.

➨ From the main menu, select File → Get ➨ Since you are still playing the role of the
External Data → Link Tables. DBA at this point, bring up the relationships
window to get a sense for the structure of
➨ Use the search feature of the “Link” dialog the NORTHWIND TRADERS application.
box to find the NORTHWIND TRADERS database
(recall Section 4.3.2). 22.3.2 Extraction, cleaning, and transformation
Data extraction is the process of copying the
➨ When asked which tables you would like to data from the operational system (i.e., the
create links to, select all the tables, as NORTHWIND TRADERS order entry system) to the
shown in Figure 22.2.
An introduction to data warehousing Exercises
4 o f 21

data warehouse. During extraction, two things 22.3.3 Creating dimension tables
normally happen to the data:
Dimension tables are relatively static: they
1. Cleaning — Data cleaning (or scrubbing) contain lists of products, customers, and so on.
involves removing incorrect or inconsistent However, since the dimensions determine the
data, missing values, and so on. As you can types of questions you can ask of your data
imagine, this is costly and difficult process warehouse, it is important to put some thought
that involves a combination of specially into their design. There are two important
written programs and manual intervention. questions to answer before you dive in:
2. Transformation — Data from the 1. What dimensions are important for making
transaction processing system must be business decision?
transformed into a format suitable for
2. What is the appropriate level of granularity
reporting, analysis, and other decision
for each dimension?
support activities. Typically, transformation
involves de-normalizing dimension tables These questions are addressed on a case-by-
and pre-computing fact tables. Both these case basis in the following sections.
processes are illustrated in the following
sections. 22.3.3.1 The product dimension
Clearly, you are going to want to look at sales by
To keep things simple, we are going to
? assume that the NORTHWIND TRADERS order
product. The granularity issue is whether you
need to look at individual SKUs (stock keeping
entry application has been designed to
units) or whether a coarser-grained approach
minimize the possibility of errors entering
(e.g., product category) is sufficient.
the database. As such, the data cleaning
stage is assumed to be unnecessary.
➨ Create a new query called
qryProductDimensionExtract based on
In the next few sections, you will use action
the Products table.
queries to copy data from the linked tables you
created in Section 22.3.1 to new tables in your
OrderEntryWarehouse database.
➨ From the Query menu, select Make Table
Query.
An introduction to data warehousing Exercises
5 o f 21

➨ When prompted for the name of the new ➨ Include the Categories table in the query.
table, enter dimProducts. There should be a one-to-many relationship
between Categories and Products.
Since you already have a linked table
! called Products, you cannot use the ➨ Project the ProductName and
same name. In addition, the dim prefix is CategoryName fields into the query. These
an easy way to indicate that the table will be the values the user sees.
contains denormalized dimension data.
➨ Project the ProductID field into the query.
In the NORTHWIND TRADERS database, each This value will be used as a key to link the
product is assigned to a category (beverages, dimension table to the fact table.
condiments, and so on). If we analyze products
by individual SKUs (in this case, ProductID), ➨ Finally, project the UnitPrice field into the
then the granularity is quite fine. However, if query.
we lump all products within a particular
category together and perform our analyses at The rationale for including UnitPrice in
the category level, then the granularity is more ? the dimension table is that users may
coarse. want to limit their analysis to high (or
low) valued items. Having the UnitPrice
If your dimension it too fine-grained, your field in the data warehouse allows users
! data warehouse will be very large may to apply price-related constraints to their
involve excessive processing to respond to queries. More generally, knowing what
user queries. However, if your dimension fields to include in a data warehouse
is too coarse-grained, you will be unable requires a good understanding of how
to ask certain kinds of questions. users make decisions. When it doubt, it is
probably best to err on the side of
In this case, we are going to include both including too much.
product and category information. This will
permit the user of the warehouse to drill-down ➨ Select Query → Run to execute the query.
to the appropriate level of granularity. Examine the contents of dimProducts, as
shown in Figure 22.3.
An introduction to data warehousing Exercises
6 o f 21

FIGURE 22.3: The extraction query for the product dimension.

1 Create a query to extract product


data from the operational database.

2 Verify the resulting


dimension table.

22.3.3.2 Taking a closer look at the products normalization in Section 7.1.1). In data
dimension table warehousing, database design logic is
turned on its head: In order to save the
There are a couple of things to notice about the
computational effort of making a join when
dimProducts table:
running queries against the data warehouse,
1. Denormalization — When designing the dimension table includes information
transaction processing systems, we make from multiple entities (e.g., products and
every effort to eliminate redundancy in our categories). Since data in the warehouse is
tables (recall the discussion of
An introduction to data warehousing Exercises
7 o f 21

never changed or edited, this denormalized Customers table so there is no need to


structure does not lead to the anomalies create a join to another table.
discussed in Section 7.1.
2. User-friendly values — Since the extracted ➨ Project the following fields into the query:
data is ultimately going to be used for CompanyName, City, Region, and Country.
creating reports, meaningful field values
(such as ProductName and CategoryName) ➨ Project CustomerID to enable the table to
are used instead of key fields (like be linked to the fact table.
CategoryID). Key fields (such as
ProductID) are only added when necessary ➨ Execute the query and verify the contents
for linking to a fact table. of dimCustomers, as shown in Figure 22.4.

22.3.3.3 The customer dimension 22.3.3.4 The time dimension

The customer dimension is similar to the The time dimension is different from the
product dimension in that a hierarchical products and customers dimension in that time
relationship in implicit in the data. In this case, exists independently of any particular data
assume that users require data right down to warehouse application. As a consequence, it is
the level of the individual customers, but may possible to create a generic time dimension
also want to aggregate across cities, regions, table consisting of a date ID plus days of the
and countries. week, months, quarters, years and so on. This
dimension table could be used for all data
➨ Create a new make-table query called warehouse applications.
qryCustomerDimensionExtract. In this lesson, we are going to take a different
approach for two reasons.
➨ Use dimCustomers as the name of the 1. Event time vs. calendar time — Although
target table.
sales per day (or hour or minute) may be a
meaningful piece of information, we are
➨ Include the Customers table. In the also interested in sales per order. One may
NORTHWIND TRADERS database, the region
not normally think of OrderID as a measure
information is included within the
of time; however, it is important to
remember that each order is an event.
An introduction to data warehousing Exercises
8 o f 21

FIGURE 22.4: The extraction query for the customer dimension.

1 Create a query to extract customer


data from the operational database.

2 Verify the resulting


dimension table.

Since many orders can occur per day, the using specialized data manipulation
granularity desired in this context is finer functions.
than the granularity of a generic date-based
dimension table. ➨ Create a new make-table query called
2. Date manipulation functions— Since each qryTimeDimensionExtract.
order has an order date, it is possible to
derive the coarser-grained values of time ➨ Use dimTime as the name of the target
table.
An introduction to data warehousing Exercises
9 o f 21

➨ Include the Orders table and project the ➨ Select View → Datasheet View to preview
OrderDate field. new table and verify the DatePart()
function, as shown in Figure 22.5.
➨ Project the OrderID field so that a link can
be made to the fact table. FIGURE 22.5: Use the DatePart() function
to show the year of an order.
22.3.3.5 Transforming the OrderDate field

1
The OrderDate field is defined as a Date/Time
data type and contains all the information we Create a calculated
field to extract the year
require about day, month, year and so on. The of the order from the
trick is to extract this information and display it OrderDate field.
in a user-friendly format. To do this, we will use
calculated fields and the built-in DatePart()
function.

➨ Create a new calculated field in


qryTimeDimensionExtract called Year and
define it as follows:
NL Year: DatePart(“yyyy”, OrderDate)
The DatePart() function takes two arguments:
a special set of characters that determines what
part of the date is returned and a valid date.
2 Use Query →
Datasheet to
preview the
For example, the argument “yyyy” tells the action query.
function to return a four-digit year.

Use the on-line help system and search


? under “datepart” to learn more about the
function and its arguments.

MICROSOFT updates the NORTHWIND TRADERS


! database from time to time. As a
An introduction to data warehousing Exercises
10 o f 21

consequence, the dates return by your The Choose() function maps an index
queries may not correspond exactly to ? number (1, 2, …) to the corresponding
those shown in Figure 22.5. value in a list of choices. For example,
the index number 2 maps to the second
The procedure for transforming the month is choice, and so on. See on-line help for
the same except that DatePart() only returns more information about the Choose()
the ordinal number of the month, not the name. function.
Since the whole purpose of this exercise is to
provide user-friendly, readable values for each ➨ Create a new calculated field called
dimension, a means of mapping month numbers Quarter as follows:
to month names is required. NL Quarter: “Q” & DatePart(“q”,
OrderDate)
One approach is to create a lookup table of
month numbers and months. A somewhat When the first argument for the
simpler approach (which we will use here) is to ? DatePart() function is “q”, the function
use the Choose() function within ACCESS. returns a value 1 to 4 corresponding to
the quarter. To make it more readable,
➨ Create a new calculated field called month the letter “Q” is added to the front of
and define it as follows: each value:
NL Month: DatePart(“m”, OrderDate)
➨ Include a final calculated field called
➨ Preview the results and verify that the Holiday:
result is a number from 1 to 12.
NL Holiday: False

➨ Modify the Month field so that the The Holiday field creates a new field in
DatePart() function provides the first ? the dimTime table to indicate whether a
argument for the Choose() function: particular date is a holiday. This type of
NL Month: Choose(DatePart(“m”,
information is often useful in a retail
OrderDate), “Jan”,
context for interpreting spikes in demand.
“Feb”,“Mar”,“Apr”,“May”,“Jun”,“Jul”
,“Aug”,“Sep”,“Oct”, “Nov”, “Dec”) Of course, someone has to go through the
dimTime table and change Holiday = True
where appropriate.
An introduction to data warehousing Exercises
11 o f 21

➨ Execute the query and verify the results, as


shown in Figure 22.6.

FIGURE 22.6: The extraction query for the time dimension.

1 Create a query to extract time data


from the operational database.

2 Verify the resulting


dimension table.

22.3.4 Creating a fact table per day, then we would compute this value for
each product × customer × day and store it in a
A fact table contains one or more results of
table. In the case of NORTHWIND TRADERS, the
interest for each unique combination of
results would be 77 × 91 × 365 = 2.5 million
dimensions. For example, if we were interested
facts per year.
in the value of sales of products to customers
An introduction to data warehousing Exercises
12 o f 21

The number of non-zero facts is be much 22.3.4.2 Calculating sales


? smaller than 2.5 million since all
What remains to be determined is the dollar
customers do not order all products every
value of sales for each combination of the
day of the year. Despite this sparseness
dimensional values. To calculate the total sale
however, fact tables tend to be very
for each product × customer × order
large.
combination, we have to know a couple of
things about the data:
22.3.4.1 Determining foreign keys
1. The total value of an order is the sum of
Through the selection of keys in the dimension extended prices of the line items in the
tables, we have already determined the foreign order.
keys that must be included in the fact table:
2. The OrderDetail.UnitPrice value can be
ProductID, CustomerID, and OrderID.
discounted by the amount in
OrderDetails.Discount. The extended
➨ Create a new make-table query called price calculation must therefore include the
qryFactExtract.
discount.
➨ Use factSales as the name of the target
table.
➨ Create a calculated field called total sale as
follows:
NL TotalSale: Quantity *
➨ Add the Order and OrderDetails tables to (1-Discount)* UnitPrice
the query.
➨ Preview the results and examine the results
Note that the Order and OrderDetails
? tables contain all the fields we need to
as shown in Figure 22.7.
create joins to the dimension tables:
ProductID, CustomerID, and OrderID.
22.3.5 Refresh intervals
Let’s summarize what you have done to this
➨ Project the foreign keys into the query point: You have extracted and transformed data
definition. from one database and stored it in another
database. The new database
(OrderEntryWarehouse.mdb) is simply a static
An introduction to data warehousing Exercises
13 o f 21

FIGURE 22.7: The fact table for order-level analyses of sales.

1 Project the
necessary 2 Calculate the total sale as a
function of information in the
foreign keys. Order Details table.

3 Verify the
resulting fact
table.

NORTHWIND uses a short


? textual code for CustomerID
rather than an AutoNumber.

copy (or “snapshot”) of the NORTHWIND TRADERS order here or there does not make that much
application. difference. You can refresh the data warehouse
daily, weekly, monthly, or according to
Clearly, your warehouse data is out-of-date as
whatever schedule makes sense.
soon as a new transaction is added to the
operational system. But since we plan to use
the data warehouse to see the big picture (e.g.,
sales trends over the last four quarters), an
An introduction to data warehousing Exercises
14 o f 21

22.3.6 Creating a star schema Since the TotalSale field has a numeric data
type, you may want to show it in the query
A star schema is a set of relationships between
formatted as currency.
a fact and several dimension tables. Since the
fact table is at the center and many dimension
➨ Right-click anywhere on the TotalSale
tables are around the perimeter (recall
field, bring up the properties sheet, and
Figure 22.1) the configuration resembles a
enter “Currency” in the Format property.
star—hence the name.
Changing the format of a query field is
➨ Create a new select query called ? merely an aesthetic enhancement—the
qryStarSchema.
underlying representation of the number
remains the same. You may also change
➨ Add the factSales, dimCustomers, the data type of the TotalSale field in
dimTime, and dimProducts table to the
the factSales table—the result is the
queries.
same.
➨ Drag the primary keys onto the ➨ View the star schema query in data sheet
corresponding foreign keys in the fact table
mode, as shown in Figure 22.8.
to create query-level relationships, as
shown in Figure 22.8.
➨ Ensure you understand the meaning of each
row: the TotalSale represents the amount
Since the data warehouse is read-only,
? there is no need to create relationships in
per product per order per customer. This is
identical to the granularity of the
the relationship window or specify
OrderDetails table.
referential integrity.

➨ Project the finest-grained field from each 22.3.7 Aggregating data using the GroupBy
dimension table into the query (specifically: operator
OrderID, CompanyName, ProductName). The level of granularity in Figure 22.8 is
probably too fine to be useful for many decision
➨ Project the TotalSale field. making purposes. In this section, you are going
to use the “totals” feature in QBE (which is
An introduction to data warehousing Exercises
15 o f 21

FIGURE 22.8: Create a star schema to join the fact table with several dimension tables.

1 Create a star schema query based on


the fact table and the dimension tables.

3 Note the result: total


sale for each order,
customer, and product.

2 Link the tables in the usual way (drag the


primary key for each dimension onto the
corresponding foreign key of the fact
table).

identical to the GroupBy operator in SQL) to 22.3.7.1 Setting up grouping and totals
aggregate the data.
➨ Switch back to the design view of
qryStarSchema.
An introduction to data warehousing Exercises
16 o f 21

➨ Select View → Totals from the main menu. ➨ Switch to design view, click on the grey bar
Alternatively, press the sigma (Σ) button on above the OrderID field in the query
the toolbar. definition grid, and press Delete.

➨ Notice that a “Total” row is added to the ➨ Preview the results and make a mental note
query definition grid and that term “Group of the number of records in the results set.
By” appears in the row for every field.
In this modified query, you are grouping on
CustomerID and ProductID and summing
➨ Leave the “Group By” entry for all the extended price. What this means is that the
foreign keys, buy change it to “Sum” for the
value of TotalSale reflects the sum of sales for
TotalSale field, as shown in Figure 22.9.
each unique combination of product and
customer regardless of when (i.e., in which
➨ Preview the results. You will note no change order) the products were ordered.
from the previous result since grouping on
unique values of OrderID, CompanyName,
and ProductName results in individual order
➨ Return to design mode and delete the
ProductID field from the query definition
details.
grid.
22.3.7.2 Different levels of aggregation
➨ Preview the results and make a mental note
There are two ways to change the level of of the number of records in the results set.
aggregation in a star schema: change the level
In this case, you are grouping on CustomerID
of granularity for a dimension or drop the
only. The TotalSale field represents the total
dimension from the results set altogether.
value of all products in all orders to the
customer in question.
➨ Before switching back to design view, make
a mental note of the number of records in
the results set (2155 records are shown in
➨ Return to query design mode and delete the
CustomerID field.
Figure 22.9; however, the number you see
may vary depending on the version of the
NORTHWIND TRADERS database you are using).
➨ Preview the results.
An introduction to data warehousing Exercises
17 o f 21

FIGURE 22.9: Use the GroupBy operator to


aggregate the TotalSale measure across
dimension values.

1 Select View → Totals or click


the sigma on the tool bar to
show the “total” row. No aggregation occurs in
? this example since each
order detail has a unique
combination of OrderID
and ProductName.

2 Group on unique combinations of


OrderID, CompanyName, and 3 Calculate the sum of
TotalSale for each
ProductName. unique group.

Now, the total sales for all customers, products,


and orders is collapsed into a single value. To
An introduction to data warehousing Exercises
18 o f 21

get subtotals for particular values of one or ➨ Project Year, City, and CategoryName into
more fields, reverse the process by adding them the query definition grid.
to the query and using the GroupBy operator.
➨ Verify the results as shown in Figure 22.10.

FIGURE 22.10: Total sales by year, city, and category.

3 All values of TotalSale for a


particular year, city, and category
are summed into a single amount.

1 Project coarser-grained
dimension fields into the 2 Use the GroupBy operator to
aggregate the results for each unique
query definition. combination of year, city, can category.

22.3.8 Using aggregation and a star schema to dimensional data model. To test the hypothesis
answer a business question that it is easier for decision makers to create
their own queries using dimensional data
The top part of the query in Figure 22.10 is a
An introduction to data warehousing Discussion
19 o f 21

models, we can start by considering a business Based on the results of the query, you may
question: ? re-evaluate the effectiveness of you sales
programs in Canadian cities other than
• What were the total sales of each product
Montreal.
in each city in Canada for 1994? Break the
results down by quarter and sort the results
Hopefully you agree that a reasonably query-
in descending order of importance.
literate individual could construct this type of
query and interpret its results. In Lesson 23 you
➨ Create a new query called qryQuestion. will perform more sophisticated queries and
analysis using your new data warehouse.
➨ Repeat the steps in Section 22.3.6 to create
a star schema query.
22.4 Discussion
➨ Project City, Quarter, ProductName, and
TotalSale into the query. 22.4.1 Rationale for data warehousing
Data warehousing is based on three basic
➨ Ensure the “Totals” feature is on and that observations:
you are grouping by unique combinations of
City, Quarter, and ProductName. 1. Normalized data models are difficult for
business users to understand and navigate.
➨ Sum the TotalSale field. There are better methods of storing and
representing data for decision support
➨ Set the query to sort on TotalSale in applications.
descending order. 2. The computational load placed on an
operational system by a decision support
➨ To constrain the results to Canada in 1994, application may be considerable. In other
project the Country and Year fields into words, it is conceivable that a middle
the query. For both fields, ensure the manager tucked away in a cubical
“Show” box is unchecked and that the somewhere could bring an organization’s
“Group By” entry is replaced by “Where”. primary operational system to its knees
with a well-intentioned but poorly designed
➨ View the results as shown in Figure 22.11. query. It is better to isolate mission-critical
An introduction to data warehousing Discussion
20 o f 21

FIGURE 22.11: Answering a


business question using a
dimensional data model.

3 Uncheck the Show box to


ensure that the constraint fields
are not shown in the results set.

4 Verify the results


of the query.

1 Select the
appropriate 2 Constrain the
results by
dimension fields using the
to answer the Where
question. operator.

transaction processing systems from the Up until the early 1990s, vendors of databases
end-user computing revolution. and transaction processing applications were
3. Many aspects of an organization’s insisting that their transaction processing
operations have an implicit time element systems could do both. This is slowly changing
that is ignored by the transaction processing as users and vendors adopt a more pragmatic
system. An example of this in the order stance.
entry system is the inventory level
(QtyOnHand) of each product.
An introduction to data warehousing Application to the assignment
21 o f 21

22.4.2 The first law of data warehousing monthly totals. In practice, it is not uncommon
to see multiple fact tables containing the same
There is one simple design rule that dominates
“fact” but with different levels of aggregation
the design and implementation of data
precomputed. This is one reason that firms
warehouses: disk space is cheap; time is
often have multiple terabyte data warehouses.
expensive.
Time, as it is used here, does not mean 22.5 Application to the assignment
computational processing time. It means the
time of the decision maker who is waiting for a ➨ Ensure you have implemented all the
query to return a result. One of the implications extraction queries discussed in this lesson.
is that hardware vendors sell a lot of expensive
gear. Very large arrays of hard drives and ➨ Set the primary key for each dimension
parallel processing machines are all the rage in table and the fact table.
data warehousing.
ACCESS automatically creates indexes for
22.4.3 Multiple fact tables ? primary keys so you do not need to worry
about indexing your tables manually.
Assume that your firm processes several
Although indexes increase the size of your
thousand orders per day and that many of the
database, the retrieval of records in an
decision makers in the organization are
indexed table is orders of magnitude
concerned with monthly measures of
faster than in an un-indexed table.
performance broken down by region. Although
it is certainly possible to get this information by
projecting a coarse-grained measure of time
(e.g., month) into the query and using the
GroupBy operator to calculate monthly sales
totals, this approach requires a considerable
amount of processing.
Given the first law of data warehousing, a
better approach involves a straight trade-off
between disk space and query performance:
create a second fact table with pre-computed

You might also like