100% found this document useful (1 vote)
553 views6 pages

Tut w3s

The document discusses data warehousing concepts including star schemas, ETL processes, OLAP queries, and data cubes. It provides examples of creating a star schema for a gym to analyze revenue, describes tasks involved in ETL like extracting and aggregating data, and discusses formulating queries using aggregate functions on cuboids within a data cube. It also covers modeling a data warehouse using different schema types, performing OLAP operations like roll-up and slice, and writing an equivalent SQL query.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
553 views6 pages

Tut w3s

The document discusses data warehousing concepts including star schemas, ETL processes, OLAP queries, and data cubes. It provides examples of creating a star schema for a gym to analyze revenue, describes tasks involved in ETL like extracting and aggregating data, and discusses formulating queries using aggregate functions on cuboids within a data cube. It also covers modeling a data warehouse using different schema types, performing OLAP operations like roll-up and slice, and writing an equivalent SQL query.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

COMP9318 Tutorial Week-3

- Data Warehouse and OLAP Q1. (1) Create a star schema diagram that will enable FIT-WORLD GYM INC. to analyze their revenue. The fact table will include for every instance of revenue taken attribute(s) useful for analyzing revenue; The star schema will include all dimensions that can be useful for analyzing revenue; and The only two data sources are shown in the next page.

(2) Appreciate the ETL process involved populating the data warehouse. (3) Appreciate the difference of formulating queries: Find the percentage of revenue generated by members in the last year. (4) How many cuboids are there in the complete data cube?

S1.

(1) As presented in the figures. Note that this is not the unique answer. (2) There are several tasks involved when importing the data into the data warehouse, e.g., we need to extract zipcode information from CorpCustNameLoc; we need to perform

aggregation [price Quantity] for tuples in the merchandise table; we might also need to deal with (near) duplicate object detection (e.g., the same member that appear in two data sources). (3) Find the percentage of revenue generated by members in the last year can be easily answered on the star schema by two aggregate queries on the fact table. Specifically, if the complete data cube has been built, the queries can be efficiently answered by the cuboid [Year], and the cuboid [Year,Category]. (4) Since CustName is not likely to be a good level for analysis (rather, it is a descriptive attribute), there are 4 levels on Calendar dimension, 3 on Item, and 3 on Customer. Therefore, there are in total. Note that we could have different hierarchies on a dimension, e.g., we could consider the hierarchy on the Customer dimension one of the following. They have different semantics, but do not affect the number of cuboids.

Q2. Suppose that a data warehouse consists of the three dimensions time, doctor, and patient, and the two measures count and charge, where charge is the fee that a doctor charges a patient for a visit. (1) Enumerate three classes of schemas that are popularly used for modeling data warehouses. (2) Draw a schema diagram for the above data warehouse using one of the schema classes listed in (1). (3) Starting with the base cuboid [day, doctor, patient], what specific OLAP operations should be performed in order to list the total fee collected by each doctor in 2004? (4) To obtain the same list, write an SQL query assuming the data is stored in a relational database with the schema fee (day, month, year, doctor, hospital, patient, count, charge).

S2. (1) Three classes of schemas popularly used for modeling data warehouses are the star schema, the snowflake schema, and the fact constellations schema. (2) A star schema is shown in the figure.

(3) The operations to be performed are: Roll-up on time from day to year; Slice for time = 2004; Roll-up on patient from individual patient to all.

(4) select doctor, SUM(charge)


from

fee year = 2004 doctor

where

group by

You might also like