0% found this document useful (0 votes)
21 views27 pages

Data Warehouse Creation 1

This document outlines the steps for creating a data warehouse using a sales database, focusing on the implementation of a star schema with fact and dimension tables. It details the use of SSIS and SSMS for creating tables, adding surrogate keys, and generating the fact table through lookup transformations. The process includes specific instructions for handling employee data, creating dimension tables, and ensuring proper data flow and structure within the warehouse.

Uploaded by

arij nasri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views27 pages

Data Warehouse Creation 1

This document outlines the steps for creating a data warehouse using a sales database, focusing on the implementation of a star schema with fact and dimension tables. It details the use of SSIS and SSMS for creating tables, adding surrogate keys, and generating the fact table through lookup transformations. The process includes specific instructions for handling employee data, creating dimension tables, and ensuring proper data flow and structure within the warehouse.

Uploaded by

arij nasri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

1BA

Data Warehouse Creation

For this practice we will use the sales database (sales.xlsx file is available on Moodle).

Step 1: create the tables using SSIS and SSMS

Step 2: Data modeling

This data warehouse is in the form of a star schema where we have one fact table and several
dimensions as presented below:

Step 3: Creation of the dimension tables.

Open SSIS and create a new project called “sales_Dwh” and open SSMS to create a new database
called “sales_DWH”

1 2023/2024
1BA

For tables “customers”, “products” and “order_line” there is no need of any transformations and
thus, we create directly the dimensions. Indeed, we have to add the surrogate key for each
dimension:

Add OLE DB source and double click on it add a new connection manager to the DB

Add a new OLE DB destination and create a new connection manager to the “sales_Dwh”. Create the
table by adding the surrogate key

2 2023/2024
1BA

3 2023/2024
1BA

Apply the same steps for tables “products” and “order_line”:

By adding the surrogate key, the data in the tables should be as follows:

4 2023/2024
1BA

In SSMS define the customer_key, order_line_key and product_key as primary keys (same for the rest
of the tables):

5 2023/2024
1BA

For the dimension Employees we have to divide the data into “sellers” and “employees” (the rest of
the employees) using the Conditional Split component that splits data based on a specific condition.

Add a Conditional split component and two OLE DB Destinations

6 2023/2024
1BA

7 2023/2024
1BA

Now we will create “dim_order” and “dim_time” using the table “order”:

8 2023/2024
1BA

In this part of the project, we will use the Multicast component which distributes its input to one
or more output. The difference between the Conditional Split component and the Multicast is that
the latter directs every row to every output, and the former directs a row to a single output.

In order to create the “dim_date” dimension we will add a derived column components in order
to split each date to day, month, and year as follows:

9 2023/2024
1BA

Double click on the derived columns to add the necessary functions (i.e., DAY, MONTH, YEAR) to split

the date.

Double clock on the output “dim_order” to add the connection manager and similarly to the
“dim_date”.

10 2023/2024
1BA

Creation of “dim_time”:

The resulted table in SSMS are:

11 2023/2024
1BA

Important: do not forget to truncate all of the dimension tables:

12 2023/2024
1BA

Step 4: The generation of the fact table

In order to generate the fact table, we need to use as a source the “Order_line_dim” and lookups. A
lookup transformation performs lookups by joining data in input columns with columns in a
reference dataset.

We need one lookup for each dimension:

Double click on the first lookup  connection, specify the connection manager and the dimension (in
this case it is the dimension product) and then go to Columns. Choose product_key as lookup column
and match the product from the table “Dim_order_line” to the attribute “product_id” in the lookup.

13 2023/2024
1BA

14 2023/2024
1BA

15 2023/2024
1BA

Now link the first lookup with the second one as follows:

16 2023/2024
1BA

Double click on lookup order:

17 2023/2024
1BA

Double click on “lookup_seller” and follow the following steps:

18 2023/2024
1BA

19 2023/2024
Dr. Rihab BOUSLAMA 1BA

Finish the rest of the lookups as follows:

20 2023/2024
Dr. Rihab BOUSLAMA 1BA

21 2023/2024
Dr. Rihab BOUSLAMA 1BA

22 2023/2024
Dr. Rihab BOUSLAMA 1BA

23 2023/2024
Dr. Rihab BOUSLAMA 1BA

Now add an OLE DB destination and create the new fact table as follows:

24 2023/2024
Dr. Rihab BOUSLAMA 1BA

25 2023/2024
Dr. Rihab BOUSLAMA 1BA

Now the fact table is created!

26 2023/2024
Dr. Rihab BOUSLAMA 1BA

27 2023/2024

You might also like