Assignment3 ETL PDF
Assignment3 ETL PDF
DELIVERABLES
Softcopies of your answers for the following submitted through iTaleem:
1. Updated dimensional modeling Excel workbook (matricno-dimensional-modeling-
ETLRule.xlxs) with the ETL rule.
2. A ZIP file of all files in your SSIS Integration Project solution file and associated folder. Name
the ZIP file Matricno-Integration-Project. Check the location when you open the project
solution in Microsoft SQL Server Data Tools. For example - C:\Users\Lili\source\repos
3. A .PDF file of Microsoft SQL Server screenshots of result data populated in dimension tables
and fact table of your first Data Warehouse (Data Mart). A sample of data visible in the
screenshot to show successful loading of data for each table is enough. Name the .PDF file
MatricnoDW-Data.pdf.
OBJECTIVES
This assignment will give you a better picture of how ETL development is done using an actual ETL
tool. The tool you will use is the Microsoft SQL Server Integration Services or SSIS. The objective is to
help you get a feel for how the concepts you have learned in class apply in a real-world scenario with
actual ETL tooling, instead of SQL.
ASSIGNMENT REQUIREMENTS
For this assignment, you will need the following:
1. A connection to Microsoft SQL Server and Microsoft SQL Server Management Studio.
2. Access to Microsoft SQL Server Data Tools (used to author SSIS packages)
3. You need to connect to Microsoft SQL server before running the SSIS packages.
4. Employees OR Pets Data Warehouse Schema from assignment 2 running on Microsoft SQL
Server.
1. Add a column named ETL Rule in the dimensional modeling Excel workbook (matricno-
dimensional-modeling.xlsx) submitted in Assignment 2 and identify the ETL rule for each
attribute in each dimension and fact table. Save the updated workbook as matricno-
dimensional-modeling-ETLRule.xlxs. (4 points)
1
IN-CLASS EXAMPLE:
2. Add a package (*.DTSX) for every dimension table to load source data into the data warehouse
dimension tables. An example of the control flow and data flow is shown below. You may
have other transformations from your ETL rules, so you may add more. You may Delete or
Truncate the tables so that you can repeatedly run your package without having to delete the
data manually in the SQL Server data warehouse.
(8 points)
EXAMPLE:
i. Execute SQL Task → SQL - Truncate DimProducts Table .
ii. Data Flow Task → DF - Extract From Source to DimProducts. Press CTRL + S
to save your work.
2
iii. Click Start to run the package and load the data.
3
c. General → SQLStatement → Click at ‘ … ‘ and at Enter SQL Query, type
‘Truncate Table DimProducts’
d. Click OK
4
c. OLE DB Destination – DimProducts
3. Add a package (*.DTSX) to load data into the data warehouse fact table following your ETL
rules. You may also need to truncate the fact table. Compared to dimension tables, you may
have more transformations from your ETL rules for fact table.
(4 points)
4. Run a query for each dimension table and fact table in Microsoft SQL Server to check whether
all tables have been populated with data. Capture screenshots of result data populated in
dimension tables and fact table of your first Data Warehouse (Data Mart). A sample of data
visible in the screenshot to show successful loading of data for each table is enough
(MatricnoDW-Data.pdf). (4 points)
5
6