Homework
Homework
///
USE 2S23SalesDWH;
///
- DimProd table:
///
```
Size VARCHAR(50),
Category VARCHAR(50)
);
///
- DimCust table:
///
Age INT,
Gender VARCHAR(10),
AnnualIncome DECIMAL(15,2),
NumChildren INT
);
///
- DimAdrs table:
///
CountryRegion VARCHAR(50)
);
///
basic fact table linked with dimension tables and with sales amount as measure:
///
ProdID INT,
CustID INT,
AdrsID INT,
SaleAmount DECIMAL(15,2),
);
///
You might need to alter the `FactSale` table later according to part 2 instructions. Also, adjust the field
types and sizes according to your actual data.
2.
-filegroups:
///
///
///
ADD FILE
NAME = 'USAData',
FILENAME = 'C:\path\USAData.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
TO FILEGROUP USA,
ADD FILE
NAME = 'CanadaData',
FILENAME = 'C:\path\CanadaData.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
TO FILEGROUP Canada,
ADD FILE
NAME = 'MexicoData',
FILENAME = 'C:\path\MexicoData.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
TO FILEGROUP Mexico;
///
-the partition function. Note that you need to decide on the boundaries for partitioning:
///
///
///
AS PARTITION CountryRegionPF
///
recreate the `FactSale` table within the partition scheme. ( drop it first if it already exists):
///
ProdID INT,
CustID INT,
AdrsID INT,
SaleAmount DECIMAL(15,2),
CountryRegion VARCHAR(50),
ON CountryRegionPS (CountryRegion);
///
3.
The first part of the task, creating views, is done in SQL. DataFlow tasks are typically created in SQL
Server Integration Services (SSIS) which is a graphical tool, so they can't be represented as code.
the views:
///
FROM AdventureWork.Product;
FROM AdventureWork.Address;
FROM AdventureWork.Sales;
///
You would create the DataFlow tasks in SSIS as follows:
1. Open SQL Server Data Tools (SSDT) and create a new Integration Services Project.
2. In the Control Flow tab, drag a Data Flow Task from the Toolbox onto the design surface.
4. Drag a Source Assistant from the Toolbox onto the design surface.
5. Double-click the Source Assistant and configure it to use the AdventureWork connection and select
the appropriate view.
6. Drag a Destination Assistant from the Toolbox onto the design surface.
8. Double-click the Destination Assistant and configure it to use the 2S23SalesDWH connection and
select the appropriate table.
To extract the customer data from an Excel file, you would use the Excel Source in SSIS and configure it to
use the Excel connection manager and select the appropriate sheet.
If you need to do transformations on the data, you would add Transformations between the Source and
Destination. For example, you might add a Lookup Transformation to match up identifiers between the
AdventureWork database and the 2S23SalesDWH database.
Once your DataFlow tasks are set up, you can run the package to load the data into the 2S23SalesDWH
database.
4.
///
FROM FactSale fs
///
create a DataFlow task in SSIS to export the data from this view to an Excel file. As previously stated, this
is a graphical process and can't be represented as code.
follow the same steps as for the other DataFlow tasks, but your Source would be the
vw_PredictiveModel view and your Destination would be an Excel connection manager.
The rest of the steps involve using Weka, which is a graphical tool for machine learning and data mining.
Here are the steps you would follow, although they can't be represented as code:
2. Click Open file and select the Excel file you created (you may need to convert it to CSV first).
5. In the `Test options` section, choose `Cross-validation` and enter `10` for the number of folds.
6. Click `Start`.
After the DecisionTree model has been built, you can view the tree by clicking on the `Visualize tree`
button.
Now you can compare the results of the DecisionTree and Bayesian models by comparing the output in
the classifier output area. You would typically look at measures like accuracy, precision, recall, and the F-
measure to compare the performance of the two models.