Lab Terminal Data Warehousing and Data Mining: Part-I (CLO-C1, C2, C3)
Lab Terminal Data Warehousing and Data Mining: Part-I (CLO-C1, C2, C3)
Suppose we have built a star schema for sales related data to be loaded in the data warehouse. The
schema contains following three dimension tables (Products, Customers, Employees) and a fact table
(Sales):
You have also been provided with the .sql files containing data to be inserted in each of the above tables.
Now, you should follow the following steps and paste the screenshots as instructed:
Step-1: Using SQL Server Management Studio, create a database 1. The database name must be your
registration number without dashes.
1
https://datatofish.com/database-sql-server/
<Paste here the screenshot showing that the database has been created>
Step-2: Using SQL Server Management Studio, create the three-dimension tables and the fact table
within the database created in Step-1.2
<Paste here the screenshot showing that the four tables have been created>
Employees
Customers
2
https://datatofish.com/table-sql-server/
Products
Step-3: Insert the provided data to the corresponding tables of your database.
<Paste here the four screenshots each showing that the count of the number of rows in each table>
Step-4: Answer the following queries over the star schema:
where
where Sel.ProductID=P.ProductID
) group by Name,Quantity
<Paste here the screenshot showing that the result of the query>
Q2: Product-wise count of the all the product that the customer Trevor C Coleman bought.
SELECT COUNT(Prod.Name),Prod.Name
Prod.ProductID = Sel.Product
Group by Name
<Paste here the screenshot showing that the result of the query>
Sorry sir no light (labtop battery dead)
AS SELECT COUNT(Prod.Name),Prod.Name
group by Name
Write SQL script that will add a clustered index to it on attributes ProductID and
CustomerID.
WITH SCHEMABINDING
COUNT_BIG(*) as RecordCount
group by Prod.Name,Prod.ProductID,Cust.CustomerID
You are provided with the housing prices dataset which has following six attributes:
Age of the house
Distance to the nearest metro station
Number of convenience stores in the locality
Latitude
Longitude
Unit area house price
The dataset contains data of prices of 414 houses. Your task is to fit a linear regression model to it and
once fitted, use it to predict the price for the following house:
For this part, you also need to provide your code and the linear regression model (i.e., the equation) that’s
been fitted to the dataset.
Code
import pandas as pd
import linear_model
Givendata = pd.read_csv("Downloads\\housingprices.csv")
Reg = linear_model.LinearRegression()
Reg.fit(df[['Age','Distance','Stores','Latitude','Longitude']],Givendata.Price)