0% found this document useful (0 votes)

83 views7 pages

Lab Terminal Data Warehousing and Data Mining: Part-I (CLO-C1, C2, C3)

The document describes setting up a star schema in SQL Server for a data warehouse containing sales data. It includes instructions to: 1. Create dimension tables for products, customers, and employees and a fact table for sales. 2. Insert sample data into the tables. 3. Write SQL queries to analyze the data, such as finding the best selling product and sales by a specific customer. 4. Create a view for a frequent query and add a clustered index to improve performance. It also describes using linear regression to predict housing prices based on house characteristics.

Uploaded by

Ar. Raja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

83 views7 pages

Lab Terminal Data Warehousing and Data Mining: Part-I (CLO-C1, C2, C3)

Uploaded by

Ar. Raja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Lab Terminal

Data Warehousing and Data Mining

Part-I [CLO-C1, C2, C3]

Suppose we have built a star schema for sales related data to be loaded in the data warehouse. The
schema contains following three dimension tables (Products, Customers, Employees) and a fact table
(Sales):

 Products (ProductID, Name, Price)

 Customers (CustomerID, FirstName, MiddleInitial, LastName)
 Employees (EmployeeID, FirstName, MiddleInitial, LastName)
 Sales (SalesID, EmployeeID, CustomerID, ProductID, Quantity)

You have also been provided with the .sql files containing data to be inserted in each of the above tables.

Now, you should follow the following steps and paste the screenshots as instructed:

Step-1: Using SQL Server Management Studio, create a database 1. The database name must be your
registration number without dashes.

1
https://datatofish.com/database-sql-server/
<Paste here the screenshot showing that the database has been created>

Step-2: Using SQL Server Management Studio, create the three-dimension tables and the fact table
within the database created in Step-1.2

Employees

Customers

2
https://datatofish.com/table-sql-server/
Products

Step-3: Insert the provided data to the corresponding tables of your database.

<Paste here the four screenshots each showing that the count of the number of rows in each table>
Step-4: Answer the following queries over the star schema:

 Q1: Which product has been sold the most?

SELECT TOP(1) P.Name,(S.Quantity)

FROM Sales Sel, Products P

where

Sel.Quantity = (SELECT max(Sel.Quantity)

FROM Sales S, Products P

where Sel.ProductID=P.ProductID

) group by Name,Quantity

Sorry sir no light (labtop battery dead)

Q2: Product-wise count of the all the product that the customer Trevor C Coleman bought.

<Write your sql query here>

SELECT COUNT(Prod.Name),Prod.Name

FROM Customers Cs, Products Prod, Sales Sel

where Cs.FirstName = 'Trevor' and Cs.MiddleInitial='C' and

Cs.LastName = 'Coleman' and Cs.CustomerID = Sel.CustomerID and

Prod.ProductID = Sel.Product

Group by Name

<Paste here the screenshot showing that the result of the query>
Sorry sir no light (labtop battery dead)

 Q3: Which employee has made the least number of sales?

<Write your sql query here>
<Paste here the screenshot showing that the result of the query>
Step-5: Suppose that Trevor C Coleman is a VIP customer and the query Q2 in Step-4 is run frequently
in the data warehouse.

<Write your sql script here>

Create a simple view in SQL

Create VIEW PriorityCustomer

AS SELECT COUNT(Prod.Name),Prod.Name

FROM Customers Cust, Products Prod, Sales Sal

where Cust.FirstName = 'Trevor' and Cust.MiddleInitial='C' and Cust.LastName = 'Coleman'

and Cust.CustomerID = Sal.CustomerID and Prod.ProductID = Sal.ProductID

group by Name

 Write SQL script that will add a clustered index to it on attributes ProductID and
CustomerID.

<Write your sql script here>

ALTER VIEW PriorityCustomer

WITH SCHEMABINDING

AS SELECT COUNT_BIG(Prod.Name) as 'No. of Products',Prod.Name,Prod.ProductID,Cust.CustomerID,

COUNT_BIG(*) as RecordCount

FROM dbo.Customers Cust, dbo.Products Prod, dbo.Sales Sal

where Cust.FirstName = 'Trevor' and Cust.MiddleInitial='C' and Cust.LastName = 'Coleman'

and Cust.CustomerID = Sal.CustomerID and Prod.ProductID = Sal.ProductID

group by Prod.Name,Prod.ProductID,Cust.CustomerID

CREATE UNIQUE CLUSTERED INDEX IX_PriorityCustomer

ON PriorityCustomer (ProductID, CustomerID)

 Now run the Q2 again and report the difference in terms of time taken to execute it

Part-II [CLO-C4, C5]

You are provided with the housing prices dataset which has following six attributes:
 Age of the house
 Distance to the nearest metro station
 Number of convenience stores in the locality
 Latitude
 Longitude
 Unit area house price

The dataset contains data of prices of 414 houses. Your task is to fit a linear regression model to it and
once fitted, use it to predict the price for the following house:

Age Distance Number of Latitude Longitude

stores
14.7 1717.193 2 24.96447 121.5165

For this part, you also need to provide your code and the linear regression model (i.e., the equation) that’s
been fitted to the dataset.

Code
import pandas as pd

import matplotlib.pyplot as plt from sklearn

import linear_model

Givendata = pd.read_csv("Downloads\\housingprices.csv")

Givendata.columns=['Age', 'Distance', 'Stores', 'Latitude', 'Longitude','Price']

Reg = linear_model.LinearRegression()

Reg.fit(df[['Age','Distance','Stores','Latitude','Longitude']],Givendata.Price)

print("The Price: ",Reg.predict([[14.7,1717.193,2,24.96447,121.5165]]))

SQL Project Business Report
No ratings yet
SQL Project Business Report
23 pages
Discrete Math Lab Manual
No ratings yet
Discrete Math Lab Manual
158 pages
CS412 Assignment 1 Ref Solution
50% (2)
CS412 Assignment 1 Ref Solution
8 pages
DBMS Lab Manual (Final)
No ratings yet
DBMS Lab Manual (Final)
54 pages
DMDW Co1 Session 7
No ratings yet
DMDW Co1 Session 7
46 pages
Print
No ratings yet
Print
39 pages
MANOJ + Venkat Databse
No ratings yet
MANOJ + Venkat Databse
18 pages
Exam1 540
No ratings yet
Exam1 540
3 pages
Labpsp
No ratings yet
Labpsp
46 pages
DBMS Worksheet VII
No ratings yet
DBMS Worksheet VII
16 pages
Assignment 2
No ratings yet
Assignment 2
17 pages
COMSATS University Islamabad, Lahore B: Second Term Examination - Semester Sp21
No ratings yet
COMSATS University Islamabad, Lahore B: Second Term Examination - Semester Sp21
2 pages
DWDM
No ratings yet
DWDM
81 pages
Usama Hassan bcs173066 Mid Data Base
No ratings yet
Usama Hassan bcs173066 Mid Data Base
5 pages
2023 01 02 110049DBMS - Graded - Assignment
No ratings yet
2023 01 02 110049DBMS - Graded - Assignment
5 pages
Multidimensional Data Model and OLAP
No ratings yet
Multidimensional Data Model and OLAP
21 pages
Dbms db03 2020 Assessment (Solved) : Find Study Resources
50% (2)
Dbms db03 2020 Assessment (Solved) : Find Study Resources
12 pages
Installing and Configuring WhatsUp Gold
100% (1)
Installing and Configuring WhatsUp Gold
31 pages
Its232 Sep2014 Answer
No ratings yet
Its232 Sep2014 Answer
11 pages
CS122D Spring 2021 Midterm
No ratings yet
CS122D Spring 2021 Midterm
20 pages
SQL Server DBF Coop Group 2
No ratings yet
SQL Server DBF Coop Group 2
50 pages
Data Warehouse Management Systems
No ratings yet
Data Warehouse Management Systems
56 pages
1MRK511777-UEN - en PCM600 3.1 Installation Guide
No ratings yet
1MRK511777-UEN - en PCM600 3.1 Installation Guide
12 pages
CaseStudy SQL Part3 Allen Joe Winny
No ratings yet
CaseStudy SQL Part3 Allen Joe Winny
6 pages
WA AE Admin ENU
No ratings yet
WA AE Admin ENU
271 pages
Data Warehousing & OLAP (Business Intellegent)
No ratings yet
Data Warehousing & OLAP (Business Intellegent)
31 pages
Day5 Assignments Solutions
No ratings yet
Day5 Assignments Solutions
2 pages
Final
No ratings yet
Final
8 pages
SQL Assignment 4
100% (1)
SQL Assignment 4
3 pages
SQL DBA Contents PDF
No ratings yet
SQL DBA Contents PDF
6 pages
DBMS - Assignmet 2
No ratings yet
DBMS - Assignmet 2
5 pages
SAP MII Workbench Config Guide
No ratings yet
SAP MII Workbench Config Guide
42 pages
Gagan Bansal (CS Practical File)
No ratings yet
Gagan Bansal (CS Practical File)
12 pages
Data Warehousing Logical Design
100% (1)
Data Warehousing Logical Design
23 pages
Super Market
No ratings yet
Super Market
18 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
5 pages
SQL Project
No ratings yet
SQL Project
15 pages
CCS341 Set3
100% (1)
CCS341 Set3
3 pages
OpManager Installation Guide
No ratings yet
OpManager Installation Guide
42 pages
GFI Backup 2011 Administration and Configuration Manual
No ratings yet
GFI Backup 2011 Administration and Configuration Manual
168 pages
Tausif and Stanzin Practical
No ratings yet
Tausif and Stanzin Practical
32 pages
Assignment 3 Linda Zhu
No ratings yet
Assignment 3 Linda Zhu
12 pages
Lab Manual
No ratings yet
Lab Manual
100 pages
Graded Assignment Questions
No ratings yet
Graded Assignment Questions
7 pages
MATODA Raport Store20
No ratings yet
MATODA Raport Store20
13 pages
RecPgm4 10
No ratings yet
RecPgm4 10
19 pages
Consultas Resueltas
No ratings yet
Consultas Resueltas
14 pages
Amey B-50 DWM Lab Experiment-2
No ratings yet
Amey B-50 DWM Lab Experiment-2
19 pages
Nimble Best Practices For VMware
No ratings yet
Nimble Best Practices For VMware
15 pages
Adbms 12
No ratings yet
Adbms 12
28 pages
Final Cs
No ratings yet
Final Cs
34 pages
Using The Resource Governor: " Aaron Bertrand, Boris Baryshnikovc C: Louis Davidson, Mark Pohto, Jay (In-Jerng) Choe
No ratings yet
Using The Resource Governor: " Aaron Bertrand, Boris Baryshnikovc C: Louis Davidson, Mark Pohto, Jay (In-Jerng) Choe
71 pages
Course 20462C: Administering Microsoft SQL Server Databases
No ratings yet
Course 20462C: Administering Microsoft SQL Server Databases
7 pages
E TMS Report
No ratings yet
E TMS Report
81 pages
Email: Ravi - 2906@hotmail - Com Phone: +918128820970
No ratings yet
Email: Ravi - 2906@hotmail - Com Phone: +918128820970
3 pages
PI OLEDB and DTS SSIS
No ratings yet
PI OLEDB and DTS SSIS
24 pages
Synopsis: Project Title: Payroll Management System
No ratings yet
Synopsis: Project Title: Payroll Management System
6 pages
Windowsmanual
No ratings yet
Windowsmanual
81 pages
Model-FAT Lab
No ratings yet
Model-FAT Lab
12 pages
DBMS Practical List
No ratings yet
DBMS Practical List
4 pages
Assignment3 ETL PDF
No ratings yet
Assignment3 ETL PDF
6 pages
XII CS - Term2 - Practicals (2021-22) - Sol
No ratings yet
XII CS - Term2 - Practicals (2021-22) - Sol
13 pages
How To Install SQL Server 2008 R2
No ratings yet
How To Install SQL Server 2008 R2
31 pages
COD219
No ratings yet
COD219
171 pages
Chapter 10: Mass-Storage Systems: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
No ratings yet
Chapter 10: Mass-Storage Systems: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edition
19 pages
Set 2
No ratings yet
Set 2
2 pages
Designing and Tuning High Speed Data Loading - Thomas Kejser
No ratings yet
Designing and Tuning High Speed Data Loading - Thomas Kejser
41 pages
SQL Questions
No ratings yet
SQL Questions
4 pages
Veeam Backup User Permitions For VM Files and SQL PDF
No ratings yet
Veeam Backup User Permitions For VM Files and SQL PDF
3 pages
ISTE 230 Introduction To Database and Data Modeling: Spring 2025 Project
No ratings yet
ISTE 230 Introduction To Database and Data Modeling: Spring 2025 Project
5 pages
Advanced SQL Assignment With Answers
No ratings yet
Advanced SQL Assignment With Answers
5 pages
Ade 1737191501
No ratings yet
Ade 1737191501
29 pages
Vaishali SQL Assessment
No ratings yet
Vaishali SQL Assessment
25 pages
Use Cases: Object-Oriented Software Systems Engineering - Chapter 3
No ratings yet
Use Cases: Object-Oriented Software Systems Engineering - Chapter 3
31 pages
Chapter-8 (Memory Management)
No ratings yet
Chapter-8 (Memory Management)
42 pages
Ethics in It
No ratings yet
Ethics in It
16 pages
CV English - Odt
No ratings yet
CV English - Odt
1 page
Hekaton: SQL Server's Memory-Optimized OLTP Engine: Irimescu Andrei - Abd
No ratings yet
Hekaton: SQL Server's Memory-Optimized OLTP Engine: Irimescu Andrei - Abd
8 pages
Answers
No ratings yet
Answers
11 pages
Admin Console Here
No ratings yet
Admin Console Here
136 pages
Classicmodels
No ratings yet
Classicmodels
3 pages
SQL Retail Sales Project
No ratings yet
SQL Retail Sales Project
5 pages
Create New User in MS SQL Server
No ratings yet
Create New User in MS SQL Server
14 pages
Home Activity User With Same Email
No ratings yet
Home Activity User With Same Email
3 pages
ApexSQL Refactor 2018 Released
No ratings yet
ApexSQL Refactor 2018 Released
2 pages
DCCN Report
No ratings yet
DCCN Report
2 pages
New Major Release - ApexSQL Log 2018
No ratings yet
New Major Release - ApexSQL Log 2018
2 pages
How To Connect and Import Data From Common Sources in Power BI
No ratings yet
How To Connect and Import Data From Common Sources in Power BI
2 pages
Whatsup Gold 2023.0 Release Notes 11-9-2023
No ratings yet
Whatsup Gold 2023.0 Release Notes 11-9-2023
22 pages
Vlad Gotlib: Qualifications Summary
No ratings yet
Vlad Gotlib: Qualifications Summary
1 page
Wa0002.
No ratings yet
Wa0002.
5 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
5 pages
Poonam Kshirsagar - Resume
No ratings yet
Poonam Kshirsagar - Resume
2 pages
Data Engineer (3-5 Years of Experience.) PDF
No ratings yet
Data Engineer (3-5 Years of Experience.) PDF
7 pages

Lab Terminal Data Warehousing and Data Mining: Part-I (CLO-C1, C2, C3)

Uploaded by

Lab Terminal Data Warehousing and Data Mining: Part-I (CLO-C1, C2, C3)

Uploaded by

Lab Terminal

Data Warehousing and Data Mining

Part-I [CLO-C1, C2, C3]

 Products (ProductID, Name, Price)

 Q1: Which product has been sold the most?

SELECT TOP(1) P.Name,(S.Quantity)

FROM Sales Sel, Products P

Sel.Quantity = (SELECT max(Sel.Quantity)

FROM Sales S, Products P

Sorry sir no light (labtop battery dead)

<Write your sql query here>

FROM Customers Cs, Products Prod, Sales Sel

where Cs.FirstName = 'Trevor' and Cs.MiddleInitial='C' and

Cs.LastName = 'Coleman' and Cs.CustomerID = Sel.CustomerID and

 Q3: Which employee has made the least number of sales?

<Write your sql script here>

Create a simple view in SQL

Create VIEW PriorityCustomer

FROM Customers Cust, Products Prod, Sales Sal

where Cust.FirstName = 'Trevor' and Cust.MiddleInitial='C' and Cust.LastName = 'Coleman'

and Cust.CustomerID = Sal.CustomerID and Prod.ProductID = Sal.ProductID

<Write your sql script here>

ALTER VIEW PriorityCustomer

AS SELECT COUNT_BIG(Prod.Name) as 'No. of Products',Prod.Name,Prod.ProductID,Cust.CustomerID,

FROM dbo.Customers Cust, dbo.Products Prod, dbo.Sales Sal

where Cust.FirstName = 'Trevor' and Cust.MiddleInitial='C' and Cust.LastName = 'Coleman'

and Cust.CustomerID = Sal.CustomerID and Prod.ProductID = Sal.ProductID

CREATE UNIQUE CLUSTERED INDEX IX_PriorityCustomer

ON PriorityCustomer (ProductID, CustomerID)

Part-II [CLO-C4, C5]

Age Distance Number of Latitude Longitude

import matplotlib.pyplot as plt from sklearn

Givendata.columns=['Age', 'Distance', 'Stores', 'Latitude', 'Longitude','Price']

print("The Price: ",Reg.predict([[14.7,1717.193,2,24.96447,121.5165]]))

You might also like