Data Engineering Lab Manual (CSL234)
2021-22
Data Engineering
(CSL234)
LAB WORKBOOK
Faculty name : Ankita Gupta Student name: Harshita Bhatia
Roll No.: 20csu305
Semester: Vth
Group: B
Department of Computer Science and Engineering
The NorthCap University, Gurugram- 122001, India
Session 2021-22
Data Engineering Lab Manual (CSL234)
2021-22
INDEX
S.No Experiment Page Date of Date of Marks Signature
No. Experiment Submission
1. With the objective of making
sure that students have access
to SQL Server long before they
need to work with it for an
assignment; Install SQL server
with management studio.
2. To apply SQL integrity constraints as
per the DDL statements for SAMPLE
database.
3. To Learn how to query the data
from SQL Server database.
4. To learn about views and how to
manage views
To introduces the stored procedures
5. To learn about views and how to
manage views
To introduces the stored procedures
Data Engineering Lab Manual (CSL234)
2021-22
EXPERIMENT NO. 1
Student Name and Roll Number: Harshita Bhatia 20csu305
Semester /Section: 5th B-1
Link to Code:
Date: 15/09/22
Faculty Signature:
Marks:
Objective:
With the objective of making sure that students have access to SQL Server long before they need to
work with it for an assignment; Install SQL server with management studio.
Outcome:
Students will understand how to interact with SQL Server interface
Problem Statement:
Install the SQL Server 2017 Developer Edition and SQL Server Mangement Studio (SSMS) from the below
link and perform the following mentioned tasks
https://www.microsoft.com/en-us/sql-server/sql-server-downloads
Install the SQL Server
Walk you through the steps of installing SQL Server 2017 Developer Edition on your computer or local
server
Connect to the SQL Server
Show you how to connect to the SQL Server using SQL Server Management Studio (SSMS).
Explore SQL Server Database
Introduce you to an SQL Server sample database called SAMPLE
Load the SQL Server Sample database
Guide you on how to load the Sample Database into the SQL Server for practicing.
Data Engineering Lab Manual (CSL234)
2021-22
Background Study:
SQL Server is a relational database management system, or RDBMS, developed and marketed by
Microsoft. Similar to other RDBMS software, SQL Server is built on top of SQL, a standard programming
language for interacting with the relational databases. SQL server is tied to Transact-SQL, or T-SQL, the
Microsoft’s implementation of SQL that adds a set of proprietary programming constructs.
SQL Server works exclusively on Windows environment for more than 20 years. In 2016, Microsoft made it
available on Linux. SQL Server 2017 became generally available in October 2016 that ran on both Windows
and Linux.
SQL Server Services and Tools
Microsoft provides both data management and business intelligence (BI) tools and services together
with SQL Server.
For data management, SQL Server includes SQL Server Integration Services (SSIS), SQL Server Data Quality
Services, and SQL Server Master Data Services. To develop databases, SQL Server provides SQL Server Data
tools; and to manage, deploy, and monitor databases SQL Server has SQL Server Management Studio
(SSMS).
For data analysis, SQL Server offers SQL Server Analysis Services (SSAS). SQL Server Reporting Services
(SSRS) provides reports and visualization of data. The Machine Learning Services technology appeared first
in SQL Server 2016 which was renamed from the R Services.
SQL Server Editions
SQL Server has four primary editions that have different bundled services and tools. Two editions are
available free of charge:
SQL Server Developer edition for use in database development and testing.
SQL Server Expression for small databases with the size up to 10 GB of disk storage capacity.
For larger and more critical applications, SQL Server offers the Enterprise edition that includes all SQL
server’s features.
SQL Server Standard Edition has partial feature sets of the Enterprise Edition and limits on the Server
regarding the numbers of processor core and memory that can be configured.
Question Bank:
Data Engineering Lab Manual (CSL234)
2021-22
Student Work Area
Algorithm/Flowchart/Code/Sample Outputs
Data Engineering Lab Manual (CSL234)
2021-22
Data Engineering Lab Manual (CSL234)
2021-22
EXPERIMENT NO. 2
Student Name and Roll Number: Harshita Bhatia 20csu305
Semester /Section: 5th B-1
Link to Code:
Date:
Faculty Signature:
Marks:
Objective:
Data Engineering Lab Manual (CSL234)
2021-22
To apply SQL integrity constraints as per the DDL statements for SAMPLE database.
The following illustrates the SAMPLE database diagram:
Outcome:
The students will understand how a database is created followed by insertion of relevant data.
The students will understand the need of applying various types of integrity constraints such as primary
key, foreign key, unique key, NOT NULL, default and CHECK etc
Problem Statement:
Perform the basic DDL queries for the sample database.
Background Study:
Background Study
1) Primary Key Constraint: A column or group of columns in a table which helps us to uniquely identifies
every row in that table is called a primary key. This DBMS can't be a duplicate. The same value can't
Data Engineering Lab Manual (CSL234)
2021-22
appear more than once in the table.
Syntax to define a Primary key at column level:
column name datatype [CONSTRAINT constraint_name] PRIMARY KEY
Syntax to define a Primary key at table level:
[CONSTRAINT constraint_name] PRIMARY KEY (column_name1,column_name2,..)
Rules for defining Primary key:
o Two rows can't have the same primary key value
o It must for every row to have a primary key value.
o The primary key field cannot be null.
o The value in a primary key column can never be modified or updated if any foreign key
refers to that primary key.
2) Foreign Key (Referential integrity constraint): This constraint identifies any column referencing the
PRIMARY KEY in another table. It establishes a relationship between two columns in the same table
or between different tables. For a column to be defined as a Foreign Key, it should be a defined as a
Primary Key in the table which it is referring. One or more columns can be defined as Foreign key.
Syntax to define a Foreign key at column level:
[CONSTRAINT constraint_name] REFERENCES Referenced_Table_name(column_name)
Syntax to define a Foreign key at table level:
[CONSTRAINT constraint_name] FOREIGN KEY(column_name) REFERENCES
referenced_table_name(column_name);
3) SQL Not Null Constraint : This constraint ensures all rows in the table contain a definite value for the
column which is specified as not null. Which means a null value is not allowed.
Syntax to define a Not Null constraint:
[CONSTRAINT constraint name] NOT NULL
4) SQL Unique Key: This constraint ensures that a column or a group of columns in each row have a
distinct value. A column(s) can have a null value but the values cannot be duplicated.
Syntax to define a Unique key at column level:
[CONSTRAINT constraint_name] UNIQUE
Syntax to define a Unique key at table level:
[CONSTRAINT constraint_name] UNIQUE(column_name)
5) SQL Check Constraint : This constraint defines a business rule on a column. All the rows must satisfy
Data Engineering Lab Manual (CSL234)
2021-22
this rule. The constraint can be applied for a single column or a group of columns.
Syntax to define a Check constraint:
[CONSTRAINT constraint_name] CHECK (condition)
Question Bank: Perform the following basic DDL queries
Create a SAMPLE data base
CREATE SCHEMA : to create the new schema in the databse
CREATE TABLE : a new table to a specific schema of the database
Identity column– how to use the IDENTITY property to create the identity column for a table.
ALTER TABLE ADD column – show you how to add one or more columns to an existing table
ALTER TABLE ALTER COLUMN – show you how to change the definition of existing columns in a
table.
ALTER TABLE DROP COLUMN – learn how to drop one or more columns from a table.
Computed columns – how to use the computed columns to resue the calculation logic in multiple
queries.
DROP TABLE – show you how to delete tables from the database.
TRUNCATE TABLE – delete all data from a table faster and more efficiently.
SELECT INTO – learn how to create a table and insert data from a query into it.
Rename a table – walk you through the process of renaming a table to a new one.
Temporary tables – introduce you to the temporary tables for storing temporarily immediate data
in stored procedures or database session.
Synonym – explain you the synonym and show you how to create synonyms for database objects.
Student Work Area
Algorithm/Flowchart/Code/Sample Outputs
Creating a new database using the CREATE DATABASE statement
The CREATE DATABASE statement creates a new database. The following shows the minimal syntax of
the CREATE DATABASE statement:
CREATE DATABASE database_name;
In this syntax, you specify the name of the database after the CREATE DATABASE keyword.
Data Engineering Lab Manual (CSL234)
2021-22
The database name must be unique within an instance of SQL Server. It must also comply with the SQL
Server identifier’s rules. Typically, the database name has a maximum of 128 characters.
The following statement creates a new database named TestDb:
CREATE DATABASE TestDb;
Code language: SQL (Structured Query Language) (sql)
Once the statement executes successfully, you can view the newly created database in the Object
Explorer. If the new database does not appear, you can click the Refresh button or press F5 keyboard to
update the object list.
This statement lists all databases in the SQL Server:
SELECT
name
FROM
master.sys.databases
ORDER BY
name;
Code language: SQL (Structured Query Language) (sql)
Or you can execute the stored procedure sp_databases:
EXEC sp_databases;
Code language: SQL (Structured Query Language) (sql)
Data Engineering Lab Manual (CSL234)
2021-22
Creating a new database using SQL Server Management Studio
First, right-click the Database and choose New Database… menu item.
Second, enter the name of the database e.g., SampleDb and click the OK button.
Data Engineering Lab Manual (CSL234)
2021-22
Third, view the newly created database from the Object Explorer:
Data Engineering Lab Manual (CSL234)
2021-22
Data Engineering Lab Manual (CSL234)
2021-22
EXPERIMENT NO. 3
Student Name and Roll Number: Harshita Bhatia 20csu305
Semester /Section: 5th B-1
Link to Code:
Date: 30-08-22
Faculty Signature:
Marks:
Objective:
To Learn how to query the data from SQL Server database.
Outcome:
Students will understand how to sort, filter, modify and group the data
Students will learn how to apply CTE (Common Table Expression) and recursive CTE
Problem Statement:
Perform the following queries for the sample database.
Select distinct values in one or more columns of a table.
Filter rows in the output of a query based on one or more conditions.
Combine two Boolean expressions and return true if all expressions are true.
Combine two Boolean expressions and return true if either of conditions is true.
Check whether a value matches any value in a list or a subquery.
Test if a value is between a range of values.
Check if a character string matches a specified pattern.
Show you how to use column aliases to change the heading of the query output and table alias to
improve the readability of a query.
Use common table expressions to make complex queries more readable.
Query hierarchical data using recursive CTE.
Convert rows to columns
Background Study:
Students should be aware how to change the contents of tables in the database. The SQL commands for
modifying data such as insert, delete, and update are referred to as data manipulation language (DML).
INSERT – insert a row into a table
INSERT multiple rows – insert multiple rows into a table using a single INSERT statement
INSERT INTO SELECT – insert data into a table from the result of a query.
UPDATE – change the existing values in a table.
Data Engineering Lab Manual (CSL234)
2021-22
UPDATE JOIN– update values in a table based on values from another table using JOIN clauses.
DELETE– delete one or more rows of a table.
MERGE– walk you through the steps of performing a mixture of insertion, update, and deletion using a
single statement.
Question Bank:
Student Work Area
Algorithm/Flowchart/Code/Sample Outputs
OFFSET and FETCH
Data Engineering Lab Manual (CSL234)
2021-22
NULL
Data Engineering Lab Manual (CSL234)
2021-22
Data Engineering Lab Manual (CSL234)
2021-22
AND
INNER JOIN
Data Engineering Lab Manual (CSL234)
2021-22
LEFT JOIN
Data Engineering Lab Manual (CSL234)
2021-22
RIGHT JOIN
FULL JOIN
Data Engineering Lab Manual (CSL234)
2021-22
UNION
SUBQUERY
Data Engineering Lab Manual (CSL234)
2021-22
NESTED SUBQUERY
CORELATED SUBQUERY
Data Engineering Lab Manual (CSL234)
2021-22
INTERSECT
ALL
Data Engineering Lab Manual (CSL234)
2021-22
EXCEPT
EXPERIMENT NO. 4
Student Name and Roll Number: Harshita Bhatia 20csu305
Semester /Section: 5th B-1
Link to Code:
Date:
Faculty Signature:
Marks:
Objective:
Data Engineering Lab Manual (CSL234)
2021-22
To learn about views and how to manage views
To introduces the stored procedures
Outcome:
Students will learn to create a new view, removing a view, and updating data of the underlying tables
through a view.
Students will learn how to develop flexible stored procedures to optimize database access.
Problem Statement:
Perform the following queries to Manage views in SQL Server
Creating a new view– show you how to create a new view in a SQL Server database.
Renaming a view – learn how to rename a view using the SQL Server Management Studio (SSMS)
or Transact-SQL command.
Listing views in SQL Server – discuss the various ways to list all views in a SQL Server Database.
Getting view information – how to get information about a view.
Removing a view – guide you how to use the DROP VIEW statement to remove one or more views
from the database.
Create, modify and delete a stored procedure
Make use of control-of-flow statements in stored procedure
Background Study:
By definition, views do not store data except for indexed views.
A view may consist of columns from multiple tables using joins or just a subset of columns of a single table.
This makes views useful for abstracting or hiding complex queries.
The following picture illustrates a view that includes columns from multiple tables:
Data Engineering Lab Manual (CSL234)
2021-22
Advantages of views
Generally speaking, views provide the following advantages:
Security
You can restrict users to access directly to a table and allow them to access a subset of data via views.
For example, you can allow users to access customer name, phone, email via a view but restrict them to
access the bank account and other sensitive information.
Simplicity
A relational database may have many tables with complex relationships e.g., one-to-one and one-to-many
that make it difficult to navigate.
However, you can simplify the complex queries with joins and conditions using a set of views.
Consistency
Sometimes, you need to write a complex formula or logic in every query.
To make it consistent, you can hide the complex queries logic and calculations in views.
Once views are defined, you can reference the logic from the views rather than rewriting it in separate
Data Engineering Lab Manual (CSL234)
2021-22
queries
Stored Procedure
SQL Server stored procedures are used to group one or more Transact-SQL statements into logical units.
The stored procedure is stored as a named object in the SQL Server Database Server.
When you call a stored procedure for the first time, SQL Server creates an execution plan and stores it in
the cache. In the subsequent executions of the stored procedure, SQL Server reuses the plan to execute the
stored procedure very fast with reliable performance.
This experiment introduces you to the stored procedures and shows you how to develop flexible stored
procedures to optimize database access.
Question Bank:
Data Engineering Lab Manual (CSL234)
2021-22
Student Work Area
Algorithm/Flowchart/Code/Sample Outputs
Creating a new view– show you how to create a new view in a SQL Server database
Create, modify and delete a stored procedure
Data Engineering Lab Manual (CSL234)
2021-22
Create a procedure
Executing a procedure
Data Engineering Lab Manual (CSL234)
2021-22
Modifying a procedure
Deleting a procedure
Data Engineering Lab Manual (CSL234)
2021-22
Make use of control-of-flow statements in stored procedure
Data Engineering Lab Manual (CSL234)
2021-22
EXPERIMENT NO. 5
Student Name and Roll Number: Harshita Bhatia 20csu305
Semester /Section: 5th B-1
Link to Code:
Date: 14-09-22
Faculty Signature:
Marks:
Objective:
To learn about views and how to manage views
To introduces the stored procedures
Outcome:
Students will learn to create a new view, removing a view, and updating data of the underlying tables
through a view.
Students will learn how to develop flexible stored procedures to optimize database access.
Problem Statement:
Perform the following queries using user defined function in SQL Server
Create, call, modify and remove a scalar function.
Declares a table variable named @product_table which consists of three
columns: product_name, brand_id, and list_price
Insert data into table variable
Querying data from table variables
Create a user-defined function named ufnSplit() that returns a table variable.
Create table valued function and multi-statement table valued function.
Background Study:
SQL Server user-defined functions include scalar-valued functions which return a single value and table-
valued function which return rows of data.
The SQL Server user-defined functions help you simplify your development by encapsulating complex
business logic and make them available for reuse in every query.
User defined scalar function – cover the user-defined scalar functions that allow you to
encapsulate complex formula or business logic and reuse them in every query.
Table variables – learn how to use table variables as a return value of user-defined functions.
Table valued functions – introduce you to inline table-valued function and multi-statement table-
Data Engineering Lab Manual (CSL234)
2021-22
valued function to develop user-defined functions that return data of table types.
Removing user defined functions – learn how to drop one or more existing user-defined functions
from the database.
Question Bank:
Student Work Area
Algorithm/Flowchart/Code/Sample Outputs
Create, call, modify and remove a scalar function.
Data Engineering Lab Manual (CSL234)
2021-22
Calling a scalar function
Data Engineering Lab Manual (CSL234)
2021-22
Table valued functions
Creating a table valued functions
Calling a table valued functions
Data Engineering Lab Manual (CSL234)
2021-22
Modifying the table valued function
Data Engineering Lab Manual (CSL234)
2021-22
Multi-statement table valued function
Data Engineering Lab Manual (CSL234)
2021-22
Drop function
Data Engineering Lab Manual (CSL234)
2021-22