DBMS Record

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 58

Ex. No.

: 1 BASIC SQL

Date: 06.01.17 Student Database

Relation for Student Scenario

Suppose you are given a relation grade points(grade, points), which provides a conversion from
letter grades in the takes relation to numeric scores; for example an “A” grade could be specified to correspond
to 4 points, an “A−” to 3.7 points, a “B+” to 3.3 points, a “B” to 3 points, and so on. The grade points earned
by a student for a course offering (section) is defined as the number of credits for the course multiplied by the
numeric points for the grade that the student received. Given the above relation, and our university schema,
write each of the following queries in SQL. You can assume for simplicity that no takes tuple has the null
value for grade.

For the above schema, perform the following –

a. Find the total grade-points earned by the student with ID 12345, across all courses taken by the
student.

b. Find the grade-point average (GPA) for the above student, that is, the total grade-points divided
by the total credits for the associated courses.

c. Find the ID and the grade-point average of every student.

Aim

To create a student database and write sql queries for different operations using basic SQL command.

Theory

SQL

SQL is a standard language for accessing and manipulating databases.

 SQL stands for Structured Query Language


 SQL lets you access and manipulate databases
 SQL is an ANSI (American National Standards Institute) standard
 SQL can execute queries against a database
 SQL can retrieve data from a database
 SQL can insert records in a database
 SQL can update records in a database
 SQL can delete records from a database
 SQL can create new databases
 SQL can create new tables in a database
 SQL can create stored procedures in a database
 SQL can create views in a database

CSP602: Advanced Database Management System MTECH CSE| NITPY 1


 SQL can set permissions on tables, procedures, and views

SQL CREATE TABLE

The CREATE TABLE statement is used to create a new table in a database.


Syntax-

CREATE TABLE table_name (


column1 datatype,
column2 datatype,
column3 datatype,
....);

SQL INSERT INTO

The INSERT INTO statement is used to insert new records in a table.


Syntax -
INSERT INTO table_name VALUES (value1, value2, value3, ...);

SQL SELECT

The SELECT statement is used to select data from a database.


The data returned is stored in a result table, called the result-set.
Syntax –
SELECT column1, column2, ... FROM table_name;
To select whole table –
SELECT * FROM table_name;

Program/Procedure :

Database Creation:

create database University;

use University;

Table Creation:

GradeP :

create table GradeP(Grade varchar(3),Points float,primary key(Grade));

Student :

create table Student(ID varchar(10),Name varchar(20),Sub1G varchar(3),Sub2G varchar(3),Sub3G


varchar(3),Sub4G varchar(3),Primary key(ID));

CourseD :

CSP602: Advanced Database Management System MTECH CSE| NITPY 2


create table CourseD(CName varchar(5),MaxCredits int,Primary Key(CName));

Values Insertion:

GradeP Table:

insert into GradeP values('A+',5);

insert into GradeP values('A',4);

insert into GradeP values('A-',3.7);

insert into GradeP values('B+',3.3);

insert into GradeP values('B',3);

insert into GradeP values('B-',2.7);

insert into GradeP values('C+',2.3);

insert into GradeP values('C',2);

insert into GradeP values('F',0);

Select * from GradeP;

Student Table:

insert into Student values('CS01','Shashank','A-','A+','B+','A');

insert into Student values('CS02','Pramod','A','A+','B+','A');

insert into Student values('CS03','Pavan','A+','A+','A+','A+');

insert into Student values('CS04','Pradeep','A-','A','A','A');

CSP602: Advanced Database Management System MTECH CSE| NITPY 3


insert into Student values('CS05','Amulya','A+','A+','B+','B');

insert into Student values('CS06','Anila','A-','A','B','B-');

insert into Student values('CS07','Ananya','A','B+','B+','B');

insert into Student values('CS08','Ashmith','A-','A+','B+','A');

insert into Student values('CS09','Krithansh','A+','A','B+','B');

insert into Student values('12345','Kaira','B','C','A','A+');

Select * from Student;

CourseD Table:

insert into CourseD values('Sub1',4);

insert into CourseD values('Sub2',3);

insert into CourseD values('Sub3',2);

insert into CourseD values('Sub4',4);

CSP602: Advanced Database Management System MTECH CSE| NITPY 4


Select * from CourseD;

Answer:

a) select ((g1.Points/5) * c1.MaxCredits)+ ((g2.Points/5) * c2.MaxCredits)+((g3.Points/5) *c3.MaxCredits)


+((g4.Points/5) * c4.MaxCredits)as Credits

from Student,GradeP g1,CourseD c1,GradeP g2,CourseD c2,GradeP g3,CourseD c3,GradeP g4,CourseD c4

where ID='12345' and

Sub1G=g1.Grade and c1.CName='Sub1' and

Sub2G=g2.Grade and c2.CName='Sub2' and

Sub3G=g3.Grade and c3.CName='Sub3' and

Sub4G=g4.Grade and c4.CName='Sub4';

b) select (((g1.Points/5) * c1.MaxCredits)+((g2.Points/5) * c2.MaxCredits)+((g3.Points/5) * c3.MaxCredits)


+((g4.Points/5) * c4.MaxCredits)) / (c1.MaxCredits+c2.MaxCredits +c3.MaxCredits+c4.MaxCredits) as
AGPA

from Student,GradeP g1,CourseD c1,GradeP g2,CourseD c2,GradeP g3,CourseD c3,GradeP g4,CourseD c4

where ID='12345' and

Sub1G=g1.Grade and c1.CName='Sub1' and

Sub2G=g2.Grade and c2.CName='Sub2' and

Sub3G=g3.Grade and c3.CName='Sub3' and

CSP602: Advanced Database Management System MTECH CSE| NITPY 5


Sub4G=g4.Grade and c4.CName='Sub4';

c) select ID,(((g1.Points/5) * c1.MaxCredits)+((g2.Points/5) * c2.MaxCredits)+((g3.Points/5) *


c3.MaxCredits)+((g4.Points/5) * c4.MaxCredits)) /(c1.MaxCredits+c2.MaxCredits
+c3.MaxCredits+c4.MaxCredits) as AGPA

from Student,GradeP g1,CourseD c1,GradeP g2,CourseD c2,GradeP g3,CourseD c3,GradeP g4,CourseD c4

where Sub1G=g1.Grade and c1.CName='Sub1' and

Sub2G=g2.Grade and c2.CName='Sub2' and

Sub3G=g3.Grade and c3.CName='Sub3' and

Sub4G=g4.Grade and c4.CName='Sub4';

Result:

Student database created successfully and queries executed.

CSP602: Advanced Database Management System MTECH CSE| NITPY 6


Ex. No.: 2 BASIC SQL

Date: 13.01.17 Insurance Database

Database Schema for Insurance scenario

person (driver id, name, address)

car (license, model, year)

accident (report number, date, location)

owns (driver id, license)

Participated (report number, license, driver id, damage amount)

For the above schema, perform the following –

a. Find the total number of people who owned cars that were involved in accidents in 2009.

b. Add a new accident to the database; assume any values for required attributes.

c. Delete the Mazda belonging to “John Smith”.

Aim

To create university database and write sql queries for different operations using basic SQL command.

Theory

SQL

SQL is a standard language for accessing and manipulating databases.

 SQL stands for Structured Query Language


 SQL lets you access and manipulate databases
 SQL is an ANSI (American National Standards Institute) standard
 SQL can execute queries against a database
 SQL can retrieve data from a database
 SQL can insert records in a database
 SQL can update records in a database
 SQL can delete records from a database
 SQL can create new databases
 SQL can create new tables in a database
 SQL can create stored procedures in a database
 SQL can create views in a database
 SQL can set permissions on tables, procedures, and views

CSP602: Advanced Database Management System MTECH CSE| NITPY 7


SQL CREATE TABLE

The CREATE TABLE statement is used to create a new table in a database.


Syntax-

CREATE TABLE table_name (


column1 datatype,
column2 datatype,
column3 datatype,
....
);
SQL INSERT INTO

The INSERT INTO statement is used to insert new records in a table.


Syntax -
INSERT INTO table_name VALUES (value1, value2, value3, ...);

SQL SELECT

The SELECT statement is used to select data from a database.


The data returned is stored in a result table, called the result-set.
Syntax –
SELECT column1, column2, ... FROM table_name;
To select whole table –
SELECT * FROM table_name;

SQL COUNT() Function

The COUNT() function returns the number of rows that matches a specified criteria.

Syntax -

SELECT COUNT(*) FROM table_name;

SELECT COUNT(DISTINCT column_name) FROM table_name;

SQL MIN() and MAX() Functions

The MIN() function returns the smallest value of the selected column.

The MAX() function returns the largest value of the selected column.

Syntax –

SELECT MIN(column_name) FROM table_name WHERE condition;

SELECT MAX(column_name) FROM table_name WHERE condition;

CSP602: Advanced Database Management System MTECH CSE| NITPY 8


SQL AVG() Function

The AVG() function returns the average value of a numeric column.

Syntax –

SELECT AVG(column_name) FROM table_name WHERE condition;

SQL SUM() Function

The SUM() function returns the total sum of a numeric column.

Syntax –

SELECT SUM(column_name) FROM table_name WHERE condition;

Program/Procedure

Database Creation:

create database InsurDB;

use InsurDB;

Table Creation:

Person:

create table Person(Driver_ID varchar(20),Name varchar(30),Address varchar(50), Primary Key (Driver_ID));

Car:

create table Car(License varchar(20),Model Varchar(20),Year int ,Primary Key (License));

Acc:

create table Acc(ReportNo int ,Dat date,Loc Varchar(20),Primary Key (ReportNo));

Owns:

create table Owns(Driver_ID varchar(20),License varchar(20),foreign key (Driver_ID) references


Person(Driver_ID)on delete cascade,foreign key (License) references Car(License) on delete cascade);

Participated:

create table Participated(ReportNo int,license varchar(20),Driver_ID varchar(20),DamAmount int, foreign

key(ReportNo)references Acc(ReportNo) on delete cascade,foreign key(license) references Car(License) on

delete cascade,foreign key(Driver_ID) references Person(Driver_ID) on delete cascade);

alter table Participated add foreign key(ReportNo) references Acc(ReportNo);

CSP602: Advanced Database Management System MTECH CSE| NITPY 9


alter table Participated add foreign key(license) references Car(License);

alter table Participated add foreign key(Driver_ID) references Person(Driver_ID);

Values Insertion:

Person Table :

insert into Person values('Driv01','AnuDeep','Hyderabad');

insert into Person values('Driv02','Gagan','Banglore');

insert into Person values('Driv03','Yash','Chennai');

insert into Person values('Driv04','Rishi','Karaikal');

insert into Person values('Driv05','Pradeep','Puducherry');

insert into Person values('Driv06','Ram','Warangal');

insert into Person values('Driv07','Robert','Khammam');

insert into Person values('Driv08','Vansh','Nasik');

insert into Person values('Driv09','Yug','Lucknow');

insert into Person values('Driv10','Pratheek','Mumbai');

insert into Person values('Driv11','Pavan','Tirchy');

insert into Person values('Driv12','Pramod','Jangaon');

insert into Person values('Driv13','Shashank','Ranga Reddy');

insert into Person values('Driv14','Praneeth','Medak');

insert into Person values('Driv15','Pranav','Tirupati');

insert into Person values ('Driv20','John Smith','Medak');

select * from Person;

CSP602: Advanced Database Management System MTECH CSE| NITPY 10


Car Table :

insert into Car values('Mar1001','Maruthi',2009);

insert into Car values('San01','Santro',2011);

insert into Car values('Bmw1010','Bmw',2013);

insert into Car values('Audi007','Audi',2015);

insert into Car values('Vista022','Vista',2009);

insert into Car values('Toy202','Toyota',2016);

insert into Car values('Shi111','Shift',2008);

insert into Car values('Hun103','Hundai',2010);

insert into Car values('Alt126','Alto',2009);

insert into Car values('Hon225','Honda',2013);

insert into Car values('Mazda','Mazda',2010);

Select * from Car;

CSP602: Advanced Database Management System MTECH CSE| NITPY 11


Acc Table:

insert into Acc values(01,'2009-07-20','Hyderabad');

insert into Acc values(02,'2009-08-18','Chennai');

insert into Acc values(03,'2008-09-19','Banglore');

insert into Acc values(04,'2010-06-06','Hyderabad');

insert into Acc values(05,'2013-07-27','Eluru');

insert into Acc values(06,'2014-12-23','Lucknow');

insert into Acc values(07,'2015-02-28','Warangal');

insert into Acc values(08,'2013-02-22','Nasik');

insert into Acc values(09,'2011-05-05','Banglore');

insert into Acc values(10,'2016-08-08','Karaikal');

select * from Acc;

CSP602: Advanced Database Management System MTECH CSE| NITPY 12


Owns Table:

insert into Owns values('Driv10','Mar1001');

insert into Owns values('Driv03','San01');

insert into Owns values('Driv05','Toy202');

insert into Owns values('Driv09','Shi111');

insert into Owns values('Driv08','Hun103');

insert into Owns values('Driv01','Alt126');

insert into Owns values('Driv07','Hon225');

insert into Owns values('Driv02','Bmw1010');

insert into Owns values('Driv04','Audi007');

insert into Owns values('Driv06','Vista022');

Select * from Owns;

CSP602: Advanced Database Management System MTECH CSE| NITPY 13


Participatedicipated Table :

insert into Participated Values(01,'San01','Driv03',20000);

insert into Participated Values(02,'Mar1001','Driv10',10000);

insert into Participated Values(03,'Toy202','Driv05',15000);

insert into Participated Values(04,'Shi111','Driv09',25000);

insert into Participated Values(05,'Hun103','Driv08',2000);

insert into Participated Values(06,'Alt126','Driv01',20000);

insert into Participated Values(07,'Hon225','Driv07',20000);

insert into Participated Values(08,'Bmw1010','Driv02',20000);

insert into Participated Values(09,'Audi007','Driv04',20000);

insert into Participated Values(10,'Vista022','Driv06',20000);

Select * from Participated;

CSP602: Advanced Database Management System MTECH CSE| NITPY 14


Answer :

a) Select count(distinct Driver_ID)as People from Acc,Participated where year(Dat)=2009 and


Participated.ReportNo=Acc.ReportNo;

b) Adding New Accident

insert into Acc Values(11,’2015-6-23’,’chennai’);

Select * from Acc;

CSP602: Advanced Database Management System MTECH CSE| NITPY 15


c) Deleting Mazda belonging "John Smith"

delete from Car where Model='Mazda' and license in (select License from Owns where Driver_ID in(select
Driver_ID from Person where Name='John Smith'));

select * from Car;

Result

Insurance database created and successfully executed and query results obtained.

CSP602: Advanced Database Management System MTECH CSE| NITPY 16


Ex. No.: 3 BASIC SQL

Date: 20.01.17 Computer Service Database -Sequence

Database Schema for Computer service Database

Employee (ename, eno, center_id, address, gen_workhours, working_hours, designation, salary)


Customer (cname, cu_id, address, join_date, phone_no)
Centre (cename, center_id, location address, no_of_employees)
Services (pname, it_id, center_id, service_date, scharge)
Inventory (iname, it_id, idate, quantity, cost)
Purchase (pname, pid, owner, pdate, center_id, it_id, price)
Billing (invoice_num, cu_id, pname, pid, paid_amt, service_chrg, cus_bal)

Database Question

The SHROY computer service center is owned by the SPARK computer dealer; the SHROY
services only SPARK computers. Five SHROY computer service center provide services and assembles for
the entire state. Each center is independently managed by a shop manager; a receptionist and at least ten
service engineers. Each center also maintains about, the history of services made, parts used, costs, service
dates, owner etc. Also there is a need to keep track about inventory, purchasing, billing, employee’s hours
and payroll.

CREATING SEQUENCE
Write a query that will automatically produce ‘customer id’ and ‘invoice number’ as sequence in
customer and billing relations respectively.

Aim:
To create an Employee database and execute SQL queries to achieve the desired results.

Theory

SQL
SQL is a standard language for accessing and manipulating databases.
 SQL stands for Structured Query Language
 SQL lets you access and manipulate databases
 SQL is an ANSI (American National Standards Institute) standard
 SQL can execute queries against a database
 SQL can retrieve data from a database
 SQL can insert records in a database
 SQL can update records in a database
 SQL can delete records from a database
 SQL can create new databases
 SQL can create new tables in a database

CSP602: Advanced Database Management System MTECH CSE| NITPY 17


 SQL can create stored procedures in a database
 SQL can create views in a database
 SQL can set permissions on tables, procedures, and views

SQL CREATE TABLE

The CREATE TABLE statement is used to create a new table in a database.


Syntax-

CREATE TABLE table_name (


column1 datatype,
column2 datatype,
column3 datatype,
....);

SQL INSERT INTO


The INSERT INTO statement is used to insert new records in a table.
Syntax -
INSERT INTO table_name VALUES (value1, value2, value3, ...);

SQL SELECT

The SELECT statement is used to select data from a database.


The data returned is stored in a result table, called the result-set.
Syntax –
SELECT column1, column2, ... FROM table_name;
To select whole table –
SELECT * FROM table_name;

SQL PRIMARY KEY

The PRIMARY KEY constraint uniquely identifies each record in a database table.
Primary keys must contain UNIQUE values, and cannot contain NULL values.
A table can have only one primary key, which may consist of single or multiple fields.

SQL FOREIGN KEY

A FOREIGN KEY is a key used to link two tables together.


A FOREIGN KEY in a table points to a PRIMARY KEY in another table.

Program/Procedure

Database Creation:

create database shroy_db;

CSP602: Advanced Database Management System MTECH CSE| NITPY 18


use shroy_db;

Table Creation:

Center:

create table center (center_id varchar(10),cename varchar(50), location_address varchar(50), no_of_employees


int, primary key(center_id));

Employee:

create table employee (Ename char(20),Eno int,center_id varchar(20),Address varchar(50),gen_workhours


float,working_hours float,Designation char(20),Salary int,Primary Key(Eno));

Customer:

create table Customer(Cname Char(20),Cu_id int,Address varchar(50),Join_Date date,Phone_no


bigint,primary key (Cu_id));

Inventory :

create table Inventory(Iname char(20),It_id int,Idate date,Quantity int,Cost int,Primary key(It_id));

Services:

create table Services(Pname char(20),It_id int,Center_id varchar(10),Service_date date,Scharge int);

Purchase :

Create table Purchase(Pname char(20),Pid varchar(20),Owner char(30),Pdate date,Center_id varchar(20),


Primary Key(Pid));

Billing:

create table Billing(Invoice_num int,Cu_id varchar(10),Pname char(20),Pid varchar(20),Paid_amt


int,Service_Chrg int,Cus_Bal int,primary key (Invoice_num));

Values Insertion:

Center Table:

insert into center values ('Hyd01','SparkHyd','Hyderabad',15);

insert into center values ('Mum10','SparkMum','Mumbai',18);

insert into center values ('Che19','SparkChe','Chennai',13);

select * from center;

CSP602: Advanced Database Management System MTECH CSE| NITPY 19


Employee Table:

insert into Employee Values('Pradeep',1,'ACD102','Hyderabad',8,8.5,'Manager',50000);

insert into Employee Values('Pavan',2,'ACD101','Hyderabad',8,8,'Service Engineer',30000);

insert into Employee Values('Kumar',3,'ACD103','Hyderabad',8,8,'Service Engineer',30000);

insert into Employee Values('Anjali',4,'ACD110','Hyderabad',8,8,'Receptionist',20000);

insert into Employee Values('Deepak',5,'ACD105','Hyderabad',8,8,'Service Engineer',30000);

select * from Employee;

Customer Table:

insert into Customer values('Pallavi','201','Chennai','2015-02-21',9447568491);

insert into Customer values('Pavani','203','Banglore','2012-01-12',9557568495);

insert into Customer values('Priya','102','Hyderabad','2013-07-17',9927568492);

insert into Customer values('santhosh','100','Bombai','2014-08-19',9907568499);

insert into Customer values('divya','001','Warangal','2016-02-20',7847568478);

select * from Customer;

CSP602: Advanced Database Management System MTECH CSE| NITPY 20


Inventory Table:

insert into Inventory values('Hyd','2','2015-06-21',6,75000);

insert into Inventory values('Che','3','2014-06-05',5,5000);

insert into Inventory values('Mum','4','2013-03-15',4,15000);

insert into Inventory values('Wgl','5','2010-06-12',3,51000);

select * from Inventory;

Billing Table:

insert into Billing values(12,'Cu100','Karan','Per12',25000,500,25500);

select * from Billing;

CSP602: Advanced Database Management System MTECH CSE| NITPY 21


Services Table:

insert into Services values('Deepak','20','HYD01','2016-02-21','300');

insert into Services values('Vinay','21','MUM10','2015-06-06','300');

insert into Services values('Vikranth','2','HYD01','2016-05-02','300');

insert into Services values('Dhivyank','3','MUM10','2014-01-16','300');

insert into Services values('Dhruvansh','5','CHE19','2010-09-10','300');

select * from Services;

Purchase Table :

insert into Purchase values('Karan','Per12','Pratheek','2016-07-21','HYD01');

insert into Purchase values('Kiran','Pur3','Pranay','2016-05-2','MUM06');

insert into Purchase values('Kamal','Pel1','Pavan','2016-02-22','CHE19');

select * from Purchase;

CSP602: Advanced Database Management System MTECH CSE| NITPY 22


Answer:
a) alter table Billing modify Invoice_num int auto_increment;

insert into Billing (Cu_id, Pname,Pid, Paid_amt, Service_Chrg, Cus_Bal) values ('Cu20', 'Pandu', 'P000',2000,
100, 2100);

select * from Billing;

b)

alter table Customer modify Cu_id int auto_increment;

CSP602: Advanced Database Management System MTECH CSE| NITPY 23


insert into Customer(Cname, Address,Join_Date,Phone_no) Values ('Vidya', 'Bombai','2015-07-07',
9685321467);

select * from Customer;

Result:

Queries were executed successfully and required results were obtained.

CSP602: Advanced Database Management System MTECH CSE| NITPY 24


Ex. No.: 4 BASIC SQL

Date: 27.01.17 Computer Service Database -Constraints

Database Schema for Computer Service scenario

Employee (ename, eno, center_id, address, gen_workhours, working_hours, designation,


salary)
Customer (cname, cu_id, address, join_date, phone_no)
Centre (cename, center_id, location address, no_of_employees)
Services (pname, it_id, center_id, service_date, scharge)
Inventory (iname, it_id, idate, quantity, cost)
Purchase (pname, pid, owner, pdate, center_id, it_id, price)
Billing (invoice_num, cu_id, pname, pid, paid_amt, service_chrg, cus_bal)

Database Question

The SHROY computer service center is owned by the SPARK computer dealer; the SHROY
services only SPARK computers. Five SHROY computer service center provide services and assembles for
the entire state. Each center is independently managed by a shop manager; a receptionist and at least ten
service engineers. Each center also maintains about, the history of services made, parts used, costs, service
dates, owner etc. Also there is a need to keep track about inventory, purchasing, billing, employee’s hours
and payroll.

Creating Sequence
To create a table for center relation which should automatically fix the value for an attribute
‘no_of_emp’ as some default value (say ‘20’) in all the tuples.

Aim:
To create an Employee database and execute SQL queries to achieve the desired results.

Theory
SQL
SQL is a standard language for accessing and manipulating databases.

 SQL stands for Structured Query Language


 SQL lets you access and manipulate databases
 SQL is an ANSI (American National Standards Institute) standard
 SQL can execute queries against a database
 SQL can retrieve data from a database
 SQL can insert records in a database
 SQL can update records in a database
 SQL can delete records from a database

CSP602: Advanced Database Management System MTECH CSE| NITPY 25


 SQL can create new databases
 SQL can create new tables in a database
 SQL can create stored procedures in a database
 SQL can create views in a database
 SQL can set permissions on tables, procedures, and views

SQL CREATE TABLE

The CREATE TABLE statement is used to create a new table in a database.


Syntax-

CREATE TABLE table_name (


column1 datatype,
column2 datatype,
column3 datatype,
....);

SQL INSERT INTO


The INSERT INTO statement is used to insert new records in a table.
Syntax -
INSERT INTO table_name VALUES (value1, value2, value3, ...);

SQL SELECT

The SELECT statement is used to select data from a database.


The data returned is stored in a result table, called the result-set.
Syntax –
SELECT column1, column2, ... FROM table_name;
To select whole table –
SELECT * FROM table_name;

SQL PRIMARY KEY

The PRIMARY KEY constraint uniquely identifies each record in a database table.
Primary keys must contain UNIQUE values, and cannot contain NULL values.
A table can have only one primary key, which may consist of single or multiple fields.

SQL FOREIGN KEY

A FOREIGN KEY is a key used to link two tables together.


A FOREIGN KEY in a table points to a PRIMARY KEY in another table.

CSP602: Advanced Database Management System MTECH CSE| NITPY 26


Program/Procedure
Database Creation:

create database shroy_db;

use shroy_db;

Table Creation:

Center:

create table center (center_id varchar(10),cename varchar(50), location_address varchar(50),


no_of_employees int, primary key(center_id));

Employee:

create table employee (Ename char(20),Eno int,center_id varchar(20),Address varchar(50),gen_workhours


float,working_hours float,Designation char(20),Salary int,Primary Key(Eno));

Customer:

create table Customer(Cname Char(20),Cu_id int,Address varchar(50),Join_Date date,Phone_no

bigint,primary key (Cu_id));

Inventory :

create table Inventory(Iname char(20),It_id int,Idate date,Quantity int,Cost int,Primary key(It_id));

Services:

create table Services(Pname char(20),It_id int,Center_id varchar(10),Service_date date,Scharge int);

Purchase :

Create table Purchase(Pname char(20),Pid varchar(20),Owner char(30),Pdate date,Center_id varchar(20),


Primary Key(Pid));

Billing:

create table Billing(Invoice_num int,Cu_id varchar(10),Pname char(20),Pid varchar(20),Paid_amt


int,Service_Chrg int,Cus_Bal int,primary key (Invoice_num));

Values Insertion:

Center Table:

insert into center values ('Hyd01','SparkHyd','Hyderabad',15);

insert into center values ('Mum10','SparkMum','Mumbai',18);

insert into center values ('Che19','SparkChe','Chennai',13);

CSP602: Advanced Database Management System MTECH CSE| NITPY 27


select * from center;

Employee Table:

insert into Employee Values('Pradeep',1,'ACD102','Hyderabad',8,8.5,'Manager',50000);

insert into Employee Values('Pavan',2,'ACD101','Hyderabad',8,8,'Service Engineer',30000);

insert into Employee Values('Kumar',3,'ACD103','Hyderabad',8,8,'Service Engineer',30000);

insert into Employee Values('Anjali',4,'ACD110','Hyderabad',8,8,'Receptionist',20000);

insert into Employee Values('Deepak',5,'ACD105','Hyderabad',8,8,'Service Engineer',30000);

select * from Employee;

Customer Table:

insert into Customer values('Pallavi','201','Chennai','2015-02-21',9447568491);

insert into Customer values('Pavani','203','Banglore','2012-01-12',9557568495);

insert into Customer values('Priya','102','Hyderabad','2013-07-17',9927568492);

insert into Customer values('santhosh','100','Bombai','2014-08-19',9907568499);

insert into Customer values('divya','001','Warangal','2016-02-20',7847568478);

CSP602: Advanced Database Management System MTECH CSE| NITPY 28


select * from Customer;

Inventory Table:

insert into Inventory values('Hyd','2','2015-06-21',6,75000);

insert into Inventory values('Che','3','2014-06-05',5,5000);

insert into Inventory values('Mum','4','2013-03-15',4,15000);

insert into Inventory values('Wgl','5','2010-06-12',3,51000);

select * from Inventory;

Billing Table:

insert into Billing values(12,'Cu100','Karan','Per12',25000,500,25500);

select * from Billing;

CSP602: Advanced Database Management System MTECH CSE| NITPY 29


Services Table:

insert into Services values('Deepak','20','HYD01','2016-02-21','300');

insert into Services values('Vinay','21','MUM10','2015-06-06','300');

insert into Services values('Vikranth','2','HYD01','2016-05-02','300');

insert into Services values('Dhivyank','3','MUM10','2014-01-16','300');

insert into Services values('Dhruvansh','5','CHE19','2010-09-10','300');

select * from Services;

Purchase Table :

insert into Purchase values('Karan','Per12','Pratheek','2016-07-21','HYD01');

insert into Purchase values('Kiran','Pur3','Pranay','2016-05-2','MUM06');

insert into Purchase values('Kamal','Pel1','Pavan','2016-02-22','CHE19');

select * from Purchase;

CSP602: Advanced Database Management System MTECH CSE| NITPY 30


Answer :

create table Centerr(Cename char(20),Center_id varchar(10) , Location_Address


varchar(30),No_of_employees int DEFAULT 20,

primary key (Center_id));

insert into Centerr (Cename,Center_id,Location_Address) values('Deepak','HYD01','Hyderabad');

Select * from Centerr;

CSP602: Advanced Database Management System MTECH CSE| NITPY 31


Result:

Database created and queries were executed successfully.

CSP602: Advanced Database Management System MTECH CSE| NITPY 32


Ex. No.: 5 INTERMEDIATE SQL

Date: 03.02.17 Online Store Database – Views

Database Schema for Online Store scenario

Product(BarCode, PName, Price, QuantityInStock)

Customer (CustomerID, Phone, Email, FirstName, LastName)

Sale (SaleID, DeliveryAddress, CreditCard, CustomerID)

SaleItem (SaleID, BarCode, Quantity)

For the above schema, perform the following

1. List all products, with barcode and name, and the sale id and quantity for the sales containing
that product, if any (products that were never sold should still be listed in your result) Hint: use a left join
2. Create a view called AllProductsSales based on the query in the Q.No.1.
3. Write a query against the AllProductsSales view as if it was a table:
select everything from the AllProductsSales, ordered
by BarCode and SaleID.
5. Write a query on the ProductProfit view to show the top 5 products based on total profit on order
of highest profit to lowest profit.

Aim:

To create an Online Store Database and execute SQL queries to achieve the desired results.

Theory

SQL VIEWS

In SQL, a view is a virtual table based on the result-set of an SQL statement.

A view contains rows and columns, just like a real table. The fields in a view are fields from one or more
real tables in the database

You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data
were coming from one single table.

CREATE VIEW

Syntax –

CREATE VIEW view_name AS

SELECT column1, column2, ... FROM table_name WHERE condition;

CSP602: Advanced Database Management System MTECH CSE| NITPY 33


SQL CREATE OR REPLACE VIEW

Syntax –

CREATE OR REPLACE VIEW view_name AS

SELECT column1, column2, ... FROM table_name WHERE condition;

SQL DROP VIEW

You can delete a view with the DROP VIEW command.

Syntax –

DROP VIEW view_name;

DDL

Data Definition Language (DDL) statements are used to define the database structure or schema. Some
examples:

CREATE - to create objects in the database

ALTER - alters the structure of the database

DROP - delete objects from the database

TRUNCATE - remove all records from a table, including all spaces allocated for the records are

removed.

COMMENT - add comments to the data dictionary

RENAME - rename an object

DML

Data Manipulation Language (DML) statements are used for managing data within schema objects. Some
examples:

SELECT - retrieve data from the a database

INSERT - insert data into a table

UPDATE - updates existing data within a table

DELETE - deletes all records from a table, the space for the records remain

MERGE - UPSERT operation (insert or update)

CALL - call a PL/SQL or Java subprogram

EXPLAIN PLAN - explain access path to data

CSP602: Advanced Database Management System MTECH CSE| NITPY 34


LOCK TABLE - control concurrency

DCL

Data Control Language (DCL) statements. Some examples:

GRANT - gives user's access privileges to database

REVOKE - withdraw access privileges given with the GRANT command

TCL

Transaction Control (TCL) statements are used to manage the changes made by DML statements. It allows
statements to be grouped together into logical transactions.

COMMIT - save work done

SAVEPOINT - identify a point in a transaction to which you can later roll back

ROLLBACK - restore database to original since the last COMMIT

SET TRANSACTION - Change transaction options like isolation level and what rollback

segment to use.

Program/Procedure

Database Creation

create database online_store;

use online_store;

Table Creation:

Product :

create table product(barcode int, primary key(barcode), pname varchar(50),price float, quantityInStock int);

Customer :

create table customer (cid int ,primary key(cid),phone varchar(20),email varchar(50), fname varchar(50), lname
varchar(50));

Sale :

create table sale(sidd int , primary key(sidd), deliveryAddress varchar(50), credit_card varchar(50), cid int, foreign
key(cid) references customer(cid));

Saleitem :

create table saleitem (sid int, barcode int,quantity int,foreign key(sid) references sale(sid), foreign key(barcode) references
product(barcode));

CSP602: Advanced Database Management System MTECH CSE| NITPY 35


Values Insertion:

Product Table :

insert into product values(123,'phone',5000,19);

insert into product values(125,'camera',10000,48);

insert into product values(128,'speakers',1000,96);

insert into product values(129,'keyboard',1000,150);

insert into product values(130,'pendrive',200,50);

insert into product values(131,'hdd',4000,20);

select * from product;

Customer Table :

insert into customer values (1,8542569365, '[email protected]', 'john', 'smith');

insert into customer values (2,1236547896, '[email protected]', 'kate', 'Austen');

insert into customer values (3,1475869365, '[email protected]', 'Abraham', 'Wilson');

select * from customer;

CSP602: Advanced Database Management System MTECH CSE| NITPY 36


Sale Table :

insert into sale values(1,’24,MidTown',563214523698,1);

insert into sale values(2,'79, Central Street', 854536963212, 2);

insert into sale values(3,'DownTown',785841412363,3);

insert into sale values('4,DownTown',785841412363,3);

insert into sale values(5,'DownTown',785841412363,3);

insert into sale values(6,'24,MidTown',563214523698,1);

select * from sale;

CSP602: Advanced Database Management System MTECH CSE| NITPY 37


Saleitem Table :

insert into saleitem values(1,123,1);

insert into saleitem values(2,128,2);

insert into saleitem values(3,125,1);

insert into saleitem values(4,129,1);

insert into saleitem values(5,130,1);

insert into saleitem values(6,131,1);

select * from saleitem;

Answers

a) select product.barcode,product.pname,quantityInStock,sidd,quantity from product natural left join saleitem;

b) create view allproduct_sales as

select product.barcode,product.pname,quantityInStock,sidd,quantity from product natural left join saleitem;

CSP602: Advanced Database Management System MTECH CSE| NITPY 38


c) select * from allproduct_sales order by barcode,sidd;

d) create view productproffit as

select product.barcode,pname,((price*.10)*saleitem.quantity) as total_proffit from product natural

left join saleitem;

select total_proffit from productproffit order by total_proffit desc limit 5;

Result:

Queries were executed successfully and required results were obtained.

CSP602: Advanced Database Management System MTECH CSE| NITPY 39


Ex. No.: 6 ADVANCED SQL

Date: 10.02.17 Online Store Database - Triggers

The following shows Online Store database:

Product(BarCode, PName, Price, QuantityInStock)


Customer (CustomerID, Phone, Email, FirstName, LastName)
Sale (SaleID, DeliveryAddress, CreditCard, CustomerID)
SaleItem (SaleID, BarCode, Quantity)

Write MYSQL code queries to accomplish the following tasks.


Create a trigger called updateAvailableQuantity that updates the quantity in stock in the Product
table, for every product sold. The trigger should be executed after each insert operation on the
SaleItemtable: for the product with the given barcode (the one inserted into SaleItem), update the available
quantity in Product table to be the old quantity minus the sold quantity.

Aim:
To create an Online Store Database and execute SQL queries to achieve the desired results.

Theory:
Triggers
Triggers are stored programs, which are automatically executed or fired when some events occur. Triggers
are, in fact, written to be executed in response to any of the following events −

A database manipulation (DML) statement (DELETE, INSERT, or UPDATE)

A database definition (DDL) statement (CREATE, ALTER, or DROP).

A database operation (SERVERERROR, LOGON, LOGOFF, STARTUP, or SHUTDOWN).

Triggers can be defined on the table, view, schema, or database with which the event is associated.

Triggers can be written for the following purposes −

 Generating some derived column values automatically


 Enforcing referential integrity
 Event logging and storing information on table access
 Auditing
 Synchronous replication of tables
 Imposing security authorizations
 Preventing invalid transactions

CSP602: Advanced Database Management System MTECH CSE| NITPY 40


The syntax for creating a trigger is −

CREATE [OR REPLACE ] TRIGGER trigger_name


{BEFORE | AFTER | INSTEAD OF }
{INSERT [OR] | UPDATE [OR] | DELETE}
[OF col_name]
ON table_name
[REFERENCING OLD AS o NEW AS n]
[FOR EACH ROW]
WHEN (condition)
DECLARE
Declaration-statements
BEGIN
Executable-statements
EXCEPTION
Exception-handling-statements
END;

Program/Procedure :

Table Creation:

Product :

create table product(barcode int, primary key(barcode), pname varchar(50),price float, quantityInStock int);

Customer :

create table customer (cid int auto_increment,primary key(cid),phone varchar(20),email varchar(50), fname
varchar(50), lname varchar(50));

Sale :

create table sale(sid int auto_increment, primary key(sid), deliveryAddress varchar(50), credit_card
varchar(50), cid int, foreign key(cid) references customer(cid));

Saleitem :

create table saleitem (sid int, barcode int,quantity int,foreign key(sid) references sale(sid), foreign
key(barcode) references product(barcode));

Values Insertion:

Product Table :

insert into product values(123,'phone',5000,19);

insert into product values(125,'camera',10000,48);

CSP602: Advanced Database Management System MTECH CSE| NITPY 41


insert into product values(128,'speakers',1000,96);

insert into product values(129,'keyboard',1000,150);

insert into product values(130,'pendrive',200,50);

insert into product values(131,'hdd',4000,20);

select * from product;

Customer Table :

insert into customer (phone,email,fname,lname) values (8542569365, '[email protected]', 'john', 'smith');

insert into customer (phone,email,fname,lname) values (1236547896, '[email protected]', 'kate', 'Austen');

insert into customer (phone,email,fname,lname) values (1475869365, '[email protected]', 'Abraham',


'Wilson');

select * from customer;

CSP602: Advanced Database Management System MTECH CSE| NITPY 42


Sale Table :

insert into sale(deliveryAddress,credit_card,cid) value('24,MidTown',563214523698,1);

insert into sale(deliveryAddress,credit_card,cid) values('79, Central Street', 854536963212, 2);

insert into sale(deliveryAddress,credit_card,cid) values('DownTown',785841412363,3);

insert into sale(deliveryAddress,credit_card,cid) values('DownTown',785841412363,3);

insert into sale(deliveryAddress,credit_card,cid) values('DownTown',785841412363,3);

insert into sale(deliveryAddress,credit_card,cid) values('24,MidTown',563214523698,1);

select * from sale;

Saleitem Table :

insert into saleitem values(1,123,1);

insert into saleitem values(2,128,2);

insert into saleitem values(3,125,1);

CSP602: Advanced Database Management System MTECH CSE| NITPY 43


insert into saleitem values(4,129,1);

insert into saleitem values(5,130,1);

insert into saleitem values(6,131,1);

select * from saleitem;

Answer :

create trigger UpdateAvailableQuantity

AFTER insert on saleitem for each row

Begin

update product

set quantityInStock=product.quantityInStock-NEW.quantity

where product.barcode=New.barcode;

end;

Result:

CSP602: Advanced Database Management System MTECH CSE| NITPY 44


Ex. No.: 7 E-R DIAGRAM

Date: 17.02.17

Aim:
To draw the ER diagram for the company databse.

Construct an ER diagram for Company having following Details:


 Company organized into DEPARTMENT. Each department has unique name, number and
a particular employee who manages the department. Start date for the manager is recorded.
Department may have several locations.
 A department controls number of PROJECTS. Projects have a unique name, number and a
single location.
 Company’s EMPLOYEE name, ssno, address, salary, gender and birth date are recorded.
An employee is assigned to one department, but may work for several projects (not
necessarily controlled by her dept). Number of hours week an employee works on each
project is recorded by the immediate supervisor for the employee.
 Employee’s DEPENDENT are tracked for health insurance purposes (dependent name,
birth date, relationship to employee).

Theory:

An entity relationship model, also called an entity-relationship (ER) diagram, is a graphical


representation of entities and their relationships to each other, typically used in computing in regard to the
organization of data within databases or information systems. An entity is a piece of data-an object or
concept about which data is stored.

A relationship is how the data is shared between entities. There are three types of relationships between
entities:
1. One-to-One
One instance of an entity (A) is associated with one other instance of another entity (B). For example, in a
database of employees, each employee name (A) is associated with only one social security number (B).
2. One-to-Many
One instance of an entity (A) is associated with zero, one or many instances of another entity (B), but for
one instance of entity B there is only one instance of entity A. For example, for a company with all
employees working in one building, the building name (A) is associated with many different employees
(B), but those employees all share the same singular association with entity A.
3. Many-to-Many
One instance of an entity (A) is associated with one, zero or many instances of another entity (B), and one
instance of entity B is associated with one, zero or many instances of entity A. For example, for a company

CSP602: Advanced Database Management System MTECH CSE| NITPY 45


in which all of its employees work on multiple projects, each instance of an employee (A) is associated
with many instances of a project (B), and at the same time, each instance of a project (B) has multiple
employees (A) associated with it.
ER Diagram Symbols
In an ER diagram, symbols are commonly used to to represent the types of information. Most diagrams will
use the following:
Boxes represent entities.
Diamonds represent relationships
Circles (ovals) represent attributes.
Procedure
Entities and Attributes
1. Department
 Name
 Number
 Location
Primary Key(s): Name, number
2. Project
 Name
 Number
 Location
Primary Key(s): Name, number
3. Employee
 ssnumber
 name
 address
 salary
 gender
 birthdate
Primary Key: ssnumber
4. Dependant
 Name
 Birthdate
 Relationship

CSP602: Advanced Database Management System MTECH CSE| NITPY 46


Answer :

Result:

ER diagram drawed and output successfully verified.

CSP602: Advanced Database Management System MTECH CSE| NITPY 47


Ex. No.: 8 NORMALIZATION

Date: 10.03.17

Question: Reduce the following to BCNF, showing all the steps involved.

 Identify any repeating groups and functional dependences in the PATIENT relation. Show all
the intermediate steps to derive the third normal form for PATIENT.

PATIENT(Patno,Patname,Gpno,Gpname,Appdate,Consultant,Conaddr,Sample)

Patno Patname Gpno Gpname Appdate Consultant Consaddr Sample


1 Rosy 10 Robinson 31.3.2016 Sreenath Karaikal Blood
2 Jacky 11 Ruby 30.3.2016 Russell Puducherry None
3 John 12 Tane 01.4.2016 Peter Nagoor None
4 Jane 13 Alex 31.3.2016 Sreenath Karaikal Blood
5 Daniel 14 Raju 31.3.2016 Sreenath Karaikal Sputum
6 Edward 15 July 01.4.2016 Peter Nagoor Blood

 Supplier(sno,sname,saddress,(partno, partdesc,(custid,custname,custaddr,quantity)))
sno -> sname,saddr
partno -> partdesc
sno,partno,custid -> quantity
sname -> sno
custid -> custname,custaddr
Suppliers supply many parts to many customers. Each customer deals with only one supplier. Supplier
names are unique. Customer names are not unique.

Aim
To perform normalization on the given database table.

Theory

1NF:
An attribute (column) of a table cannot hold multiple values. It should hold only atomic values.
2NF:
A table is said to be in 2NF if both the following conditions hold.
 Table is 1NF
 No non-prime attribute is dependent on the proper subset of any candidate key of table.
3NF:
A table design is said to be in 3NF if both the following conditions hold.
 Table must be in 2NF
 Transitive functional dependency of non-prime attribute on any super key should be removed.

CSP602: Advanced Database Management System MTECH CSE| NITPY 48


BCNF:
It is an advanced version of 3NF that’s why it is also referred as 3.5NF, BCNF is stricter than 3NF.A table
compiles with BCNF if it is in 3NF and for every functional dependency X->Y,X should be the super key of the table.

Procedure:
1) Functional Dependencies
a) Patno -> Patname,Gpno,Gpname,Adddate,Consultant,Conaddr,Sample
b) Patname -> Patno,Gpno,Gpname,Adddate,Consultant,Conaddr,Sample
c) Gpno -> Patname,Patno,Gpname,Adddate,Consultant,Conaddr,Sample
d) Gpname -> Patname,Gpno,Patno,Adddate,Consultant,Conaddr,Sample
e) Appdate->Consultant,Consaddr
f) Consultant->Appdate,Conaddr
g) Conaddr->Appdate,Consultant

In the above F.D a,b,c,d are in 2 NF,as the keys in LHS are not part of any key

In the above F.D e,f,g there are no attributes which are part of any key.

So.it is in 2NF

3>For 3NF-If it satisfies 2 NF & there is no transitive dependency.

A FD X->Y in a relation schema R is a transitive dependency if there exists a set of attributes Z in R that is
neither a candidate key nor a subset of any key of R & both X->Y &Z->Y hold.

Therefore we can say that either the LHS should be Superkey or RHS should be prime attribute

In F.D a,b,c,d ,LHS is a key, so that they are in 3NF.But on the other hand in e,f,g LHS is not a key ,hence
they are not in 3NF& there is transitive dependency also.

Appdate->Consultant,Consaddr

Consultant->Consaddr

Consaddr->Appdate

We need to decompose tables-

Table 1:

(Pat no.,Patname,Gpno,Gpname,Appdate,sample)

Table 2:

(Appdate,Consultant,consaddr)

Again we will check for 3NF in decomposed tables-In table1(a,b,c,d)are F.D which are in 3NF as they have
keys on LHS.

In table2(e,f,g) are F.D which are in 3NF as they have keys on LHS.

CSP602: Advanced Database Management System MTECH CSE| NITPY 49


2) for any realtion to be in BCNF,it should be in 1NF,No partial dependency,No transitive dependency,If
whenever a non-trivial FD X->A holds In R then X is superkey of R.

Therefore,in the given FD’s above—

1)1NF->No multivalued or composite attributes.So, it is in 1 NF.

2)In the given FD’sSno.,part no.,cust id->quantity.sno. is apart of key

But in other there is no partial dependency

a) Sno.->Sname,Saddr

b)Partno->Partdesc

c)Sno.,partno,custid->quantity

d)Sname->sno.

e)Custid->Custname,Custaddr

and the relation,Supplier(Sno,Sname,Saddress,(paartno.,part desc,custid,custname,custaddr,quantity))

There are composite attributes which is not in 1NF.So,the composite attributes (paartno.,part
desc,custid,custname,custaddr,quantity)) should be separated from the supplier realtion

TABLE1:Supplier(Sno.,Sname,Saddr)

TABLE2:Supplier2(,(paartno.,part desc,sno.,custid,custname,custaddr,quantity))

Again table 2 is not in 2 NF

So,composite attributes need to be separated.

TABLE1:Supplier(Sno.,Sname,Saddr)

TABLE2:Supplier2(,(paartno.,part desc,sno)

TABLE3:Supplier3(,sno.,custid,custname,custaddr,quantity, part desc)

3)checking whether any partial dependency

TABLE1:

a) Sno.->Sname,Saddr

d)Sname->sno

no partial dependency as Sno. Is a key& sname is unique.

TABLE2: b)Partno->Partdesc

Partial dependency is ther.

TABLE3: c)Sno.,partno,custid->quantity

Left side is a key,so no partial dependency.

CSP602: Advanced Database Management System MTECH CSE| NITPY 50


e)Custid->Custname,Custaddr

Custid->custaddr NO partial dependency

So,supplier 2&3 are again need to be decomposed to make them in 2 NF

TABLE1: Sno.,Sname,Saddr

TABLE2: Partno.,Sno.

TABLE3: Partno,Partdesc

TABLE4:partno.,custid,quantity

TABLE5:custid,custname,custaddr,Sno.

Finally it is in 2NF

4)check whether they are in 3NF

Sno.->Sname,Saddr

Sname->Sno,

Partno->partdesc

Sno.,partno.,custid->quantity

Custid->custname,custaddr

Answer
1) For making given relation in 3NF we need to decompose into:

RELATION1 (Pat no.,Patname,Gpno,Gpname,Appdate,sample)

RELATION2(Appdate,Consultant,consaddr)

Therefore ,table have been formed in 3 NF successfully.

2) It is in BCNF,because at the left hand side attributes are superkey.

Result:

Converted to BCNF from given table and Output verified successfully.

CSP602: Advanced Database Management System MTECH CSE| NITPY 51


Ex. No.: 10 STUDY OF BIG DATA TECHNIQUES (HADOOP)

Date: 01.04.17

Aim
This practical intends to explain the basic concept of big data techniques and also implementation
of mapper and reducer program.

Software Required

Desktop PC, 4 GB ram, Ubuntu, MS SQL server 2000, Hadoop 2.6.0

Theory

‘Big Data’ is the application of specialized techniques and technologies to process very large sets of
data. These data sets are often so large and complex that it becomes difficult to process using on-hand
database management tools. Examples include web logs, call records, medical records, military
surveillance, photography archives, video archives and large-scale e-commerce.

Big data and analytics capabilities include:

 Data Management & Warehouse: Gain industry-leading database performance across


multiple workloads while lowering administration, storage, development and server costs;
Realize extreme speed with capabilities optimized for analytics workloads such as deep
analytics, and benefit from workload-optimized systems that can be up and running in
hours.

 Hadoop System: Bring the power of Apache Hadoop to the enterprise with application
accelerators, analytics, visualization, development tools, performance and security features.

 Stream Computing: Efficiently deliver real-time analytic processing on constantly


changing data in motion and enable descriptive and predictive analytics to support real-
time decisions. Capture and analyze all data, all the time, just in time. With stream
computing, store less, analyze more and make better decisions faster.

 Content Management: Enable comprehensive content lifecycle and document


management with cost-effective control of existing and new types of content with scale,
security and stability.

 Information Integration & Governance: Build confidence in big data with the ability to
integrate, understand, manage and govern data appropriately across its lifecycle.

 Problem Scope: Hadoop is a large-scale distributed batch processing infrastructure. While


it can be used on a single machine, its true power lies in its ability to scale to hundreds or
thousands of computers, each with several processor cores. Hadoop is also designed to
efficiently distribute large amounts of work across a set of machines.

CSP602: Advanced Database Management System MTECH CSE| NITPY 52


Hadoop Systems software helps

 Support large-scale, multi-structured data processing and analysis with application


accelerators, visualization, dashboards, development tools and security features.
 Quickly analyze diverse data sets such as text, log records, social media, news feeds, email
and electronic sensor output.
 Augment your data warehouse environment with a flexible archive for storing, processing
and querying big data.

Hadoop is designed to efficiently process large volumes of information by connecting many


commodity computers together to work in parallel. The theoretical 1000-CPU machine described earlier
would cost a very large amount of money, far more than 1,000 single-CPU or 250 quad-core machines.
Hadoop will tie these smaller and more reasonably priced machines together into a single cost-effective
compute cluster.

Comparison to Existing Techniques:

Performing computation on large volumes of data has been done before, usually in a distributed
setting. What makes Hadoop unique is its simplified programming model which allows the user to quickly
write and test distributed systems, and its efficient, automatic distribution of data and work across machines
and in turn utilizing the underlying parallelism of the CPU cores.

Data Distribution

In a Hadoop cluster, data is distributed to all the nodes of the cluster as it is being loaded in. The
Hadoop Distributed File System (HDFS) will split large data files into chunks which are managed by
different nodes in the cluster. In addition to this each chunk is replicated across several machines, so that a
single machine failure does not result in any data being unavailable. An active monitoring system then re-
replicates the data in response to system failures which can result in Participatedial storage. Even though
the file chunks are replicated and distributed across several machines, they form a single namespace, so
their contents are universally accessible.

Data is conceptually record-oriented in the Hadoop programming framework. Individual input files
are broken into lines or into other formats specific to the application logic. Each process running on a node
in the cluster then processes a subset of these records. The Hadoop framework then schedules these
processes in proximity to the location of data/records using knowledge from the distributed file system.
Since files are spread across the distributed file system as chunks, each compute process running on a node
operates on a subset of the data. Which data operated on by a node is chosen based on its locality to the
node: most data is read from the local disk straight into the CPU, alleviating strain on network bandwidth
and preventing unnecessary network transfers. This strategy of moving computation to the data, instead of
moving the data to the computation allows Hadoop to achieve high data locality which in turn results in
high performance.

MapReduce

CSP602: Advanced Database Management System MTECH CSE| NITPY 53


Hadoop limits the amount of communication which can be performed by the processes, as each
individual record is processed by a task in isolation from one another. While this sounds like a major
limitation at first, it makes the whole framework much more reliable. Hadoop will not run just any program
and distribute it across a cluster. Programs must be written to conform to a Particular programming model,
named "MapReduce". In MapReduce, records are processed in isolation by tasks called Mappers. The
output from the Mappers is then brought together into a second set of tasks called Reducers, where results
different mappers can be merged. Separate nodes in a Hadoop cluster still communicate with one another.
However, in contrast to more conventional distributed systems where application developers explicitly
marshal byte streams from node to node over sockets or through MPI buffers, communication in Hadoop is
performed implicitly. Pieces of data can be tagged with key names which inform Hadoop how to send
related bits of information to a common destination node. Hadoop internally manages all of the data
transfer and cluster topology issues.

Flat Scalability

One of the major benefits of using Hadoop in contrast to other distributed systems is its flat
scalability curve. Executing Hadoop on a limited amount of data on a small number of nodes may not
demonstrate Particularly stellar performance as the overhead involved in starting Hadoop programs is
relatively high. Other parallel/distributed programming paradigms such as MPI (Message Passing Interface)
may perform much better on two, four, or perhaps a dozen machines.

A program written in distributed frameworks other than Hadoop may require large amounts of refactoring
when scaling from ten to one hundred or one thousand machines.

Procedure/ Program

Mapper:
 Mapper maps input key/value pairs to a set of intermediate key/value pairs.

 Maps are the individual tasks that transform input records into intermediate records. The
transformed intermediate records do not need to be of the same type as the input records. A
given input pair may map to zero or many output pairs.

 The Hadoop Map/Reduce framework spawns one map task for each InputSplit generated
by the InputFormat for the job.

Reducer:

 Reducer reduces a set of intermediate values which share a key to a smaller set of values.

 The number of reduces for the job is set by the user via JobConf.setNumReduceTasks(int).
Overall, Reducer implementations are passed the JobConf for the job via the
JobConfigurable.configure(JobConf) method and can override it to initialize themselves.

CSP602: Advanced Database Management System MTECH CSE| NITPY 54


 The framework then calls reduce(WritableComparable, Iterator, OutputCollector, Reporter)
method for each <key, (list of values)> pair in the grouped inputs. Applications can then
override the Closeable.close() method to perform any required cleanup.

 Reducer has 3 primary phases: shuffle, sort and reduce.

Apache Hadoop

Apache Hadoop is an open-source software framework written in Java for distributed storage and
distributed processing of very large data sets on computer clusters built from commodity hardware. All the
modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual
machines or racks of machines) are commonplace and thus should be automatically handled in software by
the framework.

The core of Apache Hadoop consists of a storage Part (Hadoop Distributed File System (HDFS))
and a processing Part (MapReduce). Hadoop splits files into large blocks and distributes them amongst the
nodes in the cluster. To process the data, Hadoop MapReduce transfers packaged code for nodes to process
in parallel, based on the data each node needs to process. This approach takes advantage of data locality -
nodes manipulating the data that they have on hand—to allow the data to be processed faster and more
efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file
system where computation and data are connected via high-speed networking.

The base Apache Hadoop framework is composed of the following modules:

 Hadoop Common – contains libraries and utilities needed by other Hadoop modules.
 Hadoop Distributed File System (HDFS) – a distributed file-system that stores data on commodity
machines, providing very high aggregate bandwidth across the cluster.
 Hadoop YARN – a resource-management platform responsible for managing computing resources
in clusters and using them for scheduling of users' applications.
 Hadoop MapReduce – a programming model for large scale data processing.

The term "Hadoop" has come to refer not just to the base modules above, but also to the "ecosystem" or
collection of additional software packages that can be installed on top of or alongside Hadoop, such as
Apache Pig, Apache Hive, Apache HBase, Apache Spark, and others.

The Hadoop framework itself is mostly written in the Java programming language, with some
native code in C and command line utilities written as Shell script. For end-users, though MapReduce Java
code is common, any programming language can be used with "Hadoop Streaming" to implement the
"map" and "reduce" Parts of the user's program.

CSP602: Advanced Database Management System MTECH CSE| NITPY 55


Sample Program

Hadoop Environment:

1. The hadoop name node information runs on /localhost:50070. To start the hadoop environment, run
start-dfs.sh and start-yarn.sh.
2. Add files to the hadoop file system using
Hadoop dfs -CopyToLocal </hadoop_directory> </OS_directory>

3. Create a file named “File.txt” in hdfs.

“This is the Hadoop Distributed File System running. This is an example for word count. Keerthi
Vasan Gounder.”

4. Run the map reduce job as


hadoop jar wordcount.jar WordCount <file_location> <result_location>
5. View result of job as
hadoop dfs -cat <result_location/Part-r-00000>

Map Reduce Code:

import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

CSP602: Advanced Database Management System MTECH CSE| NITPY 56


import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
public static class Map
extends Mapper<LongWritable, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1); // type of output value
private Text word = new Text(); // type of output key
public void map(LongWritable key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString()); // line to string
while (itr.hasMoreTokens()) {
word.set(itr.nextToken()); // set word as each input keyword
context.write(word, one); }}}
public static class Reduce
extends Reducer<Text,IntWritable,Text,IntWritable> {

private IntWritable result = new IntWritable();


public void reduce(Text key, Iterable<IntWritable> values,
Context context ) throws IOException, InterruptedException {
int sum = 0; // initialize the sum for each keyword
for (IntWritable val : values) { sum += val.get(); }
result.set(sum);
context.write(key, result); // create a pair <keyword, number of occurences> } }

public static void main(String[] args) throws Exception {


Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).get RemainingArgs();
if (otherArgs.length != 2) {
System.err.println("Usage: WordCount <in> <out>");
System.exit(2); }
Job job = new Job(conf, "wordcount");
job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setCombinerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1); }}

Output:

CSP602: Advanced Database Management System MTECH CSE| NITPY 57


Result:
Thus we have studied the different big data techniques and their implementation using hadoop 2.6.0

CSP602: Advanced Database Management System MTECH CSE| NITPY 58

You might also like