0% found this document useful (0 votes)
36 views

Normalization of Database

The document discusses database normalization and various normal forms including: 1) First normal form requires each table have a primary key and each attribute contain a single value. 2) Second normal form ensures attributes depend on the full primary key rather than parts of it. 3) Third normal form removes transitive dependencies where a non-key attribute depends on another non-key attribute. 4) Boyce-Codd normal form ensures functional dependencies have a candidate key as a determinant. Normalization reduces data redundancy and anomalies like inconsistent updates.

Uploaded by

areeba khan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Normalization of Database

The document discusses database normalization and various normal forms including: 1) First normal form requires each table have a primary key and each attribute contain a single value. 2) Second normal form ensures attributes depend on the full primary key rather than parts of it. 3) Third normal form removes transitive dependencies where a non-key attribute depends on another non-key attribute. 4) Boyce-Codd normal form ensures functional dependencies have a candidate key as a determinant. Normalization reduces data redundancy and anomalies like inconsistent updates.

Uploaded by

areeba khan
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Final Year Project

(Viva Questions)

Normalization of Database
Database Normalization is a technique of organizing the data in the database. Normalization is a
systematic approach of decomposing tables to eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion Anomalies. It is a multi-step process that puts
data into tabular form by removing duplicated data from the relation tables.

Normalization is used for mainly two purpose,

 Eliminating redundant (useless) data.

 Ensuring data dependencies make sense i.e data is logically stored.

Problem Without Normalization

Without Normalization, it becomes difficult to handle and update the database, without facing data
loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not Normalized. To
understand these anomalies let us take an example of Student table.

S_id S_Name S_Address Subject_opted

401 Adam Noida Bio

402 Alex Panipat Maths

403 Stuart Jammu Maths

404 Adam Noida Physics


 Updation Anamoly : To update address of a student who occurs twice or more than twice in a

table, we will have to update S_Address column in all the rows, else data will become

inconsistent.

 Insertion Anamoly : Suppose for a new admission, we have a Student id(S_id), name and

address of a student but if student has not opted for any subjects yet then we have to

insert NULL there, leading to Insertion Anamoly.

 Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he drops it, when we

delete that row, entire student record will be deleted along with it.

Normalization Rule

Normalization rule are divided into following normal form.

1. First Normal Form

2. Second Normal Form

3. Third Normal Form

4. BCNF

First Normal Form (1NF)

As per First Normal Form, no two Rows of data must contain repeating group of information i.e each
set of column must have a unique value, such that multiple columns cannot be used to fetch the
same row. Each table should be organized into rows, and each row should have a primary key that
distinguishes it as unique.

The Primary key is usually a single column, but sometimes more than one column can be combined
to create a single primary key. For example consider a table which is not in First normal form

Student Table :
Student Age Subject

Adam 15 Biology, Maths

Alex 14 Maths

Stuart 17 Maths

In First Normal Form, any row must not have a column in which more than one value is saved, like
separated with commas. Rather than that, we must separate such data into multiple rows.

Student Table following 1NF will be :

Student Age Subject

Adam 15 Biology

Adam 15 Maths

Alex 14 Maths

Stuart 17 Maths

Using the First Normal Form, data redundancy increases, as there will be many columns with same
data in multiple rows but each row as a whole will be unique.

Second Normal Form (2NF)

As per the Second Normal Form there must not be any partial dependency of any column on primary
key. It means that for a table that has concatenated primary key, each column in the table that is not
part of the primary key must depend upon the entire concatenated key for its existence. If any
column depends only on one part of the concatenated key, then the table fails Second normal
form.

In example of First Normal Form there are two rows for Adam, to include multiple subjects that he
has opted for. While this is searchable, and follows First normal form, it is an inefficient use of space.
Also in the above Table in First Normal Form, while the candidate key is {Student, Subject}, Age of
Student only depends on Student column, which is incorrect as per Second Normal Form. To
achieve second normal form, it would be helpful to split out the subjects into an independent table,
and match them up using the student names as foreign keys.

New Student Table following 2NF will be :

Student Age

Adam 15

Alex 14

Stuart 17

In Student Table the candidate key will be Student column, because all other column i.e Age is
dependent on it.

New Subject Table introduced for 2NF will be :

Student Subject

Adam Biology

Adam Maths

Alex Maths
Stuart Maths

In Subject Table the candidate key will be {Student, Subject} column. Now, both the above tables
qualifies for Second Normal Form and will never suffer from Update Anomalies. Although there are a
few complex cases in which table in Second Normal Form suffers Update Anomalies, and to handle
those scenarios Third Normal Form is there.

Third Normal Form (3NF)

Third Normal form applies that every non-prime attribute of table must be dependent on primary
key, or we can say that, there should not be the case that a non-prime attribute is determined by
another non-prime attribute. So this transitive functional dependency should be removed from the
table and also the table must be in Second Normal form. For example, consider a table with
following fields.

Student_Detail Table :

Student_id Student_name DOB Street city State Zip

In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency
between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to
move the street, city and state to new table, with Zip as primary key.

New Student_Detail Table :

Student_id Student_name DOB Zip

Address Table :

Zip Street City state

The advantage of removing transtive dependency is,


 Amount of data duplication is reduced.

 Data integrity achieved.

Boyce and Codd Normal Form (BCNF)

Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with
certain type of anamoly that is not handled by 3NF. A 3NF table which does not have multiple
overlapping candidate keys is said to be in BCNF. For a table to be in BCNF, following conditions
must be satisfied:

 R must be in 3rd Normal Form

 and, for each functional dependency ( X -> Y ), X should be a super Key.


1. What is MySQL?

MySQL is an open source DBMS which is built, supported and distributed by MySQL AB
(now acquired by Oracle)

2. What are the technical features of MySQL?


MySQL database software is a client or server system which includes
 Multithreaded SQL server supporting various client programs and libraries
 Different backend
 Wide range of application programming interfaces and
 Administrative tools.
3. Why MySQL is used?

MySQL database server is reliable, fast and very easy to use. This software can be
downloaded as freeware and can be downloaded from the internet.

4. What are Heap tables?

HEAP tables are present in memory and they are used for high speed storage on temporary

basis.

• BLOB or TEXT fields are not allowed

• Only comparison operators can be used =, <,>, = >,=<


• AUTO_INCREMENT is not supported by HEAP tables

• Indexes should be NOT NULL

5. What is the default port for MySQL Server?

The default port for MySQL server is 3306.

6. What are the advantages of MySQL when compared with Oracle?


 MySQL is open source software which is available at any time and has no cost involved.
 MySQL is portable
 GUI with command prompt.
 Administration is supported using MySQL Query Browser
7. Differentiate between FLOAT and DOUBLE?

Following are differences for FLOAT and DOUBLE:

• Floating point numbers are stored in FLOAT with eight place accuracy and it has four
bytes.

• Floating point numbers are stored in DOUBLE with accuracy of 18 places and it has eight
bytes.

8. Differentiate CHAR_LENGTH and LENGTH?

CHAR_LENGTH is character count whereas the LENGTH is byte count. The numbers are
same for Latin characters but they are different for Unicode and other encodings.

9. How to represent ENUMs and SETs internally?

ENUMs and SETs are used to represent powers of two because of storage optimizations.

10. What is the usage of ENUMs in MySQL?

ENUM is a string object used to specify set of predefined values and that can be used
during table creation.
1 Create table size(name ENUM('Small', 'Medium','Large');

11. Define REGEXP?

REGEXP is a pattern match in which matches pattern anywhere in the search value.

12. Difference between CHAR and VARCHAR?

Following are the differences between CHAR and VARCHAR:

 CHAR and VARCHAR types differ in storage and retrieval


 CHAR column length is fixed to the length that is declared while creating table. The length
value ranges from 1 and 255
 When CHAR values are stored then they are right padded using spaces to specific length.
Trailing spaces are removed when CHAR values are retrieved.
13. Give string types available for column?

The string types are:

 SET
 BLOB
 ENUM
 CHAR
 TEXT
 VARCHAR
14. How to get current MySQL version?

1 SELECT VERSION ();

is used to get the current version of MySQL.

15. What storage engines are used in MySQL?

Storage engines are called table types and data is stored in files using various techniques.

Technique involves:
 Storage mechanism
 Locking levels
 Indexing
 Capabilities and functions.
16. What are the drivers in MySQL?

Following are the drivers available in MySQL:

 PHP Driver
 JDBC Driver
 ODBC Driver
 C WRAPPER
 PYTHON Driver
 PERL Driver
 RUBY Driver
 CAP11PHP Driver
 Ado.net5.mxj
17. What does a TIMESTAMP do on UPDATE CURRENT_TIMESTAMP data type?
TIMESTAMP column is updated with Zero when the table is created. UPDATE
CURRENT_TIMESTAMP modifier updates the timestamp field to current time whenever
there is a change in other fields of the table.
18. What is the difference between primary key and candidate key?

Every row of a table is identified uniquely by primary key. There is only one primary key for
a table.

Primary Key is also a candidate key. By common convention, candidate key can be
designated as primary and which can be used for any foreign key references.

19. How do you login to MySql using Unix shell?

We can login through this command:

# [mysql dir]/bin/mysql -h hostname -u <UserName> -p <password>

20. What does myisamchk do?

It compress the MyISAM tables, which reduces their disk or memory usage.
21. How do you control the max size of a HEAP table?

Maximum size of Heal table can be controlled by MySQL config variable called
max_heap_table_size.

22. What is the difference between MyISAM Static and MyISAM Dynamic?

In MyISAM static all the fields will have fixed width. The Dynamic MyISAM table will have
fields like TEXT, BLOB, etc. to accommodate the data types with various lengths.

MyISAM Static would be easier to restore in case of corruption.

23. What are federated tables?

Federated tables which allow access to the tables located on other databases on other
servers.

24. What, if a table has one column defined as TIMESTAMP?

Timestamp field gets the current timestamp whenever the row gets altered.

25. What happens when the column is set to AUTO INCREMENT and if you reach maximum
value in the table?

It stops incrementing. Any further inserts are going to produce an error, since the key has
been used already.

26. How can we find out which auto increment was assigned on Last insert?

LAST_INSERT_ID will return the last value assigned by Auto_increment and it is not
required to specify the table name.

27. How can you see all indexes defined for a table?

Indexes are defined for the table by:

SHOW INDEX FROM <tablename>;

28. What do you mean by % and _ in the LIKE statement?


% corresponds to 0 or more characters, _ is exactly one character in the LIKE statement.

29. How can we convert between Unix & MySQL timestamps?

UNIX_TIMESTAMP is the command which converts from MySQL timestamp to Unix


timestamp

FROM_UNIXTIME is the command which converts from Unix timestamp to MySQL


timestamp.

30. What are the column comparisons operators?

The = , <>, <=, <, >=, >,<<,>>, <=>, AND, OR, or LIKE operators are used in column
comparisons in SELECT statements.

31. How can we get the number of rows affected by query?

Number of rows can be obtained by

1 SELECT COUNT (user_id) FROM users;

32. Is Mysql query is case sensitive?

No.

1 SELECT VERSION(), CURRENT_DATE;


2 SeLect version(), current_date;
3 seleCt vErSiOn(), current_DATE;

All these examples are same. It is not case sensitive.

33. What is the difference between the LIKE and REGEXP operators?
LIKE and REGEXP operators are used to express with ^ and %.

1 SELECT * FROM employee WHERE emp_name REGEXP "^b";


2 SELECT * FROM employee WHERE emp_name LIKE "%b";

34. What is the difference between BLOB AND TEXT?

A BLOB is a binary large object that can hold a variable amount of data. There are four
types of BLOB –

 TINYBLOB
 BLOB
 MEDIUMBLOB and
 LONGBLOB

They all differ only in the maximum length of the values they can hold.

A TEXT is a case-insensitive BLOB. The four TEXT types

 TINYTEXT
 TEXT
 MEDIUMTEXT and
 LONGTEXT

They all correspond to the four BLOB types and have the same maximum lengths and
storage requirements.

The only difference between BLOB and TEXT types is that sorting and comparison is
performed in case-sensitive for BLOB values and case-insensitive for TEXT values.
35. What is the difference between mysql_fetch_array and mysql_fetch_object?

Following are the differences between mysql_fetch_array and mysql_fetch_object:

mysql_fetch_array() -Returns a result row as an associated array or a regular array from


database.
mysql_fetch_object – Returns a result row as object from database.

36. How can we run batch mode in mysql?

Following commands are used to run in batch mode:

1 mysql ;
2 mysql mysql.out

37. Where MyISAM table will be stored and also give their formats of storage?

Each MyISAM table is stored on disk in three formats:

 The ‘.frm’ file stores the table definition


 The data file has a ‘.MYD’ (MYData) extension
 The index file has a ‘.MYI’ (MYIndex) extension
38. What are the different tables present in MySQL?

Total 5 types of tables are present:

 MyISAM
 Heap
 Merge
 INNO DB
 ISAM

MyISAM is the default storage engine as of MySQL .

39. What is ISAM?

ISAM is abbreviated as Indexed Sequential Access Method.It was developed by IBM to


store and retrieve data on secondary storage systems like tapes.

40. What is InnoDB?


lnnoDB is a transaction safe storage engine developed by Innobase Oy which is a Oracle
Corporation now.

41. How MySQL Optimizes DISTINCT?

DISTINCT is converted to a GROUP BY on all columns and it will be combined with


ORDER BY clause.

1 SELECT DISTINCT t1.a FROM t1,t2 where t1.a=t2.a;

42. How to enter Characters as HEX Numbers?

If you want to enter characters as HEX numbers, you can enter HEX numbers with single
quotes and a prefix of (X), or just prefix HEX numbers with (Ox).

A HEX number string will be automatically converted into a character string, if the
expression context is a string.

43. How to display top 50 rows?

In MySql, top 50 rows are displayed by using this following query:

1 SELECT * FROM
2 LIMIT 0,50;

44. How many columns can be used for creating Index?

Maximum of 16 indexed columns can be created for any standard table.

45. What is the different between NOW() and CURRENT_DATE()?


NOW () command is used to show current year,month,date with hours,minutes and
seconds.

CURRENT_DATE() shows current year,month and date only.

46. What are the objects can be created using CREATE statement?

Following objects are created using CREATE statement:

 DATABASE
 EVENT
 FUNCTION
 INDEX
 PROCEDURE
 TABLE
 TRIGGER
 USER
 VIEW
47. How many TRIGGERS are allowed in MySql table?

SIX triggers are allowed in MySql table. They are as follows:

 BEFORE INSERT
 AFTER INSERT
 BEFORE UPDATE
 AFTER UPDATE
 BEFORE DELETE and
 AFTER DELETE
48. What are the nonstandard string types?

Following are Non-Standard string types:

 TINYTEXT
 TEXT
 MEDIUMTEXT
 LONGTEXT
49. What are all the Common SQL Function?
CONCAT(A, B) – Concatenates two string values to create a single string output. Often
used to combine two or more fields into one single field.

FORMAT(X, D) – Formats the number X to D significant digits.

CURRDATE(), CURRTIME() – Returns the current date or time.

NOW() – Returns the current date and time as one value.

MONTH(), DAY(), YEAR(), WEEK(), WEEKDAY() – Extracts the given data from a date
value.

HOUR(), MINUTE(), SECOND() – Extracts the given data from a time value.

DATEDIFF(A, B) – Determines the difference between two dates and it is commonly used
to calculate age

SUBTIMES(A, B) – Determines the difference between two times.

FROMDAYS(INT) – Converts an integer number of days into a date value.

50. Explain Access Control Lists.

An ACL (Access Control List) is a list of permissions that is associated with an object. This
list is the basis for MySQL server’s security model and it helps in troubleshooting problems
like users not being able to connect.

MySQL keeps the ACLs (also called grant tables) cached in memory. When a user tries to
authenticate or run a command, MySQL checks the authentication information and
permissions against the ACLs, in a predetermined order.
Software Development Life Cycle
Models and Methodologies
Introduction

The software industry includes many different processes, for example, analysis, development,
maintenance and publication of software. This industry also includes software services, such as
training, documentation, and consulting.

Our focus here about software development life cycle (SDLC). So, due to that different types of
projects have different requirements. Therefore, it may be required to choose the SDLC phases
according to the specific needs of the project. These different requirements and needs give us various
software development approaches to choose from during software implementation.
Types of Software developing life cycles (SDLC)
· Waterfall Model
· V-Shaped Model

· Evolutionary Prototyping Model


· Spiral Method (SDM)
· Iterative and Incremental Method
· Extreme programming (Agile development)
Waterfall Model
Description
The waterfall Model is a linear sequential flow. In which progress is seen as flowing steadily
downwards (like a waterfall) through the phases of software implementation. This means that any
phase in the development process begins only if the previous phase is complete. The waterfall
approach does not define the process to go back to the previous phase to handle changes in
requirement. The waterfall approach is the earliest approach that was used for software development.
The usage
Projects which not focus on changing the requirements, for example, projects initiated from request
for proposals (RFPs)
Advantages and Disadvantages
Advantages Disadvantages

· Easy to explain to the user· Structures · Assumes that the requirements of a system
approach.· Stages and activities are well can be frozen· Very difficult to go back to
defined· Helps to plan and schedule the any stage after it finished.· Little flexibility
project· Verification at each stage ensures and adjusting scope is difficult and
early detection of errors / misunderstanding· expensive.· Costly and required more time,
Each phase has specific deliverables in addition to detailed plan

V-Shaped Model
Description
It is an extension for waterfall model, Instead of moving down in a linear way, the process steps are
bent upwards after the coding phase, to form the typical V shape. The major difference between v-
shaped model and waterfall model is the early test planning in v-shaped model.
The usage
· Software requirements clearly defined and known

· Software development technologies and tools is well-known

Advantages and Disadvantages


Advantages Disadvantages

· Simple and easy to use.· Each phase has · Very inflexible, like the waterfall model.·
specific deliverables.· Higher chance of Little flexibility and adjusting scope is
success over the waterfall model due to the difficult and expensive.· Software is
development of test plans early on during the developed during the implementation phase,
life cycle.· Works well for where so no early prototypes of the software are
requirements are easily produced.· Model doesn’t provide a clear
understood. Verification and validation of the path for problems found during testing
product in early stages of product phases.· Costly and required more time, in
development addition to detailed plan

Prototyping Model
Description
It refers to the activity of creating prototypes of software applications, for example, incomplete
versions of the software program being developed. It is an activity that can occur in software
development. It used to visualize some component of the software to limit the gap of
misunderstanding the customer requirements by the development team. This also will reduce the
iterations may occur in waterfall approach and hard to be implemented due to inflexibility of the
waterfall approach. So, when the final prototype is developed, the requirement is considered to be
frozen.

It has some types, such as:

· Throwaway prototyping: Prototypes that are eventually discarded rather than becoming a part of the
finally delivered software

· Evolutionary prototyping: prototypes that evolve into the final system through iterative
incorporation of user feedback.

· Incremental prototyping: The final product is built as separate prototypes. At the end the separate
prototypes are merged in an overall design.
· Extreme prototyping: used at web applications mainly. Basically, it breaks down web development
into three phases, each one based on the preceding one. The first phase is a static prototype that
consists mainly of HTML pages. In the second phase, the screens are programmed and fully
functional using a simulated services layer. In the third phase the services are implemented

The usage
· This process can be used with any software developing life cycle model. While this shall be focused
with systems needs more user interactions. So, the system do not have user interactions, such as,
system does some calculations shall not have prototypes.

Advantages and Disadvantages


Advantages Disadvantages

· Insufficient analysis· User confusion of


· Reduced time and costs, but this can be prototype and finished system· Developer
disadvantage if the developer loses time in misunderstanding of user objectives·
developing the prototypes· Improved and Excessive development time of the prototype·
increased user involvement Expense of implementing prototyping

Spiral Method (SDM)


Description
It is combining elements of both design and prototyping-in-stages, in an effort to combine advantages
of top-down and bottom-up concepts. This model of development combines the features of the
prototyping model and the waterfall model. The spiral model is favored for large, expensive, and
complicated projects. This model uses many of the same phases as the waterfall model, in essentially
the same order, separated by planning, risk assessment, and the building of prototypes and
simulations.
The usage
It is used in shrink-wrap large applications and systems which built-in small phases or segments.

Advantages and Disadvantages


Advantages Disadvantages

· Estimates (i.e. budget, schedule, etc.)


become more realistic as work progresses,
because important issues are discovered · High cost and time to reach the final
earlier.· Early involvement of developers· product.· Needs special skills to evaluate the
Manages risks and develops system into risks and assumptions· Highly customized
phases limiting re-usability

Iterative and Incremental Method


Description
It is developed to overcome the weaknesses of the waterfall model. It starts with an initial planning
and ends with deployment with the cyclic interactions in between. The basic idea behind this method
is to develop a system through repeated cycles (iterative) and in smaller portions at a time
(incremental), allowing software developers to take advantage of what was learned during
development of earlier parts or versions of the system.

It consists of mini waterfalls

The usage
It is used in shrink-wrap application and large system which built-in small phases or segments. Also
can be used in system has separated components, for example, ERP system. Which we can start with
budget module as first iteration and then we can start with inventory module and so forth.

Advantages and Disadvantages


Advantages Disadvantages

· Requires heavy documentation· Follows a


defined set of processes· Defines increments
· Produces business value early in the based on function and feature dependencies·
development life cycle· Better use of scarce Requires more customer involvement than
resources through proper increment the linear approaches· Partitioning the
definition· Can accommodate some change functions and features might be problematic·
requests between increments· More focused Integration between iteration can be an issue
on customer value than the linear if this is not considered during the
approaches· Problems can be detected earlier development.
Extreme programming (Agile development)
Description
It is based on iterative and incremental development, where requirements and solutions evolve
through collaboration between cross-functional teams.

The usage
It can be used with any type of the project, but it needs more involvement from customer and to be
interactive. Also, it can be used when the customer needs to have some functional requirement ready
in less than three weeks.

Advantages and Disadvantages


Advantages Disadvantages

· Decrease the time required to avail some


system features.· Face to face communication
and continuous inputs from customer · Scalability· Skill of the software
representative leaves no space for developers· Ability of customer to express
guesswork.· The end result is the high quality user needs· Documentation is done at later
software in least possible time duration and stages· Reduce the usability of components.·
satisfied customer Needs special skills for the team.
Other Questions

1. What is the scope of your project?


2. What is the starting date of your project?
3. Who had made this project?
4. Who helped you to complete your project?
5. What are the hurdles that you face in your project

You might also like