Normalization of Database
Normalization of Database
(Viva Questions)
Normalization of Database
Database Normalization is a technique of organizing the data in the database. Normalization is a
systematic approach of decomposing tables to eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion Anomalies. It is a multi-step process that puts
data into tabular form by removing duplicated data from the relation tables.
Without Normalization, it becomes difficult to handle and update the database, without facing data
loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not Normalized. To
understand these anomalies let us take an example of Student table.
table, we will have to update S_Address column in all the rows, else data will become
inconsistent.
Insertion Anamoly : Suppose for a new admission, we have a Student id(S_id), name and
address of a student but if student has not opted for any subjects yet then we have to
Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he drops it, when we
delete that row, entire student record will be deleted along with it.
Normalization Rule
4. BCNF
As per First Normal Form, no two Rows of data must contain repeating group of information i.e each
set of column must have a unique value, such that multiple columns cannot be used to fetch the
same row. Each table should be organized into rows, and each row should have a primary key that
distinguishes it as unique.
The Primary key is usually a single column, but sometimes more than one column can be combined
to create a single primary key. For example consider a table which is not in First normal form
Student Table :
Student Age Subject
Alex 14 Maths
Stuart 17 Maths
In First Normal Form, any row must not have a column in which more than one value is saved, like
separated with commas. Rather than that, we must separate such data into multiple rows.
Adam 15 Biology
Adam 15 Maths
Alex 14 Maths
Stuart 17 Maths
Using the First Normal Form, data redundancy increases, as there will be many columns with same
data in multiple rows but each row as a whole will be unique.
As per the Second Normal Form there must not be any partial dependency of any column on primary
key. It means that for a table that has concatenated primary key, each column in the table that is not
part of the primary key must depend upon the entire concatenated key for its existence. If any
column depends only on one part of the concatenated key, then the table fails Second normal
form.
In example of First Normal Form there are two rows for Adam, to include multiple subjects that he
has opted for. While this is searchable, and follows First normal form, it is an inefficient use of space.
Also in the above Table in First Normal Form, while the candidate key is {Student, Subject}, Age of
Student only depends on Student column, which is incorrect as per Second Normal Form. To
achieve second normal form, it would be helpful to split out the subjects into an independent table,
and match them up using the student names as foreign keys.
Student Age
Adam 15
Alex 14
Stuart 17
In Student Table the candidate key will be Student column, because all other column i.e Age is
dependent on it.
Student Subject
Adam Biology
Adam Maths
Alex Maths
Stuart Maths
In Subject Table the candidate key will be {Student, Subject} column. Now, both the above tables
qualifies for Second Normal Form and will never suffer from Update Anomalies. Although there are a
few complex cases in which table in Second Normal Form suffers Update Anomalies, and to handle
those scenarios Third Normal Form is there.
Third Normal form applies that every non-prime attribute of table must be dependent on primary
key, or we can say that, there should not be the case that a non-prime attribute is determined by
another non-prime attribute. So this transitive functional dependency should be removed from the
table and also the table must be in Second Normal form. For example, consider a table with
following fields.
Student_Detail Table :
In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency
between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to
move the street, city and state to new table, with Zip as primary key.
Address Table :
Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with
certain type of anamoly that is not handled by 3NF. A 3NF table which does not have multiple
overlapping candidate keys is said to be in BCNF. For a table to be in BCNF, following conditions
must be satisfied:
MySQL is an open source DBMS which is built, supported and distributed by MySQL AB
(now acquired by Oracle)
MySQL database server is reliable, fast and very easy to use. This software can be
downloaded as freeware and can be downloaded from the internet.
HEAP tables are present in memory and they are used for high speed storage on temporary
basis.
• Floating point numbers are stored in FLOAT with eight place accuracy and it has four
bytes.
• Floating point numbers are stored in DOUBLE with accuracy of 18 places and it has eight
bytes.
CHAR_LENGTH is character count whereas the LENGTH is byte count. The numbers are
same for Latin characters but they are different for Unicode and other encodings.
ENUMs and SETs are used to represent powers of two because of storage optimizations.
ENUM is a string object used to specify set of predefined values and that can be used
during table creation.
1 Create table size(name ENUM('Small', 'Medium','Large');
REGEXP is a pattern match in which matches pattern anywhere in the search value.
SET
BLOB
ENUM
CHAR
TEXT
VARCHAR
14. How to get current MySQL version?
Storage engines are called table types and data is stored in files using various techniques.
Technique involves:
Storage mechanism
Locking levels
Indexing
Capabilities and functions.
16. What are the drivers in MySQL?
PHP Driver
JDBC Driver
ODBC Driver
C WRAPPER
PYTHON Driver
PERL Driver
RUBY Driver
CAP11PHP Driver
Ado.net5.mxj
17. What does a TIMESTAMP do on UPDATE CURRENT_TIMESTAMP data type?
TIMESTAMP column is updated with Zero when the table is created. UPDATE
CURRENT_TIMESTAMP modifier updates the timestamp field to current time whenever
there is a change in other fields of the table.
18. What is the difference between primary key and candidate key?
Every row of a table is identified uniquely by primary key. There is only one primary key for
a table.
Primary Key is also a candidate key. By common convention, candidate key can be
designated as primary and which can be used for any foreign key references.
It compress the MyISAM tables, which reduces their disk or memory usage.
21. How do you control the max size of a HEAP table?
Maximum size of Heal table can be controlled by MySQL config variable called
max_heap_table_size.
22. What is the difference between MyISAM Static and MyISAM Dynamic?
In MyISAM static all the fields will have fixed width. The Dynamic MyISAM table will have
fields like TEXT, BLOB, etc. to accommodate the data types with various lengths.
Federated tables which allow access to the tables located on other databases on other
servers.
Timestamp field gets the current timestamp whenever the row gets altered.
25. What happens when the column is set to AUTO INCREMENT and if you reach maximum
value in the table?
It stops incrementing. Any further inserts are going to produce an error, since the key has
been used already.
26. How can we find out which auto increment was assigned on Last insert?
LAST_INSERT_ID will return the last value assigned by Auto_increment and it is not
required to specify the table name.
27. How can you see all indexes defined for a table?
The = , <>, <=, <, >=, >,<<,>>, <=>, AND, OR, or LIKE operators are used in column
comparisons in SELECT statements.
No.
33. What is the difference between the LIKE and REGEXP operators?
LIKE and REGEXP operators are used to express with ^ and %.
A BLOB is a binary large object that can hold a variable amount of data. There are four
types of BLOB –
TINYBLOB
BLOB
MEDIUMBLOB and
LONGBLOB
They all differ only in the maximum length of the values they can hold.
TINYTEXT
TEXT
MEDIUMTEXT and
LONGTEXT
They all correspond to the four BLOB types and have the same maximum lengths and
storage requirements.
The only difference between BLOB and TEXT types is that sorting and comparison is
performed in case-sensitive for BLOB values and case-insensitive for TEXT values.
35. What is the difference between mysql_fetch_array and mysql_fetch_object?
1 mysql ;
2 mysql mysql.out
37. Where MyISAM table will be stored and also give their formats of storage?
MyISAM
Heap
Merge
INNO DB
ISAM
If you want to enter characters as HEX numbers, you can enter HEX numbers with single
quotes and a prefix of (X), or just prefix HEX numbers with (Ox).
A HEX number string will be automatically converted into a character string, if the
expression context is a string.
1 SELECT * FROM
2 LIMIT 0,50;
46. What are the objects can be created using CREATE statement?
DATABASE
EVENT
FUNCTION
INDEX
PROCEDURE
TABLE
TRIGGER
USER
VIEW
47. How many TRIGGERS are allowed in MySql table?
BEFORE INSERT
AFTER INSERT
BEFORE UPDATE
AFTER UPDATE
BEFORE DELETE and
AFTER DELETE
48. What are the nonstandard string types?
TINYTEXT
TEXT
MEDIUMTEXT
LONGTEXT
49. What are all the Common SQL Function?
CONCAT(A, B) – Concatenates two string values to create a single string output. Often
used to combine two or more fields into one single field.
MONTH(), DAY(), YEAR(), WEEK(), WEEKDAY() – Extracts the given data from a date
value.
HOUR(), MINUTE(), SECOND() – Extracts the given data from a time value.
DATEDIFF(A, B) – Determines the difference between two dates and it is commonly used
to calculate age
An ACL (Access Control List) is a list of permissions that is associated with an object. This
list is the basis for MySQL server’s security model and it helps in troubleshooting problems
like users not being able to connect.
MySQL keeps the ACLs (also called grant tables) cached in memory. When a user tries to
authenticate or run a command, MySQL checks the authentication information and
permissions against the ACLs, in a predetermined order.
Software Development Life Cycle
Models and Methodologies
Introduction
The software industry includes many different processes, for example, analysis, development,
maintenance and publication of software. This industry also includes software services, such as
training, documentation, and consulting.
Our focus here about software development life cycle (SDLC). So, due to that different types of
projects have different requirements. Therefore, it may be required to choose the SDLC phases
according to the specific needs of the project. These different requirements and needs give us various
software development approaches to choose from during software implementation.
Types of Software developing life cycles (SDLC)
· Waterfall Model
· V-Shaped Model
· Easy to explain to the user· Structures · Assumes that the requirements of a system
approach.· Stages and activities are well can be frozen· Very difficult to go back to
defined· Helps to plan and schedule the any stage after it finished.· Little flexibility
project· Verification at each stage ensures and adjusting scope is difficult and
early detection of errors / misunderstanding· expensive.· Costly and required more time,
Each phase has specific deliverables in addition to detailed plan
V-Shaped Model
Description
It is an extension for waterfall model, Instead of moving down in a linear way, the process steps are
bent upwards after the coding phase, to form the typical V shape. The major difference between v-
shaped model and waterfall model is the early test planning in v-shaped model.
The usage
· Software requirements clearly defined and known
· Simple and easy to use.· Each phase has · Very inflexible, like the waterfall model.·
specific deliverables.· Higher chance of Little flexibility and adjusting scope is
success over the waterfall model due to the difficult and expensive.· Software is
development of test plans early on during the developed during the implementation phase,
life cycle.· Works well for where so no early prototypes of the software are
requirements are easily produced.· Model doesn’t provide a clear
understood. Verification and validation of the path for problems found during testing
product in early stages of product phases.· Costly and required more time, in
development addition to detailed plan
Prototyping Model
Description
It refers to the activity of creating prototypes of software applications, for example, incomplete
versions of the software program being developed. It is an activity that can occur in software
development. It used to visualize some component of the software to limit the gap of
misunderstanding the customer requirements by the development team. This also will reduce the
iterations may occur in waterfall approach and hard to be implemented due to inflexibility of the
waterfall approach. So, when the final prototype is developed, the requirement is considered to be
frozen.
· Throwaway prototyping: Prototypes that are eventually discarded rather than becoming a part of the
finally delivered software
· Evolutionary prototyping: prototypes that evolve into the final system through iterative
incorporation of user feedback.
· Incremental prototyping: The final product is built as separate prototypes. At the end the separate
prototypes are merged in an overall design.
· Extreme prototyping: used at web applications mainly. Basically, it breaks down web development
into three phases, each one based on the preceding one. The first phase is a static prototype that
consists mainly of HTML pages. In the second phase, the screens are programmed and fully
functional using a simulated services layer. In the third phase the services are implemented
The usage
· This process can be used with any software developing life cycle model. While this shall be focused
with systems needs more user interactions. So, the system do not have user interactions, such as,
system does some calculations shall not have prototypes.
The usage
It is used in shrink-wrap application and large system which built-in small phases or segments. Also
can be used in system has separated components, for example, ERP system. Which we can start with
budget module as first iteration and then we can start with inventory module and so forth.
The usage
It can be used with any type of the project, but it needs more involvement from customer and to be
interactive. Also, it can be used when the customer needs to have some functional requirement ready
in less than three weeks.