8 - Databases DBMS DBA Normalisation SQL Coding
8 - Databases DBMS DBA Normalisation SQL Coding
coding
Data integrity
Databases maintains the integrity of the data stored by providing validation and preventing any corrupted or incomplete data
The file just stores the data which it is being supplied with and sometimes the data could be corrupted or overwritten
Data privacy
Databases allow access rights which limits the users access to the data stored in the databases
Files do not have any proper access rights. However, the only method to restrict access to data is to make copies of the file and distribute
the edited files to the individuals which contain the data only for that particular user
Data redundancy
If files are distributed and copied then data is unnecessarily redundant and so this will take unneccessary storage
The database organises the data in the most efficient way possible to remove data redundancy. And also many different users of different
access rights can access the same and only database
Data dependancy
When data is stored in a file the structure of the file data must be known but, the databases uses queries to store and retrienve data without
knowing how data is stored in the database(this is known as the external level).
Also when storing new data the programs may have to be changed where as in databases this is handled by the DBMS software and the
users just has to add new data
Relational databases
You may have heard what databases are, it contains tables which store data
In the relational database model the tables have links between other tables
Relation
Tuple
Attribute
Candidate key
One or more key attributes which contain unique values for each tuple
In the table(relation) there might be attributes which contain unique values. So this could be used to uniquely identify each record or tuple
Primary key
This is a candidate key which is used to uniquely identify each tuple
Each table must have a primary key as it prevents data redundancy as each tuple in the relation is automatically unique when the primary
key is unique
So as we can see the primary key is chosen from the candidate keys
Secondary key
A candidate key is not used as a primary key
This is mainly used for indexing which we will learn later
Foreign key
A key which is used to link to a primary key of another table
This is how the relationship between tables are made
Referential integrity
The use of a foreign key ensures that a value can only be entered in one table if the same value already
exists in the referenced table
This means that foreign key of a table can not be different from the primary key of the referenced table
This ensure that the values in the foreign key must match the primary key values
Firstly, the attributes are made, then the foreign and primary key are defined
Also the primary key can not be updated or changed/deleted if the foreign can't refer to the primary key
An entity is an object,thing or person which can store data in fields and rows
In relational databases the tables must be linked so the E-R modelling helps us to find the link between the entities
2. Identify the type of relationship. Remember one to one relationships can be ignored
Cardinals or relationship
When finding the relationship between two entities we need to use logic
There are 4 types of relationships between entities and for each we will give an example
Detailed E-R diagrams are not usually tested so you can ignore them but you need to know how to draw a normal one
One to One
Many to many
For example like a auction center where many people can bid for a single product(instance) and also many products could be auctioned by
a single person
For example what is the relationship between a BAND and the MEMBER entities?
It is one to many
Representing this relationships in a diagram is called the E-R diagram and they usually give alot of hints
The E-R diagram doesn't just contain a relationship between just two entities but many entities
There are few things you need to remember everytime you see these questions
The table/entity which contains the foreign key is the Many side and the table/entity which contains the primary key is the one side.
ALways remember this
So we make an intermediate table called a link entity which joins two entities which has many to many relationships
So the Link entity table is a separate table which has two foreign keys which are used to link the primary keys of each entities
Normalisation
Normalisation is very similar to the E-R diagrams however, in normalisation this is organising already recorded data in the most efficient way
possible whereas E-R diagrams are made before data is recorded
Usually groups of data in a flat line table or a stand alone table may have repeating groups and other inefficient things. Normalisation helps
to break the single table to several tables called entities and links them together
1 NF or 1st Normalisation
The relations can't have repeating groups and so repeating groups are removed
Repeating groups are when the attributes have more than one set of the same values
The attributes or fields which have the same value are grouped together in a non repeated table and it is linked to another table which
contains an attribute which is repeating
Jobdetails
Department Payrate
Accounts $25
HR $28
So the repeating groups are collapsed together
Employeedetails
Department Surname Country City
HR Newton UK London
As we can see the 2nd table has the repeating groups for the department field but not for the pay rate. So this ensures data redundancy is
minimized as possible and it is very significant in large sets of data where many attributes have repeating groups
Also we can see the 1st table could be called the Jobdetails entity which the department field acts as the Primary key. The Employeedetails
entity is referenced to the Jobdetails entity by using the foreign key department
So again by using the rule, the place which has the foreign key is the many sides and the one with the primary key is the one side
The relationship is one to many
Another point to remember is that the foreign key of the Employeedetails entity is department but what is the primary key. As the
department field is not unique it is not enough to uniquely identify each record. So we need to define another key as a primary key.
The last name can be used as the primary key. However, we use both the departments and the last name as primary keys. This is because
the foreign key of the new table must be also the primary key. So we will use the surname and departments as both primary keys
2 NF or 2nd Normalisation
A rule in the relational database model is that non-key attributes or the other attributes must depend fully on the primary key or keys. This
means we must be able to identify each record uniquely depending on the primary keys only
However, sometimes the other attributes may not be dependant on both primary keys but only a single primary key. This is known as partial
dependancy
I just realized the example above is very poor and just focus on the method. So I will state that the Country and city are dependants on the
surname but not on the department key
The 2nd normal form removes any partial dependency and it must be already in the 1 NF to do this
Jobdetails - Surnameorigin
Department Surname
Accounts Donaldson
HR Newton
Accounts Xian
Surname origin
Surname Country City
Newton UK London
As we see we made a separate table so that the country and the city are dependant on the primary key only. In the Surname origin table
the primary key is the surname. Now in the previous table, the name changes from Employeedetails to Jobdetails - surnameorigin
Now in the 2nd table(jobdetails-surnameorigin) both the Surname and Department keys are foreign keys as they link between the two tables
This is the same as the link entity method to display a many to many relationship logically/practically
3 NF or 3rd Normalisation
Again back to the point that the non key attributes must depend only on the primary key.
Sometimes the non key attributes could be dependant on another non key attribute
For example the city depends on the country and so the 3rd normal form removes any Transitive dependancy(non - key dependancy). So a
4th table is made
Surnameorigin
Surname Country
Donaldson USA
Newton UK
Xian China
Countrydetails
Country City
UK London
China Shanghai
As we can see we made a separate table so the city field is only dependant on the country field
The primary key of this table is Country and the foreigh key is in the previous table
Normalisation is a method and it is very tricky. But just remember these two principles
Normal forms can't have repeating groups or any partial or transitive dependancy
External level
This is the level used by the external users of the data in the database. Where they store and retrieve data without any knowledge of the
structure of the database. They have access rights to control which data they are able to access
Conceptual level
This is the level used by the DBA (Database adminstrator) who creates tables and databases. The DBA is also responsible in setting access
rights for the users in the external level. The DBA uses the DBMS software to perform these functions
Internal level
This is the level handled by the DBMS software which controls the storage of data in the physical storage(internal schemes). It controls
where data is stored in the storages and this is the level where data is directly accessed or stored
DBA
This is the person who is in charge of the conceptual level and uses the DBMS software to create the
structure of the database
There are some tasks which the DBA is in charge of and you need to know these
In other words the DBMS software handles the internal operations of the database
Usually the DBA doesn't require to know SQL to be able to create the databases or to access or store data
The Developer interface provides an interface for the DBA to create the structures easily
The Query processor is a special software which is used to retrieve/ manipulate data in the database
You will need to know the features provided by the DBMS software
5. Data dictionary - which gives the detal of how the data in the database are stored in the physical memory to the DBA
Indexing
A small secondary table which contains an attribute of unique values so it can point to the tuple of the
table much faster
This concept makes accessing data from a large table much faster
A separate table called an index table contains an attribute of the table which contains unique values(candidate key)
We usually use the secondary key for this , so the index table points it to the corresponding tuple for fast access
SQL
We need to learn some scripting of the SQL language
In manipulating the data in the database such as storing and retrieving data from the database
DDL
We will see an example
The program statements can go for many lines but each statement must end with ;
We use the Keywords DATABASE and TABLE only in DDL and not in DML
The ALTER keyword is used to alter an existing table such as adding a new field
For each attribute we need to define the data type, databases don't contain the datatype STRING so we use varchar(x) where x is number of
characters
DML
Adding data to the table
The clientdetails is the table name and inside the brackets are the field names
The values must be in the same order as the fields in the brackets
SELECT DOB
FROM Clientdetails
WHERE Client ID = 2345;
So this will return 23/7/2002
We need to select which field name you need to be displayed and from which table and also the where statement gives us the condition or
else the whole DOB attribute will be displayed
SELECT DOB
FROM Clientdetails;
Has no condition so it displays all the values in DOB attribute/field
SELECT *
FROM Clientdetails
WHERE ClientID = 2345;
So this displays all the fields in the table clientdetails
If we need to have some format of displaying the data we can either use:
SELECT DOB
FROM Clientdetails
WHERE ClientID = 2345
ORDER BY DOB ASC|DSC;
So this orders the DOB in order of the Date. The default is Ascending Order
or we could use:
SELECT DOB
FROM Clientdetails
WHERE ClientID = 2345
GROUP BY Department;
This means that the date of birth is displayed in groups of department
The WHERE condition could be further used to compare field values of different tables and not only in the same table
SELECT DOB
FROM Clientdetails
INNER JOIN Client-booking
WHERE Clientdetails.ClientID = 2345 AND Client-bookin
So this is used to display the field value when both conditions of two different table values are correct
We need to use INNER JOIN to join the other table. We also need to use a dot to show it is a field of a particular table
UPDATE CLientdetails
SET ClientID = 2678
WHERE ClientID = 2345;
So this changes the clientID which was 2345 to 2678
This deletes the whole record which has the the ClientID 2678
Always remember the foreign key records must be first deleted before the primary key records are deleted in the referenced table
(referential integrity)
Summary
I will give a small summary of each keywords
ALTER - DDL change the structure of an existing table such as adding a new field/attribute
INNER JOIN - DML used to compare field values from two different tables
Recommended
These are things you might like. Clicking these ads can help us improve our free services in the future...