Lecture 1 Addition What Is Data?
Lecture 1 Addition What Is Data?
What is Data?
Data is nothing but facts and statistics stored or free flowing over a network, generally it's
raw and unprocessed. For example: When you visit any website, they might store you IP
address, that is data, in return they might add a cookie in your browser, marking you that
you visited the website, that is data, your name, it's data, your age, it's data.
Data becomes information when it is processed, turning it into something meaningful. Like,
based on the cookie data saved on user's browser, if a website can analyse that generally
men of age 20-25 visit us more, that is information, derived from the data collected.
What is a Database?
A Database is a collection of related data organised in a way that data can be easily
accessed, managed and updated. Database can be software based or hardware based,
with one sole purpose, storing data.
During early computer days, data was collected and stored on tapes, which were mostly
write-only, which means once data is stored on it, it can never be read again. They were
slow and bulky, and soon computer scientists realised that they needed a better solution to
this problem.
Larry Ellison, the co-founder of Oracle was amongst the first few, who realised the need
for a software based Database Management System.
What is DBMS?
A DBMS is a software that allows creation, definition and manipulation of database,
allowing users to store, process and analyse data easily. DBMS provides us with an
interface or a tool, to perform various operations like creating database, storing data in it,
updating data, creating tables in the database and a lot more.
DBMS also provides protection and security to the databases. It also maintains data
consistency in case of multiple users.
Here are some examples of popular DBMS used these days:
MySql
Oracle
SQL Server
IBM DB2
PostgreSQL
Amazon SimpleDB (cloud based) etc.
1. Data stored into Tables: Data is never directly stored into the database. Data is stored
into tables, created inside the database. DBMS also allows to have relationships
between tables which makes the data more meaningful and connected. You can easily
understand what type of data is stored where by looking at all the tables created in a
database.
2. Reduced Redundancy: In the modern world hard drives are very cheap, but earlier
when hard drives were too expensive, unnecessary repetition of data in database was a
big problem. But DBMS follows Normalisation which divides the data in such a way
that repetition is minimum.
3. Data Consistency: On Live data, i.e. data that is being continuosly updated and added,
maintaining the consistency of data can become a challenge. But DBMS handles it all
by itself.
4. Support Multiple user and Concurrent Access: DBMS allows multiple users to work
on it(update, insert, delete data) at the same time and still manages to maintain the data
consistency.
5. Query Language: DBMS provides users with a simple Query language, using which
data can be easily fetched, inserted, deleted and updated in a database.
6. Security: The DBMS also takes care of the security of data, protecting the data from
un-authorised access. In a typical DBMS, we can create user accounts with different
access permissions, using which we can easily secure our data by restricting user
access.
7. DBMS supports transactions, which allows us to better handle and manage data
integrity in real world applications where multi-threading is extensively used.
Advantages of DBMS
Segregation of applicaion program.
Minimal data duplicacy or data redundancy.
Easy retrieval of data using the Query Language.
Reduced development time and maintainance need.
With Cloud Datacenters, we now have Database Management Systems capable of
storing almost infinite data.
Seamless integration into the application programming languages which makes it very
easier to add a database to almost any application or website.
Disadvantages of DBMS
It's Complexity
Except MySQL, which is open source, licensed DBMSs are generally costly.
They are large in size.
Components of DBMS
The database management system can be divided into five major components, they are:
1. Hardware
2. Software
3. Data
4. Procedures
5. Database Access Language
Let's have a simple diagram to see how they all fit together to form a database management
system.
Such an architecture provides the DBMS extra security as it is not exposed to the End User
directly. Also, security can be improved by adding security and authentication checks in the
Application layer too.
Hierarchical Model
Network Model
Entity-relationship Model
Relational Model
Hierarchical Model
This database model organises data into a tree-like-structure, with a single root, to which all
the other data is linked. The heirarchy starts from the Root data, and expands like a tree,
adding child nodes to the parent nodes.
In this model, a child node will only have a single parent node.
This model efficiently describes many real-world relationships like index of a book, recipes
etc.
In hierarchical model, data is organised into tree-like structure with one one-to-many
relationship between two different types of data, for example, one department can have
many courses, many professors and of-course many students.
Network Model
This is an extension of the Hierarchical model. In this model data is organised more like a
graph, and are allowed to have more than one parent node.
In this database model data is more related as more relationships are established in this
database model. Also, as the data is more related, hence accessing the data is also easier
and fast. This database model was used to map many-to-many data relationships.
This was the most widely used database model, before Relational Model was introduced.
Entity-relationship Model
In this database model, relationships are created by dividing object of interest into entity and
its characteristics into attributes.
Different entities are related using relationships.
E-R Models are defined to represent the relationships into pictorial form to make it easier for
different stakeholders to understand.
This model is good to design a database, which can then be turned into tables in relational
model(explained below).
Let's take an example, If we have to design a School Database, then Student will be
an entity with attributes name, age, address etc. As Address is generally complex, it can
be another entity with attributes street name, pincode, city etc, and there will be a
relationship between them.
Relationships can also be of different types. To learn about E-R Diagrams in details, click
on the link.
Relational Model
In this model, data is organised in two-dimensional tables and the relationship is
maintained by storing a common field.
This model was introduced by E.F Codd in 1970, and since then it has been the most widely
used database model, infact, we can say the only database model used around the world.
The basic structure of data in the relational model is tables. All the information related to a
particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model.
In the coming tutorials we will learn how to design tables, normalize them to reduce data
redundancy and how to use Structured Query language to access data from tables.
Basic Concepts of ER Model in
DBMS
As we described in the tutorial Database models, Entity-relationship model is a model used
for design and representation of relationships between data.
The main data objects are termed as Entities, with their details defined as attributes, some
of these attributes are important and are used to identity the entity, and different entities are
related using relationships.
In short, to understand about the ER Model, we must understand about:
Let's take an example to explain everything. For a School Management Software, we will
have to store Student information, Teacher information, Classes, Subjects taught in each
class etc.
ER Model: Attributes
If a Student is an Entity, then student's roll no., student's name, student's age,
student's gender etc will be its attributes.
An attribute can be of many types, here are different types of attributes defined in ER
database model:
1. Simple attribute: The attributes with values that are atomic and cannot be broken down
further are simple attributes. For example, student's age.
2. Composite attribute: A composite attribute is made up of more than one simple
attribute. For example, student's address will contain, house no., street
name, pincode etc.
3. Derived attribute: These are the attributes which are not present in the whole database
management system, but are derived using other attributes. For example, average age
of students in a class.
4. Single-valued attribute: As the name suggests, they have a single value.
5. Multi-valued attribute: And, they can have multiple values.
ER Model: Keys
If the attribute roll no. can uniquely identify a student entity, amongst all the students, then
the attribute roll no. will be said to be a key.
Following are the types of Keys:
1. Super Key
2. Candidate Key
3. Primary Key
ER Model: Relationships
When an Entity is related to another Entity, they are said to have a relationship. For
example, A Class Entity is related to Student entity, becasue students study in classes,
hence this is a relationship.
Depending upon the number of entities involved, a degree is assigned to relationships.
For example, if 2 entities are involved, it is said to be Binary relationship, if 3 entities are
involved, it is said to be Ternary relationship, and so on.
In the next tutorial, we will learn how to create ER diagrams and design databases using ER
diagrams.
Components of ER Diagram
Entitiy, Attributes, Relationships etc form the components of ER Diagram and there are
defined symbols and shapes to represent each one of them.
Let's see how we can represent these in our ER Diagram.
Entity
Simple rectangular box represents an Entity.
Relationships between Entities - Weak and Strong
Rhombus is used to setup relationships between two or more entities.
Weak Entity
A weak Entity is represented using double rectangular boxes. It is generally connected to
another entity.
1. Binary Relationship
2. Recursive Relationship
3. Ternary Relationship
ER Diagram: Binary Relationship
Binary Relationship means relation between two Entities. This is further divided into three
types.
The above example describes that one student can enroll only for one course and a course
will also have only one Student. This is not what you will usually see in real-world
relationships.
The above diagram represents that one student can enroll for more than one courses. And
a course can have more than 1 student enrolled in it.