Lesson 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 50

Objectives of this Lesson

 To understand basic concepts of Database

 To know about traditional file system

 To know about different types of databases

 To understand database management system DBMS.


Definition

 Database is a shared collection of


logically related data that is stored to
meet the requirements of different
users of an organization.
Database
A database is a collection of information that is organized so that it can be
easily accessed, managed and updated.

The collected information could be in any number of formats.

Examples:
 Phone Book
 Address Book
 A database typically consists of:

 Tables (data file) • Collection of related Row


records

 Fields (columns) • Single category of data to be stored in a database (ID ,


name, Age, etc.)

 Records (rows) • Collection of related fields in a database (all the fields for
one person, for example)

 Character: smallest piece of data


 Entity: object that has data
 People
 Events
 Products

 Database management system (DBMS): program used to build


databases

 Query: message requesting access to data


Database structures
 Database management system (DBMS) packages are designed to use a
specific data structure to provide end users with quick, easy access to
information stored in databases.
 Five fundamental database structures are:
 Hierarchical,
 Network,
 Relational,
 Object-oriented,
 Multidimensional models.
Hierarchical Structure
 Early mainframe DBMS packages used the hierarchical structure , in which
the relationships between records form a hierarchy or treelike structure.
 In the traditional hierarchical model, all records are dependent and arranged
in multilevel structures, consisting of one root record and any number of
subordinate levels.
 Thus, all of the relationships among records are one-to-many because each
data element is related to only one element above it.
 The data element or record at the highest level of the hierarchy (the
department data element in this illustration) is called the root element.
 Any data element can be accessed by moving progressively downward from a
root and along the branches of the tree until the desired record (e.g., the
employee data element) is located.
Network Structure
 The network structure can represent more complex logical relationships
and is still used by some mainframe DBMS packages.
 It allows many-to-many relationships among records; that is, the network
model can access a data element by following one of several paths because
any data element or record can be related to any number of other data
elements.
 For example, in Figure , departmental records can be related to more than
one employee record, and employee records can be related to more than
 one project record.
 Thus, you could locate all employee records for a particular department or
all project records related to a particular employee.
 It should be noted that neither the hierarchical nor the network data
structures are commonly found in the modern organization.
 The next data structure we discuss, the relational data structure, is the
most common of all and serves as the foundation for most modern
databases in organizations.
Relational Structure
 The relational model is the most widely used of the three database
structures.
 In the relational model, all data elements within the database are viewed
as being
 stored in the form of simple two-dimensional tables , sometimes referred
to as relations .
 The tables in a relational database are flat files that have rows and
columns.
 Each row represents a single record in the file, and each column
represents a field.
Relational Operations
 Three basic operations can be performed on a relational database to create
useful sets of data.
 The select operation is used to create a subset of records that meet a
stated criterion.
 For example, a select operation might be used on an employee database to
create a subset of records that contain all employees who make more than
$30,000 per year and who have been with the company more than three
years.
 Another way to think of the select operation is that it temporarily creates a
table whose rows have records that meet the selection criteria.
 The join operation can be used to combine two or more tables temporarily
so that a user can see relevant data in a form that looks like it is all in one
big table.

 Using this operation, a user can ask for data to be retrieved from multiple
files or databases without having to go to each one separately.
 The project operation is used to create a subset of the columns contained in the
temporary tables created by the select and join operations.
 Just as the select operation creates a subset of records that meet stated criteria,
the project operation creates a subset of the columns, or fields, that the user
wants to see.
 Using a project operation, the user can decide not to view all of the columns in
the table but instead view only those that have the data necessary to answer a
particular question or construct a specific report.
 Because of the widespread use of relational models, an abundance of commercial
products exist to create and manage them.
 Leading mainframe relational database applications include Oracle 10g from
Oracle Corp. and DB2 from IBM.
 The most commonly used database application for the PC is Microsoft Access.
Object-Oriented Structure
 The object-oriented model is considered one of the key technologies of a new
generation of multimedia Web-based applications.

 An object consists of data values describing the attributes of an entity, plus the
operations that can be performed upon the data.

 The object-oriented model also supports inheritance ; that is, new objects can be
automatically created by replicating some or all of the characteristics of one or
more parent objects.

 Such capabilities have made object-oriented database management systems


(OODBMS) popular in computer-aided design (CAD) and a growing number of
applications.
Multidimensional Structure
 The multidimensional model is a variation of the relational model that uses
multidimensional structures to organize data and express the relationships
between data.

 You can visualize multidimensional structures as cubes of data and cubes within
cubes of data.

 Each side of the cube is considered a dimension of the data.

 Multidimensional databases have become the most popular database structure


for the analytical databases that support online analytical processing (OLAP)
applications, in which fast answers to complex business queries are expected.
 Key:
 A key is a database field whose purpose is to uniquely identify a record.
 Keys help enforce data integrity and avoid duplication. The main types of
keys used in a database are candidate keys, primary keys foreign keys.

 Primary key is a candidate key that is most appropriate to become the


main key for any table. It is a key that can uniquely identify each record in
a table.
 FOREIGN KEY is a column that creates a relationship between two tables.

 The purpose of Foreign keys is to maintain data integrity and allow


navigation between two different instances of an entity.

 It acts as a cross-reference between two tables as it references the primary


key of another table.
Foreign key
 database query is a request for data from a database. Usually the
request is to retrieve data; however, data can also be manipulated
using queries. The data can come from one or more tables, or even
other queries.
 Centralized database is the type of database that stores data at a
centralized database system.
 It is a database that is located, stored, and maintained in a single location
 An example of a Centralized database can be Central Library that carries a
central database of each library in a college/university.
Advantages
 The data integrity is maximized as the whole database is stored at a single
physical location. This means that it is easier to coordinate the data and it is as
accurate and consistent as possible.

 The data redundancy is minimal in the centralized database. All the data is
stored together and not scattered across different locations. So, it is easier to
make sure there is no redundant data available.

 Since all the data is in one place, there can be stronger security measures around
it. So, the centralized database is much more secure.

 Data is easily portable because it is stored at the same place.


 The centralized database is cheaper than other types of databases as it
requires less power and maintenance.

 All the information in the centralized database can be easily accessed


from the same location and at the same time.
Disadvantages
 Since all the data is at one location, it takes more time to search and access
it. If the network is slow, this process takes even more time.

 There is a lot of data access traffic for the centralized database. This may
create a bottleneck situation.

 Since all the data is at the same location, if multiple users try to access it
simultaneously it creates a problem. This may reduce the efficiency of the
system.

 If there are no database recovery measures in place and a system failure


occurs, then all the data in the database will be destroyed.
 Distributed Database: Unlike a centralized database system, in
distributed systems, data is distributed among different database systems
of an organization.
 These database systems are connected via communication links. Such links
help the end-users to access the data easily.
 Examples of the Distributed database are Apache Cassandra, HBase,
Ignite, etc.

 A distributed database management system (DDBMS) is a centralized
software system that manages a distributed database in a manner as if it
were all stored in a single location.
 Features
 It is used to create, retrieve, update and delete distributed databases.
 It synchronizes the database periodically and provides access mechanisms
by the virtue of which the distribution becomes transparent to the users.
 It ensures that the data modified at any site is universally updated.
 It is used in application areas where large volumes of data are processed
and accessed by numerous users simultaneously.
 It is designed for heterogeneous database platforms.
 It maintains confidentiality and data integrity of the databases.
Disadvantage
 Complexity of management and control

 Increased storage requirement

 More complex query processing

 More complexity in shared updates


 Relational Database: This database is based on the relational data model,
which stores data in the form of rows(tuple) and columns(attributes), and
together forms a table(relation).
 A relational database uses SQL for storing, manipulating, as well as
maintaining the data.
 Each table in the database carries a key that makes the data unique from
others. Examples of Relational databases are MySQL, Microsoft SQL
Server, Oracle, etc.
Relationships in database design.
 There are three types of relationships in database design.
 one-to-many
 The most common ones are one-to-many: For example; one student can
study multiple subjects.
 many-to-many:
 each student could work on multiple projects, and each project could
employ multiple students

 One-to-one:
 database relationships are probably the least common.
Advantages
 Accuracy:
 Data is stored just once, eliminating data duplication.

 Flexibility:
 Complex queries are easy for users to carry out.

 Collaboration:
 Multiple users can access the same database.
 Trust:
 Relational database models are mature and well-understood.
 Cloud Database: A type of database where data is stored in a virtual
environment and executes over the cloud computing platform. It provides
users with various cloud computing services (SaaS, PaaS, IaaS, etc.) for
accessing the database. There are numerous cloud platforms, but the best
options are:
 Amazon Web Services(AWS)
 Microsoft Azure
 ScienceSoft
 Google Cloud
SQL, etc.
Disadvantages?
 Hierarchical Databases: It is the type of database that stores data in the
form of parent-children relationship nodes. Here, it organizes data in a
tree-like structure.
 Data get stored in the form of records that are connected via links. Each
child record in the tree will contain only one parent. On the other hand,
each parent record can have multiple child records.
Database Management System
 Database Management System (DBMS) is a software for storing and
retrieving users' data while considering appropriate security measures
 A general-purpose DBMS allows users to create their own databases as
per their requirement.
 MySQL
 Oracle
 SQL Server
 IBM DB2
 PostgreSQL
 Amazon Simple DB (cloud based) etc.
Advantages of DBMS
 Improved data sharing

 Improved data security

 A DBMS uses various powerful functions to store and retrieve data


efficiently.

 DBMS offers a variety of techniques to store & retrieve data

 Improved decision making


Traditional File Processing System
 The traditional filing system is a method
of storing and arranging computer files
and the information in the file (data).
 In the traditional approach, we used to
store information in flat files which are
maintained by the file system under
the operating system’s control. Here, flat
files are files containing records having
no structured relationship among them.
 Before the use of a computer, a manual file system was used to maintain
the records and files. Data were stored and processed using a traditional
file system and it makes it easy to find any information. In this traditional
file system, each file is independent of other file
 All functional areas in the organization create, processes its own files.
 The files such as inventory and payroll generate separate files and do not
communicate with each other. The organization was simple to generate
and had better local control but the data of an organization is dispersed
throughout the functional subsystem.
Advantages of Traditional File System
 Store and arrange the computer files.

 Simple to use

 Less complex.

 No need of specialist.
Features of Traditional File System
 It stores data in a group of files.

 Files data are dependent on each other.

 C/C++ and COBOL languages were used to design the files.

 It is very difficult to maintain the traditional file processing system.

 The traditional file system is also called a flat file system.


Disadvantages
 It was time consuming.

 Inefficient to maintain the record of big firm having large number of


items.

 It requires a lots of labor work to do.

 It becomes more complex when anyone requires changing the


information.

You might also like