0% found this document useful (0 votes)
27 views19 pages

IM 101 - Fundamentals of Database Systems - Unit 2

Uploaded by

ceemorgan91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views19 pages

IM 101 - Fundamentals of Database Systems - Unit 2

Uploaded by

ceemorgan91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Copyright © 2020 by the Pamantasan ng Lungsod ng Valenzuela

All rights reserved. No part of this module may be reproduced, repurposed, distributed, or transmitted in
any form or by any means including photocopying, reprinting, or other electronic or mechanical
methods without the prior written permission of PLV and the individual developers of instructional
materials (IMs) except in the case of brief quotations embodied in critical and creative reviews and
certain other noncommercial uses permitted by the Copyright Law. For permission request, address your
written correspondence whether printed or electronic to the Chair of the University Committee on
Instructional Materials Development and Evaluation at the address below:

Pamantasan ng Lungsod ng Valenzuela


Tongco St., Maysan, Valenzuela City
College: Department: Course Course Title:
Engineering and Information Information Code: Fundamentals of Database
Technology Technology IM 101 Systems
Faculty: Chairperson:
Rommel P. Apostol, MIT PATRICK LUIS M. FRANCISCO, MIT

Understanding the Flow of Data in Everyday Transactions,


A module in IM 101: Fundamentals of Database Systems
Foreword
This module aims to help students understand, familiarize, and adopt the use of fundamental data
processes and operation to utilize them in developing more efficient and secure information systems.
This module contains lessons that introduce them to the core concept of database systems and
management and answers the essential questions presented in each part of the module. The learning
outcomes from each part of the module will help the students understand the essential questions given
and at a certain point in the discussions, an evaluation will be done through the use of different
activities.
At the end of this module, the student will be able to understand the basic concepts and use of
database systems and be able to use tools and software in manipulating them.
Table of Contents
Unit Two
Essential Questions …………………………………………………………………………….. 1
Intended Learning Outcomes …………………………………………………………………... 1
Assesment Task
Diagnostic ………………………………………………………………………………. 1
Formative ………………………………………………………………………………. 12
Summative ……………………………………………………………………………... 13
Lessons Input …………………………………………………………………………………… 1
References ……………………………………………………………………………………… 14
1

Unit Two – Types of Databases and File Systems


This unit introduces the different database and file systems that were used and are still being used by
individual and companies worldwide.

 Essential Question
How can we differentiate the types of database system ?
How do we differentiate the different File systems ?
What are the advantages and disadvantages of the said file systems?

 Intended Learning Outcomes


Know the different types of Database and File Systems, their uses, advantages and
disadvantages.

 Diagnostic Assessment Task


At the start of the lesson the instructor will provide the following activities to gauge the students
understanding of the lesson beforehand :
1. The instructor will give the students a group of items that they need to categorize into their
respective types of database and file system

 Lessons Input
Types of Databases
 number of users

o single-user database supports only one user at a time.

 A single-user database that runs on a personal computer is called a desktop


database

 An example is a personal computer, smartphone, a game console playing a


standalone game
2

Figure 6. Single-user database setup


o multiuser database supports multiple users at the same time

 multiuser database supports a relatively small number of users (usually fewer


than 50) or a specific department within an organization, it is called
workgroup database

 used by the entire organization and supports many users (more than 50,
usually hundreds) across many departments, the database is known as an
enterprise database

 Examples are Databases of Banks, insurance agencies, stock exchanges,


supermarkets, etc

Figure 7. Multiuser database setup


3

 Location

o a database that supports data located at a single site is called a centralized database

Centralized systems are systems that use client/server architecture where one or more client
nodes are directly connected to a central server. This is the most commonly used type of system
in many organisations where client sends a request to a company server and receives the
response.

Figure 8. Centralised system visualisation


Example:
Wikipedia. Consider a massive server to which we send our requests and the server responds
with the article that we requested. Suppose we enter the search term ‘junk food’ in the Wikipedia
search bar. This search term is sent as a request to the Wikipedia servers (mostly located in
Virginia, U.S.A) which then responds back with the articles based on relevance. In this situation,
we are the client node, wikipedia servers are central server.

Characteristics of Centralized System

o Presence of a global clock: As the entire system consists of a central node(a server/ a
master) and many client nodes(a computer/ a slave), all client nodes sync up with the
global clock(the clock of the central node).
o One single central unit: One single central unit which serves/coordinates all the other
nodes in the system.
o Dependent failure of components: Central node failure causes entire system to fail.
This makes sense because when the server is down, no other entity is there to
send/receive response/requests.

Scaling
Only vertical scaling on central server is possible. Horizontal scaling will contradict the single
central unit characteristic of this system of a single central entity.

Components of Centralized System


Components of Centralized System are,
 Node (Computer, Mobile, etc.).
4

 Server.
 Communication link (Cables, Wi-Fi, etc.).

Architecture of Centralized System


Client-Server architecture. The central node that serves the other nodes in the system is the server
node and all the other nodes are the client nodes.

Limitations of Centralized System


 Can’t scale up vertically after a certain limit – After a limit, even if you increase the hardware
and software capabilities of the server node, the performance will not increase appreciably
leading to a cost/benefit ratio < 1.
 Bottlenecks can appear when the traffic spikes – as the server can only have a finite number of
open ports to which can listen to connections from client nodes. So, when high traffic occurs
like a shopping sale, the server can essentially suffer a Denial-of-Service attack or Distributed
Denial-of-Service attack.

Advantages of Centralized System


 Easy to physically secure. It is easy to secure and service the server and client nodes by virtue of
their location
 Smooth and elegant personal experience – A client has a dedicated system which he uses(for
example, a personal computer) and the company has a similar system which can be modified to
suit custom needs
 Dedicated resources (memory, CPU cores, etc)
 More cost efficient for small systems upto a certain limit – As the central systems take less funds
to set up, they have an edge when small systems have to be built
 Quick updates are possible – Only one machine to update.
 Easy detachment of a node from the system. Just remove the connection of the client node from
the server and voila! Node detached.

Disadvantages of Centralized System


 Highly dependent on the network connectivity – System can fail if the nodes lose connectivity as
there is only one central node.
 No graceful degradation of system – abrupt failure of the entire system
 Less possibility of data backup. If the server node fails and there is no backup, you lose the data
straight away
 Difficult server maintenance – There is only one server node and due to availability reasons, it is
inefficient and unprofessional to take the server down for maintenance. So, updates have to be
done on-the-fly(hot updates) which is difficult and the system could break.

Applications of Centralized System


 Application development – Very easy to setup a central server and send client requests. Modern
technology these days do come with default test servers which can be launched with a couple
commands. For example, express server, django server.
 Data analysis – Easy to do data analysis when all the data is in one place and available for
analysis
 Personal computing
5

Use Cases
 Centralized databases – all the data in one server for use.
 Single player games like Need For Speed, GTA Vice City – entire game in one
system(commonly, a Personal Computer)
 Application development by deploying test servers leading to easy debugging, easy deployment,
easy simulation
 Personal Computers

Organisations Using
National Informatics Center (India), IBM

o A database in which every node makes its own decision, the final behavior of
the system is the aggregate of the decisions of the individual nodes is
called a decentralized database

Figure 9. Decentralised system visualisation


Example
Bitcoin. Lets take bitcoin for example because its the most popular use case of decentralized
systems. No single entity/organisation owns the bitcoin network. The network is a sum of all the
nodes who talk to each other for maintaining the amount of bitcoin every account holder has.

Characteristics of Decentralized System


 Lack of a global clock: Every node is independent of each other and hence, have different
clocks that they run and follow.
 Multiple central units (Computers/Nodes/Servers): More than one central unit which
can listen for connections from other nodes
 Dependent failure of components: one central node failure causes a part of system to fail;
not the whole system

Scaling
Vertical scaling is possible. Each node can add resources(hardware, software) to itself to increase
the performance leading to increase in performance of the entire system.
6

Components
Components of Decentralized System are,
 Node (Computer, Mobile, etc.)
 Communication link (Cables, Wi-Fi, etc.)

Architecture of Decentralized System


 peer-to-peer architecture – all nodes are peers of each other. No one node has supremacy
over other nodes
 master-slave architecture – One node can become a master by voting and help in
coordinating of a part of the system but this does not mean the node has supremacy over
the other node which it is coordinating

Limitations of Decentralized System


 May lead to problem of coordination at the enterprise level – When every node is owner of
its own behavior, its difficult to achieve collective tasks
 Not suitable for small systems – Not beneficial to build and operate small decentralized
systems because of low cost/benefit ratio
 No way to regulate a node on the system – no superior node overseeing the behavior of
subordinate nodes

Advantages of Decentralized System


 Minimal problem of performance bottlenecks occurring – The entire load gets balanced on
all the nodes; leading to minimal to no bottleneck situations
 High availability – Some nodes(computers, mobiles, servers) are always available/online
for work, leading to high availability
 More autonomy and control over resources – As each node controls its own behavior, it has
better autonomy leading to more control over resources

Disadvantages of Decentralized System


 Difficult to achieve global big tasks – No chain of command to command others to perform
certain tasks
 No regulatory oversight
 Difficult to know which node failed – Each node must be pinged for availability checking
and partitioning of work has to be done to actually find out which node failed by checking
the expected output with what the node generated
 Difficult to know which node responded – When a request is served by a decentralised
system, the request is actually served by one of the nodes in the system but it is actually
difficult to find out which node indeed served the request.

Applications of Decentralized System


 Private networks – peer nodes joined with each other to make a private network.
 Cryptocurrency – Nodes joined to become a part of a system in which digital currency is
exchanged without any trace and location of who sent what to whom. However, in bitcoin
we can see the public address and amount of bitcoin transferred, but those public addresses
are mutable and hence difficult to trace.
7

Use Cases
 Blockchain
 Decentralized databases – Entire database split in parts and distributed to different nodes
for storage and use. For example, records with names starting from ‘A’ to ‘K’ in one node,
‘L’ to ‘N’ in second node and ‘O’ to ‘Z’ in third node
 Cryptocurrency

Organisations Using
Bitcoin, Tor network

o A database that supports data distributed across several different sites is called a
distributed database

Figure 10. Distributed system visualisation

Example
Google search system. Each request is worked upon by hundreds of computers which crawl the
web and return the relevant results. To the user, the Google appears to be one system, but it
actually is multiple computers working together to accomplish one single task (return the results
to the search query).

Characteristics of Distributed System


 Concurrency of components: Nodes apply consensus protocols to agree on same
values/transactions/commands/logs.
 Lack of a global clock: All nodes maintain their own clock.
 Independent failure of components: In a distributed system, nodes fail independently
without having a significant effect on the entire system. If one node fails, the entire system
sans the failed node continue to work.

Scaling
Horizontal and vertical scaling is possible.

Components of Distributed System


Components of Distributed System are,
 Node (Computer, Mobile, etc.)
8

 Communication link (Cables, Wi-Fi, etc.)

Architecture of Distributed System


 peer-to-peer – all nodes are peer of each other and work towards a common goal
 client-server – some nodes are become server nodes for the role of coordinator, arbiter, etc.
 n-tier architecture – different parts of an application are distributed in different nodes of the
systems and these nodes work together to function as an application for the user/client

Limitations of Distributed System


 Difficult to design and debug algorithms for the system. These algorithms are difficult
because of the absence of a common clock; so no temporal ordering of commands/logs can
take place. Nodes can have different latencies which have to be kept in mind while
designing such algorithms. The complexity increases with increase in number of nodes.
 No common clock causes difficulty in the temporal ordering of events/transactions
 Difficult for a node to get the global view of the system and hence take informed decisions
based on the state of other nodes in the system

Advantages of Distributed System


 Low latency than centralized system – Distributed systems have low latency because of
high geographical spread, hence leading to less time to get a response

Disadvantages of Distributed System


 Difficult to achieve consensus
 Conventional way of logging events by absolute time they occur is not possible here

Applications of Distributed System


 Cluster computing – a technique in which many computers are coupled together to work so
that they achieve global goals. The computer cluster acts as if they were a single computer
 Grid computing – All the resources are pooled together for sharing in this kind of
computing turning the systems into a powerful supercomputer; essentially.

Use Cases
 SOA-based systems
 Multiplayer online games

Organisations Using
Apple, Google, Facebook.

 how they will be used and on the time sensitivity of the information gathered from them

o A database that is designed primarily to support a company’s day-to-day operations is


classified as an operational database(sometimes referred to as a transactional or
production database)

o Data warehouse focuses primarily on storing data used to generate information


required to make tactical or strategic decisions. Such decisions typically require
9

extensive “data massaging” (data manipulation) to extract information to formulate


pricing decisions, sales forecasts, market positioning, and so on

 the degree to which the data are structured

o Unstructured data are data that exist in their original (raw) state, that is, in the format
in which they were collected

o Structured data are the result of taking unstructured data and formatting (structuring)
such data to facilitate storage, use, and the generation of information.

o Semistructured data are data that have already been processed to some extent. For
example, if you look at a typical Web page, the data are presented to you in a
prearranged format to convey some information.

Extensible Markup Language (XML) is a special language used to represent and


manipulate data elements in a textual format. An XML database supports the storage and
management of semistructured XML data.

Database design refers to the activities that focus on the design of the database structure that will be
used to store and manage end-user data. A database that meets all user requirements does not just
happen; its structure must be designed carefully.

Manual File Systems


such systems were often manual, paper-and-pencil systems. The papers within these systems were
organized in order to facilitate the expected use of the data. Typically, this was accomplished
through a system of file folders and filing cabinets. As long as a data collection was relatively small
and an organization’s business users had few reporting requirements, the manual system served its
role well as a data repository.

Computerized File Systems


Initially, the computer files within the file system were similar to the manual files. When business
users wanted data from the computerized file, they sent requests for the data to the DP specialist. For
each request, the DP specialist had to create programs to retrieve the data from the file, manipulate it
in whatever manner the user had requested, and present it as a printed report. Data processing (DP)
specialist was hired to create a computer-based system that would track data and produce required
reports.
Basic File Terminology
10

Data - “Raw”facts, such as a telephone number, a birth date, a customer name, and a year-to-
date(YTD) sales value. Data have little meaning unless they have been organized in some logical
manner.
Field – A character or group of characters (alphabetic or numeric) that has a specific meaning. A
field is used to define and store data.
Record – A logically connected set of one or more fields that describes a person, place, or thing. For
example, the fields that constitute a record for a customer might consist of the customer’s name,
address, phone number, date of birth, credit limit, and unpaid balance.
File – A collection of related records.For example, a file might contain data about the students
currently enrolled at Gigantic University

PROBLEMS WITH FILE SYSTEM DATA PROCESSING


 Lengthy development times. The first and most glaring problem with the file system
approach is that even the simplest data-retrieval task requires extensive programming. With
the older file systems, programmers had to specify what must be done and how it was to be
done.

 Difficulty of getting quick answers. The need to write programs to produce even the
simplest reports makes ad hoc queries impossible.

 Complex system administration. System administration becomes more difficult as the


number of files in the system expands.

 Lack of security and limited data sharing. Another fault of a file system data repository is
a lack of security and limited data sharing. Data sharing and security are closely related.
Sharing data among multiple geographically dispersed users introduces a lot of security risks.

 Extensive programming. Making changes to an existing file structure can be difficult in a


file system environment.

ACID Properties in DBMS with Examples


Before making the points, consider taking the real-time example. It makes the thing easy to understand.
Suppose Alice has an account with an amount of $150. There is Bob’s account having $50. We are
transferring the amount of $100 from Alice’s account to Bob’s account. Now we see how we can ensure
data reliability using ACID properties in DBMS.
So let us have some insight over the ACID properties in DBMS.
11

1. Atomicity
It simply says “All or Nothing”. There is no intermediate.
If you are doing any database transaction (set of the read/write operations), all the operations should be
executed otherwise none.
All the operation in the transaction is considered to be one unit or atomic task.
If the system fails or any read/write conflicts occur during the transaction, the system needs to revert
back to its previous state.

Example:
Let’s check ACID properties in DBMS with examples.
Here, the set of operations are

Deduct the amount of $100 from Alice’s account.


Add amount $100 to Bob’s account.
All operations in this set should be done.

If the system fails to add the amount in Bob’s account after deducting from Alice’s account,
revert the operation on Alice’s account.

2. Consistency
Every attribute in the database has some rules to ensure the stability of the database. The constraint puts
on the data value should be constant before and after the execution of the transaction.
If the system fails because of the invalid data while doing an operation, revert back the system to its
previous state.

Example:
The total amount in Alice’s and Bob’s account should be the same before and after the
transaction. The sum of the money in Alice and Bob’s account before and after the transaction is
$200. So this transaction preserves consistency ACID properties in DBMS.

3. Isolation
12

If you are performing multiple transactions on the single database, operation from any transaction
should not interfere with operation in other transactions. the execution of all transactions should be
isolated from other transactions.

Example:
If there is any other transaction (between Mac and Alice) going, it should not make any effect on
the transaction between Alice and Bob. Both the transactions should be isolated.

4. Durability
All the above three properties should be satisfied while the transaction in progress. But durability issues
can happen even after the completion of the transaction.
So this is the ACID Property After Completion of Transaction.
The changes made during the transaction should exist after completion of the transaction.
Sometimes it may happen as all the operation in the transaction completed but the system fails
immediately. In that case, changes made while transactions should persist. The system should return to
its previous stable state.

Example:
It may happen. A system gets crashed after completion of all the operations. If the system restarts
it should preserve the stable state. An amount in Alice and Bob’s account should be the same
before and after the system gets a restart.

ACID properties in DBMS make the transaction over the database more reliable and secure. This is one
of the advantages of the database management system over the file system.

 Formative Assessment Task


1. The instructor will ask the students on how they interpret the function of a database system
(discussion and recitation)
2. The instructor will ask the students on how they interpret the function of a file system
(discussion and recitation)
3. The instructor will ask the students to write a 100 word essay on how they see the importance
of ACID properties in business transactions.

*** End of Lesson Input ***


13

 Summative Assessment Task

1. From the discussed types of database systems, create your own database system including a
sample diagram of the nodes and server/s. Explain how it will work and it's advantages and
disadvantages. Also, explain the use cases of your design.

2. Create 3 sample scenarios using ACID properties of a Database Management System.


14

References

Garcia-Molina, H.,Ullman, J.,Widom, J. (2008). Database Systems: The Complete Book (2nd ed., pp 2-
14). Pearson

You might also like