Module1-Topic1-Data Base Revolutions
Module1-Topic1-Data Base Revolutions
• Back then computers became available, and with them, the opportunity for
better data management.
• Early “databases” used paper tape initially and eventually magnetic tape to
store data sequentially.
• 1955: spinning magnetic disk - Data can be modified or can be deleted easily
in the magnetic disk memory. It also allows random access of data i.e.,
individual records.
• 1961: ISAM (Index Sequential Access Method) made fast record-oriented access
feasible and consequently leads to OLTP (On-line Transaction Processing)
computer systems.
Dr. Karthika Natarajan 5/23/2022 6
ISAM
• ISAM is an advanced sequential file organization method. Using the primary key, the records are
sorted.
• For each primary key, an index value is generated and mapped with the record. This index is
nothing but the address of record in the file.
• If any record must be retrieved based on its index value, then the address of the data block is
fetched, and the record is retrieved from the memory.
Pros of ISAM:
•In this method, each record has the address of its data block, searching a record in a
huge database is quick and easy.
•This method supports range retrieval and partial retrieval of records. Since the index is
based on the primary key values, we can retrieve the data for the given range of value.
In the same way, the partial value can also be easily searched, i.e., the student name
starting with 'JA' can be easily searched.
Cons of ISAM
•This method requires extra space in the disk to store the index value.
•When the new records are inserted, then these files must be reconstructed to
maintain the sequence.
•When the record is deleted, then the space used by it needs to be released. Otherwise,
the performance of the database will slow down.
By the early 1970s, two major models of DBMS were competing for
dominance.
• The network model was formalized by the CODASYL (Conference
on Data Systems Languages (CODASYL)) standard and implemented
databases such as IDMS (Integrated Database Management
System).
• The hierarchical model provided a somewhat simpler approach found
in IBM’s IMS (Information Management System).
In the late 1960s, Codd who is working at an IBM laboratory, found the
following drawbacks in First generation DBMS:
• Existing databases were too hard to use.
• Existing databases lacked a theoretical foundation.
• Existing databases mixed logical and physical implementations.
To overcome all these, he published a core ideas that defined the relational
database model that became the most significant model for database
systems for a generation.
Dr. Karthika Natarajan 5/23/2022 15
New in Second generation
Key concepts of the relational model includes
1.Attribute: Each column in a Table. Attributes are the properties which define a
relation. e.g., Student_Rollno, NAME, etc.
2.Tables – In the Relational model, the relations are saved in the table format. It is
stored along with its entities. A table has two properties rows and columns. Rows
represent records and columns represent attributes.
3.Tuple – It is nothing but a single row of a table, which contains a single record.
4.Degree: The total number of attributes which in the relation is called the degree of
the relation.
5.Cardinality: Total number of rows present in the Table.
6.Column: The column represents the set of values for a specific attribute.
7.Relation instance – Relation instance is a finite set of tuples in the RDBMS system.
Relation instances never have duplicate tuples.
8.Relation key - Every row has one, two or multiple attributes, which is called relation
key.
Dr. Karthika Natarajan 5/23/2022 16
Key concepts in relational model
•Disadvantages:
•Few relational databases have limits on field lengths which can't be exceeded.
Jim Gray defined the most widely accepted transaction model in the late 1970s. This soon
became popularized as ACID transactions
• Atomic: The transaction is indivisible - either all the statements in the transaction are
applied to the database or none are.
• Consistent: The database remains in a consistent state before and after transaction
execution.
• Isolated: While multiple transactions can be executed by one or more users
simultaneously, one transaction should not see the effects of other in-progress
transactions.
• Durable: Once a transaction is saved to the database, its changes are expected to
persist even if there is a failure of operating system or hardware.
In case the value read by B and C is $300, which means that data is inconsistent because
when the debit operation executes, it will not be consistent.
account A is making T1 and T2 transactions to account B and C, but both are executing independently without
affecting each other. It is known as Isolation.
• In 1998, the term NoSQL (not only structured query language) was coined.
• It refers to databases that use query language other than SQL to store and
retrieve data.
• NoSQL databases are useful for unstructured data.
• NoSQL allows faster processing of larger, more varied datasets.
• NoSQL databases are more flexible than the traditional relational databases.
• Even the most expensive commercial RDBMS such as Oracle could not
provide sufficient scalability to meet the demands of large web sites.
• Amazon added other services such as storage (S3, EBS), Virtual Private Cloud
(VPC), a MapReduce service (EMR), and so on.
• The entire platform was known as Amazon Web Services (AWS) and was the first
practical implementation of an Infrastructure as a Service (IaaS) cloud.
• AWS became the inspiration for cloud computing offerings from Google, Microsoft, and
others.
In 2007, Michael Stonebraker and his team proposed a number of variants on the
existing RDBMS design.
• H-Store described a pure in-memory distributed database
• C-Store specified a design for a columnar database.
Both these designs were extremely influential in the years to come and are the first
examples of what came to be known as NewSQL database systems
NewSQL databases that retain key characteristics of the RDBMS but that diverge from
the common architecture exhibited by traditional systems such as Oracle and SQL
Server.
However, in late 2009, the term NoSQL quickly caught on as shorthand for any
database system that broke with the traditional SQL database.
• A relational database model may not be the best solution for all situations.
SQL NoSQL
MySql, Oracle, Sqlite, Postgres and MS- MongoDB, BigTable, Redis, RavenDb,
SQL Cassandra, Hbase, Neo4j and CouchDB
Not good fit for complex queries (NoSQL
Good fit for the complex query
don’t have standard interfaces)
Not best fit for hierarchical data storage Fits better for the hierarchical data storage
SQL NoSQL
Excellent support are available for all SQL
Still have to rely on community support
database
Emphasizes on ACID properties Follows the Brewers CAP theorem
(Atomicity, Consistency, Isolation (Consistency, Availability and
and Durability) Partition tolerance )
Classified on the basis of way of storing
data as graph databases, key-value store
Classified as either open-source or close-
databases, document store databases,
sourced column store database and XML
databases.