A Brief History of Database Management - DATAVERSITY
A Brief History of Database Management - DATAVERSITY
Data Topics
Analytics | Database | Data Architecture | Data Literacy | Data Science | Data Strategy
| Data Modeling | EIM | Governance & Quality | Smart Data
Advertisement
Homepage > Data Education > Enterprise Information Management > Information Management Articles >
Databases (or DBs) have played a very important part in the recent evolution of computers. The first
computer programs were developed in the early 1950s and focused almost completely on coding
languages and algorithms. At the time, computers were basically giant calculators and data (names,
phone numbers) was considered the leftovers of processing information. Computers were just starting
to become commercially available, and when business people started using them for real-world
:
purposes, this leftover data suddenly became important.
By the mid-1960s, as computers developed speed and flexibility, and started becoming popular, many
kinds of general-use database systems became available. As a result, customers demanded a
standard be developed, in turn leading to Bachman forming the Database Task Group. This group
took responsibility for the design and standardization of a language called Common Business
Oriented Language ( COBOL ). The Database Task Group presented this standard in 1971, which
also came to be known as the “CODASYL approach.”
The CODASYL approach was a very complicated system and required substantial training. It
depended on a “manual” navigation technique using a linked data set, which formed a large network.
Searching for records could be accomplished by one of three techniques:
Eventually, the CODASYL approach lost its popularity as simpler, easier-to-work-with systems came
on the market.
Edgar Codd worked for IBM in the development of hard disk systems, and he was not happy with
:
the lack of a search engine in the CODASYL approach, and the IMS model. He wrote a series of
papers, in 1970, outlining novel ways to construct databases. His ideas eventually evolved into a
paper titled A Relational Model of Data for Large Shared Data Banks , which described a new
method for storing data and processing large databases. Records would not be stored in a free-form
list of linked records, as in CODASYL navigational model, but instead used a “table with fixed-length
records.”
IBM had invested heavily in the IMS model and wasn’t terribly interested in Codd’s ideas. Fortunately,
some people who didn’t work for IBM were interested. In 1973, Michael Stonebraker and Eugene
Wong (both then at UC Berkeley) made the decision to research relational database systems. The
project was called INGRES (Interactive Graphics and Retrieval System) and successfully
demonstrated a relational model could be efficient and practical. INGRES worked with a query
language known as QUEL, in turn, pressuring IBM to develop SQL in 1974, which was more
advanced (SQL became ANSI and OSI standards in 1986 and 1987). SQL quickly replaced QUEL as
the more functional query language.
RDBM Systems were an efficient way to store and process structured data. Then, processing
speeds got faster, and “unstructured” data (art, photographs, music, etc.) became much more
commonplace. Unstructured data is both non-relational and schema-less, and relational database
management systems simply were not designed to handle this kind of data.
MySQL
In 1995, David Axmark, Allan Larsson, and Michael Widenius launched MySQL , an open-sourced
relational database management system (RDBMS) that is based on structured query language.
According to a Stack Overflow survey in 2020, MySQL is being used by 55.6% of the survey
participants, making it the most popular database in the world.
MySQL has evolved into an extremely scalable database system with the ability to operate on
multiple platforms. Some key features of MySQL follow:
Column Stores
A DBMS using columns is quite different from traditional relational database systems. It stores data
:
as portions of columns, instead of as rows. The change in focus, from row to a column, lets column
databases maximize their performance when large amounts of data are stored in a single column.
This strength can be extended to data warehouses and CRM applications. Examples of column-style
databases include Cloudera , Cassandra , and HBase (Hadoop-based).
Key-Value Stores
Key-value stores are based on Ken Thompson’s DBM (database management) research in 1979. A
key-value pair database is useful for shopping cart data or storing user profiles. All access to the
database is done using a primary key. Typically, there is no fixed schema or data model. The key can
be identified by using a random lump of data. Key-value stores are not useful when there are
complex relationships between data elements or when data needs to be queried by other than the
primary key. Examples of key-value stores includes Riak , Berkeley DB , and Aerospike .
An element can be any single “named” unit of stored data that might, or might not, contain other data
components.
NoSQL
NoSQL (“Not only” Structured Query Language) came about as a response to the Internet and the
need for faster speed and the processing of unstructured data. Generally speaking, NoSQL
databases are preferable in certain use cases to relational databases because of their speed and
flexibility. The NoSQL model is non-relational and uses a “distributed” database system. This non-
relational system is fast, uses an ad-hoc method of organizing data, and processes high volumes of
different kinds of data.
“Not only” does it handle structured and unstructured data, it can also process unstructured big data,
very quickly. The widespread use of NoSQL can be connected to the services offered by Twitter,
LinkedIn, Facebook, and Google. Each of these organizations stores and processes colossal
amounts of unstructured data. These are the advantages NoSQL has over SQL and RDBM
Systems:
Higher scalability
A distributed computing system
Lower costs
A flexible schema
Can process unstructured and semi-structured data
Has no complex relationship
Unfortunately, NoSQL does come with some problems. Some NoSQL databases can be quite
resource-intensive, demanding high RAM and CPU allocations. It can also be difficult to find tech
support if your open-source NoSQL system goes down.
:
NoSQL Data Distribution
Hardware can fail, but NoSQL databases are designed with a distribution architecture that includes
redundant backup storage of both data and function. It does this by using multiple nodes (database
servers). If one, or more, of the nodes goes down, the other nodes can continue with normal
operations and suffer no data loss. When used correctly, NoSQL databases can provide high
performance at an extremely large scale, and never shut down. In general, there are four kinds of
NoSQL databases , with each having specific qualities and characteristics.
Document-oriented databases store all information for a given “object” within the database, and each
object in storage can be quite different from the others. This makes it easier for mapping objects to
the database and makes document storage for web programming applications very attractive. (An
“object” is a set of relationships. An article object could be related to a tag [an object], a category
[another object], or a comment [another object].)
They differ from relational databases, and other NoSQL databases , by storing data relationships as
actual relationships. This type of storage for relationship data results in fewer disconnects between an
evolving schema and the actual database. It has interconnected elements, using an undetermined
number of relationships between them. Examples graph databases include Neo4j , GraphBase , and
Titan .
Polyglot Persistence
:
Polyglot Persistence is a spin-off of “polyglot programming,” a concept developed in 2006 by Neal
Ford. The original idea promoted applications be written using a mix of languages, with the
understanding that a specific language may solve a certain kind of problem easily, while another
language would have difficulties. Different languages are suitable for tackling different problems.
Many NoSQL systems run on nodes and large clusters. This allows for significant scalability and
redundant backups of data on each node. Using different technologies at each node supports a
philosophy of polyglot persistence . This means “storing” data on multiple technologies with the
understanding certain technologies will solve one kind of problem easily, while others will not. An
application communicating with different database management technologies uses each for the best
fit in achieving the end goal.
Intelligent Databases
An intelligent database manages, not simple data, but information. It presents the information in a
natural way that is useful and informative. The concept was introduced in the book “Intelligent
Databases” in 1989 by Mark Chignell, Setrag Khoshafian, Kamran Parsaye, and Harry Wong. They
applied three levels of intelligence:
High-level tools
The user interface
The database engine
The high-level tools are used to manage Data Quality, while automatically finding relevant patterns
within the data (this is, in fact, data mining). This layer often uses artificial intelligence techniques.
Use of intelligent databases is widespread, with hospital databases bringing up patient histories
(charts, x-ray images, and text) with just a few mouse clicks. Corporate intelligent databases include
decision-making support tools (often using sales patterns analysis).
DATAVERSITY.net TDAN.com
:
Conferences Online Conferences DATAVERSITY
Enterprise Data World Enterprise Data Governance Resources
Data Governance & Online DATAVERSITY Training
Management and
Governance
White Papers
Product Demos
What is…?
Advertise With Us
Press Room
© 2011 – 2024 Dataversity Digital LLC | All Rights Reserved. Terms of Service Cookies Settings
CA: Do Not Sell My Personal Information Privacy Policy
: