Definition

database (DB)

What is a database (DB)?

A database is information that's set up for easy access, management and updating. Computer databases typically store aggregations of data records or files that contain information such as sales transactions, customer data, financials and product information.

Databases are used for storing, maintaining and accessing any sort of data. They collect information on people, places or things. This information is gathered in one place so it can be observed and analyzed. Databases can be thought of as an organized collection of information.

Databases are essential for storing large amounts of data in one place. With databases, organizations can quickly access, manage, modify, update, organize and retrieve their data.

Databases are normally controlled using a database management system (DBMS). In the database, data is organized into tables consisting of rows and columns. Many databases also use Structured Query Language (SQL) for writing and querying data. There are different kinds of databases, however, so the exact language used and how it works depends on the type of database.

What are databases used for?

Business, government and scientific fields use databases for data storage, analysis and management. Organizations use the data stored in databases to make informed business decisions. Some of the ways organizations use databases include the following:

  • Improve business processes. Companies collect data about business processes, such as sales, order processing and customer service. They analyze that data to improve these processes, expand their business and grow revenue.
  • Keep track of customers. Databases often store information about people, such as customers or users. For example, social media platforms use databases to store user information, such as names, email addresses and user behavior. That data is used to recommend content to users and improve the user experience.
  • Secure personal health information. Healthcare providers use databases to securely store personal health data to inform and improve patient care.
  • Store personal data. Databases can also be used to store personal information. For example, personal cloud storage is available for individual users to store media, such as photos, in a managed cloud.

Types of databases

There are many types of databases. They're classified according to content type: bibliographic, full text, numeric and images. In computing, databases are often classified by the organizational approach they use.

Some of the main organizational databases include the following.

Relational

This tabular approach defines data so it can be reorganized and accessed in several ways. Relational databases are comprised of tables and data is placed into predefined categories in those tables. Each table has columns with at least one data category, and rows that have a certain data instance for the categories that are defined in the columns. Information about a specific customer in a relational database is organized into rows, columns and tables. These are indexed to make it easier to search using SQL or NoSQL queries.

Relational databases use SQL in their user and application programming interfaces (APIs). A new data category can easily be added to a relational database without having to change the existing applications. A relational database management system (RDBMS) is used to store, manage, query and retrieve data in a relational database.

Typically, the RDBMS gives users the ability to control read/write access, specify report generation and analyze use. Some databases offer atomicity, consistency, isolation and durability -- known as ACID -- compliance to guarantee that data is consistent and transactions are complete.

Distributed

This database stores records or files in several physical locations. Data processing is also spread out and replicated across different parts of the network. Distributed databases can be homogeneous, where all physical locations have the same underlying hardware and run the same operating systems (OSes) and database applications. They can also be heterogeneous. In those cases, the hardware, OS and database applications can be different in the various locations.

Cloud

Cloud databases are built in a public, private or hybrid cloud for a virtualized environment. Users are charged based on how much storage and bandwidth they use. They also get scalability on demand and high availability. These databases can work with applications deployed as software as a service. The as-a-service offering is typically called database as a service, or DBaaS.

NoSQL

NoSQL databases are good when dealing with large collections of distributed data. They can address big data performance issues better than relational databases. They also do well analyzing large unstructured data sets and data on virtual servers in the cloud. These databases can also be called non-relational databases.

Object-oriented

Object-oriented databases hold data created using object-oriented programming languages. They focus on organizing objects rather than actions and data rather than logic. For instance, an image data record would be a data object rather than an alphanumeric value.

Graph

These databases are a type of NoSQL database. They store, map and query relationships using concepts from graph theory. Graph databases are made up of nodes and edges. Nodes are entities that connect the nodes. These databases are often used to analyze interconnections. Graph databases are often used to analyze data about customers as they interact with a business on web pages and social media.

Graph databases use the SPARQL declarative programming language and protocol for analytics. SPARQL can perform all the same analytics as SQL but can also be used for semantic analysis -- or the examination of relationships. This makes it useful for performing analytics on data sets that have both structured and unstructured data. SPARQL lets users perform analytics on information stored in a relational database, as well as friend-of-a-friend relationships, PageRank and shortest path.

A comparison between graph and relational databases.
Graph and relational databases are two types of databases that represent relationships between data but operate in different ways.

Multimodel

A multimodel database supports multiple data models, which define the parameters for how the information in a database is organized and arranged. Being multimodel enables IT teams to meet various application requirements without needing to deploy different database systems. For example, multimodel databases can use data models such as relational, hierarchical, object, graph and NoSQL databases.

Self-driving

A self-driving -- or autonomous -- database is a newer type of database that automates regular data management tasks, such as backups, updates, tuning and security. These databases are cloud-based and use machine learning processes in their automation. Self-driving databases require minimal human intervention to handle day-to-day operations. This reduces the time required for database administrators to manage a database.

Data warehouse

This is a repository of data from an organization's operational systems and other sources. Data warehouses are commonly designed for fast querying and analysis. Typically, a data warehouse is a relational database that's either on-premises in the data center or in the cloud.

What are the components of a database?

While the different types of databases vary in schema, data structure and data types most suited to them, they're all comprised of the following five basic components:

  • Hardware. This is the physical device that database software runs on. Database hardware includes computers, servers and hard drives.
  • Software. Database software or applications give users control of the database. DBMS software is used to manage and control databases.
  • Data. This is the raw information that the database stores. Database administrators organize the data to make it more meaningful.
  • Data access language. This is the programming language that controls the database. The programming language and the DBMS must work together. One of the most common database languages is SQL.
  • Procedures. These rules determine how the database works and how it handles the data.

What are database challenges?

Setting up, operating and maintaining a database presents common challenges, such as the following:

  • Data security is required because data is a valuable business asset. Protecting data stores requires skilled cybersecurity staff, which can be costly.
  • Data integrity ensures data is trustworthy. It isn't always easy to achieve data integrity because it means restricting access to databases to only those qualified to handle it.
  • Database performance requires regular database updates and maintenance. Without the proper support, database functionality can decline as the technology supporting the database changes or as the data it contains changes.
  • Database integration can also be difficult. It can involve integrating data sources from varying types of databases and structures into a single database or into data lakes and data warehouses.
  • Scalability is difficult for on-premises databases. It's challenging to predict the capacity needed. Cloud-based databases don't have this issue to the same extent.

What is a database management system?

Various DBMS components.
APIs connect the user or application to the database management system, which lets them interact with the database.

A DBMS is software that enables users to create and manage a database. It also helps them create, read, update and delete data in a database, and it assists with logging and auditing functions.

A DBMS provides physical and logical independence from data. Users and applications don't need to know either the physical or logical locations of data. A DBMS can also limit and control access to the database and provide different views of the same database schema to multiple users. Some examples of DBMSes include Microsoft SQL Server, MySQL and Oracle Database.

Evolution of databases

Databases were first created in the 1960s. These early databases were network models where each record is related to many primary and secondary records. Hierarchical databases were also among the early models. They have tree schemas with a root directory of records linked to several subdirectories.

Relational databases were developed in the 1970s and became more popular in the following decade. E.F. Codd outlined the concept of the relational database in the 1970s while at IBM. It became the standard for database systems because of its logical schema, or the way it's organized. The use of a logical schema separates the relational database from physical storage.

The relational database, combined with the growth of the internet beginning in the mid-1990s, led to a proliferation of databases, where many business and consumer applications began relying on them.

Object-oriented databases came next in the 1990s. This type of database enables users to quickly query data with complex relationships. Today, we use SQL, NoSQL, cloud and self-driving databases.

What is the future of databases?

The technology behind databases has changed since the conception of network and hierarchical databases in the 1960s. Most databases now are of the SQL, NoSQL and cloud-based varieties. However, self-driving databases are also making headway with services such as Oracle Autonomous Database.

The advent of self-driving databases also introduces a potential future trend in databases. That's the incorporation of artificial intelligence and machine learning to manage and optimize database performance. These tools are designed to operate in databases to minimize the need for database administrators to manually maintain them. They're able to handle numerous data management tasks.

Another trend is cloud-native databases, which are databases built from the ground-up to operate in the cloud. This type of database is more resilient, is designed to work with the distributed nature of the cloud, and can optimize performance and manage resources more efficiently.

Databases are also more likely to see an increased level of data security, as this is becoming a more significant factor for organizations that move their database to the cloud. Cloud databases commonly face threats such as exposed APIs, workload hijacking, data exposure and other exploits. Many cloud database services come with security features such as automatic backups, data encryption, identity and access management permissions and role-based access.

For organizations running their database servers in the cloud, security is becoming an increasingly critical topic. Learn more about cloud database security, the threats it faces and its best practices.

This was last updated in May 2024

Continue Reading About database (DB)

Dig Deeper on Database management

Business Analytics
SearchAWS
Content Management
SearchOracle
SearchSAP
Close