Data Structures
Data Structures
is usually chosen for efficient access to data.[1][2][3] More precisely, a data structure is a collection of
data values, the relationships among them, and the functions or operations that can be applied to
the data,[4] i.e., it is an algebraic structure about data.
Usage[edit]
Data structures serve as the basis for abstract data types (ADT). The ADT defines the logical form of
the data type. The data structure implements the physical form of the data type.[5]
Different types of data structures are suited to different kinds of applications, and some are highly
specialized to specific tasks. For example, relational databases commonly use B-tree indexes for
data retrieval,[6] while compiler implementations usually use hash tables to look up identifiers.[7]
Data structures provide a means to manage large amounts of data efficiently for uses such as
large databases and internet indexing services. Usually, efficient data structures are key to
designing efficient algorithms. Some formal design methods and programming
languages emphasize data structures, rather than algorithms, as the key organizing factor in
software design. Data structures can be used to organize the storage and retrieval of information
stored in both main memory and secondary memory.[8]
Implementation[edit]
Data structures are generally based on the ability of a computer to fetch and store data at any place
in its memory, specified by a pointer—a bit string, representing a memory address, that can be itself
stored in memory and manipulated by the program. Thus, the array and record data structures are
based on computing the addresses of data items with arithmetic operations, while the linked data
structures are based on storing addresses of data items within the structure itself.
The implementation of a data structure usually requires writing a set of procedures that create and
manipulate instances of that structure. The efficiency of a data structure cannot be analyzed
separately from those operations. This observation motivates the theoretical concept of an abstract
data type, a data structure that is defined indirectly by the operations that may be performed on it,
and the mathematical properties of those operations (including their space and time cost).[9]
Examples[edit]
Main article: List of data structures
The standard type hierarchy of the programming
language Python 3.
There are numerous types of data structures, generally built upon simpler primitive data types. Well
known examples are:[10]
An array is a number of elements in a specific order, typically all of the same type (depending on
the language, individual elements may either all be forced to be the same type, or may be of
almost any type). Elements are accessed using an integer index to specify which element is
required. Typical implementations allocate contiguous memory words for the elements of arrays
(but this is not always a necessity). Arrays may be fixed-length or resizable.
A linked list (also just called list) is a linear collection of data elements of any type, called nodes,
where each node has itself a value, and points to the next node in the linked list. The principal
advantage of a linked list over an array is that values can always be efficiently inserted and
removed without relocating the rest of the list. Certain other operations, such as random
access to a certain element, are however slower on lists than on arrays.
A record (also called tuple or struct) is an aggregate data structure. A record is a value that
contains other values, typically in fixed number and sequence and typically indexed by names.
The elements of records are usually called fields or members. In the context of object-oriented
programming, records are known as plain old data structures to distinguish them from objects.[11]
Hash tables, also known as hash maps, are data structures that provide fast retrieval of values
based on keys. They use a hashing function to map keys to indexes in an array, allowing for
constant-time access in the average case. Hash tables are commonly used in dictionaries,
caches, and database indexing. However, hash collisions can occur, which can impact their
performance. Techniques like chaining and open addressing are employed to handle collisions.
Graphs are collections of nodes connected by edges, representing relationships between
entities. Graphs can be used to model social networks, computer networks, and transportation
networks, among other things. They consist of vertices (nodes) and edges (connections
between nodes). Graphs can be directed or undirected, and they can have cycles or be acyclic.
Graph traversal algorithms include breadth-first search and depth-first search.
Stacks and queues are abstract data types that can be implemented using arrays or linked lists.
A stack has two primary operations: push (adds an element to the top of the stack) and pop
(removes the topmost element from the stack), that follow the Last In, First Out (LIFO) principle.
Queues have two main operations: enqueue (adds an element to the rear of the queue) and
dequeue (removes an element from the front of the queue) that follow the First In, First Out
(FIFO) principle.