0% found this document useful (0 votes)
48 views16 pages

CMP 313 Data Structures and Algorithms - 085929

cmp313

Uploaded by

joshmelodymedia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views16 pages

CMP 313 Data Structures and Algorithms - 085929

cmp313

Uploaded by

joshmelodymedia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

CMP 313 Data Structures and Algorithms

Understanding Computer Languages


A computer language is a language that is used to communicate with a machine. Like all
languages, computer languages have syntax (form) and semantics (meaning). High level
languages such as Java are designed to make the process of programming easier, but programmer
typically has little control over how efficient the code will run on the hardware. On the other
hand, Assembly language programs are harder to write but are designed so that programmer can
optimize the performance of the code. Then there is the machine language, the language the
machine really understands. All computer languages are designed to communicate with hardware
at the end. But programs written in high level languages may go through many steps of
translations before being executed. Programs written in C are first converted to an assembly
program (designed for the underlying hardware), which then in turn is converted to the machine
language, the language understood by the hardware. There may be many steps in between.
Machine language “defines” the machine and vice versa. Machine language instructions are
simple. They typically consist of very simple instructions such as adding two numbers or moving
data or jumping from one instruction to another. However, it is of course very difficult to write
and debug programs in machine language. Understanding how computer operates in the most
fundamental manner in terms of instruction execution in bits and bytes plays a significant role in
creating efficient data structures and algorithms.

Background and Concept of Data Structure

The study of data structures helps to understand the basic concepts involved in organizing and
storing data as well as the relationship among the data sets. This in turn helps to determine the
way information is stored, retrieved and modified in a computer’s memory.

Data structure is a branch of computer science that helps you understand how data is organized
and how data flow is managed to increase efficiency of any process or program. Data structure is
the structural representation of logical relationship between data elements. This means that a data
structure organizes data items based on the relationship between the data elements.
Example:
A house can be identified by the house name, location, number of floors and so on. These
structured set of variables depend on each other to identify the exact house. Similarly, data
structure is a structured set of variables that are linked to each other, which forms the basic
component of a system
Basic Terminologies
Choice of data structure is important in developing efficient and error free programs and
software as it forms the foundation of how computers execute instruction from the memory.
Below are basic terminologies used as far as data structure is concerned.
Data: Refers to collection of raw facts or values that have not been organized or processed. Data
can take various forms such as numbers, text, images, or multimedia. In data structures, data is
manipulated and organized to facilitate efficient storage, retrieval, and manipulation. A typical
scenario is a student database system, could include specific details about a student, such as
"John Smith," "18," "Male," "Grade 12," "Physics Grade: A," "Math Grade: B," "English Grade:
A-". Each piece of information represents raw data about the student.

Group Items: refer to collections of related or subordinate data elements that are treated as a
single entity. They can consist of homogeneous (same) or heterogeneous(different) data types.
In data structures, grouping items together allows for better organization and management of
data, making it easier to perform operations on the data as a whole. A typical example is A name
of a student can have “First Name” and “Surname” or “Middle Name”, Attendance record could
have “week 1 attendance record”, “week two attendance record and so on” and finally
Examination Scores will have a listed registered course and their corresponding grades “physics
= A”, “Mathematics = B”, “Computer Science = A”

Record: Record can be defined as the collection of various data items, each representing a piece
of data related to a particular entity or concept, for example. A student's record in a school
database might include fields such as student ID, name, address, date of birth, grade level,
parent/guardian contact information, and enrollment status. Each record represents a unique
student within the school system.

File: A file is a collection of various records of a particular entity stored together for efficient
storage and retrieval. Files can be organized in various ways, such as sequentially or using
indexed access methods. In data structures, files are used to store and manage data persistently,
often on secondary storage devices like hard drives or solid-state drives. Example: A file
containing student transcripts would consist of multiple records, each representing an individual
student's academic history. Each record would include details such as courses taken, grades,
GPA, and any honors or awards earned. The file organizes these records for easy access and
management.

Attribute and Entity: Attributes represent the properties or characteristics of an entity in a data
model. They describe the various aspects or features of the entity. it contains various attributes.
Each attribute represents the particular property of that entity.
Entities are objects or concepts in the real world that are represented in a database. Each entity is
described by a set of attributes, and entities may have relationships with other entities. For
example, in a student information system, an entity could be a "Student," with attributes
including "Name," "Date of Birth," "Gender," "Grade Level," and "Address." Each student entity
in the system would have specific values for these attributes, distinguishing one student from
another.

Field: A field is a single piece of data within a record or entity. It corresponds to a specific
attribute and holds a value representing some aspect of the entity. Within a student's record, a
field like "Grade Point Average (GPA)" would contain a numerical value representing the
student's overall academic performance, such as "3.5" or "4.0." Another field like "Address"
would store the student's residential address, such as "123 Main Street, City, State, ZIP." Each
field stores a specific piece of information about the student.
Aims of Data Structure
Data Structure implements two goals which are

Correctness: Data structure is designed such that it operates correctly for all kinds of input,
which is based on the domain of interest. In other words, correctness forms the primary goal of
data structure, which always depends on the specific problems that the data structure is intended
to solve.

Efficiency: Data structure also needs to be efficient. It should process the data at high speed
without utilizing much of the computer resources such as memory space. In a real time, state, the
efficiency of a data structure is an important factor that determines the success and failure of the
process.

Classification Of Data Structures


Data structures are divided into two categories Primitive Data Structure and non-primitive Data
Structures. Non-Primitive data structures are sub-categorised into two. Which are, linear data
structure and non-linear Data-Structure. The diagram below illustrates these classifications with
examples.
Primitive data structures: are basic data structures that are directly operated upon by the
machine's instructions. They are fundamental and typically defined by the programming
language itself. These data structures are simple and usually have fixed sizes in memory.
Primitive data structures are also known as elementary data structures.

Examples of primitive data structures include integers, real, floating-point numbers, characters,
and Booleans and the rest. These data types hold a single value and are used to represent basic
types of data

1. Integers: Integers represent whole numbers without any fractional or decimal part. They
can be either signed (positive, negative, or zero) or unsigned (positive or zero).

2. Floating-point Numbers (real): Floating-point numbers represent real numbers with a


fractional part. They include both the integer and fractional parts, separated by a decimal
point.

3. Characters: Characters represent individual symbols such as letters, digits, or special


characters. They are typically encoded using ASCII or Unicode standards.

4. Booleans: Booleans represent a binary value, typically denoting true or false.


Non-primitive data structures: are complex data structures derived from primitive data
structures and are used to organize and manage collections of data. Unlike primitive data
structures, this data structure cannot be operated on directly by machine level instructions but are
implemented using primitive data types along with programming constructs like arrays, linked
lists, trees, graphs, and hash tables. Non-primitive data structures are sub-categorized into either
linear or non-linear data structures

Linear Data Structures: Linear data structures organize elements sequentially, meaning each element
is connected to its preceding and succeeding elements in a linear or sequential manner. Elements in linear
data structures are accessed and processed in a linear order, typically from one end to the other.

Examples of linear data structures include:

I. Arrays: An array is a fundamental data structure that stores elements of the same data
type in contiguous memory locations. Each element in an array is accessed using an
index, which specifies its position in the array. An array is declared by assigning initial
values of every corresponding element by enclosing the values in braces {}.
Int Num [5] = {45, 32, 5, 10, 66}; the declared statement will create an array which is
depicted in the figure below

0 1 2 3 4
NUM 45 32 5 10 66

This shows that the number of values declared in the squared braces [] must be equal to data
elements declared inside the braces {}. Each value is represented by an index value starting from
0 to 4. And all the five values are of the same datatype declared as “NUM” which are numbers.

Arrays can be classified as one-dimensional array, two-dimensional array or

multidimensional array.

One-dimensional Array: It has only one row of elements. It is stored in ascending

storage location.

Two-dimensional Array: It consists of multiple rows and columns of data elements. It is

also called as a matrix.

Multidimensional Array: Multidimensional arrays can be defined as array of

arrays. Multidimensional arrays are not bounded to two indices or two

dimensions. They can include as many indices as required.


Disadvantages

❖ Fixed size: Arrays have a fixed size determined at the time of creation, making it challenging
to resize dynamically.

❖ Inefficient insertion and deletion: Inserting or deleting elements in the middle of an array
requires shifting subsequent elements, resulting in inefficient time complexity.

❖ Memory wastage: If the array size is larger than required, it leads to memory wastage.

Advantages

❖ Constant-time access: Elements can be accessed directly using their index, providing
efficient random access.
❖ Compact memory representation: Arrays store elements in contiguous memory blocks, which
reduces memory overhead.
❖ Easy to implement: Arrays are supported by most programming languages and are
straightforward to use.

Applications

❖ Storing list of data elements belonging to same data type


❖ Auxiliary storage for other data structures
❖ Storage of binary tree elements of fixed count
❖ Storage of matrices

II. Linked Lists: is a linear data structure composed of nodes, where each node contains
data or information and a pointer to the next node in the sequence. Unlike arrays, linked
lists do not require contiguous memory allocation, allowing dynamic memory allocation
and flexible resizing.

The figure above depicts a linked-list with 4 nodes. The left part of the nodes contains the data or
information of an entire record and the right part contains a pointer or reference to the next node.
The last node in the figure shows a null pointer.
Advantages:

❖ Dynamic memory allocation: Linked lists can dynamically allocate memory for nodes,
allowing flexible resizing and efficient memory usage.

❖ Efficient insertion and deletion: Inserting or deleting elements in a linked list requires only
updating pointers, resulting in efficient time complexity.

❖ Versatility: Linked lists support various operations such as insertion, deletion, and traversal,
making them suitable for diverse applications.

Disadvantages:

❖ Sequential access: Linked lists do not support random access to elements by index, requiring
sequential traversal to access specific elements.

❖ Additional memory overhead: Linked lists require extra memory for storing pointers,
increasing memory overhead compared to arrays.

❖ Lack of cache locality: Accessing elements in a linked list may result in poor cache locality,
leading to slower performance compared to arrays for certain operations.

III. Stacks: A stack is a linear data structure that follows the Last In, First Out (LIFO)
principle, where elements are added or removed from one end called the top. Stacks
support two primary operations: push, which adds an element to the top of the stack, and
pop, which removes and returns the top element.

Advantages

❖ Simple implementation: Stacks are easy to implement and understand, making them suitable
for a wide range of applications.
❖ LIFO ordering: Stacks enforce LIFO ordering, which is useful for managing function calls,
expression evaluation, and undo mechanisms.

❖ Space-efficient: Stacks typically use minimal memory, as they only require storage for
elements and a few pointers.

Disadvantages

❖ Limited access: Stacks provide limited access to elements, as only the top element can be
accessed or removed at any given time.

❖ Overflow and underflow: Stacks may encounter overflow (when attempting to push onto a
full stack) or underflow (when attempting to pop from an empty stack) conditions.

❖ Recursive operations: Recursive operations using stacks may lead to stack overflow if the
call stack grows too large.

IV. Queues: A queue is a linear data structure that follows the First In, First Out (FIFO)
principle, where elements are added to one end called the rear and removed from the
other end called the front. Queues support two primary operations: enqueue, which adds
an element to the rear of the queue, and dequeue, which removes and returns the front
element.

Non-linear Data structures

Non-linear data structures organize elements in a hierarchical or non-sequential manner, where


each element can have multiple connections or relationships with other elements. Unlike linear
data structures, elements in non-linear data structures do not follow a strict order.

Examples of non-linear data structures include:

I. Trees: A tree is a non-linear data structure in which data is organized in branches. The
data elements in tree are arranged in a sorted order. It imposes a hierarchical structure on
the data elements. Figure below represents a tree which consists of 8 nodes. The root of
the tree is the node 60 at the top. Node 29 and 44 are the successors of the node 60. The
nodes 6, 4, 12 and 67 are the terminal nodes as they do not have any successors.
Applications:

❖ Implementing the hierarchical structures in computer systems like directory and file system.
❖ Implementing the navigation structure of a website.
❖ Code generation like Huffman’s code.
❖ Decision making in gaming applications.
❖ Implementation of priority queues for priority-based OS scheduling functions
❖ Parsing of expressions and statements in programming language compilers
❖ For storing data keys for DBMS for indexing
❖ Spanning trees for routing decisions in computer and communications networks
❖ Hash trees
❖ path-finding algorithm to implement in AI, robotics and video games applications

II. Graphs: A graph is also a non-linear data structure. In a tree data structure, all data
elements are stored in definite hierarchical structure. In other words, each node has only
one parent node. While in graphs, each data element is called a vertex and is connected to
many other vertexes through connections called edges. Thus, a graph is considered as a
mathematical structure, which is composed of a set of vertexes and a set of edges. Figure
shows a graph with six nodes A, B,

C, D, E, F and seven edges [A, B], [A, C], [A, D], [B, C], [C, F], [D, F] and [D, E].

Static Data Structure Vs Dynamic Data Structure

Static and dynamic data structures are fundamental concepts in computer science that define how
data is organized, accessed, and manipulated in memory. Understanding the essence of both
static and dynamic data structures is crucial for designing efficient algorithms and solving
various computational problems.

Static Data structure: Static data structures are those whose size and memory allocation are
fixed at compile time and cannot be altered during program execution. Arrays are typical
examples of this as they have fixed size that are determined at compile time and elements of the
same data type are stored in contiguous memory locations. In some cases, linked lists may be
implemented statically, where a fixed number of nodes are allocated at compile time, and no
additional nodes can be added or removed during runtime.

Dynamic Data Structures: Dynamic data structures are those whose size and memory
allocation can change dynamically during program execution. They can grow or shrink as needed
to accommodate varying amounts of data. Example is Dynamic arrays (e.g., ArrayList in Java,
std::vector in C++): These are resizable arrays that automatically grow or shrink in size as
elements are added or removed. More also, are used in this scenario in dynamic linked lists to
dynamically allocate memory for nodes as new elements are added, and nodes can be removed
without affecting the structure's overall size.

OPERATIONS ON DATA STRUCTURES

1. Access

• Accessing refers to retrieving the value of an element stored in the data structure.
It involves specifying the position or key of the element to be accessed.

• Examples: Accessing the nth element in an array, retrieving the value associated
with a specific key in a dictionary.

2. Insertion:

• Insertion involves adding a new element into the data structure. The element can
be inserted at a specified position or appended to the end of the structure.

• Examples: Inserting a new item at the end of an array, adding a node to the end of
a linked list.

3. Deletion:

• Deletion removes an existing element from the data structure. It can involve
removing an element at a specified position, removing the first or last element, or
removing an element with a specific value or key.

• Examples: Deleting the nth element from an array, removing a node from a linked
list.

4. Search:

• Description: Searching involves finding a specific element within the data


structure based on its value or key. It determines whether the element exists in the
structure and, if so, returns its position or reference.

• Examples: Searching for a value in an array, traversing a linked list to find a


specific node.
5. Traversal:

• Traversal involves visiting and processing each element of the data structure in a
specific order. It allows for examining or performing operations on all elements of
the structure.

• Examples: Iterating over all elements of an array, traversing a linked list to


perform a specific operation on each node, recursively traversing a tree or graph.

6. Sorting:

• Sorting rearranges the elements of the data structure into a specific order based on
a defined criterion, such as numerical or Alphabetical order.

• Examples: Sorting an array of integers in ascending or descending order, sorting a


linked list.

7. Merging:

• Merging combines two or more data structures into a single structure. It involves
integrating the elements of multiple structures while preserving the order or
properties of the original structures.

• Examples: Merging two sorted arrays into a single sorted array, merging two
sorted linked lists, merging the keys of two dictionaries into a single dictionary.

8. Splitting:

• Description: Splitting divides a data structure into two or more smaller structures.
It separates the elements of the original structure based on a specified criterion or
condition.

• Example: Splitting an array into two arrays based on a certain condition,


partitioning a linked list into two linked lists, or separating the elements of a set
into multiple subsets

9. Update:

• Definition: Modifying the value of an existing element within the data structure.

• Example: Updating the value of an element in an array, changing the data stored
in a node of a linked list, or updating the value associated with a key in a hash
table.
Abstract Data Types

Abstract Data Types (ADTs) are a fundamental concept in computer science that allows
programmers to define data structures independently of their underlying implementation. This
means that the manner in which data will be organised in computers memory and the specific
algorithms to be used for implementing or carrying out certain operations is not specified. An
ADT consists of two main components: data and operations. The data component defines the
type of data being stored, while the operations component defines the set of operations that can
be performed on that data.

The key characteristic of ADTs is abstraction, which means hiding the internal details of how the
data is stored and manipulated and exposing only the essential properties and behaviours of the
data structure. This abstraction enables users to work with data structures at a higher level of
understanding, without needing to know the specific implementation details.

ADTs are typically defined through interfaces or specifications, which specify the set of
operations that can be performed on the data structure and their semantics. Examples are stack,
maps, tree, list, queue and set which essentially contains operations such as Add, remove,
contains, union, intersection, Difference, subset, Empty set and the rest.

Advantage of using ADTs

In the real world, programs evolve as a result of new requirements or constraints, so a


modification to a program commonly requires a change in one or more of its data structures. For
example, if you want to add a new field to a student’s record to keep track of more information
about each student, then it will be better to replace an array with a linked structure to improve the
programs efficiency. In such a scenario, rewriting every procedure that uses the changed
structure is not desirable. Therefore, a better alternative is to separate the use of a data structure
from the details of its implementation. This is the principle underlying the use of abstract data
types.

Bit Representation of Integers in the computer System

A bit is the smallest unit of data in a computer and can have one of two possible values: 0 or 1.
The term "bit" is short for "binary digit" and represents the basic building block of digital
information storage and processing.

A byte is a unit of digital information that consists of a fixed number of bits, usually eight. Bytes
are commonly used to represent characters, numbers, and other data in computer systems.

Typically files are sequence of bytes and their sizes are represented in bytes. Morealso files in
this context can be subcategorized as either text files(human readable in ASCII format) and
binary files(non-human readable such as bitmap files)
Standard units of Memory

1000 bytes = 1 Kilobytes(KB)


1000 KB = 1 megabyte (MB)
1000MB = 1 Gigabyte(GB)
1000 GB = 1 Terabyte(TB)
1000 TB = 1 Petabyte(PB)

Each Data in bytes can be represented using as ASCII or its extended version to represent values.
This is done by assigning a numerical value to characters E.g ‘A’ = 65 and ‘a’ = 97. Printable
standard ASCII values are between 32 and 126. The 8th bit in the byte may be used for parity
checking in communication or other device specific functions.
An integer is typically represented by 4 bytes (or 32-bits). However, this depends on the
compiler/machine you are using. It is possible some architectures may use 2 bytes while others
may use 8 bytes to represent an integer. But generally, it is 4 bytes of memory. Each ASCII value
can be represented using 7 bits. 7 bits can represent numbers from
0 = 0000 0000 to 127 = 0111 1111 (total of 128 numbers or 27)
Example
To convert the integer representation of the string "john" to binary, we first need to understand
how characters are represented in computers. In Java, characters are stored using Unicode
encoding, which assigns a unique numerical value to each character.
Here are the steps to convert the characters "john" to binary:

Base Conversions
Understanding different bases is critical to understanding how data is represented in the
memory. We consider base-2 (binary), base-8 (octal), base-10(decimal) and base-
16(hexadecimal). A number can be represented in any of the bases stated above. The
following digits are used in each base.
Base-2 - 0, 1
Base-8 - 0, 1, 2, 3.…, 7
Base-10 - 0, 1, 2.…, 9
Base-16 - 0,1, 2…., 9, A, B, C, D, E, F

while binary remains the primary base for storage and processing in computer memory due to its
alignment with the underlying hardware, hexadecimal is favored for human-readable
representations and ease of conversion to binary, making it easier for programmers and system
administrators to work with. Octal, while still occasionally used in legacy systems, is less
common in modern computing due to its inefficiency compared to hexadecimal. In computing,
hexadecimal numbers are commonly used to represent memory addresses, data dumps, and in
various programming tasks due to their more compact representation compared to binary. When
storing information in 4 bytes, hexadecimal representation becomes particularly useful as each
hexadecimal digit corresponds to 4 bits (half a byte), making it easy to represent byte values in a
concise manner

Each data byte can be represented using an ASCII (or extended ASCII) value. An ASCII table is
given below. Standard ASCII table assigns each character to a numerical value. For example ‘A’
= 65 and ‘a’ = 97. Printable standard ASCII values are between 32 and 126. The 8th bit in the
byte may be used for parity checking in communication or other device specific functions. The
standard ASCII table is given in the figure below.

A number that is in base-10 such as 97 can be written in base-2 (binary) as follows. To convert
the number to binary, divide the quotient value by 2 and keep the remainder. Keep on dividing
until you the last values can no longer be divisible by the binary value.
Divide 97 by 2:

quotient operator quotient Value Remainder


97 ÷ 2 48 1
48 ÷ 2 24 0
24 ÷ 2 12 0
12 ÷ 2 6 0
6 ÷ 2 3 0
3 ÷ 2 1 1
1 ÷ 2 0 1
0 ÷ 2 0 0
97 base 10 = 01100001 base 2

To convert the value of 97 to octal(8) the value will further be divided by 8

quotient operator quotient Value Remainder


97 ÷ 8 12 1
12 ÷ 8 1 4
1 ÷ 8 0 1

97 base 10 = 141 base 8


To Convert the value of 97 to hexa-decimal (16) the value will further be divided by 16

quotient operator quotient Value Remainder


97 ÷ 16 6 1
6 ÷ 16 0 6

97 base 10 = 61 base 16
The remainders, read from bottom to top, represent the hexadecimal representation of 97 in base
10

To convert the hexadecimal number 61 to binary, we first convert each hexadecimal digit to its
binary representation and then concatenate them together. Here's the conversion:
1. Hexadecimal to Binary Conversion:
• Hexadecimal 6: Binary 0110
• Hexadecimal 1: Binary 0001
2. Concatenate Binary Digits:
• Combining the binary representations of each hexadecimal digit: 0110 0001
So, the binary representation of the hexadecimal number 61 is 01100001.

However, Each hexadecimal digit corresponds to 4 bits (half a byte), so a 4-byte representation
would have 8 hexadecimal digits as follows. 00000000 00000000 00000000 01100001
The leading zeros are added to fill up the remaining bits to make each group of 4 bits.
Example: Convert the Word “john” to base two using the ASCII table given in the figure above.
Each number can be converted from binary(base-2) to any other base such as octal(base-8),
decimal(base-10) or hex (base-16).

1. Convert characters to ASCII values:


• "j": ASCII value = 106
• "o": ASCII value = 111
• "h": ASCII value = 104
• "n": ASCII value = 110

2. Convert ASCII values to binary:


• "j" (106): 106 in binary is 1101010
• "o" (111): 111 in binary is 01101111
• "h" (104): 104 in binary is 01101000
• "n" (110): 110 in binary is 01101110

3. Combine the binary representations of each character:


• "j": 01101010
• "o": 01101111
• "h": 01101000
• "n": 01101110
Concatenate these binary values together:
"j" "o" "h" "n": 01101010 01101111 01101000 01101110

You might also like