CSC 204 Session 1
CSC 204 Session 1
CSC 204 Session 1
Introduction
Computers can store and process vast amounts of data. Considering the large amount of data stored
on a computer system, how do you organize information so that you can find, update, add or delete
portions of it efficiently? This is done through the use of appropriate data structure. Data structures
enable a programmer to mentally structure large amounts of data into conceptually manageable
relationships.
In this study session, you will learn about the basic concepts of data structures and their usefulness
in the field of computer science. You will get to know about System Life-Cycle, Data types and
data abstraction. Just as Mathematics is the “Queen and servant of science”, the contents covered
by this study session are fundamental to all aspects of computer science.
Page 23 of 269
CSC 204: Fundamentals of Data Structures
Usually, efficient data structures are a key to designing efficient algorithms. Some formal design
methods and programming languages emphasize data structures, rather than algorithms, as the key
organizing factor in software design.
The implementation of a data structure usually requires writing a set of procedures that create and
manipulate instances of that structure. The efficiency of a data structure cannot be analysed
separately from those operations. This observation motivates the theoretical concept of an abstract
data type, a data structure that is defined indirectly by the operations that may be performed on it,
and the mathematical properties of those operations (including their space and time cost).
Data structure is a particular way of storing and organizing data in a computer so that it can be used
efficiently.
Page 24 of 269
CSC 204: Fundamentals of Data Structures
and some higher-level assembly languages, such as MASM, on the other hand, have special syntax
or other built-in support for certain data structures, such as vectors (one-dimensional arrays) in the
C language or multi-dimensional arrays in Pascal.
Most programming languages feature some sorts of library mechanism that allows data structure
implementations to be reused by different programs. Modern languages usually come with standard
libraries that implement the most common data structures. Examples are the C++ Standard
Template Library, the Java Collections Framework, and Microsoft's .NET Framework.
Modern languages also generally support modular programming, the separation between the
interface of a library module and its implementation. Some provide opaque data types that allow
clients to hide implementation details. Object-oriented programming languages, such as C++, Java
and .NET Framework may use classes for this purpose. Many known data structures have
concurrent versions that allow multiple computing threads to access the data structure
simultaneously.
SDLC is used during the development of an IT project; it describes the different stages involved
in the project from the drawing board, through the completion of the project.
Page 25 of 269
CSC 204: Fundamentals of Data Structures
System
Development
Life
Cycle
The system development life cycle framework provides a sequence of activities for system
designers and developers to follow. It consists of a set of steps or phases in which each phase of
the SDLC uses the results of the previous one. They are as follows;
Page 26 of 269
CSC 204: Fundamentals of Data Structures
what competitors are doing. With this finding, you will have three choices: leave the system
as it is, improve it, or develop a new system.
1.2.5 Testing
When the software is ready, it is sent to the testing department where Quality Analysts test it
thoroughly for different errors by forming various test cases. They either test the software manually
or use automated testing tools to ensure that each and every component of the software works fine.
Once the QA makes sure that the software is error-free, it goes to the next stage, which is
Implementation.
Page 27 of 269
CSC 204: Fundamentals of Data Structures
During the maintenance stage of the SDLC, the system is assessed to ensure it does not become
obsolete. This is also where changes are made to initial software. It involves continuous evaluation
of the system in terms of its performance.
When designing data structures and algorithms, it is desirable to avoid making decisions based on
the accident of how you first sketch out a piece of code. All design should be motivated by the
explicit needs of the application. The idea of an Abstract Data Type (ADT) is to support this.
Abstract Data Type (ADT) is a data type that separates specification of objects and operations
from representation of objects and implementation of operations. Abstraction captures only those
details about an object that are relevant to the current perspective.
Data abstraction enforces a clear separation between the abstract properties of a data type and the
concrete details of its implementation. The abstract properties are those that are visible to client
code that makes use of the data type—the interface to the data type—while the concrete
Page 28 of 269
CSC 204: Fundamentals of Data Structures
implementation is kept entirely private, and indeed can change, for example to incorporate
efficiency improvements over time.
The idea is that such changes are not supposed to have any impact on client code, since they
involve no difference in the abstract behaviour. Data abstraction allows handling data bits in
meaningful ways. Thus, it is a basic motivation behind data-type.
A key feature of modern computer programs is the ability to manipulate ADS using procedures or
methods that are predefined by the programmer or software designer. This requires that data
structures be specified carefully, with forethought, and in detail.
Page 29 of 269
CSC 204: Fundamentals of Data Structures
The type INTEGER comprises a subset of the whole numbers whose size may vary among
individual computer systems. It is assumed that all operations on data of this type are exact and
correspond to the ordinary laws of arithmetic, and that the computation will be interrupted in the
case of a result lying outside the representable subset. This event is called overflow.
The standard operators are the four basic arithmetic operations of addition (+), subtraction (-),
multiplication (*), and division (/, DIV). Whereas the slash denotes ordinary division resulting in a
value of type REAL, the operator DIV denotes integer division resulting in a value of type
INTEGER.
If we define the quotient q = m DIV n and the remainder r = m MOD n, the following relations
hold, assuming n > 0:
q*n + r = m and 0≤r < n
The term octet always refers to an 8-bit quantity. It is mostly used in the field of computer
networking, where computers with different byte widths might have to communicate. In modern
usage, byte almost invariably means eight bits, since all other sizes have fallen into disuse. Thus,
byte has come to be synonymous with octet.
Page 30 of 269
CSC 204: Fundamentals of Data Structures
b. Words
The term 'word' is used for a small group of bits which are handled simultaneously by processors of
a particular architecture. The size of a word is thus CPU-specific. Many different word sizes have
been used, including 6-, 8-, 12-, 16-, 18-, 24-, 32-, 36-, 39-, 48-, 60-, and 64-bit.
Since the size of a word is architectural, it is usually set by the first CPU in a family, rather than the
characteristics of a later compatible CPU. The meanings of terms derived from word, such as long-
word, double-word, quad-word, and half-word, also vary with the CPU and OS.
The standard operators are the four basic arithmetic operations of addition (+), subtraction (-),
multiplication (*), and division (/). It is an essence of data typing that different types are
incompatible under assignment. An exception to this rule is made for assignment of integer values
to real variables, because here the semantics are unambiguous. After all, integers form a subset of
real numbers. However, the inverse direction is not permissible: Assignment of a real value to an
integer variable requires an operation such as truncation or rounding.
Example 2.3
Page 31 of 269
CSC 204: Fundamentals of Data Structures
For instance, given Boolean variables p and q and integer variables x = 5, y = 8, z = 10, the two
assignments
p := x = y
The Boolean operators & (AND) and OR have an additional property in most programming
languages, which distinguishes them from other dyadic operators. Whereas, for example, the sum x
+ y is not defined, if either x or y is undefined, the conjunction p & q is defined even if q is
undefined, provided that p is FALSE.
This conditionality is an important and useful property. The exact definition of & and OR is
therefore given by the following equations:
p & q = if p then q else FALSE
p OR q = if p then TRUE else q
Page 32 of 269
CSC 204: Fundamentals of Data Structures
In order to be able to design algorithms involving characters (i.e., values of type CHAR) that are
system independent, we should like to be able to assume certain minimal properties of character
sets, namely:
1. The type CHAR contains the 26 capital Latin letters, the 26 lower-case letters, the 10
decimal digits and a number of other graphic characters, such as punctuation marks.
2. The subsets of letters and digits are ordered and contiguous, i.e.,
("A" x) & (x "Z") implies that x is a capital letter
("a" ≤ x) & (x ≤ "z") implies that x is a lower-case letter
("0" ≤ x) & (x ≤ "9") implies that x is a decimal digit.
3. The type CHAR contains a non-printing, blank character and a line-end character that may
be used as separators.
VAR r, s, t: SET
Page 33 of 269
CSC 204: Fundamentals of Data Structures
Here, the value assigned to r is the singleton set consisting of the single element 5; t is assigned the
empty set and s the elements x, y, y+1,…, z-1, z.
In this class, we will concentrate only on data structures called arrays, sets, records and on ADTs
called lists, stacks, queues, heaps, graphs, and trees. Further details on them will be discussed in
forthcoming study sessions.
Page 34 of 269
CSC 204: Fundamentals of Data Structures
1. Data structure is a particular way of storing and organizing data in a computer so that
it can be used efficiently. Data structures provide a means to manage huge amounts
of data efficiently, such as large databases and internet indexing services
2. The systems development life cycle (SDLC), also referred to as the application
development life-cycle, is a term used in systems engineering, information systems
and software engineering to describe a process for planning, creating, testing, and
deploying an information system.
3. The phases of SDLC are; Requirements Analysis, System Analysis, System Design,
Refinement and Coding, Acceptance, installation, deployment and Maintenance.
4. A data type is a term which refers to the kinds of data that variables may hold. With
every programming language there is a set of built-in data types
5. Abstract Data Type (ADT) is a data type that separates specification of objects and
operations from representation of objects and implementation of operations.
6. Abstraction captures only those details about an object that are relevant to the current
perspective.
7. Data abstraction enforces a clear separation between the abstract properties of a data
type and the concrete details of its implementation.
8. Standard primitive types are those types that are available on most computers as
built-in features. They include the whole numbers, the logical truth values, and a set
of printable characters.
9. Standard primitive data types include integer, real, boolean, character, set.
Page 35 of 269
CSC 204: Fundamentals of Data Structures
Pilot Answers
Page 36 of 269
CSC 204: Fundamentals of Data Structures
Glossary of Terms
Abstract Data Type (ADT) is a data type that separates specification of objects & operations from
representation of objects & implementation of operations
Data Structure is a particular way of storing and organizing data in a computer so that it can be
used efficiently
Page 37 of 269
CSC 204: Fundamentals of Data Structures
Page 38 of 269