0% found this document useful (0 votes)
81 views

COSC 01 - Data Structures and Algorithms

Data structures and algorithm for Ahmadu Bello university distance learning program. COSC 301.

Uploaded by

FG na Terrorists
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

COSC 01 - Data Structures and Algorithms

Data structures and algorithm for Ahmadu Bello university distance learning program. COSC 301.

Uploaded by

FG na Terrorists
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 371

DISTANCE LEARNING CENTRE

AHMADU BELLO UNIVERSITY


ZARIA, NIGERIA

COURSE MATERIAL

FOR

Course Code & Title: COSC 301/DATA STRUCTURES AND

ALGORITHMS

Programme Title: B.Sc COMPUTER SCIENCE

1
ACKNOWLEDGEMENT
We acknowledge the use of the Courseware of the National Open University of
Nigeria (NOUN) as the primary resource. Internal reviewers in Ahmadu Bello
University have also been duly listed.

2
COPYRIGHT PAGE
© 2018 Ahmadu Bello University (ABU) Zaria, Nigeria

All rights reserved. No part of this publication may be reproduced in any form or
by any means, electronic, mechanical, photocopying, recording or otherwise
without the prior permission of the Ahmadu Bello University, Zaria, Nigeria.

First published 2018 in Nigeria.

ISBN:

Published and printed in Nigeria by:


Ahmadu Bello University Press Ltd.
Ahmadu Bello University,
Zaria, Nigeria.

Tel: +234

E-mail:

3
COURSE WRITERS/DEVELOPMENT TEAM
Editor
Prof. M.I Sule

Course Materials Development Overseer


Dr. Usman Abubakar Zaria

Subject Matter Expert


Mohammed Tanko Yahaya

Subject Matter Reviewer


Dr. Aliyu Salisu

Language Reviewer
Enegoloinu Ojokojo

Instructional Designers/Graphics
Emmanuel Ekoja / Ibrahim Otukoya

Proposed Course Coordinator


Emmanuel Ameh Ekoja

ODL Expert
Dr. Abdulkarim Muhammad

4
QUOTE

Learning must travel the distance, from head to heart.


- Gloria Steinem

5
COURSE STUDY GUIDE
i. COURSE INFORMATION
Course Code: COSC 301
Course Title: Data Structures and Algorithms
Credit Units: 3
Year of Study: 3
Semester: 1

ii. COURSE INTRODUCTION AND DESCRIPTION


Introduction:
The course, Data Structure is a foundational course for students pursuing a
Bachelor of Science degree in Computer Science. In this course, we will study
programing techniques; including data structures and basic algorithms for
manipulating them. The overall aim of this course is to introduce you to
programing concepts and algorithm techniques. Topics related to data structures
and storage management are equally discussed. The bottom-up approach is adopted
in structuring this course. We start with the basic building blocks of object-oriented
programing concepts and move on to the fundamental principles of data structures
and algorithms.

Description:
This course is an introduction to design patterns and recursion. Data structures
such as trees (including binary, AVL and multiway trees), heaps, stacks and queues
will be covered. Graphs and hashing techniques are also covered. In the course of
this study, you will learn how to design new algorithms for each new data structure
studied, create and perform simple operations on graph data structures, describe

6
and implement common algorithms for working with advanced data structures and
recognise which data structure is best used in solving each particular problem.

iii. COURSE PREREQUISITES


You should note that although this course has no subject pre-requisite, you are
expected to have:
1. Satisfactory level of English proficiency
2. Basic Computer Operations proficiency
3. COSC 211, COSC 212 or equivalent

iv. COURSE LEARNING RESOURCES


i. Course Textbooks
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Cormen, T.H., Leiserson, C.E, and Rivest, R.L. (1989). Introduction to
Algorithms, New York: McGraw-Hill.
Deitel, H.M. and Deitel, P.J. (1998). C++ How to Programme (2nd
Edition), New Jersey: Prentice Hall.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Ford, W. and Topp, W. (2002). Data Structures with C++ Using the

7
STL (2nd Edition), New Jersey: Prentice Hall.
French C. S. (1992). Computer Science, DP Publications, (4th Edition),
199-217.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.
Shaffer, Clifford A. A. (1998). Practical Introduction to Data Structures
and Algorithm Analysis, Prentice Hall, pp. 77–102.
ii. Other Resources
http://www.gnu.org/manual/emacs-20.3/emacs.html
http://www.indiana.edu/~ucspubs/b131
http://yoda.cis.temple.edu:8080/UGAIWWW/help
http://www.cs.sunysb.edu/~skiena/214/lectures/
https://www.geeksforgeeks.org/doubly-linked-list/
v. COURSE AIM
The aim of this course is to give you a feel for algorithms and data
structures as a central part of what it is to be a computer scientist. You
will develop skills in selecting and implementing appropriate data
structures and algorithms to solve problems.
vi. COURSE GOAL
This course strives to strengthen your ability to solve problems
computationally by utilizing an understanding of algorithm analysis and
data structures. A strong understanding of algorithm analysis and data
structures will greatly improve your ability to solve problems with
powerful and efficient programs. It should be of interest to you that
Employees and Graduate schools are really interested in students who
possess these skills, so you should take this course seriously.

8
vii. COURSE OUTCOMES
After studying this course, you should be able to:
1. Describe the basic operations on stacks, lists and queue data structures.
2. Explain the notions of trees, hashing and binary search trees.
3. Identify the basic concepts of object-oriented programing.
4. Develop java programs for simple applications.
5. Discuss the underlying principles of basic data types: lists, stacks and
queues.
6. Describe structures and algorithms for external storage: external sorting,
external search trees.
7. Identify directed and undirected graphs.
8. Discuss sorting: internal and external sort.
9. Describe the efficiency of algorithms, recursion and recursive programs.

viii. ACTIVITIES TO MEET COURSE OBJECTIVES


Specifically, this course comprises of the following activities:
1. Studying courseware
2. Listening to course audios
3. Watching relevant course videos
4. Field activities, industrial attachment or internship, laboratory or
studio work (whichever is applicable)
5. Course assignments (individual and group)
6. Forum discussion participation
7. Tutorials (optional)
8. Semester examinations (CBT and essay based).

9
ix. TIME (TO COMPLETE SYLLABUS/COURSE)
To cope with this course, you would be expected to commit a minimum of three
hours weekly for the course.

x. GRADING CRITERIA AND SCALE


Grading Criteria s
A. Formative assessment
Grades will be based on the following:
Individual assignments/test (CA 1,2 etc) 20
Group assignments (GCA 1, 2 etc) 10
Discussions/Quizzes/Out of class engagements etc 10

B. Summative assessment (Semester examination)


CBT based 30
Essay based 30
TOTAL 100%

C. Grading Scale:
A = 70-100
B = 60 – 69
C = 50 - 59
D = 45-49
F = 0-44

D. Feedback
Courseware based:

10
1. In-text questions and answers (answers preceding references)
2. Self-assessment questions and answers (answers preceding references)

Tutor based:
1. Discussion Forum tutor input
2. Graded Continuous assessments

Student based:
1. Online program assessment (administration, learning resource, deployment,
and assessment).

xi. LINKS TO OPEN EDUCATION RESOURCES


OSS Watch provides tips for selecting open source, or for procuring free or open
software.
SchoolForge and SourceForge are good places to find, create, and publish open
software. SourceForge, for one, has millions of downloads each day.
Open Source Education Foundation and Open Source Initiative, and other
organisations like these, help disseminate knowledge.
Creative Commons has a number of open projects from Khan
Academy to Curriki where teachers and parents can find educational materials for
children or learn about Creative Commons licenses. Also, they recently launched
the School of Open that offers courses on the meaning, application, and impact of
"openness."
Numerous open or open educational resource databases and search engines
exist. Some examples include:

11
 OEDb: over 10,000 free courses from universities as well as reviews of
colleges and rankings of college degree programs
 Open Tapestry: over 100,000 open licensed online learning resources for an
academic and general audience
 OER Commons: over 40,000 open educational resources from elementary
school through to higher education; many of the elementary, middle, and
high school resources are aligned to the Common Core State Standards
 Open Content: a blog, definition, and game of open source as well as a
friendly search engine for open educational resources from MIT, Stanford,
and other universities with subject and description listings
 Academic Earth: over 1,500 video lectures from MIT, Stanford, Berkeley,
Harvard, Princeton, and Yale
 JISC: Joint Information Systems Committee works on behalf of UK higher
education and is involved in many open resources and open projects
including digitising British newspapers from 1620-1900!
Other sources for open education resources
Universities
 The University of Cambridge's guide on Open Educational Resources for
Teacher Education (ORBIT)
 OpenLearn from Open University in the UK
Global
 Unesco's searchable open database is a portal to worldwide courses and
research initiatives
 African Virtual University (http://oer.avu.org/) has numerous modules on
subjects in English, French, and Portuguese

12
 https://code.google.com/p/course-builder/ is Google's open source software
that is designed to let anyone create online education courses
 Global Voices (http://globalvoicesonline.org/) is an international community
of bloggers who report on blogs and citizen media from around the world,
including on open source and open educational resources
Individuals (which include OERs)
 Librarian Chick: everything from books to quizzes and videos here, includes
directories on open source and open educational resources
 K-12 Tech Tools: OERs, from art to special education
 Web 2.0: Cool Tools for Schools: audio and video tools
 Web 2.0 Guru: animation and various collections of free open source
software
 Livebinders: search, create, or organise digital information binders by age,
grade, or subject (why re-invent the wheel?)

13
xii. ABU DLC ACADEMIC CALENDAR/PLANNER

PERIOD
Semester Semester 1 Semester 2 Semester 3
Activity JAN FEB MAR APR MAY JUN JUL AUG SEPT OCT NOV DEC
Registration
Resumption
Late Registn.
Facilitation
Revision/
Consolidation
Semester
Examination

N.B: - All Sessions commence in January


- 1 Week break between Semesters and 6 Weeks vocation at end of session.
- Semester 3 is OPTIONAL (Fast-tracking, making up carry-overs & deferments)

14
xiii. COURSE STRUCTURE AND OUTLINE
Course Structure
WEEK/DAYS MODULE STUDY SESSION ACTIVITY

1. Read Courseware for the corresponding Study Session.


2. View the Video(s) on this Study Session
Study Session 1: 3. Listen to the Audio on this Study Session
Title: Review of 4. View any other Video/U-tube (http://bit.ly/2FIJOOy ,
Week 1 Object-Oriented http://bit.ly/2KOX1t0 , http://bit.ly/2KOGbKL ,
Concepts in JAVA http://bit.ly/2Y0czSY , http://bit.ly/2YmsL0c ,
http://bit.ly/2GmSA63 , http://bit.ly/2TVWV5L ,
Pp. 26 http://bit.ly/2L5Kosd, http://bit.ly/2SAN2cW,
http://bit.ly/2P9AhYQ , http://bit.ly/2SAlowE ,
http://bit.ly/2YaPGaA , http://bit.ly/30SYVOe ,
http://bit.ly/2JZzBzh , http://bit.ly/2Zjl479)
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2JZh2yt )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
STUDY 9. Any out of Class Activity
MODULE 1
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 2 3. Listen to the Audio on this Study Session
Week 2 Title: Design 4. View any other Video/U-tube (http://bit.ly/33Zay8Y ,
Patterns http://bit.ly/325YXDbm , http://bit.ly/33VbZ8p ,
http://bit.ly/30yUQ1X , http://bit.ly/2PdEF9e
Pp. 37 , http://bit.ly/32dYO0H ,
http://bit.ly/33Y6khM )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2zxFXBr , )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 3 3. Listen to the Audio on this Study Session
Title: Complexity 4. View any other Video/U-tube (http://bit.ly/2ZkJtxx ,
Analysis http://bit.ly/2zbO71O, http://bit.ly/2ZoLvbD ,
http://bit.ly/2MDWQ5I, http://bit.ly/2PcWUeR
Pp. 47 , http://bit.ly/2zljMho ,
Week 3 http://bit.ly/2U2QIoF )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2U2QIoF )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity

15
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 4 3. Listen to the Audio on this Study Session
Title: Linked Lists 4. View any other Video/U-tube (http://bit.ly/2Pbi90H,
http://bit.ly/2UdSX8R , http://bit.ly/2ME8Y74 ,
Pp. 66 http://bit.ly/33ZOZoE , http://bit.ly/2U5ZHp4,
http://bit.ly/2ZebDea , http://bit.ly/2MD0M70
)
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/33VilVj )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 5 3. Listen to the Audio on this Study Session
Title: Stacks and 4. View any other Video/U-tube (http://bit.ly/2Lc6l91 ,
Queues http://bit.ly/341wCiX , http://bit.ly/329fKoZ ,
http://bit.ly/2ZAGrW4 , http://bit.ly/2ZAGrW4 ,
Pp. 85 http://bit.ly/2PoKrVM , http://bit.ly/33W2NAu ,
http://bit.ly/2MArY6p )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/341wCiX )
Week 4 7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 6 3. Listen to the Audio on this Study Session
Title: Recursion 4. View any other Video/U-tube (http://bit.ly/30wvulo ,
http://bit.ly/2PeeWO5 , http://bit.ly/33U1HoZ ,
Pp. 103 http://bit.ly/2U3P5a2 , http://bit.ly/2U97Bhp ,
http://bit.ly/2ZgAYnZ )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2U5iHEj )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 7 3. Listen to the Audio on this Study Session
Title: Analysis of 4. View any other Video/U-tube (http://bit.ly/2MDl2oP ,
Week 5 recursive http://bit.ly/2zocdH4 , http://bit.ly/2ZwIc6k ,
algorithms http://bit.ly/2PdKL9C , http://bit.ly/2Zialiq ,
http://bit.ly/2L6Tg0Q)
Pp. 126 5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/31XCPL4 )
7. Read Chapter/page of Standard/relevant text.

16
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
Study Session 1 2. View the Video(s) on this Study Session
Title: Tree 3. Listen to the Audio on this Study Session
4. View any other Video/U-tube (http://bit.ly/33SfqfQ ,
Pp. 137 http://bit.ly/33ZQbbC , http://bit.ly/2U5YIW4 ,
http://bit.ly/2Pbi90H )
5. View referred OER (address/site )
STUDY 6. View referred Animation (http://bit.ly/2PbLkk5 ,
http://bit.ly/2PbLkk5 )
MODULE 2 7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 2 3. Listen to the Audio on this Study Session
Title: Binary Search 4. View any other Video/U-tube (http://bit.ly/2ZpdZCh ,
Tree http://bit.ly/2KWL1FN , http://bit.ly/2Zsrsgz ,
http://bit.ly/2ZrXntm , http://bit.ly/3207oja ,
Pp. 151 http://bit.ly/2PdLyHC , http://bit.ly/2zm08C3 ,
http://bit.ly/2zn37Kt )
Week 6 5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2MChZxj )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
3. Listen to the Audio on this Study Session
Study Session 3 4. View any other Video/U-tube (http://bit.ly/2MDlWBJ ,
Title: Tree http://bit.ly/3419IZl , http://bit.ly/2KUgGYm ,
Traversal http://bit.ly/3497bMK , http://bit.ly/31XE78S )
5. View referred OER (address/site )
Pp. 162 6. View referred Animation (http://bit.ly/2MD3UQi )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity

1. Read Courseware for the corresponding Study Session.


2. View the Video(s) on this Study Session
Study Session 4 3. Listen to the Audio on this Study Session
Title: Binary Heap 4. View any other Video/U-tube (http://bit.ly/326pvE6 ,
http://bit.ly/2Zfa1Ri , http://bit.ly/2Zfa1Ri ,
Pp. 172 http://bit.ly/2U5kqJN , http://bit.ly/2Zp6xeh , )
5. View referred OER (address/site)
Week 7 6. View referred Animation (http://bit.ly/2KXxlKw )

17
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 5 3. Listen to the Audio on this Study Session
Title: AVL Tree 4. View any other Video/U-tube (http://bit.ly/30BdWVf ,
http://bit.ly/2ZfeBz7 , http://bit.ly/2ZfeBz7 ,
Pp. 189 http://bit.ly/2MDjPOl , http://bit.ly/30B1KE7 ,
http://bit.ly/2L4Aasg )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2U7Rq3R )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 6 3. Listen to the Audio on this Study Session
Title: B-Tree 4. View any other Video/U-tube (http://bit.ly/340haUk ,
http://bit.ly/340haUk , http://bit.ly/2Hr7DMp,
Pp. 211 http://bit.ly/2Nxku3B , http://bit.ly/2NyTikU
, http://bit.ly/30BgsuF ,
http://bit.ly/2NufU6d )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2ZnVr5r)
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
Week 8 9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
Study Session 1 2. View the Video(s) on this Study Session
Title: Huffman 3. Listen to the Audio on this Study Session
Coding 4. View any other Video/U-tube (http://bit.ly/2KWfKCS ,
http://bit.ly/2HnJVR5 , http://bit.ly/2HnJVR5 ,
Pp. 231 http://bit.ly/30zWjVW ,
http://bit.ly/33ZeA0Z, http://bit.ly/2KVExa7
STUDY )
5. View referred OER (address/site )
MODULE 3
6. View referred Animation (http://bit.ly/33ZeA0Z )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity

18
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 2 3. Listen to the Audio on this Study Session
Title: Graphs 4. View any other Video/U-tube (http://bit.ly/2Zm1nQo ,
http://bit.ly/2ZrUMQn ,
Week 9 Pp. 246 http://bit.ly/2KU4uaa, http://bit.ly/2zogYQW,
http://bit.ly/2ZteSd4, http://bit.ly/2U2FkJv,
http://bit.ly/2KXJgIA )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/33YXJLF )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
3. Listen to the Audio on this Study Session
Study Session 3 4. View any other Video/U-tube (http://bit.ly/2MDqJTE ,
Title: Topological http://bit.ly/2U31Kdz , http://bit.ly/2ZeLfRv
Sort , http://bit.ly/31XKrgC ,
http://bit.ly/2LaTbJw, http://bit.ly/2Pekd8l
Pp. 274 )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2MDqJTE )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 4 3. Listen to the Audio on this Study Session
Title: Shortest Path 4. View any other Video/U-tube (http://bit.ly/2MBOyvm ,
algorithm http://bit.ly/2z9zgoQ , http://bit.ly/2PcMVpW ,
http://bit.ly/2ZcxrXw , http://bit.ly/31XL6i6
Pp. 281 )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2MBOyvm )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
Week 10 9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 5 3. Listen to the Audio on this Study Session
Title: Minimum 4. View any other Video/U-tube (http://bit.ly/2L6NvjT ,
Spanning Tree http://bit.ly/30Bvdxt , http://bit.ly/2U67xiE ,
http://bit.ly/2PfJ77y , http://bit.ly/2MqAy7o ,
Pp. 289 http://bit.ly/30LDPSc )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2PfJ77y )

19
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity

1. Read Courseware for the corresponding Study Session.


2. View the Video(s) on this Study Session
Study Session 1 3. Listen to the Audio on this Study Session
Title: Hashing 4. View any other Video/U-tube (http://bit.ly/2Zpk9lT ,
http://bit.ly/2L8qZXN , http://bit.ly/32efDZp ,
Pp. 300 http://bit.ly/2MFqKqm , http://bit.ly/33Ywd0T
)
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/32efDZp )
Week 11 7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
STUDY Study Session 2 3. Listen to the Audio on this Study Session
4. View any other Video/U-tube (http://bit.ly/2ZsDDpH ,
MODULE 4 Title: Lempel-Ziv http://bit.ly/2PbTCIJ , http://bit.ly/2ZrEnPW ,
Compression
Techniques http://bit.ly/2ZilcZM , http://bit.ly/2L707aA ,
http://bit.ly/2PfJE9y )
Pp. 328 5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2KXjR1F )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session
Study Session 3 3. Listen to the Audio on this Study Session
Title: Garbage 4. View any other Video/U-tube ( http://bit.ly/2Zx9yJw ,
Collection http://bit.ly/33VptRz ,
http://bit.ly/2Pvg9kf, http://bit.ly/2PdPCYo
Pp. 346 , http://bit.ly/2MDbnPk ,
http://bit.ly/2ZsSYX8 )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/2Pvg9kf )
Week 12 7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
1. Read Courseware for the corresponding Study Session.
2. View the Video(s) on this Study Session

20
Study Session 4 3. Listen to the Audio on this Study Session
Title: Memory 4. View any other Video/U-tube (http://bit.ly/2ZsSYX8 ,
Management http://bit.ly/2ZoSUNj , http://bit.ly/2NwIe7R,
http://bit.ly/2U9fT9c , http://bit.ly/2HFbc1V
Pp. 357 , http://bit.ly/2U5DeZ7 )
5. View referred OER (address/site )
6. View referred Animation (http://bit.ly/323jCaN )
7. Read Chapter/page of Standard/relevant text.
8. Any additional study material
9. Any out of Class Activity
Week 13 REVISION/TUTORIALS (On Campus or Online) & CONSOLIDATION
WEEK

Week 14 & SEMESTER EXAMINATION


15

21
TABLE OF CONTENT
Title Page……………………………………………...……………………………..……1
Acknowledgement Page …………………………………………………….……………2
Copyright Page ………………………………………..…………..………….…………..3
Course Writers/Development Team ……………………...……….………….…………4
Quote …………………………………..…………………...……….………….…………5

COURSE STUDY GUIDE


Course Information ………………………………………………………………………6
Course Introduction and Description ……………………………………………..…….6
Course Prerequisites ………………………………………………………………...........7
Course Textbook(s) ……………………………………………………………………….7
Course Aim………. ……………………………………………………………………….8
Course Goal ……………………………………………………………………………….8
Course Objectives and Outcome …………………………………..…………...………..9
Activities to Meet Course Objectives ……………………………………..……………..9
Time (to Study and Complete the Course) …………………………………….............10
Grading Criteria and Scale ……………………………………………………………..10
OER Resources ………………………………………………………………………......11
ABU DLC Academic Calendar ...…………………………………………….…………14
Course Study Guide and Outline ………………………………………..……….……..15
Table of Content …………………………………………...……….……………………22
Course Outline …………………………………………...……….……………………...22
STUDY MODULES
1.0 MODULE 1: Design Patterns And Recursion ……………………………………..26
Study Session 1: Review of Object-Oriented Concepts in JAVA…………………………26
Study Session 2: Design Patterns……………………………………………………….....37
Study Session 3: Complexity Analysis…………………………………………………….47
Study Session 4: Linked Lists……………………………………………………………...66
Study Session 5: Stacks and Queues……………………………………………………….85
Study Session 6: Recursion………………………………………………………………..103
Study Session 7: Analysis of recursive algorithms………………………………………..126

22
2.0 MODULE 2: Trees……………………………………………………………….......137
Study Session 1: Tree …………………………………………………………………......137
Study Session 2: Binary Tree……………………………………………………………...151
Study Session 3: Binary Search Tree…………………………………………………..,…162
Study Session 4: Binary Heap…………………………………………………………......171
Study Session 5: AVL Tree……………………………………………………………......189
Study Session 6: B-Tree……………………………………………………………….......211

3.0 MODULE 3: Graphs And Sorting ………………………………………………….231


Study Session 1: Huffman Coding ……………………………………………………......231
Study Session 2: Graphs ………………………………………………………………......246
Study Session 3: Topological Sort ………………………………………………………..274
Study Session 4: Shortest Path Algorithm ………………………………………………..281
Study Session 5: Minimum Spanning Tree ……………………………………………….289

4.0 MODULE 4: Hashing ………………………………………………………………300


Study Session 1: Hashing …………………………………………………………………300
Study Session 2: Lempel-Ziv Compression Techniques ……………………………….....328
Study Session 3: Garbage Collection ……………………………………………………..346
Study Session 4: Memory Management …………………………………………………..357
GLOSSARY………………………………………………………………………………369

23
Course Outline
MODULE 1: Design Patterns and Recursion
Study Session 1: Review of Object-Oriented Concepts in JAVA
Study Session 2: Design Patterns
Study Session 3: Complexity Analysis
Study Session 4: Linked Lists
Study Session 5: Stacks and Queues
Study Session 6: Recursion
Study Session 7: Analysis of recursive algorithms

MODULE 2: Trees
Study Session 1: Tree
Study Session 2: Binary Tree
Study Session 3: Binary Search Tree
Study Session 4: Binary Heap
Study Session 5: AVL Tree
Study Session 6: B-Tree

MODULE 3: Graphs and Sorting


Study Session 1: Huffman Coding
Study Session 2: Graphs
Study Session 3: Topological Sort
Study Session 4: Shortest Path Algorithm
Study Session 5: Minimum Spanning Tree

MODULE 4: Hashing
Study Session 1: Hashing
Study Session 2: Lempel-Ziv Compression Techniques
Study Session 3: Garbage Collection

24
Study Session 4: Memory Management

25
xii. STUDY MODULES
MODULE 1: Design Patterns and Recursion
Contents:
Study Session 1: Review of Object-Oriented Concepts in JAVA
Study Session 2: Design Patterns
Study Session 3: Complexity Analysis
Study Session 4: Linked Lists
Study Session 5: Stacks and Queues
Study Session 6: Recursion
Study Session 7: Analysis of recursive algorithms

STUDY SESSION 1
Review of Object-Oriented Concepts in JAVA
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - Object-Oriented Concepts supported by JAVA.
2.2 - Advantages of Object-Orientation.
2.3 - Review of Inheritance.
2.3.1- Notes about Inheritance
2.4 - Review of Abstract Classes.
2.5 - Review of Interfaces.
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers

26
8.0 References/Further Readings

Introduction:
We begin this course by taking a review of some object-oriented Concepts that
you have learnt from COSC 211 and COSC 212. In this first study session, we
shall take a look at some of the advantages of the object-oriented concepts
supported by Java and their advantages. We then go on to review some concepts
such as inheritance, abstract classes and interfaces.

1.0 Study Session Learning Outcomes


After studying this session, you are expected to be able to:
1. Outline object-oriented concepts supported by JAVA
2. List advantages of object-orientation
3. Describe with examples inheritance, abstract classes and interfaces.

2.0 Main Content


2.1 Object-Oriented Concepts supported by JAVA
As you have learnt from the two Java courses (COSC 211 and COSC 212) you
were taught in the second year of your studies, Java provides explicit support
for many of the fundamental object-oriented concepts. Some of these are:
- Classification: Grouping related things together. This is supported
through classes, inheritance and packages.
- Encapsulation: Representing data and the set of operations on the data
as a single entity - exactly what classes do.
- Information Hiding: An object should be in full control of its data,
granting specific access only to whom it wishes.
- Inheritance: Java allows related classes to be organized in a hierarchical
manner using the extends keyword.

27
- Polymorphism: Same code behaves differently at different times during
execution. This is due to dynamic binding.

2.2 Advantages of Object-Orientation


A number of advantages can be derived as a result of the object-oriented
features as we have highlighted in section 2.1 above. Some of these are:
- Reusability: Rather than endlessly rewriting same piece of code, you
write it once and use it or inherit it as needed.
- Extensibility: A class can be extended without affecting its users
provided the user-interface remains the same.
- Maintainability: Again, once the user-interface does not change, the
implementation can be changed at will.
- Security: Thanks to information hiding, a user can only access the
information he has been allowed to access.
- Abstraction: Classification and Encapsulation allow portrayal of real-
world problems in a simplified model.

2.3 Review of Inheritance


Now, let us review inheritance by looking at an example. Suppose we have the
following Employee class:

28
class Employee {
protected String name;
protected double payRate;
public Employee(String name, double payRate) {
this.name = name;
this.payRate = payRate;
}
public String getName() {return name;}
public void setPayRate(double newRate) {
payRate = newRate;
}
public double pay() {return payRate;}
public void print() {
System.out.println("Name: " + name);
System.out.println("Pay Rate: "+payRate);
}
}
Now, suppose we wish to define another class to represent a part-time employee
whose salary is paid per hour. We inherit from the Employee class as follows:

class HourlyEmployee extends Employee {


private int hours;
public HourlyEmployee(String hName, double hRate) {
super(hName, hRate);
hours = 0;
}
public void addHours(int moreHours) {hours +=
moreHours;}
public double pay() {return payRate * hours;}
public void print() {
super.print();
System.out.println("Current hours: " + hours);
}
}

2.3.1 Notes about Inheritance


We observe the following from the examples on inheritance:
- Methods and instance variables of the super class are inherited by
subclasses, thus allowing for code reuse.
- A subclass can define additional instance variables (e.g. hours) and
additional methods (e.g. addHours).
- A subclass can override some of the methods of the super class to make
them behave differently (e.g. the pay & print)

29
- Constructors are not inherited, but can be called using the super keyword.
Such a call must be the first statement.
- If the constructor of the super class is not called, then the complier inserts
a call to the default constructor -watch out!
- The super keyword may also be used to call a method of the super class.
-

2.4 Review of Abstract Classes


As you already know, Inheritance enforces hierarchical organisation, the benefit
of which are reusability, type sharing and polymorphism. Java uses other
concepts known as Abstract classes and Interfaces to further strengthen the
idea of inheritance.
To see the role of abstract classes, suppose that the pay method is not
implemented in the Hourly Employee subclass, obviously, the pay method in
the Employee class will be assumed, which will inevitably lead to a wrong
result. One solution is to remove the pay method out and put it in another
extension of the Employee class, MonthlyEmployee. The problem with this
solution is that it does not force subclasses of Employee class to implement the
pay method. The solution is to declare the pay method of the Employee class as
abstract, thus, making the class abstract.
abstract class Employee {
protected String name;
protected double payRate;
public Employee(String empName, double empRate) {
name = empName;
payRate = empRate;
}
public String getName() {return name;}
public void setPayRate(double newRate) {payRate =
newRate;}
abstract public double pay();
public void print() {
System.out.println("Name: " + name);
System.out.println("Pay Rate: "+payRate);
}
}

30
The following extends the Employee abstract class to get MonthlyEmployee
class.
class MonthlyEmployee extends Employee {
public MonthlyEmployee(String empName, double empRate) {
super(empName, empRate);
}
public double pay() {
return payRate;
}
}

The next example extends the MonthlyEmployee class to get the Executive
class.
class Executive extends MonthlyEmployee {
private double bonus;
public Executive(String exName, double exRate) {
super(exName, exRate);
bonus = 0;
}
public void awardBonus(double amount) {
bonus = amount;
}
public double pay() {
double paycheck = super.pay() + bonus;
bonus = 0;
return paycheck;
}
public void print() {
super.print();
System.out.println("Current bonus: " + bonus);
}
}

In-text Question 1
An Abstract class must contain at least one abstract method, (true or false)?

Answer
True

If we decide to represent the above examples hierarchically, we have the


following:

31
Figure 1.1.1: Inheritance and Abstract Classes
Next, we highlight the advantages of organising classes using abstract classes,
inheritance - same type, polymorphism using the following example which is
based on the examples from sections 2.3 and 2.4

public class TestAbstractClass {


public static void main(String[] args) {
Employee[] list = new Employee[3];
list[0] = new Executive("Jarallah Al-Ghamdi", 50000);
list[1] = new HourlyEmployee("Azmat Ansari", 120);
list[2] = new MonthlyEmployee("Sahalu Junaidu", 9000);
((Executive)list[0]).awardBonus(11000);
for(int i = 0; i < list.length; i++)
if(list[i] instanceof HourlyEmployee)
((HourlyEmployee)list[i]).addHours(60);
for(int i = 0; i < list.length; i++) {
list[i].print();
System.out.println("Paid: " + list[i].pay());
System.out.println("*************************");
}
}
}
The output of the program is:

2.5 Review of Interfaces


Interfaces are not classes, they are entirely, a separate entity. They provide a list
of abstract methods which MUST be implemented by a class that implements
the interface. Unlike abstract classes which may contain implementation of
some of the methods, interfaces provide NO implementation. Like abstract
classes, the purpose of interfaces is to provide organisational structure. More

32
importantly, interfaces are here to provide a kind of "multiple inheritance"
which is not supported in Java. If both parents of a child implement a method,
which one does the child inherit? (Multiple inheritance confusion). Interfaces
allow a child to be of both type A and B. Recall that Java has the Comparable
interface defined as:
interface Comparable {
int compareTo(Object o);
}

Recall also that java has the java.util.Arrays class, which has a sort method that
can separate any array whose contents are either primitive values or comparable
objects. Thus, to sort our list of Employee objects, all we need is to modify the
Employee class to implement the Comparable interface. Notice that this will
work even if the Employee class is extending another class or implementing
another interface. This modification is shown in the example below.

abstract class Employee implements Comparable {


protected String name;
protected double payRate;
public Employee(String empName, double empRate) {
name = empName;
payRate = empRate;
}
public String getName() {return name;}
public void setPayRate(double newRate) {
payRate = newRate;
}
abstract public double pay();
public int compareTo(Object o) {
Employee e = (Employee) o;
return name.compareTo( e.getName());
}
}

In-text Question 2
Not all methods provided by an interface must be implemented by a class that implements
that interface, (True or False).

Answer
False

33
Since Employee class implements the Comparable interface, the array of
employees can now be sorted as shown below:
import java.util.Arrays;
public class TestInterface {
public static void main(String[] args) {
Employee[] list = new Employee[3];
list[0] = new Executive("Jarallah Al-Ghamdi", 50000);
list[1] = new HourlyEmployee("Azmat Ansari", 120);
list[2] = new MonthlyEmployee("Sahalu Junaidu", 9000);
((Executive)list[0]).awardBonus(11000);
for(int i = 0; i < list.length; i++)
if(list[i] instanceof HourlyEmployee)
((HourlyEmployee)list[i]).addHours(60);
Arrays.sort(list);
for(int i = 0; i < list.length; i++) {
list[i].print();
System.out.println("Paid: " + list[i].pay());
System.out.println("**********************");
}
}
The
} output of the program is

3.0 Tutor Marked Assignments (Individual or Group)


1. How does an interface differ from an abstract class?
2. Why does Java not support multiple inheritance? What feature of Java
helps realize the benefits of multiple inheritance?
3. A subclass typically represents a larger number of objects than its super
class, (true or false)?
4. A subclass typically encapsulates less functionality than its super class
does, (true or false)?

34
5. An instance of a class can be assigned to a variable of type any of the
interfaces the class implements, (true or false)?

4.0 Conclusion/Summary
In this study session, we reviewed some object-oriented concepts supported by
Java and their advantages. You then learnt about some concepts such as
inheritance, abstract classes and interfaces using examples. In the next study
session, you will learn about design patterns, which allow reusability among
unrelated classes.

5.0 Self-Assessment Questions


1. The super keyword may also be used to call a method of the super class.
(True or False)?
2. A subclass cannot override some of the methods of the super class to
make them behave differently. (True or False)?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2FIJOOy , http://bit.ly/2KOX1t0 , http://bit.ly/2KOGbKL ,
http://bit.ly/2Y0czSY , http://bit.ly/2YmsL0c , http://bit.ly/2GmSA63 ,
http://bit.ly/2TVWV5L , http://bit.ly/2L5Kosd, http://bit.ly/2SAN2cW,
http://bit.ly/2P9AhYQ , http://bit.ly/2SAlowE , http://bit.ly/2YaPGaA ,
http://bit.ly/30SYVOe , http://bit.ly/2JZzBzh , http://bit.ly/2Zjl479 Watch the video &

summarise in 1 paragraph
b. View the animation on http://bit.ly/2JZh2yt and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on Abstract Classes and Interfaces; In
2 paragraphs summarise their opinion of the discussed topic. etc.

35
7.0 Self Assessment Question Answers
1. True
2. False

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

36
STUDY SESSION 2
Design Patterns
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- What is Design Pattern?
2.1.1- The Container Pattern.
2.1.1.1- The AbstractContainer Class
2.1.2: The Iterator Pattern.
2.1.3: The Visitor Pattern.
2.1.3.1: The isDone method
2.1.3.2: The AbstractVisitor Class
2.1.3.3: The toString Method
2.1.3.4: The accept Method
2.1.4: The SearchableContainer Pattern.
2.1.5: The Association Pattern.
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
In the last study session, we reviewed some object-oriented concepts. As a
follow up to that, you will learn about some basic design patterns in this study
session. Learning to use these patterns makes code become even more reusable,
more flexible, etc, as you would see.

37
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Explain what design patterns are
2. Design and implement Containers
3. Implement visitors
4. Use searchable containers and associations

2.0 Main Content


2.1 What is Design Pattern?
We have seen that inheritance allows related classes to share codes, thus,
allowing for code reusability, flexibility and its likes. How about unrelated
classes - can we make them reusable? Experience in the object-oriented design
community has shown that interaction among even unrelated objects often takes
the form of a recurring pattern or set of patterns. A number of these patterns
have been identified and are being referred to as object-oriented design patterns.
Learning to use these patterns make codes to become even more reusable, more
flexible, etc. We shall use some of these patterns throughout the course and the
basic ones are introduced in this lecture.

2.1.1 The Container Pattern


A container is an object that holds within it other objects. Many of the data
structures we study in this course can be viewed as containers. i.e. they have
the container pattern. The Container interface is as follows:
public interface Container {
int getCount ();
boolean isEmpty ();
boolean isFull ();
void purge ();
void accept (Visitor visitor);
Iterator iterator ();
The
} first four methods are obvious. We will explain the other two after
introducing Iterator and Visitor patterns.

38
2.1.1.1 The Abstract Container Class
The following is the AbstractContainer class that implements the Container
interface which will be used as base from which concrete container classes are
derived.

public abstract class AbstractContainer


implements Container {
protected int count;

public int getCount () {return count;}


public boolean isEmpty () {return getCount () == 0;}
public boolean isFull () {
return false;
}
public abstract void purge();
public void accept(Visitor v){. . . }
public abstract Iterator iterator();
// ...
}

2.1.2 The Iterator Pattern


The Iterator pattern provides a means to access all the objects in a container one
after another.
The following shows the Iterator interface:
public interface Iterator {
boolean hasNext ();
Object next () throws NoSuchElementException;
}

An Iterator interacts closely with a container. Recall that a container has a


method iterator which returns an Iterator. The following shows how the Iterator
interface is used:
Container c = new SomeContainer();
Iterator e = c.iterator();
while (e.hasNext())
System.out.println(e.next ());

39
2.1.3 The Visitor Pattern
Many operations on data structures are observed to have the pattern of visiting
each object - hence, the Visitor pattern. For this pattern we define the Visitor
interface as follows:
public interface Visitor {
void visit (Object object);
boolean isDone ();
}

A visitor interacts closely with a container. The interaction goes as follows:


- The container is passed as a reference to a visitor by calling the
container's “accept method.”
- The container then calls the visit method of the visitor one-by-one for
each object it contains.
The design framework for the accept method is as follows:
public void accept(Visitor visitor) {
for each object, obj, in this container
visitor.visit(obj);
}

The code for a sample visitor is shown below:


public class PrintingVisitor implements Visitor {
public void visit(Object object) {
System.out.println(object);
}}

Note that the loop that iterates the container‟s objects is not in the visit method.
It is in the container‟s accept method. To print all objects in an instance, c of
SomeContainer, the accept method is called as follows:
Container c = new SomeContainer();
//....
c.accept (new PrintingVisitor());

40
2.1.3.1 The isDone method
The isDone method of the visitor is used to stop the visit method from visiting
the other objects of a container when a success is recorded. The following code
segment shows how the isDone method is used.

public void accept (Visitor visitor){


for each Object o in this container
if (visitor.isDone ())
return;
visitor.visit (o);
}

The following shows the usefulness of the isDone method:


public class MatchingVisitor implements Visitor {
private Object target; private boolean found;
public MatchingVisitor(Object target){
this.target = target;
}
public void visit (Object object) {
if(object.equals (target))
found = true;
}
public boolean isDone (){return found;}
}

In-text Question 1
What is the function of the isDone method of the Visitor class?

Answer
The Association Pattern

2.1.3.2 The AbstractVisitor Class


Many operations on a container involve visiting all the elements. i.e. they do
not need to call the isDone method. Thus, forcing the implementation of the
isDone method for such operations may not be desirable. To avoid this, we
define the following AbstractVistor class.

41
public abstract class AbstractVisitor implements Visitor {
public abstract void visit (Object object);
public boolean isDone () {
return false;
}
}

2.1.3.3 The toString Method


The following defines the toString method for the AbstractContainer class
using a visitor. Defining it here is aimed at simplifying the implementation of
classes extending this class.

public abstract class AbstractContainer implements Container {


public String toString() {
final StringBuffer buffer = new StringBuffer();
AbstractVisitor visitor = new AbstractVisitor() {
private boolean comma;
public void visit(Object obj) {
if(comma) buffer.append(“, ”);
buffer.append(obj);
comma = true;
}
};
accept(visitor);
return "" + buffer;
}
// ...
}

2.1.3.4 The accept Method


We now define the accept method for the AbstractContainer class using an
iterator.
public abstract class AbstractContainer
implements Container {
public void accept(Visitor visitor) {
Iterator iterator = iterator();
while ( iterator.hasNext() && !visitor.isDone())
visitor.visit(iterator.next());
}
// ...
}

42
While the accept method takes only one visitor, a container can have more than
one Iterator at the same time.
2.1.4 The SearchableContainer Pattern
Some of the data structures we shall study have the additional property of being
searchable. The SearchableContainer interface extends the Container interface
by adding four more methods as shown below:
public interface SearchableContainer extends Container {
boolean isMember (Comparable object);
void insert (Comparable object);
void withdraw (Comparable obj);
Comparable find (Comparable object);
}

The find method is used to locate an object in the container and returns its
reference. It returns null if not found.

2.1.5 The Association Pattern


An association is an ordered pair of objects. The first element is called the key,
while the second is the value associated with the key. The following defines the
Association class which we shall use whenever we need to associate one object
to another.

43
public class Association implements Comparable {
protected Comparable key;
protected Object value;
public Association(Comparable comparable, Object obj){
key = comparable;
value = obj;
}
public Association(Comparable comparable){
this(comparable, null);
}
// ...
public Comparable getKey(){return key;}
public void setKey(Comparable key){this.key = key;}
public Object getValue(){return value;}
public void setValue(Object value){this.value = value;}
public int compareTo(Object obj){
Association association = (Association)obj;
return key.compareTo(association.getKey());
}
public boolean equals(Object obj){
return compareTo(obj) == 0;
}
public String toString() {
String s = "{ " + key;
if(value != null)
s = s + " , " + value;
return s + " }";
}
}

In-text Question 2
Which design pattern do we use whenever we need to associate one object to another?

Answer
The isDone method of the visitor is used to stop the visit method from visiting the other
objects of a container when a success is recorded.

3.0 Tutor Marked Assignments (Individual or Group)


1. Let c be an instance of some concrete class derived from the
AbstractContainer class. Explain how the following statement prints the
content of the container: System.out.println(c);

44
2. Suppose we have a container which contains only instances of the Integer
class. Design a Visitor that computes the sum of all the integers in the
container.
3. Using visitors, devise implementations for the isMember and find
methods of the AbstractSearchableContainer class. Using visitors, devise
implementations for the isMember and find methods of the
AbstractSearchableContainer class.
4. Consider the following pair of Associations:
Comparable a=new Association (new Integer(3), new Integer(4));
Comparable b=new Association (new Integer(3));
Give the sequence of methods called in order to evaluate a comparison
such as "a.equals(b)''. Is the result of the comparison true or false?

4.0 Conclusion/Summary
In this study session, you were introduced to some basic design patterns and
their implementations. You have learnt that using these patterns make code to
become even more reusable, more flexible, etc. In the next study session, you
will be introduced to complexity analysis.
5.0 Self-Assessment Questions
1. Which of the patterns that you have studied provides a means to access
one-by-one, all the objects in a container?
2. From what you have learnt so far, design patterns allow us to make
unrelated classes reusable. (True or False)?

45
6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/33Zay8Y , http://bit.ly/325YXDbm , http://bit.ly/33VbZ8p ,
http://bit.ly/30yUQ1X , http://bit.ly/2PdEF9e , http://bit.ly/32dYO0H , http://bit.ly/33Y6khM .

Watch the video & summarise in 1 paragraph


b. View the animation on http://bit.ly/2zxFXBr and critique it in the discussion
forum
c. Take a walk and engage any 3 students on Design Patterns; In 2 paragraphs
summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. The Iterator Pattern
2. True

8.0 References/Further Readings

Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,


Thomson Learning, ISBN 0-534-49252-5.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

46
STUDY SESSION 3
Complexity Analysis
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1: Introduction Complexity Analysis
2.1.1: Motivations for Complexity Analysis
2.1.2: Machine independence
2.1.3: Example of Basic Operations
2.1.4: Best, Average, and Worst case complexities.
2.1.5: Simple Complexity Analysis Examples.
2.2: Asymptotic Complexity
2.2.1: Big-O (asymptotic) Notation
2.2.1.1: Warnings about O-Notation
2.2.2: Big-O Computation Rules
2.2.3: Proving Big-O Complexity
2.2.4: How to determine complexity of code structures
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

47
Introduction:
Away from design patterns, you will learn about complexity analysis in this
study session. You will also learn what basic operations are with examples.
Next, you will be introduced to the concept of average, best and worst cases, as
well as some examples of simple complexity analysis. The concluding part of
this study session will introduce you to the Big-O (asymptotic) Notation, how to
prove Big-O complexity and how to determine complexity of code structures.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Outline why we study complexity analysis
2. Describe basic operations and list examples
3. Explain average, best and worst cases
4. Explain and prove Big-O notation complexity
5. Determine complexity of code structures

2.0 Main Content


2.1 Introduction Complexity Analysis
2.1.1 Motivations for Complexity Analysis
Do you know that there are often many different algorithms which can be used
to solve the same problem? Well if you did not know, now you do. It is for this
reason that it makes sense for us to develop techniques that allow us to:
- compare different algorithms with respect to their “efficiency”
- choose the most efficient algorithm for the problem.
The efficiency of any algorithmic solution to a problem is a measure of the:
- Time efficiency: the time it takes to execute.
- Space efficiency: the space (primary or secondary memory) it uses.
We will focus on an algorithm‟s efficiency with respect to time. But before we
dig further, let us take a lot at some concepts.

48
2.1.2 Machine independence
The evaluation of efficiency should be as machine independent as possible. It is
not useful to measure how fast the algorithm runs as this depends on type of
computer, OS, programing language, compiler, and the kinds of input that are
used in testing. Instead,
- We count the number of basic operations the algorithm performs.
- We calculate how this number depends on the size of the input.
A basic operation is an operation which takes a constant amount of time to
execute. Hence, the efficiency of an algorithm is the number of basic operations
it performs. This number is a function of the input size n.

2.1.3 Basic Operations and Examples


You have seen from section 2.1.2 that a basic operation is an operation which
takes a constant amount of time to execute. The following are examples of basic
operations.
- Arithmetic operations: *, /, %, +, -
- Assignment statements of simple data types.
- Reading of primitive types
- writing of a primitive types
- Simple conditional tests: if (x < 12) ...
- method call (Note: the execution time of the method itself may depend on
the value of parameter and it may not be constant)
- a method's return statement
- Memory Access
- We consider an operation such as ++ , += , and *= as consisting of two
basic operations.
To simplify complexity analysis we will not consider memory access (fetch or
store) operations.

49
2.1.4 Best, Average, and Worst case complexities
It is reasonable to measure an algorithm‟s efficiency as a function of a
parameter indicating the size of the algorithm‟s input. There are a number of
algorithms whose running times depend not only on an input size but also on the
specifics of a particular input. Consider as an example, sequential search. This
is a straightforward algorithm that searches for a given item (some search key
K) in a list of n elements by checking successive elements of the list until either
a match with the search key is found or the list is exhausted. Here is the
algorithm‟s pseudocode, in which, for simplicity, a list is implemented as an
array. It also assumes that the second condition A[i] K will not be checked if
the first one, which checks that the array‟s index does not exceed its upper
bound, fails.
ALGORITHM SequentialSearch(A[0..n − 1], K)
//Searches for a given value in a given array by
sequential search
//Input: An array A[0..n − 1] and a search key K
//Output: The index of the first element in A that
matches K
// or −1 if there are no matching elements
i ←0
while i < n and A[i] K do
i ←i + 1
if i < n return i
else return −1

You can see clearly that, the running time of this algorithm can be quite
different for the same list size n. In the worst case, when there are no matching
elements or the first matching element happens to be the last one on the list, the
algorithm makes the largest number of key comparisons among all possible
inputs of size n: Cworst(n) = n.
The worst-case efficiency of an algorithm is its efficiency for the worst-case
input of size n, which is an input (or inputs) of size n for which the algorithm
runs the longest among all possible inputs of that size. The way to determine the
worst-case efficiency of an algorithm is, in principle, quite straightforward:

50
analyse the algorithm to see what kind of inputs yield the largest value of the
basic operation‟s count C(n) among all possible inputs of size n and then
compute this worst-case value Cworst(n).
The best-case efficiency of an algorithm is its efficiency for the best-case input
of size n, which is an input (or inputs) of size n for which the algorithm runs the
fastest among all possible inputs of that size. Accordingly, we can analyse the
best case efficiency as follows. First, we determine the kind of inputs for which
the count C(n) will be the smallest among all possible inputs of size n. Note that
the best case does not mean the smallest input; it means the input of size n for
which the algorithm runs the fastest.
It should be clear from our discussion, however, that neither the worst-case
analysis nor its best-case counterpart yields the necessary information about an
algorithm‟s behavior on a “typical” or “random” input. This is the information
that the average-case efficiency seeks to provide. To analyse the algorithm‟s
average case efficiency, we must make some assumptions about possible inputs
of size n.
We are usually interested in the worst case complexity, that is, what are the
most operations that might be performed for a given problem size. We usually
focus on worst case analysis because:
- Easier to compute
- Usually close to the actual running time
- Crucial to real-time systems such as air-traffic control, etc.

51
Figure 1.3.1: Best, Worst and Average Case complexities

For the sequential (linear) search considered in this section,


- Best Case : Item found at the beginning: One comparison
- Worst Case : Item found at the end: n comparisons
- Average Case :Item may be found at index 0, or 1, or 2, . . . or n – 1
The Average number of comparisons is: (1 + 2 + . . . + n) / n = (n+1) / 2
Table 1 below summarizes Worst and Average complexities of common
sorting algorithms

Table 1.3.1: Worst and Average complexities of common sorting


algorithms
Method Worst Case Average Case

Selection sort n2 n2

Inserstion sort n2 n2

Merge sort n log n n log n

Quick sort n2 n log n

52
In-text Question 1
What is a basic operation?

Answer
A basic operation is an operation which takes a constant amount of time to execute.

2.1.5 Simple Complexity Analysis Examples


Loops
We start by considering how to count operations in for-loops.
- We use integer division throughout.
First of all, we should know the number of iterations of the loop; say it is x.
- Then the loop condition is executed x + 1 times.
- Each of the statements in the loop body is executed x times.
- The loop-index update statement is executed x times.
Loops (with <)
In the following for-loop:
for (int i = k; i < n; i = i + m){
statement1;
statement2;
}

- The number of iterations is: (n – k ) / m


- The initialization statement, i = k, is executed one time.
- The condition, i < n, is executed (n – k) / m + 1 times.
- The update statement, i = i + m, is executed (n – k) / m times.
- Each of statement1 and statement2 is executed (n – k) / m times.
Loops (with <=)
In the following for-loop:
for (int i = k; i <= n; i = i + m){
statement1;
statement2;
}
- The number of iterations is: (n – k) / m + 1
- The initialization statement, i = k, is executed one time.
- The condition, i <= n, is executed (n – k) / m + 2 times.
53
- The update statement, i = i + m, is executed (n – k) / m + 1 times.
- Each of statement1 and statement2 is executed (n – k) / m + 1 times.
Example 1: Find the exact number of basic operations in the following program
fragment:
double x, y;
x = 2.5 ; y = 3.0;
for(int i = 0; i < n; i++){
a[i] = x * y;
x = 2.5 * x;
y = y + a[i];
}

There are 2 assignments outside the loop => 2 operations.


The for loop actually comprises
- an assignment (i = 0) => 1 operation
- a test (i < n) => n + 1 operations
- an increment (i++) => 2 n operations
- the loop body that has three assignments, two multiplications, and an
addition => 6 n operations
- Thus the total number of basic operations is 6 * n + 2 * n + (n + 1) + 3 =
9n + 4
Example 2: Suppose n is a multiple of 2. Determine the number of basic
operations performed by of the method myMethod():
static int myMethod(int n){ static int helper(int
int sum = 0; n){
for(int i = 1; i < n; i int sum = 0;
= i * 2) for(int i = 1; i
sum = sum + i + <= n; i++)
helper(i); sum = sum
return sum; + i;
} return sum;
}

Solution: The number of iterations of the loop:


for(int i = 1; i < n; i = i * 2)
sum = sum + i + helper(i);
is log2n (A Proof will be given later)
54
Hence the number of basic operations is:
1 + 1 + (1 + log2 n) + log2 n[2 + 4 + 1 + 1 + (n + 1) + n[2 + 2] + 1] + 1
= 3 + log2 n + log2 n[10 + 5n] + 1
= 5 n log2 n + 11 log2 n + 4
Loops with Logarithmic Iterations
In the following for-loop: (with <)
for (int i = k; i < n; i = i * m){
statement1;
statement2;
}
- The number of iterations is: (Logm (n / k) )

In the following for-loop: (with <=)


for (int i = k; i <= n; i = i * m){
statement1;
statement2;
}
- The number of iterations is:  (Logm (n / k) + 1) 

2.2 Asymptotic Complexity


Finding the exact complexity, f(n) = number of basic operations, of an
algorithm is difficult. We approximate f(n) by a function g(n) in a way that does
not substantially change the magnitude of f(n). The function g(n) is sufficiently
close to f(n) for large values of the input size n. This "approximate" measure of
efficiency is called asymptotic complexity. Thus the asymptotic complexity
measure does not give the exact number of operations of an algorithm, but it
shows how that number grows with the size of the input. This gives us a
measure that will work for different operating systems, compilers and CPUs.

2.2.1 Big-O (asymptotic) Notation


The most commonly used notation for specifying asymptotic complexity is the
big-O notation. The Big-O notation, O(g(n)), is used to give an upper bound
(worst-case) on a positive runtime function f(n) where n is the input size.

55
Definition of Big-O
Consider a function f(n) that is non-negative  n  0. We say that “f(n) is Big-O
of g(n)” i.e., f(n) = O(g(n)), if  n0  0 and a constant c > 0 such that f(n) 
cg(n),  n  n0

Figure 1.3.2 Definition of Big-O

This definition of Big-O implies that for all sufficiently large n, c *g(n) is an
upper bound of f(n)
Note: By the definition of Big-O: f(n) = 3n + 4 is O(n)
it is also O(n2),
it is also O(n3),
...
it is also O(nn)
However when Big-O notation is used, the function g in the relationship f(n) is
O(g(n)) is chosen to be as small as possible. We call such a function g a tight
asymptotic bound of f(n)
Table 1.3.2 shows some Big-O complexity classes in order of magnitude from
smallest to highest

56
Table 1.3.2: Some Big-O complexity classes in order of magnitude from
smallest to highest
O(1) Constant

O(log(n)) Logarithmic

O(n) Linear

O(n log(n)) n log n

O(nx) {e.g., O(n2), O(n3), etc} Polynomial

O(an) {e.g., O(1.6n), O(2n), etc} Exponential

O(n!) Factorial

O(nn)

Figure 1.3.4: Some Big-O Complexity Classes

57
2.2.1.1 Warnings about O-Notation
You should take note of the following about the O – Notation.
- Big-O notation cannot compare algorithms in the same complexity class.
- Big-O notation only gives sensible comparisons of algorithms in different
complexity classes when n is large.
- Consider two algorithms for same task:
Linear: f(n) = 1000 n
Quadratic: f'(n) = n2/1000
The quadratic one is faster for n < 1000000.
2.2.2 Big-O Computation Rules
For large values of input n, the constants and terms with lower degree of n are
ignored.
1. Multiplicative Constants Rule: Ignoring constant factors.
O(c f(n)) = O(f(n)), where c is a constant;
Example:
O(20 n3) = O(n3)
2. Addition Rule: Ignoring smaller terms.
If O(f(n)) < O(h(n)) then O(f(n) + h(n)) = O(h(n)).
Example:
O(n2 log n + n3) = O(n3)
O(2000 n3 + 2n ! + n800 + 10n + 27n log n + 5) = O(n !)
3. Multiplication Rule: O(f(n) * h(n)) = O(f(n)) * O(h(n))
Example:
O((n3 + 2n 2 + 3n log n + 7)(8n 2 + 5n + 2)) = O(n 5)

58
2.2.3 Proving Big-O Complexity
To prove that f(n) is O(g(n)) we find any pair of values n0 and c that satisfy:
f(n) ≤ c * g(n) for  n  n0
Note: The pair (n0, c) is not unique. If such a pair exists then there is an infinite
number of such pairs.
Example: Prove that f(n) = 3n2 + 5 is O(n2)
We try to find some values of n and c by solving the following inequality:
3n2 + 5  cn2 OR
3 + 5/n2  c
By putting different values for n, we get corresponding values for c

n0 1 2 3 4 

C 8 4.25 3.55 3.3125 3

Example:
Prove that f(n) = 3n2 + 4n log n + 10 is O(n2) by finding appropriate values for
c and n0
We try to find some values of n and c by solving the following inequality
3n2 + 4n log n + 10  cn2
OR 3 + 4 log n / n+ 10/n2  c
In-text Question 2
The most commonly used notation for specifying asymptotic complexity is the big-O notation.
(True or False)?

Answer
True

59
We used Log of base 2, but another base can be used as well

n0 1 2 3 4 

C 13 7.5 6.22 5.62 3

2.2.4 How to determine complexity of code structures


Loops: for, while, and do-while:
Complexity is determined by the number of iterations in the loop multiplied by
the complexity of the body of the loop.
Examples:
for (int i = 0; i < n; i++)
sum = sum - i; O(n)
for (int i = 0; i < n * n; i++)
2
sum = sum + i;
O(n )
i=1;
while (i < n) {
sum = sum + i;

}
i = i*2
O(log n)

Nested Loops: Complexity of inner loop * complexity of outer loop.


Examples:
sum = 0
for(int i = 0; i < n; i++) 2
for(int j = 0; j < n; j++)
sum += i * j ;
O(n )
i = 1;
while(i <= n) {
j = 1;
while(j <= n){
statements of constant complexity
j = j*2;
} O(n log n)
i = i+1;
}

60
Sequence of statements: Use Addition rule
O(s1; s2; s3; … sk) = O(s1) + O(s2) + O(s3) + … + O(sk) = O(max(s1, s2, s3, .
. . , sk))
Example:
for (int j = 0; j < n * n; j++)
sum = sum + j;
for (int k = 0; k < n; k++)
sum = sum - l;
System.out.print("sum is now ” + sum);
Complexity is O(n2) + O(n) +O(1) = O(n2)

Switch: Take the complexity of the most expensive case

char key;
int[] X = new int[n];
int[][] Y = new int[n][n];
........
switch(key) {
case 'a':
for(int i = 0; i < X.length; i++)
sum += X[i];
o(n)
break;
case 'b':
2
o(n )
for(int i = 0; i < Y.length; j++)
for(int j = 0; j < Y[0].length; j++)
sum += Y[i][j];
break;
} // End of switch block

If Statement: Take the complexity of the most expensive case :


char key;
int[][] A = new int[n][n];
int[][] B = new int[n][n];
int[][] C = new int[n][n];
........ Overall
if(key == '+') {
for(int i = 0; i < n; i++) complexity
for(int j = 0; j < n; j++) 2 O(n3)
O(n )
C[i][j] = A[i][j] + B[i][j];
} // End of if block

else if(key == 'x') 3


O(n )
C = matrixMult(A, B);
else
System.out.println("Error! Enter '+' or 'x'!"); O(1)

61
Sometimes if-else statements must carefully be checked:
O(if-else) = O(Condition)+ Max[O(if), O(else)]
int[] integers = new int[n];
........
if(hasPrimes(integers) == true)
integers[0] = 20; O(1)
else
integers[0] = -20; O(1)
public boolean hasPrimes(int[] arr) {
for(int i = 0; i < arr.length; i++)
..........
O(if-else) = O(Condition)
.......... =O(n) O(n)
} // End of hasPrimes()

Note: Sometimes a loop may cause the if-else rule not to be applicable.
Consider the following loop:
while (n > 0) {
if (n % 2 = = 0) {
System.out.println(n);
n = n / 2;
} else{
System.out.println(n);
System.out.println(n);
n = n – 1;
}
}
The else-branch has more basic operations; therefore one may conclude that the
loop is O(n). However the if-branch dominates. For example if n is 60, then the
sequence of n is: 60, 30, 15, 14, 7, 6, 3, 2, 1, and 0. Hence, the loop is
logarithmic and its complexity is O(log n).

3.0 Tutor Marked Assignments (Individual or Group)


1. Use the most appropriate notation among to indicate the time
efficiency class of sequential search (see Section 2.1.4)
a. in the worst case.
b. in the best case.
c. in the average case.

62
2. Use the informal definitions of to determine whether the
following assertions are true or false.
a. n(n + 1)/2 ∈ O(n3)
b. n(n + 1)/2 ∈ O(n2)
c. n(n + 1)/2 ∈ (n3)
d. n(n + 1)/2 ∈ (n)
3. List the following functions according to their order of growth from the

lowest to the highest:


(n − 2)!, 5 lg(n + 100)10, 22n, 0.001n4 + 3n3 + 1, ln2 n, √ , 3n.
4. Consider the algorithm enigma below

ALGORITHM Enigma(A[0..n − 1, 0..n − 1])


//Input: A matrix A[0..n − 1, 0..n − 1] of real numbers
for i ←0 to n − 2 do
for j ←i + 1 to n − 1 do
if A[i, j ] _= A[j, i]
return false
return true

a. What does this algorithm compute?


b. What is its basic operation?
c. How many times is the basic operation executed?
d. What is the efficiency class of this algorithm?
e. Suggest an improvement, or a better algorithm altogether, and indicate its
efficiency class. If you cannot do it, try to prove that, in fact, it cannot be
done.

4.0 Conclusion/Summary
In this study session, you were introduced to complexity analysis. You learnt
what basic operations are and some examples. You were also introduced to the
concept of average, best and worst cases, as well as some examples of simple
complexity analysis. The concluding part of the study session introduced you to

63
the Big-O (asymptotic) Notation, how to prove Big-O complexity and how to
determine complexity of code structures. In the next study session, you will be
introduced to linked lists.

5.0 Self-Assessment Questions


1. Explain the terms Time Efficiency and Space Efficiency
2. Outline two reasons why we are usually more interested in the worst case
complexity of algorithms.

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2ZkJtxx , http://bit.ly/2zbO71O, http://bit.ly/2ZoLvbD ,
http://bit.ly/2MDWQ5I, http://bit.ly/2PcWUeR , http://bit.ly/2zljMho , http://bit.ly/2U2QIoF

Watch the video & summarise in 1 paragraph


b. View the animation on http://bit.ly/2U2QIoF and critique it in the discussion
forum
c. Take a walk and engage any 3 students on average, best and worst cases
complexity of algorithms In 2 paragraphs summarise their opinion of the
discussed topic. etc.

7.0 Self Assessment Question Answers


1. Time efficiency: The time an algorithm takes to execute.
Space efficiency: This is the space (primary or secondary memory) the
algorithm uses.
2.
i. Easier to compute
ii. Usually close to the actual running time
iii. Crucial to real-time systems such as air-traffic control, etc.

64
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

65
STUDY SESSION 4
Linked Lists
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - What is a List Data Structure?
2.2 - Elements of a List
2.3 - Operations on Lists
2.4 - Singly Linked Lists
2.4.1 - Representation
2.4.2 - Space Analysis
2.5 - Operations on Singly Linked Lists
2.5.1 - Creation and Insertion
2.5.1.1 - Insertion at the end (Append)
2.5.1.2 - Insertion at the beginning (Prepend)
2.5.1.3 - Insertion before and after an element
2.5.2 - Traversal
2.5.3 - Searching
2.5.4 - Deletion
2.5.4.1 - Deletion - Difference between the MyLinkedList and
the Element extracts
2.5.4.2 - Deletion – Deleting First and Last Element
2.6- Doubly Linked Lists
2.6.1- Doubly linked lists: Representation
2.6.2- Doubly Linked Lists: Space Analysis
2.7- Operations on doubly Linked Lists
2.7.1- Creation and Insertion
2.7.1.1- Insertion at the end (Append)

66
2.7.1.2- Insertion at the beginning (Prepend)
2.7.1.3- Insertion before an element
2.7.2- Traversal
2.7.3- Deletion
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction
What you will learn in this study session borders on Lists, their operations and
implementations. Typical examples are given to facilitate your understanding of
these features.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Describe a List
2. Identify the elements of a List
3. Explain the operations and implementations of Linked Lists.

2.0 Main Content


2.1 What is a List Data Structure?
A list data structure is a sequential data structure, i.e. a collection of items
accessible one after another, beginning at the head and ending at the tail. It is a
widely used data structure for applications which do not need random access.
Lists differ from the stacks and queues data structures in that additions and
removals can be made at any position in the list.

67
2.2 Elements of a List
Let us take a look at how the sentence “Dupe is not a boy” can be written as „a
list‟.

Figure 1.4.1 Elements of a List


We regard each word in the sentence above as a data-item or datum, which is
linked to the next datum, by a pointer. Datum plus pointer makes one node of a
list. The last pointer in the list is called a terminator. It is often convenient to
speak of the first item as the head of the list, and the remainder of the list as the
tail.
2.3 Operations
The main primitive operations of a list we know are:
- Add: adds a new node
- Set: updates the contents of a node
- Remove: removes a node
- Get: returns the value at a specified index
- IndexOf: returns the index in the list of a specified element
Additional primitive operations can be defined:
- IsEmpty: reports whether the list is empty
- IsFull: reports whether the list is full
- Initialise: creates/initialises the list
- Destroy: deletes the contents of the list (may be implemented by re-
initialising the list)
- Initialise Creates the structure – i.e. ensures that the structure exists but
contains no elements e.g. Initialise(L) creates a new empty queue named
Q

68
Okay, now it is time for us to see the different types of lists we have, how we
represent them, their space analysis and how to perfrom some of the operations
we have just outlined on them. Let us get started.
2.4 Singly Linked Lists
A singly linked list is a data structure in which the data items are chained
(linked) in one direction. Figure 2 shows an example of a singly linked list.

list head
tail
Figure 1.4.2: A singly linked list

2.4.1 Representation
In representing a singly linked list, we will be using a representation in which a
linked list has both head and tail references.
public class MyLinkedList{
protected Element head;
protected Element tail;
public final class Element{
Object data;
Element next;
Element(Object obj, Element element){
data = obj;
next = element;
}
public Object getData(){return data;}
public Element getNext(){return next;}
}
}
2.4.2 Space Analysis
Now, we can take a look at the space requirements for a singly linked list:
S(n) = sizeof(MyLinkedList) + n sizeof(MyLinkedList.Element)
= 2 sizeof(MyLinkedList.Element ref) + n [sizeof(Object ref) +
sizeof(MyLinkedList.Element ref)]
= (n + 2) sizeof(MyLinkedList.Element ref) + n sizeof(Object ref)

69
Space Required Explanation

The list reference has two fields:


sizeof(MyLinkedList) head (type: Element) and tail (type: Element)
= 2 sizeof(MyLinkedList.Element ref)

The list has n elements of type Element.


n
Each element has two fields-- data (type
sizeof(MyLinkedList.Element)
Object) and next (type Element).

2.5 Operations on Singly Linked Lists


2.5.1 Creation and Insertion
An empty list is created as follows
MyLinkedList list = new MyLinkedList();

head

tail

Once created, elements can be inserted into the list using either the append or
prepend methods.
for (int k = 0; k < 10; k++)
list.append(new Integer(k));

Also if we have reference to a node (an element), we can use insertAfter or


InsertBefore of the Element class.

2.5.1.1 Insertion at the end (Append)


To insert at the end of a list, we need to make the next point of the last element
(tail) to point to the element to be inserted, and then make the tail reference the
element to be inserted. The code fragment for insertion at the end of a list is
shown below.

70
public void append(Object obj){
Element element = new Element(obj, null);
if(head == null)
head = element;
else
tail.next = element;
tail = element;
} Complexity is O(1)
The figures below depict insertion at the end of a singly linked list.

Figure 1.4.3(a): Insertion at the end of list Figure 1.4.3(b): Insertion at the end of list

Figure 1.4.3(c): Insertion at the end of list

2.5.1.2 Insertion at the beginning (Prepend)


To insert at the beginning of a list, we need to make the next point of the first
element (head) to point towards the element to be inserted, and then make the
head reference the element to be inserted. The code fragment for insertion at the
beginning of a list is shown below.

public void prepend(Object obj) {


Element element = new Element(obj, head);
if(head == null)
tail = element;
Complexity is O(1)
head = element;
}
The figures below depict insertion at the beginning of a singly linked list.

71
Figure 1.4.4(a): Insertion at the beginning of list

Figure 1.4.4(b): Insertion at the beginning of list

2.5.1.3 Insertion before and after an element


To insert before or after an element, we present the code fragment to achieve
that.
public void insertBefore(Object obj) {
Element element = new Element(obj, this);
if(this == head) {
head = element;
return; Complexity is O(n)
}
Element previous = head;
while (previous.next != this) {
previous = previous.next;
}
previous.next
public = element;
void insertAfter(Object obj) {
} next = new Element(obj, next);
if(this == tail) Complexity is O(1)
tail = next;
}

2.5.2 Traversal
To move a reference e from one node to the next, we use:
e = e.next;

Figure 1.4.5: Move one reference to another

72
Example: Count the number of nodes in a linked list.
public int countNodes(){
int count = 0;
Element e = head;
while(e != null){ Complexity is O(n)
count++;
e = e.next;
}
return count;
}
In-text Question 1
If we have reference to an element in a list, then we can insert a new element before or after
it. (True of False)?

Answer
True

2.5.3 Searching
To search for an element, we traverse from head until we locate the object.
Example: Count the number of nodes with data field equal to a given object.

public int countNodes(Object obj){


int count = 0;
Element e = head;
while(e != null){ Complexity is O(n)
if(e.data.equals(obj))
count++;
e = e.next;
}
return count;
}

2.5.4 Deletion
To delete an element, we use either the extract method of MyLinkedList or that
of the Element inner class.

73
public void extract(Object obj) {
Element element = head;
Element previous = null;
while(element != null && ! element.data.equals(obj)) {
previous = element;
element = element.next;
}
if(element == null) Complexity is O(n)
throw new IllegalArgumentException("item not found");
if(element == head)
head = element.next;
else
previous.next = element.next;
if(element == tail)
tail = previous;
}

Figure 1.4.6 How to delete an element from a list

2.5.4.1 Deletion - Difference between the MyLinkedList and the Element


extracts
To delete an element, we use either the extract method of MyLinkedList or that
of the Element inner class.
try{
list.extract(obj1);
} catch(IllegalArgumentException e){
System.out.println("Element not found");
}
MyLinkedList.Element e = list.find(obj1);
if(e != null)
e.extract();
else
System.out.println("Element not found");

2.5.4.2 Deletion – Deleting First and Last Element


To delete the first element in a list, we check if the head is null. If it is not null,
we set the head to reference the next element. If on the other hand the head is
74
null, it implies that the list is empty so we throw the necessary exception
otherwise we simply set the tail to null as well.
To delete an element within the list, we check if the list is empty or has only one
element. If the head is null, it implies that the list is empty so we throw the
necessary exception. If the list is empty or has only one element, we simply set
both the head and tail to null. If the list has more than one element, then we
have to traverse the list and change the reference of the element to be deleted
accordingly if it is found.
The code fragments below describe how to delete the first and last elements.
public void extractFirst() {
if(head == null)
throw new IllegalArgumentException("item not found");
head = head.next;
if(head == null) Complexity is O(1)
tail = null;
}
public void extractLast() {
if(tail == null)
throw new IllegalArgumentException("item not found");
if (head == tail)
head = tail = null;
else {
Element previous = head;
while (previous.next != tail) Complexity is O(n)
previous = previous.next;
previous.next = null;
tail = previous;
}
}
2.6 Doubly Linked Lists
A doubly linked list is a data structure in which the data items are chained
(linked) in two directions. Doubly linked lists permit scanning or searching of
the list in both directions. Figure 1.4.7 shows an example of a singly linked list.

list head
taillinked list
Figure 1.4.7: A doubly

75
2.6.1 Doubly linked lists: Representation
For each element in a doubly linked list, we represent the data it carries, and a
reference to the previous and next elements. The code segment below shows the
representation of a doubly linked list.
public class DoublyLinkedList{
protected Element head, tail;
//. . .
public class Element {
Object data; Element next, previous;

Element(Object obj, Element next, Element previous){


data = obj; this.next = next;
this.previous = previous;
}
public Object getData(){return data;}
public Element getNext(){return next;}
public Element getPrevious(){return previous;}
// . . . }

2.6.2 Doubly Linked Lists: Space Analysis


The space requirements of our representation of the doubly linked lists is as
follows:
S(n) = sizeof(DoublyLinkedList) + n sizeof(DoublyLinkedList.Element)
= 2 sizeof(DoublyLinkedList.Element ref) + n [sizeof(Object ref)
+ 2 sizeof(DoublyLinkedList.Element ref)]
= (2n + 2) sizeof(DoublyLinkedList.Element ref) + n sizeof(Object ref)

Required space Explanation

sizeof(DoublyLinkedList) The list reference has two fields:


head (type: Element) and tail (type: Element)
= 2 sizeof(DoublyLinkedList.Element ref)

n sizeof(DoublyLinkedList. The list has n elements of type Element. Each


Element) element has three fields-- previous (type Element),
data (type Object), and next (type Element)

76
2.7 Operations on doubly Linked Lists
2.7.1 Creation and Insertion
An empty doubly linked list is created using the DoublyLinked class from
section 2.6.1 as follows:
DoublyLinkedList list = new DoublyLinkedList();

head

tail

Like singly link list, once Created, elements can be inserted into the list using
either the append or prepend methods.
for (int k = 0; k < 10; k++)
list.append(new Int(k));
Also, if we have reference to a node (an element), we can use insertAfter or
InsertBefore of the Element class.
2.7.1.1 Insertion at the end (Append)
The new element is always added after the last element of the given Linked
List. For example, if the given doubly linked list is 510152025 and we add an
element 30 at the end, then the list becomes 51015202530. Since a Linked List
is typically represented by the head of it, we have to traverse the list till the end
and then change the last elements to a new element. The code segment below
describes how to add an element at end of a doubly linked list
public void append(Object obj){
Element element = new Element(obj, null, tail);
if(head == null)
head = tail = element;
else {
tail.next = element;
tail = element;
}
}

Complexity is O(1)

77
Figure 1.4.8 (a): Before Insertion Figure 1.4.8 (b): After Insertion

2.7.1.2 Insertion at the beginning (Prepend)


The new element is always added before the head of the given Linked List and
newly added node becomes the new head of the doubly linked list. For example,
if the given Linked List is 10152025 and we add an item 5 at the front, then the
Linked List becomes 510152025. The code segment below describes how to
add an element at the beginning of a doubly linked list.
public void prepend(Object obj){
Element element = new Element(obj, head, null);
if(head == null)
head = tail = element;
else {
head.previous = element;
head = element;
} Complexity is O(1)
}

Figure 1.4.9 (a): Before Insertion Figure 1.4.9 (b): After Insertion

2.7.1.3 Insertion before an element


To insert before the current node (this) that is neither the first nor the last node,
we simply change the references as shown in the code segment below.
Element element = new Element(obj, this, this.previous);
this.previous.next = element; Complexity is O(1)
this.previous = element;

78
Figure 1.4.10 (a): Before Insertion

Figure 1.4.10 (b): After Insertion

In-text Question 2
What is the complexity of appending an element to a linked list?

Answer
The complexity of appending an element to a linked list is O(1).

2.7.2 Traversal
For DoublyLinked list, traversal can be done in either direction. Forward,
starting from head, or backward starting from tail. The code segment below
shows how to traverse in both ways.
Element e = head; Element e = tail;
while (e != null) { while (e != null) {
//do something //do something
e = e.next; e = e.previous;
} }

Example: Count the number of nodes in a linked list.


public int countNodes(){
int count = 0;
Element e = head;
while(e != null){ Complexity is O(n)
count++;
e = e.next;
}
return count;
}

79
Example: The following computes the sum of the last n nodes:

public int sumLastNnodes(int n){


if(n <= 0)
throw new IllegalArgumentException("Wrong: " + n);

if(head == null)
throw new ListEmptyException();

int count = 0, sum = 0;


Element e = tail; Complexity is O(n)
while(e != null && count < n){
sum += ((Integer)e.data).intValue();
count++;
e = e.previous;
}
if(count < n)
throw new IllegalArgumentException(“No. of nodes <
"+n);
return sum;
}
2.7.3 Deletion
To delete an element, we use either the extract method of DoublyLinkedList or
that of the Element inner class. We traverse the list to check if the element to be
deleted exists in the list. If it does, we change the reference to the next and
previous elements accordingly as shown in the code segment below.
public void extract(Object obj){
Element element = head;
while((element != null) && (!element.data.equals(obj)))
element = element.next;

if(element == null)
throw new IllegalArgumentException("item notComplexity is O(n)
found");
if(element == head) {
head = element.next;
if(element.next != null)
element.next.previous = null;
}else{
element.previous.next = element.next;
if(element.next != null)
element.next.previous = element.previous;
}
if(element == tail)
tail = element.previous;
}

80
3.0 Tutor Marked Assignments (Individual or Group)
1. Mention at least two types of lists.
2. For the MyLinkedList class, implement each of the following methods:
- String toString()
- Element find(Object obj)
- void insertAt(int n) //counting the nodes from 1.
State the complexity of each method.
- Which methods are affected if we do not use the tail reference in
MyLinkedList class.
3. For the DoublyLinkedList class, Implement each of the following
methods and state its complexity.
- String toString()
- Element find(Object obj)
- void ExtractLast()
- void ExtractFirst()
- void ExtractLastN(int n)
4. For the DoublyLinkedList.Element inner class, implement each of the
following methods and state its complexity.
- void insertBefore()
- void insertAfter()
- void extract()
What are the methods of DoublyLinkedList and its Element inner class are more
efficient than those of MyLinkedList class?

4.0 Conclusion/Summary
In this study session, you have learned about linked Lists. You have also been
able to identify the elements of a linked List. You have learnt that linked lists
are of two types, namely, singly linked lists and doubly linked lists. Finally, you
learnt the different operations that could be carried out on linked lists. We

81
introduce stacks and queues in the next study session.

5.0 Self-Assessment Questions


1. What is a List data structure?
2. What is the complexity of searching for an element in a linked list?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2Pbi90H, http://bit.ly/2UdSX8R , http://bit.ly/2ME8Y74 ,
http://bit.ly/33ZOZoE , http://bit.ly/2U5ZHp4, http://bit.ly/2ZebDea , http://bit.ly/2MD0M70.

Watch the video & summarise in 1 paragraph


b. View the animation on http://bit.ly/33VilVj and critique it in the discussion forum
c. Take a walk and engage any 3 students on inserting an element before or after
another element in a doubly linked list. In 2 paragraphs summarise their opinion
of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. A list data structure is a sequential data structure, i.e. a collection of items
accessible one after the other, beginning at the head and ending at the tail.
2. The complexity of searching in a linked list is O(n)

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Cormen, T.H., Leiserson, C.E, and Rivest, R.L. (1989). Introduction to

82
Algorithms, New York: McGraw-Hill.
Deitel, H.M. and Deitel, P.J. (1998). C++ How to Program (2nd
Edition), New Jersey: Prentice Hall.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Ford, W. and Topp, W. (2002). Data Structures with C++ Using the
STL (2nd Edition), New Jersey: Prentice Hall.
French C. S. (1992). Computer Science, DP Publications, (4th Edition),
199-217.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.
Shaffer, Clifford A. A. (1998). Practical Introduction to Data Structures
and Algorithm Analysis, Prentice Hall, pp. 77–102.

83
STUDY SESSION 5
Stacks and Queues
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- What is a Stack?
2.1.1- Stack Implementations
2.1.1.1- Stack implementation using array.
2.1.1.2- Stack implementation using linked list.
2.1.2- Applications of Stack.
2.1.2.1- Evaluating Postfix Expression
2.2- What is a queue?
2.2.1- Queue Implementations
2.2.1.1 - QueueAsArray Implementation
2.2.1.2 - QueueAsCircularArray Implementation
2.2.1.3 - QueueAsLinkedList Implementation
2.2.2- Applications of Queues.
2.2.3- Priority queues
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

84
Introduction:
In this study session, we will look at two data structures – the stack and the
queue data structures. While the stack stores data on the basis of Last in First
out (LIFO), the queue stores data on the basis of First in First out (FIFO).
We shall study the operations of these data structures, their applications, as well
as their implementations using arrays and linked lists.

1.0 Study Session Learning Outcomes


After studying this session, you are expected to be able to:
1. Describe the stack and queue data structures
2. Explain implementation of stacks and queues using arrays and linked lists
3. Outline the applications of stacks and queues in computing
4. Evaluate postfix expressions using stacks

2.0 Main Content


2.1 What is a Stack?
Do you have an idea of what a stack data structure is? Well if you do not, a
stack is a linear data structure in which all insertions and deletions of data are
made only at one end of the stack, often called the top of the stack. To add
(push) an item to the stack, it must be placed on top of the stack. To remove
(pop) an item from the stack, it must be removed from the top of the stack too.
It is for this reason that we usually refer to a stack as a LIFO (last-in first-out)
structure.
An Example of Stack: Figure 1.5.1 below shows an example of a stack.

85
Figure 1.5.1: An example of a Stack

2.1.1 Stack Implementations


In our implementation, a stack is a container that extends the AbstractContainer
class and implements the Stack interface below.

public interface Stack extends Container {


public abstract Object getTop();
public abstract void push(Object obj);
public abstract Object pop();
}

Now, we can have two implementations: StackAsArray or StackAsLinkedList


as we will see in the next two sections.

2.1.1.1 Stack implementation using array


For the StackAsArray implementation, the underlying data structure is an array
(I am sure you are very conversant with what an array is) of Object

86
StackAsArray – Constructor
In the StackAsArray implementation that follows, the top of the stack is
array[count – 1] (where count is the number of elements in the array) and the
bottom is array[0]. The constructor‟s single parameter, size, specifies the
maximum number of items that can be stored in the stack. The variable array is
initialised to be an array of length size as shown in the code fragment below.
public class StackAsArray extends AbstractContainer
implements Stack {
protected Object[] array;
public StackAsArray(int size){
array = new Object[size];
}
// …

StackAsArray – purge() Method


The purpose of the purge method is to remove all the contents of a container. To
empty the stack, the purge method simply assigns the value null to the first
count positions of the array as shown in the code fragment below.
public void purge(){
while (count > 0)
array[--count] = null;
}
Complexity is O(n)

StackAsArray – push() Method


The push method adds an element at the top the stack. It takes as argument an
object to be pushed. It first checks if there is room left in the stack. If no room is
left, it throws a ContainerFullException exception. Otherwise, it puts the
object into the array, and then increments count variable by one. The code
fragment below shows how to push an object onto a stack.
public void push(Object object){
if (count == array.length)
throw new ContainerFullException();
else
array[count++] = object;
}
Complexity is O(1)

87
StackAsArray – pop() Method
The pop method removes an item from the stack and returns that item. The pop
method first checks if the stack is empty. If the stack is empty, it throws a
ContainerEmptyException. Otherwise, it simply decreases count by one and
returns the item found at the top of the stack. The code fragment below shows
how to pop an object from a stack.

public Object pop(){


if(count == 0)
throw new ContainerEmptyException();
else {
Object result = array[--count];
array[count] = null;
return result;
Complexity is O(1)
}
}

StackAsArray – getTop() Method


The getTop method first checks if the stack is empty. The getTop method is a
stack accessor which returns the top item in the stack without removing that
item. If the stack is empty, it throws a ContainerEmptyException. Otherwise,
it returns the top item found at position count-1. The code fragment for the
getTop method is shown below.
public Object getTop(){
if(count == 0)
throw new ContainerEmptyException();
else
return array[count – 1];
} Complexity is O(1)

StackAsArray – iterator() Method


The iterator method is used to traverse the objects in the stack one after the
other. The code fragment below shows the implementation for the iterator
method.

88
public Iterator iterator() {
return new Iterator() {
private int position = count-1;
public boolean hasNext() {
return position >=0;
}
public Object next () {
if(position < 0)
throw new NoSuchElementException();
else
return array[position--];
}
};
}
2.1.1.2 Stack implementation using linked list
For the StackAsLinkedList implementation, the underlying data structure is an
object of MyLinkedList. We have described what the various methods of the
stack achieve; therefore, we only provide their implementations here.

public class StackAsLinkedList


extends AbstractContainer
implements Stack {
protected MyLinkedList list;

public StackAsLinkedList(){
list = new MyLinkedList();
}
public void purge(){
list.purge();
count = 0; Complexity is O(1)
}
// …

89
public void push(Object obj){
list.prepend(obj);
count++; Complexity is O(1)
}
public Object pop(){
if(count == 0)
throw new ContainerEmptyException();
else{
Object obj = list.getFirst();
list.extractFirst();
count--;
Complexity is O(1)
return obj;
}
}
public Object getTop(){
if(count == 0)
throw new ContainerEmptyException();
else Complexity is O(1)
return list.getFirst();
}

public Iterator iterator() {


return new Iterator() {
private MyLinkedList.Element position =
list.getHead();

public boolean hasNext() {


return position != null;
}

public Object next() {


if(position == null)
throw new NoSuchElementException();
else {
Object obj = position.getData();
position = position.getNext();
return obj;
}
}
};
}

90
2.1.2 Applications of Stack
Let us now outline some applications some direct applications are which I am
quite sure are very familiar to you. They are:
- Page-visited history in a Web browser
- Undo sequence in a text editor
- Chain of method calls in the Java Virtual Machine
- Evaluating postfix expressions
Some indirect applications are:
- Auxiliary data structure for some algorithms
- Component of other data structures

2.1.2.1 Evaluating Postfix Expression


An ordinary arithmetical expression of the form (5+9)*2+6*5 is called infix-
expression because binary operators appear in between their operands. The
order of operations evaluation is determined by the precedence rules and
parenthesis. When an evaluation order is desired that is different from that
provided by the precedence, parentheses are used to override precedence rules.
Expressions can also be represented using postfix notation - where an operator
comes after its two operands. The advantage of postfix notation is that the order
of operation evaluation is unique without the need for precedence rules or
parenthesis.

91
Table 1.5.1: Infix expressions with postfix equivalent

Infix Postfix
16 / 2 16 2 /

(2 + 14)* 5 2 14 + 5 *

2 + 14 * 5 2 14 5 * +

(6 – 2) * (5 + 4) 6 2 - 5 4 +*

The following algorithm uses a stack to evaluate a postfix expression.


Start with an empty stack
for (each item in the expression) {
if (the item is a number)
Push the number onto the stack
else if (the item is an operator){
Pop two operands from the stack
Apply the operator to the operands
Push the result onto the stack
}
}

Pop the only one number from the stack – that‟s the result of the evaluation
Example: Consider the postfix expression, 2 10 + 9 6 - /, which is (2 +
10) / (9 - 6) in infix, the result of which is 12 / 3 = 4.
The following is a trace of the postfix evaluation algorithm for the above.

92
In-text Question 1
A stack is a Last in First Out structure. (True or False)?

Answer
True

2.2 What is a Queue?


The queue data structure is similar to the normal queue you are conversant with
where you form a line on the basis of who comes first. The queue data structure
is characterized by the fact that additions are made at the end, or tail, of the
queue while removals are made from the front, or head of the queue. The first
item to be en-queued is the first to be de-queued. Queue is therefore called a
First in First out (FIFO) structure.

Figure 1.5.2: A queue data structure

Queue operations
Some operations that we can perform on a queue include:
- Enqueue: Adds a new item
- Dequeue: Removes an item
- GetHead: Returns the item at the top of the queue

93
Figure 1.5.3: Example of a queue
2.2.1: Queue Implementations
In our implementation, a queue is a container that extends the
AbstractContainer class and implements the Queue interface below.
public interface Queue extends Container{
public abstract Object getHead();
public abstract void enqueue(Object obj);
public abstract Object dequeue();
}
Now we can have two implementations: QueueAsArray,
QueueAsCircularArray or QueueAsLinkedList as we shall see in the next three
sections.

2.2.1.1 QueueAsArray Implementation


Apart from the queue operations highlighted at the beginning of this study
session, the queue also has the purge method and the iterator method, which we
explained in our discussion on stacks. We now provide the code segment for the
implementation of a queue using array as the underlying data structure.

94
public class QueueAsArray extends AbstractContainer
implements Queue {
protected Object[] array;
protected int rear = 0;
protected int size;

public QueueAsArray(int size) {


array = new Object[size];
this.size = size;
}
public void purge(){
int index = 0;
while(count > 0){
array[index] = null;
index++;
count--;
} Complexity is O(n)
rear = 0;
}
public Object getHead(){ Complexity is O(1)
if(count == 0)
throw new ContainerEmptyException();
else
return array[0];
}
public void enqueue(Object obj){
if(count == size){
throw new ContainerFullException();
}
else{
array[rear++] = obj;
public Object dequeue(){
count++;
if(count
} == 0) Complexity is O(n)
} throw new ContainerEmptyException();
else
{
Object obj = array[0];
count--;
for(int k = 1; k <= count; k++)
array[k - 1] = array[k]; Complexity is O(1)
rear--;
return obj;
}
}

95
public Iterator iterator() {
return new Iterator() {
int index = 0;
public boolean hasNext(){
return index < count;
}
public Object next(){
if(index == count)
throw new NoSuchElementException();
else {
Object obj = array[index++];
return obj;
}
}
};
}

2.2.1.2 Queue AsCircularArray Implementation


By using modulo arithmetic for computing array indexes, we can have a queue
implementation in which each of the operations enqueue, dequeue, and
getHead has complexity O(1)

Figure 1.4.4(a): Circular queue

96
Figure 1.4.4(b): Circular queue
We now provide the implementation of a queue as a circular array.

public class QueueAsCircularArray extends AbstractContainer


implements Queue {
protected Object[] array;
protected int front = 0;
protected int rear = 0;
protected int size;

public QueueAsCircularArray(int size) {


array = new Object[size]; this.size = size;
}
public void purge(){
int index = front;
while(count > 0){
array[index] = null;
index = (index + 1) % size;
count--;
} Complexity is O(n)
front = rear = 0;
}

97
public Object getHead(){
if(count == 0) throw new ContainerEmptyException();
else return array[front];
Complexity is O(1)
}
public void enqueue(Object obj){
if(count == size) throw new
ContainerFullException();
else {
array[rear] = obj;
rear = (rear + 1) % size; Complexity is O(1)
count++;
}
}
public Object dequeue(){
if(count == 0)throw new ContainerEmptyException();
else {
Object obj = array[front]; Complexity is O(1)
front = (front + 1) % size;
count--;
return obj;
}
}
public Iterator iterator(){
return new Iterator() {
int index = front;
int counter = 0;
public boolean hasNext(){
return counter < count;
}
public Object next(){
if(counter == count)
throw new NoSuchElementException();
else {
Object obj = array[index];
index = (index + 1) % size;
counter++;
return obj;
}
}
};
}

In-text Question 2
Queues can be used to evaluate postfix expressions. (True or False)?

Answer
False

98
2.2.1.3 QueueAsLinkedList Implementation
We also provide implementation of a queue and its operations using our
MyLinkedList singly linked list class.
public class QueueAsLinkedList extends AbstractContainer
implements Queue {
protected MyLinkedList list;

public QueueAsLinkedList(){list = new MyLinkedList();}


public void purge(){
list.purge();
Complexity is O(1)
count = 0;
}
public Object getHead(){
if(count == 0)
throw new ContainerEmptyException();
else
return list.getFirst(); Complexity is O(1)
}

public void enqueue(Object obj){


list.append(obj);
count++; Complexity is O(1)
}
public Object dequeue(){
if(count == 0)
throw new ContainerEmptyException();
else {
Object obj = list.getFirst();
list.extractFirst();
count--; Complexity is O(1)
return obj;
}
}

public Iterator iterator() {


return new Iterator() {
MyLinkedList.Element position = list.getHead();
public boolean hasNext(){
return position != null;
}
public Object next(){
if(position == null)
throw new NoSuchElementException();
else{
Object obj = position.getData();
position = position.getNext();
return obj;
}
} 99
};
}
2.2.2 Applications of Queues
Some direct applications of queues include:
- Waiting lines: Queues are commonly used in systems where waiting line
has to be maintained for obtaining access to a resource. For example, an
operating system may keep a queue of processes that are waiting to run
on the CPU.
- Access to shared resources (e.g., printer)
- Multiprograming
Some Indirect applications of queues include:
- Auxiliary data structure for algorithms
- Component of other data structures

2.2.3 Priority queues


In a normal queue, the enqueue operation adds an item at the back of the queue,
and the dequeue operation removes an item in front of the queue. A priority
queue is a queue in which the dequeue operation removes an item in front of the
queue; but the enqueue operation insert items according to their priorities. A
higher priority item is always enqueued before a lower priority element. An
element that has the same priority as one or more elements in the queue is
enqueued after all the elements with that priority.

3.0 Tutor Marked Assignments (Individual or Group)


1. A stack is referred to as a LIFO structure, true or false? Give reasons for
your answer.
2. Write on two applications of stacks.
3. Why is a queue referred to as a FIFO structure?
4. Describe at least three applications of queues.

100
5. Applying the LIFO principle to the third stack S above, what would be
the state of the stack S, after the operation S. POP ( ) is executed?
Illustrate this with a simple diagram.

4.0 Conclusion/Summary
In this study, you have learned about the stack and queue data structures. You
have also been able to understand the basic operations on stacks and queues.
You should also have learnt about the implementation of stacks and queues
using different data structures as well as their application in computing.

5.0 Self-Assessment Questions


1. Why is a stack called a Last in First Out structure?
2. Why is a queue called a First in First out structure?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2Lc6l91 , http://bit.ly/341wCiX , http://bit.ly/329fKoZ ,
http://bit.ly/2ZAGrW4 , http://bit.ly/2ZAGrW4 , http://bit.ly/2PoKrVM , http://bit.ly/33W2NAu ,

http://bit.ly/2MArY6p . Watch the video & summarise in 1 paragraph.


b. View the animation on http://bit.ly/341wCiX and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on applications of stacks and queues
in computing; In 2 paragraphs summarise their opinion of the discussed topic.
etc.

101
7.0 Self Assessment Question Answers
1. A stack is called a Last in First Out structure because the last element to
be added is always the first element to be removed.
2. A queue is called a First in First out structure because the first item to
be added to the queue is the first to be removed from the queue.

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L. (2012). “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc.
Bruno R. (2000), “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc.
Cormen, T. et al. (1989). Introduction toAlgorithms, New York: McGraw-Hill.
Deitel, H. and Deitel, P. (1998). C++ How to Program (2nd
Edition), New Jersey: Prentice Hall.
Nell, D et al (2002). “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc.,
Ford, W. and Topp, W. (2002). Data Structures with C++ Using the
STL (2nd Edition), New Jersey: Prentice Hall.
French C. S. (1992). Computer Science, DP Publications, (4th Edition),
199-217.
Robert L. (2003). “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing,
Shaffer, Clifford A. (1998). Practical Introduction to Data Structures
and Algorithm Analysis, Prentice Hall, pp. 77–102.

102
STUDY SESSION 6
Recursion
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Review of Recursion
2.1.1- What is a Recursive Method?
2.1.2- The need for Auxiliary (or Helper) Methods
2.1.3- How Recursive Methods work
2.1.4- Tracing of Recursive Methods
2.2- Types of Recursive Methods
2.2.1- Direct and Indirect Recursive Methods
2.2.2- Nested and Non-Nested Recursive Methods
2.2.3- Tail and Non-Tail Recursive Methods
2.2.3.1- Converting tail-recursive method to iterative
2.2.3.2- Why tail recursion?
2.2.4- Converting non-tail to tail recursive method
2.2.5- Linear and Tree Recursive Methods
2.2.6- Excessive Recursion
2.3- More on Recursion
2.3.1- Recursion vs. Iteration
2.3.2- Why Recursion?
2.3.3- Common Errors in Writing Recursive Methods
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers

103
8.0 References/Further Readings

Introduction:
A recursive method is a method that calls itself either directly or indirectly. In
this study session, we introduce recursion and how it works. You will learn how
to make and trace recursive calls. We shall conclude the study session by
looking at different types of recursive methods, why we need to use recursions
and some common errors that could occur when writing recursion.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Define recursion
2. Describe how recursive methods work and know how to trace recursive
methods
3. Explain and differentiate between the different types of recursion
4. Analyse recursive algorithms

2.0 Main Content


2.1 Review of Recursion
2.1.1 What is a Recursive Method?
A method is recursive if it calls itself either directly or indirectly. Recursion is a
technique that allows us to break down a problem into one or more simpler sub-
problems that are similar in form to the original problem. You remember how to
computer the factorial of a given integer right? Now let us see how to
recursively compute it.
Example 1: A recursive method for computing x!

long factorial (int x) {


if (x == 0)
return 1; //base case
else
return x * factorial (x – 1); //recursive case
}
104
This method illustrates all the important ideas of recursion:
1. A base (or stopping) case
i. Code first tests for stopping condition (is x = = 0?)
ii. Provides a direct (non-recursive) solution for the base case (0! = 1).
2. The recursive case
i. Expresses solution to problem in 2 (or more) smaller parts
ii. Invokes itself to compute the smaller parts, eventually reaching the
base case
Let us consider another example of a recursive method.
Example 2: count zeros in an array
int countZeros(int[] x, int index) {
if (index == 0)
return x[0] == 0 ? 1: 0;
else if (x[index] == 0)
return 1 + countZeros(x, index – 1);
else
return countZeros(x, index – 1);
}

2.1.2 The need for Auxiliary (or Helper) Methods


Auxiliary or helper methods are used for one or more of the following reasons:
- To make recursive methods more efficient.
- To make the user interface to a method simpler by hiding the method's
initialisations.
Example 1: Consider the method:
public long factorial (int x){
if (x < 0)
throw new IllegalArgumentException("Negative
argument");
else if (x == 0)
return 1;
else
return x * factorial(x – 1);
}
The condition x < 0 which should be executed only once is executed in each
recursive call. We can use a private auxiliary method to avoid this:

105
public long factorial(int x){
if (x < 0)
throw new IllegalArgumentException("Negative
argument");
else
return factorialAuxiliary(x);
}
private long factorialAuxiliary(int x){
if (x == 0)
return 1;
else
return x * factorialAuxiliary(x – 1);
}

Example 3: Consider the method:


public int binarySearch(int target, int[] array, int low,
int high) {
if(low > high)
return -1;
else {
int middle = (low + high)/2;
if(array[middle] == target)
return middle;
else if(array[middle] < target)
return binarySearch(target, array, middle + 1,
high);
else
return binarySearch(target, array, low, middle -
1);
}
}

The first time the method is called, the parameter low and high must be set to 0
and array.length – 1 respectively. Example:

int result = binarySearch (target, array, 0, array.length -


1);

From a user's perspective, the parameters low and high introduce an


unnecessary complexity that can be avoided by using an auxiliary method:

106
public int binarySearch(int target, int[] array){
return binarySearch(target, array, 0, array.length - 1);
}
private int binarySearch(int target, int[] array, int low,
int high){
if(low > high)
return -1;
else{
int middle = (low + high)/2;
if(array[middle] == target)
return middle;
else if(array[middle] < target)
return binarySearch(target, array, middle + 1,
high);
else
return binarySearch(target, array, low, middle -
1);
}
}

A call to the method becomes simple:


int result = binarySearch(target, array);
Example 3: Consider the following method that returns the length of a
MyLinkedList instance:

public int length(Element element){


if(element == null)
return 0;
else
return 1 + length(element.next);
}

The method must be invoked by a call of the form:


list.length(list.getHead());

By using an auxiliary method, we can simplify the call to:


list.length();
public int length(){
return auxLength(head);
}

private int auxLength(Element element){


if(element == null)
return 0; 107
else
return 1 + auxLength(element.next);
}
2.1.3 How Recursive Methods Work
Modern computer uses a stack as the primary memory management model for a
running program. Each running program has its own memory allocation
containing the typical layout as shown below.
In-text Question 1
What is a recursive method?

Answer
A recursive method is a method is recursive if it calls itself either directly or indirectly.

Figure 1.6.1: How recursive methods work


When a method called an Activation Record is created. It contains:
- The values of the parameters.
- The values of the local variables.
- The return address (The address of the statement after the call statement).
- The previous activation record address.
- A location for the return value of the activation record.
When a method returns:
- The return value of its activation record is passed to the previous
activation record or it is passed to the calling statement if there is no
previous activation record.

108
- The Activation Record is popped entirely from the stack.
Recursion is handled in a similar way. Each recursive call creates a separate
Activation Record as each recursive call completes, its Activation Record is
popped from the stack. Ultimately, control passes back to the calling statement.

2.1.4 Tracing of Recursive Methods


A recursive method may be traced using the recursion tree it generates.
Example1: Consider the recursive method f defined below. Draw the recursive
tree generated by the call f("KFU", 2) and hence determine the number of
activation records generated by the call and the output of the following
program:
public class MyRecursion3 {
public static void main(String[] args){
f("KFU", 2);
}

public static void f(String s, int index){


if (index >= 0) {
System.out.print(s.charAt(index));
f(s, index - 1);
System.out.print(s.charAt(index));
f(s, index - 1);
}
}
}

109
Figure 1.6.2: Output of the recursive call

Note: The red numbers indicate the order of execution


The output is: UFKKFKKUFKKFKK
The number of generated activation records is 15; it is the same as the number
of generated recursive calls.
Example 2: The Towers of Hanoi problem:
A total of n disks are arranged on a peg A from the largest to the smallest; such
that the smallest is at the top. Two empty pegs B and C are provided. It is
required to move the n disks from peg A to peg C under the following
restrictions:
- Only one disk may be moved at a time.
- A larger disk must not be placed on a smaller disk.
- In the process, any of the three pegs may be used as temporary storage.
110
Suppose we can solve the problem for n – 1 disks, then to solve for n disks use
the following algorithm:
Move n – 1 disks from peg A to peg B
Move the nth disk from peg A to peg C
Move n – 1 disks from peg B to peg C

Figure 1.6.3: Tower of Hanoi


This translates to the Java method hanoi given below:

import java.io.*;
public class TowersOfHanoi{
public static void main(String[] args) throws IOException
{
BufferedReader stdin =
new BufferedReader(new
InputStreamReader(System.in));
System.out.print("Enter the value of n: " );
int n = Integer.parseInt(stdin.readLine());
hanoi(n, 'A', 'C', 'B');
}

public static void hanoi(int n, char from, char to, char


temp){
if (n == 1)
System.out.println(from + " --------> " + to);
else{
hanoi(n - 1, from, temp, to);
System.out.println(from + " --------> " + to);
hanoi(n - 1, temp, to, from);
}
}
}

As an example, we draw the recursion tree of the method hanoi for n = = 3 and
determine the output of the above program.

111
Figure 1.6.4: Recursive Tree for tower of Hanoi with three disks
Output of the program is:
A -------> C
A -------> B
C -------> B
A -------> C
B -------> A
B -------> C
A -------> C

2.2 Types of Recursive Methods


A recursive method is characterised based on:
- Whether the method calls itself or not (direct or indirect recursion).
- Whether the recursion is nested or not.
- Whether there are pending operations at each recursive call (tail-recursive
or not).
- The shape of the calling pattern -- whether pending operations are also
recursive (linear or tree-recursive).
- Whether the method is excessively recursive or not.

112
2.2.1 Direct and Indirect Recursive Methods
A method is directly recursive if it contains an explicit call to itself.
long factorial (int x) {
if (x == 0)
return 1;
else
return x * factorial (x – 1);
}

A method x is indirectly recursive if it contains a call to another method which


in turn calls x. They are also known as mutually recursive methods:

public static boolean isEven(int n) {


if (n==0)
return true;
else
return(isOdd(n-1));
}
public static boolean isOdd(int n) {
return (! isEven(n));
}
Another example of mutually recursive methods:

113
public static double sin(double x){
if(x < 0.0000001)
return x - (x*x*x)/6;
else{
double y = tan(x/3);
return sin(x/3)*((3 - y*y)/(1 + y*y));
}
}

public static double tan(double x){


return sin(x)/cos(x);
}

public static double cos(double x){


double y = sin(x);
return Math.sqrt(1 - y*y);
}

2.2.2 Nested and Non-Nested Recursive Methods


Nested recursion occurs when a method is not only defined in terms of itself;
but it is also used as one of the parameters.
Example: The Ackerman function

public static long Ackmn(long n, long m){


if (n == 0)
return m + 1;
else if (n > 0 && m == 0)
return Ackmn(n – 1, 1);
else
return Ackmn(n – 1, Ackmn(n, m – 1));
}

The Ackermann function grows faster than a multiple exponential function.

114
2.2.3 Tail and Non-Tail Recursive Methods
A method is tail recursive if in each of its recursive cases it executes one
recursive call and if there are no pending operations after that call.
Example 1:
public static void f1(int n){
System.out.print(n + " ");
if(n > 0)
f1(n - 1);
}

Example 2:
public static void f3(int n){
if(n > 6){
System.out.print(2*n + " ");
f3(n – 2);
} else if(n > 0){
System.out.print(n + " ");
f3(n – 1);
}
}
The following are examples of non-tail recursive methods.
Example 1:
public static void f4(int n){
if (n > 0)
f4(n - 1);
System.out.print(n + " ");
}

After each recursive call there is a pending System.out.print(n + " ") operation.
Example 2:
long factorial(int x) {
if (x == 0)
return 1;
else
return x * factorial(x – 1);
}
After each recursive call there is a pending * operation.

2.2.3.1 Converting tail-recursive method to iterative


It is easy to convert a tail recursive method into an iterative one:

115
Tail recursive method
Corresponding iterative method
public static void f1(int n) { public static void f1(int n) {
System.out.print(n + " "); for( int k = n; k >= 0; k--)
if (n > 0) System.out.print(k + " ");
f1(n - 1); }
}
public static void f3 (int n) {
public static void f3 (int n) {
while (n > 0) {
if (n > 6) {
if (n > 6) {
System.out.print(2*n + " ");
System.out.print(2*n + " ");
f3(n – 2);
n = n – 2;
} else if (n > 0) {
} else if (n > 0) {
System.out.print(n + " ");
System.out.print(n + " ");
f3 (n – 1);
n = n – 1;
}
}
}
}
}

2.2.3.2 Why tail recursion?


It is desirable to have tail-recursive methods, because:
1. The amount of information that gets stored during computation is
independent of the number of recursive calls.
2. Some compilers can produce optimized code that replaces tail recursion
by iteration (saving the overhead of the recursive calls).
3. Tail recursion is important in languages like Prolog and Functional
languages like Clean, Haskell, Miranda, and SML that do not have
explicit loop constructs (loops are simulated by recursion).

2.2.4 Converting non-tail to tail recursive method


A non-tail recursive method can often be converted to a tail-recursive method
by means of an "auxiliary" parameter. This parameter is used to form the result.
The idea is to attempt to incorporate the pending operation into the auxiliary
parameter in such a way that the recursive call no longer has a pending
operation. The technique is usually used in conjunction with an "auxiliary"

116
method. This is simply to keep the syntax clean and to hide the fact that
auxiliary parameters are needed.
Example 1: Converting non-tail recursive factorial to tail-recursive factorial
long factorial (int n) {
if (n == 0)
return 1;
else
return n * factorial (n – 1);
}

We introduce an auxiliary parameter result and initialise it to 1. The parameter


result keeps track of the partial computation of n! :
public long tailRecursiveFact (int n) {
return factAux(n, 1);
}
private long factAux (int n, int result) {
if (n == 0)
return result;
else
return factAux(n-1, n * result);
}
Example 2: Converting non-tail recursive method fib to tail-recursive fib
The fibonacci sequence is:
0 1 1 2 3 5 8 13 21 . . .
Each term except the first two is a sum of the previous two terms.
int fib(int n){
if (n == 0 || n == 1)
return n;
else
return fib(n – 1) + fib(n – 2);
}
Because there are two recursive calls, a tail-recursive fibonacci method can be
implemented by using two auxiliary parameters for accumulating results:

int fib (int n) {


return fibAux(n, 1, 0);
}
int fibAux (int n, int next, int result) {
if (n == 0)
return result;
else
return fibAux(n – 1, next + result, next);
117
}
Figure 1.6.5: Recursive tree for method fib(4)

2.2.5 Linear and Tree Recursive Methods


Another way to characterise recursive methods is by the way in which the
recursion grows. The two basic ways are "linear" and "tree." A recursive
method is said to be linearly recursive when no pending operation involves
another recursive call to the method. For example, the factorial method is
linearly recursive. The pending operation is simply multiplication by a variable;
it does not involve another call to factorial.
long factorial (int n) {
if (n == 0)
return 1;
else
return n * factorial (n – 1);
}
A recursive method is said to be tree recursive when the pending operation
involves another recursive call. The Fibonacci method fib provides a classic
example of tree recursion.
int fib(int n){
if (n == 0 || n == 1)
return n;
else
return fib(n – 1) + fib(n – 2);
}

118
2.2.6 Excessive Recursion
A recursive method is excessively recursive if it repeats computations for some
parameter values. Example: The call fib(6) results in two repetitions of f(4).
This in turn results in repetitions of fib(3), fib(2), fib(1) and fib(0):

Figure 1.6.6: Recursive tree for call to fib(6)

In-text Question 2
Nested recursion occurs when a method is only defined in terms of itself but it is not used as
one of the parameters. (True/False?)

Answer
False

2.3 More on Recursion


2.3.1 Recursion vs. Iteration
In general, an iterative version of a method will execute more efficiently in
terms of time and space than a recursive version. This is because the overhead
involved in entering and exiting a function in terms of stack I/O is avoided in
iterative version. Sometimes, we are forced to use iteration because stack cannot
handle enough activation records - Example: power (2, 5000)

119
2.3.2 Why Recursion?
Usually recursive algorithms have less code, therefore algorithms can be easier
to write and understand - e.g. Towers of Hanoi. However, avoid using
excessively recursive algorithms even if the code is simple. Sometimes
recursion provides a much simpler solution. Obtaining the same result using
iteration requires complicated coding - e.g. Quicksort, Towers of Hanoi, etc.
Recursive methods provide a very natural mechanism for processing recursive
data structures. A recursive data structure is a data structure that is defined
recursively – e.g. Tree. Functional programing languages such as Clean, FP,
Haskell, Miranda, and SML do not have explicit loop constructs. In these
languages looping is achieved by recursion.
Some recursive algorithms are more efficient than equivalent iterative
algorithms.
Example:

public static long power1 (int x, int n) {


long product = 1;
for (int i = 1; i <= n; i++)
product *= x;
return product;
}

public static long power2 (int x, int n) {


if (n == 1) return x;
else if (n == 0)return 1;
else {
long t = power2(x , n / 2);
if ((n % 2) == 0) return t * t;
else return x * t * t;
}
}
2.3.3 Common Errors in Writing Recursive Methods
We highlight some errors that could occur when writing recursive methods.
1. Non-terminating Recursive Methods (Infinite recursion)
(a) No base case.
int badFactorial(int x) {
return x * badFactorial(x-1);
}
120
(b) The base case is never reached for some parameter values.

int anotherBadFactorial(int x) {
if(x == 0)
return 1;
else
return x*(x-1)*anotherBadFactorial(x -2);
// When x is odd, we never reach the base case!!
}

2. Post increment and decrement operators must not be used since the
update will not occur until AFTER the method call - infinite recursion.

public static int sumArray (int[ ] x, int index) {


if (index == x.length)return 0;
else
return x[index] + sumArray (x, index++);
}

3. Local variables must not be used to accumulate the result of a recursive


method. Each recursive call has its own copy of local variables.

public static int sumArray (int[ ] x, int index) {


int sum = 0;
if (index == x.length)return sum;
else {
sum += x[index];
return sumArray(x,index + 1);
}
}

4. Wrong placement of return statement.


Consider the following method that is supposed to calculate the sum of
the first n integers:

public static int sum (int n, int result) {


if (n >= 0)
sum(n - 1, n + result);
return result;
}

121
When result is initialized to 0, the method returns 0 for whatever value of the
parameter n. The result returned is that of the final return statement to be
executed. Example: A trace of the call sum(3, 0) is:

Figure 1.6.7: A trace of the call sum(3, 0)

A correct version of the method is:


public static int sum(int n, int result){
if (n == 0)
return result;
else
return sum(n-1, n + result);
}

Example: A trace of the call sum (3, 0) is:

Figure 1.6.8: A trace of the call sum (3, 0)

5. The use of instance or static variables in recursive methods should be


avoided. Although it is not an error, it is bad programing practice. These
variables may be modified by code outside the method and cause the
recursive method to return wrong result.

public class Sum{


private int sum;
public int sumArray(int[ ] x, int index){
if(index == x.length)
return sum;
else {
sum += x[index];
return sumArray(x,index + 1);
}
} 122
}
3.0 Tutor Marked Assignments (Individual or Group)
1. Explain what is meant by
a. base case.
b. general (or recursive) case.
c. indirect recursion.

Use the following method in answering Exercises 2 and 3:


int puzzle(int base, int limit)
{
if (base > limit)
return –1;
else
if (base == limit)
return 1;
else
return base * puzzle(base + 1, limit);
}
2. Identify
a. the base case(s) of method puzzle.

b. the general case(s) of method puzzle.

3. Show what would be written by the following calls to the recursive


method puzzle.
a. System.out.println(puzzle (14, 10));

b. System.out.println(puzzle (4, 7));

c. System.out.println(puzzle (0, 0));

4. For each of the following recursive methods, identify the base and
general cases and explain what the method does.
a.
int power(int base, int exponent)
{
if (exponent == 0)
return 1;
else
return (base * power(base, exponent–1));
}
b.

123
int recur(int n)
{
if (n < 0)
return –1;
else if (n < 10)
return 1;
else
return (1 + recur(n / 10);
}

c.
int recur2(int n)
{
if (n < 0)
return –1;
else if (n < 10)
return n;
else
return (n % 10) + recur2(n / 10);
}

4.0 Conclusion/Summary
In this study session, you learnt that a recursive method is a method that calls
itself either directly or indirectly. You were shown how recursion works. You
also learnt how to make and trace recursive calls. The different types of
recursive methods, why we need to use recursions and some common errors that
occur when writing recursion were studied in the concluding part of study
session. In the next study session, we will study how to analyse recursive
algorithms.

5.0 Self-Assessment Questions


1. Non-terminating recursive methods cannot result to infinite recursion.
(True or False)?
2. When is a method said to be directly recursive?

124
6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/30wvulo , http://bit.ly/2PeeWO5 , http://bit.ly/33U1HoZ ,
http://bit.ly/2U3P5a2 , http://bit.ly/2U97Bhp , http://bit.ly/2ZgAYnZ. Watch the video &
summarise in 1 paragraph
b. View the animation on http://bit.ly/2U5iHEj and critique it in the discussion
forum
c. Take a walk and engage any 3 students on how to trace recursive calls; In 2
paragraphs summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question


1. False
2. A method is directly recursive if it contains an explicit call to itself.

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.

125
STUDY SESSION 7
Analysis of Recursive Algorithms
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - What is a recurrence relation?
2.2 - Forming Recurrence Relations
2.3 - Solving Recurrence Relations
2.4 - Analysis of Recursive Factorial method
2.5 - Analysis of Recursive Selection Sort
2.6 - Analysis of Recursive Binary Search
2.7 - Analysis of Recursive Towers of Hanoi Algorithm
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 In-text Question Answers
8.0 Self-Assessment Question Answers
9.0 References/Further Readings

Introduction:
In the last study session of this module, you will study how to analyse recursive
algorithms. In particular, we will introduce recurrence relations, how to form
and solve them. We conclude the session by analysing the algorithm of some
common recursive problems.

126
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Define and form recurrence relations
2. Solve recurrence relations
3. Analyse recursive algorithms

2.0 Main Content


2.1 What is a recurrence relation?
A recurrence relation, T(n), is a recursive function of integer variable n. Like
all recursive functions, it has both recursive case and base case.

Example:

The portion of the definition that does not contain T is called the base case of
the recurrence relation; the portion that contains T is called the recurrent or
recursive case. Recurrence relations are useful for expressing the running times
(i.e., the number of basic operations executed) of recursive algorithms.
2.2 Forming Recurrence Relations
For a given recursive method, the base case and the recursive case of its
recurrence relation correspond directly to the base case and the recursive case of
the method.
Example 1: Write the recurrence relation for the following method.
public void f (int n) {
if (n > 0) {
System.out.println(n);
f(n-1);
}
}
127
The base case is reached when n == 0. The method performs one comparison.
Thus, the number of operations when n == 0, T(0), is some constant a.
When n > 0, the method performs two basic operations and then calls itself,
using ONE recursive call, with a parameter n – 1.
Therefore the recurrence relation is:

2.3 Solving Recurrence Relations


To solve a recurrence relation T(n) we need to derive a form of T(n) that is not a
recurrence relation. Such a form is called a closed form of the recurrence
relation.
There are four methods to solve recurrence relations that represent the running
time of recursive methods:
- Iteration method (unrolling and summing)
- Substitution method
- Recursion tree method
- Master method
In this course, we will only use the Iteration method.
In-text Question 1
A recurrence relation has only a recursive case but not a base case. (True or False)?

Answer
False

Iterative Method
Steps:
- Expand the recurrence
- Express the expansion as a summation by plugging the recurrence back
into itself until you see a pattern.

128
- Evaluate the summation
In evaluating the summation, one or more of the following summation formulae
may be used:
Arithmetic series:

Geometric Series:

Special Cases of Geometric Series:

Harmonic Series:

129
Others:

Let us now look at how to analyse some the recursive methods we previously
discussed.
2.4 Analysis of Recursive Factorial method
Example1: Form and solve the recurrence relation for the running time of
factorial method and hence determine its big-O complexity:

long factorial (int n) {


if (n == 0)
return 1;
else
return n * factorial (n – 1);
}
T(0) = c
T(n) = b + T(n - 1)
= b + b + T(n - 2)
= b +b +b + T(n - 3)

= kb + T(n - k)
When k = n, we have:
T(n) = nb + T(n - n)
= bn + T(0)
= bn + c.

Therefore, method factorial is O(n).

130
2.5 Analysis of Recursive Selection Sort
public static void selectionSort(int[] x) {
selectionSort(x, x.length - 1);}
private static void selectionSort(int[] x, int n) {
int minPos;
if (n > 0) {
minPos = findMinPos(x, n);
swap(x, minPos, n);
selectionSort(x, n - 1);
}
}
private static int findMinPos (int[] x, int n) {
int k = n;
for(int i = 0; i < n; i++)
if(x[i] < x[k]) k = i;
return k;
}
private static void swap(int[] x, int minPos, int n) {
int temp=x[n]; x[n]=x[minPos]; x[minPos]=temp;
}

The findMinPos method is O(n), and swap is O(1), therefore the recurrence
relation for the running time of the selectionSort method is:
T(0) = a
T(n) = T(n – 1) + n + c n > 0
= [T(n-2) +(n-1) + c] + n + c = T(n-2) + (n-1) + n + 2c
= [T(n-3) + (n-2) + c] +(n-1) + n + 2c= T(n-3) + (n-2) + (n-1) + n +
3c
= T(n-4) + (n-3) + (n-2) + (n-1) + n + 4c
= ……
= T(n-k) + (n-k + 1) + (n-k + 2) + …….+ n + kc
When k = n, we have :

Therefore, Recursive Selection Sort is O(n2)

131
2.6 Analysis of Recursive Binary Search
public int binarySearch (int target, int[] array,
int low, int high) {
if (low > high)
return -1;
else {
int middle = (low + high)/2;
if (array[middle] == target)
return middle;
else if(array[middle] < target)
return binarySearch(target, array, middle + 1,
high);
else
return binarySearch(target, array, low, middle -
1);
}
}

The recurrence relation for the running time of the method is:
T(1) = a if n = 1 (one element array)
T(n) = T(n / 2) + b if n > 1
Expanding:
T(n) = T(n / 2) + b
= [T(n / 4) + b] + b = T (n / 22) + 2b
= [T(n / 8) + b] + 2b = T(n / 23) + 3b
= ……..
= T( n / 2k) + kb
When n / 2k = 1  n = 2k  k = log2 n, we have:
T(n) = T(1) + b log2 n
= a + b log2 n
Therefore, Recursive Binary Search is O(log n)

In-text Question 2
To solve a recurrence relation T(n) we need to derive a closed form of the recurrence
relation. (True or False)?

Answer
True

132
2.7 Analysis of Recursive Towers of Hanoi Algorithm
public static void hanoi(int n, char from, char to, char
temp){
if (n == 1)
System.out.println(from + " --------> " + to);
else{
hanoi(n - 1, from, temp, to);
System.out.println(from + " --------> " + to);
hanoi(n - 1, temp, to, from);
}
}

The recurrence relation for the running time of the method hanoi is:
T(n) = a if n = 1
T(n) = 2T(n - 1) + b if n > 1
Expanding:
T(n) = 2T(n – 1) + b
= 2[2T(n – 2) + b] + b = 22 T(n – 2) + 2b + b
= 22 [2T(n – 3) + b] + 2b + b = 23 T(n – 3) + 22b + 2b + b
= 23 [2T(n – 4) + b] + 22b + 2b + b = 24 T(n – 4) + 23 b + 22b + 21b +
20b
= ……
= 2k T(n – k) + b[2k- 1 + 2k– 2 + . . . 21 + 20]

When k = n – 1, we have:

Therefore, the method hanoi is O(2n)

3.0 Tutor Marked Assignments (Individual or Group)

1. Consider the following recursive algorithm for computing the sum of the
first n cubes: S(n) = 13 + 23 + . . . + n3.
ALGORITHM S(n)

133
//Input: A positive integer n
//Output: The sum of the first n cubes
if n = 1 return 1
else return S(n − 1) + n ∗ n ∗ n
a. Set up and solve a recurrence relation for the number of times the
algorithm‟s basic operation is executed.
b. How does this algorithm compare with the straightforward
nonrecursive algorithm for computing this sum?

2. Consider the following recursive algorithm.


ALGORITHM Q(n)
//Input: A positive integer n
if n = 1 return 1
else return Q(n − 1) + 2 ∗ n – 1
a. Set up a recurrence relation for this function‟s values and solve it to
determine what this algorithm computes.
b. Set up a recurrence relation for the number of multiplications made by
this algorithm and solve it.
c. Set up a recurrence relation for the number of additions/subtractions
made by this algorithm and solve it.

3. Consider the following recursive algorithm.


ALGORITHM Riddle(A[0..n − 1])
//Input: An array A[0..n − 1] of real numbers
if n = 1 return A[0]
else temp←Riddle(A[0..n − 2])
if temp ≤ A[n − 1] return temp
else return A[n − 1]
a. What does this algorithm compute?
b. Set up a recurrence relation for the algorithm‟s basic operation
count and solve it.

134
4.0 Conclusion/Summary
In the last study session of this module, you have studied how to analyse
recursive algorithms. You learnt about recurrence relations, how to form and
solve them. We concluded the study session by analysing the algorithm of some
common recursive problems.

5.0 Self-Assessment Questions


1. Master method is a method of solving recurrence relations. (True or
False)?
2. We discussed the iterative approach to solving recurrence relations. List
three other methods that you know.

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2MDl2oP , http://bit.ly/2zocdH4 , http://bit.ly/2ZwIc6k ,

http://bit.ly/2PdKL9C , http://bit.ly/2Zialiq , http://bit.ly/2L6Tg0Q . Watch the video &


summarise in 1 paragraph
b. View the animation on http://bit.ly/31XCPL4 and critique it in the discussion
forum
c. Take a walk and engage any 3 students on the iterative approach to solving
recurrence relations; In 2 paragraphs summarise their opinion of the discussed
topic. etc.

7.0 Self Assessment Question Answers


1. True
2.
- Substitution method

- Recursion tree method


- Master method

135
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

136
MODULE 2
Trees
Contents:
Study Session 1: Tree
Study Session 2: Binary Search Tree
Study Session 3: Tree Traversal
Study Session 4: Binary Heap
Study Session 5: AVL Tree
Study Session 6: B-Tree

STUDY SESSION 1
Tree
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - What is a Tree?
2.2 - Tree terminology
2.3 - Why trees?
2.4 - General Trees and its Implementation
2.4.1 - N-ary Trees
2.4.2 - N-ary Trees Implementation
2.5 - Binary trees
2.6 - Binary tree implementation
2.7 - Application of Binary trees
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)

137
7.0 In-text Question Answers
8.0 Self-Assessment Question Answers
9.0 References/Further Readings

Introduction:
In the first study session of this module, we shall introduce trees, which are used
to define a parent-child relationship. You will learn more about what trees are as
well as some tree terminologies in this session. We will also look at how to
implement trees. We shall conclude the session by introducing binary trees, its
implementation and applications.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. give a basic definition of a tree
2. outline and explain tree terminologies
3. explain why we need to use trees
4. implement trees
5. describe binary tree and its implementation
6. outline applications of binary trees

2.0 Main Content


2.1 What is a Tree?
A tree is a finite set of nodes together with a finite set of directed edges that
define parent-child relationships. Each directed edge connects a parent to its
child.
Example:
Nodes={A,B,C,D,E,f,G,H}
Edges={(A,B),(A,E),(B,F),(B,G),(B,H),(E,C),(E,D)}

138
Figure 2.1.1: Example of a tree

A directed path from node m1 to node mk is a list of nodes m1, m2, . . . , mk


such that each is the parent of the next node in the list. The length of such a path
is k - 1.
Example: From figure 1, A, E, C is a directed path of length 2.
A tree satisfies the following properties:
1. It has one designated node called the root, that has no parent.
2. Every node, except the root, has exactly one parent.
3. A node may have zero or more children.
4. There is a unique directed path from the root to each node.

139
2.2 Tree Terminology
We outline some tree terminologies in the section as follow:
Ordered tree: A tree in which the children of each node are linearly ordered
(usually from left to right).
Ancestor of a node v: Any node, including v itself, on the path from the root to
the node.
Proper ancestor of a node v: Any node, excluding v, on the path from the root
to the node.

Figure 2.1.2(a): Some tree terminologies


Descendant of a node v: Any node, including v itself, on any path from the
node to a leaf node (i.e., a node with no children).
Proper descendant of a node v: Any node, excluding v, on any path from the
node to a leaf node.
Subtree of a node v: A tree rooted at a child of v.

Figure 2.1.2(b): Some tree terminologies

Figure 2.1.2(c): Some tree terminologies

140
Figure 2.1.2(d): Some tree terminologies

Degree: The degree of a node is the number of subtrees of that node


In the ordered tree in figure 2(e),
- Each of node D and B has degree 1.
- Each of node A and E has degree 2.
- Node C has degree 3. Each of node F,G,H,I,J has degree 0.
Leaf: A leaf node is a node with degree 0.
Internal or interior node: a node with degree greater than 0 is called an internal
or interior node.
Siblings: Nodes that have the same parent are known as siblings.
Size: The number of nodes in a tree is its size.

Figure 2.1.2(e): An Ordered Tree with a Size of 10

141
Level (or depth) of a node v: The level of a node is the length of the path from
the root to v.
Height of a node v: The height of a node is the length of the longest path from v
to a leaf node.
- The height of a tree is the height of its root mode.
- By definition, the height of an empty tree is -1.

Figure 2.1.2(f): Some tree terminologies


The height of the tree in figure 2.1.2(f) is 4 while the height of node C is 3.

2.3 Why trees?


Trees are very important data structures in computing. They are suitable for:
1. Hierarchical structure representation, e.g.,
- File directory.
- Organisational structure of an institution.
- Class inheritance tree.
2. Problem representation, e.g.,
- Expression tree.
- Decision tree.
3. Efficient algorithmic solutions, e.g.,
- Search trees.
- Efficient priority queues via heaps.

142
2.4 General Trees and its Implementation
In a general tree, there is no limit to the number of children that a node can
have. In representing a general tree by linked lists:
- Each node has a linked list of the subtrees of that node.
- Each element of the linked list is a subtree of the current node
public class GeneralTree extends AbstractContainer {
protected Object key ;
protected int degree ;
protected MyLinkedList list ;
// . . .
}

2.4.1 N-ary Trees


An N-ary tree is an ordered tree that is either:
1. Empty, or
2. It consists of a root node and at most N non-empty N-ary subtrees.
It follows that the degree of each node in an N-ary tree is at most N.
Example of N-ary trees:

Figure 2.1.3: Examples of N-ary trees


In-text Question 1
What is a Tree?

Answer
A tree, is a finite set of nodes together with a finite set of directed edges that define parent-
child relationships.

143
2.4.2 N-ary Trees Implementation
The code fragment below shows the implementation of the N-ary tree.

public class NaryTree extends AbstractTree {


protected Object key ;
protected int degree ;
protected NaryTree[ ] subtree ;
public NaryTree(int degree){
key = null ; this.degree = degree ;
subtree = null ;
}
public NaryTree(int degree, Object key){
this.key = key ;
this.degree = degree ;
subtree = new NaryTree[degree] ;
for(int i = 0; i < degree; i++)
subtree[i] = new NaryTree(degree);
}
// . . .
}

2.5 Binary Trees


A binary tree is an N-ary tree for which N = 2. Thus, a binary tree is either:
- An empty tree, or
- A tree consisting of a root node and at most two non-empty binary
subtrees.
Example:

Figure 2.1.4: A binary tree


A full binary tree is either an empty binary tree or a binary tree in which each
level k, k > 0, has 2k nodes.

144
Figure 2.1.5: Example showing the growth of a full binary tree
A complete binary tree is either an empty binary tree or a binary tree in which:
1. Each level k, k > 0, other than the last level contains the maximum
number of nodes for that level, that is 2k.
2. The last level may or may not contain the maximum number of nodes.
3. If a slot with a missing node is encountered when scanning the last level
in a left to right direction, then all remaining slots in the level must be
empty.
Thus, every full binary tree is a complete binary tree, but the opposite is not
true.

Figure 2.1.5: Example showing the growth of a complete binary tree

2.6 Binary Tree Implementation


In our binary search tree implementation, each node has a key representing the
node itself, and then a left and right tree.

145
We present the implementation of a binary tree by extending some of our
existing classes. But before then, we present figure 6 below, which is a binary
tree representing the expression a + (b - c) * d

Figure 2.1.6: A binary tree representing a + (b - c) * d

In-text Question 2
Every full binary tree is a complete binary tree, but the opposite is not true. (True or False)?

Answer
True

public class BinaryTree


extends AbstractTree{
protected Object key ;
protected BinaryTree left, right ;
public BinaryTree(Object key,
BinaryTree left,
BinaryTree right){
this.key = key ;
this.left = left ;
this.right = right ;
}
public BinaryTree( ) {
this(null, null, null) ;
}
public BinaryTree(Object key){
this(key, new BinaryTree( ),
new BinaryTree( ));
}
// . . .
}

146
public boolean isEmpty( ){ return key == null ; }
public boolean isLeaf( ){
return ! isEmpty( ) && left.isEmpty( ) &&
right.isEmpty( ) ; }
public Object getKey( ){
if(isEmpty( )) throw new InvalidOperationException( )
;
else return key ;
}
public int getHeight( ){
if(isEmpty( )) return -1 ;
else return 1 + Math.max(left.getHeight( ),
right.getHeight( )) ;
}
public void attachKey(Object obj){
if(! isEmpty( )) throw new InvalidOperationException(
) ;
else{
key = obj ;
left = new BinaryTree( ) ;
right = new BinaryTree( ) ;
}
}

public Object detachKey( ){


if(! isLeaf( )) throw new InvalidOperationException( ) ;
else {
Object obj = key ;
key = null ;
left = null ;
right = null ;
return obj ;
}
}
public BinaryTree getLeft( ){
if(isEmpty( )) throw new InvalidOperationException( ) ;
else return left ;
}
public BinaryTree getRight( ){
if(isEmpty( )) throw new InvalidOperationException( ) ;
else return right ;
}

2.7 Application of Binary trees


Binary trees have many important uses. Two examples are:
1. Binary decision trees.
- Internal nodes are conditions. Leaf nodes denote decisions.
147
2. Expression Trees

3.0 Tutor Marked Assignments (Individual or Group)


1. True or False: Not all trees are binary trees.
2. In a complete binary tree with 20 nodes, and the root considered to be at
level 0, how many nodes are there at level 4?

3. Answer the following questions about treeA.


a. What are the ancestors of node P?

148
b. What are the descendants of node K?
c. What is the maximum possible number of nodes at the level of node
W?
d. What is the maximum possible number of nodes at the level of node
N?

4. Answer the following questions about treeB.


a. What is the height of the tree?
b. What nodes are on level 3?
c. Which levels have the maximum number of nodes that they could
contain?
5. Mention three application of general trees and two applications of binary
trees.

4.0 Conclusion/Summary
In this study session, you learnt about trees and tree terminologies. We also
studied how to implement trees. We concluded the session by introducing
binary trees, its implementation and applications. In the next study session, we
will learn more on binary search trees.

149
5.0 Self-Assessment Questions
1. What do you understand by the term degree of a node?
2. What is a Leaf node?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/33SfqfQ , http://bit.ly/33ZQbbC , http://bit.ly/2U5YIW4 ,

http://bit.ly/2Pbi90H. Watch the video & summarise in 1 paragraph


b. View the animation on http://bit.ly/2PbLkk5 , http://bit.ly/2PbLkk5 and critique it in
the discussion forum
c. Take a walk and engage any 3 students on relationship between a tree and
binary tree; In 2 paragraphs summarise their opinion of the discussed topic.
etc.

7.0 Self Assessment Question Answers


1. The degree of a node is the number of subtrees of that node.
2. A leaf node is a node with degree 0.

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.

150
STUDY SESSION 2
Binary Search Tree
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - What is a Binary search tree?
2.2 - Why Binary search trees?
2.3 - Binary search tree implementation
2.4 - Insertion in a BST
2.5 - Deletion from a BST
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 In-text Question Answers
8.0 Self-Assessment Question Answers
9.0 References/Further Readings

Introduction:
In this study session, we shall concentrate on binary search trees and why we
need to use them. We shall also study their implementation, and conclude the
session by looking at some of the operations on a binary search tree.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Describe a binary search tree and why we need to them.
2. Implement a binary search tree
3. Describe with examples operations on binary search trees.

151
2.0 Main Content
2.1 What is a Binary search tree?
A binary search tree (BST) is a binary tree that is empty or that satisfies the
BST ordering property:
1. The key of each node is greater than each key in the left subtree, if any, of
the node.
2. The key of each node is less than each key in the right subtree, if any, of
the node.
Thus, each key in a BST is unique.
Examples:

Figure 2.2.1: Examples of Binary Search Trees

2.2 Why Binary search trees?


BSTs provide good logarithmic time performance in the best and average cases.
Table 2.2.1 shows the average case complexities of using linear data structures
compared to BSTs.
In-text Question 1
Each key in a BST is unique. (True or False)?

Answer
True

152
Table 2.2.1: Average case complexities of using linear data structures
compared to BSTs
Data Structure Retrieval Insertion Deletion
O(log n) O(log n) O(log n)
BST
FAST FAST FAST
O(log n) O(n) O(n)
Sorted Array
FAST* SLOW SLOW
O(n) O(n) O(n)
Sorted Linked List
SLOW SLOW SLOW

2.3 Binary search tree implementation


Our Binary Search Tree class extends from our previous discussions in the last
study session on trees. By so doing, it inherits the instant variable keys, left, and
right of the BinaryTree class as shown in the code segment below:

public class BinarySearchTree extends BinaryTree


implements SearchableContainer {
private BinarySearchTree getLeftBST(){
return (BinarySearchTree) getLeft( ) ;
}
private BinarySearchTree getRightBST( ){
return (BinarySearchTree) getRight( ) ;
}
// . . .
}

The find method of the BinarySearchTree class which is used to locate elements
in the binary search tree is implemented as follows:

public Comparable find(Comparable comparable)


{
if(isEmpty()) return null;
Comparable key = (Comparable) getKey();
if(comparable.compareTo(key)==0)
return key;
else if (comparable.compareTo(key)<0)
return getLeftBST().find(comparable);
else
return getRightBST().find(comparable);
}
153
By the BST ordering property, the minimum key is the key of the left-most
node that has an empty left-subtree. The findMin method of the
BinarySearchTree class which is used to find the minimum key in the tree is
implemented as follows:
public Comparable findMin()
{
if(isEmpty())
return null;
if(getLeftBST().isEmpty())
return (Comparable)getKey();
else
return getLeftBST().findMin();
}

Similarly, by the BST ordering property, the maximum key is the key of the
right-most node that has an empty right-subtree. The findMax method of the
BinarySearchTree class which is used to find the maximum key in the tree is
implemented as follows:
public Comparable findMax() {
if(isEmpty())
return null;
if(getRightBST().isEmpty())
return (Comparable)getKey();
else
return getRightBST().findMax();
}

In-text Question 2
What is the complexity of searching for a key in a BST?

Answer
O(log n)

2.4 Insertion in a BST


By the BST ordering property, a new node is always inserted as a leaf node. The
insert method, given below, recursively finds an appropriate empty subtree to
insert the new key. It then transforms this empty subtree into a leaf node by

154
invoking the attachKey method. The code for the attachKey and insert methods
are shown below:

public void attachKey(Object obj) {


if(!isEmpty())
throw new InvalidOperationException();
else {
key = obj;
left = new BinarySearchTree();
right = new BinarySearchTree();
}
}

public void insert(Comparable comparable){


if(isEmpty())
attachKey(comparable);
else {
Comparable key = (Comparable) getKey();
if(comparable.compareTo(key)==0)
throw new IllegalArgumentException("duplicate key");
else if (comparable.compareTo(key)<0)
getLeftBST().insert(comparable);
else
getRightBST().insert(comparable);
}
}

Figure 2.2.2: Inserting a new node into an existing BST

2.5 Deletion from a BST


In the case of deleting a node from a binary search tree, there are three
scenarios:
1. The node to be deleted is a leaf node.
2. The node to be deleted has one non-empty child.

155
3. The node to be deleted has two non-empty children.
We will treat each case one after the other.

CASE 1: Deleting A Leaf Node


To delete a leaf node, convert it into an empty tree by using the detachKey
method shown below:
// In Binary Tree class
public Object detachKey( ){
if(! isLeaf( )) throw new InvalidOperationException( )
;
else {
Object obj = key ;
key = null ;
left = null ;
right = null ;
return obj ;
}
}

Example: Delete 5 in the tree below:

Figure 2.2.3(a): Deleting a leaf node in a BST

CASE 2: The Node To Be Deleted Has One Non-Empty Child


(a) The right subtree of the node x to be deleted is empty.
We present the code segment to delete a node that satisfies this condition.

// Let target be a reference to the node x.


BinarySearchTree temp = target.getLeftBST();
target.key = temp.key;
target.left = temp.left;
target.right = temp.right;
temp = null;
156
Example:

Figure 2.2.3(a): Deleting a node with one non-empty child in a BST


(b) The left subtree of the node x to be deleted is empty.
We present the code segment to delete a node that satisfies this condition.
// Let target be a reference to the node x.
BinarySearchTree temp = target.getRightBST();
target.key = temp.key;
target.left = temp.left;
target.right = temp.right;
temp = null;

Example:

Figure 2.2.3(c): Deleting a node with one non-empty child in a BST

CASE 3: Deleting A Node That Has Two Non-Empty Children


We can delete a node that has two non-empty children in two ways. We discuss
the two approaches next.

157
Deletion By Copying: Method 1
Copy the minimum key in the right subtree of x to the node x, then delete the
one-child or leaf-node with this minimum key.
Example:

Figure 2.2.4(d): Deletion by copying method 1

Deletion By Copying: Method 2


Copy the maximum key in the left subtree of x to the node x, then delete the
one-child or leaf-node with this maximum key.
Example:

Figure 2.2.4(e): Deletion by copying method 2

Two-child deletion method 1 code


// find the minimum key in the right subtree of the target
node
Comparable min = target.getRightBST().findMin();
// copy the minimum value to the target
target.key = min;

// delete the one-child or leaf node having the min


target.getRightBST().withdraw(min);
158
All the different cases for deleting a node are handled in the withdraw
(Comparable key) method of BinarySearchTree class.

3.0 Tutor Marked Assignments (Individual or Group)


1. In section 2.5, we discussed the different processes to follow when
deleting a node from a BST. We provided the code for deleting a node
with two children for method one, hence, provide the code for deleting a
node with two children for method two.
2. Draw the binary search tree whose elements are inserted in the following
order:
50 72 96 94 107 26 12 11 9 2 10 25 51 16 17 95
Use treeB below to answer questions 3, 4 and 5

3. Trace the path that would be followed in searching for


a. a node containing 61.
b. a node containing 28.
4. Show how treeB would look after the deletion of 29, 59, and 47.
Show how the (original) treeB would look after the insertion of nodes
containing 63, 77, 76, 48, 9, and 10 (in that order).

159
4.0 Conclusion/Summary
In this study session, we introduced binary search trees and why we need to use
them. We studied their implementation and concluded the session by looking at
some of the operations on a binary search tree. In the next study session, we will
study tree traversal.

5.0 Self-Assessment Questions


1. What is a binary search tree (BST)?
2. Why do we use BST‟s compared to other data structures?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2ZpdZCh , http://bit.ly/2KWL1FN , http://bit.ly/2Zsrsgz ,
http://bit.ly/2ZrXntm , http://bit.ly/3207oja , http://bit.ly/2PdLyHC , http://bit.ly/2zm08C3 ,

http://bit.ly/2zn37Kt . Watch the video & summarise in 1 paragraph


b. View the animation on http://bit.ly/2MChZxj and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on the different methods of deleting a
node in a binary search tree; In 2 paragraphs summarise their opinion of the
discussed topic. etc.

7.0 Self Assessment Question Answers


1. A binary search tree (BST) is a binary tree that is empty or that satisfies
the BST ordering property:
- The key of each node is greater than each key in the left subtree, if any, of
the node.
- The key of each node is less than each key in the right subtree, if any,
of the node.
2. BSTs provide good logarithmic time performance in the best and
average cases.

160
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.

161
STUDY SESSION 3
Tree Traversal
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Tree Traversal
2.2- Binary Tree Traversal classification
2.2.1- BreadthFirst traversal
2.2.2- DepthFirst traversal
2.3 - Accept method of BinaryTree class
2.4 - Binary Tree Iterator
2.4.1 - Using a Binary Tree Iterator
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 In-text Question Answers
8.0 Self-Assessment Question Answers
9.0 References/Further Readings

Introduction:
In this study session, you will study tree traversals. You will learn the
classification of binary tree traversal and how they work. We will conclude the
session by revisiting the accept and iterator methods from the BinaryTree class
in study session two

162
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Define tree traversal
2. Outline the classification of tree traversal
3. Traverse trees using any given traversal technique

2.0 Main Content


2.1 Tree Traversal
The process of systematically visiting all the nodes in a tree and performing
some computation at each node in the tree is called a tree traversal.

2.2 Binary Tree Traversal Classifications


There are two methods by which we can traverse a tree:
1. Breadth-First Traversal.
2. Depth-First Traversal
The depth-First traversal is further classified types into three namely:
- Preorder traversal
- Inorder traversal (for binary trees only)
- Postorder traversal

2.2.1 Bread-th First Traversal


Breadth-first search traversal proceeds in a concentric manner by visiting first
all the vertices that are adjacent to a starting vertex, then all unvisited vertices
two edges apart from it, and so on, until all the vertices in the same connected
component as the starting vertex are visited. The code for the breadthfirst
traversal method is shown below:

163
public void breadthFirstTraversal(Visitor visitor){
QueueAsLinkedList queueaslinkedlist =
new QueueAsLinkedList();
if(!isEmpty()) queueaslinkedlist.enqueue(this);
while(!queueaslinkedlist.isEmpty() &&
!visitor.isDone()){
BinaryTree tree =
(BinaryTree)queueaslinkedlist.dequeue();
visitor.visit(tree.getKey());
if (!tree.getLeft().isEmpty())
queueaslinkedlist.enqueue(tree.getLeft());
if (!tree.getRight().isEmpty())
queueaslinkedlist.enqueue(tree.getRight());
}
}

Example:

Figure 2.3.1: Breadth-first traversal

2.2.2 Depth-First traversal


Depth-first traversal is further classified into three types. We summarize how
they work and the code for implementing them in table 1 below.

In-text Question 1
List the two methods used in traversing a tree

Answer
Breadth-First Traversal and Depth-First Traversal

164
Table 2.3.1: Depth-First Search Classification

Name for each Node: CODE

public void preorderTraversal(Visitor v){


- Visit the node if(!isEmpty() && ! v.isDone()){
- Visit the left subtree, v.visit(getKey());
Preorder
if any. getLeft().preorderTraversal(v);
(N-L-R)
- Visit the right subtree, getRight().preorderTraversal(v);
if any. }
}
public void inorderTraversal(Visitor v){
- Visit the left subtree, if(!isEmpty() && ! v.isDone()){
if any. getLeft().inorderTraversal(v);
Inorder
- Visit the node v.visit(getKey());
(L-N-R)
- Visit the right subtree, getRight().inorderTraversal(v);
if any. }
}
public void postorderTraversal(Visitor v){
- Visit the left subtree, if(!isEmpty() && ! v.isDone()){
if any. getLeft().postorderTraversal(v) ;
Postorder
- Visit the right subtree, getRight().postorderTraversal(v);
(L-R-N)
if any. v.visit(getKey());
- Visit the node }
}

Depth-first Preorder Traversal

Figure 2.3.2(a): depth-first Preorder Traversal

165
Depth-first Inorder Traversal

Figure 2.3.2(b): depth-first Inorder Traversal

You should note that an inorder traversal of a BST visits the keys sorted in
increasing order.

Depth-first Postorder Traversal

Figure 2.3.2(c): depth-first Postorder Traversal

The following code illustrates how to display the contents of a Binary tree using
each traversal method.
Visitor v = new PrintingVisitor() ;
BinaryTree t = new BinaryTree() ;
// . . .
t.breadthFirstTraversal(v) ;
t.preorderTraversal(v) ;
t.inorderTraversal(v) ;
t.postorderTraversal(v) ;
166
2.3 Accept Method of BinaryTree Class
Usually the accept method of a container is allowed to visit the elements of the
container in any order. A depth-first tree traversal visits the nodes in either
preoder or postorder and for Binary trees inorder traversal is also possible. The
BinaryTree class accept method does a preorder traversal:
public void accept(Visitor visitor)
{
preorderTraversal(visitor) ;
}

2.4 Binary Tree Iterator


The BinaryTree class provides a tree iterator that does a preorder traversal. The
iterator is implemented as an inner class as shown below:
private class BinaryTreeIterator implements Iterator{
Stack stack;

public BinaryTreeIterator(){
stack = new StackAsLinkedList();
if(!isEmpty())stack.push(BinaryTree.this);
}

public boolean hasNext(){return !stack.isEmpty();}

public Object next(){


if(stack.isEmpty())throw new NoSuchElementException();
BinaryTree tree = (BinaryTree)stack.pop();
if (!tree.getRight().isEmpty())
stack.push(tree.getRight());
if (!tree.getLeft().isEmpty()) stack.push(tree.getLeft());
return tree.getKey();
}
}

In-text Question 2
Postorder traversal of a BST visits the keys sorted in increasing order. (True or False)?

Answer
False

167
2.4.1 Using a Binary Tree Iterator
As shown in the code fragment below, the iterator() method of the BinaryTree
class returns a new instance of the BinaryTreeIterator inner class each time it is
called:
public Iterator iterator(){
return new BinaryTreeIterator();
}

The following program fragment shows how to use a tree iterator:

BinaryTree tree = new BinaryTree() ;


// . . .
Iterator i = tree.iterator() ;
while(i.hasNext(){
Object obj = e.next() ;
System.out.print(obj + " ") ;
}

3.0 Tutor Marked Assignments (Individual or Group)

1. Answer the following questions about treeA above


a. What is the order in which the nodes are visited by an inorder

traversal?
b. What is the order in which the nodes are visited by a preorder

traversal?

168
c. What is the order in which the nodes are visited by a postorder

traversal?
2. True or False?
a. A preorder traversal of a binary search tree processes the nodes in
the tree in the exact reverse order that a postorder traversal
processes them.
b. An inorder traversal of a binary search tree always processes the
elements of the tree in the same order, regardless of the order in
which the elements were inserted.
c. A preorder traversal of a binary search tree always processes the
elements of the tree in the same order, regardless of the order in
which the elements were inserted.
4.0 Conclusion/Summary
In this study session, you learnt tree traversals, you also learnt the classification
of binary tree traversal and how they work. We concluded the session by
revisiting the accept and iterator methods from the BinaryTree class. In the next
study session, we will learn about binary heaps.
5.0 Self-Assessment Questions
1. What is tree traversal?
2. Depth-first traversal is categorized into three. List them.

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube add http://bit.ly/2MDlWBJ , http://bit.ly/3419IZl , http://bit.ly/2KUgGYm ,
http://bit.ly/3497bMK , http://bit.ly/31XE78S. Watch the video & summarise in 1
paragraph.
b. View the animation on http://bit.ly/2MD3UQi and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on the categories of depth-first
traversal; In 2 paragraphs summarise their opinion of the discussed topic. etc.

169
7.0 Self Assessment Question Answers
1. The process of systematically visiting all the nodes in a tree and
performing some computation at each node in the tree is called a tree
traversal.
2.
- Preorder traversal
- Inorder traversal (for binary trees only)
- Postorder traversal

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.

170
STUDY SESSION 4
Binary Heap
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - What is a Binary Heap?
2.2 - Array representation of a Binary Heap
2.3 - MinHeap implementation
2.4 - Operations on Binary Heaps
2.4.1 - enqueuer
2.4.2 - dequeuer
2.4.3 - deleting an arbitrary key
2.4.4 - changing the priority of a key
2.5- Building a binary heap
2.5.1 - Top down approach
2.5.2 - bottom up approach
2.6- Heap Applications
2.6.1 - Heap Sort
2.6.2 - Heap as a priority queue
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

171
Introduction:
We introduce a type of binary tree called binary heap in this study session. We
will look at what a binary heap is, how to represent binary heap using arrays and
how to implement it. You will study operations on binary heaps, how to build a
heap and conclude the session with applications of binary heaps.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Define binary heap
2. Represent binary heaps using arrays.
3. Describe the operations on binary heaps
4. Build a binary heap
5. Outline applications of heaps

2.0 Main Content


2.1 What is a Binary Heap?
A binary heap is a complete binary tree with one (or both) of the following
heap order properties:
- MinHeap property: Each node must have a key that is lesser or equal to
the key of each of its children.
- MaxHeap property: Each node must have a key that is greater or equal
to the key of each of its children.
A binary heap satisfying the MinHeap property is called a MinHeap while a
binary heap satisfying the MaxHeap property is called a MaxHeap. A binary
heap with all keys equal is both a MinHeap and a MaxHeap.
Recall that a complete binary tree may have missing nodes only on the right
side of the lowest level. Figure 1 shows an example of a binary heap.

172
Figure 2.4.1: A binary heap

MinHeap and non-MinHeap examples

Figure 2.4.2(a): MinHeap and non-MinHeap examples

173
MaxHeap and non-MaxHeap examples

Figure 2.4.2(b): MaxHeap and non-MaxHeap examples

2.2 Array Representation of a Binary Heap


A heap is a dynamic data structure that is represented and manipulated more
efficiently using an array. Since a heap is a complete binary tree, its node values
can be stored in an array, without any gaps, in a breadth-first order, where:
Value(node i+1) array[ i ], for i > 0

Figure 2.4.3(a): Binary heap representation using array

- The root is array[0]


- The parent of array[i] is array[(i – 1)/2], where i > 0

174
- The left child, if any, of array[i] is array[2i+1].
- The right child, if any, of array[i] is array[2i+2].
We shall use an implementation in which the heap elements are stored in an
array starting at index 1.
Value(node i ) array[i] , for i > 1

Figure 2.4.3(b): Binary heap representation using array

- The root is array[1].


- The parent of array[i] is array[i/2], where i > 1
- The left child, if any, of array[i] is array[2i].
- The right child, if any, of array[i] is array[2i+1].

2.3 MinHeap implementation


A binary heap can serve as a priority queue. Our MinHeap class will implement
the following PriorityQueue interface. We present the implementation of the
MinHeap below.
public interface PriorityQueue extends Container{
public abstract void enqueue(Comparable comparable);
public abstract Comparable findMin();
public abstract Comparable dequeueMin();
}

175
public class BinaryHeap extends AbstractContainer
implements PriorityQueue {
protected Comparable array[];

public BinaryHeap(int i){


array = new Comparable[i + 1];
}

public BinaryHeap(Comparable[] comparable) {


this(comparable.length);
for(int i = 0; i < comparable.length; i++)
array[i + 1] = comparable[i];
count = comparable.length;

buildHeapBottomUp();
}

2.4 Operations on Binary Heaps


2.4.1 Enqueue
Enqueue means to add a node to the existing tree while maintaining the
MinHeap property of the tree. The pseudo code algorithm for enqueing a key in
a MinHeap is:
1 enqueue(e1)
2 {
3 if(the heap is full) throw an exception ;
4 insert e1 at the end of the heap ;
5 while(e1 is not in the root node and e1 < parent(e1))
6 swap(e1 , parent(e1)) ;
7 }

The process of swapping an element with its parent in order to restore the heap
order property is called percolate up, sift up, or reheapification upward.
Thus, the steps for enqueue are:
1. Enqueue the key at the end of the heap.
2. As long as the heap order property is violated, percolate up.

176
MinHeap Insertion Example

Figure 2.4.4: Insertion in a MinHeap

MinHeap enqueue implementation


To have better efficiency, we avoid repeated swapping. We find a place (hole)
for the new key, move the hole upward when needed, and at the end, put the key
into the hole. The code below is the MinHeap enqueue implementation.
In-text Question 1
A heap is a complete binary tree. (True or False)?

Answer
True

177
public void enqueue(Comparable comparable){
if(isFull()) throw new ContainerFullException();

int hole = ++count;

// percolate up via a hole


while(hole > 1 &&
array[hole / 2].compareTo(comparable)>0){
array[hole] = array[hole / 2];
hole = hole / 2 ;
}
array[hole] = comparable;
}
public boolean isFull(){
return count == array.length - 1;
}

2.4.2 Dequeue
To dequeue means to remove a node from the existing tree while maintaining
the MinHeap property of the tree. The pseudo code algorithm for deleting the
root key in a MinHeap is:
1 dequeueMin(){
2 if(Heap is empty) throw an exception ;
3 extract the element from the root ;
4 if(root is a leaf node){ delete root ; return; }
5 copy the element from the last leaf to the root ;
6 delete last leaf ;
7 p = root ;
8 while(p is not a leaf node and p > any of its
children)
9 swap p with the smaller child ;
10 return ;
11 }

The process of swapping an element with its child, in order to restore the heap
order property is called percolate down, sift down, or reheapification
downward.

178
Thus, the steps for deletion are:
1. Replace the key at the root by the key of the last leaf node.
2. Delete the last leaf node.
3. As long as the heap order property is violated, percolate down.

MinHeap Dequeue Example

Figure 2.4.5: deletion in a MinHeap

179
MinHeap dequeue Implementation
The code below is the MinHeap dequeue implementation.
public Comparable dequeueMin(){
if(isEmpty()) throw new ContainerEmptyException();
Comparable minItem = array[1];
array[1] = array[count];
count--;
percolateDown(1);
return minItem;
}
private void percolateDown(int hole){
int minChildIndex;
Comparable temp = array[hole];
while(hole * 2 <= count){
minChildIndex = hole * 2;
if(minChildIndex + 1 <= count && array[minChildIndex +
1].

compareTo(array[minChildIndex])<0)
minChildIndex++;
if(array[minChildIndex].compareTo(temp)<0){
array[hole] = array[minChildIndex];
hole = minChildIndex;
} else
break;
}
array[hole] = temp;
}

2.4.3 Deleting an arbitrary key


The algorithm of deleting an arbitrary key from a heap is:
- Copy the key x of the last node to the node containing the deleted key.
- Delete the last node.
- Percolate x down until the heap property is restored.
Example:

180
Figure 2.4.6: Deleting an arbitrary key in a MinHeap

2.4.4 Changing the priority of a key


There are three possibilities when the priority of a key x is changed:
1. The heap property is not violated.
2. The heap property is violated and x has to be percolated up to restore the
heap property.
3. The heap property is violated and x has to be percolated down to restore
the heap property.
Example:

Figure 2.4.6: Changing priority of a key in a MinHeap


181
2.5 Building a Binary Heap
2.5.1 Top down approach
A heap is built top-down by inserting one key at a time in an initially empty
heap. After each key insertion, if the heap property is violated, it is restored by
percolating the inserted key upward.
The algorithm is:
for(int i=1; i <= heapSize; i++){
read key;
binaryHeap.enqueue(key);
}

Example: Insert the keys 4, 6, 10, 20, and 8 in this order in an originally
empty max-heap

Figure 2.4.7: Building a binary heap

In-text Question 2
The process of swapping an element with its child, in order to restore the heap order
property is called ____________________

Answer
Percolate down, sift down, or reheapification downward.

182
2.5.2 Bottom up approach (Converting an array into a Binary heap)
The algorithm to convert an array into a binary heap is:
1. Start at the level containing the last non-leaf node (i.e., array[n/2], where
n is the array size).
2. Make the subtree rooted at the last non-leaf node into a heap by invoking
percolateDown.
3. Move in the current level from right to left, making each subtree, rooted
at each encountered node, into a heap by invoking percolateDown.
4. If the levels are not finished, move to a lower level then go to step 3.
The above algorithm can be refined to the following method of the BinaryHeap
class:
private void buildHeapBottomUp(){
for(int i = count / 2; i >= 1; i--)
percolateDown(i);
}

Converting an array into a MinHeap (Example)

Figure 2.4.8 Converting an array into a MinHeap

183
2.6 Heap Applications
2.6.1 Heap Sort
A MinHeap or a MaxHeap can be used to implement an efficient sorting
algorithm called Heap Sort. The following algorithm uses a MinHeap:
public static void heapSort(Comparable[] array){
BinaryHeap heap = new BinaryHeap(array) ;
for(int i = 0; i < array.length; i++)
array[i] = heap.dequeueMin() ;
}

Because the dequeueMin algorithm is O(log n), heapSort is an O(n log n)


algorithm. Apart from needing the extra storage for the heap, heapSort is among
efficient sorting algorithms.

2.6.2 Heap as a priority queue


A heap can be used as the underlying implementation of a priority queue. A
priority queue is a data structure in which the items to be inserted have
associated priorities. Items are withdrawn from a priority queue in order of their
priorities, starting with the highest priority items first.
Priority queues are often used in resource management, simulations, and in the
implementation of some algorithms (e.g., some graph algorithms, some
backtracking algorithms). Several data structures can be used to implement
priority queues. Below is a comparison of some:
Table 2.4.1: Comparison of some data structures

Data structure Enqueue Find Min Dequeue Min

Unsorted List O(1) O(n) O(n)

Sorted List O(n) O(1) O(1)

AVL Tree O(log n) O(log n) O(log n)

MinHeap O(log n) O(1) O(log n)

184
The algorithm for implementing a priority queue using a MinHeap is shown
below.

Heap Heap

X is the element with highest


priority

1 priorityQueueEnque(e1)
2 {
3 if(priorityQueue is full) throw an exception;
4 insert e1 at the end of the priorityQueue;
5 while(e1 is not in the root node and e1 < parent(e1))
6 swap(e1 , parent(e1));
7 }
1 priorityQueueDequeue(){
2 if(priorityQueue is empty) throw an exception;
3 extract the highest priority element from the root;
4 if(root is a leaf node){ delete root ; return; }
5 copy the element from the last leaf to the root;
6 delete last leaf;
7 p = root;
8 while(p is not a leaf node and p > any of its children)
9 swap p with the smaller child;
10 return;
11 }

185
3.0 Tutor Marked Assignments (Individual or Group)
1. Which of the following trees are heaps?

2. Draw a tree that satisfies both the binary search property and the order
property of heaps.
3. In this study session, we provide the implementation for the MinHeap
enqueue() method. Provide the implementation of the MinHeap
dequeue().
4. We provided an iterative approach for the MinHeap enqueue(). Provide a
recursive equivalent of this method.

4.0 Conclusion/Summary
In this study session, you were introduced to a type of binary tree called binary
heap.You learnt what a binary heap is, how to represent binary heap using

186
arrays and how to implement it. You also studied operations on binary heaps,
how to build a heap and applications of binary heaps. In the next study sessions,
we will learn about another special type of binary search trees called AVL trees.

5.0 Self-Assessment Questions


1. What is a binary heap?
2. The process of swapping an element with its parent, in order to restore the
heap order property is called ____________________

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/326pvE6 , http://bit.ly/2Zfa1Ri , http://bit.ly/2Zfa1Ri ,
http://bit.ly/2U5kqJN , http://bit.ly/2Zp6xeh Watch the video & summarise in 1
paragraph.
b. View the animation on http://bit.ly/2KXxlKw and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on your understanding of MinHeap
and MaxHeap; In 2 paragraphs summarise their opinion of the discussed topic.
etc.

7.0 Self Assessment Question Answers


1. A binary heap is a complete binary tree with one (or both) of the
following heap order properties:
- MinHeap property: Each node must have a key that is less or equal to the
key of each of its children.
- MaxHeap property: Each node must have a key that is greater or equal to
the key of each of its children.
2. Percolate up, sift up, or reheapification upward.

187
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.

188
STUDY SESSION 5
AVL Tree
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1: AVL Search Trees
2.1.1: What is an AVL Tree?
2.1.2: AVL Tree Implementation.
2.1.3: Why AVL Trees?
2.1.4: Rotations
2.1.4.1 Single Right Rotation
2.1.4.2 Single Left Rotation
2.1.4.3 Double Right-Left Rotation
2.1.4.4 Double Left-Right Rotation
2.1.4.5 Double Rotation Implementation
2.1.4.6 BST ordering property after a rotation
2.2: Inserting in an AVL tree
2.3: AVL Rotation Summary
2.4: Insertion implementation
2.5: Deleting from an AVL tree
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

189
Introduction:
An AVL tree is a binary search tree with a balance factor. In this study session,
you will learn about AVL tree, its implementation and the need to use AVL
trees. You will also study some operations on AVL trees such as rotation,
insertion, deletion and implementation of some of these operations.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Describe an AVL tree
2. Implement an AVL tree
3. Outline the need for an AVL tree
4. Rotate, insert and delete from an AVL tree

2.0 Main Content


2.1 AVL Search Trees
2.1.1 What is an AVL Tree?
An AVL tree is a binary search tree with a height balance property. By height
balance property, we mean that for each node v in the tree, the heights of the
subtrees of v differ by at most 1. A subtree of an AVL tree is also an AVL tree.
For each node of an AVL tree:
Balance factor = height(right subtree) - height(left subtree)
An AVL node can have a balance factor of -1, 0, or +1.

Figure 2.5.1: An AVL tree and a Non-AVL tree

190
2.1.2 AVL Tree Implementation
As we have seen so far in this course, we build implementation of structures on
existing codes. The AVL tree is a binary search tree; hence, its implementation
builds on the existing binary search tree class, and then introduces the additional
property of an AVL tree, the balance factor. The code segment below shows the
implementation of an AVL tree.
public class AVLTree extends BinarySearchTree{
protected int height;
public AVLTree(){ height = -1;}
public int getHeight(){ return height } ;
protected void adjustHeight(){
if(isEmpty())
height = -1;
else
height = 1 + Math.max(left.getHeight() ,
right.getHeight());
}
protected int getBalanceFactor(){
if( isEmpty())
return 0;
else
return right.getHeight() - left.getHeight();
}
// . . .
}

2.1.3 Why AVL Trees?


Insertion or deletion in an ordinary Binary Search Tree that can cause large
imbalances. In the worst case, searching an imbalanced Binary Search Tree is
O(n). An AVL tree is rebalanced after each insertion or deletion.
The height-balance property ensures that the height of an AVL tree with n
nodes is O(log n). Searching, insertion, and deletion are all O(log n).

2.1.4 Rotations
A rotation is a process of switching children and parents among two or three
adjacent nodes to restore balance to a tree.

191
An insertion or deletion may cause an imbalance in an AVL tree. The deepest
node, which is an ancestor of a deleted or an inserted node, and whose balance
factor has changed to -2 or +2 requires rotation to rebalance the tree.

Figure 2.5.2(a): Rotation in an AVL tree (Before Rotation)

Figure 2.5.3(b): Rotation in an AVL tree (After Rotation)

There are two kinds of single rotation:


- Right Rotation.

- Left Rotation.

A double right-left rotation is a right rotation followed by a left rotation while a


double left-right rotation is a left rotation followed by a right rotation.

192
2.1.4.1 Single Right Rotation
In a Single right rotation:
- The left child x of a node y becomes y's parent.
- y becomes the right child of x.
- The right child T2 of x, if any, becomes the left child of y.
-

Figure 2.5.3: Single Right Rotation

You should note that the pivot of the rotation is the deepest unbalanced node.
Single Right Rotation Implementation
The implementation for a single right rotation is provided below.
protected void rightRotate(){
if( isEmpty()) throw new InvalidOperationException();
BinaryTree temp = right;
right = left;
left = right.left;
right.left = right.right;
right.right = temp;
Object tmpObj = key;
key = right.key;
right.key = tempObj;
getRightAVL().adjustHeight();
adjustHeight();
}
We now show a detailed example of a single right rotation following the
implementation provided above line by line.

193
In-text Question 1
What is an AVL Tree?

Answer
An AVL tree is a binary search tree with a height balance property.

Figure 2.5.4(a) Single Right Rotation Example

194
Figure 2.5.4(b) Single Right Rotation Example

Figure 2.5.4(c) Single Right Rotation Example

195
Figure 2.5.4(d) Single Right Rotation Example

Figure 2.5.4(e) Single Right Rotation Example

196
Figure 2.5.4(f) Single Right Rotation Example

Figure 2.5.4(g) Single Right Rotation Example

197
Figure 2.5.4(g) Single Right Rotation Example

Figure 2.5.4(i) Single Right Rotation Example

198
Figure 2.5.4(j) Single Right Rotation Example

2.1.4.1 Single Left Rotation


In a Single left rotation:
- The right child y of a node x becomes x's parent.
- x becomes the left child of y.
- The left child T2 of y, if any, becomes the right child of x.

Figure 2.5.5: Single Left Rotation

199
Just as in the case of a single right rotation, you should also note that the pivot
of the rotation is the deepest unbalanced node.

2.1.4.3 Double Right-Left Rotation


Figure 2.5.6 below shows how a double right-left rotation is done.

Figure 2.5.6 Double right-left rotation


2.1.4.4 Double Left-Right Rotation
Figure 2.5.7 below shows how a double left-right rotation is done.

Figure 2.5.7 Double Left-Right Rotation

200
2.1.4.5 Double Rotation Implementation
The following code segment is an implementation for the double right-left and
double left-right rotations discussed in sections 2.4.1.3 and 2.4.1.4 respectively.
1 protected void rotateRightLeft()
2 {
3 if( isEmpty())
4 throw new InvalidOperationException();
5 getRightAVL().rotateRight();
6 rotateLeft();
7 }
1 protected void rotateLeftRight()
2 {
3 if( isEmpty())
4 throw new InvalidOperationException();
5 getLeftAVL().rotateLeft();
6 rotateRight();
7 }

2.1.4.6 BST ordering property after a rotation


A rotation does not affect the ordering property of a BST (Binary Search Tree).

Figure 2.5.8 BST ordering property after a rotation


The scenario is similar to a left rotation
In-text Question 2
Balance factor = height(left subtree) - height(right subtree). (True or False)?

Answer
False

201
2.2 Inserting in an AVL tree
We insert into an AVL tree using the BST insertion algorithm discussed in the
study session on binary search trees. If the insertion causes an imbalance, the
tree is rebalanced. An imbalance occurs if a node's balance factor changes from
-1 to -2 or from+1 to +2. Rebalancing is done at the deepest or lowest
unbalanced ancestor of the inserted node.
There are three insertion cases:
1. Insertion that does not cause an imbalance.
2. Same side (left-left or right-right) insertion that causes an imbalance.
- Requires a single rotation to rebalance.
3. Opposite side (left-right or right-left) insertion that causes an imbalance.
- Requires a double rotation to rebalance.
Insertion: case 1
Example: An insertion that does not cause an imbalance.

Figure 2.5.9(a): Insertion case 1

Insertion: case 2
Case 2a: The lowest node (with a balance factor of -2) had a taller left-subtree
and the insertion was on the left-subtree of its left child.
It requires single right rotation to rebalance.

202
Figure 2.5.9 (b): Insertion case 2(a)
Case 2b: The lowest node (with a balance factor of +2) had a taller right-
subtree and the insertion was on the right-subtree of its right child.
It requires single left rotation to rebalance.

Figure 2.5.9 (c): Insertion case 2(b)

Insertion: case 3
Case 3a: The lowest node (with a balance factor of -2) had a taller left-subtree
and the insertion was on the right-subtree of its left child.
203
It requires a double left-right rotation to rebalance.

Figure 2.5.9 (d): Insertion case 3(a)


Case 3b: The lowest node (with a balance factor of +2) had a taller right-subtree
and the insertion was on the left-subtree of its right child.
It requires a double right-left rotation to rebalance.

Figure 2.5.9(e): Insertion case 3(b)

204
2.3 AVL Rotation Summary
Figure 9 below is a summary of scenarios that could lead to an imbalance in an
AVL tree and the type of rotation required to rebalance the tree.

Figure 2.5.10: AVL Rotation Summary

2.4 Insertion implementation


We present the implementation for inserting a new node into an existing AVL
tree. The insert method of the AVLTree class is:
public void insert(Comparable comparable){
super.insert(comparable);
balance();
}

Recall that the insert method of the BinarySearchTree class is:


public void insert(Comparable comparable){
if(isEmpty()) attachKey(comparable);
else {
Comparable key = (Comparable) getKey();
if(comparable.compareTo(key)==0)
throw new IllegalArgumentException("duplicate key");
else if (comparable.compareTo(key)<0)
getLeftBST().insert(comparable);
else
getRightBST().insert(comparable);
}
}

205
The AVLTree class overrides the attachKey method of the BinarySearchTree
class:
public void attachKey(Object obj)
{
if(!isEmpty())
throw new InvalidOperationException();
else
{
key = obj;
left = new AVLTree();
right = new AVLTree();
height = 0;
}
}

The balance method to rebalance a tree in the case of an imbalance is defined as


follows.
protected void balance(){
adjustHeight();
int balanceFactor = getBalanceFactor();
if(balanceFactor == -2){
if(getLeftAVL().getBalanceFactor() < 0)
rotateRight();
else
rotateLeftRight();
}
else if(balanceFactor == 2){
if(getRightAVL().getBalanceFactor() > 0)
rotateLeft();
else
rotateRightLeft();
}
}

2.5 Deleting from an AVL tree


We delete a node in an AVL tree by the BST deletion copying algorithm.
If the deletion results in an imbalance, the tree is rebalanced.
There are three deletion cases:
1. Deletion that does not cause an imbalance.
2. Deletion that requires a single rotation to rebalance.
3. Deletion that requires two or more rotations to rebalance.

206
Deletion case 1 example:

Figure 2.5.11(a): Deletion case 1

Deletion case 2 examples:

Figure 2.5.11(b): Deletion case 2

207
Figure 2.5.11(b): Deletion case 2

Deletion case 3 example:

Figure 2.5.11(c): Deletion case 3

208
3.0 Tutor Marked Assignments (Individual or Group)
1. Which of the following binary trees are AVL trees?

2. For each of the following lists, construct an AVL tree by inserting their

elements successively, starting with the empty tree.


a. 1, 2, 3, 4, 5, 6
b. 6, 5, 4, 3, 2, 1
c. 3, 6, 5, 1, 2, 4
3. In section 2.1.4.1, we provided the implementation for a single right
rotation. Implement the single left rotation.

4.0 Conclusion/Summary
In this study session, you learnt about AVL tree and its properties. You also
learnt about AVL tree implementation and the need to use AVL trees. We
concluded the study session by looking at some operations on AVL trees such
as rotation, insertion, deletion and implementation of some of these operations.
We will study B-Trees in the next study session.

5.0 Self-Assessment Questions


1. What do you understand by height balance property?
2. Why AVL trees?

209
6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube (http://bit.ly/30BdWVf , http://bit.ly/2ZfeBz7 , http://bit.ly/2ZfeBz7 ,

http://bit.ly/2MDjPOl , http://bit.ly/30B1KE7 , http://bit.ly/2L4Aasg. Watch the video &


summarise in 1 paragraph
b. View the animation on http://bit.ly/2U7Rq3R and critique it in the discussion
forum
c. Take a walk and engage any 3 students on the different rotation in an AVL
tree; In 2 paragraphs summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. Height balance property means that for each node v in a tree, the
heights of the subtrees of v differ by at most 1.
2. Insertion or deletion in an ordinary Binary Search Tree can cause large
imbalances. In the worst case, searching an imbalanced Binary Search
Tree is O(n). An AVL tree is rebalanced after each insertion or deletion.
The height-balance property ensures that the height of an AVL tree with n
nodes is O(log n). Searching, insertion, and deletion are all O(log n).
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.

210
STUDY SESSION 6
B-Trees
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- Disk Storage
2.2- What is a multiway tree?
2.2.1- The node structure of a Multi-way tree
2.2.2- Examples of Multi-way Trees
2.3- What is a B-tree?
2.3.1- B-Tree Examples
2.4- Motivation for studying Multi-way and B-trees
2.5- Why B-trees?
2.5.1- Comparing B-Trees with AVL Trees
2.6- Insertion in a B-tree
2.6.1- B-Tree Insertion Algorithm
2.7- Deletion in a B-tree
2.7.1- B-Tree Deletion Algorithm
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

211
Introduction:
In the last study session of this module, we shall be looking at multi-way trees
and why we need to use them. Our focus will be on B-Trees. As we have done
in previous study sessions on trees, you will learn about operations on B-Trees
with examples and algorithms for implementing these operations.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Explain multi-way trees
2. Outline motivations for studying multi-way and B-Trees
3. Describe B-Trees
4. Describe the various operations on B-Trees with examples

2.0 Main Content


2.1 Disk Storage
Data is stored on disk (i.e., secondary memory) in blocks. A block is the
smallest amount of data that can be accessed on a disk. Each block has a fixed
number of bytes – typically 512, 1024, 2048, 4096 or 8192 bytes. Each block
may hold many data records. Figure 1 shows an example of a block.

Figure 2.6.1: A Block

212
2.2 What is a Multiway Tree?
A multi-way (or m-way) search tree of order m is a tree in which
- Each node has at most m subtrees, where the subtrees may be empty.
- Each node consists of at least 1 and at most m-1 distinct keys
- The keys in each node are sorted.

Figure 2.6.2: A Multi-way Tree


For the multi-way tree in figure 2, the keys and subtrees of a non-leaf node are
ordered as:
T0, k1, T1, k2, T2, . . . , km-1, Tm-1 such that:
- All keys in subtree T0 are less than k1.
- All keys in subtree Ti , 1 <= i <= m - 2, are greater than ki but less than
ki+1.
- All keys in subtree Tm-1 are greater than km-1

2.2.1 The node structure of a Multi-way tree


The node structure of a multi-way tree is shown in figure 3 below

Figure 2.6.3: Node structure of a Multi-way tree


Note that:

213
- Corresponding to each key, there is a data reference that refers to the data
record for that key in secondary memory.
- In our representations, we will omit the data references.
- The literature contains other node representations that we will not
discuss.

2.2.2 Examples of Multi-way Trees


Two examples of multi-way trees are shown in figure 5 below

Figure 2.6.5: Examples of Multi-way trees

In a multiway tree, you should note that:


- The leaf nodes need not be at the same level.
- A non-leaf node with n keys may contain less than n + 1 non-empty
subtree(s).

2.3 What is a B-tree?


A B-tree of order m (or branching factor m), where m > 2, is either an empty
tree or a multiway search tree with the following properties:
- The root is either a leaf or it has at least two non-empty subtrees and at
most m non-empty subtrees.

214
- Each non-leaf node, other than the root, has at least m/2 non-empty
subtrees and at most m non-empty subtrees. (Note: x is the lowest
integer > x ).
- The number of keys in each non-leaf node is one less than the number of
non-empty subtrees for that node.
- All leaf nodes are at the same level; that is the tree is perfectly balanced.

For a non-empty B-tree of order m:

2.3.1 B-Tree Examples

215
Note that:
- The data references are not shown.
- The leaf references are to empty subtrees

2.4 Motivation for studying Multi-way and B-trees


A disk access is very expensive compared to a typical computer instruction
(mechanical limitations) - One disk access is worth about 200,000 instructions.
Thus, when data is too large to fit in main memory, the number of disk access
becomes important.
Many algorithms and data structures that are efficient for manipulating data in
primary memory are not efficient for manipulating large data in secondary
memory because they do not minimise the number of disk accesses. For
example, AVL trees are not suitable for representing huge tables residing in
secondary memory. The height of an AVL tree increases, and hence the number
of disk accesses required to access a particular record increases, as the number
of records increases.

2.5 Why B-trees?


B-trees are suitable for representing huge tables residing in secondary memory
because:
1. With a large branching factor m, the height of a B-tree is low resulting in
fewer disk accesses.
Note: As m increases the amount of computation at each node
increases; however this cost is negligible compared to hard-drive
accesses.
2. The branching factor can be chosen such that a node corresponds to a
block of secondary memory.
3. The most common data structure used for database indices is the B-tree.
An index is any data structure that takes as input a property (e.g. a value

216
for a specific field), called the search key, and quickly finds all records
with that property.

2.5.1 Comparing B-Trees with AVL Trees


The height h of a B-tree of order m, with a total of n keys, satisfies the
inequality: h <= 1 + log m / 2 ((n + 1) / 2)
If m = 300 and n = 16,000,000 then h ≈ 4.
Thus, in the worst case, finding a key in such a B-tree requires 3 disk accesses
(assuming the root node is always in main memory).
The average number of comparisons for an AVL tree with n keys is log n + 0.25
where n is large.
If n = 16,000,000 the average number of comparisons is 24.
Thus, in the average case, finding a key in such an AVL tree requires 24 disk
accesses.
2.6 Insertion in a B-tree
Inserting a key into an existing B-tree may result in what is known as an
overflow condition.
Overflow condition: A root-node or a non-root node of a B-tree of order m
overflows if, after a key insertion, it contains m keys.
For the Insertion algorithm, if a node overflows, split it into two, propagate the
"middle" key to the parent of the node. If the parent overflows, the process
propagates upward. If the node has no parent, create a new root node.
Note that insertion of a key always starts at a leaf node.

Insertion in a B-tree of odd order


Example: Insert the keys 78, 52, 81, 40, 33, 90, 85, 20, and 38 in this order in an
initially empty B-tree of order 3

217
Figure 2.6.6: Insertion in a B-tree

Insertion in a B-tree of even order


For a B-tree of even order, at each node the insertion can be done in two
different ways:
- right-bias: The node is split such that its right subtree has more keys than
the left subtree.
- left-bias: The node is split such that its left subtree has more keys than
the right subtree.
Example: Insert the key 5 in the following B-tree of order 4:

Figure 2.6.7: Insertion in a B-tree

218
2.6.1 B-Tree Insertion Algorithm
The algorithm for inserting a key into a B-tree is shown below.

insertKey (x){
if(the key x is in the tree)
throw an appropriate exception;
let the insertion leaf-node be the currentNode;
insert x in its proper location within the node;

if(the currentNode does not overflow)


return;
done = false;
do{
if (m is odd) {
split currentNode into two siblings such that the
right sibling rs has m/2 right-most keys,
and the left sibling ls has m/2 left-most keys;
Let w be the middle key of the splinted node;
}
else { // m is even
split currentNode into two siblings by any of the
following methods:
• right-bias: the right sibling rs has m/2 right-
most keys, and the left sibling ls has (m-1)/2
left-most keys.
• left-bias: the right sibling rs has (m-1)/2
right-most keys, and the left sibling ls has
m/2 left-most keys.
let w be the “middle” key of the splinted node;
}
if (the currentNode is not the root node) {
insert w in its proper location in the parent p of
the currentNode;
if (p does not overflow)
done = true;
else
let p be the currentNode;
}
} while (! done && currentNode is not the root node);
if (! done) {
create a new root node with w as its only key;
let the right sibling rs be the right child of the new
root;
let the left sibling ls be the left child of the new
root;
}
return;
}

219
Key predecessor successor
20 17 25
30 25 32
34 32 40
50 45 53
60 55 64
70 68 75
78 75 88

In-text Question 1
What is a block?

Answer
A block is the smallest amount of data that can be accessed on a disk

2.7 Deletion in a B-tree


Like insertion, deletion must be on a leaf node. If the key to be deleted is not in
a leaf, swap it with either its successor or predecessor (each will be in a leaf).
The successor of a key k is the smallest key greater than k.
The predecessor of a key k is the largest key smaller than k.
In a b-tree the successor and predecessor, if any, of any key is in a leaf node.
Example: Consider the following B-tree of order 3:

Deleting a key from an existing B-tree may result in what is known as an


underflow condition.

220
Underflow Condition: A non-root node of a B-tree of order m underflows if,
after a key deletion, it contains m / 2 - 2 keys
The root node does not underflow. If it contains only one key and this key is
deleted, the tree becomes empty.
Deletion algorithm:
If a node underflows, rotate the appropriate key from the adjacent right- or
left-sibling if the sibling contains at least m / 2 keys; otherwise perform a
merging.
 A key rotation must always be attempted before a merging
There are five deletion cases:
1. The leaf does not underflow.
2. The leaf underflows and the adjacent right sibling have at least m / 2 
keys.
Perform a left key-rotation
3. The leaf underflows and the adjacent left sibling have at least m / 2  keys.
Perform a right key-rotation
4. The leaf underflows and each of the adjacent right sibling and the adjacent
left sibling has at least m / 2  keys.
Perform either a left or a right key-rotation
5. The leaf underflows and each adjacent sibling has m / 2 - 1 keys.
Perform a merging

221
Case 1: The leaf does not underflow.
Example:

Figure 2.6.8 (a): Deleting a key from a B-tree

Case 2: The leaf underflows and the adjacent right sibling have at least m /
2  keys.
Perform a left key-rotation:
1. Move the parent key x that separates the siblings to the node with
underflow
2. Move y, the minimum key in the right sibling, to where the key x was
3. Make the old left subtree of y to be the new right subtree of x.

Figure 2.6.8(b): Key rotation during deletion in B-tree


Example:

222
Figure 2.6.8(c): Deleting a key from a B-tree

Case 3: The leaf underflows and the adjacent left sibling has at least m /
2 keys.
Perform a right key-rotation:
1. Move the parent key x that separates the siblings to the node with underflow
2. Move w, the maximum key in the left sibling, to where the key x was
3. Make the old right subtree of w to be the new left subtree of x

Figure 2.6.8(d): Key rotation during deletion in B-tree

In-text Question 2
The height h of a B-tree of order m, with a total of n keys, satisfies the inequality _________

Answer
.h <= 1 + log m / 2 ((n + 1) / 2)

Example:

Figure 2.6.8(e): Deleting a key from a B-tree

223
Case 5: The leaf underflows and each adjacent sibling has m / 2 - 1 keys.

Figure 2.6.8(f): Merging process during deletion in B-tree

If the parent of the merged node underflows, the merging process propagates
upward. In the limit, a root with one key is deleted and the height decreases by
one.
Note: The merging could also be done by using the left sibling instead of the
right sibling.
Example:

Figure 2.6.8(g): Deleting a key from a B-tree


Example:

224
Figure 2.6.8(h): Deleting a key from a B-tree

Deletion: Special Case, involves rotation and merging


Some cases of deletion may require a combination of rotation and merging. An
example is presented below
Example: Delete the key 40 in the following B-tree of order 3:

225
Figure 2.6.8(i): Deleting a key from a B-tree

Deletion of a non-leaf node


Deletion of a non-leaf key can always be done in two different ways: by first
swapping the key with its successor or predecessor. The resulting trees may be
similar or they may be different.
Example: Delete the key 140 in the following partial B-tree of order 4:

Figure 2.6.8(i): Deleting a key from a B-tree

226
2.7.1 B-Tree Deletion Algorithm
The algorithm for deleting a key from a B-tree is shown below.
deleteKey (x) {
if (the key x to be deleted is not in the tree)
throw an appropriate exception;
if (the tree has only one node) {
delete x ;
return;
}
if (the key x is not in a leaf node)
swap x with its successor or predecessor; // each
will be in a leaf node
delete x from the leaf node;
if(the leaf node does not underflow) // after deletion
numKeys  m / 2 - 1
return;
let the leaf node be the CurrentNode;
done = false;
while (! done && numKeys(CurrentNode)  m / 2 - 1) { //
there is underflow
if (any of the adjacent siblings t of the CurrentNode
has at least m / 2 keys) { // ROTATION CASE
if (t is the adjacent right sibling) {
• rotate the separating-parent key w of
CurrentNode and t to CurrentNode;
• rotate the minimum key of t to the previous
parent-location of w;
• rotate the left subtree of t, if any, to become
the right-most subtree of CurrentNode;
}
else { // t is the adjacent left sibling
• rotate the separating-parent key w between
CurrentNode and t to CurrentNode;
• rotate the maximum key of t to the previous
parent-location of w;
• rotate the right subtree of t , if any, to
become the left-most subtree of CurrentNode;
}
done = true;
}
else { // MERGING CASE: the adjacent or each
adjacent sibling has m / 2 - 1 keys
select any adjacent sibling t of CurrentNode;
create a new sibling by merging currentNode, the sibling t,
and their parent-separating key ;
If (parent node p is the root node) {
if (p is empty after the merging)
make the merged node the new root;
done = true;

227
} else
let parent p be the CurrentNode;
}
} // while
return;
}

3.0 Tutor Marked Assignment (Individual or Group)


1. Find the minimum order of the B-tree that guarantees that the number of
disk accesses in searching in a file of 100 million records does not exceed
3. Assume that the root‟s page is stored in main memory.

2. Draw the B-tree obtained after inserting 30 and then 31 in the B-tree in
the figure above. Assume that a leaf cannot contain more than three
items.
3. Outline an algorithm for finding the largest key in a B-tree.
4. Write a program implementing a key insertion algorithm in a B-tree.

4.0 Conclusion/Summary
In the last study session of this module, you learnt about multi-way trees and
why we need to use them. We focused on B-Trees and as we did in previous
study sessions on trees, you learnt about operations on B-Trees with examples
and algorithms for implementing these operations. In the next module, we will
study graphs and sorting.

5.0 Self-Assessment Questions


1. What do you understand by overflow condition that occurs when

228
inserting a key into a B-tree?
2. Why do we need B-Trees?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube add http://bit.ly/340haUk , http://bit.ly/340haUk , http://bit.ly/2Hr7DMp,

http://bit.ly/2Nxku3B , http://bit.ly/2NyTikU , http://bit.ly/30BgsuF , http://bit.ly/2NufU6d. Watch


the video & summarise in 1 paragraph.
b. View the animation on http://bit.ly/2ZnVr5r and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on inserting and deleting keys from a
B-Tree; In 2 paragraphs summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. A root-node or a non-root node of a B-tree of order m overflows if,
after a key insertion, it contains m keys.
2.
i. With a large branching factor m, the height of a B-tree is low
resulting in fewer disk accesses.
ii. The branching factor can be chosen such that a node corresponds to
a block of secondary memory.
iii. The most common data structure used for database indices is the
B-tree. An index is any data structure that takes as input a
property (e.g. a value for a specific field), called the search key,
and quickly finds all records with that property.

229
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

230
MODULE 3
Graph and Sorting
Contents:
Study Session 1: Huffman Coding
Study Session 2: Graphs
Study Session 3: Topological Sort
Study Session 4: Shortest Path algorithm
Study Session 5: Minimum Spanning Tree

STUDY SESSION 1
Huffman Coding
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - Introduction to Data Compression
2.1.1 - What is Data Compression?
2.1.2 - Why Data Compression?
2.1.3 - How is Data Compression possible?
2.2 - Lossless and Lossy Data Compression
2.2.1- Classification of Lossless Compression Techniques
2.3 - Compression Utilities and Formats
2.4 - Run-length Encoding
2.5 - Static Huffman Coding
2.5.1 - Static Huffman Coding Algorithm
2.6 - The Prefix property
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion

231
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 In-text Question Answers
8.0 Self-Assessment Question Answers
9.0 References/Further Readings

Introduction:
In the first study session of this module, you will be introduced to data
compression, the need to compress data and how to go about compressing data.
You will also learn the different techniques of compressing data. You will then
be introduced to compression utilities and formats, run-length encoding and
static Huffman coding. We will be concluding the study session by looking at
the prefix property.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Define data compression and outline its importance
2. Explain the different techniques for encoding data
3. List some compression utilities and formats
4. Describe run-length encoding
5. Encode a given text using static Huffman coding
6. Explain the prefix property

2.0 Main Content


2.1 Introduction to Data Compression
2.1.1 What is Data Compression?
Data compression is the representation of an information source (e.g. a data
file, a speech signal, an image, or a video signal) as accurately as possible using

232
the fewest number of bits. Compressed data can only be understood if the
decoding method is known by the receiver.

2.1.2 Why Data Compression?


Data compression has some advantages and disadvantages. We highlight some
of them.
Advantage:
Data storage and transmission costs money. This cost increases with the amount
of data available. This cost can be reduced by processing the data so that it takes
less memory and lesser transmission time.
Disadvantage:
Compressed data must be decompressed to be viewed (or heard), thus extra
processing is required. Due to its advantage and disadvantage, the design of data
compression schemes therefore involve trade-offs between various factors,
including the degree of compression, the amount of distortion introduced (if
using a lossy compression scheme), and the computational resources required to
compress and decompress the data.

2.1.3 How is Data Compression possible?


Compression is possible because information usually contains redundancies, or
information that is often repeated. Examples include reoccurring letters,
numbers or pixels. File compression programs remove this redundancy.

2.2 Lossless and Lossy Data Compression


We have two broad classifications of data compression techniques namely
lossless and lossy.
Lossless techniques enable exact reconstruction of the original document from
the compressed information. They exploit redundancy in data and are applied to
general data

233
- Examples: Run-length, Huffman, LZ77, LZ78, and LZW
Lossy compression reduces a file by permanently eliminating certain redundant
information. They exploit redundancy and human perception and are applied to
audio, image, and video
- Examples: JPEG and MPEG
Lossy techniques usually achieve higher compression rates than lossless ones
but the latter are more accurate.

2.2.1 Classification of Lossless Compression Techniques


Lossless techniques are classified into static, adaptive (or dynamic), and hybrid.
In a static method the mapping from the set of messages to the set of codewords
is fixed before transmission begins, so that a given message is represented by
the same codeword every time it appears in the message being encoded.
Static coding requires two passes: one pass to compute probabilities (or
frequencies) and determine the mapping, and a second pass to encode.

Examples: Static Huffman Coding


In an adaptive method, the mapping from the set of messages to the set of
codeword changes over time. All of the adaptive methods are one-pass
methods; only one scan of the message is required.
Examples: LZ77, LZ78, LZW, and Adaptive Huffman Coding
An algorithm may also be a hybrid, neither completely static nor completely
dynamic.

2.3 Compression Utilities and Formats


Some Compression tool examples are:
- winzip, pkzip, compress, gzip
General compression formats include:
- .zip, .gz

234
Common image compression formats:
- JPEG, JPEG 2000, BMP, GIF, PCX, PNG, TGA, TIFF, WMP
Common audio (sound) compression formats:
- MPEG-1 Layer III (known as MP3), RealAudio (RA, RAM, RP), AU,
Vorbis, WMA, AIFF, WAVE, G.729a
Common video (sound and image) compression formats:
- MPEG-1, MPEG-2, MPEG-4, DivX, Quicktime (MOV), RealVideo
(RM), Windows Media Video (WMV), Video for Windows (AVI), Flash
video (FLV)

2.4 Run-length Encoding


The following string BBBBHHDDXXXXKKKKWWZZZZ can be encoded
more compactly by replacing each repeated string of characters by a single
instance of the repeated character and a number that represents the number of
times it is repeated. The given string can be represented as
4B2H2D4X4K2W4Z. Here "4B" means four B's, and 2H means two H's, and so
on. Compressing a string in this way is called run-length encoding.
As another example, consider the storage of a rectangular image. As a single
color bitmapped image, it can be stored as:

The rectangular image can be compressed with run-length encoding by counting


identical bits as follows:
0, 40
0, 40
0, 10 1, 20 0, 10
0, 10 1, 1 0, 18 1, 1 0, 10
0, 10 1, 1 0, 18 1, 1 0, 10
0, 10 1, 1 0, 18 1, 1 0, 10

235
0, 10 1, 20 0, 10
0, 40
The first line says that the first line of the bitmap consists of 40 0's. The third
line says that the third line of the bitmap consists of 10 0's followed by 20 1's
followed by 10 more 0's, and so on for the other lines
In-text Question 1
What is data compression?

Answer
Data compression is the representation of an information source (e.g. a data file, a speech
signal, an image, or a video signal) as accurately as possible using the fewest number of bits.

2.5 Static Huffman Coding


Static Huffman coding assigns variable length codes to symbols based on their
frequency of occurrences in the given message. Low frequency symbols are
encoded using many bits, and high frequency symbols are encoded using fewer
bits. The message to be transmitted is first analysed to find the relative
frequencies of its constituent characters. The coding process generates a binary
tree, the Huffman code tree, with branches labeled with bits (0 and 1). The
Huffman tree (or the character codeword pairs) must be sent with the
compressed information to enable the receiver decode the message.

2.5.1 Static Huffman Coding Algorithm


Next, we present the static Huffman coding algorithm.
Find the frequency of each character in the file to be compressed; for each
distinct character create a one-node binary tree containing the character and its
frequency as its priority; insert the one-node binary trees in a priority queue in
increasing order of frequency;
while (there are more than one tree in the priority queue) {
dequeue two trees t1 and t2;
Create a tree t that contains t1 as its left subtree and t2 as
its right subtree; // 1
priority (t) = priority(t1) + priority(t2);
insert t in its proper location in the priority queue; // 2

236
}
Assign 0 and 1 weights to the edges of the resulting tree, such that the left and
right edge of each node do not have the same weight; // 3
Note: The Huffman code tree for a particular set of characters is not unique.
(Steps 1, 2, and 3 may be done differently).
Example: Information to be transmitted over the internet contains the following
characters with their associated frequencies:

Use Huffman technique to answer the following questions:


1. Build the Huffman code tree for the message.
2. Use the Huffman tree to find the codeword for each character.
3. If the data consists of only these characters, what is the total number of
bits to be transmitted? What is the compression ratio?
4. Verify that your computed Huffman codewords satisfy the Prefix
property.
Solution

237
Figure 3.1.1(a): Static Huffman Coding

238
Figure 3.1.1(b): Static Huffman Coding

Figure 3.1.1(c): Static Huffman Coding

239
Figure 3.1.1(c): Static Huffman Coding

Figure 3.1.1(c): Static Huffman Coding


The sequence of zeros and ones that are the arcs in the path from the root to
each leaf node are the desired codes:

If we assume the message consists of only the characters a,e,l,n,o,s,t then the
number of bits for the compressed message will be 696:

240
If the message is sent uncompressed with 8-bit ASCII representation for the
characters, we have 261*8 = 2088 bits.
Assuming that the number of character-codeword pairs and the pairs are
included at the beginning of the binary file containing the compressed message
in the following format:

Number of bits for the transmitted file = bits(7) + bits(characters) +


bits(codewords) + bits(compressed message)
= 3 + (7*8) + 21 + 696 = 776
Compression ratio = bits for ASCII representation / number of bits transmitted
= 2088 / 776 = 2.69
Thus, the size of the transmitted file is 100 / 2.69 = 37% of the original ASCII
file

241
2.6 The Prefix property
Data encoded using Huffman coding is uniquely decodable. This is because
Huffman codes satisfy an important property called the prefix property. In a
given set of Huffman codewords, no codeword is a prefix of another Huffman
codeword. For example, in a given set of Huffman codewords, 10 and 101
cannot simultaneously be valid Huffman codewords because the first is a prefix
of the second. We can see by inspection that the codewords we generated in the
previous example are valid Huffman codewords.
To see why the prefix property is essential, consider the codewords given below
in which “e” is encoded with 110 which is a prefix of “f”

The decoding of 11000100110 is ambiguous:


11000100110 => face
11000100110 => eaace
Now let us look at some examples
Example 1:

Encode (compress) the message tenseas using the following codewords:

Answer: Replace each character with its codeword:


001011101010110010
Example 2: Decode (decompress) each of the following encoded messages, if
possible, using the Huffman codeword tree given below 0110011101000 and
11101110101011.

242
Answer: Decode a bit-stream by starting at the root and proceeding down the
tree according to the bits in the message (0 = left, 1 = right). When a leaf is
encountered, output the character at that leaf and restart at the root .If a leaf
cannot be reached, the bit-stream cannot be decoded.
(a) 0110011101000 => lost
(b) 11101110101011 => The decoding fails in this case because the
corresponding node for 11 is not a leaf

In-text Question 2
The Huffman code tree for a particular set of characters is unique. (True or False)?

Answer
False

3.0 Tutor Marked Assignments (Individual or Group)


1. Using the Huffman tree constructed in this section 2.5.1, decode the
following sequence of bits, if possible. Otherwise, where does the
decoding fail?
10100010111010001000010011
2. Using the same Huffman tree as in question 1, write the bit sequences that
encode the messages:
test , state , telnet , notes
3. Mention one disadvantage of a lossless compression scheme and one
disadvantage of a lossy compression scheme.

243
4. Write a Java program that implements the Huffman coding algorithm.

4.0 Conclusion/Summary
In this study session, you were introduced to data compression, the need to
compress data and how to go about compressing data. You also learnt the
different techniques to compress data. You were then introduced to compression
utilities and formats, run-length encoding and static Huffman coding. We
concluded the study session by looking at the prefix property. In the next study
session, you will learn about graphs.

5.0 Self-Assessment Questions


1. Which category of data compression are applied to general data?
2. In which method of lossless compression techniques is the mapping from
the set of messages to the set of codewords fixed before transmission
begins?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2KWfKCS , http://bit.ly/2HnJVR5 , http://bit.ly/2HnJVR5 ,

http://bit.ly/30zWjVW , http://bit.ly/33ZeA0Z, http://bit.ly/2KVExa7 . Watch the video &


summarise in 1 paragraph
b. View the animation on http://bit.ly/33ZeA0Z and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on static Huffman coding; In 2
paragraphs summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. Lossless compression
2. Static Method

244
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

245
STUDY SESSION 2
Graphs
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- What is a Graph?
2.2- Some Example applications of Graphs
2.3- Graph Terminologies
2.4- Representation of Graphs
2.4.1- Adjacency Matrix
2.4.2- Adjacency Lists
2.4.3- Simple Lists
2.6- Implementation of Graph
2.6.1- Identification of Classes and Interfaces
2.6.2- Concrete Implementations for Graph
2.7- Graph Traversals
2.7.1- Depth-First Traversals.
2.7.2- Breadth-First Traversal.
2.8- Testing for Connectedness and Cycles
2.8.1- Connectedness of an Undirected Graph
2.8.2- Implementation of Connectedness detection Algorithm.
2.8.3- Connectedness of a Directed Graph
2.8.4- Implementation of Strong Connectedness Algorithm.
2.8.5- Cycles in a Directed Graph.
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)

246
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
In this study session, we introduce graphs. Graphs are a generalisation of trees.
You will learn what a graph is, application of graphs and some graph
terminologies. You will also learn different ways of representing graphs. Next,
you will study graph implementation and graph traversals. We conclude this
study session by studying how to test for connectedness and cycles in a graph.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Define a graph
2. Outline application of graphs
3. List and explain graph terminologies
4. Explain the different ways of representing graphs
5. Implement graphs
6. Traverse graphs using any of the traversal techniques

2.0 Main Content


2.1 What is a Graph?
Have you heard about the graph data structure before? Well, a graph is a
generalisation of a Tree. A simple graph G = (V, E) consists of a non-empty set
V, whose members are called the vertices of G, and a set E of pairs of distinct
vertices from V, called the edges of G.

247
Figure 3.2.1: Examples of graph

Graphs could be undirected, directed or weighted. In an undirected graph, the


edges don‟t have a direction; you can go either way on them. Thus, you can go
from vertex A to vertex B, or from vertex B to vertex A, with equal ease.
For a directed graph, you can go in only one direction along an edge - from A to
B but not from B to A, as on a one-way street. The allowed direction is typically
shown with an arrowhead at the end of the edge. Such graphs are usually called
digraphs for short.
In some graphs, edges are given a weight, a number that can represent the
physical distance between two vertices, or the time it takes to get from one
vertex to another, or how much it costs to travel from vertex to vertex (on
airline routes, for example). Such graphs are called weighted graphs. Figure 1
above shows examples of these graphs

2.2 Some Example applications of Graphs


Graphs have many applications in computing. Some of them are:
1. Finding the least congested route between two phones, given connections
between switching stations.
2. Determining if there is a way to get from one page to another, just by
following links.
3. Finding the shortest path from one city to another.
4. As a traveling sales-person, finding the cheapest path that passes through
all the cities that the sales person must visit.
248
5. Determining an ordering of courses so that prerequisite courses are
always taken first.

2.3 Graph Terminologies


There are a number of graph terminologies. It is important that you have a clear
understanding of these terminologies and be able to different between them. We
now outline some of these terminologies.
Adjacent Vertices: Two vertices are said to be adjacent if there is a connecting
edge between them.
A Path: A path is a sequence of adjacent vertices.
A Cycle: A cycle is a path in which the last and first vertices are adjacent.
Connected graph: A graph is said to be connected if there is a path from any
vertex to every other vertex in the graph.

Figure 3.2.2(a): Graph terminologies

Path and cycles in a digraph: Paths and cycles in a digraph must move in the
direction specified by the arrow.
Connectedness in a digraph: Connectedness in a digraph can be either strong
or weak.
- Strongly Connected: A digraph is said to be strongly connected if
connected as a digraph - following the arrows.
- Weakly connected: A digraph is said to be weakly connected if the
underlying undirected graph is connected (i.e. ignoring the arrows).

249
Figure 3.2.2(b): Graph terminologies
Emanate: An edge e = (v, w) is said to emanate from v.
- A(v) denotes the set of all edges emanating from v.
Incident: An edge e = (v, w) is said to be incident to w.
- I(w) denote the set of all edges incident to w.
Out-degree: The out-degree of a vertex v is the number of edges emanating
from v -- |A(v)|
In-degree: The in-degree of a vertex v is the number of edges incident to w --
|I(w)|.

Figure 3.2.2(c): Graph terminologies

2.4 Representation of Graphs


For vertices in a graph, an array or a linked list can be used to represent them.
For edges, any of the following could be used:
- Adjacency Matrix (Two-dimensional array)
- Adjacency List (One-dimensional array of linked lists)
- Linked List (one list only)

250
2.4.1 Adjacency Matrix
An Adjacency Matrix uses a 2-D array of dimension |V|x|V| for edges. (For
vertices, a 1-D array is used). The presence or absence of an edge, (v, w) is
indicated by the entry in row v, column w of the matrix. For an unweighted
graph, boolean values could be used while for a weighted graph, the actual
weights are used.

Figure 3.2.3(a): Adjacency Matrix

Here are some notes on adjacency matrix.


- For undirected graph, the adjacency matrix is always symmetric.
- In a Simple Graph, all diagonal elements are zero (i.e. no edge from a
vertex to itself).
- The space requirement of adjacency matrix is O(n2) - most of it wasted
for a graph with few edges. However, entries in the matrix can be
accessed directly.

Figure 3.2.3(b): Adjacency Matrix

2.4.2 Adjacency Lists


This involves representing the set of vertices adjacent to each vertex as a list.
Thus, generating a set of lists. This can be implemented in different ways.

251
We will represent vertices as a one dimensional array and edges as an array of
linked list (the emanating edges of vertex 1 will be in the list of the first
element, and so on).

Figure 3.2.4: Adjacency List

2.4.3 Simple Lists


For simple list representation, vertices are represented as a 1-D array or a linked
list while edges are represented as one linked list with each edge containing the
information about its two vertices.

Figure 3.2.5: Simple List Representation

2.6 Implementation of Graph


2.6.1 Identification of Classes and Interfaces
A graph contains vertices and edges. We can identify three kinds of objects:
vertices, edges, and graphs. Accordingly, we define three interfaces:
- Vertex
- Edge

252
- Graph
A graph can be represented in different ways (we know three of them).
Accordingly, we use the following six classes.
- AbstractGraph (having the following two inner classes)
• GraphVertex
• GraphEdge
- GraphAsMatrix
- GraphAsArrayLists
- GraphAsLists

Figure 3.2.6: Classes and Interfaces

2.6.2 Concrete Implementations for Graph


We now provide concrete implementations for the classes and interfaces
identified.
The Vertex Interface
Each vertex must be distinguishable from other vertices. Thus, each vertex
should have a unique label. Some applications require vertex-weighted graphs.
253
A vertex type object belongs to exactly one graph. It is neither independent nor
shared between two graphs. Otherwise the getIncidentEdges(), and other
methods will not make sense.

public interface Vertex extends Comparable


{
public String getLabel();
public Comparable getWeight();
public Iterator getIncidentEdges();
public Iterator getEmanatingEdges();
public Iterator getPredecessors();
public Iterator getSuccessors();
}

The Edge Interface


An edge in a directed graph is an ordered pair of vertices; while in an undirected
graph it is a set of two vertices. We use the same class for both--the context
determines whether it is directed or undirected. Some edges may have weight.
public interface Edge extends Comparable
{
public abstract Vertex getFromVertex();
public abstract Vertex getToVertex();
public abstract Comparable getWeight();
public abstract boolean isDirected();
public abstract Vertex getMate(Vertex vertex);
}

The Graph Interface


The graph interface represents both directed and undirected graphs.
In-text Question 1
For undirected graphs, the adjacency matrix is always symmetric. (True or False)?

Answer
False

254
public interface Graph {
public int getNumberOfEdges();
public int getNumberOfVertices();
public Iterator getVertices();
public Iterator getEdges();
public void addVertex(String label);
public void addVertex(String label, Comparable weight);
public Vertex getVertex(String label);
public int getIndex(Vertex v);// vertices are indexed starting from zero
public void addEdge(String from, String to);
public void addEdge(String from, String to, Comparable weight);
public Edge getEdge(String from, String to);
public boolean isReachable(String from, String to);
public boolean isDirected();
public boolean isWeighted();
public boolean isConnected();
public abstract boolean isStronglyConnected();
public abstract boolean isWeaklyConnected();
public boolean isCyclic();
public void preorderDepthFirstTraversal(Visitor visitor, Vertex start);
public void postorderDepthFirstTraversal(Visitor visitor, Vertex start);
public void breadthFirstTraversal(Visitor visitor, Vertex start);
public abstract int topologicalOrderTraversal(Visitor visitor);
}

The AbstractGraph class


The following introduces the AbstractGraph class.

public abstract class AbstractGraph implements Graph {


protected int numberOfVertices;
protected int numberOfEdges;
protected boolean directed;

public AbstractGraph(boolean directed){


numberOfVertices = 0;
numberOfEdges = 0;
this.directed = directed;
}
public int getNumberOfVertices(){return numberOfVertices;}
public int getNumberOfEdges(){return numberOfEdges;}
public void purge() { Implemented in subclasses
numberOfVertices = 0;
numberOfEdges = 0;
}
public void addVertex(String label){addVertex(label, null);}
public void addEdge(String from, String to){addEdge(from, to, null);}
public boolean isDirected() {return directed;}

255
public boolean isWeighted(){
Iterator p = getEdges();
if(((Edge)p.next()).getWeight() == null) return false;
return true;
}
public Vertex getVertex(String label){
Iterator i = getVertices();
while (i.hasNext()){
Vertex v = (Vertex) i.next();
if (v.getLabel().equals(label)) return v;
}
return null;
}
public Edge getEdge(String from, String to){
Iterator i = getEdges();
while (i.hasNext()){
Edge e = (Edge) i.next();
if (e.getFromVertex().getLabel().equals(from) &&

e.getToVertex().getLabel().equals(to))
return e;
}
return null;
}
public Iterator getEmanatingEdges(Vertex from) {
Iterator i = getEdges();
MyLinkedList emEdges = new MyLinkedList();
while (i.hasNext()){
Edge edge = (Edge) i.next();
if (edge.getFromVertex().equals(from))
emEdges.append(edge);
}
return emEdges.iterator();
}
public Iterator getIncidentEdges(Vertex to) {
Iterator i = getEdges();
MyLinkedList inEdges = new MyLinkedList();
while (i.hasNext()){
Edge edge = (Edge) i.next();
if (edge.getToVertex().equals(to))
inEdges.append(edge);
}
return inEdges.iterator();
}
public int getIndex(Vertex v){ return
getIndex(v.getLabel());}
protected abstract int getIndex(String label);

256
The GraphVertex class
The GraphVertex class is implemented as an inner class:
protected final class GraphVertex implements Vertex {
protected String label; protected Comparable weight;
protected GraphVertex(String s, Comparable w) {
label = s; weight = w;
}
protected GraphVertex(String s) {this(s, null);}
public int compareTo(Object obj) {
return label.compareTo(((GraphVertex)obj).getLabel());
}
public Iterator getIncidentEdges() {
return AbstractGraph.this.getIncidentEdges(this);
}
public Iterator getPredecessors() {
return new Iterator() {
Iterator edges = getIncidentEdges();
public boolean hasNext() {return edges.hasNext();}
public Object next() {
Edge edge = (Edge)edges.next();
return edge.getMate(GraphVertex.this);
}
};
}
} The getEmanatingEdges and getSuccessors
methods are implemented in the same way.

The GraphEdge class


The GraphEdge class is also implemented as an inner class:

protected final class GraphEdge implements Edge {


protected Vertex startVertex, endVertex;
protected Comparable weight;
protected GraphEdge(Vertex v1, Vertex v2, Comparable w) {
startVertex = v1; endVertex = v2; weight = w;
}
protected GraphEdge(Vertex v1, Vertex v2) {this(v1, v2, null);}
public Vertex getFromVertex() {return startVertex;}
public Vertex getToVertex() {return endVertex;}
public Comparable getWeight() {return weight;}
public Vertex getMate(Vertex v) {
if(v.equals(startVertex)) return endVertex;
if(v.equals(endVertex)) return startVertex;
else throw new InvalidOperationException("invalid vertex");
}
public boolean isDirected() {
return AbstractGraph.this.isDirected();
}
// …
}
257
Implementing GraphAsMatrix (Adjacency Matrix)
The following describes the concrete class, GraphAsMatrix:
public class GraphAsMatrix extends AbstractGraph {
private int size;
private Vertex[] vertices;
private Edge[][] edges;
public GraphAsMatrix(int size, boolean directed) {
super(directed);
this.size = size;
vertices = new GraphVertex[size];
edges = new Edge[size][size];
}
public void purge() {
for (int i=0;i<size;i++){
vertices[i] = null;
for (int j=0;j<size;j++) edges[i][j] = null;
}
super.purge();
}
public int getIndex(String label){
for (int i=0;i<numberOfVertices;i++)
if (vertices[i].getLabel().equals(label)) return i;
return -1;
}
public void addVertex(String label, Comparable weight){
if (getIndex(label)!=-1)
throw new IllegalArgumentException("Duplicate vertex");
if (numberOfVertices == size)
throw new IllegalArgumentException("Graph is full");
vertices[numberOfVertices++] = new GraphVertex(label, weight);
}
public void addEdge(String from, String to, Comparable weight){
int i = getIndex(from);
int j = getIndex(to);
if (i==-1 || j==-1)
throw new IllegalArgumentException("Vertex not in this
graph");
if (i == j)
throw new IllegalArgumentException("Loops not supported");
if (edges[i][j] == null){
edges[i][j] = new GraphEdge(vertices[i], vertices[j],
weight);
numberOfEdges++;
if (!isDirected() && edges[j][i]==null){
edges[j][i]=new GraphEdge(vertices[j],
vertices[i],weight);
numberOfEdges++;
}
}
}

258
public Iterator getVertices(){
return new Iterator(){
int index = 0;
public boolean hasNext(){return index <
numberOfVertices;}
public Object next(){return vertices[index++];}
};
}
public Iterator getEdges() {
return new Iterator(){
int count = 0,i=0,j=0;
public boolean hasNext(){return count <
numberOfEdges;}
public Object next(){
if (count==numberOfEdges) throw new
NoSuchElementException();
while (i<numberOfVertices && j<numberOfVertices &&

edges[i][j]==null){
j++; if (j==numberOfVertices){j=0;i++;}
}
Edge r = edges[i][j];
count++;
// for next call, adust i and j
j++; if (j==numberOfVertices){j=0;i++;}
return r;
}
};
}

Implementing GraphAsLists (Simple List)


The following describes the concrete class, GraphAsLists:

259
public class GraphAsLists extends AbstractGraph {
private MyLinkedList listOfVertices, listOfEdges;
public GraphAsLists(boolean directed) {
super(directed);
listOfVertices = new MyLinkedList();
listOfEdges = new MyLinkedList();
}
public void purge() {
listOfVertices.purge();
listOfEdges.purge();
super.purge();
}
public int getIndex(String label){
int index = -1;
MyLinkedList.Element e = listOfVertices.getHead();
while (e != null){
index++;
Vertex v = (Vertex) e.getData();
if (label.equals(v.getLabel())) return index;
e = e.getNext();
}
return -1;
}

260
public void addVertex(String label, Comparable weight){
if (getIndex(label)!=-1)
throw new IllegalArgumentException("Duplicate
vertex");
listOfVertices.append(new GraphVertex(label, weight));
numberOfVertices++;
}
public void addEdge(String from, String to, Comparable
weight){
Vertex fromVertex = getVertex(from);
Vertex toVertex = getVertex(to);
if (fromVertex==null || toVertex==null)
throw new IllegalArgumentException("Vertex not in this
graph");
if (fromVertex == toVertex)
throw new IllegalArgumentException("Loops not
supported");
if (getEdge(from, to)==null){
listOfEdges.append(new GraphEdge(fromVertex, toVertex,
weight));
numberOfEdges++;
if (!isDirected() && getEdge(to, from)==null){
listOfEdges.append(new GraphEdge(toVertex,
fromVertex, weight));
numberOfEdges++;
}
}
}
public Iterator getEdges() {return listOfEdges.iterator();}
public Iterator getVertices(){return
listOfVertices.iterator();}
Implementing GraphAsArrayLists (Adjacency List)

261
public class GraphAsArrayLists extends AbstractGraph {
private int size;
private Vertex[] vertices;
private MyLinkedList[] edges;
public GraphAsArrayLists(int size, boolean directed) {
super(directed);
this.size = size;
vertices = new GraphVertex[size];
edges = new MyLinkedList[size];
for (int i=0;i<size;i++) edges[i] = new
MyLinkedList();
}
// These methods are similar to those in GraphAsMatrix
class
public int getIndex(String label)
public void addVertex(String label, Comparable weight)
public Iterator getVertices()
// These methods will be implemented in the lab
public void purge()
public void addEdge(String from, String to, Comparable
weight)
public Iterator getEdges()
}

2.7 Graph Traversals


2.7.1 Depth-First Traversals Algorithm
In this method, after visiting a vertex v, which is adjacent to w1, w2, w3, ...;
Next, we visit one of v's adjacent vertices, w1 say. Next, we visit all vertices
adjacent to w1 before coming back to w2, etc. We must keep track of vertices
already visited to avoid cycles. The method can be implemented using recursion
or iteration.

The iterative preorder depth-first algorithm is:

1 push the starting vertex onto the stack


2 while(stack is not empty){
3 pop a vertex off the stack, call it v
4 if v is not already visited, visit it
5 push vertices adjacent to v, not visited, onto the stack
6 }

262
Note: Adjacent vertices can be pushed in any order; but to obtain a unique
traversal, we will push them in reverse alphabetical order.
Example: Demonstrates depth-first traversal using an explicit stack.

Figure 3.2.7: depth-first traversal

Recursive preorder Depth-First Traversal Implementation

dfsPreorder(v){
visit v;
for(each neighbour w of v)
if(w has not been visited)
dfsPreorder(w);
}

The following is the code for the recursive preorderDepthFirstTraversal method


of the AbstractGraph class:
public void preorderDepthFirstTraversal(Visitor visitor,
Vertex start)
{
boolean visited[] = new boolean[numberOfVertices];
for(int v = 0; v < numberOfVertices; v++)
visited[v] = false;
preorderDepthFirstTraversal(visitor, start, visited);
}

263
private void preorderDepthFirstTraversal(Visitor visitor,
Vertex v, boolean[] visited)
{
if(visitor.isDone())
return;
visitor.visit(v);
visited[getIndex(v)] = true;
Iterator p = v.getSuccessors();
while(p.hasNext()) {
Vertex to = (Vertex) p.next();
if(! visited[getIndex(to)])
preorderDepthFirstTraversal(visitor, to, visited);
}
}

Figure 3.2.8: Recursive Preorder Depth-first implementation

Recursive postorder Depth-First Traversal Implementation


dfsPostorder(v){
mark v;
for(each neighbour w of v)
if(w is not marked)
dfsPostorder(w);
visit v;
}
The following is the code for the recursive postorderDepthFirstTraversal
method of the AbstractGraph class:

264
public void postorderDepthFirstTraversal(Visitor visitor,
Vertex start)
{
boolean visited[] = new boolean[numberOfVertices];
for(int v = 0; v < numberOfVertices; v++)
visited[v] = false;
postorderDepthFirstTraversal(visitor, start, visited);
}

In-text Question 2
What is the in-degree of a vertex v?

Answer
The in-degree of a vertex v is the number of edges incident to it.

private void postorderDepthFirstTraversal(


Visitor visitor, Vertex v, boolean[] visited)
{
if(visitor.isDone())
return;
// mark v
visited[getIndex(v)] = true;
Iterator p = v.getSuccessors();
while(p.hasNext()){
Vertex to = (Vertex) p.next();
if(! visited[getIndex(to)])
postorderDepthFirstTraversal(visitor, to, visited);
}
// visit v
visitor.visit(v);
}

265
Figure 3.2.8: Recursive Postorder Depth-first implementation

2.7.2 Breadth-First Traversal


In this method, After visiting a vertex v, we must visit all its adjacent vertices
w1, w2, w3, ..., before going down next level to visit vertices adjacent to w1
etc. The method can be implemented using a queue. A boolean array is used to
ensure that a vertex is enqueued only once.

1 enqueue the starting vertex


2 while(queue is not empty){
3 dequeue a vertex v from the queue;
4 visit v.
5 enqueue vertices adjacent to v that were never enqueued;
6 }

Note: Adjacent vertices can be enqueued in any order; but to obtain a unique
traversal, we will enqueue them in alphabetical order.
Example: Demonstrating breadth-first traversal using a queue.

266
Figure 3.2.9: breadth-first traversal

Breadth-First Traversal Implementation

Below is the implementation for the breadth-first traversal algorithm.

public void breadthFirstTraversal(Visitor visitor, Vertex


start){
boolean enqueued[] = new boolean[numberOfVertices];
for(int i = 0; i < numberOfVertices; i++) enqueued[i] =
false;
Queue queue = new QueueAsLinkedList();
enqueued[getIndex(start)] = true;
queue.enqueue(start);
while(!queue.isEmpty() && !visitor.isDone()) {
Vertex v = (Vertex) queue.dequeue();
visitor.visit(v);
Iterator it = v.getSuccessors();
while(it.hasNext()) {
Vertex to = (Vertex) it.next();
int index = getIndex(to);
if(!enqueued[index]) {
enqueued[index] = true;
queue.enqueue(to);
}
}
}
}

267
2.8 Testing for Connectedness and Cycles
2.8.1 Connectedness of an Undirected Graph
An undirected graph G = (V, E) is connected if there is a path between every
pair of vertices. Although the figure 11 below appears to be two graphs, it is
actually a single graph. Clearly, G is not connected. e.g. no path between A and
D. G consists of two unconnected parts, each of which is a connected sub-graph
--- connected components.

Figure 3.2.11: Connectedness of an undirected graph

2.8.2 Implementation of Connectedness detection Algorithm


A simple way to test for connectedness in an undirected graph is to use either
depth-first or breadth-first traversal - Only if all the vertices are visited is the
graph connected. The algorithm uses the following visitor:
public class CountingVisitor extends AbstractVisitor {
protected int count;
public int getCount(){ return count;}
public void visit(Object obj) {count++;}
}

Using the CountingVisitor, the isConnected method is implemented as follows:

public boolean isConnected() {


CountingVisitor visitor = new CountingVisitor();
Iterator i = getVertices();
Vertex start = (Vertex) i.next();
breadthFirstTraversal(visitor, start);
return visitor.getCount() == numberOfVertices;
}

268
2.8.3 Connectedness of a Directed Graph
A directed graph G = (V, E) is strongly connected if there is a directed path
between every pair of vertices. Is the directed graph in figure 12 connected?
G is not strongly connected. No path between any of the vertices in {D, E, F}
However, G is weakly connected since the underlying undirected graph is
connected.

Figure 3.2.12: Connectedness of a directed graph

2.8.4 Implementation of Strong Connectedness Algorithm


A simple way to test for strong connectedness is to use |V| traversals - The
graph is strongly connected if all the vertices are visited in each traversal.
public boolean isStronglyConnected() {
if (!this.isDirected())
throw new InvalidOperationException(
"Invalid for Undirected Graph");
Iterator it = getVertices();
while(it.hasNext()) {
CountingVisitor visitor = new CountingVisitor();
breadthFirstTraversal(visitor, (Vertex) it.next());
if(visitor.getCount() != numberOfVertices)
return false;
}
return true;
}

2.8.5 Cycles in a Directed Graph


An easy way to detect the presence of cycles in a directed graph is to attempt a
topological order traversal. This algorithm visits all the vertices of a directed
graph if the graph has no cycles.

269
In the following graph, after A is visited and removed, all the remaining vertices
have in-degree of one. Thus, a topological order traversal cannot complete. This
is because of the presence of the cycle {B, C, D, B}.

Figure 3.2.13: Cycles in digraphs

public boolean isCyclic() {


CountingVisitor visitor = new CountingVisitor();
topologicalOrderTraversal(visitor);
return visitor.getCount() != numberOfVertices;
}

3.0 Tutor Marked Assignments (Individual or Group)

1. Consider the undirected graph GA shown above. List the elements of V


and E. Then, for each vertex v in V, do the following:
i. Compute the in-degree of v
ii. Compute the out-degree of v
iii. List the elements of A(v)
iv. List the elements of I(v).
2. Consider the undirected graph GA shown above.
i. Show how the graph is represented using adjacency matrix.
ii. Show how the graph is represented using adjacency lists.
3. Repeat Exercises 1 and 2 for the directed graph GB shown above.

270
4. Write an instance method public Edge minWeightEdge() in one of the
concrete Graph classes that returns the minimum-weight edge. Your
method must throw an appropriate exception if the graph is not weighted.
Your method must not use any Iterator.
5. Write an instance method public int countSpecialEdges() of
AbstractGraph that counts the number of edges in the invoking object
that have starting vertex greater than ending vertex (based on compareTo
method).

6. Consider a depth-first traversal of the undirected graph GA shown above,


starting from vertex a.
i. List the order in which the nodes are visited in a preorder traversal.
ii. List the order in which the nodes are visited in a postorder traversal
7. Repeat exercise 1 above for a depth-first traversal starting from vertex d.
8. List the order in which the nodes of the undirected graph GA shown
above are visited by a breadth first traversal that starts from vertex a.
Repeat this exercise for a breadth-first traversal starting from vertex d.
9. Repeat Exercises 6 and 8 for the directed graph GB.
10.Every tree is a directed, acyclic graph (DAG), but there exist DAGs that
are not trees.
a) How can we tell whether a given DAG is a tree?
b) Devise an algorithm to test whether a given DAG is a tree.
11.Consider an acyclic, connected, undirected graph G that has n vertices.
How many edges does G have?

271
12.In general, an undirected graph contains one or more connected
components.
a) Devise an algorithm that counts the number of connected
components in a graph.
b) Devise an algorithm that labels the vertices of a graph in such a
way that all the vertices in a given connected component get the
same label and vertices in different connected components get
different labels.
13. Devise an algorithm that takes as input a graph, and a pair of vertices, v
and w, and determines whether w is reachable from v.
4.0 Conclusion/Summary
In this study session, we introduced graphs as a generalisation of trees. You
learnt what a graph is, application of graphs and some graph terminologies. You
also learnt different ways of representing graphs. Next, you studied graph
implementation and graph traversals. We concluded the study session by
studying how to test for connectedness and cycles in a graph. In the next study
session, you will learn about topological sort.
5.0 Self-Assessment Questions
1. What is a graph?
2. When is a graph said to be connected?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2Zm1nQo , http://bit.ly/2ZrUMQn , http://bit.ly/2KU4uaa,

http://bit.ly/2zogYQW, http://bit.ly/2ZteSd4, http://bit.ly/2U2FkJv, http://bit.ly/2KXJgIA . Watch


the video & summarise in 1 paragraph
b. View the animation on http://bit.ly/33YXJLF and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on applications of graphs; In 2
paragraphs summarise their opinion of the discussed topic. etc.

272
7.0 Self Assessment Question Answers
1. A simple graph G = (V, E) consists of a non-empty set V, whose
members are called the vertices of G, and a set E of pairs of distinct
vertices from V, called the edges of G.
2. A graph is said to be connected if there is a path from any vertex to
every other vertex in the graph.

8.0References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

273
STUDY SESSION 3
Topological Sort
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - Introduction.
2.2 - Definition of Topological Sort.
2.3 - Topological Sort is Not Unique.
2.4 - Topological Sort Algorithm
2.5 - Implementation of Topological Sort
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
There are many problems involving a set of tasks in which some of the tasks
must be done before others. In this study session, we introduce topological sort
as one of the ways of solving these problems. You will learn what topological
sort is, examples and finally its implementation.

274
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. define topological sort
2. demonstrate topological sort
3. Implement topological sort

2.0 Main Content


2.1 Introduction
There are many problems involving a set of tasks in which some of the tasks
must be done before others. For example, consider the problem of taking a
course only after taking its prerequisites. Is there any systematic way of linearly
arranging the courses in the order that they should be taken? Yes, this is where
topological sort comes in.

2.2 Definition of Topological Sort


Topological sort is a method of arranging the vertices in a directed acyclic
graph (DAG), as a sequence, such that no vertex appear in the sequence before
its predecessor.
The graph in figure 1(a) can be topologically sorted as in figure 1(b)

Figure 3.3.1(a): Before sort Figure 3.3.1(b): Before sort


2.3 Topological Sort is Not Unique
275
Topological sort is not unique. The following are all topological sorts of the
graph below:

Figure 3.3.2: Topological Sort

2.4 Topological Sort Algorithm


One way to find a topological sort is to consider in-degrees of the vertices. The
first vertex must have in-degree zero -- every DAG must have at least one
vertex with in-degree zero.

The Topological sort algorithm is:


int topologicalOrderTraversal( ){
int numVisitedVertices = 0;
while(there are more vertices to be visited){
if(there is no vertex with in-degree 0)
break;
else{
select a vertex v that has in-degree 0;
visit v;
numVisitedVertices++;
delete v and all its emanating edges;
}
}
return numVisitedVertices;
}

In-text Question 1
Topological sort is not unique. (True or False)?

Answer
True

276
Topological Sort Example
Demonstrating Topological Sort.

Figure 3.3.3: Topological sort example

2.5 Implementation of Topological Sort


The algorithm is implemented as a traversal method that visits the vertices in a
topological sort order. An array of length |V| is used to record the in-degrees of
the vertices. Hence no need to remove vertices or edges. A priority queue is
used to keep track of vertices with in-degree zero that are not yet visited.
public int topologicalOrderTraversal(Visitor visitor){
int numVerticesVisited = 0;
int[] inDegree = new int[numberOfVertices];
for(int i = 0; i < numberOfVertices; i++)
inDegree[i] = 0;
Iterator p = getEdges();
while (p.hasNext()) {
Edge edge = (Edge) p.next();
Vertex to = edge.getToVertex();
inDegree[getIndex(to)]++;
}

277
BinaryHeap queue = new BinaryHeap(numberOfVertices);
p = getVertices();
while(p.hasNext()){
Vertex v = (Vertex)p.next();
if(inDegree[getIndex(v)] == 0)
queue.enqueue(v);
}

while(!queue.isEmpty() && !visitor.isDone()){


Vertex v = (Vertex)queue.dequeueMin();
visitor.visit(v);
numVerticesVisited++;
p = v.getSuccessors();
while (p.hasNext()){
Vertex to = (Vertex) p.next();
if(--inDegree[getIndex(to)] == 0)
queue.enqueue(to);
}
}
return numVerticesVisited;
}

In-text Question 2
In the topological sort algorithm, the first vertex must have in-degree zero. (True or False)?

Answer
True

3.0 Tutor Marked Assignments (Individual or Group)

1. List the order in which the nodes of the directed graph GB are visited by
topological order traversal that starts from vertex A.
2. What kind of DAG has a unique topological sort?
3. Generate a directed graph using the required courses for your major. Now
apply topological sort on the directed graph you obtained.

278
4.0 Conclusion/Summary
In this study session, we introduced topological sort. You also learnt what
topological sort is, examples and finally its implementation. In the next study
session, you will learn about the shortest path problem.

5.0 Self-Assessment Questions


1. Explain what you understand by topological sort
2. In our implementation of topological sort, a stack is used to keep track of
vertices with in-degree zero that are not yet visited. (True or False)?

6.0 Additional Activities (Videos, Animations & Out of Class activities)


a. Visit U-tube http://bit.ly/2MDqJTE , http://bit.ly/2U31Kdz , http://bit.ly/2ZeLfRv ,
http://bit.ly/31XKrgC , http://bit.ly/2LaTbJw, http://bit.ly/2Pekd8l. Watch the video &
summarise in 1 paragraph
b. View the animation on http://bit.ly/2MDqJTE and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on complexity of topological sort; In
2 paragraphs summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. Topological sort is a method of arranging the vertices in a directed
acyclic graph (DAG), as a sequence, such that no vertex appear in the
sequence before its predecessor.
2. False

279
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

280
STUDY SESSION 4
Shortest Path Algorithm
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - What is the Shortest Path Problem?
2.2 - Is the shortest path problem well defined?
2.3 - The Dijkstra's Algorithm for Shortest Path Problem.
2.4 - Data structures required for Implementating Dijkstra's Algorithm
2.5 - Implementation Dijkstra's Algorithm
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
Shortest path deals with finding the fastest way to get to a vertex from another
vertex in a weighted graph. In this study session, we introduce the shortest path
algorithm. In particular, you will learn about Dijkstra's Algorithm for shortest
path problem and its implementation.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Define what a shortest path problem is
2. Explain the dijkstra‟s algorithm for shortest path problem

281
2.0 Main Content
2.1 What is the Shortest Path Problem?
In an edge-weighted graph, the weight of an edge measures the cost of traveling
that edge. For example, in a graph representing a network of airports, the
weights could represent: distance, cost or time. Such a graph could be used to
answer any of the following:
- What is the fastest way to get from A to B?
- Which route from A to B is the least expensive?
- What is the shortest possible distance from A to B?
Each of these questions is an instance of the same problem: The shortest path
problem!

2.2 Is the shortest path problem well defined?


If all the edges in a graph have non-negative weights, then it is possible to find
the shortest path from any two vertices. For example, in the figure 1 below, the
shortest path from B to F is { B, A, C, E, F } with a total cost of nine. Thus, the
problem is well defined for a graph that contains non-negative weights.

Figure 3.4.1(a): A weighted graph


Things get difficult for a graph with negative weights. For example, the path D,
A, C, E, F costs 4 even though the edge (D, A) costs 5 -- the longer the less
costly. The problem gets even worse if the graph has a negative cost cycle. e.g.
{D, A, C, D}

282
A solution can be found even for negative-weight graphs but not for graphs
involving negative cost cycles.

Figure 3.4.1(b): A weighted graph


In-text Question 1
Explain shortest path

Answer
Shortest path deals with finding the fastest way to get to a vertex from another vertex in a
weighted graph.

2.3 The Dijkstra's Algorithm for Shortest Path Problem


Dijkstra's algorithm solves the single-source shortest path problem for a non-
negative weights graph. It finds the shortest path from an initial vertex, say S, to
all the other vertices. The Dijkstra's algorithm is presented below.
// Let V be the set of all vertices in G, and s the start vertex.
for(each vertex v){
currentDistance(s-v) = ∞;
predecessor(v) = undefined;
}
currentDistance(s-s) = 0;
T = V;
while(T  ){
v = a vertex in T with minimal currentDistance from s;
T= T – {v};
for(each vertex u adjacent to v and in T){
if(currentDistance(s-u) > currentDistance(s-v) + weight(edge(vu)){
currentDistance(s-u) = currentDistance(s-v) + weight(edge(vu));
predecessor(u) = v;
}
}
}

283
For each vertex, the algorithm keeps track of its current distance from the
starting vertex and the predecessor on the current path
Example: Trace Dijkstra‟s algorithm starting at vertex B in the graph below:

Figure 3.4.2: Tracing Dijkstra’s algorithm

2.4 Data structures required for Implementating Dijkstra's Algorithm


The implementation of Dijkstra's algorithm uses the Entry structure, which
contains the following three fields:
- know: a boolean variable indicating whether the shortest path to v is
known, initially false for all vertices.
- distance : the shortest known distance from s to v, initially infinity for all
vertices except that of s which is 0.
- predecessor : the predecessor of v on the path from s to v, initially
unknown for all vertices.

284
public class Algorithms{
static final class Entry{
boolean known;
int distance;
Vertex predecessor;
Entry(){
known = false;
distance = Integer.MAX_VALUE;
predecessor = null;
}
}

2.5 Implementation Dijkstra's Algorithm


The dijkstra‟s algorithm method shown below takes two arguments, a directed
graph and the starting vertex. The method returns a vertex-weighted digraph
from which the shortest path from S to any vertex can be found. Since in each
pass, the vertex with the smallest known distance is chosen, a minimum priority
queue is used to store the vertices.

285
public static Graph dijkstrasAlgorithm(Graph g, Vertex
start){
int n = g.getNumberOfVertices();
Entry table[] = new Entry[n];
for(int v = 0; v < n; v++)
table[v] = new Entry();
table[g.getIndex(start)].distance = 0;
PriorityQueue queue = new BinaryHeap(
g.getNumberOfEdges());
queue.enqueue(new Association(new Integer(0), start));

while(!queue.isEmpty()) {
Association association =
(Association)queue.dequeueMin();
Vertex v1 = (Vertex) association.getValue();
int n1 = g.getIndex(v1);
if(!table[n1].known){
table[n1].known = true;
Iterator p = v1.getEmanatingEdges();
while (p.hasNext()){
Edge edge = (Edge) p.next();
Vertex v2 = edge.getMate(v1);
int n2 = g.getIndex(v2);
Integer weight = (Integer) edge.getWeight();
int d = table[n1].distance + weight.intValue();
if(table[n2].distance > d){
table[n2].distance = d;
table[n2].predecessor = v1;
queue.enqueue(new Association(d, v2));
}
}
}
}

Graph result = new GraphAsLists(true);//Result is Digraph


Iterator it = g.getVertices();
while (it.hasNext()){
Vertex v = (Vertex) it.next();
result.addVertex(v.getLabel(),
new
Integer(table[g.getIndex(v)].distance));
}

it = g.getVertices();
while (it.hasNext()){
Vertex v = (Vertex) it.next();
if (v != start){
String from = v.getLabel();
String to =
table[g.getIndex(v)].predecessor.getLabel();
result.addEdge(from, to);
}
} 286
return result;
}
In-text Question 2
A shortest path solution can be found for graphs involving negative cost cycles. (True or
False)?

Answer
False

3.0 Tutor Marked Assignments (Individual or Group)

1. Use the graph Gc shown above to trace the execution of Dijkstra's


algorithm as it solves the shortest path problem starting from vertex a.
2. Dijkstra's algorithm works as long as there are no negative edge weights.
Given a graph that contains negative edge weights, we might be tempted
to eliminate the negative weights by adding a constant weight to all of the
edges. Explain why this does not work.
3. Dijkstra's algorithm can be modified to deal with negative edge weights
(but not negative cost cycles) by eliminating the known flag and by
inserting a vertex back into the queue every time its tentative distance
decreases. Implement this modified algorithm.

4.0 Conclusion/Summary
In this study session, we introduced the shortest path algorithm. You learnt
about Dijkstra's algorithm for shortest path problems, how it works and its
implementation. In the next study session, you will learn about minimum
spanning tree.

287
5.0 Self-Assessment Questions
1. Dijkstra's algorithm solves the single-source shortest path problem for a
non-negative weights graph. (True or False)?
2. Dijkstra's algorithm uses the Entry structure. Mention the three fields an
Entry structure has

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube add http://bit.ly/2MBOyvm , http://bit.ly/2z9zgoQ , http://bit.ly/2PcMVpW ,

http://bit.ly/2ZcxrXw , http://bit.ly/31XL6i6. Watch the video & summarise in 1


paragraph
b. View the animation on http://bit.ly/2MBOyvm and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on finding shortest path using
Dijkstra's algorithm; In 2 paragraphs summarise their opinion of the discussed
topic. etc.

7.0 Self Assessment Question Answers


1. True
2.
- Know
- Distance
- Predecessor

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

288
STUDY SESSION 5
Minimum Spanning Tree
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - What is a Minimum Spanning Tree.
2.2 - Constructing Minimum Spanning Trees.
2.3 - What is a Minimum-Cost Spanning Tree.
2.4 - Applications of Minimum Cost Spanning Trees
2.5 - Prim‟s Algorithm
2.5.1: Implementation of Prim‟s Algorithm
2.6: Kruskal‟s algorithm
2.6.1: Implementation of Kruskal's Algorithm
2.7: Prim‟s and Kruskal‟s Algorithms
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
In the last study session of this module, you will learn about minimum spanning
trees and how to construct minimum spanning trees with some examples.
Minimum-cost spanning trees and their applications will also be discussed.

289
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Define a minimum spanning tree
2. Construct a minimum spanning tree
3. Explain what a minimum-cost spanning tree is
4. Outline applications of minimum-cost spanning tree

2.0 Main Content


2.1 What is a Minimum Spanning Tree?
Let G = (V, E) be a simple, connected, undirected graph that is not edge-
weighted. A spanning tree of G is a free tree (i.e., a tree with no root) with | V | -
1 edges that connects all the vertices of the graph.
Thus a minimum spanning tree for G is a graph, T = (V‟, E‟) with the following
properties:
- V‟ = V
- T is connected
- T is acyclic.
A spanning tree is called a tree because every acyclic undirected graph can be
viewed as a general, unordered tree. Because the edges are undirected, any
vertex may be chosen to serve as the root of the tree.

2.2 Constructing Minimum Spanning Trees


Any traversal of a connected, undirected graph visits all the vertices in that
graph. The set of edges which are traversed during a traversal forms a spanning
tree.
For example, Figure 1(b) shows the spanning tree obtained from a breadth-first
traversal of the graph in figure 1(a) starting at vertex b.
Similarly, Figure 1(c) shows the spanning tree obtained from a depth-first
traversal starting at vertex c.

290
Figure 3.5.1(a): Graph G Figure 3.5.1(b): spanning tree of G Figure 3.5.1(c): spanning tree of G

2.3 What is a Minimum-Cost Spanning Tree?


For an edge-weighted, connected, undirected graph, G, the total cost of G is the
sum of the weights on all its edges.
A minimum-cost spanning tree for G is a minimum spanning tree of G that has
the least total cost.
Example: The graph below

Has 16 spanning trees. Some are:

Figure 3.5.2: Some minimum spanning trees of the graph.

291
The graph has two minimum-cost spanning trees, each with a cost of 6:

Figure 3.5.3: Minimum-cost spanning trees of the graph.

2.4 Applications of Minimum Cost Spanning Trees


Minimum-cost spanning trees have many applications. Some are:
1. Building cable networks that join n locations with minimum cost.
2. Building a road network that joins n cities with minimum cost.
3. Obtaining an independent set of circuit equations for an electrical
network.
4. In pattern recognition minimal spanning trees can be used to find noisy
pixels.

2.5 Prim’s Algorithm


Prim‟s algorithm finds a minimum cost spanning tree by selecting edges from
the graph one-by-one as follows:
- It starts with a tree, T, consisting of the starting vertex, x.
- Then, it adds the shortest edge emanating from x that connects T to the
rest of the graph.
- It then moves to the added vertex and repeats the process.
Prim‟s algorithm is shown below
Consider a graph G=(V, E);
Let T be a tree consisting of only the starting vertex x;
while (T has fewer than IVI vertices)
{
find a smallest edge connecting T to G-T;
add it to T;
}

292
Example: For the graph below, trace Prim‟s algorithm starting at vertex a:
Solution:

Figure 3.5.4: Prim’s Algorithm Example

In-text Question 1
Any traversal of a connected, undirected graph visits all the vertices in that graph. (True or
False)?

Answer
True

2.5.1 Implementation of Prim’s Algorithm


Prims algorithm can be implemented similar to the Dijskra‟s algorithm as
shown below:

293
public static Graph primsAlgorithm(Graph g, Vertex start){
int n = g.getNumberOfVertices();
Entry table[] = new Entry[n];
for(int v = 0; v < n; v++)
table[v] = new Entry();
table[g.getIndex(start)].distance = 0;
PriorityQueue queue = new BinaryHeap(g.getNumberOfEdges());
queue.enqueue(new Association(new Integer(0), start));
while(!queue.isEmpty()) {
Association association = (Association)queue.dequeueMin();
Vertex v1 = (Vertex) association.getValue();
int n1 = g.getIndex(v1);
if(!table[n1].known){
table[n1].known = true;
Iterator p = v1.getEmanatingEdges();
while (p.hasNext()){
Edge edge = (Edge) p.next();
Vertex v2 = edge.getMate(v1);
int n2 = g.getIndex(v2);
Integer weight = (Integer) edge.getWeight();
int d = weight.intValue();

if(!table[n2].known && table[n2].distance > d){


table[n2].distance = d; table[n2].predecessor = v1;
queue.enqueue(new Association(new Integer(d), v2));
}
}
}
}
GraphAsLists result = new GraphAsLists(false);
Iterator it = g.getVertices();
while (it.hasNext()){
Vertex v = (Vertex) it.next();
result.addVertex(v.getLabel());
}
it = g.getVertices();
while (it.hasNext()){
Vertex v = (Vertex) it.next();
if (v != start){
int index = g.getIndex(v);
String from = v.getLabel();
String to = table[index].predecessor.getLabel();
result.addEdge(from, to, new
Integer(table[index].distance));
}
}
return result;
}

294
2.6 Kruskal’s Algorithm
Kruskal‟s algorithm also finds the minimum cost spanning tree of a graph by
adding edges one-by-one. The algorithm is shown below:
enqueue edges of G in a queue in increasing order of cost.
T =  ;
while(queue is not empty){
dequeue an edge e;
if(e does not create a cycle with edges in T)
add e to T;
}
return T;

Example: Trace Kruskal's algorithm in finding a minimum-cost spanning tree


for the undirected, weighted graph given below:

Solution

Figure 3.5.5: Kruskal's Algorithm Example

2.6.1 Implementation of Kruskal's Algorithm


The implementation of Kruskal‟s algorithm is shown below:

295
public static Graph kruskalsAlgorithm(Graph g){
Graph result = new GraphAsLists(false);
Iterator it = g.getVertices();
while (it.hasNext()){
Vertex v = (Vertex)it.next();
result.addVertex(v.getLabel());
}
PriorityQueue queue = new BinaryHeap(g.getNumberOfEdges());
it = g.getEdges();
while(it.hasNext()){
Edge e = (Edge) it.next();
if (e.getWeight()==null)
throw new IllegalArgumentException("Graph is only,
adds an edge not if it does not
weighted"); create a cycle
queue.enqueue(e);
}
while (!queue.isEmpty()){
Edge e = (Edge) queue.dequeueMin();
String from = e.getFromVertex().getLabel();
String to = e.getToVertex().getLabel();
if (!result.isReachable(from, to))
result.addEdge(from,to,e.getWeight());
}
return result;
}

public abstract class AbstractGraph implements Graph {


public boolean isReachable(String from, String to){
Vertex fromVertex = getVertex(from);
Vertex toVertex = getVertex(to);
if (fromVertex == null || toVertex==null)
throw new IllegalArgumentException("Vertex not in the
graph");
PathVisitor visitor = new PathVisitor(toVertex);
this.preorderDepthFirstTraversal(visitor, fromVertex);
return visitor.isReached();
}
private class PathVisitor implements Visitor {
boolean reached = false;
Vertex target;
PathVisitor(Vertex t){target = t;}
public void visit(Object obj){
Vertex v = (Vertex) obj;
if (v.equals(target)) reached = true;
}
public boolean isDone(){return reached;}
boolean isReached(){return reached;}
}
}

2.7 Prim’s and Kruskal’s Algorithms


You should note it is not necessary that Prim's and Kruskal's algorithm generate
the same minimum-cost spanning tree.

296
For example for the graph:

Kruskal's algorithm (that imposes an ordering on edges with equal weights)


results in the following minimum cost spanning tree:

The same tree is generated by Prim's algorithm if the start vertex is any of: A, B,
or D; however if the start vertex is C the minimum cost spanning tree is:

In-text Question 2
Prim's and Kruskal's algorithm always generate the same minimum-cost spanning tree. (True
or False)?

Answer
False

3.0 Tutor Marked Assignments (Individual or Group)

1. Find the breadth-first spanning tree and depth-first spanning tree of the
graph GA shown above.

297
2. For the graph GB shown above, trace the execution of Prim's algorithm
as it finds the minimum-cost spanning tree of the graph starting from
vertex a.
3. Repeat question 2 above using Kruskal's algorithm.

4.0 Conclusion/Summary
In the last study session of this module, you learnt about minimum spanning
trees. You learnt how to construct minimum spanning trees and some examples.
Minimum-cost spanning trees and their applications was also be discussed. In
the final module of this course, you will be introduced to hashing.

5.0 Self-Assessment Questions


1. Considering a graph G, what is the minimum-cost spanning tree for G?
2. The set of edges which are traversed during a traversal forms a spanning
tree. (True or False)?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2L6NvjT , http://bit.ly/30Bvdxt , http://bit.ly/2U67xiE ,

http://bit.ly/2PfJ77y , http://bit.ly/2MqAy7o , http://bit.ly/30LDPSc . Watch the video &


summarise in 1 paragraph.
b. View the animation on http://bit.ly/2PfJ77y and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on how to use Prim's and Kruskal's
algorithm to generate minimum-cost spanning trees; In 2 paragraphs summarise
their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. The minimum-cost spanning tree for G is the minimum spanning tree of
G that has the least total cost.

298
2. True

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

299
MODULE 4
Hashing
Contents:
Study Session 1: Hashing
Study Session 2: Lempel-Ziv Compression Techniques
Study Session 3: Garbage Collection
Study Session 4: Memory Management

STUDY SESSION 1
Hashing
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - Introduction to Hashing and Hashing Techniques
2.1.1 - Review of Searching Techniques
2.1.2 - Introduction to Hashing
2.1.3 - Hash Tables
2.1.4 - Types of Hashing
2.1.5 - Hash Functions
2.1.6 - Applications of Hash Tables
2.1.7 - Problems for which Hash Tables are not suitable
2.2 - Hashing: Collision Resolution Schemes
2.2.1 - Collision Resolution Techniques
2.2.2 - Separate Chaining
2.2.3 - Separate Chaining with String Keys
2.2.4 - Separate Chaining versus Open-addressing
2.2.5 - The class hierarchy of Hash Tables

300
2.2.6 - Implementation of Separate Chaining
2.2.7 - Introduction to Collision Resolution using Open Addressing
2.2.8 - Linear Probing
2.3 - Collision Resolution: Open Addressing
2.3.1 - Quadratic Probing
2.3.2 - Double Hashing
2.3.3 - Rehashing
2.3.4 - Implementation of Open Addressing
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
In the first study session of the final module in this course, you will learn about
hashing. We will start by reviewing some search techniques and then go on to
introduce hashing, hash tables, types of hashing, hash functions and applications
of hashing. You will also learn about collision resolution techniques in hashing.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Explain what you understand by hashing
2. Explain hash tables
3. Highlight the types of hashing
4. Explain hash functions
5. Highlight applications of hash tables and list problems for which hash
tables are not suitable.

301
6. Discuss collision resolution techniques in hashing

2.0 Main Content


2.1 Introduction to Hashing and Hashing Techniques
2.1.1 Review of Searching Techniques
Recall the efficiency of some of the searching techniques we covered earlier in
the course.
- The sequential search algorithm takes time proportional to the data size,
i.e, O(n).
- Binary search improves on liner search reducing the search time to O(log
n). With a BST, an O(log n) search efficiency can be obtained; but the
worst-case complexity is O(n). To guarantee the O(log n) search time,
BST height balancing is required ( i.e., AVL trees).
The efficiency of these search strategies depend on the number of items in the
container being searched. Search methods with efficiency independent on data
size would be better.
Consider the following Java class that describes a student record:
class StudentRecord {
String name; // Student name
double height; // Student height
long id; // Unique id
}
The id field in this class can be used as a search key for records in the container.

2.1.2 Introduction to Hashing


Suppose that we want to store 10,000 students records (each with a 5-digit ID)
in a given container.
- A linked list implementation would take O(n) time.
- A height balanced tree would give O(log n) access time.
- Using an array of size 100,000 would give O(1) access time but will lead
to a lot of space wastage.

302
Is there some way that we could get O(1) access without wasting a lot of space?
The answer is hashing.
Hashing is based on the idea of distributing keys among a one-dimensional
array H[0..m − 1] called a hash table. The distribution is done by computing,
for each of the keys, the value of some predefined function h called the hash
function. This function assigns an integer between 0 and m − 1, called the hash
address, to a key.

Example 1: Illustrating Hashing


Use the function f(r) = r.id % 13 to load the following records into an array of
size 13.

Solution

303
2.1.3 Hash Tables
There are two types of Hash Tables namely Open-addressed Hash Tables and
Separate-Chained Hash Tables.
An Open-addressed Hash Table is a one-dimensional array indexed by integer
values that are computed by an index function called a hash function.
A Separate-Chained Hash Table is a one-dimensional array of linked lists
indexed by integer values that are computed by an index function called a hash
function.
Hash tables are sometimes referred to as scatter tables. Typical hash table
operations are Initialisation, Insertion, Searching and Deletion.

2.1.4 Types of Hashing


We have two types of hashing namely:
1. Static hashing: In static hashing, the hash function maps search-key
values to a fixed set of locations.
2. Dynamic hashing: In dynamic hashing a hash table can grow to handle
more items. The associated hash function must change as the table grows.
The load factor of a hash table is the ratio of the number of keys in the table to
the size of the hash table. However, you should note that the higher the load
factor, the slower the retrieval. With open addressing, the load factor cannot
exceed 1. With chaining, the load factor often exceeds 1.

2.1.5 Hash Functions


A hash function, h, is a function which transforms a key from a set, K, into an
index in a table of size n:
h: K -> {0, 1, ..., n-2, n-1}
A key can be a number, a string, a record etc. The size of the set of keys, |K|, to
be relatively very large. It is possible for different keys to hash to the same array

304
location. This situation is called collision and the colliding keys are called
synonyms.
A good hash function should:
- Minimize collisions.
- Be easy and quick to compute.
- Distribute key values evenly in the hash table.
- Use all the information provided in the key.

We will now highlight some common hashing functions. They are as follows:
1. Division Remainder (using the table size as the divisor)
It Computes hash value from key using the % operator. Table size that is
a power of 2 like 32 and 1024 should be avoided, for it leads to more
collisions. Also, powers of 10 are not good for table sizes when the keys
rely on decimal integers. Prime numbers not close to powers of 2 are
better table size values.
2. Truncation or Digit/Character Extraction
It Works based on the distribution of digits or characters in the key. More
evenly distributed digit positions are extracted and used for hashing
purposes. For instance, students IDs or ISBN codes may contain common
subsequences which may increase the likelihood of collision. It is very
fast but digits/characters distribution in keys may not be very even.
3. Folding
It involves splitting keys into two or more parts and then combining the
parts to form the hash addresses. To map the key 25936715 to a range
between 0 and 9999, we can:
- split the number into two as 2593 and 6715 and
- add these two to obtain 9308 as the hash value.

305
It is very useful if we have keys that are very large. It is also fast and
simple especially with bit patterns. A great advantage is ability to
transform non-integer keys into integer values.
4. Radix Conversion
It transforms a key into another number base to obtain the hash value. It
typically uses number bases other than base 10 and base 2 to calculate the
hash addresses. To map the key 55354 in the range 0 to 9999 using base
11 we have:
5535410 = 3865211
We may truncate the high-order 3 to yield 8652 as our hash address
within 0 to 9999.
5. Mid-Square
This hash function squares the key and the middle part of the result taken
as the hash value. To map the key 3121 into a hash table of size 1000, we
square it 31212 = 9740641 and extract 406 as the hash value. It works
well if the keys do not contain a lot of leading or trailing zeros. Non-
integer keys have to be preprocessed to obtain corresponding integer
values.
6. Use of a Random-Number Generator
Given a seed as parameter, this hash function generates a random
number.
The algorithm must ensure that:
• It always generates the same random value for a given key.
• It is unlikely for two keys to yield the same random value.
The random number produced can be transformed to produce a valid hash
value.

2.1.6 Applications of Hash Tables


Some applications of hash tables include:

306
- Database systems: Specifically, those that require efficient random
access. Generally, database systems try to optimize between two types of
access methods: sequential and random. Hash tables are an important part
of efficient random access because they provide a way to locate data in a
constant amount of time.
- Symbol tables: The tables used by compilers to maintain information
about symbols from a program. Compilers access information about
symbols frequently. Therefore, it is important that symbol tables be
implemented very efficiently.
- Data dictionaries: Data structures that support adding, deleting, and
searching for data. Although the operations of a hash table and a data
dictionary are similar, other data structures may be used to implement
data dictionaries. Using a hash table is particularly efficient.
- Network processing algorithms: Hash tables are fundamental
components of several network processing algorithms and applications,
including route lookup, packet classification, and network monitoring.
- Browser Cashes: Hash tables are used to implement browser cashes.
In-text Question 1
When is collision said to have occur during hashing?

Answer
Collision is a situation in which different keys hash to the same array location.

2.1.7 Problems for which Hash Tables are not suitable


There are situations that hash tables are not suitable for. Some of them are:
1. Problems for which data ordering is required.
Because a hash table is an unordered data structure, certain operations are
difficult and expensive. Range queries, proximity queries, selection, and
sorted traversals are possible only if the keys are copied into a sorted data
structure. There are hash table implementations that keep the keys in
order, but they are far from efficient.
307
2. Problems having multidimensional data.
3. Prefix searching especially if the keys are long and of variable-lengths.
4. Problems that have dynamic data:
Open-addressed hash tables are based on 1D-arrays, which are difficult to
resize once they have been allocated. Unless you want to implement the
table as a dynamic array and rehash all of the keys whenever the size
changes. This is an incredibly expensive operation. An alternative is used
to separate-chained hash tables or dynamic hashing.
5. Problems in which the data does not have unique keys.
Open-addressed hash tables cannot be used if the data does not have
unique keys. An alternative is use separate-chained hash tables.

2.2 Hashing: Collision Resolution Schemes


2.2.1 Collision Resolution Techniques
There are two broad ways of collision resolution:
1. Separate Chaining: An array of linked list implementation.
2. Open Addressing: Array-based implementation.
i. Linear probing (linear search)
ii. Quadratic probing (nonlinear search)
iii. Double hashing (uses two hash functions)

2.2.2 Separate Chaining


In separate chaining, the hash table is implemented as an array of linked lists.
Inserting an item, r that hashes at index i is simply insertion into the linked list
at position i. Synonyms are chained in the same linked list.

308
Figure 4.1.1: Separate Chaining
Retrieval of an item, r, with hash address, i, is simply retrieval from the linked
list at position i.
Deletion of an item, r, with hash address, i, is simply deleting r from the linked
list at position i.

Example: Load the keys 23, 13, 21, 14, 7, 8, and 15 , in this order, in a hash
table of size 7 using separate chaining with the hash function: h(key) = key % 7
Solution
h(23) = 23 % 7 = 2
h(13) = 13 % 7 = 6
h(21) = 21 % 7 = 0
h(14) = 14 % 7 = 0 collision
h(7) = 7 % 7 = 0 collision
h(8) = 8 % 7 = 1
h(15) = 15 % 7 = 1 collision

309
2.2.3 Separate Chaining with String Keys
Recall that search keys can be numbers, strings or some other object. A hash
function for a string s = c0c1c2…cn-1 can be defined as:
hash = (c0 + c1 + c2 + … + cn-1) % tableSize

This can be implemented as:


public static int hash(String key, int tableSize){
int hashValue = 0;
for (int i = 0; i < key.length(); i++){
hashValue += key.charAt(i);
}
return hashValue % tableSize;
}
Example: The following class describes commodity items:
class CommodityItem {
String name; // commodity name
int quantity; // commodity quantity needed
double price; // commodity price
}
Use the hash function hash to load the following commodity items into a hash
table of size 13 using separate chaining:
onion 1 10.0
tomato 1 8.50
cabbage 3 3.50
carrot 1 5.50
okra 1 6.50
mellon 2 10.0
potato 2 7.50
Banana 3 4.00
olive 2 15.0
salt 2 2.50
cucumber 3 4.50
mushroom 3 5.50
orange 2 3.00

Solution:

hash(onion) = (111 + 110 + 105 + 111 + 110) % 13 = 547 % 13 = 1

310
hash(salt) = (115 + 97 + 108 + 116) % 13 = 436 % 13 = 7
hash(orange) = (111 + 114 + 97 + 110 + 103 + 101)%13 = 636 %13 = 12

Alternative hash functions for a string s = c0c1c2…cn-1 exist. some are:


- hash = (c0 + 27 * c1 + 729 * c2) % tableSize
- hash = (c0 + cn-1 + s.length()) % tableSize

()
- hash = [∑ ∗ ( ) ]

311
2.2.4 Separate Chaining versus Open-Addressing
Separate Chaining has several advantages over open addressing. Some of them
are:
- Collision resolution is simple and efficient.
- The hash table can hold more elements without the large performance
deterioration of open addressing (The load factor can be 1 or greater).
- The performance of chaining declines much more slowly than open
addressing.
- Deletion is easy - no special flag values are necessary.
- Table size need not be a prime number.
- The keys of the objects to be hashed need not be unique.

Disadvantages of Separate Chaining:


- It requires the implementation of a separate data structure for chains, and
code to manage it.
- The main cost of chaining is the extra space required for the linked lists.
- For some languages, creating new nodes (for linked lists) is expensive
and slows down the system.

312
2.2.5 The Class Hierarchy of Hash Tables
Figure 4.1.2 below shows the hierarchy tree for implementing hash tables.

Figure 4.1.2: The class hierarchy of Hash Tables.

2.2.6 Implementation of Separate Chaining


The code fragment for implementing separate chaining is shown below.
public class ChainedHashTable extends AbstractHashTable {
protected MyLinkedList [ ] array;
public ChainedHashTable(int size) {
array = new MyLinkedList[size];
for(int j = 0; j < size; j++)
array[j] = new MyLinkedList( );
}
public void insert(Object key) {
array[h(key)].append(key); count++;
}
public void withdraw(Object key) {
array[h(key)].extract(key); count--;
}
public Object find(Object key){
int index = h(key);
MyLinkedList.Element e = array[index].getHead( );
while(e != null){
if(key.equals(e.getData()) return e.getData();
e = e.getNext();
}
return null;
}
}
313
2.2.7 Introduction to Collision Resolution using Open Addressing
In open addressing, all items are stored in the hash table itself. In addition to
the cell data (if any), each cell keeps one of the three states: EMPTY,
OCCUPIED, DELETED. While inserting, if a collision occurs, alternative cells
are tried until an empty cell is found.
Deletion: (lazy deletion): When a key is deleted the slot is marked as
DELETED rather than EMPTY otherwise subsequent searches that hash at the
deleted cell will fail.
Probe sequence: A probe sequence is the sequence of array indexes that is
followed in searching for an empty cell during an insertion, or in searching for a
key during find or delete operations.
The most common probe sequences are of the form:
hi(key) = [h(key) + c(i)] % n, for i = 0, 1, …, n-1.
where h is a hash function and n is the size of the hash table
The function c(i) is required to have the following two properties:
Property 1: c(0) = 0
Property 2: The set of values {c(0) % n, c(1) % n, c(2) % n, . . . , c(n-1) %
n} must be a permutation of {0, 1, 2,. . ., n – 1}, that is, it must contain every
integer between 0 and n - 1 inclusive.
The function c(i) is used to resolve collisions. To insert item r, we examine
array location h0(r) = h(r). If there is a collision, array locations h1(r), h2(r), ...,
hn-1(r) are examined until an empty slot is found. Similarly, to find item r, we
examine the same sequence of locations in the same order.
Note: For a given hash function h(key), the only difference in the open
addressing collision resolution techniques (linear probing, quadratic probing
and double hashing) is in the definition of the function c(i).
Common definitions of c(i) are:

314
where hp(key) is another hash function.
Some advantages of Open Addressing are:
- All items are stored in the hash table itself. There is no need for another
data structure.
- Open addressing is more efficient storage-wise.
Disadvantages of Open Addressing are:
- The keys of the objects to be hashed must be distinct.
- Dependent on choosing a proper table size.
- Requires the use of a three-state (Occupied, Empty, or Deleted) flag in
each cell.

Open Addressing Facts


- In general, primes give the best table sizes.
- With any open addressing method of collision resolution, as the table
fills, there can be a severe degradation in the table performance.
- Load factors between 0.6 and 0.7 are common.
- Load factors > 0.7 are undesirable.
- The search time depends only on the load factor, not on the table size.
- We can use the desired load factor to determine appropriate table size:

315
2.2.8 Linear Probing
In linear probing, c(i) is a linear function in i of the form c(i) = a*i. Usually c(i)
is chosen as:
c(i) = i for i = 0, 1, . . . , tableSize – 1
The probe sequences are then given by:
hi(key) = [h(key) + i] % tableSize for i = 0, 1, . . . , tableSize – 1
For c(i) = a*i to satisfy Property 2, a and n must be relatively prime.
Example: Perform the operations given below, in the given order, on an
initially empty hash table of size 13 using linear probing with c(i) = i and the
hash function: h(key) = key % 13:
insert(18), insert(26), insert(35), insert(9), find(15), find(48), delete(35),
delete(40), find(9), insert(64), insert(47), find(35)
Solution
The required probe sequences are given by:
hi(key) = (h(key) + i) % 13 i = 0, 1, 2, . . ., 12

316
Disadvantage of Linear Probing: Primary Clustering
Linear probing is subject to a primary clustering phenomenon. Elements tend to
cluster around table locations that they originally hash to. Primary clusters can
combine to form larger clusters. This leads to long probe sequences and hence
deterioration in hash table efficiency.

Figure 4.1.3: Primary Clustering


Example of a primary cluster: Insert keys: 18, 41, 22, 44, 59, 32, 31, 73, in this
order, in an originally empty hash table of size 13, using the hash function
h(key) = key % 13 and c(i) = i:
Solution
h(18) = 5
h(41) = 2
h(22) = 9
h(44) = 5+1
h(59) = 7
h(32) = 6+1+1
h(31) = 5+1+1+1+1+1
h(73) = 8+1+1+1

Figure 4.1.4: Primary Clustering example

317
2.3 Collision Resolution: Open Addressing
2.3.1 Quadratic Probing
Quadratic probing eliminates primary clusters suffered by linear probing. c(i) is
a quadratic function in i of the form c(i) = a*i2 + b*i. Usually c(i) is chosen as:
c(i) = i2 for i = 0, 1, . . . , tableSize – 1
or
c(i) = i2 for i = 0, 1, . . . , (tableSize – 1) / 2
The probe sequences are then given by:
hi(key) = [h(key) + i2] % tableSize for i = 0, 1, . . . , tableSize – 1
or
hi(key) = [h(key)  i2] % tableSize for i = 0, 1, . . . , (tableSize – 1) / 2
Note for Quadratic Probing however:
- Hashtable size should not be an even number; otherwise Property 2 will
not be satisfied.
- Ideally, table size should be a prime of the form 4j+3, where j is an
integer. This choice of table size guarantees Property 2.
Example: Load the keys 23, 13, 21, 14, 7, 8, and 15, in this order, in a hash
table of size 7 using quadratic probing with c(i) = i2 and the hash function:
h(key) = key % 7
Solution
The required probe sequences are given by:
hi(key) = (h(key)  i2) % 7 i = 0, 1, 2, 3
h0(23) = (23 % 7) % 7 = 2
h0(13) = (13 % 7) % 7 = 6
h0(21) = (21 % 7) % 7 = 0
h0(14) = (14 % 7) % 7 = 0 collision
2
h1(14) = (0 + 1 ) % 7 = 1
h0(7) = (7 % 7) % 7 = 0 collision
2
h1(7) = (0 + 1 ) % 7 = 1 collision
h-1(7) = (0 - 12) % 7 = -1
NORMALIZE: (-1 + 7) % 7 = 6 collision
h2(7) = (0 + 22) % 7 = 4

318
h0(8) = (8 % 7)%7 = 1 collision
2
h1(8) = (1 + 1 ) % 7 = 2 collision
h-1(8) = (1 - 12) % 7 = 0 collision
h2(8) = (1 + 22) % 7 = 5
h0(15) = (15 % 7)%7 = 1 collision
2
h1(15) = (1 + 1 ) % 7 = 2 collision
h-1(15) = (1 - 12) % 7 = 0 collision
h2(15) = (1 + 22) % 7 = 5 collision
h-2(15) = (1 - 22) % 7 = -3
NORMALIZE: (-3 + 7) % 7 = 4 collision
h3(15) = (1 + 32)%7 = 3

Secondary Clusters
Quadratic probing is better than linear probing because it eliminates primary
clustering. However, it may result in secondary clustering: if h(k1) = h(k2) the
probing sequences for k1 and k2 are exactly the same. This sequence of
locations is called a secondary cluster. Secondary clustering is less harmful than
primary clustering because secondary clusters do not combine to form large
clusters.
Example of Secondary Clustering: Suppose keys k0, k1, k2, k3, and k4 are
inserted in the given order in an originally empty hash table using quadratic
probing with c(i) = i2. Assuming that each of the keys hashes to the same array
index x. A secondary cluster will develop and grow in size:

319
Figure 4.1.5: Primary Clusters Example

2.3.2 Double Hashing


To eliminate secondary clustering, synonyms must have different probe
sequences. Double hashing achieves this by having two hash functions that both
depend on the hash key.
c(i) = i * hp(key) for i = 0, 1, . . . , tableSize – 1
where hp (or h2) is another hash function.
The probing sequence is:
hi(key) = [h(key) + i*hp(key)]% tableSize for i = 0, 1, . . . , tableSize – 1
The function c(i) = i*hp(r) satisfies Property 2 provided hp(r) and tableSize are
relatively prime.
To guarantee Property 2, tableSize must be a prime number.
Common definitions for hp are :
- hp(key) = 1 + key % (tableSize - 1)
- hp(key) = q - (key % q) where q is a prime less than
tableSize
- hp(key) = q*(key % q) where q is a prime less than
tableSize
Performance of double hashing is much better than linear or quadratic probing
because it eliminates both primary and secondary clustering BUT it requires a
computation of a second hash function hp.

320
Example: Load the keys 18, 26, 35, 9, 64, 47, 96, 36, and 70 in this order, in an
empty hash table of size 13
a. using double hashing with the first hash function: h(key) = key % 13 and
the second hash function: hp(key) = 1 + key % 12
b. using double hashing with the first hash function: h(key) = key % 13 and
the second hash function: hp(key) = 7 - key % 7
Solution

h0(18) = (18%13)%13 = 5 hi(key) = [h(key) +


i*hp(key)]% 13
h0(26) = (26%13)%13 = 0 h(key) = key % 13
h0(35) = (35%13)%13 = 9 hp(key) = 1 + key % 12
h0(9) = (9%13)%13 = 9 collision
hp(9) = 1 + 9%12 = 10
h1(9) = (9 + 1*10)%13 = 6
h0(64) = (64%13)%13 = 12
h0(47) = (47%13)%13 = 8
h0(96) = (96%13)%13 = 5 collision
hp(96) = 1 + 96%12 = 1
h1(96) = (5 + 1*1)%13 = 6 collision
h2(96) = (5 + 2*1)%13 = 7
h0(36) = (36%13)%13 = 10
h0(70) = (70%13)%13 = 5 collision
hp(70) = 1 + 70%12 = 11
h1(70) = (5 + 1*11)%13 = 3

h0(18) = (18%13)%13 = 5 hi(key) = [h(key) +


i*hp(key)]% 13
h0(26) = (26%13)%13 = 0 h(key) = key % 13
h0(35) = (35%13)%13 = 9 hp(key) = 7 - key % 7
h0(9) = (9%13)%13 = 9 collision
hp(9) = 7 - 9%7 = 5
h1(9) = (9 + 1*5)%13 = 1
h0(64) = (64%13)%13 = 12
h0(47) = (47%13)%13 = 8

321
h0(96) = (96%13)%13 = 5 collision
hp(96) = 7 - 96%7 = 2
h1(96) = (5 + 1*2)%13 = 7
h0(36) = (36%13)%13 = 10
h0(70) = (70%13)%13 = 5 collision
hp(70) = 7 - 70%7 = 7
h1(70) = (5 + 1*7)%13 = 12 collision
h2(70) = (5 + 2*7)%13 = 6

2.3.3 Rehashing
As noted before, with open addressing, if the hash tables become too full,
performance can suffer a lot. So, what can we do? We can double the hash table
size, modify the hash function, and re-insert the data.
More specifically, the new size of the table will be the first prime that is more
than twice as large as the old table size.

In-text Question 2
What is the main disadvantage of Linear probing?

Answer
Primary Clustering.

2.3.4 Implementation of Open Addressing


We now provide the code for the implementation of the Open Addressing.
public class OpenScatterTable extends AbstractHashTable {
protected Entry array[];
protected static final int EMPTY = 0;
protected static final int OCCUPIED = 1;
protected static final int DELETED = 2;
protected static final class Entry {
public int state = EMPTY;
public Comparable object;
// …
}
public OpenScatterTable(int size) {
array = new Entry[size];
for(int i = 0; i < size; i++)
array[i] = new Entry();
}
// …
}

322
/* finds the index of the first unoccupied slot
in the probe sequence of obj */
protected int findIndexUnoccupied(Comparable obj){
int hashValue = h(obj);
int tableSize = getLength();
int indexDeleted = -1;
for(int i = 0; i < tableSize; i++){
int index = (hashValue + c(i)) % tableSize;
if(array[index].state == OCCUPIED
&& obj.equals(array[index].object))
throw new IllegalArgumentException(
"Error: Duplicate key");
else if(array[index].state == EMPTY ||
(array[index].state == DELETED &&
obj.equals(array[index].object)))
return indexDeleted ==-1?index:indexDeleted;
else if(array[index].state == DELETED &&
indexDeleted == -1)
indexDeleted = index;
}
if(indexDeleted != -1) return indexDeleted;
throw new IllegalArgumentException(
"Error: Hash table is full");
}
protected int findObjectIndex(Comparable obj){
int hashValue = h(obj);
int tableSize = getLength();
for(int i = 0; i < tableSize; i++){
int index = (hashValue + c(i)) % tableSize;
if(array[index].state == EMPTY
|| (array[index].state == DELETED
&& obj.equals(array[index].object)))
return -1;
else if(array[index].state == OCCUPIED
&& obj.equals(array[index].object))
return index;
}
return -1;
}
public Comparable find(Comparable obj){
int index = findObjectIndex(obj);
if(index >= 0)return array[index].object;
else return null;
}

323
public void insert(Comparable obj){
if(count == getLength()) throw new ContainerFullException();
else {
int index = findIndexUnoccupied(obj);
// throws exception if an UNOCCUPIED slot is not found
array[index].state = OCCUPIED;
array[index].object = obj;
count++;
}
}

public void withdraw(Comparable obj){


if(count == 0) throw new ContainerEmptyException();
int index = findObjectIndex(obj);
if(index < 0)
throw new IllegalArgumentException("Object not found");
else {
array[index].state = DELETED;
// lazy deletion: DO NOT SET THE LOCATION TO null
count--;
}
}

3.0 Marked Assignments (Individual or Group)


1. What in your opinion is the single most important motivation for the
development of hashing schemes while there already are other techniques
that can be used to realize the same functionality provided by hashing
methods?
2. How many storage cells will be wasted in an array implementation with
O(1) access for records of 10,000 students each with a 7-digit ID
number?
3. Must a hash table be implemented using an array? Will an
alternative data structure achieve the same efficiency? If yes, why?
If no, what condition must the data structure satisfy to ensure the same
efficiency as provided by arrays?
4. Which of the techniques for creating hash functions is most general?
Why?

324
5. Why do prime numbers generally make a good selection for hash table
sizes?
6. Given that,
c(i) = a*i,
for c(i) in linear probing, we discussed that this equation satisfies
Property 2 only when a and n are relatively prime. Explain what the
requirement of being relatively prime means in simple plain language.
7. Consider the general probe sequence,
hi (r) = (h(r) + c(i))% n.
Are we sure that if c(i) satisfies Property 2, then hi(r) will cover all n
hash table locations, 0,1,...,n-1? Explain.
8. Suppose you are given k records to be loaded into a hash table of size n,
with k < n using linear probing. Does the order in which these records are
loaded matter for retrieval and insertion? Explain.
9. A prime number is always the best choice of a hash table size. Is this
statement true or false? Justify your answer either way.
10.If a hash table is 25% full what is its load factor?
11.Given that,
c(i) = i2,
for c(i) in quadratic probing, we discussed that this equation does not
satisfy Property 2, in general. What cells are missed by this probing
formula for a hash table of size 17? Characterise using a formula, if
possible, the cells that are not examined by using this function for a hash
table of size n.
12. It was mentioned in this session that secondary clusters are less harmful
than primary clusters because the former cannot combine to form larger
secondary clusters. Use an appropriate hash table of records to exemplify
this situation.

325
4.0 Conclusion/Summary
In this study session, we looked at hashing. We started by reviewing some
search techniques and then went on to introduce hashing, hash tables, types of
hashing, hash functions and applications of hashing. You also learnt about
collision resolution techniques in hashing. In the next study session, you will
learn about Lempel-Ziv Compression Techniques.

5.0 Self-Assessment Questions


1. Explain the concept of hashing.
2. List two collision resolution techniques discussed in this study session.

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2Zpk9lT , http://bit.ly/2L8qZXN , http://bit.ly/32efDZp ,
http://bit.ly/2MFqKqm , http://bit.ly/33Ywd0T . Watch the video & summarise in 1
paragraph
b. View the animation on http://bit.ly/32efDZp and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on the different Open Addressing
collision resolution schemes; In 2 paragraphs summarise their opinion of the
discussed topic. etc.

7.0 Self Assessment Question Answers


1. Hashing is based on the idea of distributing keys among a one-
dimensional array H[0..m − 1] called a hash table. The distribution is
done by computing, for each of the keys, the value of some predefined
function h called the hash function. This function assigns an integer
between 0 and m − 1, called the hash address, to a key.
2.
- Separate Chaining

326
- Open Addressing

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

327
STUDY SESSION 2
Lempel-Ziv Compression Techniques
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - Lempel-Ziv Compression Techniques
2.1.1 - Classification of Lossless Compression techniques
2.1.2 - Introduction to Lempel-Ziv Encoding: LZ77 & LZ78
2.1.2.1 - LZ78 Compression Algorithm
2.2 - Lempel-Ziv-Welch (LZW) Compression Algorithm
2.2.1 - Introduction to the LZW Algorithm
2.2.2 - LZW Encoding Algorithm
2.2.3 - LZW Decoding Algorithm
2.2.4 - LZW Limitations
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
Earlier in the course, we studied lossless compression techniques which are
classified into static, adaptive (or dynamic), and hybrid. We also saw some
examples of these techniques. In this study session, you will be introduced to
the Lempel-Ziv encoding. You will learn that this encoding technique consists
of two different algorithms namely LZ77 and LZ78. We will study these

328
algorithms with some examples. You will also learn about Lempel-Ziv-Welch
(LZW) compression algorithm.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Describe the Lempel-Ziv encoding
2. Explain the LZ78 compression algorithm
3. Encode and decode strings using the LZ78 compression algorithm
4. Explain the LZW encoding algorithm
5. Encode and decode strings using the LZW encoding algorithm
6. Outline the limitations of the LZW encoding algorithm

2.0 Main Content


2.1 Lempel-Ziv Compression Techniques
2.1.1 Classification of Lossless Compression techniques
Recall that during our discussion on hashing, we saw that:
Lossless compression techniques are classified into static, adaptive (or
dynamic), and hybrid. Static coding requires two passes: one pass to compute
probabilities (or frequencies) and determine the mapping, and a second pass to
encode.
Examples of Static techniques: Static Huffman Coding
All of the adaptive methods are one-pa
ss methods; only one scan of the message is required.
Examples of adaptive techniques: LZ77, LZ78, LZW, and Adaptive Huffman
Coding

2.1.2 Introduction to Lempel-Ziv Encoding: LZ77 & LZ78


Data compression up until the late 1970‟s were mainly directed towards
creating better methodologies for Huffman coding. An innovative, radically

329
different method was introduced in1977 by Abraham Lempel and Jacob Ziv.
This technique (called Lempel-Ziv) actually consists of two considerably
different algorithms, LZ77 and LZ78. Due to patents, LZ77 and LZ78 led to
many variants:
Table 4.2.1: Variants of LZ77 and LZ78

The zip and unzip use the LZH technique while UNIX's compress methods
belong to the LZW and LZC classes.

2.1.2.1 LZ78 Compression Algorithm


LZ78 inserts one- or multi-character, non-overlapping, distinct patterns of the
message to be encoded in a Dictionary. The multi-character patterns are of the
form: C0C1 . . . Cn-1Cn. The prefix of a pattern consists of all the pattern
characters except the last: C0C1 . . . Cn-1. Table 2 below shows the output of
LZ78.

Table 4.2.2: LZ78 Output

Note that the dictionary is usually implemented as a hash table. The LZ78
compression algorithm is presented below:
Dictionary  empty ; Prefix  empty ; DictionaryIndex  1;
while(characterStream is not empty)
{
Char  next character in characterStream;

330
if(Prefix + Char exists in the Dictionary)
Prefix  Prefix + Char ;
else
{
if(Prefix is empty)
CodeWordForPrefix  0 ;
else
CodeWordForPrefix  DictionaryIndex for
Prefix ;
Output: (CodeWordForPrefix, Char) ;
insertInDictionary( ( DictionaryIndex , Prefix +
Char) );
DictionaryIndex++ ;
Prefix  empty ;
}
}
if(Prefix is not empty)
{
CodeWordForPrefix  DictionaryIndex for Prefix;
Output: (CodeWordForPrefix , ) ;
}

Example 1: Encode (i.e., compress) the string ABBCBCABABCAABCAAB


using the LZ78 algorithm.

Solution

Figure 4.2.1: LZ78 Compression example 1


The compressed message is: (0,A)(0,B)(2,C)(3,A)(2,A)(4,A)(6,B)

331
Note: The above is just a representation, the commas and parentheses are not
transmitted; we will discuss the actual form of the compressed message later on.
In-text Question 1
Static compression methods requires a single pass. (True or False)?

Answer
False

1. A is not in the Dictionary; insert it


2. B is not in the Dictionary; insert it
3. B is in the Dictionary.
BC is not in the Dictionary; insert it.
4. B is in the Dictionary.
BC is in the Dictionary.
BCA is not in the Dictionary; insert it.
5. B is in the Dictionary.
BA is not in the Dictionary; insert it.
6. B is in the Dictionary.
BC is in the Dictionary.
BCA is in the Dictionary.
BCAA is not in the Dictionary; insert it.
7. B is in the Dictionary.
BC is in the Dictionary.
BCA is in the Dictionary.
BCAA is in the Dictionary.
BCAAB is not in the Dictionary; insert it.

Example 2: Encode (i.e., compress) the string BABAABRRRA using the LZ78
algorithm.

Solution

332
Figure 4.2.2: LZ78 Compression example 2
The compressed message is: (0,B)(0,A)(1,A)(2,B)(0,R)(5,R)(2, )
1. B is not in the Dictionary; insert it
2. A is not in the Dictionary; insert it
3. B is in the Dictionary.
BA is not in the Dictionary; insert it.
4. A is in the Dictionary.
AB is not in the Dictionary; insert it.
5. R is not in the Dictionary; insert it.
6. R is in the Dictionary.
RR is not in the Dictionary; insert it.
7. A is in the Dictionary and it is the last input character; output a pair
containing its index: (2, )

Example 3: Encode (i.e., compress) the string AAAAAAAAA using the LZ78
algorithm.
Solution

333
Figure 4.2.3: LZ78 Compression example 3
1. A is not in the Dictionary; insert it
2. A is in the Dictionary
AA is not in the Dictionary; insert it
3. A is in the Dictionary.
AA is in the Dictionary.
AAA is not in the Dictionary; insert it.
4. A is in the Dictionary.
AA is in the Dictionary.
AAA is in the Dictionary and it is the last pattern; output a pair containing its
index:
(3, )

LZ78 Compression: Number of bits transmitted


Example: Uncompressed String: ABBCBCABABCAABCAAB
Number of bits = Total number of characters * 8
= 18 * 8
= 144 bits
Suppose the codewords are indexed starting from 1:
Compressed string( codewords): (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)
Codeword index 1 2 3 4 5 6 7
Each code word consists of an integer and a character:
The character is represented by 8 bits and the number of bits n required to
represent the integer part of the codeword with index i is given by:

334
Alternatively number of bits required to represent the integer part of the
codeword with index i is the number of significant bits required to represent the
integer i – 1

Codeword (0, A) (0, B) (2, C) (3, A) (2, A) (4, A) (6, B)


index 1 2 3 4 5 6 7
Bits: (1 + 8) + (1 + 8) + (2 + 8) + (2 + 8) + (3 + 8) + (3 + 8) + (3 + 8) =
71 bits
The actual compressed message is: 0A0B10C11A010A100A110B where each
character is replaced by its binary 8-bit ASCII code.
LZ78 Decompression Algorithm
We now present the LZ78 decompression algorithm.
Dictionary  empty ; DictionaryIndex  1 ;
while(there are more (CodeWord, Char) pairs in codestream){
CodeWord  next CodeWord in codestream ;
Char  character corresponding to CodeWord ;
if(CodeWord = = 0)
String  empty ;
else
String  string at index CodeWord in Dictionary ;
335
Output: String + Char ;
insertInDictionary( (DictionaryIndex , String + Char) )
;
DictionaryIndex++;
}

In summary,
- input: (CW, character) pairs
- output:
if(CW == 0)
output: currentCharacter
else
output: stringAtIndex CW + currentCharacter
- Insert: current output in dictionary

Example 4: Decode (i.e., decompress) the sequence (0, A) (0, B) (2, C) (3, A)
(2, A) (4, A) (6, B)
Solution

Figure 4.2.4: LZ78 Decompression example 1


The decompressed message is: ABBCBCABABCAABCAAB
Example 5: Decode (i.e., decompress) the sequence (0, B) (0, A) (1, A) (2, B)
(0, R) (5, R) (2, )
Solution

336
The decompressed message is: BABAABRRRA

Example 6: Decode (i.e., decompress) the sequence (0, A) (1, A) (2, A) (3, )
Solution

The decompressed message is: AAAAAAAAA

2.2 Lempel-Ziv-Welch (LZW) Compression Algorithm


2.2.1 Introduction to the LZW Algorithm
If the message to be encoded consists of only one character, LZW outputs the
code for this character; otherwise it inserts two- or multi-character,
overlapping*, distinct patterns of the message to be encoded in a Dictionary.
The last character of a pattern is the first character of the next pattern.

337
The patterns are of the form: C0C1 . . . Cn-1Cn. The prefix of a pattern consists of
all the pattern characters except the last: C0C1 . . . Cn-1
LZW output if the message consists of more than one character:
- If the pattern is not the last one; output: The code for its prefix.
- If the pattern is the last one:
i. if the last pattern exists in the Dictionary; output: The code
for the pattern.
ii. If the last pattern does not exist in the Dictionary; output:
code(lastPrefix) then output: code(lastCharacter)
Note: LZW outputs codewords that are 12-bits each. Since there are 212 = 4096
codeword possibilities, the minimum size of the Dictionary is 4096; however
since the Dictionary is usually implemented as a hash table its size is larger than
4096.

2.2.2 LZW Encoding Algorithm


The LZW encoding algorithm is presented below.
Initialise Dictionary with 256 single character strings and their corresponding
ASCII codes;

Prefix  first input character;


CodeWord  256;
while(not end of character stream){
Char  next input character;
if(Prefix + Char exists in the Dictionary)
Prefix  Prefix + Char;
else{
Output: the code for Prefix;
insertInDictionary( (CodeWord , Prefix + Char) ) ;
CodeWord++;
Prefix  Char;
}
}

Output: the code for Prefix;


Example 7: Encode the string BABAABAAA by the LZW encoding algorithm.

338
Solution

Figure 4.2.5: LZW Compression example 1

1. BA is not in the Dictionary; insert BA, output the code for its prefix: code(B)
2. AB is not in the Dictionary; insert AB, output the code for its prefix: code(A)
3. BA is in the Dictionary.
BAA is not in Dictionary; insert BAA, output the code for its prefix:
code(BA)
4. AB is in the Dictionary.
ABA is not in the Dictionary; insert ABA, output the code for its prefix:
code(AB)
5. AA is not in the Dictionary; insert AA, output the code for its prefix: code(A)
6. AA is in the Dictionary and it is the last pattern; output its code: code(AA)

The compressed message is: <66><65><256><257><65><260>


Example 8: Encode the string BABAABRRRA by the LZW encoding
algorithm.
Solution

Figure 4.2.6: LZW Compression example 2


1. BA is not in the Dictionary; insert BA, output the code for its prefix: code(B)
2. AB is not in the Dictionary; insert AB, output the code for its prefix: code(A)
3. BA is in the Dictionary.

339
BAA is not in Dictionary; insert BAA, output the code for its prefix:
code(BA)
4. AB is in the Dictionary.
ABR is not in the Dictionary; insert ABR, output the code for its prefix:
code(AB)
5. RR is not in the Dictionary; insert RR, output the code for its prefix: code(R)
6. RR is in the Dictionary.
RRA is not in the Dictionary and it is the last pattern; insert RRA, output
code for its prefix:
code(RR), then output code for last character: code(A)

The compressed message is: <66><65><256><257><82><260> <65>

LZW: Number of bits transmitted


Example: Uncompressed String: aaabbbbbbaabaaba
Number of bits = Total number of characters * 8
= 16 * 8
= 128 bits
Compressed string (codewords): <97><256><98><258><259><257><261>
Number of bits = Total Number of codewords * 12
= 7 * 12
= 84 bits
Note: Each codeword is 12 bits because the minimum Dictionary size is
taken as 4096, and
340
212 = 4096

2.2.3 LZW Decoding Algorithm


The LZW decompressor creates the same string table during decompression.
The LZW decoding algorithm is presented below:

Initialize Dictionary with 256 ASCII codes and corresponding single character
strings as their translations;
PreviousCodeWord  first input code;
Output: string(PreviousCodeWord) ;
Char  character(first input code);
CodeWord  256;
while(not end of code stream){
CurrentCodeWord  next input code ;
if(CurrentCodeWord exists in the Dictionary)
String  string(CurrentCodeWord) ;
else
String  string(PreviousCodeWord) + Char ;
Output: String;
Char  first character of String ;
insertInDictionary( (CodeWord ,
string(PreviousCodeWord) + Char ) );
PreviousCodeWord  CurrentCodeWord ;
CodeWord++ ;
}

In-text Question 2
Which compression algorithm inserts one- or multi-character, non-overlapping, distinct
patterns of the message to be encoded in a dictionary?

Answer
LZ78 Compression Algorithm

341
Summary of LZW decoding algorithm:
output: string(first CodeWord);
while(there are more CodeWords){
if(CurrentCodeWord is in the Dictionary)
output: string(CurrentCodeWord);
else
output: PreviousOutput + PreviousOutput first
character;
insert in the Dictionary: PreviousOutput + CurrentOutput
first character;
}

Example 9: Use LZW to decompress the output sequence <66> <65> <256>
<257> <65> <260>
Solution

Figure 4.2.7: LZW Decompression example 1


1. 66 is in Dictionary; output string(66) i.e. B
2. 65 is in Dictionary; output string(65) i.e. A, insert BA
3. 256 is in Dictionary; output string(256) i.e. BA, insert AB
4. 257 is in Dictionary; output string(257) i.e. AB, insert BAA
5. 65 is in Dictionary; output string(65) i.e. A, insert ABA
6. 260 is not in Dictionary; output
previous output + previous output first character: AA, insert AA

342
Example 10: Decode the sequence <67> <70> <256> <258> <259> <257> by
LZW decode algorithm.
Solution

Figure 4.2.8: LZW Decompression example 2


1. 67 is in Dictionary; output string(67) i.e. C
2. 70 is in Dictionary; output string(70) i.e. F, insert CF
3. 256 is in Dictionary; output string(256) i.e. CF, insert FC
4. 258 is not in Dictionary; output previous output + C i.e. CFC, insert
CFC
5. 259 is not in Dictionary; output previous output + C i.e. CFCC, insert
CFCC
6. 257 is in Dictionary; output string(257) i.e. FC, insert CFCCF

2.2.4 LZW Limitations


What happens when the dictionary gets too large? One approach is to clear
entries 256-4095 and start building the dictionary again. The same approach
must also be used by the decoder.

343
3.0 Tutor Marked Assignments (Individual or Group)
1. Use LZ78 to trace encoding the string SATATASACITASA.
2. Write a Java program that encodes a given string using LZ78.
3. Write a Java program that decodes a given set of encoded codewords
using LZ78.
4. Use LZW to trace encoding the string ABRACADABRA.
5. Write a Java program that encodes a given string using LZW.
6. Write a Java program that decodes a given set of encoded codewords
using LZW.

4.0 Conclusion/Summary
In this study session, you were introduced to the Lempel-Ziv encoding. You
learnt that this encoding technique consists of two different algorithms namely
LZ77 and LZ78. You studied these algorithms with some examples. You also
learnt about Lempel-Ziv-Welch (LZW) compression algorithm in the
concluding part of this study session.

5.0 Self-Assessment Questions


1. All of the adaptive compression methods are two-pass methods. (True or
False)?
2. If the message to be encoded consists of only one character, __________
compression algorithm outputs the code for this character; otherwise it
inserts two- or multi-character, overlapping, distinct patterns of the
message to be encoded in a dictionary.

344
6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2ZsDDpH , http://bit.ly/2PbTCIJ , http://bit.ly/2ZrEnPW ,

http://bit.ly/2ZilcZM , http://bit.ly/2L707aA , http://bit.ly/2PfJE9y. Watch the video &


summarise in 1 paragraph.
b. View the animation on http://bit.ly/2KXjR1F and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on comparison between LZ78 and
LZW compression algorithms; In 2 paragraphs summarise their opinion of the
discussed topic. etc.

7.0 Self Assessment Question Answers


1. False
2. LZW Compression algorithm

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Anany L., “Introduction to the Design and Analysis of Algorithms”,
3rd Edition, Pearson Education, Inc., 20012.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

345
STUDY SESSION 3
Garbage Collection
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1- How Objects are created in Java
2.1.1 - How Java Reclaims Objects Memory
2.2 - What is Garbage?
2.2.1 - what is garbage collection?
2.2.2 - Advantages and disadvantages of garbage collection
2.2.3 - Helping the garbage collector
2.3 - Garbage collection schemes
2.3.1 - Reference Counting
2.3.2 - Mark and Sweep
2.3.3 - Stop and Copy
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

346
Introduction:
Have you ever wondered how programing languages (Java in our case) handle
unreferenced objects? This study session hence, is an introduction to garbage
collection, with our focus on programing language being Java. Not only will
you be learning what garbage collection is, its advantages and disadvantage will
also be considered. We shall conclude the study session by learning about
garbage collection schemes.

1.0 Study Session Learning Outcomes


After studying this session, I expect you to be able to:
1. Describe how objects are created in Java and how Java reclaims objects
memory
2. Define garbage collection
3. Outline advantages and disadvantages of garbage collection
4. List and explain different garbage collection schemes

2.0 Main Content


2.1 How Objects are created in Java
An object is created in Java by invoking the new operator. Calling the new
operator, the JVM will do the following:
- allocate memory;
- assign fields their default values;
- run the constructor;
- return a reference

2.1.1 How Java Reclaims Objects Memory


Java does not provide the program any means to destroy objects explicitly. It
does so implicitly through an internal garbage collection.

347
The advantages of these are:
- No dangling reference problem in Java
- Easier programing
- No memory leak problem

2.2 What is Garbage?


Garbage is defined as unreferenced objects. For example, assuming we have a
Java class called Student and we do the following:
Student ali= new Student();
Student khalid= new Student();
ali=khalid;
Now ali Object becomes a garbage, because it is an unreferenced Object as
shown in figure 4.3.1 below.

Figure 4.3.1: Example of garbage

348
2.2.1 What is garbage collection?
Garbage Collection is the process of finding garbage and reclaiming memory
allocated to it. The reason why garbage collection is done is that the heap space
occupied by an un-referenced object can be recycled and made available for
subsequent new objects. Garbage Collection process is invoked when the total
memory allocated to a Java program exceeds some threshold. Running a
program is affected by garbage collection because the program suspends during
garbage collection.

2.2.2 Advantages and disadvantages of garbage collection


The following are the advantages that come with garbage collection
1. GC eliminates the need for the programr to deallocate memory blocks
explicitly
2. Garbage collection helps ensure program integrity.
3. Garbage collection can also dramatically simplify programs.
Just as garbage collection has advantages, it also has some disadvantages. They
are:
1. Garbage collection adds an overhead that can affect program
performance.
2. GC requires extra memory.
3. Programrs have less control over the scheduling of CPU time.

2.2.3 Helping the garbage collector


There are some programing practices you can employ to make the job of the
garbage collector easier. We outline some of them as follows.
1. Reuse objects instead of generating new ones.
- This program generates one million objects and prints them out,
generating a lot of garbage
for (int i=0;i<1000000; ++i) {

349
SomeClass obj= new SomeClass(i);
System.out.println(obj);
}
- By using only one object instead and implementing the setInt() method,
we drastically reduce the garbage generated.
SomeClass obj= new SomeClass();
for (int i=0;i< 1000000; ++i) {
obj.setInt(i);
System.out.println(obj);
}
2. Eliminate all references to objects that are no longer needed
- This can be done by assigning null to every variable that refers to an
object that is no longer needed

2.3 Garbage collection schemes


There are three main methods of garbage collection. They are:
1. Reference counting
2. Mark-and-sweep
3. Stop-and-copy garbage collection.

2.3.1 Reference Counting


The main idea of reference counting is adding a reference count field for every
object. This Field is updated whenever the number of references to an object
changes.
Example
Object p= new Integer(57);
Object q = p;

Figure 4.3.2: Reference Counting

350
In-text Question 1
Java provides programrs with a means to destroy objects explicitly. (True or False)?

Answer
False

The update of reference field when we have a reference assignment ( i.e p=q)
can be implemented as follows
Example:
Object p = new Integer(57);
Object q= new Integer(99);
p=q
if (p!=q){
if (p!=null)
--p.refCount;
p=q;
if (p!=null)
++p.refCount;
}

Figure 4.3.3: Reference counting example


Reference counting will fail whenever the data structure contains a cycle of
references and the cycle is not reachable from a global or local reference

Figure 4.3.4: When Reference counting fails

351
Here are some pros and cons of the reference counting approach to garbage
collection.
Advantages
1. Conceptually simple: Garbage is easily identified
2. It is easy to implement.
3. Immediate reclamation of storage
4. Objects are not moved in memory during garbage collection.
Disadvantages
1. Reference counting does not detect garbage with cyclic references.
2. The overhead of incrementing and decrementing the reference count each
time.
3. Extra space: A count field is needed in each object.
4. It may increase heap fragmentation.

2.3.2 Mark and Sweep


The mark-and-sweep algorithm is divided into two phases:
- Mark phase: the garbage collector traverses the graph of references from
the root nodes and marks each heap object it encounters. Each object has
an extra bit: the mark bit – initially the mark bit is 0. It is set to 1 for the
reachable objects in the mark phase.
- Sweep phase: the GC scans the heap looking for objects with mark bit 0
– these objects have not been visited in the mark phase – they are
garbage. Any such object is added to the free list of objects that can be
reallocated. The objects with a mark bit 1 have their mark bit reset to 0.

The pros and cons of the mark and sweep approach are:
Advantages
1. It is able to reclaim garbage that contains cyclic references.
2. There is no overhead in storing and manipulating reference count fields.

352
3. Objects are not moved during GC – no need to update the references to
objects.

Disadvantages
1. It may increase heap fragmentation.
2. It does work proportional to the size of the entire heap.
3. The program must be halted while garbage collection is being performed.

2.3.3 Stop and Copy


In the stop and copy approach to garbage collection, the heap is divided into
two regions: Active and Inactive. Objects are allocated from the active region
only. When all the space in the active region has been exhausted, program
execution is stopped and the heap is traversed. Live objects are copied to the
other region as they are encountered by the traversal. The role of the two
regions is reversed, i.e., swap (active, inactive).
Figure 4.3.5 shows a graphical depiction of a garbage-collected heap that uses a
stop and copy algorithm. This figure shows nine snapshots of the heap over
time.

In-text Question 2
_____________ Garbage collection scheme adds a reference count field for every object and
updates this field whenever the number of references to an object changes.

Answer
Reference Counting

353
Figure 4.3.5: A graphical depiction of a garbage-collected heap using stop and copy
approach

The pros and cons of the stop and copy approach are:
Advantages
1. Only one pass through the data is required.
2. It de-fragments the heap.
3. It does work proportional to the amount of live objects and not to the
memory size.
4. It is able to reclaim garbage that contains cyclic references.
5. There is no overhead in storing and manipulating reference count fields.
Disadvantages
1. Twice as much memory is needed for a given amount of heap space.
2. Objects are moved in memory during garbage collection (i.e., references
need to be updated)
The program must be halted while garbage collection is being performed.

3.0 Tutor Marked Assignments (Individual or Group)


1. What is garbage? Show an example of how it is created by a Java
program.
2. List two advantages of garbage collection.
3. List two disadvantages of garbage collection.

354
4. Describe how objects are created in Java.
5. List and explain the ways in which a programr can help the garbage
collector.
6. List and explain the garbage collection schemes we discussed.

4.0 Conclusion/Summary
In this study session you were introduced to garbage collection, with our focus
programing language being Java. You learnt about what garbage collection is,
its advantages and disadvantage. We concluded the study session by learning
about garbage collections schemes.

5.0 Self-Assessment Questions


1. What is Garbage Collection?
2. Which garbage collection scheme is divided into two phases named mark
and sweep?

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2Zx9yJw , http://bit.ly/33VptRz , http://bit.ly/2Pvg9kf,

http://bit.ly/2PdPCYo , http://bit.ly/2MDbnPk , http://bit.ly/2ZsSYX8. Watch the video &


summarise in 1 paragraph.
b. View the animation on http://bit.ly/2Pvg9kf and critique it in the discussion
forum.
c. Take a walk and engage any 3 students on Grabage collection schemes; In 2
paragraphs summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. Garbage Collection is the process of finding garbage and reclaiming
memory allocated to it.
2. Mark and Sweep
355
8.0 References/Further Readings
Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Bruno R. P., “Data Structures and Algorithms with Object Oriented
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.
Nell D., Daniel T. J., Chip W., “Object-Oriented Data Structures using Java”,
Jones and Bartlett Publishers, Inc., 2002.
Robert L., “Data Structures and Algorithms in Java”, 2nd Edition,
Sams Publishing, 2003.

356
STUDY SESSION 4
Memory Management
Section and Subsection Headings:
Introduction
1.0 Learning Outcomes
2.0 Main Content
2.1 - Memory Areas and their use
2.2 - Memory Manager Tasks
2.3 - Free List
2.3.1 - Free List Implementations
2.3.1.1 - Singly-linked list implementation of free-list
2.3.1.2 - doubly-linked list implementation of free-list
2.3.1.3 - Buddy Systems implementation of free-list
2.3.1.4 - Binary Buddy System implementation of free-list
3.0 Tutor Marked Assignments (Individual or Group assignments)
4.0 Study Session Summary and Conclusion
5.0 Self-Assessment Questions and Answers
6.0 Additional Activities (Videos, Animations & Out of Class activities)
7.0 Self-Assessment Question Answers
8.0 References/Further Readings

Introduction:
In the last study session of this module and this course, we shall study memory
management. You will learn how memories used by programs are allocated.
You will also be briefly introduced to the memory manager and problems faced
by memory allocation. We will conclude this study session by looking at free
lists and the implementation of free lists with examples using different data
structures.

357
1.0 Study Session Learning Outcomes
After studying this session, I expect you to be able to:
1. Explain memory management
2. Describe memory manager and its problems
3. Explain free lists and some common implementations of free lists
4. Outline allocation policies

2.0 Main Content


2.1 Memory Areas and Their Use
In programing languages like C or Java, the memory used by a program can be
allocated from three different areas:
- Static: laid out at compilation time, and allocated when the program
starts. It is used for Global variables and constants
- Stack: memory is allocated and freed dynamically, in LIFO order. It is
used for Local variables and parameters
- Heap: memory is allocated and freed dynamically, in any order. It is used
for data outliving the method which created them. In Java, all objects are
stored in the heap
The memory management techniques we discuss in this lecture apply
exclusively to the management of the heap.

358
Figure 4.4.1: Memory area

2.2 Memory Manager Tasks


The memory manager is part of the Operating System. It must keep track of
which parts of the heap are free, and which are allocated.
A memory manager supports the following operations:
- acquire: allocates memory needed by programs
- release: deallocates memory no longer needed by programs
It also defragments memory when needed

Problems faced in memory allocation


Memory allocation has some problems. We outline some of them as follows:
1. Memory fragmentation: Memory fragmentation is of two types namely:
- External fragmentation: Memory wasted outside allocated blocks
- Internal fragmentation: Memory wasted inside allocated block. Results
when memory allocated is larger than memory requested.

359
Figure 4.4.2: Memory allocation
2. Overhead: Additional memory that must be allocated, above and beyond
that requested by programs, in order to provide for the management of the
heap.

2.3 Free List


The Memory manager uses a free list data structure that keeps track of free
memory blocks in a scheme for dynamic memory allocation. Some common
implementations for free list are:
- Singly-linked list
- Doubly-linked list
- Buddy systems: an array of doubly-linked lists
Some allocation policies used by the in allocating memory are:
- First fit chooses the first block in the free list big enough to satisfy the
request, and split it.
- Next fit is like first fit, except that the search for a fitting block will start
where the last one stopped, instead of at the beginning of the free list.
- Best fit chooses the smallest block bigger than the requested one.
- Worst fit chooses the biggest, with the aim of avoiding the creation of
too many small fragments – but doesn‟t work well in practice.

360
2.3.1 Free List Implementations
2.3.1.1 Singly-linked list implementation of free-list
In the singly-linked list implementation of free lists, each node represents a free
block of memory. Nodes must be sorted according to start addresses of free
blocks so that adjacent free memory blocks can be combined.
The acquire( ) and release( ) operations which are used to allocate and release
memory respectively, are O(n), where n is the number of blocks in the heap. In
order to acquire a block, a node is searched following one of the allocation
policy. If the block is bigger than requested, it is divided into two. One part is
allocated and one remains in the list.
In order to release a block, a new node must be inserted (if the adjacent block is
not on the free list) or a node, which contains the adjacent free block, must be
modified. Searching for the place of the new or existing node has complexity
O(n).

2.3.1.2 doubly-linked list implementation of free-list


In this implementation, nodes are not sorted according to start addresses of free
blocks. All memory blocks have boundary tags between them. The tags have
information about the size and status (allocated/free). Each node in the doubly
linked list represents a free block. It keeps size & start address of the free block
and start addresses and sizes of the previous and next memory blocks. The
adjacent blocks may be or may not be free
The release operation does not combine adjacent free blocks. It simply prepends
a node corresponding to a released block at the front of the free list. This
operation is thus O(1). Adjacent free blocks are combined by acquire().
The acquire operation traverses the free list in order to find a free area of a
suitable size. As it does, so it also combines adjacent free blocks.

361
In-text Question 1
A memory manager is part of the operating system. (True or False)?

Answer
True

Example:

Figure 4.4.3: Node structure in a doubly-linked list implementation of free list

Figure 4.4.4: Initial state of memory (shaded=allocated, grayed=boundary tags)

Figure 4.4.5: The corresponding free list


Now the operation release(400, 4000) will result in:

Figure 4.4.6: Result of release(400, 4000)


The node corresponding to the freed block is appended at the front of the free-
list. The nodes x, y, and z correspond to the three free blocks that have not yet
been combined.

362
Figure 4.4.7: The node corresponding to the freed block

The operation acquire(600) using the first-fit allocation policy will first result
in the combination of the three adjacent free blocks:

Figure 4.4.8: Result of acquire(600)


At this point the corresponding free list is:

Figure 4.4.9: The corresponding free list


The required 600 bytes are then allocated, resulting in:

Figure 4.4.10: Allocation of the 600 bytes


The corresponding free list is:

Figure 4.4.11: The corresponding free list


363
2.3.1.3 Buddy Systems implementation of free-list
Instead of having a single free list, it has an array of free lists; each element of
the array holding blocks of the same size. One type of buddy systems is the
binary buddy system.
For a memory of size m, there are free-lists of size 20, 21, 22, . . . , 2k, where m 
2k
The heap is viewed as one large block which can be split into two equal smaller
blocks, called buddies. Each of these smaller blocks can again be split into two
equal smaller buddies, and so on. Each memory block has its “buddy”. The
“buddy” of a block of size 2k that starts at address x is the block of size 2k that
start at address y = complementBit_k(x), where the address bits are numbered
from right to left starting with 0.

Buddies
If each block is of size 8 bytes (i.e., 23 bytes); then the buddy of a block is
obtained by complementing bit 3 of its starting address. If each block is of size
4 bytes (i.e., 22 bytes); then the buddy of a block is obtained by complementing
bit 2 of its starting address.
Example: What is the starting address of the buddy of a block that starts at
address 1100101010101101 if each block is 16 bytes?
Solution: 16 = 24; the starting address of the buddy is obtained by
complementing bit 4: 1100101010111101

364
Figure 4.4.12: Buddies

2.3.1.4 Binary Buddy System implementation of free-list


In this implementation, each array element is a list of free blocks of same size.
The size of each block is a power of 2.

365
Figure 4.4.13: Binary Buddy System implementation of free-list

Binary Buddy System Algorithms


acquire(x): x <= 2k, the corresponding free list is searched
- If there is a block in this list, it is allocated;
- otherwise a block of size 2k+1, 2k+2, and so on is searched and taken off
the free list. The block is divided into two buddies. One buddy is put on
the free list for the next lower size and the other is either allocated or
further splinted if needed.
release(x): The block is placed back in the free list of its size, and
- if its buddy is also free they are combined to form a free block of size
2k+1. This block is then moved to the corresponding free list.
- If its buddy is free they are combined to form a free block of size 2k+2,
which is then moved to the appropriate free list and so on.

366
Buddy Systems Advantages/Disadvantages
Advantage:
- Both acquire( ) and release( ) operations are fast.
Disadvantages:
- Only memory of size that is a power of 2 can be allocated  internal
fragmentation if memory request is not a power of 2.
- When a block is released, its buddy may not be free, resulting in external
fragmentation.

In-text Question 2
Highlight the advantage of buddy systems discussed in the study session.

Answer
Both acquire( ) and release( ) operations are fast.

3.0 Tutor Marked Assignments (Individual or Group)


1. In programing languages like C or Java, the memory used by a program
can be allocated from three different areas. List and explain them.
2. List the operations supported by a memory manager.
3. List and explain the problems faced in memory allocation.
4. Highlight the memory allocation policies discussed in this study session.

4.0 Conclusion/Summary
In the last study session of this module and course, you studied memory
management. You learnt how memories used by programs are allocated. You
were also briefly introduced to the memory manager and problems faced by
memory allocation. We concluded this study session by looking at free lists and
the implementation of free lists with examples using different data structures.

367
5.0 Self-Assessment Questions
1. In which implementation of free list must nodes be sorted according to
start addresses of free blocks so that adjacent free memory blocks can be
combined?
2. Highlight the disadvantages of buddy systems discussed in the study
session.

6.0 Additional Activities (Videos, Animations & Out of Class activities) e.g.
a. Visit U-tube http://bit.ly/2ZsSYX8 , http://bit.ly/2ZoSUNj , http://bit.ly/2NwIe7R,

http://bit.ly/2U9fT9c , http://bit.ly/2HFbc1V , http://bit.ly/2U5DeZ7. Watch the video &


summarise in 1 paragraph
b. View the animation on http://bit.ly/323jCaN and critique it in the discussion
forum
c. Take a walk and engage any 3 students on memory allocation policies; In 2
paragraphs summarise their opinion of the discussed topic. etc.

7.0 Self Assessment Question Answers


1. Singly-linked list implementation of free-list
2.
i. Only memory of size that is a power of 2 can be allocated 
internal fragmentation if memory request is not a power of 2.
ii. When a block is released, its buddy may not be free, resulting in
external fragmentation.

8.0 References/Further Readings


Adam D., “Data Structures and Algorithms in Java”, 2nd Edition,
Thomson Learning, ISBN 0-534-49252-5.
Bruno R. P., “Data Structures and Algorithms with Object Oriented

368
Design Patterns in Java”, John Wiley & Sons, Inc., 2000.

GLOSSARY
Classification: Grouping related things together. This is supported through
classes, inheritance and packages.
Encapsulation: Representing data and the set of operations on the data as a
single entity - exactly what classes do.
Information Hiding: An object should be in full control of its data, granting
specific access only to whom it wishes.
Inheritance: Java allows related classes to be organized in a hierarchical
manner using the extends keyword.
Polymorphism: Same code behaves differently at different times during
execution. This is due to dynamic binding.

Abstract classes: These are classes that contain a mix of methods declared with
or without an implementation. However, they cannot be instantiated.

Interfaces: An interface is a reference type in Java. It is similar to class. It is a


collection of abstract methods. It contains static constants and abstract methods.

Time efficiency: the time an algorithm takes to execute.


Space efficiency: the space (primary or secondary memory) an algorithm uses.
Basic operation: is an operation which takes a constant amount of time to
execute.
Worst-case efficiency: Worst-case efficiency of an algorithm is its efficiency
for the worst-case input of size n, which is an input (or inputs) of size n for
which the algorithm runs the longest among all possible inputs of that size.
Best-case efficiency: The best-case efficiency of an algorithm is its efficiency
for the best-case input of size n, which is an input (or inputs) of size n for which
the algorithm runs the fastest among all possible inputs of that size.

369
Big-O notation, O(g(n)), is used to give an upper bound (worst-case) on a
positive runtime function f(n) where n is the input size.
List data structure: is a sequential data structure, i.e. a collection of items
accessible one after another, beginning at the head and ending at the tail.

Singly linked list: is a data structure in which the data items are chained (linked)
in one direction

Doubly linked list: is a data structure in which the data items are chained
(linked) in two directions.

Stack: is a linear data structure in which all insertions and deletions of data are
made only at one end of the stack, often called the top of the stack.

Queue data structure: is characterized by the fact that additions are made at the
end, or tail, of the queue while removals are made from the front, or head of the
queue.

Recursion: is a technique that allows us to break down a problem into one or


more simpler sub-problems that are similar in form to the original problem.

Recurrence relation: A recurrence relation, T(n), is a recursive function of


integer variable n.

Tree: A tree is a finite set of nodes together with a finite set of directed edges
that define parent-child relationships.

Tree Traversal: The process of systematically visiting all the nodes in a tree
and performing some computation at each node in the tree is called a tree
traversal.

AVL tree: An AVL tree is a binary search tree with a height balance property.

370
Data compression: Data compression is the representation of an information
source (e.g. a data file, a speech signal, an image, or a video signal) as
accurately as possible using the fewest number of bits.

Simple Graph: A simple graph G = (V, E) consists of a non-empty set V,


whose members are called the vertices of G, and a set E of pairs of distinct
vertices from V, called the edges of G.

Topological sort: Topological sort is a method of arranging the vertices in a


directed acyclic graph (DAG), as a sequence, such that no vertex appear in the
sequence before its predecessor.

Garbage: Garbage is defined as unreferenced objects.

Garbage Collection: Garbage Collection is the process of finding garbage and


reclaiming memory allocated to it.

371

You might also like