Instant download of Data Structures and Algorithms Using Python 1st Edition Rance D. Necaise ebook PDF, every chapter
Instant download of Data Structures and Algorithms Using Python 1st Edition Rance D. Necaise ebook PDF, every chapter
https://ebookultra.com/download/python-for-everyone-2nd-edition-cay-
horstmann-rance-necaise/
https://ebookultra.com/download/growing-algorithms-and-data-
structures-4th-edition-david-scuse/
https://ebookultra.com/download/learning-f-functional-data-structures-
and-algorithms-1st-edition-masood/
https://ebookultra.com/download/data-structures-algorithms-in-go-1st-
edition-hemant-jain/
Data structures using C 1st Edition Patil
https://ebookultra.com/download/data-structures-using-c-1st-edition-
patil/
https://ebookultra.com/download/learning-javascript-data-structures-
and-algorithms-2nd-edition-loiane-groner/
https://ebookultra.com/download/concise-notes-on-data-structures-and-
algorithms-ruby-edition-christopher-fox/
https://ebookultra.com/download/data-structures-and-algorithms-in-
java-4th-edition-michael-t-goodrich/
https://ebookultra.com/download/data-structures-and-algorithms-in-
java-6th-edition-michael-t-goodrich/
Data Structures and Algorithms Using Python 1st Edition
Rance D. Necaise Digital Instant Download
Author(s): Rance D. Necaise
ISBN(s): 9780470618295, 0470618299
Edition: 1
File Details: PDF, 10.19 MB
Year: 2010
Language: english
This page intentionally left blank
Data Structures and
Algorithms Using
Python
Rance D. Necaise
Department of Computer Science
College of William and Mary
This book was printed and bound by Hamilton Printing Company. The cover was
printed by Hamilton Printing Company
Copyright ©2011 John Wiley & Sons, Inc. All rights reserved. No part of this
publication may be reproduced, stored in a retrieval system or transmitted in any
form or by any means, electronic, mechanical, photocopying, recording, scanning or
otherwise, except as permitted under Sections 107 or 108 of the 1976 United States
Copyright Act, without either the prior written permission of the Publisher, or
authorization through payment of the appropriate per-copy fee to the Copyright
Clearance Center, Inc. 222 Rosewood Drive, Danvers, MA 01923, website
www.copyright.com. Requests to the Publisher for permission should be addressed
to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken,
NJ 07030-5774, (201)748-6011, fax (201)748-6008, website
http://www.wiley.com/go/permissions.
“Evaluation copies are provided to qualified academics and professionals for review
purposes only, for use in their courses during the next academic year. These copies
are licensed and may not be sold or transferred to a third party. Upon completion
of the review period, please return the evaluation copy to Wiley. Return
instructions and a free of charge return shipping label are available at
www.wiley.com/go/returnlabel. Outside of the United States, please contact your
local representative.”
Necaise, Rance D.
Data structures and algorithms using Python / Rance D. Necaise.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-61829-5 (pbk.)
1. Python (Computer program language) 2. Algorithms.
3. Data structures (Computer science) I. Title.
QA76.73.P98N43 2011
005.13'3—dc22 2010039903
10 9 8 7 6 5 4 3 2 1
To my nieces and nephews
Allison, Janey, Kevin, RJ, and Maria
This page intentionally left blank
Contents
Preface xiii
Chapter 2: Arrays 33
2.1 The Array Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1.1 Why Study Arrays? . . . . . . . . . . . . . . . . . . . . . . . 34
2.1.2 The Array Abstract Data Type . . . . . . . . . . . . . . . . . 34
2.1.3 Implementing the Array . . . . . . . . . . . . . . . . . . . . 36
2.2 The Python List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
v
vi CONTENTS
The standard second course in computer science has traditionally covered the fun-
damental data structures and algorithms, but more recently these topics have been
included in the broader topic of abstract data types. This book is no exception,
with the main focus on the design, use, and implementation of abstract data types.
The importance of designing and using abstract data types for easier modular pro-
gramming is emphasized throughout the text. The traditional data structures are
also presented throughout the text in terms of implementing the various abstract
data types. Multiple implementations using different data structures are used
throughout the text to reinforce the abstraction concept. Common algorithms are
also presented throughout the text as appropriate to provide complete coverage of
the typical data structures course.
Overview
The typical data structures course, which introduces a collection of fundamental
data structures and algorithms, can be taught using any of the different program-
ming languages available today. In recent years, more colleges have begun to adopt
the Python language for introducing students to programming and problem solv-
ing. Python provides several benefits over other languages such as C++ and Java,
the most important of which is that Python has a simple syntax that is easier to
learn. This book expands upon that use of Python by providing a Python-centric
text for the data structures course. The clean syntax and powerful features of the
language are used throughout, but the underlying mechanisms of these features
are fully explored not only to expose the “magic” but also to study their overall
efficiency.
For a number of years, many data structures textbooks have been written to
serve a dual role of introducing data structures and providing an in-depth study
of object-oriented programming (OOP). In some instances, this dual role may
compromise the original purpose of the data structures course by placing more focus
on OOP and less on the abstract data types and their underlying data structures.
To stress the importance of abstract data types, data structures, and algorithms, we
limit the discussion of OOP to the use of base classes for implementing the various
abstract data types. We do not use class inheritance or polymorphism in the main
part of the text but instead provide a basic introduction as an appendix. This
choice was made for several reasons. First, our objective is to provide a “back to
xiii
xiv PREFACE
Prerequisites
This book assumes that the student has completed the standard introduction to
programming and problem-solving course using the Python language. Since the
contents of the first course can differ from college to college and instructor to
instructor, we assume the students are familiar with or can do the following:
Apply the basic data types and constructs, including loops, selection state-
ments, and subprograms (functions)
Design and implement basics classes, including the use of helper methods and
private attributes
reordering of some topics. For example, the chapters on recursion and hashing can
be presented at any time after the discussion of algorithm analysis in Chapter 4.
Chapter 1: Abstract Data Types. Introduces the concept of abstract data types
(ADTs) for both simple types, those containing individual data fields, and the more
complex types, those containing data structures. ADTs are presented in terms
of their definition, use, and implementation. After discussing the importance of
abstraction, we define several ADTs and then show how a well-defined ADT can
be used without knowing how its actually implemented. The focus then turns to
the implementation of the ADTs with an emphasis placed on the importance of
selecting an appropriate data structure. The chapter includes an introduction to
the Python iterator mechanism and provides an example of a user-defined iterator
for use with a container type ADT.
Chapter 2: Arrays. Introduces the student to the array structure, which is im-
portant since Python only provides the list structure and students are unlikely to
have seen the concept of the array as a fixed-sized structure in a first course using
Python. We define an ADT for a one-dimensional array and implement it using a
hardware array provided through a special mechanism of the C-implemented ver-
sion of Python. The two-dimensional array is also introduced and implemented
using a 1-D array of arrays. The array structures will be used throughout the text
in place of the Python’s list when it is the appropriate choice. The implementa-
tion of the list structure provided by Python is presented to show how the various
operations are implemented using a 1-D array. The Matrix ADT is introduced and
includes an implementation using a two-dimensional array that exposes the stu-
dents to an example of an ADT that is best implemented using a structure other
than the list or dictionary.
Chapter 3: Sets and Maps. This chapter reintroduces the students to both
the Set and Map (or dictionary) ADTs with which they are likely to be familiar
from their first programming course using Python. Even though Python provides
these ADTs, they both provide great examples of abstract data types that can be
implemented in many different ways. The chapter also continues the discussion of
arrays from the previous chapter by introducing multi-dimensional arrays (those
of two or more dimensions) along with the concept of physically storing these
using a one-dimensional array in either row-major or column-major order. The
chapter concludes with an example application that can benefit from the use of a
three-dimensional array.
Chapter 8: Queues. Introduces the Queue ADT and includes three different
implementations: Python list, circular array, and linked list. The priority queue
is introduced to provide an opportunity to discuss different structures and data
organization for an efficient implementation. The application of the queue presents
the concept of discrete event computer simulations using an airline ticket counter
as the example.
Chapter 10: Recursion. Introduces the use of recursion to solve various pro-
gramming problems. The properties of creating recursive functions are presented
along with common examples, including factorial, greatest common divisor, and
the Towers of Hanoi. The concept of backtracking is revisited to use recursion for
solving the eight-queens problem.
PREFACE xvii
Chapter 11: Hash Tables. Introduces the concept of hashing and the use of hash
tables for performing fast searches. Different addressing techniques are presented,
including those for both closed and open addressing. Collision resolution techniques
and hash function design are also discussed. The magic behind Python’s dictionary
structure, which uses a hash table, is exposed and its efficiency evaluated.
Chapter 12: Advanced Sorting. Continues the discussion of the sorting problem
by introducing the recursive sorting algorithms—merge sort and quick sort—along
with the radix distribution sort algorithm, all of which can be used to sort se-
quences. Some of the common techniques for sorting linked lists are also presented.
Chapter 13: Binary Trees. Presents the tree structure and the general binary
tree specifically. The construction and use of the binary tree is presented along
with various properties and the various traversal operations. The binary tree is
used to build and evaluate arithmetic expressions and in decoding Morse Code
sequences. The tree-based heap structure is also introduced along with its use in
implementing a priority queue and the heapsort algorithm.
Chapter 14: Search Trees. Continues the discussion from the previous chapter
by using the tree structure to solve the search problem. The basic binary search
tree and the balanced binary search tree (AVL) are both introduced along with
new implementations of the Map ADT. Finally, a brief introduction to the 2-3
multi-way tree is also provided, which shows an alternative to both the binary
search and AVL trees.
Acknowledgments
There are a number of individuals I would like to thank for helping to make this
book possible. First, I must acknowledge two individuals who served as mentors
in the early part of my career. Mary Dayne Gregg (University of Southern Mis-
sissippi), who was the best computer science teacher I have ever known, shared
her love of teaching and provided a great role model in academia. Richard Prosl
(Professor Emeritus, College of William and Mary) served not only as my graduate
advisor but also shared great insight into teaching and helped me to become a good
teacher.
A special thanks to the many students I have taught over the years, especially
those at Washington and Lee University, who during the past five years used draft
versions of the manuscript and provided helpful suggestions. I would also like to
thank some of my colleagues who provided great advice and the encouragement
to complete the project: Sara Sprenkle (Washington and Lee University), Debbie
Noonan (College of William and Mary), and Robert Noonan (College of William
and Mary).
I am also grateful to the following individuals who served as outside review-
ers and provided valuable feedback and helpful suggestions: Esmail Bonakdarian
(Franklin University), David Dubin (University of Illinois at Urbana-Champaign)
Mark E. Fenner (Norwich University), Robert Franks (Central College), Charles J.
Leska (Randolph-Macon College), Fernando Martincic (Wayne State University),
Joseph D. Sloan (Wofford College), David A. Sykes (Wofford College), and Stan
Thomas (Wake Forest University).
Finally, I would like to thank everyone at John Wiley & Sons who helped make
this book possible. I would especially like to thank Beth Golub, Mike Berlin, and
Amy Weintraub, with whom I worked closely throughout the process and who
helped to make this first book an enjoyable experience.
Rance D. Necaise
CHAPTER 1
Abstract Data Types
1.1 Introduction
Data items are represented within a computer as a sequence of binary digits. These
sequences can appear very similar but have different meanings since computers
can store and manipulate different types of data. For example, the binary se-
quence 01001100110010110101110011011100 could be a string of characters, an in-
teger value, or a real value. To distinguish between the different types of data, the
term type is often used to refer to a collection of values and the term data type to
refer to a given type along with a collection of operations for manipulating values
of the given type.
Programming languages commonly provide data types as part of the language
itself. These data types, known as primitives, come in two categories: simple
and complex. The simple data types consist of values that are in the most
basic form and cannot be decomposed into smaller parts. Integer and real types,
for example, consist of single numeric values. The complex data types, on the
other hand, are constructed of multiple components consisting of simple types or
other complex types. In Python, objects, strings, lists, and dictionaries, which can
1
2 CHAPTER 1 Abstract Data Types
contain multiple values, are all examples of complex types. The primitive types
provided by a language may not be sufficient for solving large complex problems.
Thus, most languages allow for the construction of additional data types, known
as user-defined types since they are defined by the programmer and not the
language. Some of these data types can themselves be very complex.
1.1.1 Abstractions
To help manage complex problems and complex data types, computer scientists
typically work with abstractions. An abstraction is a mechanism for separat-
ing the properties of an object and restricting the focus to those relevant in the
current context. The user of the abstraction does not have to understand all of
the details in order to utilize the object, but only those relevant to the current task
or problem.
Two common types of abstractions encountered in computer science are proce-
dural, or functional, abstraction and data abstraction. Procedural abstraction
is the use of a function or method knowing what it does but ignoring how it’s
accomplished. Consider the mathematical square root function which you have
probably used at some point. You know the function will compute the square root
of a given number, but do you know how the square root is computed? Does it
matter if you know how it is computed, or is simply knowing how to correctly use
the function sufficient? Data abstraction is the separation of the properties of a
data type (its values and operations) from the implementation of that data type.
You have used strings in Python many times. But do you know how they are
implemented? That is, do you know how the data is structured internally or how
the various operations are implemented?
Typically, abstractions of complex problems occur in layers, with each higher
layer adding more abstraction than the previous. Consider the problem of repre-
senting integer values on computers and performing arithmetic operations on those
values. Figure 1.1 illustrates the common levels of abstractions used with integer
arithmetic. At the lowest level is the hardware with little to no abstraction since it
includes binary representations of the values and logic circuits for performing the
arithmetic. Hardware designers would deal with integer arithmetic at this level
and be concerned with its correct implementation. A higher level of abstraction
for integer values and arithmetic is provided through assembly language, which in-
volves working with binary values and individual instructions corresponding to the
underlying hardware. Compiler writers and assembly language programmers would
work with integer arithmetic at this level and must ensure the proper selection of
assembly language instructions to compute a given mathematical expression. For
example, suppose we wish to compute x = a + b − 5. At the assembly language
level, this expression must be split into multiple instructions for loading the values
from memory, storing them into registers, and then performing each arithmetic
operation separately, as shown in the following psuedocode:
Software-Implemented
Software-Implemented Higher Level
Big
Big Integers
Integers
High-Level
High-Level Language
Language
Instructions
Instructions
Assembly
Assembly Language
Language
Instructions
Instructions
Hardware
Hardware Lower Level
Implementation
Implementation
One problem with the integer arithmetic provided by most high-level languages
and in computer hardware is that it works with values of a limited size. On 32-bit
architecture computers, for example, signed integer values are limited to the range
−231 . . . (231 − 1). What if we need larger values? In this case, we can provide
long or “big integers” implemented in software to allow values of unlimited size.
This would involve storing the individual digits and implementing functions or
methods for performing the various arithmetic operations. The implementation
of the operations would use the primitive data types and instructions provided by
the high-level language. Software libraries that provide big integer implementations
are available for most common programming languages. Python, however, actually
provides software-implemented big integers as part of the language itself.
implementation, allowing us to focus on the use of the new data type instead of
how it’s implemented. This separation is typically enforced by requiring interac-
tion with the abstract data type through an interface or defined set of operations.
This is known as information hiding . By hiding the implementation details and
requiring ADTs to be accessed through an interface, we can work with an ab-
straction and focus on what functionality the ADT provides instead of how that
functionality is implemented.
Abstract data types can be viewed like black boxes as illustrated in Figure 1.2.
User programs interact with instances of the ADT by invoking one of the several
operations defined by its interface. The set of operations can be grouped into four
categories:
The implementation of the various operations are hidden inside the black box,
the contents of which we do not have to know in order to utilize the ADT. There
are several advantages of working with abstract data types and focusing on the
“what” instead of the “how.”
We can focus on solving the problem at hand instead of getting bogged down
in the implementation details. For example, suppose we need to extract a
collection of values from a file on disk and store them for later use in our
program. If we focus on the implementation details, then we have to worry
about what type of storage structure to use, how it should be used, and
whether it is the most efficient choice.
We can reduce logical errors that can occur from accidental misuse of storage
structures and data types by preventing direct access to the implementation. If
we used a list to store the collection of values in the previous example, there
is the opportunity to accidentally modify its contents in a part of our code
1.1 Introduction 5
where it was not intended. This type of logical error can be difficult to track
down. By using ADTs and requiring access via the interface, we have fewer
access points to debug.
The implementation of the abstract data type can be changed without having
to modify the program code that uses the ADT. There are many times when
we discover the initial implementation of an ADT is not the most efficient or
we need the data organized in a different way. Suppose our initial approach
to the previous problem of storing a collection of values is to simply append
new values to the end of the list. What happens if we later decide the items
should be arranged in a different order than simply appending them to the
end? If we are accessing the list directly, then we will have to modify our code
at every point where values are added and make sure they are not rearranged
in other places. By requiring access via the interface, we can easily “swap out”
the black box with a new implementation with no impact on code segments
that use the ADT.
It’s easier to manage and divide larger programs into smaller modules, al-
lowing different members of a team to work on the separate modules. Large
programming projects are commonly developed by teams of programmers in
which the workload is divided among the members. By working with ADTs
and agreeing on their definition, the team can better ensure the individual
modules will work together when all the pieces are combined. Using our pre-
vious example, if each member of the team directly accessed the list storing
the collection of values, they may inadvertently organize the data in different
ways or modify the list in some unexpected way. When the various modules
are combined, the results may be unpredictable.
There are many common data structures, including arrays, linked lists, stacks,
queues, and trees, to name a few. All data structures store a collection of values,
but differ in how they organize the individual data items and by what operations
can be applied to manage the collection. The choice of a particular data structure
depends on the ADT and the problem at hand. Some data structures are better
suited to particular problems. For example, the queue structure is perfect for
implementing a printer queue, while the B-Tree is the better choice for a database
index. No matter which data structure we use to implement an ADT, by keeping
the implementation separate from the definition, we can use an abstract data type
within our program and later change to a different implementation, as needed,
without having to modify our existing code.
list or vector abstract data type. To avoid confusion, we will use the term list to
refer to the data type provided by Python and use the terms general list or list
structure when referring to the more general list structure as defined earlier.
A date represents a single day in the proleptic Gregorian calendar in which the
first day starts on November 24, 4713 BC.
Date( month, day, year ): Creates a new Date instance initialized to the
given Gregorian date which must be valid. Year 1 BC and earlier are indicated
by negative year components.
day(): Returns the Gregorian day number of this date.
month(): Returns the Gregorian month number of this date.
year(): Returns the Gregorian year of this date.
monthName(): Returns the Gregorian month name of this date.
dayOfWeek(): Returns the day of the week as a number between 0 and 6 with
0 representing Monday and 6 representing Sunday.
numDays( otherDate ): Returns the number of days as a positive integer be-
tween this date and the otherDate.
isLeapYear(): Determines if this date falls in a leap year and returns the
appropriate boolean value.
8 CHAPTER 1 Abstract Data Types
advanceBy( days ): Advances the date by the given number of days. The date
is incremented if days is positive and decremented if days is negative. The
date is capped to November 24, 4714 BC, if necessary.
comparable ( otherDate ): Compares this date to the otherDate to deter-
mine their logical ordering. This comparison can be done using any of the
logical operators <, <=, >, >=, ==, !=.
toString (): Returns a string representing the Gregorian date in the format
mm/dd/yyyy. Implemented as the Python operator that is automatically called
via the str() constructor.
The abstract data types defined in the text will be implemented as Python
classes. When defining an ADT, we specify the ADT operations as method pro-
totypes. The class constructor, which is used to create an instance of the ADT, is
indicated by the name of the class used in the implementation.
Python allows classes to define or overload various operators that can be used
more naturally in a program without having to call a method by name. We define
all ADT operations as named methods, but implement some of them as operators
when appropriate instead of using the named method. The ADT operations that
will be implemented as Python operators are indicated in italicized text and a brief
comment is provided in the ADT definition indicating the corresponding operator.
This approach allows us to focus on the general ADT specification that can be
easily translated to other languages if the need arises but also allows us to take
advantage of Python’s simple syntax in various sample programs.
via the constructor before any operation can be used. Other than the initialization
requirement, an operation may not have any other preconditions. It all depends
on the type of ADT and the respective operation. Likewise, some operations may
not have a postcondition, as is the case for simple access methods, which simply
return a value without modifying the ADT instance itself. Throughout the text,
we do not explicitly state the precondition and postcondition as such, but they are
easily identified from the description of the ADT operations.
When implementing abstract data types, it’s important that we ensure the
proper execution of the various operations by verifying any stated preconditions.
The appropriate mechanism when testing preconditions for abstract data types is
to test the precondition and raise an exception when the precondition fails. You
then allow the user of the ADT to decide how they wish to handle the error, either
catch it or allow the program to abort.
Python, like many other object-oriented programming languages, raises an ex-
ception when an error occurs. An exception is an event that can be triggered
and optionally handled during program execution. When an exception is raised
indicating an error, the program can contain code to catch and gracefully handle
the exception; otherwise, the program will abort. Python also provides the assert
statement, which can be used to raise an AssertionError exception. The assert
statement is used to state what we assume to be true at a given point in the pro-
gram. If the assertion fails, Python automatically raises an AssertionError and
aborts the program, unless the exception is caught.
Throughout the text, we use the assert statement to test the preconditions
when implementing abstract data types. This allows us to focus on the implemen-
tation of the ADTs instead of having to spend time selecting the proper exception
to raise or creating new exceptions for use with our ADTs. For more information
on exceptions and assertions, refer to Appendix C.
Date Representations
There are two common approaches to storing a date in an object. One approach
stores the three components—month, day, and year—as three separate fields. With
this format, it is easy to access the individual components, but it’s difficult to
compare two dates or to compute the number of days between two dates since the
number of days in a month varies from month to month. The second approach
stores the date as an integer value representing the Julian day, which is the number
of days elapsed since the initial date of November 24, 4713 BC (using the Gregorian
calendar notation). Given a Julian day number, we can compute any of the three
Gregorian components and simply subtract the two integer values to determine
1.2 The Date Abstract Data Type 11
which occurs first or how many days separate the two dates. We are going to use
the latter approach as it is very common for storing dates in computer applications
and provides for an easy implementation.
T = (M - 14) / 12
jday = D - 32075 + (1461 * (Y + 4800 + T) / 4) +
(367 * (M - 2 - T * 12) / 12) -
(3 * ((Y + 4900 + T) / 100) / 4)
NOTE
To conserve space, however, classes and methods presented in this book
do not routinely include these comments since the surrounding text provides
a full explanation.
This allows for a more natural use of the objects instead of having to call
specific named methods. It can be tempting to define operators for every
class you create, but you should limit the definition of operator methods for
classes where the specific operator has a meaningful purpose.
1.3 Bags
The Date ADT provided an example of a simple abstract data type. To illustrate
the design and implementation of a complex abstract data type, we define the Bag
ADT. A bag is a simple container like a shopping bag that can be used to store a
collection of items. The bag container restricts access to the individual items by
only defining operations for adding and removing individual items, for determining
if an item is in the bag, and for traversing over the collection of items.
1.3 Bags 15
A bag is a container that stores a collection in which duplicate values are allowed.
The items, each of which is individually stored, have no particular order but they
must be comparable.
Bag(): Creates a bag that is initially empty.
length (): Returns the number of items stored in the bag. Accessed using
the len() function.
contains ( item ): Determines if the given target item is stored in the bag
and returns the appropriate boolean value. Accessed using the in operator.
remove( item ): Removes and returns an occurrence of item from the bag.
An exception is raised if the element is not in the bag.
iterator (): Creates and returns an iterator that can be used to iterate over
the collection of items.
You may have noticed our definition of the Bag ADT does not include an
operation to convert the container to a string. We could include such an operation,
but creating a string for a large collection is time consuming and requires a large
amount of memory. Such an operation can be beneficial when debugging a program
that uses an instance of the Bag ADT. Thus, it’s not uncommon to include the
str operator method for debugging purposes, but it would not typically be used
in production software. We will usually omit the inclusion of a str operator
method in the definition of our abstract data types, except in those cases where it’s
meaningful, but you may want to include one temporarily for debugging purposes.
Examples
Given the abstract definition of the Bag ADT, we can create and use a bag without
knowing how it is actually implemented. Consider the following simple example,
which creates a bag and asks the user to guess one of the values it contains.
16 CHAPTER 1 Abstract Data Types
myBag = Bag()
myBag.add( 19 )
myBag.add( 74 )
myBag.add( 23 )
myBag.add( 19 )
myBag.add( 12 )
Next, consider the checkdates.py sample program from the previous section
where we extracted birth dates from the user and determined which ones were
for individuals who were at least 21 years of age. Suppose we want to keep the
collection of birth dates for later use. It wouldn’t make sense to require the user to
re-enter the dates multiple times. Instead, we can store the birth dates in a bag as
they are entered and access them later, as many times as needed. The Bag ADT
is a perfect container for storing objects when the position or order of a specific
item does not matter. The following is a new version of the main routine for our
birth date checking program from Listing 1.1:
def main():
bornBefore = Date( 6, 1, 1988 )
bag = Bag()
# Extract dates from the user and place them in the bag.
date = promptAndExtractDate()
while date is not None :
bag.add( date )
date = promptAndExtractDate()
1. Does the data structure provide for the storage requirements as specified by
the domain of the ADT? Abstract data types are defined to work with a
specific domain of data values. The data structure we choose must be capable
of storing all possible values in that domain, taking into consideration any
restrictions or limitations placed on the individual items.
2. Does the data structure provide the necessary data access and manipulation
functionality to fully implement the ADT? The functionality of an abstract
data type is provided through its defined set of operations. The data structure
must allow for a full and correct implementation of the ADT without having
to violate the abstraction principle by exposing the implementation details to
the user.
3. Does the data structure lend itself to an efficient implementation of the oper-
ations? An important goal in the implementation of an abstract data type is
to provide an efficient solution. Some data structures allow for a more effi-
cient implementation than others, but not every data structure is suitable for
implementing every ADT. Efficiency considerations can help to select the best
structure from among multiple candidates.
There may be multiple data structures suitable for implementing a given ab-
stract data type, but we attempt to select the best possible based on the context
in which the ADT will be used. To accommodate different contexts, language
libraries will commonly provide several implementations of some ADTs, allowing
the programmer to choose the most appropriate. Following this approach, we in-
troduce a number of abstract data types throughout the text and present multiple
implementations as new data structures are introduced.
The efficiency of an implementation is based on complexity analysis, which is
not introduced until later in Chapter 3. Thus, we postpone consideration of the
efficiency of an implementation in selecting a data structure until that time. In
the meantime, we only consider the suitability of a data structure based on the
storage and functional requirements of the abstract data type.
18 CHAPTER 1 Abstract Data Types
We now turn our attention to selecting a data structure for implementing the
Bag ADT. The possible candidates at this point include the list and dictionary
structures. The list can store any type of comparable object, including duplicates.
Each item is stored individually, including duplicates, which means the reference
to each individual object is stored and later accessible when needed. This satisfies
the storage requirements of the Bag ADT, making the list a candidate structure
for its implementation.
The dictionary stores key/value pairs in which the key component must be
comparable and unique. To use the dictionary in implementing the Bag ADT, we
must have a way to store duplicate items as required by the definition of the ab-
stract data type. To accomplish this, each unique item can be stored in the key
part of the key/value pair and a counter can be stored in the value part. The
counter would be used to indicate the number of occurrences of the corresponding
item in the bag. When a duplicate item is added, the counter is incremented; when
a duplicate is removed, the counter is decremented.
Both the list and dictionary structures could be used to implement the Bag
ADT. For the simple version of the bag, however, the list is a better choice since
the dictionary would require twice as much space to store the contents of the bag
in the case where most of the items are unique. The dictionary is an excellent
choice for the implementation of the counting bag variation of the ADT.
Having chosen the list, we must ensure it provides the means to implement the
complete set of bag operations. When implementing an ADT, we must use the
functionality provided by the underlying data structure. Sometimes, an ADT op-
eration is identical to one already provided by the data structure. In this case, the
implementation can be quite simple and may consist of a single call to the corre-
sponding operation of the structure, while in other cases, we have to use multiple
operations provided by the structure. To help verify a correct implementation
of the Bag ADT using the list, we can outline how each bag operation will be
implemented:
From this itemized list, we see that each Bag ADT operation can be imple-
mented using the available functionality of the list. Thus, the list is suitable for
implementing the bag.
1.3 Bags 19
Most of the implementation details follow the specifics discussed in the previous
section. There are some additional details, however. First, the ADT definition
of the remove() operation specifies the precondition that the item must exist
in the bag in order to be removed. Thus, we must first assert that condition
and verify the existence of the item. Second, we need to provide an iteration
mechanism that allows us to iterate over the individual items in the bag. We delay
theItems 19
19 74
74 23
23 19
19 12
12
Bag 0 1 2 3 4
Figure 1.3: Sample instance of the Bag class implemented using a list.
20 CHAPTER 1 Abstract Data Types
the implementation of this operation until the next section where we discuss the
creation and use of iterators in Python.
A list stores references to objects and technically would be illustrated as shown
in the figure to the right. To conserve space and reduce the clutter that can result
in some figures, however, we illustrate objects in the text as boxes with rounded
edges and show them stored directly Bag 0 1 2 3 4
within the list structure. Variables theItems
will be illustrated as square boxes
with a bullet in the middle and the Bag
name of the variable printed nearby. 19
19 74 74 23
23 1919 1212
Bag
1.4 Iterators
Traversals are very common operations, especially on containers. A traversal iter-
ates over the entire collection, providing access to each individual element. Traver-
sals can be used for a number of operations, including searching for a specific item
or printing an entire collection.
Python’s container types—strings, tuples, lists, and dictionaries—can be tra-
versed using the for loop construct. For our user-defined abstract data types, we
can add methods that perform specific traversal operations when necessary. For
example, if we wanted to save every item contained in a bag to a text file, we could
add a saveElements() method that traverses over the vector and writes each value
to a file. But this would limit the format of the resulting text file to that specified
in the new method. In addition to saving the items, perhaps we would like to
simply print the items to the screen in a specific way. To perform the latter, we
would have to add yet another operation to our ADT.
Not all abstract data types should provide a traversal operation, but it is appro-
priate for most container types. Thus, we need a way to allow generic traversals to
be performed. One way would be to provide the user with access to the underlying
data structure used to implement the ADT. But this would violate the abstraction
principle and defeat the purpose of defining new abstract data types.
Python, like many of today’s object-oriented languages, provides a built-in it-
erator construct that can be used to perform traversals on user-defined ADTs. An
iterator is an object that provides a mechanism for performing generic traversals
through a container without having to expose the underlying implementation. Iter-
ators are used with Python’s for loop construct to provide a traversal mechanism
for both built-in and user-defined containers. Consider the code segment from the
checkdates2.py program in Section 1.3 that uses the for loop to traverse the
collection of dates:
Listing 1.4 The BagIterator class, which is part of the linearbag.py module.
The next method is called to return the next item in the container. The
method first saves a reference to the current item indicated by the loop variable.
The loop variable is then incremented by one to prepare it for the next invocation
of the next method. If there are no additional items, the method must raise a
StopIteration exception that flags the for loop to terminate. Finally, we must
add an iter method to our Bag class, as shown here:
This method, which is responsible for creating and returning an instance of the
BagIterator class, is automatically called at the beginning of the for loop to
create an iterator object for use with the loop construct.
22 CHAPTER 1 Abstract Data Types
is executed, Python automatically calls the iter method on the bag object
to create an iterator object. Figure 1.4 illustrates the state of the BagIterator
object immediately after being created. Notice the bagItems field of the iterator
object references theItems field of the bag object. This reference was assigned
by the constructor when the BagIterator object was created.
bagVectorcurItem
curItem
0000
_BagIterator
theItems 19
19 74
74 23
23 19
19 12
12
Bag 0 1 2 3 4
Figure 1.4: The Bag and BagIterator objects before the first loop iteration.
The for loop then automatically calls the next method on the iterator
object to access the next item in the container. The state of the iterator object
changes with the curItem field having been incremented by one. This process
continues until a StopIteration exception is raised by the next method when
the items have been exhausted as indicated by the curItem. After all of the items
have been processed, the iteration is terminated and execution continues with the
next statement following the loop. The following code segment illustrates how
Python actually performs the iteration when a for loop is used with an instance
of the Bag class:
# Create a BagIterator object for myBag.
iterator = myBag.__iter__()
# Catch the exception and break from the loop when we are done.
except StopIteration:
break
1.5 Application: Student Records 23
LIST OF STUDENTS
Our contact in the Registrar’s office, who assigned the task, has provided some
information about the data. We know each record contains five pieces of infor-
mation for an individual student: (1) the student’s id number represented as an
integer; (2) their first and last names, which are strings; (3) an integer classification
code in the range [1 . . . 4] that indicates if the student is a freshman, sophomore,
junior, or senior; and (4) their current grade point average represented as a floating-
point value. What we have not been told, however, is how the data is stored on
disk. It could be stored in a plain text file, in a binary file, or even in a database.
In addition, if the data is stored in a text or binary file, we will need to know how
the data is formatted in the file, and if it’s in a relational database, we will need
to know the type and the structure of the database.
A student file reader is used to extract student records from external storage. The
five data components of the individual records are extracted and stored in a storage
object specific for this collection of student records.
StudentFileReader( filename ): Creates a student reader instance for ex-
tracting student records from the given file. The type and format of the file
is dependent on the specific implementation.
open(): Opens a connection to the input source and prepares it for extracting
student records. If a connection cannot be opened, an exception is raised.
close(): Closes the connection to the input source. If the connection is not
currently open, an exception is raised.
fetchRecord(): Extracts the next student record from the input source and
returns a reference to a storage object containing the data. None is returned
when there are no additional records to be extracted. An exception is raised
if the connection to the input source was previously closed.
sorted in ascending order based on the student identification number. The actual
report is produced by passing the sorted list to the printReport() function.
Storage Class
When the data for an individual student is extracted from the input file, it will
need to be saved in a storage object that can be added to a list in order to first
sort and then print the records. We could use tuples to store the records, but we
Another Random Scribd Document
with Unrelated Content
kansanrunoisto on jakanut muille urhoillensa. Epätietoista on, onko
tämä tragillinen henkilö lähtenyt sadusta vain jumalaistarustosta,
mutta nyt hän kumminkin monessa suhteessa on tullut samaksi kuin
tähtisulhanen, se joka tarussa Koit'ista ainoastaan kerran vuodessa
saa suudella Ämarik'in hohtavia huulia, eli joka vapauttaa Manalassa,
päällettäin ryhmitetyissä pilvissä, vangitun neidon. Vielä on
tarkastettava mitä historiallisia johtopäätöksiä Viron
kansanrunoistosto voipi saada.
III.
Frygge frågade frå: huru skall man bota den floget får?
Jesus reed sig till hede, da reed han sönder sitt folebeen.
Jesus stigade af og lägte det: Jesus lagde marv i marv, Been i
been, kjöd i kjöd, Jesus lagde derpaa et blad, att det skulde
blive i samma stad.
Vir. = Suom.
Viiteselitykset:
[25] Kanteletar 3, 1.
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
ebookultra.com