02 Data Stru
02 Data Stru
2. Intro. To Data
Structures &
ADTs – C-Style Types
• Goal: to organize data
• Criteria: to facilitate efficient
– storage of data
– retrieval of data
– manipulation of data
• Design Issue:
– select and design appropriate data
types.
(This is the real essence of OOP.)
1
Examples – Factors to Consider
Airline Reservations Trans-Fryslan Airlines (pp. 30-31)
Lab 2A Attempt 1:
enum SeatStatus {OCCUPIED, UNOCCUPIED};
SeatStatus seat1, seat2, . . . , seat10;
Horrible algorithms for the basic operations!
Attempt 2:
const int MAX_SEATS = 10;
// upper limit on number of seats
enum SeatStatus {OCCUPIED, UNOCCUPIED};
typedef SeatStatus SeatList[MAX_SEATS];
SeatList seat;
Nice algorithms for the basic operations!
array
Arithmetic void pointers struct
union
class
istream
ostream
Integral Floating point bool complex iostream
(reals) ifstream
ofstream
istringstream fstream
string
Characters Enumerations Integers float ostringstream vector
double stringstream deque
long double list
int stack
char queue
unsigned char short int
long int priority_queue
signed char map
unsigned
unsigned short multimap
set
unsigned long multiset
bitset
valarray
6
Simple Data Types (§2.2)
Memory:
2-state devices bits 0 and 1
In C/C++: false = 0, true = 1 (or nonzero)
11
Two's complement
representation Same as
For nonnegative n: sign mag.
Use ordinary base-two representation with leading (sign) bit 0
12
Good for arithmetic computations (see p. 38)
5 + 7:
111 carry bits
0000000000000101
+0000000000000111
000000000000
1100
5 + –6:
0000000000000101
+1111111111111010
111111111111
1111 13
Biased representation
Add a constant bias to the number (typically, 2w – 1) ;
then find its base-two representation.
Examples:
88 using w = 16 bits and bias of 215 = 32768
1. Add the bias to 88, giving 32856
2. Represent the result in base-two notation:
1000000001011000
–88:
1. Add the bias to -88, giving 32680
2. Represent the result in base-two notation:
01111111101010
00
Good for comparisons; so, it is commonly used for
exponents in floating-point representation of reals. 14
Problems with Integer
Representation
15
Real Data
T y p e s f l o a t a n d d o u b l e ( a n d v a r ia tio n s ) in C + +
S in g le p r e c is io n ( I E E E F lo a tin g - P o in t F o r m a t ) p. 756
1 . W rite b in a r y r e p r e s e n ta tio n in f lo a tin g -p o in t fo r m :
b 1 .b 2 b 3 . . . 2 w ith e a c h b i a b it a n d b 1 = 1 ( u n le s s n u m b e r is 0 )
k
m a n tis s a exponent
o r fra c tio n a l p a r t
double:
2. Store: Exp: 11 bits,
bias
— sign of mantissa in leftmost bit (0 = +, 1 = – )
1023
— biased binary rep. of exponent in next 8 bits (bias = 127)Mant: 52 bits
— bits b2b3 . . . in rightmost 23 bits. (Need not store b1 — know
it's 1)
Example: 22.625 = 10110.10 (see
p.41) 12 1.01101012
Floating point form:
24 +
127
16
Problems with Real
What's Representation
10*DBL_MAX? See cfloat
(App. C)
Exponent overflow/underflow (p. 41)
Only a finite range of reals can be stored exactly.
17
C-Style Data Structures: Arrays
(§2.3)
Defn of an array as an ADT:
An ordered set (sequence) with a fixed number of
elements,
where the basic
all of the sameoperation
type, is
direct access to each element in the array so values
can be
retrieved from or stored in this element.
Properties:
• Ordered so there is a first element, a second one, etc.
• Fixed number of elements — fixed capacity
• Elements must be the same type (and size);
use arrays only for homogeneous data sets.
• Direct access: Access an element by giving its location
— the time to access each element is the same for all
elements,
regardless of position. 18
Declaring arrays in C++
element_type array_name[CAPACITY];
where
element_type is any type
array_name is the name of the array — any valid identifier
CAPACITY (a positive integer constant) is the number of
elements in the array Can't input the capacity
The compiler reserves a block of consecutive memory locations, enough
to hold CAPACITY values of type element_type.
The elements (or positions) of the array are indexed 0, 1, 2, . . .,
CAPACITY - 1.
e.g., double score[100]; score[0]
Better to use a named constant to specify the array capacity:
score[1]
const int CAPACITY = 100; score[2]
double score[CAPACITY]; score[3]
. .
Can use typedef with array declarations; e.g., . .
. .
const int CAPACITY = 100;
typedef double ScoresArray[CAPACITY]; score[99]
ScoresArray score;
19
How well does C/C++ implement an array ADT?
As an ADT In C++
20
Subscript operator
[] is an actual operator and not simply a notation/punctuation as in some
other languages.
Its two operands are an array variable and an integer index (or
subscript) and is written
array_name[i] Also, it
Here i is an integer expression with 0 < i < CAPACITY – 1. can have
an index
[] returns the address of the element in location i in array_name; so
array_name[i]is a variable, called an indexed (or subscripted)
variable,
whose type is the specified element_type of the array.
This//means thatall
Zero out it can
thebeelements
used on of
thescore
left side of an assignment, in input
statements,
for (int etc.
i = to0; store a value in i++)
i < CAPACITY; a specified location in the array. For
example:
score[i] = 0.0;
// Read values into the first numScores elements of score
for (int i = 0; i < numScores; i++)
cin >> score[i];
// Display values stored in the first numScores elements
for (int i = 0; i < numScores; i++)
cout << score[i] << endl; 21
Array Initialization
In C++, arrays can be initialized when they are declared.
an array literal
Numeric arrays:
element_type num_array[CAPACITY] = {list_of_initial_values};
Example:
double rate[5] = {0.11, 0.13, 0.16, 0.18, 0.21};
0 1 2 3 4
rate 0.11 0.13 0.16 0.18 0.21
Note 2: It is an error if more values are supplied than the declared size
of the array.
How this error is handled, however, will vary from one compiler to
another. 22
Character arrays:
Character arrays may be initialized in the same manner as
numeric
chararrays.
vowel[5] = {'A', 'E', 'I', 'O', 'U'};
Note 1: If fewer values are supplied than the declared size of the
array,
the zeroes used to fill uninitialized elements are interpreted as
the null character '\0' whose ASCII code is 0.
Example:
double rate[] = {0.11, 0.13, 0.16};
0 1 2
rate 0.11 0.13 0.16
*(array_name + index)
An array reference array_name[index]
* is the dereferencing
is equivalent to operator
*ref returns the contents of the memory location with
address ref
27
C-Style Multidimensional
Arrays
Example: A table of test scores for several different
students on p.52
several different tests.
Test 1 Test 2 Test 3 Test 4
Student 1 99.0 93.5 89.0 91.0
Student 2 66.0 68.0 84.5 82.0
Student 3 88.5 78.5 70.0 65.0
: : : : :
: : : : :
Student-n 100.0 99.5 100.0 99.0
28
Declaring Two-Dimensional
Arrays
Standard form of declaration:
element_type array_name[NUM_ROWS][NUM_COLUMNS];
[0] [[1] [2]
Example: [0] [3]
[1]
[2]
const int NUM_ROWS = 30, [3]
NUM_COLUMNS = 4;
double scoresTable[NUM_ROWS][NUM_COLUMNS]; [29]
or
typedef double TwoDimArray [NUM_ROWS][NUM_COLUMNS];
TwoDimArray scoresTable;
Initialization
List the initial values in braces, row by row;
May use internal braces for each row to improve
readability.
Example:
double rates[2][3] = {{0.50, 0.55, 0.53}, // first row
29
{0.63, 0.58, 0.55}}; // second row
Processing Two-Dimensional
Arrays
Remember: Rows (and) columns are numbered from zero!!
Example: To store and process a table of test scores for several differe
students on several different tests for several different semesters
31
Errors in
b. Still higher dimensions text
Example like the automobile-inventory example on pp. 54-5
jeansInStock[b][s][w][i]--;
// sale of 1 brand-b, style-s, waist-w, inseam-i jeans
32
Arrays of Arrays
[0] [[1] [2]
[0] [3]
[1]
[2]
[3]
double scoresTable[30][4];
[29]
Address Translation:
The array-of-arrays structure of multidimensional arrays explains
address translation.
Suppose the base address of scoresTable is 0x12348:
[0]
scoresTable[10] [1]
[3]
scoresTable[10] 0x12348 + 10*(sizeof
RowOfTable)
= 0x12348 + 10 * (4 * 8)
[9] [3]
scoresTable[10] [10
]
[3] base(scoresTable[10]) + 3*(sizeof
= 0x12348 + 10 * (4 * 8)
double)
+3*8
= 0x124a0
In general, an n-dimensional array can be viewed (recursively) as a
34
one-dimensional array whose elements are (n - 1)-dimensional
Arrays as Parameters
Passing an array to a function actually passes the
base address
of1.the
Thearray. This means:
parameter has the same address as the
argument.
So modifying the
parameter will f(array); void f(ArrayType param)
{ ... }
modify the
corresponding
array argument.
2. Array capacity is not available to a function unless
passed as a separate parameter.
The following function prototypes are all equivalent
void Print(int A[100], int theSize);
void Print(int A[], int theSize);
void Print(int * A, int theSize); 35
Arrays as Parameters …
Continued
Now, what about multidimensional arrays?
doesn't work
37
Start Stop
processing processing
•Virtually no predefined here here
operations
for non-char arrays. J o h n D o e \0 \0
Basic reason: No character
to mark the end of a Start Stop
processing processing
numeric sequence here where???
— no numeric equivalent
of the NUL character. 6 2 0 1 5 0 2 0 0 0
Solution 1(non-OOP): In addition to the array,
pass its size (and perhaps its capacity) to
functions.
The Deeper
Problem:
C-style arrays aren't self-
contained.
Basic principle of OOP:
An object should be autonomous (self-contained); it should
carry within itself all of the information needed to describe and
operate upon itself.
Why Needed?
Current OCD:
1. Identify the objects in the problem.
1a. . . .
2. Identify the operations in the problem.
2a. If the operation is not predefined, write a function to
perform it.
2b. If the function is useful for other problems, store it in a
library.
3. Organize the objects and operations into an algorithm.
4. Code the algorithm as a program.
5. Test, execute, and debug the program.
But,
6. predefined types
Maintain the may not be adequate; so we add:
program
1a. If necessary, create a new data type to model it.
39
Especially true if object being modeled has multiple
attributes.
Examples:
A temperature has:
a degrees attribute
a scale attribute (Fahrenheit, Celsius, Kelvin)
32 F
degrees
scale
A date has:
a month attribute
a day attribute
a year attribute
February 14 2001
month day year
40
C++ provides structs and classes to create new
types with multiple attributes.
41
As an
ADT:
A structure (usually abbreviated to struct and sometimes
called a record)
has a fixed size Only difference
from an array
is ordered
elements may be of different types
42
Examples:
32 F
degrees
scale struct Temperature
{
double degrees; // number of degrees
char scale; // temp. scale (F, C, or K)
};
Temperature temp;
February 14 2001
month day year
struct Date
{
char month[10]; string month; // name of month
int day, // day number
year; // year number
};
Date birthday, currentDate; 43
Phone Listing:
John Q. Doe 12345 Calvin Rd. Grand Rapids, MI 9571234
name street city & state
phone #
struct DirectoryListing
{
char name[20],
string name, // name of person
street[20],
street, // street address
cityAndState[20];
cityAndState; // city, state (no zip)
unsigned phoneNumber; // 7-digit phone number
};
DirectoryListing
entry, // entry in phone book
group[20]; // array of directory listings
44
Coordinates of a point: Test scores:
(Members need not (Members may be structured
have different types.) types — e.g., arrays.)
45
Hierarchical (or nested) structs
Since the type of a member may be any type,
it may be another struct.
John Q. Doe 123 Calvin Rd. Detroit, MI 95714 May 17 1975 3.95 92.5
name street city & state -zip month
day year gpa credits
DirectoryListing Date real
struct PersonalInfo
{
DirectoryListing ident;
Date birth;
double cumGPA,
credits;
};
PersonalInfo student; 46
Some Properties:
The scope of a member identifier is the struct in which it is
defined.
Consequences:
— A member identifier may be used outside the struct for
some other purpose.
e.g. int month; // legal declaration alongside Date
47
Examples:
Input a value into the month member of birthday
cin >> birthDay.month;
48
A Quick Look at Unions (p. 68)
49
Unions can be used to define structs that have some
common members — a fixed part — and a variant
part that makes it possible for the fields of a struct to
differ from one data value to the next. For example to
process a file of information about various categories
of people:
John Doe 40 M <——— name, age, marital status
(married)
January 30 1980 <——— wedding date
Mary Smith Doe 8 <——— spouse, # dependents
Fred Jones 17 S <——— name, age, marital status (single)
T <——— available
Jane VanderVan 24 D <——— name, age, marital status
(divorced)
February 21 1998 N <——— divorce date, remarried (No)]
Peter VanderVan 25 W <——— name, age, marital status
(widower)
February 22 1998 Y <——— date became a widower, 50
remarried (Yes)
struct Date struct PersonalInfo
{ {
string month; string name;
short day, year; short age;
}; char marStatus;
struct MarriedInfo // Tag: S = single, M = married,
{ // W = was married
Date wedding; union
string spouse {
short dependents; MarriedInfo married;
}; SingleInfo single;
struct SingleInfo WasMarriedInfo wasMarried;
{ };
bool available; };
};
PersonalInfo person;
struct WasMarriedInfo
{
Date divorceOrDeath;
char remarried;
};
51
Structs with variant parts aren't used much anymore.
(p. 69)
Instead, in OOP languages:
Encapsulate the common information in a base class.
= base address of s w
+i i 1
56
/* Set sets the time to a specified values. Notice
* the docu-
* Receive: Time object t mentatio
* hours, the number of hours in standard time n!
* minutes, the number of minutes in standard time
* AMPM ('A' if AM, 'P' if PM
* Pass back: The modified Time t with data members set to
* the specified values
******************************************************************/
void Set(Time & t, unsigned hours, unsigned minutes, char AMPM);
#include "Time.h"
58
C++ Types
array
Arithmetic void pointers struct
union
class
istream
ostream
Integral Floating point bool complex iostream
(reals) ifstream
ofstream
istringstream fstream
string
Characters Enumerations Integers float ostringstream vector
double stringstream deque
long double list
int stack
char queue
unsigned char short int
long int priority_queue
signed char map
unsigned
unsigned short multimap
set
unsigned long multiset
bitset
valarray
59