0% found this document useful (0 votes)
34 views

Dynamic Data Structures: Next

This document discusses dynamic data structures in C programming. It begins by introducing dynamic arrays and how functions like malloc(), calloc(), realloc() and free() are used to manage memory for arrays whose size is only known at runtime. It then discusses implementing a simple stack data structure using a linked list with nodes that point to the next node. Functions for pushing and popping items onto the stack are demonstrated. The document shows how the stack can be traversed and printed. Finally, it provides an example of how the stack can be used to evaluate a reverse polish notation calculator by pushing and popping operands and performing operations.

Uploaded by

Fred
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

Dynamic Data Structures: Next

This document discusses dynamic data structures in C programming. It begins by introducing dynamic arrays and how functions like malloc(), calloc(), realloc() and free() are used to manage memory for arrays whose size is only known at runtime. It then discusses implementing a simple stack data structure using a linked list with nodes that point to the next node. Functions for pushing and popping items onto the stack are demonstrated. The document shows how the stack can be traversed and printed. Finally, it provides an example of how the stack can be used to evaluate a reverse polish notation calculator by pushing and popping operands and performing operations.

Uploaded by

Fred
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

CITS2002

Systems Programming

1 next CITS2002 CITS2002 schedule

Dynamic data structures


Initially, we focused on scalar and array variables, whose size is known at compile-time.

More recently, we've focused on arrays of values, whose required size was only known at run-time.

In the case of dynamic arrays we've used C99 functions such as:

malloc(), calloc(), realloc(), and free()


to manage the required storage for us.

An extension to this idea is the use of dynamic data structures - collections of data whose required size is not known until run-time. Again, we'll use C99's
standard memory allocation functions whenever we require more memory.

However, unlike our use of realloc to grow (or shrink) a single data structure (there, an array), we'll see two significant differences:

we'll manage a complete data structure by allocating and deallocating its "pieces", and
we'll keep all of the "pieces" linked together by including, in each piece, a "link" to other pieces.

To implement these ideas in C99, we'll develop data structures that contain pointers to other data structures.

CITS2002 Systems Programming, Lecture 20, p1, 17th October 2017.


CITS2002 Systems Programming

prev 2 next CITS2002 CITS2002 schedule

A simple dynamic data structure - a stack


We'll commence with a simple stack - a data structure that maintains a simple list of items by adding new items, and removing existing items, from the
head of the list.

Such a data structure is also termed a first-in-last-out data structure, a FILO, because the first item added to the stack is the last item removed from it (not
the sort of sequence you want while queueing for a bank's ATM!).

Let's consider the appropriate type definition in C99:

typedef struct _s {
int value;
struct _s *next;
} STACKITEM;
STACKITEM *stack = NULL;

Of note:

we haven't really defined a stack datatype, but a single item that will "go into" the stack.
the datatype STACKITEM contains a pointer field, named next, that will point to another item in the stack.
we've defined a new type, a structure named _s, so that the pointer field next can be of a type that already exists.
we've defined a single pointer variable, named stack, that will point to a stack of items.

CITS2002 Systems Programming, Lecture 20, p2, 17th October 2017.


CITS2002 Systems Programming

prev 3 next CITS2002 CITS2002 schedule

Adding items to our stack data structure


As a program's execution progresses, we'll need to add and remove data items from the data structure.

The need to do this is not known until run-time, and data (perhaps read from file) will determine how large our stack eventually grows.

As its name suggests, when we add items to our stack, we'll speak of pushing new items on the stack, and popping existing items from the stack, when
removing them.

typedef struct _s { // same definition as before


int value;
struct _s *next;
} STACKITEM;
STACKITEM *stack = NULL;

....

void push_item(int newvalue)


{
STACKITEM *new = malloc( sizeof(STACKITEM) );
if(new == NULL) { // check for insufficient memory
perror( __func__ );
exit(EXIT_FAILURE);
}
new->value = newvalue;
new->next = stack;
stack = new;
}

The functions push_item and pop_item are quite simple, but in each case we must worry about the case when the stack is empty. We use a
NULL pointer
to represent the condition of the stack being empty.

CITS2002 Systems Programming, Lecture 20, p3, 17th October 2017.


CITS2002 Systems Programming

prev 4 next CITS2002 CITS2002 schedule

Removing items from our stack data structure


The function pop_item now removes an item from the stack, and returns the actual data's value.

In this example, the data held in each STACKITEM is just a single integer, but it could involve several fields of data. In that case, we may need more
complex functions to return all of the data (perhaps using a structure or pass-by-reference parameters to the pop_item function).

Again, we must ensure that we don't attempt to remove (pop) an item from an empty stack:

int pop_item(void)
{
STACKITEM *old;
int oldvalue;
if(stack == NULL) {
fprintf(stderr, "attempt to pop from an empty stack\n");
exit(EXIT_FAILURE);
}
oldvalue = stack->value;
old = stack;
stack = stack->next;
free(old);

return oldvalue;
}

CITS2002 Systems Programming, Lecture 20, p4, 17th October 2017.


CITS2002 Systems Programming

prev 5 next CITS2002 CITS2002 schedule

Printing our stack data structure


To print out our whole data structure, we can't just use a standard C99 function as C99 doesn't know/understand our data structure.

Thus we'll write our own function, print_stack, to traverse the stack and successively print each item. using printf.

Again, we must check for the case of the empty stack:

void print_stack(void)
{
STACKITEM *thisitem = stack;
while(thisitem != NULL) {
printf("%i", thisitem->value);
thisitem = thisitem->next;
if(thisitem != NULL)
printf(" -> ");
}
if(stack != NULL)
printf("\n");
}

Again, our stack is simple because each node only contains a single integer. If more complex, we may call a different function from within
print_stack to
perform the actual printing:

....
print_stack_item( thisitem );

CITS2002 Systems Programming, Lecture 20, p5, 17th October 2017.


CITS2002 Systems Programming

prev 6 next CITS2002 CITS2002 schedule

Using our stack in a Reverse Polish Calculator


Let's employ our stack data structure to evaluate basic integer arithmetic, as if using a Reverse Polish Calculator.

Each integer read from lines of a file is pushed onto the stack, arithmetic operators pop 2 integers from the stack, perform some arithmetic, and push the
result back onto the stack.

int evaluate_RPN(FILE *fp) # Our input data:


{ 12
char line[BUFSIZ]; 3
int val1, val2; add
5
while( fgets(line, sizeof(line), fp) != NULL ) { div
if(line[0] == '#')
continue;
if(isdigit(line[0]) || line[0] == '-')
push_item( atoi(line) );
else if(line[0] == 'a') {
val1 = pop_item();
val2 = pop_item();
push_item( val1 + val2 );
}
....
else if(line[0] == 'd') {
val1 = pop_item();
val2 = pop_item();
push_item( val2 / val1 );
}
else
break;
}
return pop_item();
}

CITS2002 Systems Programming, Lecture 20, p6, 17th October 2017.


CITS2002 Systems Programming

prev 7 next CITS2002 CITS2002 schedule

Using our stack in a Reverse Polish Calculator, continued


Careful readers may have noticed that in some cases we don't actually need the integer variables val1 and val2.

We can use the 2 results returned from pop_item as arguments to push_item:

int evaluate_RPN(FILE *fp)


{
char line[BUFSIZ];
while( fgets(line, sizeof line, fp) != NULL ) {
if(line[0] == '#')
continue;
if(isdigit(line[0]) || line[0] == '-')
push_item( atoi(line) );
else if(line[0] == 'a') {
push_item( pop_item() + pop_item() );
}
....
}
return pop_item();
}
int main(int argc, char *argv[])
{
printf("%i\n", evaluate_RPN( stdin ) );
return 0;
}

CITS2002 Systems Programming, Lecture 20, p7, 17th October 2017.


CITS2002 Systems Programming

prev 8 next CITS2002 CITS2002 schedule

Problems with our stack data structure


As written, our stack data structure works, but may be difficult to deploy in a large program.

In particular, the whole stack was represented by a single global pointer variable, and all functions accessed or modified that global variable.

What if our program required 2, or more, stacks?


What if the required number of stacks was determined at run-time?
Could the stacks be manipulated by functions that didn't actually "understand" the data they were manipulating ?

Ideally we'd re-write all of our functions, push_item, push_item, and print_stack so that they received the required stack as a parameter, and used or
manipulated that stack.

Techniques on how, and why, to design and implement robust data structures are a focus of the unit CITS2200 Data Structures & Algorithms.

CITS2002 Systems Programming, Lecture 20, p8, 17th October 2017.


CITS2002 Systems Programming

prev 9 next CITS2002 CITS2002 schedule

Declaring a list of items


Let's develop a similar data structure that, unlike the first-in-last-out (FILO) approach of the stack, provides first-in-first-out (FIFO) storage - much fairer for
queueing at the ATM!

We term such a data structure a list, and its datatype declaration is very similar to our stack:

typedef struct _l {
char *string;
struct _l *next;
} LISTITEM;
LISTITEM *list = NULL;

As with the stack, we'll need to support empty lists, and will again employ a NULL pointer to represent it.

This time, each data item to be stored in the list is string, and we'll often term such a structure as "a list of strings".

CITS2002 Systems Programming, Lecture 20, p9, 17th October 2017.


CITS2002 Systems Programming

prev 10 next CITS2002 CITS2002 schedule

Adding (appending) a new item to our list


When adding (appending) new items to our list, we need to be careful about the special (edge) cases:

the empty list, and


when adding items to the end:

void append_item(char *newstring)


{
if(list == NULL) { // append to an empty list
list = malloc( sizeof(LISTITEM) );
if(list == NULL) {
perror( __func__ );
exit(EXIT_FAILURE);
}
list->string = strdup(newstring);
list->next = NULL;
}
else { // append to an existing list
LISTITEM *p = list;
while(p->next != NULL) { // walk to the end of the list
p = p->next;
}
p->next = malloc( sizeof(LISTITEM) );
if(p->next == NULL) {
perror( __func__ );
exit(EXIT_FAILURE);
}
p = p->next; // append after the last item
p->string = strdup(newstring);
p->next = NULL;
}
}

Notice how we needed to traverse the whole list to locate its end.
Such traversal can become expensive (in time) for very long lists.

CITS2002 Systems Programming, Lecture 20, p10, 17th October 2017.


CITS2002 Systems Programming

prev 11 next CITS2002 CITS2002 schedule

Removing an item from the head our list


Removing items from the head of our list, is much easier.

Of course, we again need to be careful about the case of the empty list:

char *remove_item(void)
{
LISTITEM *old = list;
char *string;
if(old == NULL) {
fprintf(stderr, "cannot remove item from an empty list\n");
exit(EXIT_FAILURE);
}
list = list->next;
string = old->string;
free(old);

return string;
}

Notice that we return the string (data value) to the caller, and deallocate the old node that was at the head of the list.

We say that the caller now owns the storage required to hold the string - even though the caller did not initially allocate that
storage.

It will be up to the caller to deallocate that memory when no longer required.


Failure to deallocate such memory can lead to memory leaks, that may eventually crash long running programs.

CITS2002 Systems Programming, Lecture 20, p11, 17th October 2017.


CITS2002 Systems Programming

prev 12 next CITS2002 CITS2002 schedule

Problems with our list data structure


As written, our list data structure works, but also has a few problems:

Again, our list accessing functions use a single global variable.


What if our program required 2, or more, lists?
Continually searching for the end-of-list can become expensive.
Could the lists be manipulated by functions that didn't actually "understand" the data they were
manipulating?

We'll address all of these by developing a similar first-in-first-out (FIFO) data structure, which we'll name a queue.

CITS2002 Systems Programming, Lecture 20, p12, 17th October 2017.


CITS2002 Systems Programming

prev 13 next CITS2002 CITS2002 schedule

A general-purpose queue data structure


Let's develop a first-in-first-out (FIFO) data structure that queues (almost) arbitrary data.

We're hoping to address the main problems that were exhibited by the stack and list data structures:

We should be able to manage the data without knowing what it is.


We'd like operations, such as appending, to be independent of the number of items already stored.
Such (highly desirable) operations are performed in a constant-time.

typedef struct _e {
void *data;
size_t datalen;
struct _e *next;
} ELEMENT;
typedef struct {
ELEMENT *head;
ELEMENT *tail;
} QUEUE;

Of note:

We've introduced a new datatype, ELEMENT, to hold each individual item of data.
Because we don't require our functions to "understand" the data they're queueing, each element will just hold a void pointer to the data it's holding,
and remember its length.
Our "traditional" datatype QUEUE now holds 2 pointers - one to the head of the list of items, one to the tail.

CITS2002 Systems Programming, Lecture 20, p13, 17th October 2017.


CITS2002 Systems Programming

prev 14 next CITS2002 CITS2002 schedule

Creating a new queue


We'd like our large programs to have more than a single queue - thus we don't want a single, global, variable, and we don't know until run-time how many
queues we'll require.

We thus need a function to allocate space for, and to initialize, a new queue:

QUEUE *queue_new(void) QUEUE *queue_new(void) // same outcome, often seen


{ {
QUEUE *q = malloc( sizeof(QUEUE) ); QUEUE *q = calloc( 1, sizeof(QUEUE) );

if(q == NULL) { if(q == NULL) {


perror( __func__ ); perror( __func__ );
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
} }
q->head = NULL; return q;
q->tail = NULL; }
return q;
}
....
QUEUE *people_queue = queue_new();
QUEUE *truck_queue = queue_new();

If we remember that:

the calloc function both allocates memory and sets all of its bytes to the zero-bit-pattern, and
that (most) C99 implementations represent the NULL pointer as the zero-bit-pattern,

then we appreciate the simplicity of allocating new items with calloc.

CITS2002 Systems Programming, Lecture 20, p14, 17th October 2017.


CITS2002 Systems Programming

prev 15 next CITS2002 CITS2002 schedule

Deallocating space used by our queue


It's considered a good practice to always write a function that deallocates all space used in our own user-defined dynamic data
structures.

In the case of our queue, we need to deallocate 3 things:

1. the memory required for the data in every element,


2. the memory required for every element,
3. the queue itself.

void queue_free(QUEUE *q)


{
ELEMENT *this, *save;
this = q->head;
while( this != NULL ) {
save = this;
this = this->next;
free(save->data);
free(save);
}
free(q);
}
QUEUE *my_queue = queue_new();
....
// use my local queue
....
queue_free( my_queue );

CITS2002 Systems Programming, Lecture 20, p15, 17th October 2017.


CITS2002 Systems Programming

prev 16 next CITS2002 CITS2002 schedule

Adding (appending) new items to our queue


Finally, we'll considered adding new items to our queue.

Remember two of our objectives:

To quickly add items - we don't wish appending to a very long queue to be slow.
We achieve this by remembering where the tail of the queue is, and quickly adding to it without searching.
To be able to queue data that we don't "understand".
We achieve this by treating all data as "a block of bytes", allocating memory for it, copying it (as we're told its length), all without ever interpreting its
contents.

CITS2002 Systems Programming, Lecture 20, p16, 17th October 2017.


CITS2002 Systems Programming

prev 17 next CITS2002 CITS2002 schedule

Adding (appending) new items to our queue, continued

void queue_add(QUEUE *Q, void *data, size_t datalen)


{
ELEMENT *newelement;
// ALLOCATE MEMORY FOR A NEW ELEMENT
newelement = calloc(1, sizeof(ELEMENT));
if(newelement == NULL) {
perror( __func__ );
exit(EXIT_FAILURE);
}
// ALLOCATE MEMORY FOR THE DATA IN THE NEW ELEMENT
newelement->data = malloc(datalen);
if(newelement->data == NULL) {
perror( __func__ );
exit(EXIT_FAILURE);
}

// SAVE (COPY) THE UNKNOWN DATA INTO OUR NEW MEMORY


memcpy(newelement->data, data, datalen);
newelement->datalen = datalen;
newelement->next = NULL;
// APPEND THE NEW ELEMENT TO AN EMPTY LIST
if(q->head == NULL) {
q->head = newelement;
q->tail = newelement;
}
// OR APPEND THE NEW ELEMENT TO THE TAIL OF THE LIST
else {
q->tail->next = newelement;
q->tail = newelement;
}
}

Writing a function to remove items from our queue, is left as a simple


exercise.

CITS2002 Systems Programming, Lecture 20, p17, 17th October 2017.


CITS2002 Systems Programming

prev 18 next CITS2002 CITS2002 schedule

Storing and searching ordered data - a binary tree


Each of the previous self-referential data-structures stored their values in their order of arrival, and accessed or removed them in the same order or the
reverse. The actual time of insertion is immaterial, with the relative times 'embedded' in the order of the elements.

More common is to store data in a structure that embeds the relative magnitude or priority of the data. Doing so requires insertions to keep the data-
structure ordered, but this makes searching much quicker as well.

Let's consider the type definition and insertion of data into a binary tree in C99:

typedef struct _bt { BINTREE *tree_insert(BINTREE *t, int value)


int value; {
struct _bt *left; if(t == NULL) {
struct _bt *right; BINTREE *new = calloc(1, sizeof(BINTREE));
} BINTREE;
if(new == NULL) {
BINTREE *tree_root = NULL; perror( __func__ );
exit(EXIT_FAILURE);
}
new->value = value;
// the calloc() call has set both left and right to NULL
return new;
}
int order = (t->value - value);
if(order > 0) {
t->left = tree_insert(t->left, value);
}
else if(order < 0) {
t->right = tree_insert(t->right, value);
}
return t;
}

Of note:

we've defined a data-structure containing two pointers to other instances of the data-structure.
the use of the struct _bt data type is temporary, and never used again.
here, each element of the data-structure, each node of the tree, holds a unique instance of a data value - here, a single integer - though it's very
common to hold multiple data values.
we insert into the tree with:
tree_root = tree_insert(tree_root, new_value);

the (magnitude of the) integer data value embeds the order of the structure - elements with lesser integer values are stored 'below' and to the left of
the current node, higher values to the right.
unlike some (more complicated) variants of the binary-tree, we've made no effort to keep the tree balanced. If we insert already sorted elements into
the tree, the tree will degenerate into a list, with every node having either a NULL left or a NULL right pointer.

CITS2002 Systems Programming, Lecture 20, p18, 17th October 2017.


CITS2002 Systems Programming

prev 19 CITS2002 CITS2002 schedule

Storing and searching ordered data - a binary tree, continued


Knowing that we've built the binary tree to maintain an order of its elements, we exploit this property to find elements:

bool find_recursively(BINTREE *t, int wanted) bool find_iteratively(BINTREE *t, int wanted)
{ {
if(t != NULL) { while(t != NULL) {
int order = (t->value - wanted); int order = (t->value - wanted);
if(order == 0) { if(order == 0) {
return true; return true;
} }
else if(order > 0) { else if(order > 0) {
return find_recursively(t->left, wanted); t = t->left;
} }
else { else {
return find_recursively(t->right, wanted); t = t->right;
} }
} }
return false; return false;
} }

Of note:

we search for a value in the tree with:


bool found = find_recursively(tree_root, wanted_value);

we do not modify the tree when searching, we simply 'walk' over its elements, determining whether to go-left or go-right depending on the relative
value of each element's data to the wanted value.
some (more complicated) variants of the binary-tree re-balance the tree by moving recently found values (their nodes) closer to the root of the tree in
the hope that they'll be required again, soon.
if the required value if found, the searching functions return true; otherwise we keep walking the tree until we find the value or until we can no longer
walk in the required direction (because either the left or the right pointer is NULL).

CITS2002 Systems Programming, Lecture 20, p19, 17th October 2017.

You might also like