0% found this document useful (0 votes)
5 views

Binary Trees

Uploaded by

Pedro Pereira
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Binary Trees

Uploaded by

Pedro Pereira
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

288   ■   C h a p t e r 6 B i n a r y T r e e s

If parentheses are not allowed, the task is simple, as parentheses allow for many levels
of nesting. Therefore, an algorithm should be powerful enough to process any num-
ber of nesting levels in an expression. A natural approach is a recursive implementa-
tion. We modify the recursive descent interpreter discussed in Chapter 5’s case study
and outline a recursive descent expression tree constructor.
As Figure 6.64 indicates, a node contains either an operator or an operand, the
latter being either an identifier or a number. To simplify the task, all of them can be
represented as strings in an instance of the class defined as
class ExprTreeNode {
public:
ExprTreeNode(char *k, ExprTreeNode *l, ExprTreeNode *r){
key = new char[strlen(k)+1];
strcpy(key,k);
left = l; right = r;
}
. . . . . . . .
private:
char *key;
ExprTreeNode *left, *right;
}

Expressions that are converted to trees use the same syntax as expressions in
the case study in Chapter 5. Therefore, the same syntax diagrams can be used. Using
these diagrams, a class ExprTree can be created in which member functions for pro-
cessing a factor and term have the following pseudocode (a function for processing
an expression has the same structure as the function processing a term):
factor()
if (token is a number, id or operator)
return new ExprTreeNode(token);
else if (token is '(')
ExprTreeNode *p = expr();
if (token is ')')
return p;
else error;

term()
ExprTreeNode *p1, *p2;
p1 = factor();
while (token is '*' or '/')
oper = token;
p2 = factor();
p1 = new ExprTreeNode(oper,p1,p2);
return p1;

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
S e c t i o n 6 . 1 2 P o l i s h N o t a t i o n a n d E x p r e s s i o n T r e e s    ■   289

The tree structure of expressions is very suitable for generating assembly code
or intermediate code in compilers, as shown in this pseudocode of a function from
­ xprTree class:
E
void generateCode() {
generateCode(root);
}
generateCode(ExprTreeNode *p) {
if (p->key is a number or id)
return p->key;
else if (p->key is an addition operator)
result = newTemporaryVar();
output << "add\t" << generateCode(p->left) << "\t"
<< generateCode(p->right) << "\t"
<<result<<endl;
return result;
. . . . . . . . .
}
With these member functions, an expression
(var2 + n) * (var2 + var1)/5
is transformed into an expression tree shown in Figure 6.65, and from this tree,
­generateCode() generates the following intermediate code:

Figure 6.65 An expression tree.

* 5

+ +

var2 n var2 var1

add var2 n _tmp_3


add var2 var1 _tmp_4
mul _tmp_3 _tmp_4 _tmp_2
div _tmp_2 5 _tmp_1
Expression trees are also very convenient for performing other symbolic opera-
tions, such as differentiation. Rules for differentiation (given in the programming as-
signments in Chapter 5) are shown in the form of tree transformations in Figure 6.66
and in the following pseudocode:

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
290   ■   C h a p t e r 6 B i n a r y T r e e s

Figure 6.66 Tree transformations for differentiation of multiplication and division.

+
P differentiate (p)
* * *

Left Right Left (Right)' (Left)' Right

/
P differentiate (p)
/ – *

Left Right * * Right Right

(Left)' Right Left (Right)'

differentiate(p,x) {
if (p == 0)
return 0;
if (p->key is the id x)
return new ExprTreeNode("1");
if (p->key is another id or a number)
return new ExprTreeNode("0");
if (p->key is '+' or '–')
return new ExprTreeNode(p->key,differentiate(p->left,x),
differentiate(p->right,x));
if (p->key is '*')
ExprTreeNode *q = new ExprTreeNode("+");
q->left = new ExprTreeNode("*",p->left,new ExprTreeNode(*p->right));
q->left->right = differentiate(q->left->right,x);
q->right = new ExprTreeNode("*",new ExprTreeNode(*p->left),p->right);
q->right->left = differentiate(q->right->left,x);
return q;
. . . . . . . . .
}

Here p is a pointer to the expression to be differentiated with respect to x.


The rule for division is left as an exercise.

6.13 Case Study: Computing Word Frequencies


One tool in establishing authorship of text in cases when the text is not signed, or it is
attributed to someone else, is using word frequencies. If it is known that an author A
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
S e c t i o n 6 . 1 3 C a s e S t u d y : C o m p u t i n g W o r d F r e q u e n c i e s    ■   291

wrote text T1 and the distribution of word frequencies in a text T2 under scrutiny is
very close to the frequencies in T1, then it is likely that T2 was written by author A.
Regardless of how reliable this method is for literary studies, our interest lies in writing
a program that scans a text file and computes the frequency of the ­occurrence of words in
this file. For the sake of simplification, punctuation marks are disregarded and case sensitiv-
ity is disabled. Therefore, the word man’s is counted as two words, man and s, although in
fact it may be one word (for possessive) and not two words (contraction for man is or man
has). But contractions are counted separately; for example, s from man’s is considered a
separate word. Similarly, separators in the middle of words such as hyphens cause portions
of the same words to be considered separate words. For example, pre-existence is split into
pre and existence. Also, by disabling case sensitivity, Good in the phrase Mr. Good is consid-
ered as another occurrence of the word good. On the other hand, Good used in its normal
sense at the beginning of a sentence is properly included as another occurrence of good.
This program focuses not so much on linguistics as on building a self-adjusting
binary search tree using the semisplaying technique. If a word is encountered in the
file for the first time, it is inserted in the tree; otherwise, the semisplaying is started
from the node corresponding to this word.
Another concern is storing all predecessors when scanning the tree. It is achieved
by using a pointer to the parent. In this way, from each node we can access any pre-
decessor of this node up to the root of the tree.
Figure 6.67 shows the structure of the tree using the content of a short file,
and Figure 6.68 contains the complete code. The program reads a word, which is any
­sequence of alphanumeric characters that starts with a letter (spaces, punctuation marks,
and the like are discarded) and checks whether the word is in the tree. If so, the semi-
splaying technique is used to reorganize the tree and then the word’s frequency count

Figure 6.67 Semisplay tree used for computing word frequencies.

word YE
freq 2

MORE YET
2 1

LAURELS ONCE
1 2

AND O The text processed to produce


1 1 this tree is the beginning of
John Milton's poem, Lycidas:
Yet once more, o ye laurels,
BROWN MYRTLES
1 1 and once more
ye myrtles brown, ...

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
292   ■   C h a p t e r 6 B i n a r y T r e e s

Figure 6.68 Implementation of word frequency computation.

//************************ genSplay.h ************************


// generic splaying tree class

#ifndef SPLAYING
#define SPLAYING

template<class T> class SplayTree;

template<class T>
class SplayingNode {
public:
SplayingNode() {
left = right = parent = 0;
}
SplayingNode(const T& el, SplayingNode *l = 0, SplayingNode *r = 0,
SplayingNode *p = 0) {
info = el; left = l; right = r; parent = p;
}
T info;
SplayingNode *left, *right, *parent;
};

template<class T>
class SplayTree {
public:
SplayTree() {
root = 0;
}
void inorder() {
inorder(root);
}
T* search(const T&);
void insert(const T&);
}
protected:
SplayingNode<T> *root;
void rotateR(SplayingNode<T>*);
void rotateL(SplayingNode<T>*);
void continueRotation(SplayingNode<T>* gr, SplayingNode<T>* par,
SplayingNode<T>* ch, SplayingNode<T>* desc);
void semisplay(SplayingNode<T>*);
void inorder(SplayingNode<T>*);

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
S e c t i o n 6 . 1 3 C a s e S t u d y : C o m p u t i n g W o r d F r e q u e n c i e s    ■   293

Figure 6.68 (continued)

void virtual visit(SplayingNode<T>*) {


}
};

template<class T>
void SplayTree<T>::continueRotation(SplayingNode<T>* gr,
SplayingNode<T>* par, SplayingNode<T>* ch, SplayingNode<T>* desc) {
if (gr != 0) { // if par has a grandparent;
if (gr->right == ch->parent)
gr->right = ch;
else gr->left = ch;
}
else root = ch;
if (desc != 0)
desc->parent = par;
par->parent = ch;
ch->parent = gr;
}

template<class T>
void SplayTree<T>::rotateR(SplayingNode<T>* p) {
p->parent->left = p->right;
p->right = p->parent;
continueRotation(p->parent->parent,p->right,p,p->right->left);
}

template<class T>
void SplayTree<T>::rotateL(SplayingNode<T>* p) {
p->parent->right = p->left;
p->left = p->parent;
continueRotation(p->parent->parent,p->left,p,p->left->right);
}

template<class T>
void SplayTree<T>::semisplay(SplayingNode<T>* p) {
while (p != root) {
if (p->parent->parent == 0) // if p’s parent is the root;
if (p->parent->left == p)
rotateR(p);
else rotateL(p);
else if (p->parent->left == p) // if p is a left child;

Continues

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
294   ■   C h a p t e r 6 B i n a r y T r e e s

Figure 6.68 (continued)

if (p->parent->parent->left == p->parent) {
rotateR(p->parent);
p = p->parent;
}
else {
rotateR(p); // rotate p and its parent;
rotateL(p); // rotate p and its new parent;
}
else // if p is a right child;
if (p->parent->parent->right == p->parent) {
rotateL(p->parent);
p = p->parent;
}
else {
rotateL(p); // rotate p and its parent;
rotateR(p); // rotate p and its new parent;
}
if (root == 0) // update the root;
root = p;
}
}

template<class T>
T* SplayTree<T>::search(const T& el) {
SplayingNode<T> *p = root;
while (p != 0)
if (p->info == el) { // if el is in the tree,
semisplay(p); // move it upward;
return &p->info;
}
else if (el < p->info)
p = p->left;
else p = p->right;
return 0;
}

template<class T>
void SplayTree<T>::insert(const T& el) {
SplayingNode<T> *p = root, *prev = 0, *newNode;
while (p != 0) { // find a place for inserting a new node;
prev = p;
if (el < p->info)

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
S e c t i o n 6 . 1 3 C a s e S t u d y : C o m p u t i n g W o r d F r e q u e n c i e s    ■   295

Figure 6.68 (continued)

p = p->left;
else p = p->right;
}
if ((newNode = new SplayingNode<T>(el,0,0,prev)) == 0) {
cerr << "No room for new nodes\n";
exit(1);
}
if (root == 0) // the tree is empty;
root = newNode;
else if (el < prev->info)
prev->left = newNode;
else prev->right = newNode;
}

template<class T>
void SplayTree<T>::inorder(SplayingNode<T> *p) {
if (p != 0) {
inorder(p->left);
visit(p);
inorder(p->right);
}
}

#endif

//********************* splay.cpp ************************

#include <iostream>
#include <fstream>
#include <cctype>
#include <cstring>
#include <cstdlib> // exit()
#include "genSplay.h"
using namespace std;

class Word {
public:
Word() {
freq = 1;
}
int operator== (const Word& ir) const {

Continues

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
296   ■   C h a p t e r 6 B i n a r y T r e e s

Figure 6.68 (continued)

return strcmp(word,ir.word) == 0;
}
int operator< (const Word& ir) const {
return strcmp(word,ir.word) < 0;
}
private:
char *word;
int freq;
friend class WordSplay;
friend ostream& operator<< (ostream&,const Word&);
};

class WordSplay : public SplayTree<Word> {


public:
WordSplay() {
differentWords = wordCnt = 0;
}
void run(ifstream&,char*);
private:
int differentWords, // counter of different words in a text file;
wordCnt; // counter of all words in the same file;
void visit(SplayingNode<Word>*);
};

void WordSplay::visit(SplayingNode<Word> *p) {


differentWords++;
wordCnt += p->info.freq;
}

void WordSplay::run(ifstream& fIn, char *fileName) {


char ch = ' ', i;
char s[100];
Word rec;
while (!fIn.eof()) {
while (1)
if (!fIn.eof() && !isalpha(ch)) // skip nonletters
fIn.get(ch);
else break;
if (fIn.eof()) // spaces at the end of fIn;
break;
for (i = 0; !fIn.eof() && isalpha(ch); i++) {
s[i] = toupper(ch);

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
S e c t i o n 6 . 1 3 C a s e S t u d y : C o m p u t i n g W o r d F r e q u e n c i e s    ■   297

Figure 6.68 (continued)

fIn.get(ch);
}
s[i] = ‘\0’;
if (!(rec.word = new char[strlen(s)+1])) {
cerr << “No room for new words.\n”;
exit(1);
}
strcpy(rec.word,s);
Word *p = search(rec);
if (p == 0)
insert(rec);
else p->freq++;
}
inorder();
cout << "\n\nFile " << fileName
<< " contains " << wordCnt << " words among which "
<< differentWords << " are different\n";
}

int main(int argc, char* argv[]) {


char fileName[80];
WordSplay splayTree;
if (argc != 2) {
cout << "Enter a file name: ";
cin >> fileName;
}
else strcpy(fileName,argv[1]);
ifstream fIn(fileName);
if (fIn.fail()) {
cerr << "Cannot open " << fileName << endl;
return 0;
}
splayTree.run(fIn,fileName);
fIn.close();
return 0;
}

is incremented. Note that this movement toward the root is accomplished by changing
links of the nodes involved, not by physically transferring information from one node to
its parent and then to its grandparent and so on. If a word is not found in the tree, it is in-
serted in the tree by creating a new leaf for it. After all words are processed, an inorder tree
traversal goes through the tree to count all the nodes and add all frequency counts to print
as the final result the number of words in the tree and the number of words in the file.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

You might also like