0% found this document useful (0 votes)
13 views

Graphs

The document discusses C++ input/output (I/O) and introduces graphs and bit-by-bit I/O. It covers C++ I/O buffering and how to perform bit-by-bit I/O. It provides examples of reading and writing raw binary data to files using bitwise operators and discusses defining a class for bitwise I/O streams. The class would extend existing I/O stream classes and add methods like writeBit() and readBit() while buffering bits in a byte.

Uploaded by

Umar Ali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Graphs

The document discusses C++ input/output (I/O) and introduces graphs and bit-by-bit I/O. It covers C++ I/O buffering and how to perform bit-by-bit I/O. It provides examples of reading and writing raw binary data to files using bitwise operators and discusses defining a class for bitwise I/O streams. The class would extend existing I/O stream classes and add methods like writeBit() and readBit() while buffering bits in a byte.

Uploaded by

Umar Ali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

CSE 100:

C++ I/O;
INTRODUCTION TO
GRAPH
Today’s Class
• C++ I/O
• I/O buffering
• Bit-by-bit I/O
• Introduction to Graph
Reading and writing numbers
#include <iostream>
#include <fstream>

using namespace std;

int main( int argc, char** argv )


{
ofstream numFile;
int num = 12345;
numFile.open( "numfile" );
numFile << num;
numFile.close();
}

Assuming ints are represented with 4 bytes, how large is numfile after this
program runs?
A. 1 byte
B. 4 bytes
C. 5 bytes
D. 20 bytes
Reading and writing numbers
#include <iostream>
#include <fstream>

using namespace std;

int main( int argc, char** argv )


{
ofstream numFile;
int num = 12345;
numFile.open( "numfile" );
numFile << num;
numFile.close();
}

You’ll need to include delimiters between the numbers


Writing raw numbers
#include <iostream>
#include <fstream>

using namespace std;

int main( int argc, char** argv )


{
ofstream numFile;
int num = 12345;
numFile.open( "numfile" );
numFile.write( (char*)&num, sizeof(num) ) ;
numFile.close();
}

This is the method you’ll use for the final submission


Let’s look at the file after we run this code…
Reading raw numbers
#include <iostream>
#include <fstream>

using namespace std;

int main( int argc, char** argv )


{
ofstream numFile;
int num = 12345;
numFile.open( "numfile" );
numFile.write( (char*)&num, sizeof(num) ) ;
numFile.close();

// Getting the number back!


ifstream numFileIn;
numFileIn.open( "numfile" );
int readN;
numFileIn.read((char*)&readN, sizeof(readn));
cout << readN << endl;
numFileIn.close();
}
Opening a binary file
#include <iostream>
#include <fstream>

using namespace std;

int main( int argc, char** argv )


{
ifstream theFile;
unsigned char nextChar;
theFile.open( "testerFile", ios::binary );
while ( 1 ) {
nextChar = theFile.get();
if (theFile.eof()) break;
cout << nextChar;

}
theFile.close();
}
Binary and nonbinary file streams
• Ultimately, all streams are sequences of bytes: input streams, output streams... text
streams, multimedia streams, TCP/IP socket streams...

• However, for some purposes, on some operating systems, text files are handled
differently from binary files
• Line termination characters for a particular platform may be inserted or removed
automatically
• Conversion to or from a Unicode encoding scheme might be performed

• If you don’t want those extra manipulations to occur, use the flag ios::binary when you
open it, to specify that the file stream is a binary stream
• To test your implementation on small strings, use formatted I/O

• Then add the binary I/O capability

• But there is one small detail: binary I/O operates on units of information such as
whole byes, or a string of bytes
• We need variable strings of bits
Reading binary data from a file: an example
#include <fstream>
using namespace std;
/** Count and output the number of times char ’a’ occurs in
* a file named by the first command line argument. */
int main(int argc, char** argv) {
ifstream in;
in.open(argv[1], ios::binary);
int count = 0;
unsigned char ch;
while(1) {
ch = in.get(); // or: in.read(&ch,1);
if(! in.good() ) break; // failure, or eof
if(ch == ’a’) count++; // read an ’a’, count it
}

if(! in.eof() ) { // loop stopped for some bad reason...


cerr << "There was a problem, sorry." << endl;
return -1;
}
cerr << "There were " << count << " ’a’ chars." << endl;
return 0;
}
Writing the compressed file
Header (some way to
reconstruct the HCTree)

Encoded data (bits)


Now let’s talk about how to
write the bits…
Today’s Class
• C++ I/O
• I/O buffering
• Bit-by-bit I/O
Buffering
• The C++ I/O classes ofstream, ifstream, and fstream use buffering

• I/O buffering is the use of an intermediate data structure, called the buffer, usually an
array used with FIFO behavior, to hold data items

• Output buffering: the buffer holds items destined for output until there are enough of
them to send to the destination; then they are sent in one large chunk

• Input buffering: the buffer holds items that have been received from the source in
one large chunk, until the user needs them

• The reason for buffering is that it is often much faster per byte to receive data from a
source, or to send data to a destination, in large chunks, instead of one byte at at time

• This is true, for example, of disk files and internet sockets; even small buffers
(512 or 1K bytes), can make a big difference in performance

• Also, operating system I/O calls and disk drives themselves typically perform buffering
Streams and Buffers

BitOutputStream:
DATA
encoder Buffer ostream
IN bits bytes

You can also manually flush this buffer

ostream Buffer disk


bytes 4KB

disk Buffer istream


4KB bytes
BitInputStream:

istream Buffer decoder DATA


bytes bits OUT
Buffering and bit-by-bit I/O
• The standard C++ I/O classes do not have any methods for doing I/O a bit at a time

• The smallest unit of input or output is one byte (8 bits)

• This is standard not only in C++, but in just about every other language in the world

• If you want to do bit-by-bit I/O, you need to write your own methods for it

• Basic idea: use a byte as an 8-bit buffer!


• Use bitwise shift and or operators to write individual bits into the byte, or read
individual bits from it;
• flush the byte when it is full, or done with I/O

• For a nice object-oriented design, you can define a class that extends an existing
iostream class, or that delegates to an object of an existing iostream class, and
adds writeBit or readBit methods (and a flush method which flushes the 8-bit buffer)
Today’s Class
• C++ I/O
• I/O buffering
• Bit-by-bit I/O
C++ bitwise operators
• C++ has bitwise logical operators &, |, ^, ~ and shift operators <<, >>

• Operands to these operators can be of any integral type; the type of the result will be the
same as the type of the left operand

& does bitwise logical and of its arguments;


| does logical bitwise or of its arguments;
^ does logical bitwise xor of its arguments;
~ does bitwise logical complement of its one argument

<< shifts its left argument left by number of bit positions given by its right
argument, shifting in 0 on the right;
>> shifts its left argument right by number of bit positions given by its right
argument, shifting in the sign bit on the left if the left argument is a signed
type, else shifts in 0
C++ bitwise operators: examples
one byte
unsigned char a = 5, b = 67;

a: 0 0 0 0 0 1 0 1
most least
b: 0 1 0 0 0 0 1 1 significant significant
bit bit

What is the result of a & b


A. 01000111
B. 00000001
C. 01000110
D. Something else

Page 17of 23
Scott B. Baden / CSE 100-A / Spring 2013
C++ bitwise operators: examples
one byte
unsigned char a = 5, b = 67;

a: 0 0 0 0 0 1 0 1
most least
b: 0 1 0 0 0 0 1 1 significant significant
bit bit

What is the result of b >> 5


A. 00000010
B. 00000011
C. 01100000
D. Something else

Page 18of 23
Scott B. Baden / CSE 100-A / Spring 2013
C++ bitwise operators: examples
one byte
unsigned char a = 5, b = 67;

a: 0 0 0 0 0 1 0 1
most least
b: 0 1 0 0 0 0 1 1 significant significant
bit bit

a & b 0 0 0 0 0 0 0 1

a | b 0 1 0 0 0 1 1 1

~a 1 1 1 1 1 0 1 0

a << 4 0 1 0 1 0 0 0 0

b >> 1 0 0 1 0 0 0 0 1

(b >> 1) & 1 0 0 0 0 0 0 0 1

a | (1 << 5) 0 0 1 0 0 1 0 1
C++ bitwise operators: an exercise
• Selecting a bit: Suppose we want to return the value --- 1 or 0 --- of the nth bit from the
right of a byte argument, and return the result. How to do that?
byte bitVal(char b, int n) {

return
}
• Setting a bit: Suppose we want to set the value --- 1 or 0 --- of the nth bit from the right of
a byte argument, leaving other bits unchanged, and return the result. How to do that?
byte setBit(char b, int bit, int n) {

return
}
Defining classes for bitwise I/O
• For a nice object-oriented design, let’s define a class BitOutputStream that delegates to
an object of an existing iostream class, and that adds a writeBit method (and a flush
method which flushes the 8-bit buffer)
• If instead BitOutputStream subclassed an existing class, it would inherit all the
existing methods of its parent class, and so they become part of the subclass’s interface
also
• some of these methods might be useful, but...
• in general it will complicate the interface

• Otherwise the two design approaches are very similar to implement, except that:
• with inheritance, BitOutputStream uses superclass methods to perform operations
• with delegation, BitOutputStream uses methods of a contained object to perform
operations

• We will also consider a BitInputStream class, for bitwise input


Outline of a BitOutputStream class using delegation
#include <iostream>
class BitOutputStream {
private:
char buf; // one byte buffer of bits
int nbits; // how many bits have been written to buf
std::ostream & out; // reference to the output stream to use
public:

/** Initialize a BitOutputStream that will use


* the given ostream for output.
*/
BitOutputStream(std::ostream & os) : out(os), buf(0), nbits(0) {
// clear buffer and bit counter
}

/** Send the buffer to the output, and clear it */


void flush()
os.put(buf);
os.flush();
buf = nbits = 0;
}
Outline of a BitOutputStream class using delegation,
cont
/** Write the least significant bit of the argument to
* the bit buffer, and increment the bit buffer index.
* But flush the buffer first, if it is full.
*/
void writeBit(int i) {
// Is the bit buffer full? Then flush it

// Write the least significant bit of i into the buffer


// at the current index

// Increment the index

char buf
int nbits
ostream out
Outline of a BitInputStream class, using delegation
#include <iostream>
class BitInputStream {
private:
char buf; // one byte buffer of bits
int nbits; // how many bits have been read from buf
std::istream & in; // the input stream to use
public:

/** Initialize a BitInputStream that will use


* the given istream for input.
*/
BitInputStream(std::istream & is) : in(is) {
buf = 0; // clear buffer
nbits = ?? // initialize bit index
}
What should we
initialize nbits to?
/** Fill the buffer from the input */ A. 0
void fill() { B. 1
buf = in.get(); C. 7
nbits = 0;
} D. 8
E. Other
Outline of a BitInputStream class, using delegation (cont’d)
/** Read the next bit from the bit buffer.
* Fill the buffer from the input stream first if needed.
* Return 1 if the bit read is 1;
* return 0 if the bit read is 0.
*
*/
int readBit() {
// If all bits in the buffer are read, fill the buffer first

// Get the bit at the appriopriate location in the bit


// buffer, and return the appropriate int

// Increment the index

}
Sources of information and entropy
• A source of information emits a sequence of symbols drawn independently
from some alphabet
• Suppose the alphabet is the set of symbols ,…,
• Suppose the probability of symbol occurring in the source is
• Then the information contained in symbol is log bits, and the average
information per symbol is (logs are base 2):

• This quantity H is the “entropy” or “Shannon information” of the information


source
• For example, suppose a source uses 3 symbols, which occur with
probabilities 1/3, 1/4, 5/12
• The entropy of this source is
Lower bound on average code length
Code A Code B Code C
Symbol Codeword Symbol Codeword Symbol Codeword
s 00 s 0 s 0
p 01 p 1 p 10
a 10 a 10 a 110
m 11 m 11 m 111

Symbol Frequency Shannon’s entropy provides a lower bound on the


s 0.6 average code length purely as a function of symbol
p 0.2 frequencies and independent of ANY encoding
a 0.1 scheme
m 0.1

L_ave = 0.6 * -lg(0.6) + 0.2 * -lg(0.2) + 0.1 * -lg(0.1) + 0.1 * -lg(0.1)


= 0.6 * lg(5/3) + 0.2*lg(5) + 0.1*lg(10) + 0.1*lg(10)
= 1.57
How large and how small can entropy be?
• A source of information emits a sequence of symbols drawn independently from the
alphabet ,…, such that the probability of symbol occurring is
• The entropy (Shannon information) of the source, in bits, is defined as (logs are base
2):

• Q: What is the possible range of values of H? A: We always have 0 log


• The smallest possible value of H is 0:
• If one symbol occurs all the time, so 1 and so log ⁄ , and all the other symbols never
occur, so the other 0, then you don’t get any information by observing the source:
H=0
• The largest possible value of H is log N. This is the ‘maximum entropy’ condition
• If each of the symbols are equally likely, then ⁄ for all i and so:

• H
What is the best possible average length of a Symbol Frequency
coded symbol with these frequencies? S 1.0
A. 0 P 0.0
B. 0.67 A 0.0
C. 1.0 M 0.0
D. 1.57
E. 2.15
What is the best possible average length of a
coded symbol with this frequency distribution? Symbol Frequency
(why?) S 0.25
A. 1 P 0.25
B. 2 A 0.25
C. 3 M 0.25
D. lg(2)
Graphs
32

Kinds of Data Structures

Unstructured structures Sequential, linear structures Hierarchical structures


(sets) (arrays, linked lists) (trees)
Graphs
Consist of:
• A collection of elements (“nodes” or “vertices”)
C • A set of connections (“edges” or “links” or “arcs”)
B between pairs of nodes.
• Edges may be directed or undirected
E • Edges may have weight associated with them
A
D Graphs are not hierarchical or sequential,
no requirements for a “root” or “parent/child”
relationships between nodes
33

Kinds of Data Structures

Unstructured structures Sequential, linear structures Hierarchical structures


(sets) (arrays, linked lists) (trees)

Graphs
A. They consist of both vertices and edges
C B. They do NOT have an inherent order
B C. Edges may be weighed or unweighted
D. Edges may be directed or undirected
E E. They may contain cycles
A
D
34

Kinds of Data Structures

Unstructured structures Sequential, linear structures Hierarchical structures


(sets) (arrays, linked lists) (trees)

Graphs
Which of the following is true?
C A. A graph can always be represented as a tree
B B. A tree can always be represented as a graph
C. Both A and B
E D. Neither A or B
A
D
35

Kinds of Data Structures

Unstructured structures Sequential, linear structures Hierarchical structures


(sets) (arrays, linked lists) (trees)

Graphs
Which of the following is true?
C A. A graph can always be represented as a tree
B B. A tree can always be represented as a graph
C. Both A and B
E D. Neither A or B
A
D

Note that trees are special cases of graphs; lists are special cases of trees.
36

Why Graphs?

C
B

E
A
D
37

Why Graphs?
C
B

E
A
D

Remember: If your problem maps to a well-known graph problem, it


usually means you can solve it blazingly fast!
38

Graphs: Example
V0 V1
A directed graph

V2 V3 V4

V5

V={

|V| =

E={

|E|

Path:
39

Graphs: Definitions
V0 V1
A directed graph

V2 V3 V4

V5 V6

A graph G = (V,E) consists of a set of vertices V and a set of edges E


• Each edge in E is a pair (v,w) such that v and w are in V.
• If G is an undirected graph, (v,w) in E means vertices v and w are connected by an
edge in G. This (v,w) is an unordered pair
• If G is a directed graph, (v,w) in E means there is an edge going from vertex v to
vertex w in G. This (v,w) is an ordered pair; there may or may not also be an
edge (w,v) in E
• In a weighted graph, each edge also has a “weight” or “cost” c, and an edge in E is a
triple (v,w,c)
• When talking about the size of a problem involving a graph, the number of vertices
|V| and the number of edges |E| will be relevant
40

Connected, disconnected and fully connected


graphs
• Connected graphs:

• Disconnected graphs:

• Fully connected (complete graphs):


41

Q: What are the minimum and maximum number of edges in


a undirected connected graph G(V,E) with no self loops,
where N=|V|?

A. 0, N2
B. N, N2
C. N-1, N(N-1)/2
42

Sparse vs. Dense Graphs


V0 V1 V0 V1

V2 V3 V2 V3

A dense graph is one where |E| is “close to” |V|2.


A sparse graph is one where |E| is “closer to” |V|.
43

Representing Graphs: Adjacency Matrix


V0 V1

V2 V3 V4

V5 V6

0 1 2 3 4 5 6
A 2D array where each entry [i][j] encodes
0
connectivity information between i and j
1 • For an unweighted graph, the entry is 1
2 if there is an edge from i to j, 0 otherwise
• For a weighted graph, the entry is the
3 weight of the edge from i to j, or “infinity”
4 if there is no edge
5 • Note an undirected graph’s adjacency matrix
will be symmetrical
6
44

Representing Graphs: Adjacency Matrix


V0 V1

V2 V3 V4

V5 V6

0 1 2 3 4 5 6 How big is an adjacency matrix in terms of the


0 1 number of nodes and edges (BigO, tightest bound)?
1 1 1 A. |V|
B. |V|+|E|
2 1 1
C. |V|2
3 1 1 D. |E|2
4 1 1 E. Other
5 When is that OK? When is it a problem?
6 1
45

Space efficiency of Adjacency Matrix


V0 V1 V0 V1

V2 V3 V2 V3

0 1 2 3 0 1 2 3
0 0 1 0 0 0 1 1 1 1
1 0 0 0 1 1 1 1 0 1
2 1 0 0 0 2 1 1 0 1
3 0 0 1 0 3 0 1 1 1

A dense graph is one where |E| is “close to” |V|2.


A sparse graph is one where |E| is “closer to” |V|.

Adjacency matrices are space inefficient for sparse graphs


46

Representing Graphs: Adjacency Lists


V0 V1

V2 V3 V4

• Vertices and edges stored as lists


V5 • Each vertex points to all its edges
• Each edge points to the two vertices that it connects
• If the graph is directed: edge nodes differentiate
between the head and tail of the connection
Vertex List Edge List • If the graph is weighted edge nodes also contain weights
47

Representing Graphs: Adjacency Lists


V0 V1

V2 V3 V4

V5

Each vertex has a list with the vertices adjacent to it.


In a weighted graph this list will include weights.

How much storage does this representation need?


(BigO, tightest bound)
A. |V|
B. |E|
C. |V|+|E|
D. |V|^2
E. |E|^2
48

Searching a graph
• Find if a path exists between any two nodes
• Find the shortest path between any two nodes
• Find all nodes reachable from a given node

V0 V1

V3 V4

V2
Generic Goals:
• Find everything that can be explored
• Don’t explore anything twice
49

Generic approach to graph search

V0 V1

V3 V4

V2
50

Depth First Search for Graph Traversal


• Search as far down a single path as possible
before backtracking
V0 V1

V3 V4

V5
V2
51

Depth First Search for Graph Traversal


• Search as far down a single path as possible
before backtracking
V0 V1

V3 V4

V5
V2

Assuming DFS chooses the lower number node to explore first,


in what order does DFS visit the nodes in this graph?
A. V0, V1, V2, V3, V4, V5
B. V0, V1, V3, V4, V2, V5
C. V0, V1, V3, V2, V4, V5
D. V0, V1, V2, V4, V5, V3
52

Depth First Search for Graph Traversal


• Search as far down a single path as possible
before backtracking
V0 V1

V3 V4

V5
V2

Does DFS always find the shortest path between nodes?


A. Yes
B. No

You might also like