0% found this document useful (0 votes)
44 views61 pages

Data Structures and Algorithm Analysis: E-Mail

The document outlines a course on Data Structures and Algorithm Analysis taught by Dr. Yuhao Yi at SCU, covering essential topics such as data structures, algorithms, and their applications in C++. It includes information on course prerequisites, grading policies, and a detailed course outline with specific chapters and learning goals. Key objectives of the course are to understand the costs and benefits of data structures, learn commonly used data structures, and measure their effectiveness.

Uploaded by

haidarfaiz979
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views61 pages

Data Structures and Algorithm Analysis: E-Mail

The document outlines a course on Data Structures and Algorithm Analysis taught by Dr. Yuhao Yi at SCU, covering essential topics such as data structures, algorithms, and their applications in C++. It includes information on course prerequisites, grading policies, and a detailed course outline with specific chapters and learning goals. Key objectives of the course are to understand the costs and benefits of data structures, learn commonly used data structures, and measure their effectiveness.

Uploaded by

haidarfaiz979
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Data Structures and

Algorithm Analysis
Yuhao Yi
E-mail: [email protected]

Modified from Wenzheng Xu's Slides


1
Instructor
 Dr. Yuhao Yi, Researcher in CS @ SCU

 Teaching: Operating Systems, DSA (this course)

 Research Interests: Algorithms, Control and


Optimization, Social Networks, Graph theory,
Robust ML.

 Personal Website: https://yhyi15.github.io/

2
Course Introduction
311116030: Data Structures and
Algorithm Analysis
course page:
https://yhyi15.github.io/teaching/DSA
3-credit course
weeks 1-16, 12 lectures, 3 lab classes
Prerequisite courses:
Program design methodology (C)
Object-oriented programming (C++)
Discrete mathematics
Wechat
Textbook:
Data Structures and
Algorithm Analysis
in C++ Third Edition

Clifford A. Shaffer
You can buy the print
version online
The electronic pdf
version is at:
http://people.cs.vt.edu/~
shaffer/Book/ 5
There are some errors in the textbook,
which were fixed at
http://people.cs.vt.edu/~shaffer/Book/erra
ta.html
Other useful references
Introduction to Algorithms. 3rd edition, T. H.
Cormen, C. E. Leiserson, R. L. Rivest, and
C. Stein, 2009.
Algorithms. S. Dasgupta, C. H.
Papadimitriou, U. Vazirani , 2006.
Useful online course videos from Florida
University:
http://www.cise.ufl.edu/academics/courses/
preview/cop3530sahni/
Grade Policy-1

Total 100 points


Homework 36%
Course Project 15%
Lab classes 9%
Attendance 10%
Final Exam 30%
Grade Policy-2
Homework (36%)
 Exercises and practice are very important for this course
 6 times of homework, each accounts for 6%
 Accept only electronic versions, no paper versions, when you submit
your homework
 Filename format example: 1stHomework-2017521460109-
RhomanSohag.doc
 No plagiarism!
Lab classes (9%)
 Experiments are the core of this course
 3 times of experiments, each accounts for 3%
 Some coding on the classes
 Total 16 weeks, including the experimental classes
Attendance: (10%)
Grade Policy-3
homework submission deadline
Submit your homework next week to Yixuan Liu, when
the homework is given to you, contact:
[email protected]
Please submit in time, which is very important, as
someone failed in the final exam but the score of his
homework helped him avoid the failure of this course
What about postponing ?
 Maximum of 80% credit for homework up to 1 week late. No
future delays will be accepted.
We have 3 Experimental Classes

Experiments, i.e., coding, are very important


for this course
Bring your laptop when we have the
experimental class
Borrow one if you do not have a laptop
About this course

Data structures is a core course of


computer science.
Algorithms + Data Structures = Programming
It provides a rich context for the study of
problem-solving techniques and program
design and utilizes powerful programming
constructs and algorithms.
This course uses C++ whose classes and
OO constructs are specifically designed to
efficiently implement data structures.
Goals of this Course
1. Reinforce the concept that costs and
benefits exist for every data structure.
2. Learn the commonly used data structures.
 These form a programmer's basic data
structure ``toolkit.'‘

3. Understand how to measure the cost of a


data structure or program.
 These techniques also allow you to judge the
merits of new data structures that you or others
might invent.
Course Summary-1
1. Learning the basic concepts of data
structures, including data objects, data
types, abstract data types, classes,
methods and implementation.
2. Learning to describe the methods of
abstract data types with C++, and learning
some important concepts in object-oriented
programming, such as encapsulation,
template function, inheritance, virtual
functions, polymorphism, etc.
Course Summary-2
3. Learning the concepts, declaration and
implementation, benefits, and costs about
list, stack, queue, tree, binary tree, hash,
graph, etc.
4. Learning the basic searching and sorting
algorithms, such as sequential searching,
binary searching, tree searching, hash
searching, and simple sorting, quick
sorting, heap sorting, and merge sorting,
etc.
Course Summary-3
5. Learning some important applications and
algorithms, such as Huffman algorithm,
Floyd algorithm, and Dijkstra algorithm, etc.
6. Learning the performance criteria,
including space efficiency and
computational efficiency, Big-O Notation.
Using the Big-O Notation to evaluate all
kinds of algorithms that are learned in this
course.
Course Outline-1

Part1 Preliminaries
Chap 1. Data structures and algorithms 2h
Chap 2. mathematical preliminaries 1h
Chap 3. Algorithm Analysis 3h
Course Outline-2

Part2 Fundamental Data Structures


Chap 4. Lists, Stacks and Queues
9h: 6h for theoretical learning, 3h for
experiments
Declaration and implementation of array-
based and linked list, stack and queue
Course Outline-3

Chap 5. Binary Trees 5h


Definition, Implementation and Traversals
Binary search Tree
Huffman Tree
Course Outline -4
Part 3 Sorting and searching
Chap 7. internal sorting
9h: 4h for theoretical learning, 3h for
experiments
insertion/bubble/selection/shell/quick
/merge/Heap/radix sort
Course Outline -5
Chap 9. searching 3h
searching sorted arrays
hashing
Binary Searching
Course Outline -6

Part 4 Applications and advanced topics


Chap 11. Graph
12h: 6h for theoretical learning, 3h for
experiments
Connected Components
Graph Class Implementation
Graph Traversals
Prim Algorithm
Dijkstra Algorithm
Course Outline -7

Chap 13. Advanced Tree Structures 3h


Tries
Balanced Trees
Spacial Data Structures
Chap 16. Intro to Patterns of Algorithms 3h
Dynamic Programming
Randomized Algorithms
Numerical Algorithms
Three goal of this class

Present the commonly used data


structures
Introduce the idea of tradeoffs and
reinforce the concept that there are costs
and benefits associated with every data
structure.
To learn how to measure the effectiveness
of a data structure or algorithm.
24
Solving a problem by computers

Many approaches, how do we choose


between them.
To design an algorithm that is easy to
understand, code, and debug
To design an algorithm that makes efficient use
of the computer’s resources.
We mostly talk about the second in this
class

25
The Need for Data Structures

Data structures organize data


 more efficient programs.
More powerful computers  more
complex applications.
More complex applications demand more
calculations.

26
Organizing Data

Any organization for a collection of records


can be searched, processed in any order, or
modified.
The choice of data structure and algorithm
can make the difference between a program
running in a few seconds or many days.
E.g., sequential searching vs. binary search

27
Efficiency
A solution is said to be efficient if it solves the
problem within its resource constraints.
Space
Time
 The cost of a solution is the amount of resources
that the solution consumes.

 e.g.
Graphic rendering
Digital Video Analyzing
Server App
Communication Applications 28
Selecting a Data Structure

Select a data structure as follows:


1. Analyze the problem to determine the
resource constraints a solution must
meet.
2. Determine the basic operations that must
be supported. Quantify the resource
constraints for each operation.
3. Select the data structure that best meets
these requirements.
29
Some Questions to Ask before Choosing
Data Structures
Are all data inserted into the data structure
at the beginning, or are insertions
interspersed with other operations?
Can data be deleted?
Are all data processed in some well-
defined order, or is random access
allowed?

30
Data Structure Philosophy

Each data structure has costs and benefits.


Rarely is one data structure better than
another in all situations.
A data structure requires:
space for each data item it stores,
time to perform each basic operation,
programming effort.

31
Data Structure Philosophy (cont.)

Each problem has constraints on available


space and time.
Only after a careful analysis of problem
characteristics can we know the best data
structure for the task.
Bank example:
Start account: a few minutes
Transactions: a few seconds
Close account: overnight

32
Goals of this Course
1. Reinforce the concept that costs and
benefits exist for every data structure.
2. Learn the commonly used data structures.
 These form a programmer's basic data
structure "toolkit".
3. Understand how to measure the cost of a
data structure or program.
 These techniques also allow you to judge the
merits of new data structures that you or
others might invent.
33
Abstract Data Types

Abstract Data Type (ADT): a definition for a


data type solely in terms of a set of values and
a set of operations on that data type.
Each ADT operation is defined by its inputs
and outputs.
Encapsulation: Hide implementation details.
Example 1: Integers: operations: +, -, x, /, mod
Example 2: Standard Template Library
34
Data Structure

A data structure is the physical


implementation of an ADT.
Each operation associated with the ADT is
implemented by one or more subroutines in the
implementation.
Data structure usually refers to an
organization for data in main memory.
File structure is an organization for data
on peripheral storage, such as a disk drive.
35
Metaphors

An ADT manages complexity through


abstraction: metaphor.
Hierarchies of labels

E.g., transistors  gates  CPU.


In a program, implement an ADT, then
think only about the ADT, not its
implementation.

36
Logical vs. Physical Form

Data items have both a logical and a


physical form.
Logical form: definition of the data item
within an ADT.
E.g., Integers in mathematical sense: +, -

Physical form: implementation of the data


item within a data structure.
Ex: 16/32 bit integers, overflow.

37
Data Type

ADT:
Data Items:
Type
Logical Form
Operations

Data Structure: Data Items:


Storage Space Physical Form
Subroutines

38
Example 1.8

A typical database-style project will have


many interacting parts.
Problems vs. algorithms vs. programs

Problem: a task to be performed.


Best thought of as inputs and matching outputs.
e.g., sort a set of numbers
Problem definition should include constraints on
the resources that may be consumed by any
acceptable solution.

40
Problems (cont)

 Problems  mathematical functions


A function is a matching between inputs (the domain)
and outputs (the range).
An input to a function may be single number, or a
collection of information.
The values making up an input are called the
parameters of the function.
A particular input must always result in the same output
every time the function is computed.
 Mathematical functions is not exactly the same
to computer programs
41
Algorithms and Programs
Algorithm: a method or a process followed
to solve a problem.
A recipe .

An algorithm takes the input to a problem


(function) and transforms it to the output.
A mapping of input to output.

A problem can have many algorithms.

42
Algorithm Properties
An algorithm possesses the following five
properties:
It must be correct.
It must be composed of a series of concrete steps.
There can be no ambiguity as to which step will be
performed next.
It must be composed of a finite number of steps.
It must terminate.

A computer program is an instance, or


concrete representation, for an algorithm
in some programming language.
43
Incremental Development
How to fail at implementing your project:
1. Write the project
2. Debug the project
How to succeed:
1. Write the smallest possible kernel
2. Debug the kernel thoroughly
3. Repeat until completion:
i. Add a functional unit
ii. Debug the resulting program
4. Have a way to track details 44
Chapter 2 Mathematical Background

Set concepts and notation


Logarithms
Summations
Recursion and Recurrence Relations
Induction Proofs

45
Set

 A set is a collection of distinct elements or


members with the same type

 E.g., a set R consists of integers 3, 4, and 5,


then R={3,4,5}

 Cardinality: a measure of the number of


elements in a set
Set representations in two ways

1) Enumeration:S={2,4,6,8,10}
2) Description:S={x | x is an even integer
number and 0≤x≤10}
S={x | x is an even positive number }
 The cardinality of S ?
Three characteristics of a set

1) certainty:any element is contained in a


set or not, no ambiguity arises
2) uniqueness:elements are distinct
3) No order
 Sets R={1, 2, 3} and S={3, 2, 1} are identical
Logarithm
Definition: log a N  b , a > 0
The minimum number of bits for encoding a
number N
The minimum number of searches performed
for searching a particular value in an ordered
sequence of values
Summation
Recursion
An algorithm is recursive if it calls itself
to do part of its work.

Example:
 1. Compute n!
 2. Hanoi puzzle
An example: the factorial function

 1 n=0
fact(n) =
n*fact(n-1) n>0
int Fact ( int n )
{
int m;
if (n= =0) return(1);
else
{
m=n*Fact(n-1);
return(m);
} 52

}
Recursion-cont.

A recursive function contains two parts


 Base case, which can be solved easily,
e.g., fact(n)=1 if n=0.
Recursion case, including a single or
multiple calls for itself with smaller problem
sizes, e.g, fact(n) = n * fact(n-1) if n > 0
Recursion-cont.

fact(n) = n * fact(n-1)
Recursive function may be difficult to
understand, the key is as follows:
Do not think how function fact(n-1) executes
Just assume that fact(n-1) returns the correct
results.
The ideas of abstract and divide and conquer
is very very useful in not only algorithm design
but also solving many problems in our lives
Mathematical Proof

 Three templates for mathematical proof


1. Direct Proof
 Logic deduction
2. Proof by Contradiction
3. Proof by mathematical induction
Proof by contradiction

consider to show that P→ Q, if we can


prove (not Q)→(not P), then P→ Q holds
E.g., prove that there is no largest integer.
Proof:
Step 1: Assume that there is a largest integer, call it B.
Step 2: consider that C = B + 1. C is an integer and C > B.
B then is not the largest integer. A contradiction happens.
Therefore, the assumption that there is a largest integer is
incorrect.
Proof by induction

Prove S(n) = 1 + 2 + …+n = n(n+1)/2, for n ≥1


Proof:
1. Check the base case. S(1) = 1 = 1(1+1)/2 √
2. Assume the equation holds for n-1, i.e.,
S(n-1)=1+2+…+(n-1)=(n-1)(n-1+1)/2=(n-1)n/2
3. Consider the case for n.
S(n) = 1+2+…+(n-1)+n
= S(n-1) + n
= (n-1)n/2 + n, by the assumption
= n(n+1)/2 √ 57
Estimation Techniques

Known as “back of the envelope” or


“back of the napkin” calculation
1. Determine the major parameters that effect the
problem.
2. Derive an equation that relates the parameters
to the problem.
3. Select values for the parameters, and apply
the equation to yield and estimated solution.

58
Estimation Example

How many library bookcases does it


take to store books totaling one
million pages?

Estimate:
 Pages/inch
 Feet/shelf
 Shelves/bookcase

59
Conclusion
The three goals of this course
The concepts of different data structures and
their costs and benefits
Adopt and design appropriate data structures
and algorithms for applications
Analyze the complexity of an algorithm/program
The definition of abstract data type
Grasp the idea of abstract
Hidden implementation details
Mathematical backgrounds
Set, logarithm, summation, recursion, three 60

proof techniques
Homework 1

Exercises
1.2, 1.14
2.17, 2.20, 2.29, 2.33, 2.34, 2.47
 Deadline: TBA

You might also like