0% found this document useful (0 votes)
14 views87 pages

Lecture1 IO BLG336E 2022

The document outlines the introductory lecture for the course BLG 336E: Analysis of Algorithms II, covering logistics, course goals, and the importance of algorithms. It emphasizes that algorithms are fundamental, useful, and enjoyable, and introduces key topics such as algorithm design, analysis, and various algorithmic techniques. The lecture also highlights the course structure, grading, and resources, including recommended textbooks and the significance of engaging with the material.

Uploaded by

pearsonicin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views87 pages

Lecture1 IO BLG336E 2022

The document outlines the introductory lecture for the course BLG 336E: Analysis of Algorithms II, covering logistics, course goals, and the importance of algorithms. It emphasizes that algorithms are fundamental, useful, and enjoyable, and introduces key topics such as algorithm design, analysis, and various algorithmic techniques. The lecture also highlights the course structure, grading, and resources, including recommended textbooks and the significance of engaging with the material.

Uploaded by

pearsonicin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

BLG 336E

Analysis of Algorithms II
Lecture 1:
Logistics, and introduction

Slides partially based on CS 161-Stanford Course 1


The big questions
• Who are we?
• Professor, TAs, students?
• Why are we here?
• Why learn about algorithms?
• What is going on?
• What is this course about?
• Logistics?

2
3
Teaching Assistants
• Doğukan Arslan
• Enes Erdoğan
• Erdi Sarıtaş
• Mehmet Selahaddin Şentop
• Yusuf Kızılkaya

4
Who are you?

• Sophomores • Seniors • PhD Students


• Juniors • MS Students

5
Why are we here?
• I’m here because I’m super excited about algorithms!

6
Why are you here?
• Algorithms are fundamental.
• Algorithms are useful.
• Algorithms are fun!
• BLG 336E is a required course.

Why is BLG 336E required?


• Algorithms are fundamental.
• Algorithms are useful.
• Algorithms are fun!
7
Algorithms are fundamental

Operating Systems
The
Security
Algorithmic
Lens

Compilers
Networking Computational Biology
8
Algorithms are useful
• As inputs get bigger and
bigger, having good
algorithms becomes more
and more important!
• Most companies base
their interview questions
on this course

9
Algorithms are fun!
• Algorithm design is both an art and a science.
• Many surprises!
• Many exciting research questions!

10
What’s going on?
• Course goals/overview
• Logistics

11
Course goals
• The design and analysis of algorithms
• These go hand-in-hand

• In this course you will:


• Learn to think analytically about algorithms
• Flesh out an “algorithmic toolkit”
• Learn to communicate clearly about algorithms

12
Our guiding questions:

Does it work?
Is it fast?
Can I do better?

13
Our internal monologue…
Precision Intuition
What exactly do we Dude, this is just like
Does it work?
mean by better? that other time. If you
And what about that Is it fast? do the thing and the
corner case? Can I do better? stuff like you did then,
Shouldn’t we be it’ll totally work real fast!
zero-indexing?

Detail-oriented Big-picture
Precise Intuitive
Rigorous Hand-wavey

Both sides are necessary! 14


Aside: the bigger picture
• Does it work? • Should it work?
• Is it fast? • Should it be fast?
• Can I do better?
• We want to reduce crime.
• It would be more “efficient” to put cameras in everyone’s homes/cars/etc.

• We want advertisements to reach to the people to whom they are most


relevant.
• It would be more “efficient” to make everyone’s private data public.

• We want to design algorithms, that work well, on average, in the population.


• It would be more “efficient” to focus on the majority population.
15
How to get the most out of lectures
• During lecture:
• Show up or tune in, ask questions.
• Engage with in-class questions.
• Before lecture:
• Get prepared to the topic
• After lecture:
• Go through the exercises and assignments.

• Do the reading
• either before or after lecture, whatever works best for you.
• do not wait to “catch up” the week before the exam.
16
”CLRS”: Introduction to
Algorithms by Cormen,
Leiserson, Rivest, and Stein.
Textbook
“Algorithm Design” by
Kleinberg and Tardos

• CLRS and Algorithm Design are the reference books


• Stanford CS-161 Course
• We may also refer to to the following (optional) books:

17
Grading

Syllabus and • 3 Programming Projects (30%)


• 1 Midterm (30%)

Grading • Final (40%)

Please note:
• VF Condition 15/50 (First 2 Hws+Midterm)
Week Date Topic

1 12-Feb Introduction. Some representative problems

2 19-Feb Stable Matching

3 26-Feb Basics of algorithm analysis.

4 4-Mar Graphs (Project 1 announced)

5 11-Mar Greedy algorithms-I

6 18-Mar Greedy algorithms-II

7 25-Mar Divide and conquer (Project 2 announced)

8 1-Apr Dynamic Programming I

9 15-Apr Dynamic Programming II

10 22-Apr Network Flow-I (Project 3 announced)

11 29/30- Midterm
Apr

12 6-May Network Flow II

13 13-May NP and computational intractability-I

14 20-May NP and computational intractability-II


18
Applications
Wide range of applications.
• Caching.
• Compilers.
• Databases.
• Scheduling.
• Networking.
• Data analysis.
• Signal processing.
• Computer graphics.
• Scientific computing.
• Operations research.
• Artificial intelligence.
• Computational biology.
• ...

We focus on algorithms and techniques that are useful in practice.

19
Homework!
• First HW 4rd of March
• Use C++ and object oriented approach in your
assignments.
• The goal is to practice your implementation skills, so
copy-pasting is not encouraged.
• There will be explicit instructions on which files to
submit, how it should be compiled, example cases etc.
• Some examples will be provided so that you can test
your program yourselves.
• However, your program will be graded based on how
good it is in solving the given problem.
20
Talk to each other!
• Recitation sections:
• See Ninova for schedule (to be posted soon)
• Extra practice with the material, example problems, etc.
• Technically optional, but highly recommended!

Talk to us!
21
Roadmap
Randomized
Algs
Asymptotic

str
Da ture
We are
Sorting

uc
Analysis

ta s
here
Brains and e n ces
Re curr
bees

Dynamic
Greedy Algs Programming

Longe Graphs!
st, Sh
Max a ortes
nd M t,
in … The
Contents
1. Introduction. Some representative problems.

2. Basics of algorithm analysis.

3. Graphs

4. Greedy algorithms

5. Divide and conquer

6. Dynamic programming

7. Network Flow

8. NP and computational intractability


Introduction, Some
representative problems
Stable matching problem
•How to prove the number of iterations of the algorithm
•How to prove that the solution returned is correct (stable)

Five representative problems:


•Interval Scheduling (Greedy Algorithm)
•Weighted Interval Scheduling (Dynamic Programming)
•Bipartite Matching (network flow problems, augmentation)
•Independent Set (NP Complete problem, can check a given solution in
linear time, finding soln takes a lot of time)
•Competitive Facility Location (PSPACE-complete poblem)
Basics of Algorithm Analysis
• Computational Tractability = running efficiently=in polynomial time
• Asymptotic Order of Growth (O(), (), ())
• Implementation of Stable Matching Problem Using Arrays and
Lists
• Survey of Common Running Times (Linear, nlogn, Quadratic, Cubic,
O(nk), Beyond Polynomial, Sublinear)
• Implementation of Stable Matching Problem Using Priority Queues
(Heap)
Graphs
• Definition and Applications (Transportation networks,
communication networks, information networks, social networks,
dependency networks)
• Paths and connectivity, trees
• Graph Connectivity and Graph Traversal (Breadth-First (BFS),
Depth First Search (DFS))
• Implementation using Queues and Stacks
• Testing bipartideness: An Application of BFS
• Directed Acyclic Graphs, Topological Ordering
Greedy Algorithms
• An algorithm that builds up a solution in small steps, choosing a
decision at each step myopically to optimize some underlying
criterion.
• Interval Scheduling: Design, Analysis (How to prove that a greedy
algorithm produces an optimal solution)
• Scheduling all intervals
• Scheduling to minimize lateness
• Optimal Caching
• Shortest Paths in a Graph (Dijksta)
• Minimum(-cost) spanning trees (Kruskal, Prim)
• Implementation of Kruskal’s Algo: Union-Find Data Structure
Divide and Conquer
• An algorithm that breaks up the problem into smaller problems,
solves each part separately and then combines them. Recurrence
relation.
• Mergesort
• Recurrences
• Divide and Conquer Applications
• Counting inversions (collaborative filtering)
• Finding the closest pair of points
• Integer Multiplication
• Convolutions and the FFT (maybe)
Dynamic Programming
• Drawn from the intuition behind divide and conquer and is opposite
of greedy strategy. Explore space of all possible solutions by
carefully decomposing things into a series of subproblems. Then
build up solutions to larger and larger subproblems. Dynamic
programming is better than brute force search. Does not explore
all possibilites but only the ones it needs to explore.
• Weighted Interval Scheduling
• Principles of Dynamic Programming
• Segmented Least Squares
• Subset sums and knapsacks
• Shortest Paths in a Graph
Network Flow
• Bipartite matching is a special case, but there are many diverse
applications.
• Max flow problem and Ford-Fulkerson algorithm
• Maximum flows and minimum cuts
• Choosing good augmenting paths
• A first application: bipartite matching
• Joint paths in directed and undirected graphs
• Extensions to max flow problem
NP and Computational
Intractability
• NP complete problems: A large set of problems for which we do
not know an efficient (polynomial time) solution. Once a solution is
given though, we can check if it is correct in polynomial time.
• Polynomial time reductions (of one problem to another)
• Satisfiability problem reductions
• Definition of NP
• NP Complete problems
How was the word
“algorithm” created?

32
Muhammad ibn Mūsā al-Khwārizmī 780-850
The words 'algorithm' and 'algorism' come from the name al-
Khwārizmī. Al-Khwārizmī (Persian: ‫خوارزمی‬, 8th century) was a Persian
mathematician, astronomer, geographer, and scholar in the House of
Wisdom in Baghdad.

About 825, he wrote a treatise in the Arabic language, which was


translated into Latin in the 12th century under the title Algoritmi de
numero Indorum. This title means "Algoritmi on the numbers of the
Indians", where "Algoritmi" was the translator's Latinization of Al-
Khwarizmi's name. Al-Khwarizmi was the most widely read
mathematician in Europe in the late Middle Ages, primarily through his
book, the Algebra.

His name has taken on a special significance in computer science,


where the word “algorithm” has come to refer to a method that can be
used by a computer for the solution of a problem.

33
In Class discussion: what is an
algorithm?

• Program?
• Function?

34
A Picture of an Algorithm

Input( Output(s
s) Algorithm )

35
First Algorithm
• It is in human nature to think algorithmically!
• Long before computers were invented, people could create
precise algorithms that could compute. One of the first
algorithms (computing the greatest common divisor) is 300
years older than the Bible! It was written by Euclid (~300 BC).

• Here is an idea:
Gcd(m,n) = n, if m mod n = 0 or
Gcd(n, (m mod n)) otherwise.

36
Gcd (54,24)

Use Euclid’s algorithm to determine the greatest common divisor of 54 and 24?

Gcd(m,n) = n, if m mod n = 0
or Gcd(n, (m mod n)) otherwise

37
Gcd (54,24)

Use Euclid’s algorithm to determine the greatest common divisor of 54 and 24?

Gcd(m,n) = n, if m mod n = 0
or Gcd(n, (m mod n)) otherwise

Solution:

Gcd(54,24) = ?
54 mod 24 = 6 (=54 – 2 x 24)
Gcd(24,6) = 6 (since 24 mod 6 = 0)

38
First Algorithm

“Homework”: implement the Euclidean algorithm in the language of your choice.

39
First Algorithm (Euclid’s) =
function

input( output(
s) s)

40
How to think about an
algorithm differently?

41
Alan Turing. 1912 - 1954

Alan Turing was an English mathematician, logician,


cryptanalyst and computer scientist. He was
influential in the development of computer science and
providing a formalization of the concept of the
algorithm and computation with the Turing machine,
playing a significant role in the creation of the modern
computer.

42
Turing Machine

A Turing machine is a kind of a state machine. At any time the machine is in any one of a finite
number of states.
A Turing machine has an infinite one-dimensional tape divided into cells. Traditionally we think
of the tape as being horizontal with the cells arranged in a left-right orientation. The tape has
one end, at the left say, and stretches infinitely far to the right. Each cell is able to contain one
symbol, ‘0’ or ‘1’.
The machine has a read-write head, which scans a single cell on the tape. This read-write head
can move left and right along the tape to scan successive cells.
The action of a Turing machine is determined completely by (1) the current state of the
machine (2) the symbol in the cell currently being scanned by the head and (3) a table of
transition rules, which serve as the “program” for the machine.

43
Turing Machine
Each transition rule is a 4-tuple:

〈 State0, Symbol, Statenext, Action 〉

which can be read as saying “if the machine is in state State0 and the
current cell contains Symbol then move into
state Statenext taking Action”.
A function will be Turing-computable
The actions available to a Turing machine are either to write a symbol if there exists a set of instructions that
on the tape in the current cell (which we will denote with the symbol will result in the machine computing the
in question), or to move the head one cell to the left or right. function, regardless of the amount of
time it takes.
If the machine reaches a situation in which there is not exactly one
transition rule specified, i.e., none or more than one, then the
Although the device looks primitive,
machine halts.
even very complex modern computers
In modern terms, the tape serves as the memory of the machine, can perform *only* Turing-computable
while the read-write head is the memory bus through which data is tasks!
accessed (and updated) by the machine.

44
45
Church-Turing thesis

Everything computable is
computable by a Turing Machine.

Alonzo Church
(1903–1995)

46
Turing’s Algorithm =
program
“a series of instructions that can be put into a
computer in order to make it perform an
operation”

47
Is an algorithm a function or a
program?

• It can be both:
1. Declarative/functional (Euclid’s way of thinking)
2. Imperative/ via machine instructions (Turing’s way of thinking)

48
What is an Algorithm?
All algorithms must satisfy the following criteria:
1. Input. Zero or more quantities are externally supplied.
2. Output. At least one quantity is produced.
3. Definiteness. Each instruction is clear and unambiguous. E.g., “add 6 or 7 to x”
is not permitted.
4. Finiteness. If we trace out the instructions of an algorithm, then for all cases,
the algorithm terminates after a finite number of steps. (Termination!)
We can also add “effectiveness”. Every instruction must be very basic so that it can
be carried out, in principle, by a person using only pencil and paper; it must be
feasible.

49
4 questions to ask about algorithms
• How to devise algorithms?
Creating an algorithm is an art which will never be fully automated.
During your studies, you can learn useful techniques.
• How to validate/verify algorithms?
Once an algorithm is devised, it is necessary to show that it computes a correct answer for all possible inputs —this is called
algorithm validation.
Once the validity of the method is shown, a program can be written. Then a program verification is needed.
• How to analyze algorithms?
Performance analysis determines how much computing time and storage an algorithm requires [in best case, in worst case,
and in average].
• How to test a program? Testing a program has two phases – debugging and profiling.
“Debugging can only point to the presence of errors, and not to their absence!” A proof of correctness is much more certain.
Profiling is the process of executing a correct program on data sets and measuring the time and space required.

50
How do we know that an
algorithm is correct?

51
52
Shortest travel distance in a graph
The travelling salesman problem (TSP) asks the following question: "Given a list of cities and the
distances between each pair of cities, what is the shortest possible route that visits each city exactly once and
returns to the origin city?"

Stop right now and think up and algorithm to solve this problem (make the travelling salesman
happy!).

53
Nearest Neighbour Tour

• A popular solution starts at some point p0 and then


walks to its nearest unvisited neighbor p1, then
repeats from p1, etc. Until done.

54
Is this solution correct?
Let us try another example …

55
Nearest Neighbor Algorithm works
perfectly for this case too! ☺

56
To make sure that NN algorithm is
a correct solution to the TSP
problem, we need to keep testing
it on different examples.

57
What about this example?

0 is our starting
point.

58
Closest Pair Tour
• Another idea is to repeatedly connect the closest pair of points whose
connection will not cause a cycle or a three-way branch, until all points are in
one tour.

59
What about this example?

A good
solution

60
What about this example? Does
the Closest Pair Tour work on this?

Although it works correctly on the previous example, other data causes trouble.

61
What about this example? Does
the Closest Pair Tour work on this?

A good
solution

Although it works correctly on the previous example, other data causes trouble.

62
Does a GOOD algorithm
for the problem exist?

63
A Correct Algorithm: Exhaustive
Search
• We could try all possible orderings of the points, then select the
one which minimizes the total length:

• Since all possible orderings are considered, we are guaranteed to


end up with the shortest possible tour.
Factorial function:
n! = 1*2*3*...*n (e.g. 3! = 1*2*3 = 6)

64
Exhaustive Search is Slow!
Because it tries all n! permutations, it is extremely slow to use when there are more
than 10-20 points.
The fastest computer in the world couldn’t hope to enumerate all the
20! =2,432,902,008,176,640,000 orderings of 20 points within a day.
[For n=1000, will not be achieved in your lifetime!!!]
No efficient and correct algorithm exists for the travelling salesman
problem.
William Rowan Hamilton
(1805–1865)
Irish mathematician

The problem of joining


dots is also known as
“Hamiltonian cycle”

65
66
Searching for
counterexamples is the best
way to disprove the
correctness of an algorithm.
If the algorithm works in
some cases and fails in
others, it is generally called a
heuristic.
“In which case my algorithm might fail?” 67
Think algorithmically ☺
Top interview questions on algorithms

• Sorting (plus searching / binary


search)
• Divide-and-conquer
• Dynamic programming /
memoization
• Greediness
• Recursion
• Algorithms associated with a
specific data structure
68
A sample of interview questions
• How to find middle element in a linked list through
a single pass?
• Write a Java program to sort an array using Bubble
Sort algorithm?
• How to reverse a linked list using recursion and
iteration?
• Design an algorithm to find all the common
elements in two sorted lists of numbers.

69
Sorting Problem
Input: A sequence of n numbers
<a1, a2,..., an>
Output: A permutation (reordering)
<a1’, a2’, ..., an’>
such that
a1’<=a2’ <=... <=an’

70
Insertion Sort
• Simple algorithm
• Basic idea:
• Assume initial j-1 elements are sorted
• Until you find place to insert jth element, move array
elements to right
• Copy jth element into its place
• Insertion Sort is an “in place” sorting algorithm. No
extra storage is required.

71
Insertion Sort Example

72
Insertion Sort Example

73
Insertion Sort Example

74
Insertion Sort

75
Pseudocode Conventions
• Indentation
• indicates block structure
• saves space and writing time
• Looping constructs (while, for, repeat) and conditional
constructs (if, then, else)
• like in C, C++, and Java
• we assume that loop variable in a for loop is still defined when
loop exits
• Multiple assignment i ← j ← e assigns to both variables i
and j value of e (== j ← e, i ← j)
• Variables are local, unless otherwise specified

76
Pseudocode Conventions
• Array elements are accessed by specifying array name
followed by index in square brackets
• A[i] indicates ith element of array A
• Notation “..” is used to indicate a range of values within an array
(A[i..j] = A[1], A[2],…, A[j])
• We often use objects, which have attributes (equivalently,
fields)
• For an attribute attr of object x, we write attr[x]
• Equivalent of x.attr in Java or x-> attr in C++
• Objects are treated as references, like in Java
• If x and y denote objects, then assignment y ← x makes x and y
reference same object
• It does not cause attributes of one object to be copied to another

77
Pseudocode Conventions
• Parameters are passed by value, as in Java and C (and the
default mechanism in C++).
• When an object is passed by value, it is actually a reference (or
pointer) that is passed
• Changes to the reference itself are not seen by caller, but changes
to the object’s attributes are
• Boolean operators “and” and “or” are short-circuiting
• If after evaluating left-hand operand, we know result of expression,
then we do not evaluate right-hand operand
• If x is FALSE in “x and y”, then we do not evaluate y
• If x is TRUE in “x or y”, then we do not evaluate y

78
Efficiency
• Correctness alone is not sufficient
• Brute-force algorithms exist for most problems
• To sort n numbers, we can enumerate all
permutations of these numbers and test which
permutation has the correct order
• Why cannot we do this?
• Too slow!
• By what standard?

79
How to measure complexity?
• Accurate running time is not a good measure
• It depends on input
• It depends on the machine you used and who
implemented the algorithm

• We would like to have an analysis that does not


depend on those factors

80
Machine-independent
• A generic uniprocessor random-access machine
(RAM) model
• No concurrent operations
• Each simple operation (e.g. +, -, =, *, if, for) takes 1 step.
• Loops and subroutine calls are not simple operations.
• All memory equally expensive to access
• Constant word size
• Unless we are explicitly manipulating bits
• No memory hierarch (caches, virtual mem) is modeled

81
Running Time
• Running Time:T(n): Number of primitive
operations or steps executed for an input of size n.
• Running time depends on input
• already sorted sequence is easier to sort
• Parameterize running time by size of input
• short sequences are easier to sort than long ones
• Generally, we seek upper bounds on running time
• everybody likes a guarantee

82
Kinds of Analysis
• Worst-case: (usually)
• T(n) = maximum time of algorithm on any input of size n
• Average-case: (sometimes)
• T(n) = expected time of algorithm over all inputs of size n
• Need assumption about statistical distribution of inputs
• Best-case: (bogus)
• Cheat with a slow algorithm that works fast on some input

83
Analyzing Insertion Sort
• T(n) = c1n + c2(n-1) + c3(n-1) + c4S + c5(S - (n-1)) + c6(S - (n-1)) + c7(n-1)
= c8S + c9n + c10
• What can S be?
• Best case -- inner loop body never executed
• tj = 1 🡺 S = n - 1
• T(n) = an + b is a linear function
• Worst case -- inner loop body executed for all previous
elements
• tj = j 🡺 S = 2 + 3 + … + n = n(n+1)/2 - 1
• T(n) = an2 + bn + c is a quadratic function
• Average case
• Can assume that in average, we have to insert A[j] into the middle
of A[1..j-1], so tj = j/2
• S ≈ n(n+1)/4
• T(n) is still a quadratic function
84
Insertion Sort Running Time
Theta Notation, see next week.

• Best-case:
• , inner loop not executed at all
• Worst-case: Input reverse sorted
• [Arithmetic series]

• Average-case: All permutations equally likely


Is Insertion Sort a fast sorting algorithm?


• Moderately so, for small n
• Not at all, for large n

85
Asymptotic Analysis
• Ignore actual and abstract statement costs
• Order of growth is the interesting measure:
• Highest-order term is what counts
• As the input size grows larger it is the high order term that
dominates

86
Next Week
• Stable Week Date Topic
Matching 1 12-Feb Introduction. Some representative problems

2 19-Feb Stable Matching

3 26-Feb Basics of algorithm analysis.

4 4-Mar Graphs (Project 1 announced)


5 11-Mar Greedy algorithms-I

6 18-Mar Greedy algorithms-II

7 25-Mar Divide and conquer (Project 2 announced)


8 1-Apr Dynamic Programming I
9 15-Apr Dynamic Programming II
10 22-Apr Network Flow-I (Project 3 announced)

11 29/30- Midterm
Apr

12 6-May Network Flow II


13 13-May NP and computational intractability-I
14 20-May NP and computational intractability-II

87

You might also like