0% found this document useful (0 votes)
9 views

CSIT571 - 01SP23 - Module - 01 Introduction To Algorithms

Uploaded by

amannellutla9
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

CSIT571 - 01SP23 - Module - 01 Introduction To Algorithms

Uploaded by

amannellutla9
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

INTRODUCTION TO ALGORITHM

January 20, 2023

Professor: Maggie Zhang

ATTENTION: distribution outside this group of students is NOT allowed.


A Dream Job and $1,000,000 Prize?

THE DEPARTMENT OF COMPUTER SCIENCE


REALITY OF WHY ALGORITHMS
• Important for all other branches of computer science (CS)
• Database/Operating System/Machine Learning/Networking
• Plays a critical role in modern technological innovation
• Provide novel “lens” on processes outside of CS and
technology
• Quantum mechanics, economic markets, evolution
• Challenging
• good for the brain! And good for getting in FLAAG!
• Fun (my favorite part)

THE DEPARTMENT OF COMPUTER SCIENCE


REALITY OF WHY ALGORITHMS

“Everyone knows Moore’s Law – a


prediction on made in 1965 by Intel co-
founder Gordon Moore that the density of
transistors in integrated circuits would
continue to double every 1 to 2 years....in
many areas, performance gains due to
improvements in algorithms have vastly
exceeded even the dramatic performance
gains due to increased processor speed.”
https://lazowska.cs.washington.edu/nitrd/

THE DEPARTMENT OF COMPUTER SCIENCE


TOPICS THAT WE WILL COVER IN CSIT571
• Vocabulary for design and analysis of algorithms
• E.g., “Big-Oh” notation
• “sweet spot” for high-level reasoning about algorithms
• Divide and conquer algorithm design paradigm
• Will apply to: Integer multiplication, sorting, matrix multiplication, closest pair
• General analysis methods (“Master Method/Theorem”)
• Randomization in algorithm design
• Will apply to: QuickSort, primality testing, graph partitioning, hashing.
• Primitives for reasoning about graphs
• Connectivity information, shortest paths, structure of information and social
networks.
• Use and implementation of data structures
• Heaps, balanced binary search trees, hashing and some variants (e.g., bloom filters)

THE DEPARTMENT OF COMPUTER SCIENCE


ADVANCED TOPICS THAT WE MAY NOT IN CSIT571

• Greedy algorithm design paradigm


• Dynamic programming algorithm design paradigm
• NP-complete problems and what to do about them
• Fast heuristics with provable guarantees
• Fast exact algorithms for special cases
• Exact algorithms that beat brute-force search

THE DEPARTMENT OF COMPUTER SCIENCE


SKILLS THAT YOU WILL LEARN IN CSIT571

• Become a better programmer


• Sharpen your mathematical and analytical skills
• Start “thinking algorithmically”
• Literacy with computer science’s “greatest hits”
• Ace your technical interviews

THE DEPARTMENT OF COMPUTER SCIENCE


YOU PERFECT FOR CSIT571

• It doesn’t really matter.


• Ideally, you know some programming.
• Doesn’t matter which language(s) you know.
• Some (perhaps rusty) mathematical experience.
• Basic discrete math, proofs by induction, etc.
• Excellent free reference: “Mathematics for Computer
Science”, by Eric Lehman and Tom Leighton. (Easy to
find on the Web.)

THE DEPARTMENT OF COMPUTER SCIENCE


MATERIALS FOR CSIT571

• Syllabus (Your guidance to survive) & Slides.


• Textbook: Introduction to Algorithms (Third Edition), by
Corman, Leiserson, Rivest, and Stein [Recommended,
designated]; A few of the many good ones:
• Kleinberg/Tardos, Algorithm Design, 2005.
• Dasgupta/Papadimitriou/Vazirani, Algorithms, 2006.
• Mehlhorn/Sanders, Data Structures and Algorithms: The
Basic Toolbox, 2008.
• No specific development environment required.
• But you should be able to write and execute programs.

THE DEPARTMENT OF COMPUTER SCIENCE


COLLECTION VS SEQUENCE VS SET
• Collection
• Can be a sequence or a set
• Set
• The order of its elements does not matter
• Sequence
• The order matters
• Examples
{1, 2, 3} vs {2, 1, 3} in set? And in sequence?

THE DEPARTMENT OF COMPUTER SCIENCE


LOGARITHM
• lg 𝑛
• log base 2 of m
• log 𝑛
• log base 10 of m
• ln 𝑛
• log base e of m

THE DEPARTMENT OF COMPUTER SCIENCE


DATA TYPE
• The set of values the variable takes
• Build-in (primitive or elementary)
• char
• Int
• double
• Composite
• Struct
• Class
• int[]
• Programming language
• Strongly-typed language
• Weakly-typed language

THE DEPARTMENT OF COMPUTER SCIENCE


DATA MODEL
• An abstraction that describes how data are represented and used.
• Characters
• Integers of different sizes
• Floating-point numbers of different sizes
• Array
• Classes
• Structures

THE DEPARTMENT OF COMPUTER SCIENCE


ABSTRACT DATA TYPE
• A mathematical model and a collection of operations defined on this
model
• Dictionary
• Insert
• Delete
• Search
• Priority queue
• Insert
• ExtractHigh

THE DEPARTMENT OF COMPUTER SCIENCE


DATA STRUCTURE
• a representation of the mathematical model underlying an ADT, or It
is a systematic way of organizing and accessing data.
• a systematic way of organizing and accessing data in a computer.

• Question to you: Does it matter what data structures we use?

THE DEPARTMENT OF COMPUTER SCIENCE


COMPUTATINAL PROBLEM
• a computational problem has an input (a set of values) and an output
(set of values).
• Question to you: Can you define some computational problems?
• Examples:
Input: two numbers
Output: the lumpsum of two numbers

Input: two n-bit non-negative integer a, b


Output: an (n+1)-bit integer c such that c = a + b
Problem: Addition of bounded-size non-negative integers

THE DEPARTMENT OF COMPUTER SCIENCE


INTEGER MULTIPLICATION
Input: two n-digit numbers x and y  Follow this Pattern!
Output: the product x*y
“Primitive Operation” – add or multiply 2 single-digit number
Kids care about
getting correct
Grade-School Algorithm answers. We care
about running time!
5678
x 1234 _
22712 roughly n~2n basic operations per row
17034 up to a constant
11356
5678 _
7006652 # of operations overall ~ constant * 𝒏𝟐
Problem: Multiplication of bounded-size non-negative integers
THE DEPARTMENT OF COMPUTER SCIENCE
ALGORITHMS
• a well-defined sequence of computational steps that performs a task
by generating a set of output values from a set of input values.
• a “method” that transforms an input into an output.

• A sorted sequence is an increasing sequence or non-decreasing


sequence or monotonically-increasing sequence such as {1, 2, 3, 3, 4}.
• A reverse-sorted sequence is a decreasing / non-increasing /
monotonically-decreasing sequence such as {h4, 3, 3, 2, 1}.

THE DEPARTMENT OF COMPUTER SCIENCE


CORRECT ALGORITHMS
• an algorithm for a problem is an ”Algorithm” if it works correctly for
every input instance
• and halts with the correct output.an input instance of the problem
consists of all the inputs needed to compute a solution to the

THE DEPARTMENT OF COMPUTER SCIENCE


CAN WE DO BETTER?
Algorithm Designer’s Mindset: Always ask the question – can we design
an algorithm than the “obvious” method?

“Perhaps the most important principle for


the good Algorithm designer is to refuse
to be content.”

THE DEPARTMENT OF COMPUTER SCIENCE


KARATSUBA MULTIPLICATION & BEYOND
Input: two n-digit numbers x and y
Output: product x*y
x = (56)(78) a=56; b=78
Y = (12)(34) c=12; d=34
Step 1: a x c = 672
Step 2: b x d = 2652
Step 3: (a+b)(c+d) = 123 x 46 = 6164
Step 4: (3) – (2) – (1) = 2840
Step 5:
6720000
2652
284000 _
7006652 = (1234) (5678)

Problem: Multiplication of bounded-size non-negative integers


THE DEPARTMENT OF COMPUTER SCIENCE
DICTIONARY
• an ADT with data consisting of words (aka strings of characters) and
few operations involving insertion, deletion and look-up of a word.
• Insert: an operation that inserts a word into the dictionary thus increasing its
size by one.
• Search: an operation that searches for a word in an already formed
dictionary.
• Delete: an operation that deletes a specified word from the dictionary thus
reducing its size by one.

THE DEPARTMENT OF COMPUTER SCIENCE


STATIC VS DYNAMIC
• a static dictionary is one that does not support operations Insert and
Delete. Only a Search or similar operations are allowed.
• “probing operations” that do not modify the data structure
• Search
• Minimum
• Maximum
• Predecessor
• Successor
• is a dynamic data structure that supports operations Insert and
Delete in addition to Search. Its size can thus change.
• Data structure: array (linear search/binary search), sorted array,
(single/double)linked-listed, sorted linked-list, binary search tree.

THE DEPARTMENT OF COMPUTER SCIENCE


PSEUDOCODE
• Describe an Algorithm
• Including free-form English to describe more complex interaction.
• Black-Box Algorithm invocation.
• Array Indexing
• Not need to compile

purpose of this course: introduce concepts around algorithms.

THE DEPARTMENT OF COMPUTER SCIENCE


MODEL OF COMPUTATION
• Random-Access Machine (RAM)
• show that if an algorithm uses A computation resources under the RAM it
would also require at least A under more realistic models.
• WORD and BIT
• WORD - where addition of integer a and integer b takes unit time i.e. one
step, i.e. one operation (thus RAM is the Word model)
• BIT - where the addition of a and b would take time proportional to the
number of bits of a and the number of bits of b.
• Straight Line Program (SLP)

THE DEPARTMENT OF COMPUTER SCIENCE


PROBLEM SIZE
• The value that would be used in determining the running time
and/or space resources used by the algorithm.
• Both space and time are to be expressed as functions of this
• Problem size: n
• Integer value/a natural non-negative and non-zero integer number
• MOST IMPORTANT parameter
• T(n) or S(n)
• vs Input Size
• the number of elements of the input or the size of all the input.
• unit of measurement

THE DEPARTMENT OF COMPUTER SCIENCE


RESOURCES
• TIME
• How long an algorithm runs to completion
• T(n)
• SPACE
• How much memory/space the algorithm use.
• S(n)
• Big-O: S(n) = O(n) – upper bounded by a linear function to n
• Big-Omega: S(n) = Ω(n) – lower bounded by a linear function to n
• Theta: S(n) = ϴ(n) – linear to n
• Question: how would you measure TIME and SPACE?
• Precise or not?

THE DEPARTMENT OF COMPUTER SCIENCE


COMPARISON
Algorithm A and B:

Asymptotic Comparison

THE DEPARTMENT OF COMPUTER SCIENCE


COMPARISON
Algorithm A and B:

Asymptotic Comparison

THE DEPARTMENT OF COMPUTER SCIENCE


INCREMENTAL TECHINIQUE
• iteration/iterative
• generates the output incrementally one-element at a time or by walking
through the input and generating a solution one element of the input at a
time incrementally.
• n-th Fibonacci Function
Input: An integer n ≥ 0
Output: Integer 𝐹𝑛 , the n-th Fibonacci number: 𝐹𝑛 = 𝐹𝑛−1 + 𝐹𝑛−2 𝑖𝑓 𝑛 > 1
where 𝐹0 = 0 𝑎𝑛𝑑 𝐹1 = 1

THE DEPARTMENT OF COMPUTER SCIENCE


FIBONACCI
• SPACE
• ϴ(n)
• TIME
• ϴ(n)

• Let’s exercise!

THE DEPARTMENT OF COMPUTER SCIENCE


RUNNING TIME
• worst scenario
• bound
• FIBONACCI: a linear running-time operation involving this naive
incremental approach for computing Fn.
• Recursion/Recursive
• Direct
• Indirect

THE DEPARTMENT OF COMPUTER SCIENCE


KARATSUBA’S RECURSION
Write 𝑥 = 10𝑛/2 𝑎 + 𝑏 and 𝑦 = 10𝑛/2 𝑐 + 𝑑
Where a,b,c,d are n/2-digit numbers
[example: a=56, b=78, c=12, d=34]
𝑛 𝑛
Then x ∗ 𝑦 = 10 𝑎 + 𝑏 ∗ (10 𝑐 + 𝑑)
2 2

𝑛
= 10𝑛 𝑎𝑐 + 10 𝑎𝑑 + 𝑏𝑐 + 𝑏𝑑 (∗𝑠𝑖𝑚𝑝𝑙𝑒 𝑏𝑎𝑠𝑒 𝑐𝑎𝑠𝑒 𝑜𝑚𝑖𝑡𝑡𝑒𝑑)
2

Idea: Recursively compute ac, ad, bc, bd, then compute (*) in the
obvious way.

Problem: Multiplication of bounded-size non-negative integers


THE DEPARTMENT OF COMPUTER SCIENCE
KARATSUBA MULTIPLICATION
𝑛
Remember x ∗ 𝑦 = 10𝑛 𝑎𝑐
+ 10 𝑎𝑑 + 𝑏𝑐 + 𝑏𝑑
2

Idea: Reduce recursive calculation to 3!

1. Recursively compute ac
2. Recursively compute bd
3. Recursively compute (a+b)(c+d) = ac+bc+ad+bc

Gauss’ Trick: (3)-(1)-(2) = ad + bc


Upshot: only need 3 recursive multiplications (and some additions)

Question: which is the fastest algorithm?


Problem: Multiplication of bounded-size non-negative integers
THE DEPARTMENT OF COMPUTER SCIENCE
KARATSUBA’S IMPLEMENTATION

Problem: Multiplication of bounded-size non-negative integers


THE DEPARTMENT OF COMPUTER SCIENCE
RECURSION
• Recursion/Recursive
• Direct
• Indirect

• Question: Can you improve the previous Fibonacci program?

THE DEPARTMENT OF COMPUTER SCIENCE


DIVIDE AND CONQUER

• DIVIDE the problem into two or more sub-problems


• CONQUER those sub-problems, and then combining those
• Easy to design
• Less efficient due to resource costs

THE DEPARTMENT OF COMPUTER SCIENCE


FIBONACCI EXCERCISE

a) Show that 𝐹𝑛 ≤ 2𝑛 (or 𝐹𝑛 ≤ 2𝑛−1 ) by using induction (also, 𝐹𝑛 ≥


2𝑛/2−1 ).

b) Find the exact solution for 𝐹𝑛 using Discrete Math techniques.

c) What is the number of function (recursive) calls in the recursive


solution?

d) Show that 𝑇(𝑛) ≥ 2𝑛/2−1

THE DEPARTMENT OF COMPUTER SCIENCE


DYNAMIC PROGRAMMING
• Divide-and-Conquer without additional storage.

• Reduce duplication

• Memoization

THE DEPARTMENT OF COMPUTER SCIENCE


SORTING
• Insertion Sort Input: array of n numbers, unsorted.
(Assume distinct or not)
• Merge Sort
5 4 1 8 7 2 6 3
• Selection Sort
• Bubble Sort Output: Same numbers, sorted in
increasing order.
• Maximum and Minimum
1 2 3 4 5 6 7 8

THE DEPARTMENT OF COMPUTER SCIENCE


INSERTION SORT
• Incremental Algorithm

• Sort in place
• If the same space is used for input and output and the extra memory used by
the algorithm is constant.

• Stable Sort.
• if the relative order of same valued keys remains the same in the input and
output sequences.

THE DEPARTMENT OF COMPUTER SCIENCE


RUNNING TIME
• n sorted keys
• ϴ(n)

• n reversed-sorted keys
• ϴ(𝑛2 )

• Lower bound and upper bound for comparison


• n − 1 ≤ Comparison(n) ≤ n(n − 1)/2.

• Pseudocode line is a constant 𝐶𝑖 - does not matter

• Best case/Worst Case/Average Case

THE DEPARTMENT OF COMPUTER SCIENCE


MERGE SORT
• Good introduction to Divide-and-conquer Algorithm
• Divide Phase: Divide n numbers into two sequences of n/2 numbers each.
• Conquer Phase. Recursively sort the two sequences (recursion terminates when a
sequence of size 1 is to be sorted).
• Combine Phase. Merge the two sorted sequences of size n/2 to produce the sorted
result.

• Improve over Selection, Insertion, Bubble Sort


• Calibrate your preparation
• Motivates guiding principles for algorithms analysis (worst-case and
asymptotic analysis)
• Analysis generalizes to “Master Method”

THE DEPARTMENT OF COMPUTER SCIENCE


MERGE SORT

5 4 1 8 7 2 6 3

5 4 1 8 5 4 1 8

Recursive Calls

1 4 5 8 2 3 6 7
merge

1 2 3 4 5 6 7 8

THE DEPARTMENT OF COMPUTER SCIENCE


MERGE SORT: PSEUDOCODE DESIGN
Ignore base cases and odd numbers of items.

• Recursively sort 1st half of the input array


• Recursively sort 2nd half of the input array
• Merge two sorted sub lists into one

THE DEPARTMENT OF COMPUTER SCIENCE


MERGE SORT: PSEUDOCODE

C = output [length=n] Conquer: Merge two arrays together


A = 1st sorted array [n/2] for k=1 to n * (loop – by n)
B = 2nd sorted array [n/2] if A(i) < B(j) *
i=1 C(k) = A(i) *
j=1 i++ *
else [B(j) < A(i)]
Upshot Operation: one recursive C(k) = B(j)
running time of Merge on array of n j++
numbers is ≤ 𝟒𝒏 + 𝟐 ≤ 𝟔𝒏
end

THE DEPARTMENT OF COMPUTER SCIENCE


RUNNING TIME
• Merge sort requires 𝟔𝒏 ∗ 𝒍𝒐𝒈𝟐 𝒏 + 𝟔𝒏 operations to sort n numbers
(lgn is the # of times you divide by 2 until you get down to 1)

• T(n) = 2T(n/2) + cn.

• T(n) = dn lg n

• Sort in Place (No?)

• Stable in sort (Yes?)

THE DEPARTMENT OF COMPUTER SCIENCE


WHY 𝟔𝒏 ∗ 𝒍𝒐𝒈𝟐 𝒏 + 𝟔𝒏 ?
Merge sort requires 𝟔𝒏 ∗ 𝒍𝒐𝒈𝟐 𝒏 + 𝟔𝒏 operations to sort n numbers

• Roughly how many levels does this recursion tree have (as a function of n,
the length of the input array)? – 𝑐, log 2 𝑛 , 𝑛, 𝑛

• What is the pattern? Fill in the blanks in the following statement: at each
level j=0,1,2,.., log 2 𝑛, there are <blank> subproblems, each of size
𝑗 𝑛
<blank>. 2 𝑎𝑛𝑑 𝑗 , respectively
2
𝒏
• Every level: ≤ 𝟐𝒋 ∗ 𝟔 = 𝟔𝒏
𝟐𝒋
𝒍𝒆𝒗𝒆 𝒋: # 𝒐𝒇 𝒍𝒆𝒗𝒆𝒍 𝒋 𝒔𝒖𝒃𝒑𝒓𝒐𝒃𝒍𝒆𝒎𝒔 ∗ 𝒔𝒊𝒛𝒆 𝒐𝒇 𝒍𝒆𝒗𝒆𝒍 𝒋 𝒔𝒖𝒃𝒑𝒓𝒐𝒃𝒍𝒆𝒎
• Total: 𝟔𝒏 ∗ 𝒍𝒐𝒈𝟐 𝒏 + 𝟏
𝑾𝒐𝒓𝒌 𝒑𝒆𝒓 𝒍𝒆𝒗𝒆𝒍 ∗ # 𝒐𝒇 𝒍𝒆𝒗𝒆𝒍𝒔.

THE DEPARTMENT OF COMPUTER SCIENCE


MAXIMUM AND MINIMUM

MAXIMUM( A[1..n], n) MINIMUM( A[1..n], n)


1. result = A[n]; 1. results = A[1];
2. for(i=n-1 ; i >= 1 ; i--) { 2. for(i=2; i <= n; i++) {
3. if A[i] > result 3. if A[i] < result
4. result = A[i]; 4. result = A[i];
5. } 5. }
6. return(result) 6. return(result)

THE DEPARTMENT OF COMPUTER SCIENCE


RUNNING TIME
• TIME efficient - T(n) = ϴ(n)

• comparison: n-1

• n-2?

• No better than ϴ(n)

• SPACE efficient - S(n) = ϴ(1)

THE DEPARTMENT OF COMPUTER SCIENCE


SELECTION SORT
• Incremental Algorithm
• Let An = A be the sequence of input keys. In the i-th iteration of selection
sort the minimum m of A is found and is placed in the output sequence in
position i and then m is removed from A and thus An−1 contains the
remaining n − 1 keys.

• Sort In place
• This can become in-place if instead of swapping A[index] and A[i] we shift
the keys to right starting with the key at i and ending with the key at index-1
and then concluding moving the original A[index] into A[i].

THE DEPARTMENT OF COMPUTER SCIENCE


BUBBLE SORT
• Odd-Even Transportation Sort
In an odd round i (i = 1, 3, 5, etc) odd-indexed keys A[j] compare themselves to
neighboring keys A[j + 1] and exchange their values if necessary. In an even
round i, the same occurs with even indexed keys.

• Parallel Sort

THE DEPARTMENT OF COMPUTER SCIENCE


SUMMARY OF SORTING
• Sort in place:
• Insertion Sort, Selection Sort, Bubble Sort, Heap Sort
• Not sort in place:
• Merge sort [Merge Sort uses extra array for merging],
Quick Sort [Non-recursive Quick Sort must use Omega(lgn) space]
• Stable:
• Insertion Sort, Selection Sort, Bubble Sort, Merge Sort
• Not Stable :
• Quick Sort, Heap Sort

THE DEPARTMENT OF COMPUTER SCIENCE


RUNNING TIME VS RUNNING TIME BOUND
• RUNNING TIME
• Single instance

• Running Time Bound Count Sort


• All possible instances of the same input size

THE DEPARTMENT OF COMPUTER SCIENCE


SETS
• Static
• Dynamic

• Probing operation
• Search, FindMin, FindMax

• Modifying operation
• Insert, Delete, ExtractMax, DeleteMax

THE DEPARTMENT OF COMPUTER SCIENCE


OPERATIONS
• Search(S, k)
• Insert(S, x)
• Delete(S, x)
• Min(S)
• Max(S
• Successor(S, x)
• Predecessor(S, x)

THE DEPARTMENT OF COMPUTER SCIENCE


TERMINOLOGIES FOR THIS COURSE
• Array
• Random Access Memory
• Linked-list
• Double-linked list
• Circular
• Sentinel
• Garbage-collector
• Free list
• Binary tree (left/right child/parent)

THE DEPARTMENT OF COMPUTER SCIENCE


PRINCIPLE 1
“worst–case analysis”: our running time bound holds for every input
of length n.
• Particularly appropriate for “general-purpose” routines

As Opposed to
-- ”average case” analysis
-- benchmarks

BONUS: worst case usually easier to analyze.

THE DEPARTMENT OF COMPUTER SCIENCE


PRINCIPLE 2
Ignore constant factors, lower-order terms

Justifications
1. Way easier
2. Constants depend on architecture/compiler/programmer anyways
3. Lose very little predictive power
(as we’ll see)

THE DEPARTMENT OF COMPUTER SCIENCE


PRINCIPLE 3
Asymptotic Analysis: focus on running time for large input sizes n

𝟏 𝟐
Eg: Merge Sort 𝟔𝒏 ∗ 𝒍𝒐𝒈𝟐 𝒏 + 𝟔𝒏 “better than” Insertion Sort 𝒏
𝟐

Justification: Only big problems are interesting!

THE DEPARTMENT OF COMPUTER SCIENCE


RUNNING TIME COMPARISON

THE DEPARTMENT OF COMPUTER SCIENCE


FAST ALGORITHM

𝑭𝒂𝒔𝒕 𝑨𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎 ≈ 𝑾𝒐𝒓𝒔𝒕 𝒄𝒂𝒔𝒆 𝒓𝒖𝒏𝒏𝒊𝒏𝒈 𝒕𝒊𝒎𝒆 𝒈𝒓𝒐𝒘𝒔 𝒄𝒍𝒔𝒐𝒍𝒚 𝒘𝒊𝒕𝒉 𝒊𝒏𝒑𝒖𝒕 𝒔𝒊𝒛𝒆

Usually: Want as close to linear (𝑶(𝒏)) as possible!

THE DEPARTMENT OF COMPUTER SCIENCE

You might also like