0% found this document useful (0 votes)
21 views37 pages

Lecture1 Intro Streaming

The document discusses a lecture on algorithms and complexity. It covers the topics of streaming algorithms, sublinear time algorithms, and distributed algorithms. It also describes a testing algorithm for list sortedness, a streaming algorithm for counting distinct elements, and mentions an algorithm for all-pairs shortest paths.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views37 pages

Lecture1 Intro Streaming

The document discusses a lecture on algorithms and complexity. It covers the topics of streaming algorithms, sublinear time algorithms, and distributed algorithms. It also describes a testing algorithm for list sortedness, a streaming algorithm for counting distinct elements, and mentions an algorithm for all-pairs shortest paths.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Lecture 1: Introduction

Logistics
• Prerequisites:
Algorithms + Complexity
or
Probability + Computational Models with grade
Logistics
• Grade:
• 70% exam
• 30% HW assignments (5-6)
• 5 bonus points for participating in Mentimeter quiz during class
• Participate sin at least 11 (out of 13) quizzes

• Office hours: email me ([email protected])


Logistics

IF YOU DON’T FEEL WELL, STAY HOME


What Is This Course About?
• Traditional models of computing:
Algorithm:

data workspace
This Course
• Part I: Streaming Algorithms
• Part II: Sublinear-Time Algorithms
• Part III: Distributed Algorithms
Streaming Algorithms

Algorithm
(workspace)

data

Goal: compute
… approximately, w.h.p.
Streaming Algorithms
• Useful when:
• Data really is a stream
• Many cases where it’s not
Sublinear-Time Algorithms
Algorithm

𝑛
𝑥 ∈ {0 , 1}
?
?
?

Goal: compute
… approximately, w.h.p.
9
One Current Example….
Distributed Algorithms
data4

data1
data5

data3

data2

Goal: compute
… approximately, w.h.p.
Course Goals
• See some cool algorithms and lower bounds
• Get a “feel” for randomized algorithms and probability
Today: a Tasting Menu
• One sublinear-time algorithm
• One streaming algorithm
• One distributed algorithm
Testing List Sortedness in
Sublinear Time
[Ergün, Kannan, Kumar, Rubinfeld, Viswanathan ‘00]
List Sortedness
• Input: a list of integers
• Output: is sorted?
For every :

• Can’t answer without reading the entire list


• What can we do?
Property Testing

universe

NO

YES
???
Property
“close to ”
Need to change at
most of the object to
get
“far from ”
Property Testing (Formally)
Given and a property , distinguish between:
•,
• is -far from :
for all we have , where = “edit distance”

17
Back to Sortedness
• “-close to sorted”?
• Need to change at most values to get a sorted list
Naïve Attempt
• Sample uniformly random indices and verify
• How large should be?
• Bad example:

• How far from sorted?


• How large ?

• What about checking pairs, , for random ?


Actual Algorithm
Repeat times:
• Sample uniform index
• Perform binary search for the value
• If binary search ends at position different from – reject
Finally: accept
Example

3 2 1 6 5 4 9 8 7 12 11 10 15 14 13 16
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Correctness
• Need to show:
• If is sorted, we accept w.h.p.
• If is -far from sorted, we reject w.h.p.

• Say is a good index if binary search for ends up in position


• Claim: the elements at good indices are sorted!
Proof of Claim
• Let be good indices
• Let = last common index in the binary search for and for
• Then:
Using the Claim
• Suppose is -far from sorted
at most good indices in
(otherwise: replace just the bad indices)
at least bad indices in , so:

• How many samples needed to find a bad index w.h.p.?


Streaming Algorithm for Distinct
Elements
[Flajolet, Martin’84]
Distinct Elements
• , where
• Naïve solution?

• Claim: can’t do it deterministically with bits


• Another claim: can’t do it exactly with bits. But…
• Can get -multiplicative approximation in bits
Lower Bound of for Exact, Deterministic
Algorithms
Flajolet-Martin Algorithm
• Choose random hash function
• Define sequence of events over , of increasing probability:
Event

Event Use the smallest


Event event that occurred
Event
to estimate the
number of elements
Flajolet-Martin Algorithm
• Sequence of events
• Event : the binary encoding of the number ends with zeroes
Mentimeter Experiment
Flajolet-Martin Algorithm
• Let
• Let be a random hash function*
• To process :
• Let number of trailing zeroes in binary representation of

• Output:
Analysis of Flajolet-Martin
Analysis of Flajolet-Martin
Space Complexity
The Hash Function
• Pairwise-independence: for every and ,

• Example: for every prime , the family

is pairwise-independent.
• Representing ?
Improving the Accuracy
• Result must be of the form
• High variance
• How to improve?
Distributed Algorithm for All-
Pairs Shortest Paths

You might also like