0% found this document useful (0 votes)

17 views51 pages

18-Sub-Modular Functions

Sub modular functions

Uploaded by

blessedmabvunure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views51 pages

18-Sub-Modular Functions

Sub modular functions

Uploaded by

blessedmabvunure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.

edu 1
 Date:
▪ Monday, March 11 2:00 PM –
Wednesday, March 13, 2:00 PM Pacific Time
▪ Logistics:
▪ Administered on Gradescope
▪ 3 hours long (timer starts once you open the exam)
▪ Submitting answers (all questions visible at the same time):
 One PDF for the entire exam (uploaded at the top of the exam)
 One PDF for each question (uploaded to each question)
▪ You can do this as you go through the questions (do not need to
wait until the end)
 Write answers directly in text boxes
▪ Please budget your time for submission (~10 min) and solve
questions you find easy first – the exam tends to be on the
longer side

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 2
 If you think a question isn't clear on the
exam...
▪ Ask on Ed or state your (reasonable
and valid) assumptions in your answer
▪ We will actively monitor Ed on...
▪ Monday: 2 PM – 10 PM PT
▪ Tuesday: 8 AM – 3 PM, 5 PM – 10 PM PT
▪ Wednesday: 8 AM – 2 PM PT
▪ We will answer clarifying questions only
 Exam Review Session: Friday, 6 PM – 7 PM PT
via Zoom (see Ed, Canvas for details)
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 3
 Final exam is open book and open notes
 A calculator or computer is REQUIRED
▪ You may only use your computer to do arithmetic
calculations (i.e., the buttons found on a standard
scientific calculator)
▪ You may also use your computer to read course
notes or the textbook
▪ No use of AI chatbots (including, but not limited
to, ChatGPT)
▪ No collaboration with other students
 Practice finals are posted on Ed, Gradescope
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 4
Good luck with the exam! ☺
 You Have Done a Lot!!!
 And (hopefully) learned a lot!!!
▪ Answered questions and
proved many interesting results
▪ Implemented a number of methods

Thank You for the

Hard Work!
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 5
Note to other teachers and users of these slides: We would be delighted if you found our
material useful for giving your own lectures. Feel free to use these slides verbatim, or to
modify them to fit your own needs. If you make use of a significant portion of these slides
in your own lecture, please include this message, or a link to our web site: http://www.mmds.org

CS246: Mining Massive Datasets

Jure Leskovec, Stanford University
http://cs246.stanford.edu
 Redundancy leads to a bad user experience

▪ Uncertainty around information need => don’t

put all eggs in one basket
 How do we optimize for diversity directly?
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 7
France intervenes

Chuck for Defense

Argo wins big

Hagel expects fight
Monday, January 14
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 8
France intervenes

Chuck for Defense

Argo wins big

New gun proposals
Monday, January 14
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 9
 Idea: Encode diversity as coverage problem
 Example: Word cloud of news for a single day
▪ Want to select articles so that most words are
“covered”

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 10
 Q: What is being covered?
 A: Concepts (In our case: Named entities)
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Hagel expects fight

 Q: Who is doing the covering?

 A: Documents
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 12
 Suppose we are given a set of documents D
▪ Each document d covers a set 𝑿𝒅 of
words/topics/named entities W
 For a set of documents A  D we define

𝑭 𝑨 = ራ 𝑿𝒊
𝒊∈𝑨
 Goal: We want to
max 𝑭(𝑨)
𝑨 ≤𝒌
 Note: F(A) is a set function: 𝑭 𝑨 : 𝐒𝐞𝐭𝐬 → ℕ
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 13
 Given universe of elements 𝑾 = {𝒘𝟏, … , 𝒘𝒏 }
and sets 𝑿𝟏, … , 𝑿𝒎  𝑾
X3

X1 W
X2 X4

 Goal: Find k sets Xi that cover the most of W

▪ More precisely: Find k sets Xi whose size of the
union is the largest
▪ Bad news: A known NP-complete problem
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 14
Simple Heuristic: Greedy Algorithm:
 Start with 𝑨𝟎 = { }
 For 𝒊 = 𝟏 … 𝒌
▪ Find set 𝒅 that 𝐦𝐚𝐱 𝑭(𝑨𝒊−𝟏 ∪ {𝒅})
▪ Let 𝑨𝒊 = 𝑨𝒊−𝟏  {𝒅}
𝑭 𝑨 = ራ 𝑿𝒅

Example:
𝒅∈𝑨

▪ Eval. 𝑭 𝒅𝟏 , … , 𝑭({𝒅𝒎}), pick best (say 𝒅𝟏 )
▪ Eval. 𝑭 𝒅𝟏 } ∪ {𝒅𝟐 , … , 𝑭({𝒅𝟏 } ∪ {𝒅𝒎 }), pick best (say 𝒅𝟐 )
▪ Eval. 𝑭({𝒅𝟏 , 𝒅𝟐 } ∪ {𝒅𝟑 }), … , 𝑭({𝒅𝟏 , 𝒅𝟐 } ∪ {𝒅𝒎}), pick best
▪ And so on…
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 15
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 16
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 17
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 18
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 19
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 20
A

B C

 Goal: Maximize the size of the covered area

 Greedy first picks A and then C
 But the optimal way would be to pick B and C

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 21
 Greedy produces a solution A
where: F(A)  (1-1/e)*OPT (F(A)  0.63*OPT)
[Nemhauser, Fisher, Wolsey ’78]

 Claim holds for functions F(·) with 2 properties:

▪ F is monotone: (adding more docs doesn’t decrease coverage)
if A  B then F(A)  F(B) and F({})=0
▪ F is submodular:
adding an element to a set gives less improvement
than adding it to one of its subsets

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 22
Definition:
 Set function F(·) is called submodular if:
For all A,B W:
F(A) + F(B)  F(A B) + F(A B)

+  +
A A B B
A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 23
 Diminishing returns characterization
Equivalent definition:
 Set function F(·) is called submodular if:
For all A B:

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B)

Gain of adding d to a small set Gain of adding d to a large set

B A + d Large improvement

+ d Small improvement

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 24
 F(·) is submodular: A  B
F(A  {d}) – F(A) ≥ F(B  {d}) – F(B)
Gain of adding d to a small set Gain of adding d to a large set

 Natural example: A
▪ Sets 𝑑1, … , 𝑑𝑚
d
▪ 𝐹 𝐴 = ‫𝑖𝑑 𝐴∈𝑖ڂ‬
(size of the covered area)
B
▪ Claim:
𝑭(𝑨) is submodular! d

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 25
 Submodularity is discrete analogue of
concavity
F(·)

F(B  {d}) A  B
F(B)
F(A  {d})

F(A) Adding d to B helps less

than adding it to A!

Solution size |A|

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B)
Gain of adding Xd to a small set Gain of adding Xd to a large set
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 26
 Marginal gain:
𝚫𝑭 𝒅 𝑨 = 𝑭 𝑨 ∪ {𝒅} − 𝑭(𝑨)
 Submodular: 𝐴⊆𝐵
𝑭 𝑨 ∪ {𝒅} − 𝑭 𝑨 ≥ 𝑭 𝑩 ∪ {𝒅} − 𝑭(𝑩)
 Concavity: 𝑎≤𝑏
𝒇 𝒂 + 𝒅 − 𝒇 𝒂 ≥ 𝒇 𝒃 + 𝒅 − 𝒇(𝒃)
F(A)

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu
|A| 27
 Let 𝑭𝟏 … 𝑭𝒎 be submodular and 𝝀𝟏 … 𝝀𝒎 > 𝟎
then 𝑭 𝑨 = σ𝒎 𝒊=𝟏 𝝀𝒊 𝑭𝒊 𝑨 is submodular
▪ Submodularity is closed under non-negative
linear combinations!

 This is an extremely useful fact:

▪ Average of submodular functions is submodular:
𝑭 𝑨 = σ𝒊 𝑷 𝒊 ⋅ 𝑭𝒊 𝑨
▪ Multicriterion optimization: 𝑭 𝑨 = σ𝒊 𝝀𝒊 𝑭𝒊 𝑨

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 28
 Q: What is being covered?
 A: Concepts (In our case: Named entities)
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Hagel expects fight

 Q: Who is doing the covering?

 A: Documents
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 29
 Objective: pick k docs that cover most concepts
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend

 F(A): the number of concepts covered by A
▪ Elements…concepts, Sets … concepts in docs
▪ F(A) is submodular and monotone!
▪ We can use greedy algorithm to optimize F
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 30
 Objective: pick k docs that cover most concepts
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend

The good: The bad:

Penalizes redundancy Concept importance?
Submodular All-or-nothing too harsh
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 31
 Objective: pick k docs that cover most concepts

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend

 Each concept 𝒄 has importance weight 𝒘𝒄

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 33
 Document coverage function
probability document d covers
concept c
[e.g., how strongly d covers c]

Obama Romney

Enthusiasm for Inauguration wanes

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 34
 Document coverage function:
probability document d covers
concept c
▪ Coverd(c) can also model how relevant is concept c for user u
 Set coverage function:

▪ Prob. that at least one document in A covers c

 Objective: concept weights

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 35
 The objective function is also submodular
▪ Intuitively, it has a diminishing returns property
▪ Greedy algorithm leads to a (1 – 1/e) ~ 63%
approximation, i.e., a near-optimal solution

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 36
 Objective: pick k docs that cover most concepts

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend

 Each concept 𝑐 has importance weight 𝑤𝑐
 Documents partially cover concepts: 𝐜𝐨𝐯𝐞𝐫𝒅 (𝒄)
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 37
Greedy  Greedy algorithm is slow!
Marginal gain: ▪ At each iteration we need to
F(Ax)-F(A)
re-evaluate marginal gains of
a
all remaining documents
b
▪ Runtime 𝑶(|𝑫| · 𝑲) for
c selecting 𝑲 documents out of the
d
set of 𝑫 of them

Add document with

highest marginal gain

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 39
[Leskovec et al., KDD ’07]

 In round 𝒊: So far we have 𝑨𝒊−𝟏 = {𝒅𝟏 , … , 𝒅𝒊−𝟏 }

▪ Now we pick 𝐝𝒊 = 𝐚𝐫𝐠 𝐦𝐚𝐱 𝑭(𝑨𝒊−𝟏 ∪ {𝒅}) − 𝑭(𝑨𝒊−𝟏)
𝒅∈𝑽
▪ Greedy algorithm maximizes the “marginal benefit”
𝚫 𝒊 𝒅 = 𝑭(𝑨𝒊−𝟏 ∪ {𝒅}) − 𝑭(𝑨𝒊−𝟏 )

 By submodularity property:
𝐹 𝐴𝑖 ∪ 𝑑 − 𝐹 𝐴𝑖 ≥ 𝐹 𝐴𝑗 ∪ 𝑑 − 𝐹 𝐴𝑗 for 𝑖 < 𝑗
 Observation: By submodularity:
For every 𝒅 ∈ 𝑫
𝚫𝒊 (𝒅) ≥ 𝚫𝒋 (𝒅) for 𝒊 < 𝒋 since 𝑨𝒊  𝑨𝒋  i(d)   j(d)
 Marginal benefits 𝚫𝒊 (𝒅) only shrink! d
(as i grows) Selecting document d in step i covers
more words than selecting d at step j (j>i)
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 40
[Leskovec et al., KDD ’07]

 Idea:
(Upper bound on)
▪ Use  i as upper-bound on  j (j > i) Marginal gain  1
 Lazy Greedy: a A1={a}

▪ Keep an ordered list of marginal b

benefits  i from previous iteration c
▪ Re-evaluate  i only for top d
element
e
▪ Re-sort and prune

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 41
[Leskovec et al., KDD ’07]

 Idea:
Upper bound on
▪ Use  i as upper-bound on  j (j > i) Marginal gain  2
 Lazy Greedy: a A1={a}

▪ Keep an ordered list of marginal b

benefits  i from previous iteration c
▪ Re-evaluate  i only for top d
element
e
▪ Re-sort and prune

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 42
[Leskovec et al., KDD ’07]

 Idea:
Upper bound on
▪ Use  i as upper-bound on  j (j > i) Marginal gain  2
 Lazy Greedy: a A1={a}

▪ Keep an ordered list of marginal d A2={a,b}

benefits  i from previous iteration b
▪ Re-evaluate  i only for top element e
▪ Re-sort and prune c

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 43
 Summary so far:
▪ Diversity can be formulated as a set cover
▪ Set cover is submodular optimization problem
▪ Can be (approximately) solved using greedy algorithm
▪ Lazy-greedy gives significant speedup
400

exhaustive search
300 (all subsets)
Lower is better

running time (seconds)

naive
200 greedy

100
Lazy

0
1 2 3 4 5 6 7 8 9 10
number of blogs selected
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 44
But what about
personalization?

Election trouble

model Songs of Syria

Sandy delays

Recommendations

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 45
We assumed same concept weighting for all users

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

France intervenes

Chuck for Defense

Argo wins big

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 46
 Each user has different preferences over
concepts

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

politico

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

movie buff
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 47
 Assume each user u has different preference
vector wc(u) over concepts c

 Goal: Learn personal concept weights from

user feedback

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 48
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

France intervenes
Chuck for Defense
Argo wins big

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 49
 Multiplicative Weights algorithm
▪ Assume each concept 𝒄 has weight 𝒘𝒄
▪ We recommend document 𝒅 and receive feedback,
say 𝒓 = +1 or -1
▪ Update the weights:
▪ For each 𝒄 ∈ 𝑿𝒅 set 𝒘𝒄 = 𝜷𝒓𝒘𝒄
▪ If concept c appears in doc d and we received positive feedback r=+1
then we increase the weight wc by multiplying it by 𝜷 (𝜷 > 𝟏)
otherwise we decrease the weight (divide by 𝜷)
▪ Normalize weights so that σ𝒄 𝒘𝒄 = 𝟏

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 50
 Steps of the algorithm:
1. Identify items to recommend from
2. Identify concepts [what makes items redundant?]
3. Weigh concepts by general importance
4. Define item-concept coverage function
5. Select items using probabilistic set cover
6. Obtain feedback, update weights

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 51

Algorithms
100% (1)
Algorithms
383 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
124 pages
Theory of Locality Sensitive Hashing - CS246 Stanford (Slides)
No ratings yet
Theory of Locality Sensitive Hashing - CS246 Stanford (Slides)
52 pages
Cmu850 f20
No ratings yet
Cmu850 f20
309 pages
Submodular Optimization
No ratings yet
Submodular Optimization
65 pages
Introduction PDF
No ratings yet
Introduction PDF
69 pages
Mmds Exam 2022
No ratings yet
Mmds Exam 2022
17 pages
Combinatorial Optimization - Chekuri (2022)
No ratings yet
Combinatorial Optimization - Chekuri (2022)
255 pages
BD - Lecture 3 - Decision Tree
No ratings yet
BD - Lecture 3 - Decision Tree
39 pages
02 Algorithm
No ratings yet
02 Algorithm
22 pages
Cmu850 f20
No ratings yet
Cmu850 f20
285 pages
01 Intro
No ratings yet
01 Intro
70 pages
Information-Theoretic Subset Selection of Multivariate Markov Chains Via Submodular Optimization
No ratings yet
Information-Theoretic Subset Selection of Multivariate Markov Chains Via Submodular Optimization
35 pages
CS769 2025 Intro Lecture 1-Annotated
No ratings yet
CS769 2025 Intro Lecture 1-Annotated
57 pages
Mining Data Streams (Part 2) : Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman
No ratings yet
Mining Data Streams (Part 2) : Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman
46 pages
On Resource Allocation in Fading Multiple Access Channels - An Efficient Approximate Projection Approach
No ratings yet
On Resource Allocation in Fading Multiple Access Channels - An Efficient Approximate Projection Approach
22 pages
DAA Unit - 4
No ratings yet
DAA Unit - 4
45 pages
18-Sub-Modular Functions
No ratings yet
18-Sub-Modular Functions
51 pages
Da 2023
No ratings yet
Da 2023
30 pages
18 Advertising
No ratings yet
18 Advertising
48 pages
19 Bandits
No ratings yet
19 Bandits
48 pages
Accn101 Assignment Final
No ratings yet
Accn101 Assignment Final
36 pages
07 Recsys1
No ratings yet
07 Recsys1
48 pages
01 Intro PDF
No ratings yet
01 Intro PDF
69 pages
Algorithms
100% (1)
Algorithms
383 pages
16 Streams
No ratings yet
16 Streams
61 pages
Algorithms
No ratings yet
Algorithms
501 pages
Detecting Customer Trends For Optimal Promotion Targeting
No ratings yet
Detecting Customer Trends For Optimal Promotion Targeting
48 pages
08 Recsys2
No ratings yet
08 Recsys2
60 pages
Toc CS246 PRK
No ratings yet
Toc CS246 PRK
17 pages
Unit 4
No ratings yet
Unit 4
60 pages
ch01 Intro
No ratings yet
ch01 Intro
29 pages
07 Recsys1
No ratings yet
07 Recsys1
47 pages
ch01 Intro
No ratings yet
ch01 Intro
28 pages
Main Test Corrected
No ratings yet
Main Test Corrected
1 page
09 Pagerank
No ratings yet
09 Pagerank
61 pages
ch04 Streams2
No ratings yet
ch04 Streams2
4 pages
ch04 Streams1
No ratings yet
ch04 Streams1
4 pages
HMTHCS 101tutorial 5
No ratings yet
HMTHCS 101tutorial 5
2 pages
16 Streams
No ratings yet
16 Streams
5 pages
17-Matrix Sketching
No ratings yet
17-Matrix Sketching
65 pages
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
No ratings yet
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
56 pages
02 Assocrules
No ratings yet
02 Assocrules
56 pages
Unit 3 Information Sources
No ratings yet
Unit 3 Information Sources
15 pages
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
No ratings yet
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
42 pages
Introduction Algorithm
No ratings yet
Introduction Algorithm
46 pages
19 Submodular
No ratings yet
19 Submodular
47 pages
Big Data Analytics Course Introduction
No ratings yet
Big Data Analytics Course Introduction
28 pages
Association Rules
No ratings yet
Association Rules
56 pages
Calculus T4 (DFM)
No ratings yet
Calculus T4 (DFM)
2 pages
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman
No ratings yet
Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman
64 pages
0 Main
No ratings yet
0 Main
13 pages
18-Complexity Theory
No ratings yet
18-Complexity Theory
23 pages
Jeffrey D. Ullman: Stanford University
No ratings yet
Jeffrey D. Ullman: Stanford University
52 pages
01 Mapreduce
No ratings yet
01 Mapreduce
77 pages
AAscript
No ratings yet
AAscript
158 pages
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
No ratings yet
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
53 pages
Ecd101 Final Exam Oct 2024
No ratings yet
Ecd101 Final Exam Oct 2024
10 pages
Stanford - Slides Mapreduce
No ratings yet
Stanford - Slides Mapreduce
76 pages
Community Detection in Social Networks
No ratings yet
Community Detection in Social Networks
64 pages
Submodular Set Function - Wikipedia
No ratings yet
Submodular Set Function - Wikipedia
5 pages
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
No ratings yet
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
58 pages
PB Algo2 2013 PDF
No ratings yet
PB Algo2 2013 PDF
87 pages
Picking Winners: A Framework For Venture Capital Investment
No ratings yet
Picking Winners: A Framework For Venture Capital Investment
30 pages
Cs Theorists Toolkit
No ratings yet
Cs Theorists Toolkit
95 pages
Mining of Massive Datasets: Jure Leskovec Anand Rajaraman Jeffrey D. Ullman
0% (1)
Mining of Massive Datasets: Jure Leskovec Anand Rajaraman Jeffrey D. Ullman
17 pages
CSL 356: Analysis and Design of Algorithms: Ragesh Jaiswal CSE, IIT Delhi
No ratings yet
CSL 356: Analysis and Design of Algorithms: Ragesh Jaiswal CSE, IIT Delhi
32 pages
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
No ratings yet
CS246: Mining Massive Datasets Jure Leskovec,: Stanford University
49 pages
NP Complete
No ratings yet
NP Complete
32 pages
Mining Massive Datasets Preface
No ratings yet
Mining Massive Datasets Preface
17 pages
Creating Capsule Wardrobes From Fashion Images: Wei-Lin Hsiao UT-Austin Kristen Grauman UT-Austin
No ratings yet
Creating Capsule Wardrobes From Fashion Images: Wei-Lin Hsiao UT-Austin Kristen Grauman UT-Austin
10 pages
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
From Everand
IGNOU MCA Data Science and Big Data Previous Years Unsolved Papers MCS 226
Manish Soni
No ratings yet
Enterprise STEM
From Everand
Enterprise STEM
Shirley Duke
No ratings yet
Engineers Are Problem Solvers
From Everand
Engineers Are Problem Solvers
Nikole Bethea
No ratings yet
Apache Cassandra Developer Associate - Exam Practice Tests
From Everand
Apache Cassandra Developer Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
From Everand
IGNOU BCA Introduction to Algorithm Design Previous Year Unsolved Papers BCS 042
Manish Soni
No ratings yet
Advanced C++ Interview Questions You'll Most Likely Be Asked
From Everand
Advanced C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

18-Sub-Modular Functions

Uploaded by

18-Sub-Modular Functions

Uploaded by

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.

Thank You for the

CS246: Mining Massive Datasets

▪ Uncertainty around information need => don’t

Chuck for Defense

Argo wins big

Chuck for Defense

Argo wins big

Hagel expects fight

 Q: Who is doing the covering?

 Goal: Find k sets Xi that cover the most of W

 Goal: Maximize the size of the covered area

 Claim holds for functions F(·) with 2 properties:

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B)

F(A) Adding d to B helps less

Solution size |A|

 This is an extremely useful fact:

Hagel expects fight

 Q: Who is doing the covering?

Enthusiasm for Inauguration wanes Inauguration weekend

Enthusiasm for Inauguration wanes Inauguration weekend

The good: The bad:

Enthusiasm for Inauguration wanes Inauguration weekend

 Each concept 𝒄 has importance weight 𝒘𝒄

Enthusiasm for Inauguration wanes

▪ Prob. that at least one document in A covers c

Enthusiasm for Inauguration wanes Inauguration weekend

Add document with

 In round 𝒊: So far we have 𝑨𝒊−𝟏 = {𝒅𝟏 , … , 𝒅𝒊−𝟏 }

▪ Keep an ordered list of marginal b

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

▪ Keep an ordered list of marginal b

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

▪ Keep an ordered list of marginal d A2={a,b}

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

running time (seconds)

model Songs of Syria

Chuck for Defense

Argo wins big

 Goal: Learn personal concept weights from

You might also like