0% found this document useful (0 votes)
11 views

18-Sub-Modular Functions

Sub modular functions

Uploaded by

blessedmabvunure
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

18-Sub-Modular Functions

Sub modular functions

Uploaded by

blessedmabvunure
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.

edu 1
 Date:
▪ Monday, March 11 2:00 PM –
Wednesday, March 13, 2:00 PM Pacific Time
▪ Logistics:
▪ Administered on Gradescope
▪ 3 hours long (timer starts once you open the exam)
▪ Submitting answers (all questions visible at the same time):
 One PDF for the entire exam (uploaded at the top of the exam)
 One PDF for each question (uploaded to each question)
▪ You can do this as you go through the questions (do not need to
wait until the end)
 Write answers directly in text boxes
▪ Please budget your time for submission (~10 min) and solve
questions you find easy first – the exam tends to be on the
longer side

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 2
 If you think a question isn't clear on the
exam...
▪ Ask on Ed or state your (reasonable
and valid) assumptions in your answer
▪ We will actively monitor Ed on...
▪ Monday: 2 PM – 10 PM PT
▪ Tuesday: 8 AM – 3 PM, 5 PM – 10 PM PT
▪ Wednesday: 8 AM – 2 PM PT
▪ We will answer clarifying questions only
 Exam Review Session: Friday, 6 PM – 7 PM PT
via Zoom (see Ed, Canvas for details)
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 3
 Final exam is open book and open notes
 A calculator or computer is REQUIRED
▪ You may only use your computer to do arithmetic
calculations (i.e., the buttons found on a standard
scientific calculator)
▪ You may also use your computer to read course
notes or the textbook
▪ No use of AI chatbots (including, but not limited
to, ChatGPT)
▪ No collaboration with other students
 Practice finals are posted on Ed, Gradescope
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 4
Good luck with the exam! ☺
 You Have Done a Lot!!!
 And (hopefully) learned a lot!!!
▪ Answered questions and
proved many interesting results
▪ Implemented a number of methods

Thank You for the


Hard Work!
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 5
Note to other teachers and users of these slides: We would be delighted if you found our
material useful for giving your own lectures. Feel free to use these slides verbatim, or to
modify them to fit your own needs. If you make use of a significant portion of these slides
in your own lecture, please include this message, or a link to our web site: http://www.mmds.org

CS246: Mining Massive Datasets


Jure Leskovec, Stanford University
http://cs246.stanford.edu
 Redundancy leads to a bad user experience

▪ Uncertainty around information need => don’t


put all eggs in one basket
 How do we optimize for diversity directly?
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 7
France intervenes

Chuck for Defense

Argo wins big


Hagel expects fight
Monday, January 14
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 8
France intervenes

Chuck for Defense

Argo wins big


New gun proposals
Monday, January 14
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 9
 Idea: Encode diversity as coverage problem
 Example: Word cloud of news for a single day
▪ Want to select articles so that most words are
“covered”

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 10
 Q: What is being covered?
 A: Concepts (In our case: Named entities)
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Hagel expects fight

 Q: Who is doing the covering?


 A: Documents
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 12
 Suppose we are given a set of documents D
▪ Each document d covers a set 𝑿𝒅 of
words/topics/named entities W
 For a set of documents A  D we define

𝑭 𝑨 = ራ 𝑿𝒊
𝒊∈𝑨
 Goal: We want to
max 𝑭(𝑨)
𝑨 ≤𝒌
 Note: F(A) is a set function: 𝑭 𝑨 : 𝐒𝐞𝐭𝐬 → ℕ
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 13
 Given universe of elements 𝑾 = {𝒘𝟏, … , 𝒘𝒏 }
and sets 𝑿𝟏, … , 𝑿𝒎  𝑾
X3

X1 W
X2 X4

 Goal: Find k sets Xi that cover the most of W


▪ More precisely: Find k sets Xi whose size of the
union is the largest
▪ Bad news: A known NP-complete problem
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 14
Simple Heuristic: Greedy Algorithm:
 Start with 𝑨𝟎 = { }
 For 𝒊 = 𝟏 … 𝒌
▪ Find set 𝒅 that 𝐦𝐚𝐱 𝑭(𝑨𝒊−𝟏 ∪ {𝒅})
▪ Let 𝑨𝒊 = 𝑨𝒊−𝟏  {𝒅}
𝑭 𝑨 = ራ 𝑿𝒅

Example:
𝒅∈𝑨

▪ Eval. 𝑭 𝒅𝟏 , … , 𝑭({𝒅𝒎}), pick best (say 𝒅𝟏 )
▪ Eval. 𝑭 𝒅𝟏 } ∪ {𝒅𝟐 , … , 𝑭({𝒅𝟏 } ∪ {𝒅𝒎 }), pick best (say 𝒅𝟐 )
▪ Eval. 𝑭({𝒅𝟏 , 𝒅𝟐 } ∪ {𝒅𝟑 }), … , 𝑭({𝒅𝟏 , 𝒅𝟐 } ∪ {𝒅𝒎}), pick best
▪ And so on…
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 15
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 16
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 17
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 18
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 19
 Goal: Maximize the covered area

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 20
A

B C

 Goal: Maximize the size of the covered area


 Greedy first picks A and then C
 But the optimal way would be to pick B and C

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 21
 Greedy produces a solution A
where: F(A)  (1-1/e)*OPT (F(A)  0.63*OPT)
[Nemhauser, Fisher, Wolsey ’78]

 Claim holds for functions F(·) with 2 properties:


▪ F is monotone: (adding more docs doesn’t decrease coverage)
if A  B then F(A)  F(B) and F({})=0
▪ F is submodular:
adding an element to a set gives less improvement
than adding it to one of its subsets

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 22
Definition:
 Set function F(·) is called submodular if:
For all A,B W:
F(A) + F(B)  F(A B) + F(A B)

+  +
A A B B
A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 23
 Diminishing returns characterization
Equivalent definition:
 Set function F(·) is called submodular if:
For all A B:

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B)


Gain of adding d to a small set Gain of adding d to a large set

B A + d Large improvement

+ d Small improvement

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 24
 F(·) is submodular: A  B
F(A  {d}) – F(A) ≥ F(B  {d}) – F(B)
Gain of adding d to a small set Gain of adding d to a large set

 Natural example: A
▪ Sets 𝑑1, … , 𝑑𝑚
d
▪ 𝐹 𝐴 = ‫𝑖𝑑 𝐴∈𝑖ڂ‬
(size of the covered area)
B
▪ Claim:
𝑭(𝑨) is submodular! d

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 25
 Submodularity is discrete analogue of
concavity
F(·)

F(B  {d}) A  B
F(B)
F(A  {d})

F(A) Adding d to B helps less


than adding it to A!

Solution size |A|


F(A  {d}) – F(A) ≥ F(B  {d}) – F(B)
Gain of adding Xd to a small set Gain of adding Xd to a large set
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 26
 Marginal gain:
𝚫𝑭 𝒅 𝑨 = 𝑭 𝑨 ∪ {𝒅} − 𝑭(𝑨)
 Submodular: 𝐴⊆𝐵
𝑭 𝑨 ∪ {𝒅} − 𝑭 𝑨 ≥ 𝑭 𝑩 ∪ {𝒅} − 𝑭(𝑩)
 Concavity: 𝑎≤𝑏
𝒇 𝒂 + 𝒅 − 𝒇 𝒂 ≥ 𝒇 𝒃 + 𝒅 − 𝒇(𝒃)
F(A)

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu
|A| 27
 Let 𝑭𝟏 … 𝑭𝒎 be submodular and 𝝀𝟏 … 𝝀𝒎 > 𝟎
then 𝑭 𝑨 = σ𝒎 𝒊=𝟏 𝝀𝒊 𝑭𝒊 𝑨 is submodular
▪ Submodularity is closed under non-negative
linear combinations!

 This is an extremely useful fact:


▪ Average of submodular functions is submodular:
𝑭 𝑨 = σ𝒊 𝑷 𝒊 ⋅ 𝑭𝒊 𝑨
▪ Multicriterion optimization: 𝑭 𝑨 = σ𝒊 𝝀𝒊 𝑭𝒊 𝑨

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 28
 Q: What is being covered?
 A: Concepts (In our case: Named entities)
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Hagel expects fight

 Q: Who is doing the covering?


 A: Documents
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 29
 Objective: pick k docs that cover most concepts
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend


 F(A): the number of concepts covered by A
▪ Elements…concepts, Sets … concepts in docs
▪ F(A) is submodular and monotone!
▪ We can use greedy algorithm to optimize F
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 30
 Objective: pick k docs that cover most concepts
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend

The good: The bad:


Penalizes redundancy Concept importance?
Submodular All-or-nothing too harsh
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 31
 Objective: pick k docs that cover most concepts

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend

 Each concept 𝒄 has importance weight 𝒘𝒄


3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 33
 Document coverage function
probability document d covers
concept c
[e.g., how strongly d covers c]

Obama Romney

Enthusiasm for Inauguration wanes

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 34
 Document coverage function:
probability document d covers
concept c
▪ Coverd(c) can also model how relevant is concept c for user u
 Set coverage function:

▪ Prob. that at least one document in A covers c


 Objective: concept weights

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 35
 The objective function is also submodular
▪ Intuitively, it has a diminishing returns property
▪ Greedy algorithm leads to a (1 – 1/e) ~ 63%
approximation, i.e., a near-optimal solution

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 36
 Objective: pick k docs that cover most concepts

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

Enthusiasm for Inauguration wanes Inauguration weekend


 Each concept 𝑐 has importance weight 𝑤𝑐
 Documents partially cover concepts: 𝐜𝐨𝐯𝐞𝐫𝒅 (𝒄)
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 37
Greedy  Greedy algorithm is slow!
Marginal gain: ▪ At each iteration we need to
F(Ax)-F(A)
re-evaluate marginal gains of
a
all remaining documents
b
▪ Runtime 𝑶(|𝑫| · 𝑲) for
c selecting 𝑲 documents out of the
d
set of 𝑫 of them

Add document with


highest marginal gain

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 39
[Leskovec et al., KDD ’07]

 In round 𝒊: So far we have 𝑨𝒊−𝟏 = {𝒅𝟏 , … , 𝒅𝒊−𝟏 }


▪ Now we pick 𝐝𝒊 = 𝐚𝐫𝐠 𝐦𝐚𝐱 𝑭(𝑨𝒊−𝟏 ∪ {𝒅}) − 𝑭(𝑨𝒊−𝟏)
𝒅∈𝑽
▪ Greedy algorithm maximizes the “marginal benefit”
𝚫 𝒊 𝒅 = 𝑭(𝑨𝒊−𝟏 ∪ {𝒅}) − 𝑭(𝑨𝒊−𝟏 )

 By submodularity property:
𝐹 𝐴𝑖 ∪ 𝑑 − 𝐹 𝐴𝑖 ≥ 𝐹 𝐴𝑗 ∪ 𝑑 − 𝐹 𝐴𝑗 for 𝑖 < 𝑗
 Observation: By submodularity:
For every 𝒅 ∈ 𝑫
𝚫𝒊 (𝒅) ≥ 𝚫𝒋 (𝒅) for 𝒊 < 𝒋 since 𝑨𝒊  𝑨𝒋  i(d)   j(d)
 Marginal benefits 𝚫𝒊 (𝒅) only shrink! d
(as i grows) Selecting document d in step i covers
more words than selecting d at step j (j>i)
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 40
[Leskovec et al., KDD ’07]

 Idea:
(Upper bound on)
▪ Use  i as upper-bound on  j (j > i) Marginal gain  1
 Lazy Greedy: a A1={a}

▪ Keep an ordered list of marginal b


benefits  i from previous iteration c
▪ Re-evaluate  i only for top d
element
e
▪ Re-sort and prune

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 41
[Leskovec et al., KDD ’07]

 Idea:
Upper bound on
▪ Use  i as upper-bound on  j (j > i) Marginal gain  2
 Lazy Greedy: a A1={a}

▪ Keep an ordered list of marginal b


benefits  i from previous iteration c
▪ Re-evaluate  i only for top d
element
e
▪ Re-sort and prune

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 42
[Leskovec et al., KDD ’07]

 Idea:
Upper bound on
▪ Use  i as upper-bound on  j (j > i) Marginal gain  2
 Lazy Greedy: a A1={a}

▪ Keep an ordered list of marginal d A2={a,b}


benefits  i from previous iteration b
▪ Re-evaluate  i only for top element e
▪ Re-sort and prune c

F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A B

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 43
 Summary so far:
▪ Diversity can be formulated as a set cover
▪ Set cover is submodular optimization problem
▪ Can be (approximately) solved using greedy algorithm
▪ Lazy-greedy gives significant speedup
400

exhaustive search
300 (all subsets)
Lower is better

running time (seconds)

naive
200 greedy

100
Lazy

0
1 2 3 4 5 6 7 8 9 10
number of blogs selected
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 44
But what about
personalization?

Election trouble

model Songs of Syria

Sandy delays

Recommendations

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 45
We assumed same concept weighting for all users

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

France intervenes

Chuck for Defense

Argo wins big


3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 46
 Each user has different preferences over
concepts

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

politico

France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

movie buff
3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 47
 Assume each user u has different preference
vector wc(u) over concepts c

 Goal: Learn personal concept weights from


user feedback

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 48
France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL

France intervenes
Chuck for Defense
Argo wins big

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 49
 Multiplicative Weights algorithm
▪ Assume each concept 𝒄 has weight 𝒘𝒄
▪ We recommend document 𝒅 and receive feedback,
say 𝒓 = +1 or -1
▪ Update the weights:
▪ For each 𝒄 ∈ 𝑿𝒅 set 𝒘𝒄 = 𝜷𝒓𝒘𝒄
▪ If concept c appears in doc d and we received positive feedback r=+1
then we increase the weight wc by multiplying it by 𝜷 (𝜷 > 𝟏)
otherwise we decrease the weight (divide by 𝜷)
▪ Normalize weights so that σ𝒄 𝒘𝒄 = 𝟏

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 50
 Steps of the algorithm:
1. Identify items to recommend from
2. Identify concepts [what makes items redundant?]
3. Weigh concepts by general importance
4. Define item-concept coverage function
5. Select items using probabilistic set cover
6. Obtain feedback, update weights

3/7/2024 Jure Les kovec, Stanford CS246: Mi ning Ma ssive Datasets, http://cs246.stanford.edu 51

You might also like