0% found this document useful (0 votes)
139 views2 pages

Algorithms Quiz Solution - Georgia Tech - Machine Learning

The speakers discuss that finding the optimal subset from all possible subsets is an NP-hard problem with an exponential number of possibilities. This is because to find the truly best subset, one would need to evaluate all possible combinations. While this problem is difficult, machine learning more broadly would be easy if an optimal solution could be found efficiently. The key question then is how can approaches tackle this challenging optimization problem.

Uploaded by

yousef shaban
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
139 views2 pages

Algorithms Quiz Solution - Georgia Tech - Machine Learning

The speakers discuss that finding the optimal subset from all possible subsets is an NP-hard problem with an exponential number of possibilities. This is because to find the truly best subset, one would need to evaluate all possible combinations. While this problem is difficult, machine learning more broadly would be easy if an optimal solution could be found efficiently. The key question then is how can approaches tackle this challenging optimization problem.

Uploaded by

yousef shaban
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

Okay Michael, you think you know the answer?

>> I do. So, if you hadn't said, yeah, up

until the very end I was thinking along, different lines. So,

if it's the case that we just want to choose, if,

if this f thing that is scoring is say, constant, then

choosing the best subset is easy, I can just choose

any subset I want. But you said then, it could actually

be arbitrary and it is unknown. So the only way

that I can know that I have the best scoring subset

is if I try all the subsets. And there's an exponential

number of subsets. So I'm going, I would go with exponential.

>> That's correct exactly what is the, the form of the exponential?

>> Oh, I want to say n choose m.

>> Yeah, but you don't actually even know what m is. N

choose m, n choose m is right and it gives you an exponential.

>> Good, alright. Thanks very much.

>> But, the other way is if you don't what

m is before you start. Then, it's just 2 to the n.

>> It's hard for me to imagine what that

means given that you made that part of the

input. If you mean just the best subset, then

yeah there's 2 to the n subsets. I agree.

>> So either one's fine because it gives

you the same answer. Sometimes we say, we

want half as many or, we want no more than half as many, and sometimes we

say we don't know what the best subset is. You figure it out. So this becomes

like a kind of clustering problem in its

own way except you don't know what k is.

But the other way of thinking about this is that of course this problem is hard,

of course it's exponential, it is effectively an


optimization problem over a set of arbitrary discreet variables.

>> That's how I was thinking about it

because we always seem to come back to that.

>> Right and in fact this problem is known to be

NP-hard. And it's exactly because you have to find all possible

subsets, so they match the three set. I will not prove

that but it does turn out that this is, in fact,

exactly the hard problem that you think it is. So given

this, and, and by the way, of course, of course this problem

is hard. Because if this problem weren't hard then most machine learning

would be pretty easy. So, it's not that big of a, it's

not that, it's not that surprising that the problem is difficult,

so the real question we have in front of us is given

that we've got yet another really difficult problem, in fact, a difficult

optimization problem, how might we go about tackling it? And it turns

out that there are two ways in general that people try to approach this problem.

You might also like