EECS 545 - Machine Learning
Lecture 3: Convex Optimization + Probability/Stat
Overview
Date: January 13, 2016
Instructor: Jacob Abernethy
In [1]: from IPython.core.display import HTML, Image
from IPython.display import YouTubeVideo
from sympy import init_printing, Matrix, symbols, Rational
import sympy as sym
from warnings import filterwarnings
init_printing(use_latex = 'mathjax')
filterwarnings('ignore')
%pylab inline
import numpy as np
Populating the interactive namespace from numpy and matplotlib
Some important notes
HW1 is out! Due January 25th at 11pm
Homework will be submitted via Gradescope. Please see Piazza for precise instructions. Do it
soon, not at the last minute!!
There is an optional tutorial this evening, 5pm, in Dow 1013. Come see Daniel go over some
tough problems.
No class on Monday January 18, MLK day!
Python: We recommend Anacdona
Anaconda is standalone Python distribution that includes all the most important scientific
packages: numpy, scipy, matplotlib, sympy, sklearn, etc.
Easy to install, available for OS X, Windows, Linux.
Small warning: it's kind of large (250MB)
Some notes on using Python
HW1 has only a very simple programming exercise, just as a warmup. We don't expect you to
submit code this time
This is a good time to start learning Python basics
There are a ton of good places on the web to learn python, we'll post some
Highly recommended: ipython; it's a much more user friendly terminal interface to python
Even better: jupyter notebook, a web based interface. This is how I'm making these slides!
Checking if all is installed, and HelloWorld
If you got everything installed, this should run:
# numpy is crucial for vectors, matrices, etc.
import numpy as np
# Lots of cool plotting tools with matplotlib
import matplotlib.pyplot as plt
# For later: scipy has a ton of stats tools
import scipy as sp
# For later: sklearn has many standard ML algs
import sklearn
# Here we go!
print("Hello World!")
More on learning python
We will have one tutorial devoted to this
If you're new to Python, go slow!
First learn the basics (lists, dicts, for loops, etc.)
Then spend a couple days playing with numpy
Then explore matplotlib
etc.
Piazza = your friend. We have a designated python instructor (IA Ben Bray) who has lots of
answers.
Lecture Cat #2
(credit to Johann for the suggestion)
Functions and Convexity
Let be a function mapping , and assume is twice differentiable.
The gradient and hessian of , denoted and , are the vector an matrix functions:
Note: the hessian is always symmetric!
Convex functions
We say that a function is convex if, for any distinct pair of points we have
Fun facts about convex functions
If is differentiable, then is convex iff "lies above its linear approximation", i.e.:
If is twice-differentiable, then the hessian is always positive semi-definite!
This last one you will show on your homeowork :-)
Convex Sets
is convex if
for any and
that is, a set is convex if the line connecting any two points in the set is entirely inside the set
Not all sets are convex
The Most General Optimization Problem
Assume is some function, and is some set. The following is an optimization problem:
How hard is it to find a solution that is (near-) optimal? This is one of the fundamental problems
in Computer Science and Operations Research.
A huge portion of ML relies on this task
A rough optimization hierarchy
[Really Easy] (i.e. problem is unconstrained), is convex, is differentiable, strictly
convex, and "slowly-changing" gradients
[Easyish] , is convex
[Medium] is a convex set, is convex
[Hard] is a convex set, is non-convex
[REALLY Hard] is an arbitrary set, is non-convex
Optimization without constraints
This problem tends to be easier than constrained optimization
We just need to find an such that
Techniques like gradient descent or Newton's method work in this setting. (More on this later)
Optimization with constraints
Here the set
is convex as long as all convex
The solution of this optimization may occur in the interior of , in which case the optimal will
have
But what if the solution occurs on the boundary of ?
A Quick Overview of Lagrange Duality
Here we need to work with the Lagrangian:
The vector are dual variables
For fixed , we now solve
A Quick Overview of Lagrange Duality
Assume, for every fixed , we found such that
Now we have what is called the dual function,
The Lagrange Dual Problem
What did we do here? We took one optimization problem:
And then we got another optimization problem:
Sometimes this dual problem is easier to solve
We always have weak duality:
Under nice conditions, we get strong duality:
Recommended reading:
Free online!
Chapter 5 covers duality