0% found this document useful (0 votes)
13 views3 pages

Latex For Mu

Uploaded by

rarehighrise94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views3 pages

Latex For Mu

Uploaded by

rarehighrise94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

\documentclass{article} \usepackage{amsmath}

\title{Optimization Theory: Important Theorems and Examples} \author{}


\date{}
\begin{document}
\maketitle
\section{Fermat’s Theorem (First-order Necessary Condition)}
\textbf{Formula}: For a function �f(x)�, if �x^*� is a local extremum (maximum
or minimum), then the first derivative of �f(x)� at �x^*� must be zero: \[ \nabla
f(x^*) = 0 \] where �\nabla f(x^*)� is the gradient of the function at �x^*�.
\textbf{Example}: Let’s say we want to find the maximum or minimum of the
function �f(x) = x^2 - 4x + 3�.
- First, compute the derivative of the function: \[ f’(x) = 2x - 4 \] - Set the
derivative equal to zero to find the critical points: \[ 2x - 4 = 0 \quad \Rightar-
row \quad x = 2 \] - Now, check whether this is a maximum or minimum by
examining the second derivative: \[ f”(x) = 2 \] Since �f”(x) > 0�, �x = 2� is a
minimum.
\section{Second-order Necessary Conditions}
\textbf{Formula}: If �f(x)� has a critical point �x^*� where �\nabla f(x^*) = 0�,
then: - If �\nabla^2 f(x^*) > 0�, �x^*� is a local minimum. - If �\nabla^2 f(x^*)
< 0�, �x^*� is a local maximum. - If �\nabla^2 f(x^*) = 0�, further tests are
needed.
\textbf{Example}: Consider the function �f(x) = x^3 - 3x^2 + 2�.
- The first derivative is: \[ f’(x) = 3x^2 - 6x \] - Setting it to zero to find critical
points: \[ 3x^2 - 6x = 0 \quad \Rightarrow \quad x(x - 2) = 0 \] So, �x = 0�
or �x = 2�. - Now, compute the second derivative: \[ f”(x) = 6x - 6 \] - At �x =
0�, �f”(0) = -6�, so �x = 0� is a local maximum. - At �x = 2�, �f”(2) = 6�, so �x =
2� is a local minimum.
\section{Lagrange Multiplier Theorem (Constrained Optimization)}
\textbf{Formula}: For an optimization problem with a constraint, the method
of Lagrange multipliers states that we need to find the stationary points of the
Lagrangian function: \[ \mathcal{L}(x, \lambda) = f(x) + \lambda g(x) \]
where: - �f(x)� is the objective function to be maximized or minimized. - �g(x)
= 0� is the constraint. - �\lambda� is the Lagrange multiplier.
The optimal point occurs when the gradient of the Lagrangian is zero: \[ \nabla
\mathcal{L}(x, \lambda) = 0 \]
\textbf{Example}: Maximize �f(x, y) = x + y� subject to the constraint �x^2 +
y^2 = 1�.

1
- The Lagrangian is: \[ \mathcal{L}(x, y, \lambda) = x + y + \lambda(x^2
+ y^2 - 1) \] - Taking partial derivatives and setting them equal to zero:
\[ \frac{\partial \mathcal{L}}{\partial x} = 1 + 2\lambda x = 0 \] \[
\frac{\partial \mathcal{L}}{\partial y} = 1 + 2\lambda y = 0 \] \[
\frac{\partial \mathcal{L}}{\partial \lambda} = x^2 + y^2 - 1 = 0 \] -
Solving this system of equations gives the solution �x = \frac{1}{\sqrt{2}}, y
= \frac{1}{\sqrt{2}}�.
\section{Karush-Kuhn-Tucker (KKT) Conditions (Nonlinear Constrained Opti-
mization)}
\textbf{Formula}: For a constrained optimization problem with inequality con-
straints, the KKT conditions include: 1. \textbf{Primal feasibility}: �g(x) \leq
0� 2. \textbf{Dual feasibility}: �\lambda \geq 0� 3. \textbf{Stationarity}:
�\nabla f(x) + \sum \lambda_i \nabla g_i(x) = 0� 4. \textbf{Complementary
slackness}: �\lambda_i g_i(x) = 0�
\textbf{Example}: Minimize �f(x) = x^2� subject to �g(x) = x - 1 \geq 0�.
- The KKT conditions give: - Primal feasibility: �x - 1 \geq 0� - Dual feasibility:
�\lambda \geq 0� - Stationarity: �2x + \lambda = 0� - Complementary slackness:
�\lambda(x - 1) = 0� - Solving these conditions gives the optimal solution �x =
1� with �\lambda = 0�.
\section{Convex Optimization Theorem}
\textbf{Formula}: If �f(x)� is a convex function and �X� is a convex feasible
region, then any local minimum is also a global minimum.
\textbf{Example}: Minimize �f(x) = x^2� subject to �x \in \mathbb{R}�.
- Since �f(x) = x^2� is convex (the second derivative �f”(x) = 2 > 0�), and the
feasible region is �\mathbb{R}�, the global minimum occurs at �x = 0�.
\section{Duality Theorem (Strong Duality)}
\textbf{Formula}: For a convex optimization problem, the dual problem pro-
vides a lower bound to the primal problem. \textbf{Strong duality} means that
the optimal value of the dual problem is equal to the optimal value of the primal
problem.
If the primal problem is: \[ \min f(x) \quad \text{subject to} \quad g(x)
\leq 0 \] then the dual problem is: \[ \max \, \min \, L(x, \lambda) \quad
\text{subject to} \quad \lambda \geq 0 \]
\textbf{Example}: For a linear programming problem, the duality theorem
states that solving the dual problem gives the same optimal value as solving the
primal problem.
\section{Subdifferential Theorem (Subgradient)}
\textbf{Formula}: For a non-smooth function �f(x)�, the subdifferential at a
point �x� is the set of all vectors �g� such that: \[ f(y) \geq f(x) + g^\top (y - x)

2
\quad \forall y \] where �g� is called the \textbf{subgradient} of �f(x)� at �x�.
\textbf{Example}: For the function �f(x) = |x|�, the subdifferential at �x = 0� is
�\{-1, 1\}�, because the function has a sharp corner at �x = 0�.
\section{Frank-Wolfe Algorithm (Conditional Gradient Method)}
\textbf{Formula}: The Frank-Wolfe algorithm updates the current solution by
solving a linear approximation of the objective function at each step: \[ x_{k+1}
= \text{argmin}_{x \in X} \, \nabla f(x_k)^\top (x - x_k) \] where �X� is
the feasible region.
\textbf{Example}: For a convex problem like minimizing �f(x) = x^2� over the
interval �[0, 1]�, the Frank-Wolfe method iteratively moves toward the optimal
solution by solving linearized versions of the function.
\section{Gradient Descent and Convergence Theorem}
\textbf{Formula}: The gradient descent algorithm updates the solution by mov-
ing in the direction of the negative gradient: \[ x_{k+1} = x_k - \alpha \nabla
f(x_k) \] where �\alpha� is the step size.
\textbf{Example}: Minimize �f(x) = x^2� using gradient descent: - Start with
�x_0 = 2�, choose �\alpha = 0.1�. - Iteratively update �x_k� until convergence.
\section{No-Free-Lunch Theorem (Optimization)}
\textbf{Formula}: The No-Free-Lunch Theorem states that no optimization al-
gorithm can outperform others across all possible problems. There’s no universal
best algorithm.
\textbf{Example}: For simple problems, methods like gradient descent work
well, but for non-smooth or discrete problems, algorithms like genetic algorithms
or simulated annealing may perform better.
\end{document}

You might also like