Preamble: Run the cells below to import the necessary Python packages
This notebook created by William Gilpin. Consult the course website for all content and GitHub repository for raw files and runnable online code.
# Import local plotting functions and in-notebook display functions
import matplotlib.pyplot as plt
%matplotlib inline
# Import linear algebra module
import numpy as np
Constrained optimization¶
Constrained optimization problems arise in physics when we have an objective function like energy that we want to minimize, but we have constraints like mechanical limits (fixed length of a pendulum) or thermodynamic (conservation of energy) that restrict the set of solutions we can consider.
We start by stating our optimization problem in the form of a minimization problem:
where the univariate is a convex function. Any value of that satisfies the constraint is called a feasible solution. The set of all feasible solutions is called the feasible set.
As an example, below we choose the function
x = np.linspace(-4, 4, 100)
f = lambda x: x**2 - 2 * x
plt.plot(x, f(x))
plt.xlabel('x')
plt.ylabel('f(x)')

We start by writing this problem in the form of a Lagrangian:
Where is the Lagrange multiplier. If , then the solution to the optimization problem is given by taking the derivative of the Lagrangian with respect to and setting it equal to zero:
lagrange_primal = lambda x, lam: f(x) - lam * x
xx = np.linspace(-4, 4, 100)
lamvals = np.linspace(0, 4, 100)
# create uniform meshgrid
x_mesh, lam_mesh = np.meshgrid(xx, lamvals)
langrange_primal_values = lagrange_primal(x_mesh, lam_mesh)
plt.figure()
plt.scatter(x_mesh.ravel(), lam_mesh.ravel(),
c=langrange_primal_values.ravel(),
cmap='viridis'
)
plt.ylim(np.min(lamvals), np.max(lamvals))
plt.xlim(np.min(xx), np.max(xx))
plt.xlabel('x')
plt.ylabel('$\lambda$')
plt.colorbar(label='f(x) + lambda*x')

The resulting relationship is the the KKT condition for the optimization problem. Notice that there is an implicit constraint on that . If , then the solution to the original optimization problem is .
For our example, we have , so the KKT condition becomes
with the constraint . We can think of the resulting line in as the equilibrium line, where the value of the Lagrangian is constant.
# The KKT condition
lam_kkt = lambda x: 2 * x - 2
plt.figure()
plt.scatter(x_mesh.ravel(), lam_mesh.ravel(),
c=langrange_primal_values.ravel(),
cmap='viridis'
)
plt.plot(lamvals, lam_kkt(lamvals), '--r', dashes=(8, 3))
plt.ylim(np.min(lamvals), np.max(lamvals))
plt.xlim(np.min(xx), np.max(xx))
plt.xlabel('x')
plt.ylabel('$\lambda$')

Now, we seek to write the dual function. We revisit the Lagrangian and use the KKT to write it as a function of only
For our example, we have , so the Lagrangian becomes
We insert our condition, , into the Lagrangian to get the dual function:
which reduces to,
The dual optimization problem is then
g = lambda lam: -(1 / 4) * (lam + 2)**2
plt.figure()
plt.plot(lamvals, g(lamvals))
plt.xlabel('$\lambda$')
plt.ylabel('Dual function g')

We can see that we now have two formulations of our optimization problem.
The primal problem is to minimize subject to .
The dual problem is to maximize subject to .
Once we solve for , we can find the optimal by using the KKT condition, and vice versa.
import scipy.optimize
lam_opt = scipy.optimize.minimize(lambda lam: -g(lam), 0, bounds=[(0, 6)]).x[0]
x_from_kkt = (2 + lam_opt) / 2
print('Optimal lambda:', lam_opt)
print('Optimal x:', x_from_kkt)
x_opt = scipy.optimize.minimize(f, 0, bounds=[(0, 6)]).x[0]
lam_from_kkt = 2 * x_opt - 2
print('Optimal lambda:', lam_from_kkt)
print('Optimal x:', x_opt)
Optimal lambda: 0.0
Optimal x: 1.0
Optimal lambda: -1.829984008772101e-08
Optimal x: 0.99999999085008
The two approaches are equivalent, and the optimal values of the primal and dual problems are the same. This is known as strong duality.
When is duality broken?¶
Duality is broken when the primal problem is not convex. In this case, the KKT condition may not be sufficient to find the optimal solution. In general, the KKT condition is necessary but not sufficient for optimality. Let’s try a different function,
The parameter is a constant that we can vary. For , the function is convex, but for all other values of , the function is not convex. The Lagrangian is
The KKT condition becomes
Solving this expression for gives
We insert this expression into the Lagrangian to get the dual function .
To resolve the ± ambiguity, we take the limit as . The correct root is the one that becomes the dual function that we derived earlier for the convex case.
Let’s plot the primal and dual functions to see what happens.
import numpy as np
import matplotlib.pyplot as plt
def f(x, g=0):
return g * x**3 + x**2 - 2*x
def L(x, lam):
return f(x) - lam * x
def q(lam, g=0.0001):
return ((-1 + np.sqrt(1 + 3*g*(2 +lam)))*(-1 - 6*g*(2 + lam) + np.sqrt(1 + 3*g*(2 + lam))))/(27.*g**2)
x = np.linspace(-1, 3, 100)
lam = np.linspace(-3, 3, 100)
plt.figure()
plt.plot(x, f(x), label='f(x)')
plt.xlabel('x')
plt.ylabel('f(x)')
plt.legend()
plt.figure()
plt.plot(lam, q(lam), label='q(lambda)')
plt.xlabel('x or lambda')
plt.ylabel('q(lambda)')
plt.legend()


Our two optimization problems remain the same:
The primal problem is to minimize subject to .
The dual problem is to maximize subject to .
We’ll start with to confirm that we get the same results as before.
import scipy.optimize
gval = 1e-8
lam_opt = scipy.optimize.minimize(lambda lam: -q(lam, g=gval), 0, bounds=[(0, 5)]).x[0]
x_from_kkt = (-1 + np.sqrt(1 + 6 * gval + 3 * gval * lam_opt))/(3. * gval)
print('Optimal lambda:', lam_opt)
print('Optimal x:', x_from_kkt)
# lagrange_primal = lambda x, lam: f(x) - lam * x
x_opt = scipy.optimize.minimize(lambda lam: f(lam, g=gval), 0, bounds=[(0, 5)]).x[0]
lam_from_kkt = 3 * gval * x_opt**2 + 2 * x_opt - 2
print('Optimal lambda:', lam_from_kkt)
print('Optimal x:', x_opt)
Optimal lambda: 0.0
Optimal x: 0.999999986521042
Optimal lambda: 5.9544609243289415e-09
Optimal x: 0.9999999879772308
The duality gap¶
The difference between the optimal values of the primal and dual problems is called the duality gap. When the duality gap is zero, the primal and dual problems are said to have strong duality. When the duality gap is positive, the primal and dual problems are said to have weak duality.
gvals = np.logspace(-14, 10, 50)
all_lam_opt_primal, all_x_opt_primal = [], []
all_lam_opt_dual, all_x_opt_dual = [], []
for gval in gvals:
lam_opt = scipy.optimize.minimize(lambda lam: -q(lam, g=gval), 0, bounds=[(0, 5)]).x[0]
x_from_kkt = (-1 + np.sqrt(1 + 6 * gval + 3 * gval * lam_opt))/(3. * gval)
all_lam_opt_primal.append(lam_opt)
all_x_opt_primal.append(x_from_kkt)
x_opt = scipy.optimize.minimize(lambda lam: f(lam, g=gval), 0, bounds=[(0, 5)]).x[0]
lam_from_kkt = 3 * gval * x_opt**2 + 2 * x_opt - 2
all_lam_opt_dual.append(lam_from_kkt)
all_x_opt_dual.append(x_opt)
plt.plot(gvals, all_lam_opt_primal, label='x primal')
plt.plot(gvals, all_lam_opt_dual, label='x dual')
# plt.
