Stanford EE364A Convex Optimization I Stephen Boyd I 2023 I Lecture 15
25 Mar 2024 (10 months ago)
Self-Concordance
- Self-concordance allows for an analysis of Newton's method independent of affine changes of coordinates.
- Self-concordance provides a way to measure the smoothness of the Hessian that is independent of affine changes of coordinates.
- Many functions commonly used in optimization satisfy the self-concordant property.
- The analysis of Newton's method for self-concordant functions is simple and provides strong convergence guarantees.
- Self-concordance is often used to prove the convergence of interior-point methods.
- Formulations that involve self-concordant functions tend to perform better in practice.
- The implementation of Newton's method for self-concordant functions reduces to solving a sequence of quadratic minimization problems, which can be solved efficiently using linear algebra techniques.
- Sparse or banded Hessian matrices can be exploited to speed up the solution of the linear systems.
- Separable functions have a diagonal Hessian, which simplifies the computations.
- The Hessian matrix is diagonal plus low rank if and only if the function is separable.
- If the Hessian matrix is diagonal plus low rank, then the Newton step can be computed in O(p^2 N) time, where p is the rank of the low-rank part.
- For banded Hessian matrices, the Newton step can be computed in O(N) time.
Equality Constraints
- The KKT matrix is invertible if and only if the null space of A does not intersect the null space of P.
- Equality constraints can be eliminated by parameterizing them as the image of an affine function.
- Newton's method can be used to solve the resulting unconstrained problem.
- The problem involves resource allocation, where resources X1 to Xn are allocated to agents 1 to n, with a total budget B.
- The goal is to minimize the total cost function F(X1, …, Xn) subject to the equality constraint that the sum of the allocated resources equals the total budget.
- To eliminate the equality constraint, one can express Xn in terms of X1 to Xn-1 and the total budget B.
- The resulting reduced problem can be solved efficiently using linear algebra techniques.
- The Newton step for solving the reduced problem can be obtained directly from the KKT system or by linearizing the optimality conditions.
- Newton's method for solving the reduced problem is identical to Newton's method without equality constraints, but it involves solving a larger linear system.
- Newton's method for solving the reduced problem is a feasible descent method that guarantees that the iterates remain feasible and the function value decreases at each iteration.
- The convergence analysis of Newton's method for the reduced problem can be obtained from the convergence analysis of Newton's method without equality constraints.
- Solving the larger linear system in Newton's method for the reduced problem is often more efficient than solving the smaller linear system in Newton's method without equality constraints.
Infeasible Start Newton Method
- The infeasible start Newton method is an extension of Newton's method that allows for solving systems of equations with equality constraints, even when the initial point is not feasible (i.e., does not satisfy the constraints).
- The infeasible start Newton method takes a step of size one in the direction of the Newton step, which ensures that the equality constraints will be satisfied after the step.
- The backtracking search in the infeasible start Newton method is performed on the norm of the residual rather than the function value.
- If the Newton step takes the point outside the domain of the function, the step size is reduced until the point is back in the domain.
- The directional derivative of the norm of the residual is equal to minus one, which means that the backtracking search will always find a step that reduces the norm of the residual.
Solving Linear Systems
- LDL transpose factorization or elimination can be used to solve the KKT system efficiently, especially when the Hessian matrix H is sparse or diagonal.
- Different methods for solving optimization problems are discussed, including Newton's method, the Dual method, and the Infeasible Start Newton method.
- All these methods have different initialization requirements, but in terms of the work done per iteration, they are essentially the same.
- The choice of method depends on the problem at hand and the ease of initialization.
- If naive linear algebra is used, the methods would be different in complexity, but with proper linear algebra techniques, they all have identical complexity.
- Newton's method requires solving a smaller linear system compared to the Dual method.
- Block elimination, naive linear algebra, and smart linear algebra are all the same for solving linear equations.
Flow Utility Maximization
- The KKT system for flow utility maximization has a diagonal Hessian and a sparse incidence matrix.
- The incidence matrix is not full rank, so one row can be removed and one node can be referred to as the ground.
- The diagonal part of the Hessian is a diagonal matrix, and the off-diagonals are non-positive.
- The sum of each row of the reduced matrix is zero, making it a Laplacian matrix.
- Solving Lassan equations is a specialized field in theoretical computer science.
- The sparsity pattern of the Lassan Matrix can be determined immediately, making it suitable for solving network flow optimization problems for large-scale systems.
Linearizing Optimality Conditions
- An alternative approach to solving optimization problems with matrix variables involves linearizing the optimality conditions, which leads to a dense positive definite system of linear equations.
- This approach reduces the computational complexity from the sixth power to the second power in the problem size, making it more efficient for large-scale problems.
- This method is commonly used in solvers for semidefinite programming (SDP) problems.
Debugging
- It is important to recognize when debugging efforts are taking too long and to seek help or take a break to avoid wasting time.
Approximating Smooth Functions
- The video discusses methods for solving optimization problems with smooth functions subject to inequality constraints.
- The approach involves approximating the smooth function as a quadratic in 22 steps.
- Each step reduces to minimizing a quadratic function with equality constraints, which can be solved using linear algebra methods.
- Examples of problems that can be solved using this approach include LP, QP, QCQP, GP, entropy maximization with linear inequality constraints, and SDPs and SSPs.
- The Log barrier function is used to approximate the inequality constraints and make the problem smooth.
- The story of the Log barrier function is compared to the story of Lagrangian duality, where an initially poor approximation leads to a good approximation under certain conditions.
- The video provides an intuition for using the Log barrier function to apply Newton's method to problems with inequality constraints.