MA630 Advanced Optimization Methods
Course Catalog Description
The objective of this course is to introduce the students to the some advanced topics in the
theory and methods of optimization. The first portion of the class focuses on subgradient
calculus for non-smooth convex functions, some convex analysis results. optimality conditions
for non-smooth optimization problems, composite optimization problems, conjugate and
Lagrangian duality for non-smooth problems. Problems defined by differentiable but possibly
non-convex functions will be reviewed briefly. The second part of the class discusses numerical
methods for non-smooth optimization as well as approaches to large-scale optimization
problems. The latter include decomposition methods, design of distributed and parallel
methods of optimization, as well as stochastic approximation methods. Along with the
theoretical results and methods, examples of optimization models in statistical learning and
data mining, compressed sensing and image reconstruction will be discussed in order to
illustrate the challenges and the phenomena, and to demonstrate the scope of applications.
The students are expected to have knowledge of calculus, linear algebra, and
optimization corresponding to Ma 230 or an equivalent course.
Students who successfully complete this course will be able to
- The students will be able to use the following notions: subgradients of convex and
Lipschitz functions; conjugate and bi-conjugate functions, and stochastic
subgradients both for the purpose of theoretical analysis as well as in numerical
- The students would be familiar with the use of Fenchel and Lagrange duality and will
be able to analyze problems and devise numerical methods using duality relations.
- The students would be able to use subgradient calculus and calculate subgradients
of conjugate and of dual functions applying Fenchel-Moreau theorem, Rockafellar-
Moreau theorem and other results.
- The students will be able to use first order necessary and sufficient conditions of
optimality for problems with non-differentiable functions and for composite
problems (composition of non-convex-smooth with convex-non-smooth functions),
to verify various constraint qualification conditions.
- The students will be familiar with the basic steps of the numerical methods listed in
the syllabus, their proof of convergence, advantages and disadvantages.
- The students will be able to apply the fundamental decomposition and distribution
approaches to create tailored distributed methods for particular applied
problems. They will be aware of the applicability and convergence properties of the
The students will understand and use stochastic approximation and randomized
techniques. They will be able to identify when these techniques are needed and how
to apply them.
Andrzej Ruszczynski, Nonlinear Optimization, Princeton University Press, 2006.
Lecture Notes distributed in class.
Supplementary Text(Not required)
- James C. Spall, Introduction to Stochastic Search and Optimization: Estimation,
Simulation, and Control, Wiley, Hoboken, 2003.
- Heinz H. Bauschke, Patrick L. Combettes, Convex Analysis and Monotone Operator
Theory in Hilbert Spaces, Springer, 2011.
Seven homework assignments will be given. In addition, 4 short in-class examinations (quizzes)
and a final exam will be conducted. The final grade will be based on the following score:
0.35×Assignments+0.35×In-class Quizzes+0.3×Final Project
The syllabus is subject to change. All announcements about this class will be posted on Canvas.
All assignments have to be submitted by their due date. Late submissions will be accepted only
if the student has obtained a written permission. Submissions, which are late beyond one week,
will not be accepted.
||Review of basic properties of convex sets.
||Non-smooth functions. Subgradient calculus.
||Tangent and normal cones.
||Fenchel and Lagrange duality
||Gradient descent and Newton's method
||Conjugate gradient method and quasi-Newton's methods
||Subgradient and non-gradient methods of optimization.
||Cutting plane methods. Proximal point method.
||Bundle methods. Trust region methods.
||Dual decomposition approaches: ADMM, ASM, ADAL.
||Stochastic subgradient methods.
||Stochastic methods with constraints.