Favorite Theorems: Gradient Descent
Computational Complexity 2024-10-02
Who thought the algorithm behind machine learning would have cool complexity implications?
Let's unpack these classes, subclasses of TFNP, where for every input we know there is an easily verifiable solution and we are looking at the complexity of finding it. PPAD is the class famous for having finding a Nash Equilibrium as a complete set, see this post for details.
PLS is the class of problems where we look for a local minimum. Finding a global minimum is NP-complete--think vertex cover. Finding a local minimum is often easier but still could be hard if you are optimizing over an exponential set of values.
CLS is a bit trickier to define, basically you are finding an approximate local minimum of a function mapping three real variables to one real value.
The authors show that gradient descent is complete for PPAD ∩ PLS even if you only use two input variables. Since gradient descent is in CLS, the equality follows.
More in my 2021 post. On that post author Paul Goldberg commented
The paper is a fine example of the humorous stereotype of complexity theorists proving a problem "hard" when meanwhile the problem is being routinely solved in practice in the real world.
Nevertheless it's a neat complexity result and now officially one of my favorite theorems.