2020
27 aprile
Seminario di fisica matematica
ore 15:30
presso - Aula Da Stabilire -
We study the performance of stochastic gradient descent in high-dimensional inference tasks. Our focus is on the initial ``search'' phase where the algorithm is far from a trust region and the loss landscape is highly non-convex. We develop a classification of the difficulty of this problem, namely whether the problem requires linear, quasilinear, or polynomially many samples in the dimension to achieve weak recovery of the parameter. This classification depends on an intrinsic property of the population loss which we call the ``information exponent''. We illustrate our approach by applying it to a wide variety of estimation tasks such as parameter estimation for generalize linear models, two component Gaussian mixture models, phase retrieval, and spiked matrix and tensor models, as well as supervised learning for singe-layer networks with general activation functions. In this latter case, our results translate to the difficulty of this task for teacher-student networks in terms of the Hermite decomposition of the activation function.
Torna alla pagina dei seminari del Dipartimento di Matematica di Bologna