Seminari del Dipartimento di Matematica

Questo sito utilizza solo cookie tecnici per il corretto funzionamento delle pagine web e per il miglioramento dei servizi.
Se vuoi saperne di più o negare il consenso consulta l'informativa sulla privacy.
Proseguendo la navigazione del sito acconsenti all'uso dei cookie.

Seminario del 2020

2020

21 maggio

Alessandro Achille

Structure of Learning Tasks and the Information in the Weights of a Deep Network

nel ciclo di seminari: GEOMETRIA E DEEP LEARNING

Seminario di algebra e geometria

Abstract: What are the fundamental quantities to understand the learning process of a deep neural network? Why are some datasets easier than other? What does it means for two tasks to have a similar structure? We argue that information theoretic quantities, and in particular the amount of information that SGD stores in the weights, can be used to characterize the training process of a deep network. In fact, we show that the information in the weights bounds the generalization error and the invariance of the learned representation. It also allows us to connect the learning dynamics with the so called "structure function" of the dataset, and to define a notion of distance between tasks, which relates to fine-tuning. The non-trivial dynamics of information during training give rise to phenomena, such as critical periods for learning, that closely mimics those observed in humans and may suggests that forgetting information about the training data is a necessary part of the learning process.

indietro