## Title:
A theoretical framework for deep learning
## Abstract:
In recent years, machine learning researchers has achieved impressive results. Though
theory has been lagging behind, some of the main questions about deep learning are
now being solved. I will review the state of three main puzzles which include 3 separate
branches of mathematics, that is approximation, optimization and machine learning
theory:
- Approximation Theory: When and why are deep networks, with many layers of neurons, better than shallow networks which correspond to older machine learning techniques? When can they avoid the curse of dimensionality?
- Optimization: Why is it easy to train a deep network and often achieve global minima of the empirical loss?
- Learning Theory: Can we characterize convergence of the weights during gradient descent? How can deep learning avoid overfit and predict well for new data despite overparametrization? Do deep networks generalize according to classical theory?
I will also discuss the future of AI. To create artifacts that are as intelligent as we are, we need several additional breakthroughs. A good bet is that several of them will come from interdisciplinary research between the natural science and the engineering of Intelligence. This vision is in fact at the core of the CBMM and of the new MIT Quest for Intelligence, of which I will outline organization and research strategy. |

Back to FFT2019 speakers

Back to FFT2019 home