AI Neuroscience: can we understand the neural networks we train?
Deep neural networks have recently made a bit of a splash, enabling machines to learn to solve problems that had previously been easy for humans but difficult for computers, like playing Atari games or identifying lions and jaguars in photos. But how do these neural nets actually work? What concepts do they learn en route to their goals? We built and trained the networks, so on the surface these questions might seem trivial to answer. However, network training dynamics, internal representations, and mechanisms of computation turn out to be surprisingly tricky to study and understand, because networks have so many connections - often millions or more - that the resulting computation is fundamentally complex.
This high fundamental complexity enables the models to master their tasks, but we find now that we need something like neuroscience just to understand the AI that we've constructed! As we continue to train more complex networks on larger and larger datasets, the gap between what we can build and what we can understand will only grow wider. This gap both inhibits progress toward more competent AI and bodes poorly for a society that will increasingly be run by learned algorithms that are poorly understood. In this talk, we'll look at a collection of research aimed at shrinking this gap, with approaches including interactive model exploration, optimization, and visualization.