Home > FFT 2022 > Rayan Saab

Rayan Saab (UCSD)

Time: 1:45 pm on Friday, October 7th, 2022

Quantizing neural networks

Neural networks are highly non-linear functions often parametrized by a staggering number of weights. Miniaturizing these networks and implementing them in hardware is a direction of research that is fueled by a practical need, and at the same time connects to interesting mathematical problems. For example, by quantizing, or replacing the weights of a neural network with quantized (e.g., binary) counterparts, massive savings in cost, computation time, memory, and power consumption can be attained. Of course, one wishes to attain these savings while preserving the action of the function on domains of interest.

We present data-driven and computationally efficient methods for quantizing the weights of already trained neural networks and we prove that our methods have favorable error guarantees under a variety of assumptions. We also discuss extensions and provide the results of numerical experiments, on large multi-layer networks, to illustrate the performance of our methods. Time permitting, we will also discuss open problems and related areas of research.