What is model compression quantization?

by SMEBOOK (admin) · February 28, 2021

Model compression quantization is one of the technique used to compress models. It involves bundling weights together by clustering them or rounding them off so that the same number of connections can be represented using lesser amount of memory. Quantization is the idea of representing these weights by reducing the number of bits. The weights can be quantized to 16-bit, 8-bit, 4-bit or even with 1-bit. By reducing the number of bits used, the size of the deep neural network can be significantly reduced.

Share it on social networks: Tweet Share

What is model compression quantization?

SMEBOOK is reinventing the management of tech companies’ assets by providing a matchmaking algorithm capable of recommending partnerships according to needs and interests.