Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT | AISC
Subscribers:
22,300
Published on ● Video Link: https://www.youtube.com/watch?v=aX4Tm1s01wY
For slides and more information on the paper, visit https://ai.science/e/q-bert-hessian-based-ultra-low-precision-quantization-of-bert--t5TyN0LewFq33knWRHWY
Speaker: Amir Gholami; Host: Xi Chen
Motivation:
The motivation of Q-BERT is to enable efficient deployment at the edge with lower inference and power consumption. Furthermore, enabling high accuracy inference at the edge would help with privacy of the user, since his/her data would not need to be transmitted to the cloud for inference.