Automatically Quantize LLMs with AutoRound | Intel Software
If you are looking to deploy faster and smaller language models, but you don’t want to experiment with finding the right quantization settings for your deployment requirements, AutoRound makes it easy. You just specify your model, a light amount of training data, how many bits you want to quantize to, whether you want to prioritize accuracy or speed, and it will automatically tune the weight rounding and clipping ranges. AutoRound supports CPUs, GPUs, and AI accelerators from multiple vendors. Learn how to get started with this coding LLM example.
Resources:
Learn more about AutoRound: https://huggingface.co/blog/autoround
AutoRound GitHub repo: https://github.com/intel/auto-round
Intel AI software resources: https://developer.intel.com/ai
About Intel Software:
Intel® Developer Zone is committed to empowering and assisting software developers in creating applications for Intel hardware and software products. The Intel Software YouTube channel is an excellent resource for those seeking to enhance their knowledge. Our channel provides the latest news, helpful tips, and engaging product demos from Intel and our numerous industry partners. Our videos cover various topics; you can explore them further by following the links.
Connect with Intel Software:
INTEL SOFTWARE WEBSITE:https://intel.ly/2KeP1hDD
INTEL SOFTWARE on FACEBOOK:http://bit.ly/2z8MPFFF
INTEL SOFTWARE on TWITTER:http://bit.ly/2zahGSnn
INTEL SOFTWARE GITHUB:http://bit.ly/2zaih6zz
INTEL DEVELOPER ZONE LINKEDIN:http://bit.ly/2z979qss
INTEL DEVELOPER ZONE INSTAGRAM:http://bit.ly/2z9Xsbyy
INTEL GAME DEV TWITCH:http://bit.ly/2BkNshuu
#intelsoftware
Automatically Quantize LLMs with AutoRound | Intel Software