Speed Up Inference with Mixed Precision | AI Model Optimization with Intel® Neural Compressor

Channel:

Intel Software

Subscribers:

256,000

Published on July 26, 2023 2:00:12 PM ● Video Link: https://www.youtube.com/watch?v=H7Gg-EmGpAI

Duration: 4:08

8,208 views

Learn the most simple model optimization technique to speed up AI inference. Mixed precision, often used to speed up training, can also be used to speed up inference without having to worry about sacrificing accuracy.

Mixed precision is a popular technique for speeding up training of large AI models. It can also be a simple way to reduce model size and inference latency. This approach mixes lower-precision floating point formats such as FP16 and Bfloat16, together with the original 32-bit floating point parameters. Choosing how to mix formats requires assessing the accuracy effects, knowing what is supported by a given device, and what layers are used.

Intel® Neural Compressor automatically mixes in lower-precision formats supported by the hardware and the model’s layers. This video shows how to get started, whether you’re using PyTorch*, TensorFlow*, or ONNX* Runtime. It also shows how to automatically assess the accuracy effects of lower precisions.

Intel® Neural Compressor: bit.ly/3Nl6pVj

Intel® Neural Compressor GitHub: bit.ly/3NlBgkH

About Intel Software:
Intel® Developer Zone is committed to empowering and assisting software developers in creating applications for Intel hardware and software products. The Intel Software YouTube channel is an excellent resource for those seeking to enhance their knowledge. Our channel provides the latest news, helpful tips, and engaging product demos from Intel and our numerous industry partners. Our videos cover various topics; you can explore them further by following the links.

Connect with Intel Software:
INTEL SOFTWARE WEBSITE: https://intel.ly/2KeP1hD
INTEL SOFTWARE on FACEBOOK: http://bit.ly/2z8MPFF
INTEL SOFTWARE on TWITTER: http://bit.ly/2zahGSn
INTEL SOFTWARE GITHUB: http://bit.ly/2zaih6z
INTEL DEVELOPER ZONE LINKEDIN: http://bit.ly/2z979qs
INTEL DEVELOPER ZONE INSTAGRAM: http://bit.ly/2z9Xsby
INTEL GAME DEV TWITCH: http://bit.ly/2BkNshu

#intelsoftware #ai

Speed Up Inference with Mixed Precision | AI Model Optimization with Intel® Neural Compressor

Other Videos By Intel Software

2023-08-14	SYCL ND-Range \| Intel Software
2023-08-11	Increasing Trust in Confidential Computing with Project Amber \| InTechnology Podcast
2023-08-11	Social-Technical Systems with Maria Bezaitis \| InTechnology Podcast \| Intel Software
2023-08-10	OpenVINO Demos Overview \| Intel Software
2023-08-09	Overview of Intel® Optimizations for PyTorch* \| Intel Software
2023-07-28	Create Custom Layers \| Intel® Graphics Performance Analyzers Framework Quick Tips \| Intel Software
2023-07-27	July 2023 \| oneAPI Dev News \| Intel Software
2023-07-27	July 2023 \| oneAPI Dev News \| Intel Software
2023-07-26	July 2023 \| Intel Software
2023-07-26	Introduction to Intel's AI Solutions Stack \| Intel Software
2023-07-26	Speed Up Inference with Mixed Precision \| AI Model Optimization with Intel® Neural Compressor
2023-07-25	July 2023 \| IDZ News \| Intel Software
2023-07-24	Style-Transfer (Gen AI) with OpenVINO \| Intel Software
2023-07-24	Create Custom Layers \| Intel® Graphics Performance Analyzers Framework Quick Tips \| Intel Software
2023-07-24	Style-Transfer (Gen AI) with OpenVINO \| Intel Software
2023-07-18	Unlock Generative AI with Software Powered by oneAPI \| Intel Software
2023-07-18	Visual Inspection AI Reference Kit \| Introduction \| Intel Software
2023-07-17	Visual Inspection AI Reference Kit \| The Full Flow \| Intel Software
2023-07-17	Visual Inspection AI Reference Kit \| Introduction \| Intel Software
2023-07-13	Hugging Face + OpenVINO \| Intel Software
2023-07-12	Start Post-Training Static Quantization \| AI Model Optimization with Intel® Neural Compressor

Tags:

Intel Developer Zone

IDZ

Intel Software

Software Developer

Developer Tools

Software Tools

Developer

Intel

AI model optimization

deep learning

model compression

model optimization

mixed precision

bfloat16

float16

half precision

Intel Neural Compressor

Channel	Latest
Solheim Gaming	6 hours ago
Gerugon	6 hours ago
Game Over	7 hours ago
chuggaaconroy	7 hours ago
ProbIems	7 hours ago
Geraldo G97	7 hours ago
bungg	7 hours ago
lonniedos	7 hours ago
Karnian community Gaming	7 hours ago
this dad	7 hours ago
Amad's Speedruns	8 hours ago
GrayStillPlays	8 hours ago
KidTrigger	8 hours ago
Tealgamemaster	8 hours ago
The Librarian	8 hours ago
BigKlingy	8 hours ago
Valorant DAILY	8 hours ago
Organic Satisfy	8 hours ago
Joe, The Alternative Gamer	8 hours ago
André Roronora Zoro	8 hours ago
Alfapiomega	8 hours ago
SimbaThaGod	8 hours ago
Dendi	8 hours ago
Sayro Digital	8 hours ago
SoaR	8 hours ago