OpenAI CLIP Embeddings: Walkthrough + Insights
If there is one thing that has been impactful ever since its launch, it has to be CLIP Embeddings.
CLIP stands for Contrastive Language–Image Pre-training.
From Stable Diffusion to DALL-E to Robotics Tasks involving Vision and Text, CLIP bridges the gap between image and text using an embedding space common to both of them.
Granted, CLIP is not able to do everything well - it struggles with the limitations of vector embeddings - context may not be captured well.
It also struggles with limitations of the image encoder - loss of positional information with Vision Transformers.
That said, it is pretty useful for generic tasks, and my experiments with it have impressed me on its versatility to various situations.
Web-scale training does produce wonders.
~~~~
CLIP Paper: https://arxiv.org/abs/2103.00020
CLIP Code: https://github.com/openai/CLIP
Code for my experiments: https://github.com/tanchongmin/TensorFlow-Implementations/tree/main/Paper_Reviews/CLIP/CLIP%20Code
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/CLIP/CLIP_Embeddings.pdf
~~~~
0:00 Introduction
1:19 CLIP Experiments
28:37 Key Takeaways
34:57 Prediction in latent space is faster learning than in input space
39:53 Dataset
46:32 Final Architecture
48:45 CLIP Training
55:04 CLIP nference
57:42 Code details
1:01:00 Performance over 27 datasets
1:05:19 Using CLIP for Classification
1:06:50 Prompt Engineering and Ensembling to Improve Classification performance
1:11:03 CLIP is good for for datasets with limited samples
1:12:10 CLIP is bad for specialised tasks
1:16:19 Broad Training vs Specific Training
1:24:50 CLIP and Multiple Abstraction Spaces
1:30:51 Discussion
~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin