Our fourth stream in the Python + AI series is all about vision models!
Vision models are LLMs that can accept both text and images, like GPT 4o and GPT 4o-mini. You can use those models for image captioning, data extraction, question-answering, classification, and more!
We'll use Python to send images to vision models, build a basic chat app with image upload, and even use vision models inside a RAG application.
📌 Follow-along live, thanks to GitHub Models (https://github.com/marketplace/models) and GitHub Codespaces.
If you'd like to follow along with the live examples, make sure you've got a GitHub account.
📌 You can also join a weekly office hours to ask any questions that don't get answered in the chat, in our AI Discord: https://aka.ms/aipython/oh