Multimodal Query for Images: Text/Image Multimodal Query with Negative Filter and Folder Selection

Channel:

John Tan Chong Min

Subscribers:

6,300

Published on July 7, 2025 6:04:09 AM ● Video Link: https://www.youtube.com/watch?v=TJ9jvYSZwhc

Duration: 0:00

202 views

This is a practical demonstration as to how to use gpt-o4-mini-high with web search and documentation to vibe code a proof-of-concept for multimodal retrieval.

You can retrieve with either text or image input prompts, or even a hybrid search.

As Cohere embeddings are not that great with negative prompts, I also created negative text/image input filters to prevent some images from being retrieved.

You can also select the filters used to search only over certain sub-folders of your choice!

Also works if image is text or image is a sketch! Try it out!

Code can be found at:
https://github.com/tanchongmin/john-youtube/tree/main/Multimodal_Index

Dataset used:
https://www.kaggle.com/datasets/shreyapmaher/fruits-dataset-images
Download and put into a folder Fruits in same directory as this Jupyter Notebook

File structure
Current Directory
Fruits (folder)
.env
embeddings.db (automatically generated with the sqlite3 in this code)
Multimodal_Index.ipynb (this notebook)

Multimodal Embedding Used:

Cohere Embed v4: https://cohere.com/blog/embed-4

~~~

0:00 Introduction
3:24 First Prompt
9:48 Introducing SQLite Database to Cache Embeddings
16:56 Testing out Cohere Embeddings
24:10 Hybrid Image/Text Query
1:00:22 Sub-folder Filtering
1:07:03 Gradio UI
1:25:51 Not storing user input to cache
1:28:14 Negative Fitler
1:30:41 Moment of Truth - Final Testing

~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin

Other Videos By John Tan Chong Min

2025-09-08	DINOv3: One backbone, multiple image/video tasks
2025-08-18	R-Zero: Self-Evolving Reasoning LLM from Zero Data
2025-08-11	Reasoning without Language (Part 2) - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-08-04	Reasoning without Language - Deep Dive into 27 mil parameter Hierarchical Reasoning Model
2025-07-28	No need for symbolic programs for Math? Natural language approach to IMO
2025-07-21	How many instructions can LLMs follow at once?
2025-07-15	Arjo Chakravarty: Indoor Localisation with Visual Language Models (VLMs)
2025-07-14	MemOS: A Paradigm Shift to Memory as a First Class Citizen for LLMs
2025-07-07	Multimodal Query for Images: Text/Image Multimodal Query with Negative Filter and Folder Selection
2025-06-30	Universal Filter (Part 4 - Finale): Knowledge/Memory, Reflection, Communication between Individuals
2025-06-23	Universal Filter (Part 3): Learning the Filters, Universal Database, Individual Knowledge Base
2025-06-16	Universal Filter (Part 2): Time, Akashic Records, Individual Mind-based, Body-based memory
2025-06-04	Good Vibes Only with Dylan Chia: Lyria (Music), Veo3 (Video), Gamma (Slides), GitHub Copilot (Code)
2025-03-10	Memory Meets Psychology - Claude Plays Pokemon: How It works, How to improve it
2025-02-24	Vibe Coding: How to use LLM prompts to code effectively!
2025-01-26	PhD Thesis Overview (Part 2): LLMs for ARC-AGI, Task-Based Memory-Infused Learning, Plan for AgentJo
2025-01-20	PhD Thesis Overview (Part 1): Reward is not enough; Towards Goal-Directed, Memory-based Learning
2024-12-04	AgentJo CV Generator: Generate your CV by searching for your profile on the web!
2024-11-11	Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
2024-10-28	From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
2024-10-21	Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting

Channel	Latest
MLBB EPIC PLAYS	6 hours ago
CHOUEXP	6 hours ago
AntiQuaz	6 hours ago
La-Bloom Event Styling Services	6 hours ago
salmanxml	6 hours ago
Spotting Pesawat	6 hours ago
Showbiz Chika Update	7 hours ago
ANDRI RAYEN	7 hours ago
KORN GAMING	7 hours ago
SIBUYAN VLOGGER	7 hours ago
Nivaldo Arruda	7 hours ago
BANG DIN	7 hours ago
RCGM tutorials	7 hours ago
Mbdjack	7 hours ago
MVP ROGER YT	7 hours ago
AkmaL 07	7 hours ago
BB Ria Malupa	8 hours ago
Nasich 28	8 hours ago
Nanda Hero	8 hours ago
Tuys Ming07	8 hours ago
GO GAMING	8 hours ago
Joe Acojedo	8 hours ago
floydbishop	9 hours ago
Anime Xperienze	9 hours ago
HuFaBo_ROBLOX	9 hours ago