LESS VRAM, 8K+ Tokens & HUGE SPEED INCRASE | ExLlama for Oobabooga

Channel:
Subscribers:
284,000
Published on ● Video Link: https://www.youtube.com/watch?v=WiuXZBtcdpE



Duration: 10:25
8,532 views
230


Oobabooga WebUI had a HUGE update adding ExLlama and ExLlama_HF model loaders that use LESS VRAM and have HUGE speed increases, and even 8K tokens to play around with compared to the previous limit of 2K! This is insanely powerful and will be a huge timesaver for creators, and may even help users with less powerful graphics cards use LLMs!

OpenAI Tokenizer: https://platform.openai.com/tokenizer

Timestamps:
0:00 - What's new (It's CRAZY!)
0:44 - Open Oobabooga install directory
1:02 - Update Oobabooga WebUI
1:18 - VRAM usage & speed before update (4.3 tokens/s)
1:56 - Fix missing option or update errors
2:33 - Choosing new ExLlama model locader
2:52 - Downloading new model types (8k models)
4:25 - New VRAM & Speed (20 tokens/s! INSANE!)
5:25 - Raise token limit from 2,000 to 8,000+!
7:17 - How many tokens is your text?
7:50 - How long is 8k tokens?
8:45 - EVEN LESS VRAM with ExLlama_HF

#Oobabooga #AI #LLM
-----------------------------
💸 Found this useful? Help me make more! Support me by becoming a member: https://youtube.com/channel/UCkih2oVTbXPEpVwE-U7kmHw/join
-----------------------------
💸 Support me on Patreon: https://patreon.com/TroubleChute
💸 Direct donations via Ko-Fi: https://ko-fi.com/TCNOco
💬 Discuss the video & Suggest (Discord): https://s.tcno.co/Discord
👉 Game guides & Simple tips: https://YouTube.com/TroubleChuteBasics
🌐 Website: https://tcno.co
📧 Need voiceovers done? Business query? Contact my business email: TroubleChute (at) tcno.co
-----------------------------
🎨 My Themes & Windows Skins: https://hub.tcno.co/faq/my-windows/
👨‍💻 Software I use: https://hub.tcno.co/faq/my-software/
➡️ My Setup: https://hub.tcno.co/faq/my-hardware/
🖥️ My Current Hardware:
Intel i9-13900k - https://amzn.to/42xQuI1
GIGABYTE Z790 AORUS Master - https://amzn.to/3nHuBHx
G.Skill RipJaws 2x(2x32G) [128GB] - https://amzn.to/42cilxN
Corsair H150i 360mm AIO - https://amzn.to/42cznvP
MSI 3080Ti Gaming X Trio - https://amzn.to/3pdnLdb
Corsair 1000W RM1000i - https://amzn.to/42gOTGY
Corsair MP600 PRO XT 2TB - https://amzn.to/3NSvwzx
🎙️ My Current Mic/Recording Gear:
Shure SM7B - https://amzn.to/3nDGYo1
Audient iD14 - https://amzn.to/3pgf2XK
dbx 286s - https://amzn.to/3VNaq7O
Triton Audio FetHead - https://amzn.to/3pdjIgZ

Everything in this video is my personal opinion and experience and should not be considered professional advice. Always do your own research and ensure what you're doing is safe.







Tags:
oobabooga
optimization
update
ExLllama
ExLllama_HF
increase vram
openai chatbot
openai chatgpt
gpt 4
openai
chatgpt
gpt-4
gpt4
chatbot
text generation
text generation webui
ai chatbot
ai chat
language models
large language models
llm
web ui
ooga booga text-generation-webui
text-generation-webui
ai text generation
gpt4all
wizardlm
alpaca
fine tuning
open source llm
tutorial
oobabooga texgen