Locally Hosted Chatbots with RWKV through ChatRWKV and the Text-Generation-WebUI | 14B Model on 3GB!
A quick look at installing RWKV, ChatRWKV the Text-Generation-WebUI and the Long Term Memory extension for chatbots. This is a great LLM based around an RNN rather than a GPT model. Very impressive output on many of the released models. The WebUI has developed fast, and now supports features like TavernAI cards, and with the long term memory plugin, should make for some interesting chat bots.
Configure swap space for low RAM (less than 32gb) users: https://youtu.be/UeAD1qWNb1U?t=170
Install ChatRWKV:
https://youtu.be/UeAD1qWNb1U?t=229
Models:
https://youtu.be/UeAD1qWNb1U?t=408
Model loading strategies:
https://youtu.be/UeAD1qWNb1U?t=453
Editing the ChatRWKV script chat.py:
https://youtu.be/UeAD1qWNb1U?t=481
Installing Text-Generation-WebUI:
https://youtu.be/UeAD1qWNb1U?t=596
Installing Long Term Memory extension:
https://youtu.be/UeAD1qWNb1U?t=630
Models:
https://huggingface.co/BlinkDL/
RWKV pip package: https://pypi.org/project/rwkv/ (please always check for latest version and upgrade). Strategies found here.
ChatRWKV:
conda create -n rwkv
conda activate rwkv
conda install -c conda-forge cudatoolkit=11.7 cudnn
git clone https://github.com/BlinkDL/ChatRWKV.git
cd ChatRWKV
pip install rwkv
pip install ninja
pip install -r requirements.txt
pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
conda install -c conda-forge cudatoolkit=11.7 cudnn
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
Text-Generation-WebUI:
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
Long Term Memory extension:
cd text-generation-webui
git clone https://github.com/wawawario2/long_term_memory extensions/long_term_memory
pip install -r extensions/long_term_memory/requirements.txt
python -m pytest -v extensions/long_term_memory/
Example loading Text-Generation-WebUI:
python server.py --rwkv-strategy "cuda fp16i8 -[replace this with an angled bracket - you cant use them in youtube descriptions] cpu fp32 *12" --rwkv-cuda-on --extensions long_term_memory