Fine-Tuning Mistral 7B using QLoRA and PEFT on Unstructured Scraped Text Data | Making it Evil?
I go over my experience fine-tuning Mistral 7B on a few large datasets of scraped text data including English language song lyrics, and a huge KiwiFarms post dataset.
Training script and video resources are linked below.
Introduction/Topics
https://www.youtube.com/watch?v=9bl1mJImj10&t=1m
Tools for Bulk Text Extraction
https://www.youtube.com/watch?v=9bl1mJImj10&t=2m
Model Choice: Mistral 7b
https://www.youtube.com/watch?v=9bl1mJImj10&t=3m
QLoRA
https://www.youtube.com/watch?v=9bl1mJImj10&t=3m20s
Talking about the linked article, compare/contrast with my training experiences
https://www.youtube.com/watch?v=9bl1mJImj10&t=4m10s
Training script used
https://www.youtube.com/watch?v=9bl1mJImj10&t=6m20s
Merge LoRA script
https://www.youtube.com/watch?v=9bl1mJImj10&t=10m45s
Testing the model with the LM Evaluation Harness
https://www.youtube.com/watch?v=9bl1mJImj10&t=11m30s
Esotericlly evaluating the LoRAs with the WebUI/What can be expected from crude raw text training
https://www.youtube.com/watch?v=9bl1mJImj10&t=13m
Internet "celebrities"
https://www.youtube.com/watch?v=9bl1mJImj10&t=15m
Song parody test
https://www.youtube.com/watch?v=9bl1mJImj10&t=18m10s
Memorization test
https://www.youtube.com/watch?v=9bl1mJImj10&t=19m20s
ALL LINKS AND NOTEBOOK DOWNLOAD ALSO HERE:
http://nanonomad.com/2023/10/27/fine-tuning-mistral-7b/
Jupyter Notebook
https://drive.google.com/file/d/1mnew-Y1DQ0Z7AGxulF04Xur1w7SHhj3q/view?usp=sharing
Finetuning LLMs with LoRA and QLoRA: Insights from Hundreds of Experiments by Sebastian Raschka
https://lightning.ai/pages/community/lora-insights/
Can LLMs learn from a single example?
https://www.fast.ai/posts/2023-09-04-learning-jumps/
LM Evaluation Harness
https://github.com/EleutherAI/lm-evaluation-harness
Convert with Calibre
https://gist.github.com/rohshall/8980b8f73374c767dbe0a82bcf8ae86c
Calibre
https://calibre-ebook.com/
Unstructured IO
https://github.com/Unstructured-IO
QLoRA
https://github.com/artidoro/qlora
PEFT
https://github.com/huggingface/peft
Bitsandbytes
https://github.com/TimDettmers/bitsandbytes
Original LongLoRA merge script
https://github.com/dvlab-research/LongLoRA/blob/main/merge_lora_weights_and_save_hf_model.py
OpenLLM Leaderboard
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
LM Eval Harness example command:
python main.py --model hf-causal-experimental --model_args pretrained="/home/nano/textgen/models/mistral-books-br-2048-v2-7300",low_cpu_mem_usage=True,load_in_4bit=True,bnb_4bit_use_double_quant=True,bnb_4bit_quant_type="nf4",bnb_4bit_compute_dtype=bfloat16 --tasks arithmetic_2ds,arithmetic_4ds,truthfulqa_mc --batch_size 8 --num_fewshot 0 --output_path "/home/nano/textgen/models//home/nano/textgen/models/mistral-books-br-2048-v2-7300-arith-truthfulqa_mc.json"
Text Generation WebUI
https://github.com/oobabooga/text-generation-webui