Train or Fine Tune VITS on (theoretically) Any Language | Train Multi-Speaker Model | Train YourTTS
VITS Multispeaker English Training and Fine Tuning Notebook:
https://colab.research.google.com/drive/178Nv5lmMdI1pMHmE0X0EsAxZEDZoklqQ?usp=sharing
VITS Alternate Language Training and Fine Tuning Notebook:
https://colab.research.google.com/drive/1zQXTel8AyqNvnnBLMItbzs-kUv51Dwat?usp=sharing
YourTTS Training and Fine Tuning notebook:
https://colab.research.google.com/drive/1MqiLjNaVNIEmD31A0s48U0GgyEHRk7Vo?usp=sharing
Updated YourTTS and VITS multi-speaker English-language notebooks. New notebook is for training a VITS model with languages other than English.
In this one I take a look at alternate language training a VITS model using Coqui TTS on Google Colab. I trained a Spanish-speaking model on mostly-blind sample data. I don't speak Spanish, so I can't evaluate this, but it started sounding pretty good for what it was.
Then I review some of the change/differences in the multispeaker VITS notebook and YourTTS notebook
Other videos:
Multispeaker VITS https://www.youtube.com/watch?v=45DiA-aJwXI
YourTTS training https://www.youtube.com/watch?v=1yt2W-uK8mk
Check out Unscripted Coding if you want to watch someone explore cool open source projects: https://www.youtube.com/@UnscriptedCoding
Download my multilingual, multispeaker YourTTS model on Huggingface: https://huggingface.co/AOLCDROM/YourTTS-Fr-En-De-Es
See allvoices.txt for information about each speaker:language training pair. Was trained on character sets, and uses 'artificial' language codes.
RTFM:
https://tts.readthedocs.io/en/latest/
https://github.com/openai/whisper
https://tts.readthedocs.io/en/latest/models/vits.html
https://arxiv.org/pdf/2106.06103.pdf