Even more Voice Cloning | Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset

Channel:

NanoNomad

Subscribers:

2,860

Published on February 13, 2023 3:35:07 AM ● Video Link: https://www.youtube.com/watch?v=45DiA-aJwXI

Duration: 9:05

4,158 views

I've been looking at multispeaker VITS TTS models lately, so thought I'd share the Google Colab notebook. Its similar to the others posted, but this is using precomputed vectors; the configuration is similar to the YourTTS model, however this seems a little easier to fine tune.

As always, this stuff is experimental, but this should help you get started if you want to poke around at training a multi-speaker, English language VITS model using the Coqui TTS framework.

Multi-Speaker English language VITS training Colab Notebook:
https://colab.research.google.com/drive/1wAuG-TcZeAUYhff0f6ZiG-so9KT-sBIE?usp=sharing

YourTTS video discussing the same training options that can be used here as well:
https://www.youtube.com/watch?v=1yt2W-uK8mk

Real time noise suppression plugin:
https://github.com/werman/noise-suppression-for-voice

Audacity:
https://www.audacityteam.org/

Coqui's Dataset Guide:
https://github.com/coqui-ai/TTS/wiki/What-makes-a-good-TTS-dataset

rnnoise:
https://github.com/xiph/rnnoise

Download my multilingual, multispeaker YourTTS model on Huggingface: https://huggingface.co/AOLCDROM/YourTTS-Fr-En-De-Es
See allvoices.txt for information about each speaker:language training pair. Was trained on character sets, and uses 'artificial' language codes.

Generate text with the CLI:
tts --text "text" --out_path outfile.wav --model_path path/to/model_file.pth --config_path path/to/config.json --speakers_file_path speakers/index/path/speakers.pth --speaker_idx VCTK_speaker

Other Videos By NanoNomad

2023-05-04	Make Using Tortoise TTS Faster with Fine-Tuned Models
2023-05-01	AI Voice Swap and Lip Sync using Wav2Lip-HQ-Updated
2023-04-22	Voice Cloning with Tortoise TTS and Model Training Using the AI Voice Cloning WebUI
2023-04-07	Locally Hosted Chatbots with RWKV through ChatRWKV and the Text-Generation-WebUI \| 14B Model on 3GB!
2023-03-29	Create Datasets for Voice Model Training on Google Colab \| Updated Tools for Coqui TTS Training
2023-03-22	Train a VITS Speech Model using Coqui TTS \| Updated Script and Audio Processing Tools
2023-03-15	Training or Fine Tuning a Hindi Language VITS TTS Voice Model with Coqui TTS on Google Colab
2023-03-05	Install and Configure Retroarch for PS Vita with Thumbnails, Overlays and Shaders
2023-03-03	Fallout 1 on the PS Vita is the Best Way to Play
2023-02-24	Train or Fine Tune VITS on (theoretically) Any Language \| Train Multi-Speaker Model \| Train YourTTS
2023-02-12	Even more Voice Cloning \| Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset
2023-02-04	Updated \| Near-Automated Voice Cloning \| Whisper STT + Coqui TTS \| Fine Tune a VITS Model on Colab
2023-01-30	YourTTS Training Discussion \| Experiences, Multistage Training, Demos, Prior Training Preservation
2023-01-27	Updated \| Fine-Tuning YourTTS with Automated STT Datasets on Google Colab for AI Voice Cloning
2023-01-13	Fine-Tune YourTTS with Near-Automated Datasets on Google Colab for AI Voice Cloning
2022-12-22	Near-Automated Voice Cloning \| Whisper STT + Coqui TTS \| Fine Tune a VITS Model on Colab or Linux
2022-12-09	Dreambooth and Fine Tuning for Stable Diffusion 1.5 and 2 with this Versatile Script
2022-11-30	If Bill Gates could rap? AI Synthesized Voice, AI Upsampled Video \| Deltron 3030's Virus
2022-11-14	Training Stable Diffusion Dreambooth on Multiple Subjects for Combined Image Generation
2022-10-31	Locally Train Stable Diffusion with Dreambooth using WSL Ubuntu
2022-10-25	Animated Stable Diffusion and Synthesized Voice Demo with Facial Movements

Tags:

voice cloning

ai voice

tts

speech synthesis

vits

machine learning

Channel	Latest
ShinLad FGC	10 hours ago
EddboyBlue	10 hours ago
ErralasaGamer's	10 hours ago
Capi Reacts	10 hours ago
Ransix Plays	10 hours ago
AuMiO VXC	10 hours ago
Corner Line Studio	10 hours ago
Tavon B	10 hours ago
floydbishop	10 hours ago
Cyrus ꪜ	10 hours ago
Bricks Lair	10 hours ago
KevRow University LIVE	10 hours ago
Mike The Goon	10 hours ago
Oldana Vaverka	10 hours ago
SWK	10 hours ago
Leachim Remsy	11 hours ago
Cult Classic Cage	11 hours ago
Baegger	11 hours ago
Cool Dinosaurs	11 hours ago
PragerU	11 hours ago
BuzzAlt	11 hours ago
N1GHT	11 hours ago
MrDDG94	11 hours ago
All In ONE PL	11 hours ago
YovaD Gamer	11 hours ago