Voice Cloning with Tortoise TTS and Model Training Using the AI Voice Cloning WebUI

Channel:

NanoNomad

Subscribers:

2,970

Published on April 22, 2023 8:31:24 PM ● Video Link: https://www.youtube.com/watch?v=snz-VzgGgmA

Duration: 22:27

4,894 views

A look at Tortoise TTS, The AI Voice Cloning WebUI, and the Tortoise TTS-Fast fork. Most of the audio was generated using Tortoise TTS 'fast' preset with a selection of random, cloned, and trained voices throughout. The only cherry-picking of clips was done when the output was nonsensical, or misspoke enough that the instructions would be incorrect.

WSL 2 Reinstall:

wsl -l
wsl --unregister [distro]
wsl --install -d [distro]
sudo apt update
sudo apt upgrade

Conda: https://docs.conda.io/en/latest/miniconda.html#linux-installers

Conda Install:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x ./Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh

Install Cuda:
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

Install Cuda toolkit (do this for the conda environment you are using, or install on base to cache packages)
conda activate [name]
conda install -c conda-forge cudatoolkit=11.8 cudnn

Torch install:
pip install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118

Fix missing libraries:
cd C:\Windows\System32\lxss\lib
mv libcuda.so libcuda.so.bak
mv libcuda.so.1 libcuda.so.1.bak
libnvoptix_loader.so.1 to libnvoptix.so.1
mv libnvoptix.so.1 libnvoptix.so.1.bak

wsl -e /bin/bash
# in WSL
ln -s libcuda.so.1.1 libcuda.so.1
ln -s libcuda.so.1.1 libcuda.so
ln -s libnvoptix_loader.so.1 libnvoptix.so.1
exit

wsl --shutdown
wsl -e /bin/bash
sudo ldconfig
exit

conda create -n tortoise python=3.9 git pip

Install Tortoise:
git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
python -m pip install -r ./requirements.txt
python setup.py install

Replace Tortoise requirements.txt with this:
tqdm
rotary_embedding_torch
transformers==4.19
tokenizers
inflect
progressbar
einops==0.4.1
unidecode
#scipy==0.10.1
scipy==1.10.1
librosa==0.9.1
#numba==0.48.0
ffmpeg
#numpy==1.20.0
#numba==0.48.0
numba==0.56.4
numpy==1.23.5
torchaudio
threadpoolctl
llvmlite
appdirs

AI Voice Cloning:
sudo apt install espeak-ng
conda create -n tort python=3.9 git pip
conda activate tort
conda install -c conda-forge cudatoolkit=11.8 cudnn
git clone https://git.ecker.tech/mrq/ai-voice-cloning.git
cd ai-voice-cloning
chmod +x *.sh
./setup-cuda.sh
source ./venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install phonemizer
deactivate
./start.sh

Tortoise TTS-Fast:
https://github.com/152334H/tortoise-tts-fast

Download one of my fine-tuned Tortoise TTS models here:
Base model here: https://huggingface.co/AOLCDROM/Tortoise-TTS-MSFT-VCTK-4V-En
Requires custom tokenizer file, pt-t.json (put in ./models/tokenizers, switch tokenizer in settings menu)

Other Videos By NanoNomad

2023-06-03	DEMO: YourTTS Multi-speaker VCTK Irish-accented Dataset after 275k Steps trained using Coqui TTS
2023-05-22	Tortoise TTS Fine Tuning Wrap-Up
2023-05-16	Tortoise TTS DEMO: G-Man performs Gilbert and Sullivan's 'The Major-General's Song'
2023-05-15	Train Tortoise TTS in English, Spanish, French, Italian, Portuguese, German, and more? Maybe?
2023-05-10	DEMO: Testing French-Speaking Tortoise TTS
2023-05-10	DEMO: Testing German-Speaking Tortoise TTS
2023-05-08	DEMO: Testing Spanish Speaking Tortoise TTS
2023-05-07	DEMO: Testing Tortoise TTS Speaking in Portuguese
2023-05-04	Make Using Tortoise TTS Faster with Fine-Tuned Models
2023-05-01	AI Voice Swap and Lip Sync using Wav2Lip-HQ-Updated
2023-04-22	Voice Cloning with Tortoise TTS and Model Training Using the AI Voice Cloning WebUI
2023-04-07	Locally Hosted Chatbots with RWKV through ChatRWKV and the Text-Generation-WebUI \| 14B Model on 3GB!
2023-03-29	Create Datasets for Voice Model Training on Google Colab \| Updated Tools for Coqui TTS Training
2023-03-22	Train a VITS Speech Model using Coqui TTS \| Updated Script and Audio Processing Tools
2023-03-15	Training or Fine Tuning a Hindi Language VITS TTS Voice Model with Coqui TTS on Google Colab
2023-03-05	Install and Configure Retroarch for PS Vita with Thumbnails, Overlays and Shaders
2023-03-03	Fallout 1 on the PS Vita is the Best Way to Play
2023-02-24	Train or Fine Tune VITS on (theoretically) Any Language \| Train Multi-Speaker Model \| Train YourTTS
2023-02-12	Even more Voice Cloning \| Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset
2023-02-04	Updated \| Near-Automated Voice Cloning \| Whisper STT + Coqui TTS \| Fine Tune a VITS Model on Colab
2023-01-30	YourTTS Training Discussion \| Experiences, Multistage Training, Demos, Prior Training Preservation

Tags:

Tortoise TTS

voice cloning

AI voice

text to speech

Channel	Latest
Lukwer TFT	6 hours ago
WinnerMax	6 hours ago
Julinurrohman	6 hours ago
Francis Films	6 hours ago
Shawn TFT	6 hours ago
domisumReplay: Kennen	6 hours ago
OMEGONCIO	6 hours ago
YouNoMeDawg	6 hours ago
ExtraHolidayGames	6 hours ago
domisumReplay: LeBlanc	6 hours ago
Rafaeu	6 hours ago
Inside Gamers	6 hours ago
domisumReplay: Volibear	6 hours ago
domisumReplay: Malzahar	6 hours ago
420thestunna	6 hours ago
More MrTop5	7 hours ago
Soiscool Mec	7 hours ago
Enzo Alavaski	7 hours ago
ISpyWithMyMiniEye	7 hours ago
Alexandru Bălan	7 hours ago
Czokled	7 hours ago
Nephyrus	7 hours ago
Noire Blue	7 hours ago
STACK Presents	7 hours ago
Stable Ronaldo Live	7 hours ago