Voice Cloning with Tortoise TTS and Model Training Using the AI Voice Cloning WebUI

Channel:
Subscribers:
2,810
Published on ● Video Link: https://www.youtube.com/watch?v=snz-VzgGgmA



Duration: 22:27
4,894 views
84


A look at Tortoise TTS, The AI Voice Cloning WebUI, and the Tortoise TTS-Fast fork. Most of the audio was generated using Tortoise TTS 'fast' preset with a selection of random, cloned, and trained voices throughout. The only cherry-picking of clips was done when the output was nonsensical, or misspoke enough that the instructions would be incorrect.

WSL 2 Reinstall:

wsl -l
wsl --unregister [distro]
wsl --install -d [distro]
sudo apt update
sudo apt upgrade

Conda: https://docs.conda.io/en/latest/miniconda.html#linux-installers

Conda Install:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod +x ./Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh


Install Cuda:
wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-11-8-local_11.8.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda

Install Cuda toolkit (do this for the conda environment you are using, or install on base to cache packages)
conda activate [name]
conda install -c conda-forge cudatoolkit=11.8 cudnn

Torch install:
pip install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118

Fix missing libraries:
cd C:\Windows\System32\lxss\lib
mv libcuda.so libcuda.so.bak
mv libcuda.so.1 libcuda.so.1.bak
libnvoptix_loader.so.1 to libnvoptix.so.1
mv libnvoptix.so.1 libnvoptix.so.1.bak

wsl -e /bin/bash
# in WSL
ln -s libcuda.so.1.1 libcuda.so.1
ln -s libcuda.so.1.1 libcuda.so
ln -s libnvoptix_loader.so.1 libnvoptix.so.1
exit


wsl --shutdown
wsl -e /bin/bash
sudo ldconfig
exit

conda create -n tortoise python=3.9 git pip

Install Tortoise:
git clone https://github.com/neonbjb/tortoise-tts.git
cd tortoise-tts
python -m pip install -r ./requirements.txt
python setup.py install

Replace Tortoise requirements.txt with this:
tqdm
rotary_embedding_torch
transformers==4.19
tokenizers
inflect
progressbar
einops==0.4.1
unidecode
#scipy==0.10.1
scipy==1.10.1
librosa==0.9.1
#numba==0.48.0
ffmpeg
#numpy==1.20.0
#numba==0.48.0
numba==0.56.4
numpy==1.23.5
torchaudio
threadpoolctl
llvmlite
appdirs



AI Voice Cloning:
sudo apt install espeak-ng
conda create -n tort python=3.9 git pip
conda activate tort
conda install -c conda-forge cudatoolkit=11.8 cudnn
git clone https://git.ecker.tech/mrq/ai-voice-cloning.git
cd ai-voice-cloning
chmod +x *.sh
./setup-cuda.sh
source ./venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install phonemizer
deactivate
./start.sh


Tortoise TTS-Fast:
https://github.com/152334H/tortoise-tts-fast

Download one of my fine-tuned Tortoise TTS models here:
Base model here: https://huggingface.co/AOLCDROM/Tortoise-TTS-MSFT-VCTK-4V-En
Requires custom tokenizer file, pt-t.json (put in ./models/tokenizers, switch tokenizer in settings menu)




Other Videos By NanoNomad


2023-06-03DEMO: YourTTS Multi-speaker VCTK Irish-accented Dataset after 275k Steps trained using Coqui TTS
2023-05-22Tortoise TTS Fine Tuning Wrap-Up
2023-05-16Tortoise TTS DEMO: G-Man performs Gilbert and Sullivan's 'The Major-General's Song'
2023-05-15Train Tortoise TTS in English, Spanish, French, Italian, Portuguese, German, and more? Maybe?
2023-05-10DEMO: Testing French-Speaking Tortoise TTS
2023-05-10DEMO: Testing German-Speaking Tortoise TTS
2023-05-08DEMO: Testing Spanish Speaking Tortoise TTS
2023-05-07DEMO: Testing Tortoise TTS Speaking in Portuguese
2023-05-04Make Using Tortoise TTS Faster with Fine-Tuned Models
2023-05-01AI Voice Swap and Lip Sync using Wav2Lip-HQ-Updated
2023-04-22Voice Cloning with Tortoise TTS and Model Training Using the AI Voice Cloning WebUI
2023-04-07Locally Hosted Chatbots with RWKV through ChatRWKV and the Text-Generation-WebUI | 14B Model on 3GB!
2023-03-29Create Datasets for Voice Model Training on Google Colab | Updated Tools for Coqui TTS Training
2023-03-22Train a VITS Speech Model using Coqui TTS | Updated Script and Audio Processing Tools
2023-03-15Training or Fine Tuning a Hindi Language VITS TTS Voice Model with Coqui TTS on Google Colab
2023-03-05Install and Configure Retroarch for PS Vita with Thumbnails, Overlays and Shaders
2023-03-03Fallout 1 on the PS Vita is the Best Way to Play
2023-02-24Train or Fine Tune VITS on (theoretically) Any Language | Train Multi-Speaker Model | Train YourTTS
2023-02-12Even more Voice Cloning | Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset
2023-02-04Updated | Near-Automated Voice Cloning | Whisper STT + Coqui TTS | Fine Tune a VITS Model on Colab
2023-01-30YourTTS Training Discussion | Experiences, Multistage Training, Demos, Prior Training Preservation



Tags:
Tortoise TTS
voice cloning
AI voice
text to speech