Near-Automated Voice Cloning | Whisper STT + Coqui TTS | Fine Tune a VITS Model on Colab or Linux

Channel:

NanoNomad

Subscribers:

2,970

Published on December 22, 2022 7:50:42 AM ● Video Link: https://www.youtube.com/watch?v=e_DCb1XPWS0

Duration: 12:19

7,636 views

216

This is about as close to automated as I can make things. I've put together a Colab notebook that uses a bunch of spaghetti code, rnnoise, OpenAI's Whisper Speech to Text, and Coqui Text to Speech to train a VITS model.

Upload audio files, split and process clips, denoise clips, transcribe clips with Whisper, then use that dataset to fine tune a VITS model.

Second part of the video is a quick look at installing the same thing on WSL2 Ubuntu 20.04 LInux on Windows 10. Copy-paste command list and scripts (rnnoise, and whisper translate) linked down below.

Use single-speaker, clear audio samples. rnnoise can't work miracles.

Whisper STT + Coqui TTS Colab Notebook
https://colab.research.google.com/drive/1xy0qmej_G3skZL2BpY1sBm_k3BTTkf7V?usp=sharing

Linux command list:
https://pastebin.com/9MeCYi4p

rnnoise voice clip denoise script:
https://pastebin.com/5wrAt1UG

Whisper STT voice clip transcription script:
https://pastebin.com/Q4VSsktk

Alternate command to split files to 8 seconds instead of on silence:
for FILE in *.wav; do sox "$FILE" splits/"$FILE" --show-progress trim 0 8 : restart ; done

Other Videos By NanoNomad

2023-03-22	Train a VITS Speech Model using Coqui TTS \| Updated Script and Audio Processing Tools
2023-03-15	Training or Fine Tuning a Hindi Language VITS TTS Voice Model with Coqui TTS on Google Colab
2023-03-05	Install and Configure Retroarch for PS Vita with Thumbnails, Overlays and Shaders
2023-03-03	Fallout 1 on the PS Vita is the Best Way to Play
2023-02-24	Train or Fine Tune VITS on (theoretically) Any Language \| Train Multi-Speaker Model \| Train YourTTS
2023-02-12	Even more Voice Cloning \| Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset
2023-02-04	Updated \| Near-Automated Voice Cloning \| Whisper STT + Coqui TTS \| Fine Tune a VITS Model on Colab
2023-01-30	YourTTS Training Discussion \| Experiences, Multistage Training, Demos, Prior Training Preservation
2023-01-27	Updated \| Fine-Tuning YourTTS with Automated STT Datasets on Google Colab for AI Voice Cloning
2023-01-13	Fine-Tune YourTTS with Near-Automated Datasets on Google Colab for AI Voice Cloning
2022-12-22	Near-Automated Voice Cloning \| Whisper STT + Coqui TTS \| Fine Tune a VITS Model on Colab or Linux
2022-12-09	Dreambooth and Fine Tuning for Stable Diffusion 1.5 and 2 with this Versatile Script
2022-11-30	If Bill Gates could rap? AI Synthesized Voice, AI Upsampled Video \| Deltron 3030's Virus
2022-11-14	Training Stable Diffusion Dreambooth on Multiple Subjects for Combined Image Generation
2022-10-31	Locally Train Stable Diffusion with Dreambooth using WSL Ubuntu
2022-10-25	Animated Stable Diffusion and Synthesized Voice Demo with Facial Movements
2022-10-24	Stable Diffusion Image to Video, Synthesized Lauretta Young 1930s voice, Wav2Lip Demo
2022-10-16	Animate Images using AI with Frame Interpolation for Large Motion
2022-10-14	Animated Stable Diffusion Images using Google's FILM Frame Interpolation for Large Motion demo
2022-10-07	Training Textual Inversion for Stable Diffusion \| Customizable AI Image Generation
2022-09-26	How to Download All Styles and Objects from the Stable Diffusion Concepts Library \| AI Images

Tags:

Voice Cloning

OpenAI Whisper

Coqui TTS

Voice Synthesis

Vocaloid

VITS Fine-Tuning

Channel	Latest
Shiny Aggron DX	6 hours ago
Scheiren	6 hours ago
Super Saiyan Blue Lucario	6 hours ago
Bobbi-Lee Marijs	6 hours ago
TeaHee Plays	6 hours ago
Cadence Hero	6 hours ago
RealitySpin	6 hours ago
Joffker	6 hours ago
앙리형	6 hours ago
Gaming Is Life For This Platinum Trophy Hunter.	6 hours ago
COMISARIOTV	6 hours ago
Jugando Enserio	6 hours ago
JMP Art & Vlogs	6 hours ago
PDABLOG: Best new Android/iOS games	6 hours ago
Bonkol Live	6 hours ago
Dumb BUB	6 hours ago
Wlerlen	7 hours ago
Mirana Davion	7 hours ago
MrSan	7 hours ago
DeadlyCyclone	7 hours ago
JOGOS E VIDEOS IA	7 hours ago
Ask About Parenting & Care	7 hours ago
Nepheliba	7 hours ago
hanuari	7 hours ago
Zac's Game Sessions	7 hours ago