Animated Stable Diffusion and Synthesized Voice Demo with Facial Movements

Channel:

NanoNomad

Subscribers:

2,970

Published on October 25, 2022 9:29:08 AM ● Video Link: https://www.youtube.com/watch?v=BnrnJ_6wiYs

Duration: 1:58

315 views

Short test video version 2. Stable Diffusion generated image. Altered images using inpainting to close eyes, alter smile, etc. Selected image pairs and used with Frame Interpolation for Large Motion ML model to interpolate images in between. Selected clips generated by FILM and made a loop of face movements.

Voice is synthesized Lauretta Young; most samples from 1930s movies and radio plays. Audio quality of voice samples is very poor, but rnnoise ML model did a reasonable job cleaning them up. Model is VITS fine tuned using Coqui TTS up to 1153000 steps.

Synced voice to video using Wav2Lip with the wav2lip_gan model as a 512x512 video.
https://github.com/Rudrabha/Wav2Lip

Upscaled video using Aaron Feng's massively feature rich Waifu2x GUI
https://github.com/AaronFeng753/Waifu2x-Extension-GUI

Speech is the lyrics to Jamiroquai's Virtual Insanity

Other Videos By NanoNomad

2023-02-12	Even more Voice Cloning \| Train a Multi-Speaker VITS model using Google Colab and a Custom Dataset
2023-02-04	Updated \| Near-Automated Voice Cloning \| Whisper STT + Coqui TTS \| Fine Tune a VITS Model on Colab
2023-01-30	YourTTS Training Discussion \| Experiences, Multistage Training, Demos, Prior Training Preservation
2023-01-27	Updated \| Fine-Tuning YourTTS with Automated STT Datasets on Google Colab for AI Voice Cloning
2023-01-13	Fine-Tune YourTTS with Near-Automated Datasets on Google Colab for AI Voice Cloning
2022-12-22	Near-Automated Voice Cloning \| Whisper STT + Coqui TTS \| Fine Tune a VITS Model on Colab or Linux
2022-12-09	Dreambooth and Fine Tuning for Stable Diffusion 1.5 and 2 with this Versatile Script
2022-11-30	If Bill Gates could rap? AI Synthesized Voice, AI Upsampled Video \| Deltron 3030's Virus
2022-11-14	Training Stable Diffusion Dreambooth on Multiple Subjects for Combined Image Generation
2022-10-31	Locally Train Stable Diffusion with Dreambooth using WSL Ubuntu
2022-10-25	Animated Stable Diffusion and Synthesized Voice Demo with Facial Movements
2022-10-24	Stable Diffusion Image to Video, Synthesized Lauretta Young 1930s voice, Wav2Lip Demo
2022-10-16	Animate Images using AI with Frame Interpolation for Large Motion
2022-10-14	Animated Stable Diffusion Images using Google's FILM Frame Interpolation for Large Motion demo
2022-10-07	Training Textual Inversion for Stable Diffusion \| Customizable AI Image Generation
2022-09-26	How to Download All Styles and Objects from the Stable Diffusion Concepts Library \| AI Images
2022-09-05	AI Images \| Installing Stable Diffusion and the Automatic1111 WebUI using Conda on Windows 10
2022-09-04	AI Image Generation with Stable Diffusion Part 2 \| Img2Img Transformations, Masking, Upscaling
2022-09-01	AI Image Generation with Stable Diffusion \| Part 1
2022-08-28	Johnny Cash Delivers The Great Dictator Speech \| AI Voice Demo VITS Model with Stable Diffusion art
2022-08-23	Duke Nukem covering DJ Shadow feat Run The Jewels' Nobody Speak \| AI Voice Synthesis

Tags:

TTS

Stable Diffusion

img2img

machine learning

ai art

Channel	Latest
Iqbal Mulyawan	6 hours ago
Games TudoCelular	6 hours ago
文老爹	6 hours ago
DavyJ90 gameS	6 hours ago
Bodachi	6 hours ago
Guillaume & Kim	6 hours ago
DrPeachesDubs	6 hours ago
Mr Vishal Rajawat	6 hours ago
Ray noa	6 hours ago
Hijimung	7 hours ago
がそりんチャンネル	7 hours ago
AMH LIVE	7 hours ago
MLBB eSports	7 hours ago
Hidayatx	7 hours ago
Teneke Kafalar Twitch	7 hours ago
ユウトソウゴスペキタクラのトレーナー	7 hours ago
ƒ r u 7 7 y	7 hours ago
pgp	7 hours ago
Robinoyo	7 hours ago
Kang Movie	7 hours ago
exhilaratinggaming	7 hours ago
Virtual Space Games	7 hours ago
IronNerd	7 hours ago
LMG	7 hours ago
MorfVK	7 hours ago