AI Voice Swap and Lip Sync using Wav2Lip-HQ-Updated
Using Flowframes to smooth a 30 second video loop made from 3 seconds of video, then taking some synthesized speech and using Wav2Lip-HQ-Updated ESRGAN to lip sync the video to the audio.
Wav2Lip-HQ Updated: https://github.com/GucciFlipFlops1917/wav2lip-hq-updated-ESRGAN
Flowframes: https://nmkd.itch.io/flowframes
AI Voice Cloning GUI: https://www.youtube.com/watch?v=snz-VzgGgmA
Video on Using Facebook's FILM Interpolation model: https://www.youtube.com/watch?v=aTYTTRxD1Hw
Model files mirrored here, as original sources appear offline:
https://huggingface.co/AOLCDROM/WAV2LIP-HQ-Updated-MIRROR/tree/main
Installation:
conda create -n env_name python=3.8 git pip
git clone https://github.com/GucciFlipFlops1917/wav2lip-hq-updated-ESRGAN.git
cd wav2lip-hq-updated-ESRGAN
pip3 install -r requirements.txt
rename to s3fd.pth:
https://www.adrianbulat.com/downloads/python-fan/s3fd-619a316812.pth
wav2lip_gan.pth
https://drive.google.com/uc?id=10Iu05Modfti3pDbxCFPnofmfVlbkvrCm
face_segmentation.pth
From: https://drive.google.com/uc?id=154JgKpzCPW82qINcVieuPH3fZ2e0P812
pretrained.state
From: https://drive.google.com/uc?id=1_MGeOLdARWHylC1PCU2p5_FQztD4Bo7B