ABSOLUTE SPEED -- AI
Credit for Wally West x My Ordinary Life: • My Ordinary Life but slowed at best part
Explanation: TLDR: Others: 1-2 second latency, Me: 0.233 second aka 233 millisecond latency.
It is notoriously hard to achieve low latency on a Speech-To-Text (STT) to Large Language Model (LLM) to Text-To-Speech (TTS) pipeline. Every solution I have seen (companies and individuals) have only achieved 1-2 second latencies (companies do have to deal with network latency as well to be fair), so roughly 1.5 seconds of latency. Basically, below 1 second much less ~200+ millisecond latency has been all but unheard of until now for this kind of a pipeline
P.S. The source code is fully free for all (link here: https://github.com/Scikous/Vtuber-AI -- this version is in the multiprocessing branch which is very misleading don't ask why). The README is heavily out-of-date and the setup process until I fix it is not going to be easy.