DiffRhythm: Generative Music (done quickly)

Channel:
Subscribers:
2,810
Published on ● Video Link: https://www.youtube.com/watch?v=oYRSijJOBEA



Duration: 0:00
298 views
4


A look at DiffRhythm; a diffusion model for generative music. This one is FAST. I'm looking over the demos, sharing some installation notes, trying some demo generations, seeing what works and what doesn't, and trying to make a decent sounding tune. This is not a detailed tutorial.

DiffRhythm Huggingface demo - https://huggingface.co/spaces/ASLP-lab/DiffRhythm
DiffRhythm demo page - https://aslp-lab.github.io/DiffRhythm.github.io/
DiffRhythm GitHub - https://github.com/ASLP-lab/DiffRhythm

The modified scripts I mention in the video (not required, may not work, your mileage may vary, etc) - http://nanonomad.com/2025/04/17/diffrhythm-fast-full-length-song-generation/

[00:01] What am I doing? What is all of this
[00:46] DiffRhythm is fast
[01:02] There is an online demo, but...
[01:30] Let's listen and critique some of the demo songs
[08:10] Installation notes
[10:10] Generating the demo examples
[12:02] A few notes
[13:40] Trying some of my own random demos with questionable results
[14:20] Poor results attempting to generate sample-based music (rap)
[15:12] Better results with acoustic music
[15:30] Regenerating the folksy-country tune from the demo page
[16:40] Extending the lyrics and adjusting the timestamps of the song and results
[17:05] The adjusted infer.py and infer_utils.py

#generativeai #texttoaudio #diffrhythm




Other Videos By NanoNomad


2025-04-17DiffRhythm: Generative Music (done quickly)
2025-02-25Is YuE the Stable Diffusion of Music? | Generate Full-Length Songs with Vocals at Home
2025-02-10Portable Whisper Speech to Text with Speaker Diarization and VAD | Purfview Faster Whisper XXL
2024-07-03Fine Tuning XTTS v2 for Hindi Speech with forked Coqui TTS
2024-06-26Fine Tuning XTTS v2 with forked Coqui | Coqui AI is dead; Long live Coqui!
2024-06-202x Faster LLM Training on Windows | LLaMA-Factory with Unsloth and Flash Attention 2
2024-06-1564kb Scene Demo/Intro/Cracktro Multimedia Mix #1 (90 min) | Flash/Photo-sensitivity Warning
2024-06-10Stable Audio Open 1.0 | Open Source* Generative Audio and Fine Tuning*
2024-06-04Troubleshooting Sega Saturn Emulation with Retroarch for iOS/Apple
2024-05-29Play Windows 98 and MS-DOS Games on iPad/iOS/iPhone with DOSBox-Pure and Retroarch for FREE
2024-05-25The Lost Art of Optical Disc Repair | Fixing and Testing a PlayStation Disc
2024-05-22Retroarch iOS Updates | Improved Performance, MS-DOS Core, Doom and Touch Input
2024-05-17RetroArch for iPad and iPhone now on the App Store | Installation, Setup, Quick Performance Overview
2024-05-13Micca Speck 4K Media Player | Unboxing, Firmware Update, Setup, Demos, and Opinions
2024-05-06Training SDXL to Generate Text Using IA3 LoRA | It's like Kai's Power Tools, I Guess?
2024-04-17Replacing Faulty Asus Phoenix RTX 3060 GPU Cooler - It's Easy
2024-03-21Bark TTS, Seamless Translation, RVC, Music Generation and More with the TTS Generation WebUI
2024-02-14Train Better Stable Diffusion Models | Prep Datasets Using this Free "Magic" Image Tool
2024-02-12Emulate a Sound Blaster in real MS-DOS on Modern Hardware | Retro Gaming on "Current" PCs
2024-01-28How to Play Hundreds of Point-and-Click Adventures on iOS for FREE with ScummVM with NO SIDELOADING
2024-01-18Training LoRAs and GLoRAs for Stable Diffusion 1.5 and XL Using the New Prodigy Optimizer