Archspire - Bleed the Future - 08 A.U.M. (Instrumental, wip)

Channel:

Benjamin Carper

Subscribers:

2,350

Published on March 27, 2022 9:40:07 PM ● Video Link: https://www.youtube.com/watch?v=jMVcX9RQCbg

Duration: 3:04

1,464 views

Some vocals bleed through in a few places but less than normal; still training but will probably have a more finalized version in a day or two.

Might do a video on coding this, though the architecture is somewhat complex and uses concepts from both convolutional neural networks and transformers.

For anyone who cares, my model architecture has been uploaded here: https://github.com/carperbr/vocal-remover-frame-transformer (all code for this architecture is in lib/frame_transformer.py; a normal vocal remover is in nets.py and layers.py for comparison)

Have been training a new version of my model. Like the last video, this is using a weird u-net that only acts on the frequency axis - all encoder and decoder 2d convolutions use 3x1 kernels so as to only convolve features from the same frame and to retain the temporal resolution. This appears to have a drastic effect on learning. There are no 3x3 convolutions used in this architecture - for interframe communication, the transformer architecture is used - specifically a modified evolved transformer encoder and decoder architecture.

On the encoding part of the U-Net, there are evolved transformer encoders that take the input, bottleneck it to 1 channel using a 1x1 convolution, and then treat the frequency bins as the embedding. The shape is rearranged to B,W,H in order to reflect this, and is then sent through the evolved transformer encoder transposing W and H where needed for 1d convolutions (so H would be C for 1d convolutions). The only place interframe communication occurs is in the wide convolutions of the evolved transformer architecture as well as in the multihead attention module which in this project is called MultiheadFrameAttention.

There are 4 sets of encoders at 4 different frequency resolutions and 4 sets of decoders at the same frequency resolutions on the decoding process. In this sense, there are extra skip connections in this U-Net in the form of the memory in the evolved transformer decoder architecture which uses the skip connection as the memory input which is bottlenecked to B,1,H,W and used in multiheaded attention as normal.

For both input and memory in the encoder and decoder, because this is a convolutional neural network that expands the channels when downscaling the frequency, a bottleneck is required for both input and memory to project them into a HxW representation. Each resolution has two transformer encoders or decoders stacked that have their own bottlenecks which in effect allows them to iteratively pull information from all channels as needed. More layers would be able to use more information from the 2d convolutional portion of the architecture, but who knows. At each new layer in a transformer encoder/decoder stack, the layers are connected in a dense net fashion so each layers output is concatenated with the input and sent through the next layers bottleneck.

Positional encoding is weird in this model. It uses more of a distance encoding instead of positional encoding, so after Q*Kt there is a matrix of distances that is calculated (well, this is done in the constructor of the module) and there is a collection of weights for distances for each head. This allows each head to focus on general areas giving it a more big picture form of honing in on where to pay attention. I am going to try adding in relative positional embedding next to add a kind of fine tuning - I would imagine you could consider the distance encoding + distance weights as a low frequency form of positional attention and relative positional embeddings as a higher frequency form of positional attention.

Other Videos By Benjamin Carper

2022-08-04	Within The Ruins - World Undone (Instrumental, work in progress)
2022-08-04	Within The Ruins - Resurgence (Instrumental, work in progress)
2022-08-04	AC DC - Back In Black (Instrumental, work in progress)
2022-05-04	Equilibrium - Sagas - 03 Blut Im Auge (Instrumental)
2022-05-04	The Zenith Passage - Solipsist - 02 Holographic Principle II - Convergence (Instrumental)
2022-04-29	Sonata in B Minor (intro section, v2; neo-Baroque metal)
2022-04-26	The Zenith Passage - Algorithmic Salvation (Instrumental, v3)
2022-04-26	The Zenith Passage - Synaptic Depravation (Instrumental, v2)
2022-03-27	Between the Buried and Me - The Parallax II Future Sequence - 02 Astral Body (Instrumental, wip)
2022-03-27	Archspire - Bleed the Future - 08 A.U.M. (Vocals only, wip)
2022-03-27	Archspire - Bleed the Future - 08 A.U.M. (Instrumental, wip)
2022-03-15	Archspire - Bleed the Future - 05 Drain of Incarnation (Instrumental, wip v2)
2022-03-12	First Fragment - Gloire Éternelle - 05 De Chair Et De Haine (v2, AI Instrumental WIP)
2022-03-12	First Fragment - Gloire Éternelle - 02 Solus (AI Instrumental WIP)
2022-02-19	The Devils of Loudun - Escaping Eternity - 01 The Scourge of Beasts (Instrumental, wip)
2022-01-06	First Fragment - Gloire Éternelle - 09 In'el (AI Instrumental)
2022-01-05	Archspire - Bleed the Future - 05 Drain of Incarnation (Instrumental, work in progress)
2021-11-23	First Fragment - Gloire Éternelle (AI Instrumental, work in progress)
2021-11-16	Archspire - Bleed the Future - 06 Acrid Canon (Instrumental WIP, v2)
2021-11-13	Archspire - Bleed the Future - 07 Reverie on the Onyx (Instrumental WIP)
2021-11-06	First Fragment - Dasein - 08 Gula (Instrumental WIP)

Channel	Latest
Salvatore Aiello	6 hours ago
sibam ff	6 hours ago
hololive ホロライブ - VTuber Group	7 hours ago
Carlos Tunado	7 hours ago
Podcast Now	7 hours ago
Kojampel	7 hours ago
Laster	7 hours ago
Tara Arts Game Indonesia	7 hours ago
Agraelovo Stream Šílenství	7 hours ago
ShadowNightm4re	7 hours ago
ZeldaGrossmeister	7 hours ago
hirudov2d	7 hours ago
즐기는 남자들	7 hours ago
MovieDigger	7 hours ago
Alfa Games	7 hours ago
Tundra Esports - Dota 2	7 hours ago
ДЖАК	7 hours ago
Sothis Spielwiese	7 hours ago
ナノ進	7 hours ago
Cobweb Stream	7 hours ago
HIKI	7 hours ago
mnoboys	8 hours ago
IHATEPINK	8 hours ago
🔥TGPCZ🔥	8 hours ago
うま / S.Pプロジェクト	8 hours ago