Between the Buried and Me - The Parallax II Future Sequence - 02 Astral Body (Instrumental, wip)

Subscribers:
2,350
Published on ● Video Link: https://www.youtube.com/watch?v=bAJ_zUlUcAA



Game:
Parallax (1986)
Duration: 5:02
341 views
16


Running out of bands to post lol. This is generated with a machine learning model I've been working on, so it's not something where I can remove obvious errors - unfortunately humans suck at visualizing 50-million-dimensional space lol.

Have been training a new version of my model. This is using a weird u-net that only acts only on the frequency axis - all encoder and decoder 2d convolutions use 3x1 kernels so as to only convolve features from the same frame and to retain the temporal resolution. This appears to have a drastic effect on learning - going to get my google cloud setup dusted off; with more resources, this architecture would achieve vastly higher quality than this and the vanilla u-net version. There are no 3x3 convolutions used in this architecture - for interframe communication, the transformer architecture is used - specifically a modified evolved transformer encoder and decoder architecture. This means all communication occurs at either the multihead attention modules or the wide convolutions (1d convolutions with kernel size greater than 1; the channels in this case are the frequency bins of the frame)

On the encoding part of the U-Net, there are evolved transformer encoders that take the input, bottleneck it to 1 channel using a 1x1 convolution, and then treat the frequency bins as the embedding. The shape is rearranged to B,W,H in order to reflect this, and is then sent through the evolved transformer encoder transposing W and H where needed for 1d convolutions (so H would be C for 1d convolutions). The only place interframe communication occurs is in the wide convolutions of the evolved transformer architecture as well as in the multihead attention module which in this project is called MultiheadFrameAttention.

There are 4 sets of encoders at 4 different frequency resolutions and 4 sets of decoders at the same frequency resolutions on the decoding process. In this sense, there are extra skip connections in this U-Net in the form of the memory in the evolved transformer decoder architecture which uses the skip connection as the memory input which is bottlenecked to B,1,H,W and used in multiheaded attention as normal.

For both input and memory in the encoder and decoder, because this is a convolutional neural network that expands the channels when downscaling the frequency, a bottleneck is required for both input and memory to project them into a HxW representation. Each resolution has two transformer encoders or decoders stacked that have their own bottlenecks which in effect allows them to iteratively pull information from all channels as needed. More layers would be able to use more information from the 2d convolutional portion of the architecture, but who knows. At each new layer in a transformer encoder/decoder stack, the layers are connected in a dense net fashion so each layers output is concatenated with the input and sent through the next layers bottleneck.

Positional encoding is weird in this model. It uses more of a distance encoding instead of positional encoding, so after Q*Kt there is a matrix of distances that is calculated (well, this is done in the constructor of the module) and there is a collection of weights for distances for each head. This allows each head to focus on general areas giving it a more big picture form of honing in on where to pay attention. I am going to try adding in relative positional embedding next to add a kind of fine tuning - I would imagine you could consider the distance encoding + distance weights as a low frequency form of positional attention and relative positional embeddings as a higher frequency form of positional attention.

Bands I could post videos of:
Vale of Pnath (II, Accursed)
Archspire (RM, BTF, could get the other one I guess too)
Beyond Creation - Algorythm
Between the Buried and Me - The Parallax II: Future Sequence
Chialism - Flesh Over Finite
First Fragment - Gloire Eternelle, Dasein
Equipoise - Demiurgus
Devils of Loudun - Enduring Creation, Escaping Eternity
Necrophagist - Epitaph
Flub - Flub
Substructures - Monolith
The Faceless - Planetary Duality
Singularity - Palace of Chains




Other Videos By Benjamin Carper


2022-08-07Archspire - Bleed the Future - 01 Drone Corpse Aviator (Instrumental, v3)
2022-08-04Between the Buried and Me - The Parallax II Future Sequence - Lay Your Ghosts to Rest (Instrumental)
2022-08-04Within The Ruins - World Undone (Instrumental, work in progress)
2022-08-04Within The Ruins - Resurgence (Instrumental, work in progress)
2022-08-04AC DC - Back In Black (Instrumental, work in progress)
2022-05-04Equilibrium - Sagas - 03 Blut Im Auge (Instrumental)
2022-05-04The Zenith Passage - Solipsist - 02 Holographic Principle II - Convergence (Instrumental)
2022-04-29Sonata in B Minor (intro section, v2; neo-Baroque metal)
2022-04-26The Zenith Passage - Algorithmic Salvation (Instrumental, v3)
2022-04-26The Zenith Passage - Synaptic Depravation (Instrumental, v2)
2022-03-27Between the Buried and Me - The Parallax II Future Sequence - 02 Astral Body (Instrumental, wip)
2022-03-27Archspire - Bleed the Future - 08 A.U.M. (Vocals only, wip)
2022-03-27Archspire - Bleed the Future - 08 A.U.M. (Instrumental, wip)
2022-03-15Archspire - Bleed the Future - 05 Drain of Incarnation (Instrumental, wip v2)
2022-03-12First Fragment - Gloire Éternelle - 05 De Chair Et De Haine (v2, AI Instrumental WIP)
2022-03-12First Fragment - Gloire Éternelle - 02 Solus (AI Instrumental WIP)
2022-02-19The Devils of Loudun - Escaping Eternity - 01 The Scourge of Beasts (Instrumental, wip)
2022-01-06First Fragment - Gloire Éternelle - 09 In'el (AI Instrumental)
2022-01-05Archspire - Bleed the Future - 05 Drain of Incarnation (Instrumental, work in progress)
2021-11-23First Fragment - Gloire Éternelle (AI Instrumental, work in progress)
2021-11-16Archspire - Bleed the Future - 06 Acrid Canon (Instrumental WIP, v2)